The input matrix should ensure that the following conditions hold:
- Directed graph
- No self edges
- No duplicate edges
- Compressed sparse column (CSC) format
- Max 64M Vertex with 128M Edge graph for this design, still board-level scalability.
In order to make the API use higher bandwidth on the board of HBM base, this optimized version for HBM is implemented The algorithm implemention is shown as the figure below:
Figure 1 : PageRank calculate degree architecture on FPGA
Figure 2 : PageRank initiation module architecture on FPGA
Figure 3 : PageRank Adder architecture on FPGA
Figure 4 : PageRank calConvergence architecture on FPGA
As we can see from the figure:
- Module calculate degree: first get the vertex node’s outdegree with weighted and keep them in one DDR buffer.
- Module initiation: initiate PR DDR buffers and constant value buffer.
- Module Adder: calculate Sparse matrix multiplification.
- Module calConvergence: calculate convergence of pagerank iteration.