Implemention - 2023.2 English

The input matrix should ensure that the following conditions hold:

Directed graph
No self edges
No duplicate edges
Compressed sparse column (CSC) format
Max 64M Vertex with 128M Edge graph for this design, still board-level scalability.

In order to make the API use higher bandwidth on the board of HBM base, this optimized version for HBM is implemented The algorithm implemention is shown as the figure below:

Figure 1 : PageRank calculate degree architecture on FPGA

Figure 1 PageRankMultiChannels calculate degree architecture on FPGA

Figure 2 : PageRank initiation module architecture on FPGA

Figure 2 PageRankMultiChannels initiation module architecture on FPGA

Figure 3 : PageRank Adder architecture on FPGA

Figure 3 PageRankMultiChannels Adder architecture on FPGA

Figure 4 : PageRank calConvergence architecture on FPGA

Figure 4 PageRankMultiChannels calConvrgence architecture on FPGA

As we can see from the figure:

Module calculate degree: first get the vertex node’s outdegree with weighted and keep them in one DDR buffer.
Module initiation: initiate PR DDR buffers and constant value buffer.
Module Adder: calculate Sparse matrix multiplification.
Module calConvergence: calculate convergence of pagerank iteration.