Overview - 2.2 English

HBM/DDR4 Binary CAM Search LogiCORE IP Product Guide (PG336)

Document ID
PG336
Release Date
2021-08-12
Version
2.2 English
The HBM/DDR4 Binary CAM Search LogiCORE IP core (HBM/DDR BCAM) is a member of the family of CAMs provided by Xilinx® . The family consists of four members:
Binary CAM (BCAM)
Used for exact matching. Entry storage is provided in UltraRAM or block RAM. See the Binary CAM Search LogiCORE IP Product Guide (PG317).
HBM/DDR BCAM (HBM/DDR BCAM)
Described in this document. Used for exact matching. Entry storage is provided in DRAM.
Semi TCAM (STCAM)
The STCAM is fully flexible in terms of number, size, and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit. The number of allowed unique masks is however limited. This allows for considerable memory and logic optimizations. See the Semi-Ternary CAM Search LogiCORE IP Product Guide (PG319).
Ternary CAM (TCAM)
The primary usage of TCAM is for tables requiring full flexibility in terms of size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit stored together with the key. All entries can have different masks. TCAMs are used for Access Control List (ACL) type lookups, requiring a large number of different masks. See the Ternary CAM Search LogiCORE IP Product Guide (PG318).

One or multiple instances of each type can be used inside the same FPGA. Different types can also be mixed inside the same FPGA. Each CAM type is optimized for its specific task in terms of hardware resource usage.

The HBM/DDR BCAM stores {key, response} entries in HBM DRAM or DDR4 DRAM.

The Lookup interface of the HBM/DDR BCAM receives a lookup key and outputs a result that contains a match flag indicating whether the lookup key matches the key of any entry in the HBM/DDR BCAM. If any HBM/DDR BCAM entry is matched, the response value of the matching entry is output.

The entries are read and written using a set of high-level Application Programming Interface (API) functions. The API functions are written in C and delivered as part of the IP. The API encapsulates the details of memory management and register access and provides a simple and efficient management interface. The API software with detailed documentation is found in the CAM IP Product Page. The user provides the functions for basic hardware reads and writes to the API. This allows for flexible hardware mapping and the communications link between the API software and the hardware is designed to the users' specifications. The communication link could for instance be AXI4-Lite or PCIe® .

The HBM/DDR BCAM design is highly configurable at compile time to make it suitable for a large variety of applications. The following table lists configuration parameters. Some parameters are not configurable while others are calculated and only indirectly configurable.

Table 1. Configuration Parameters
Parameter Name Valid Range Description
KEY_WIDTH 10-992 bits The width of the lookup key. KEY_WIDTH + RESPONSE_WIDTH + 1 cannot exceed 1024.
RESPONSE_WIDTH 1-1013 bits The width of the lookup response. KEY_WIDTH + RESPONSE_WIDTH + 1 cannot exceed 1024.
NUM_ENTRIES 1 - 230M The supported number of entries (depth).
LOOKUP_RATE 15 - 150 Msps This is the supported lookup rate of the HBM/DDR BCAM (expressed in million searches per second). To save resources it is important not to set the lookup rate higher than required.
LOOKUP_INTERFACE_FREQ 30-300 MHz The LOOKUP_INTERFACE_FREQ is calculated as:

LOOKUP_INTERFACE_FREQ ≥ 2*LOOKUP_RATE

This frequency is calculated and only indirectly configurable. The only reason for clocking higher than the minimum required frequency is if no clock of the correct frequency is available.

CORE_FREQ 325 MHz This is the clock frequency of the CAM Database logic and the AXI3-Full Masters.
REPLICATION_FACTOR 1, 2, 4 or 8 The REPLICATION_FACTOR is calculated and only indirectly configurable. The REPLICATION_FACTOR is based on the LOOKUP_RATE. For higher lookup rates than 18.75 Msps, entries must be replicated. LOOKUP_RATE/18.75 is rounded upwards to the closest REPLICATION_FACTOR. For example a specified LOOKUP_RATE of 20 Msps will use REPLICATION_FACTOR = 2
ENTRY_SIZE 512 or 1024 bits To optimize DRAM data transfer efficiency, a single entry of ENTRY_SIZE bits is read from the DRAM. All entries in the HBM/DDR BCAM have the same ENTRY_SIZE. The ENTRY_SIZE is calculated and only indirectly configurable. The ENTRY_SIZE is based on KEY_WIDTH and RESPONSE_WIDTH. (KEY_WIDTH + RESPONSE_WIDTH +1) is rounded upwards to the closest ENTRY_SIZE. All entries in the DRAM BCAM have the same ENTRY_SIZE.
NUM_ENTRIES_PER_LIST 1, 2, or 4 NUM_ENTRIES_PER_LIST is calculated and only indirectly configurable. It is based on ENTRY_SIZE:

NUM_ENTRIES_PER_LIST=1024/ENTRY_SIZE

STORAGE_EFFICIENCY 0.9 The Cuckoo algorithm packs the HBM/DDR BCAM to 90% before returning a full indication. This is not configurable.
NUM_PCS 1 - 32 The number of Pseudo Channels (PCs) the HBM/DDR BCAM requires. The NUM_PCS is calculated and only indirectly configurable.

MAX(REPLICATION_FACTOR, NUM_ENTRIES * REPLICATION_FACTOR/(512K * STORAGE_EFFICIENCY * NUM_ENTRIES_PER_LIST * 4))

is rounded upwards to the closest NUM_PCS.

START_PC 0-31 The first PC of the HBM/DDR BCAM. The HBM/DDR BCAM uses NUM_PCS consecutive PCs. Two HBM/DDR BCAMs cannot use the same PC. An HBM/DDR BCAM can use PCs from both HBM stacks.

All of these parameters are extracted or calculated from the P4 code and VitisNetP4 tool during compilation. VitisNetP4 ensures that the parameters used to generate the hardware HBM/DDR BCAM and those used to create the software HBM/DDR BCAM are synchronized. For standalone usage, the user must guarantee that the parameters used to generate the hardware HBM/DDR BCAM and the parameters used to call the software HBM/DDR BCAM are identical.