Overview - 2.2 English

Binary CAM Search LogiCORE IP Product Guide (PG317)

Document ID
PG317
Release Date
2021-07-27
Version
2.2 English
The Binary CAM Search IP core (BCAM) is a member of the family of CAMs provided by Xilinx® . The family consists of four members:
Binary CAM (BCAM)
Described in this document. Used for exact matching. Entry storage is provided in UltraRAM or block RAM.
Semi TCAM (STCAM)
The STCAM is fully flexible in terms of number, size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit. The number of allowed unique masks is however limited. This allows for considerable memory and logic optimizations. See the Semi-Ternary CAM Search LogiCORE IP Product Guide (PG319).
Ternary CAM (TCAM)
The primary usage of TCAM is for tables requiring full flexibility in terms of size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit stored together with the key. All entries can have different masks. TCAMs are used for Access Control List (ACL) type lookups, requiring a large number of different masks. See the Ternary CAM Search LogiCORE IP Product Guide (PG318).
One or multiple instances of each type can be used inside the same FPGA. Different types can also be mixed inside the same FPGA. Each CAM type is optimized for its specific task in terms of hardware resource usage and can be flexibly configured using VitisNetP4 or the IP integrator.

The BCAM stores {key, response} entries in either URAM or block RAM. The BCAM provides efficient use of FPGA resources, in contrast with basic BCAM implementations that store the keys in flip-flops and use logic resources for parallel key comparison.

The Lookup interface of the BCAM receives a lookup key and outputs a result that contains a match flag indicating whether the lookup key matches the key of any entry in the BCAM. If any BCAM entry is matched, the response value of the matching entry is output. The BCAM is pipelined so that it can process a Lookup Request every clock cycle.

The entries are read and written using a set of high-level Application Programming Interface (API) functions. The API functions are written in C and delivered as part of the IP. The API encapsulates the details of memory management and register access and provides a simple and efficient management interface. The API software with detailed documentation is found in the CAM IP product page. The user provides the functions for basic hardware reads and writes to the API. This allows for flexible hardware mapping and the communications link between the API software and the hardware is designed to the users' specifications. The communication link could for instance be AXI4-Lite or PCIe.

The BCAM design is highly configurable at compile time to make it suitable for a large variety of applications. The table below lists the configuration parameters.

Table 1. Configuration Parameters
Parameter Name Valid Range Description
KEY_WIDTH 10-992 bits The width of the lookup key. KEY_WIDTH + RESPONSE_WIDTH + 1 cannot exceed 1536/1024 [block RAM/URAM].
RESPONSE_WIDTH 1-1024 bits The width of the lookup response. KEY_WIDTH + RESPONSE_WIDTH + 1 cannot exceed 1536/1024 [block RAM/URAM].
NUM_ENTRIES 1 - 1.25M The number of usable entries (depth). To generate a BCAM with a certain memory depth, .(for example 4K), specify 95% of the target NUM_ENTRIES = 0.95 x 4096 = 3891
MEMORY_PRIMITIVE BLOCK or ULTRA or AUTO The compiler selects the best suited type automatically. This can however be overridden as a user preference.
LOOKUP_RATE 15 - 600 Mlps This is the supported lookup rate of the instance (expressed in million Lookups per second). In order to save resources it is important not to set the lookup rate higher than required.
LOOKUP_INTERFACE_FREQ 15-600 MHz This is the clock frequency of the Lookup Request and response interfaces.

LOOKUP_INTERFACE_FREQ >= LOOKUP_RATE

RAM_FREQ 15-600 MHz This is the clock frequency of the memories and the internal datapath. An optional, high frequency RAM clock enables time division of the hardware resources, leading to significant savings. See the TDM_FACTOR parameter.

RAM_FREQ >= LOOKUP_INTERFACE_FREQ

TDM_FACTOR 1, 2, or 4 The TDM_FACTOR is calculated from the ratio:

RAM_FREQ / LOOKUP_RATE

The ratio is rounded downwards to the nearest power of two and capped based on NUM_ENTRIES. This further described in Resource Time Sharing.

CLOCKING_MODE SINGLE-CLOCK or DUAL_CLOCK The use of a separate RAM clock is optional. If RAM_FREQ = LOOKUP_INTERFACE_FREQ, then the single clock mode is enabled. In single clock mode only the lookup interface clock is used for lookup interfaces, RAM and match logic.

All of these parameters are extracted from the P4 code and VitisNetP4 tool during compilation. If the BCAM is used without P4, these parameters are set in the IP Generator prior to generating the BCAM hardware and software BCAM API. VitisNetP4 ensures that the parameters used to generate the hardware BCAM and those used to create the software BCAM are synchronized. For standalone usage, the user must guarantee that the parameters used to generate the hardware BCAM and the parameters used to call the software BCAM are identical.