Overview - 2.6 English

Ternary CAM Search v2.6 LogiCORE IP Product Guide (PG318)

Document ID
PG318
Release Date
2023-11-01
Version
2.6 English
The Ternary CAM Search IP core (TCAM) is a member of the family of CAMs provided by AMD. The family consists of four members:
Binary CAM (BCAM)
Used for exact matching. See the Binary CAM Search LogiCORE IP Product Guide (PG317).
Hardware-managed binary CAM (CBCAM)
Entries can be inserted/deleted without a software driver using hardware interface. Software managed inserts/deletes are also supported. The CBCAM software driver has no shadow memory of the CAM contents, and requires less memory and CPU resources. See the Binary CAM Search LogiCORE IP Product Guide (PG317).
Semi TCAM (STCAM)
The STCAM is fully flexible in terms of number, size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit. The number of allowed unique masks is however limited. This allows for considerable memory and logic optimizations. See the Semi-Ternary CAM Search LogiCORE IP Product Guide (PG319).
Ternary CAM (TCAM)
Described in this document. The primary usage of TCAM is for tables requiring full flexibility in terms of size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit stored together with the key. All entries can have different masks. TCAMs are used for Access Control List (ACL) type lookups, requiring a large number of different masks.
One or multiple instances of each type can be used inside the same FPGA. Different types can also be mixed inside the same FPGA. Each CAM type is optimized for its specific task in terms of hardware resource usage.

The TCAM stores {key, mask, priority, response} entries in either UltraRAM (URAM) or block RAM. The TCAM provides efficient use of FPGA resources, in contrast with basic TCAM implementations that store the keys in flip-flops and use logic resources for parallel key comparison.

The Lookup interface of the TCAM receives a lookup key and outputs a result that contains a match flag indicating whether the lookup key matches the key of any entry in the TCAM. The match comparison allows for don't care bits, by applying the entry mask to both keys. If any TCAM entry is matched, the priority and the response value of the matching entry with the lowest priority is output. The TCAM is pipelined so that it can process a Lookup Request every clock cycle.

In addition to ternary matching, the TCAM supports range comparisons, for example, for TCP/UDP port range checking. Up to eight sub-fields of the key can optionally be designated for numerical range checking ([min:max]), as opposed to ternary (key/mask) matching. The position and the width of these sub-fields of the key are statically configured according to user specifications.

The rule entries are read and written using a set of high-level API functions. The API functions are written in C and delivered as part of the IP. The API encapsulates the details of memory management and register access and provides a simple and efficient management interface. The API software with detailed documentation is found in the CAM IP product page. You only provide the functions for basic hardware reads and writes to the API. This allows for flexible hardware mapping and the communications link between the API software and the hardware is designed to the users' specifications. The communication link could for instance be AXI4-Lite or PCIe® . The TCAM API insert function has four arguments: data, mask, priority, and response. For range values, the minimum value is encoded in the data bits and the maximum value in the mask bits.

Examples of rules that can be programmed into the TCAM using the provided API functions are shown in the following table.

Table 1. Example of TCAM Rules
Priority IPv4 Src Addr/Mask IPv4 Dst Addr/Mask TCP Src Port TCP Dst Port Protocol/Mask Value
0 198.238.184.78/32 142.205.117.12/30 0:65535 0:65535 0x00/0x00 0
1 0.0.0.0/0 198.238.186.122/32 0:65535 19:25 0x11/0xff 2
2 0.0.0.0/0 198.238.190.245/32 1024:65535 8080:8080 0x11/0xff 3
3 0.0.0.0/0 198.238.190.174/32 0:65535 67:67 0x11/0xff 0

The rules in the table consist of the following five fields:

  • IPv4 source address and mask
  • IPv4 destination address and mask
  • TCP source port [min:max] range
  • TCP destination port [min:max] range
  • IPv4 protocol number and mask

The priority and value is provided per rule. The rules are ordered according to priority. If there are multiple matches, the lowest priority wins.

The TCAM design is highly configurable at compile time to make it suitable for a large variety of applications. The following table lists the configuration parameters.

Table 2. Configuration Parameters
Parameter Name Valid Range Description
KEY_WIDTH 10-992 bits The width of the lookup key.

2*KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + 1 cannot exceed 2048.

RESPONSE_WIDTH 1-1024 bits The width of the lookup response.

2*KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + 1 cannot exceed 2048.

PRIORITY_WIDTH 0-32 bits The priority is usually defined wide enough to support one unique priority value per entry. The width can be larger to facilitate easier TCAM management or narrower if entries are order independent and guaranteed not to overlap.
NUM_RANGES 0-8 ranges The number of range fields. The bit position (in the lookup key) of range fields are not allowed to overlap.
RANGE_SIZE 2-16 bits The width of the range fields.
NUM_ENTRIES 1 - 32K The supported number of entries (depth).
MEMORY_PRIMITIVE BLOCK or ULTRA or AUTO The compiler selects the best suited type automatically. This can however be overridden as a user preference.
LOOKUP_RATE 15 - 600 Mlps This is the supported lookup rate of the instance (expressed in million lookups per second). In order to save resources, it is important not to set the lookup rate higher than required.
LOOKUP_INTERFACE_FREQ 15-600 MHz This is the clock frequency of the Lookup Request and response interfaces.

LOOKUP_INTERFACE_FREQ >= LOOKUP_RATE

RAM_FREQ 15-600 MHz This is the clock frequency of the memories and the internal datapath. An optional, high-frequency RAM clock enables time division of the hardware resources, leading to significant savings. See the TDM_FACTOR parameter.
TDM_FACTOR 1, 2, 4, 8, 16 or 32 The TDM_FACTOR is calculated as:

RAM_FREQ / LOOKUP_RATE = 1, 2, 4, 8, 16, or 32

The ratio is rounded downwards to the nearest power of two.

Example:

RAM clock frequency = 600, Lookup rate = 150 → TDM_FACTOR = 600 / 150 = 4

The RAM can be accessed four times per lookup, saving up to four times the RAM and logic resources for small table configurations.

CLOCKING_MODE SINGLE-CLOCK or DUAL_CLOCK The use of a separate RAM clock is optional. If RAM_FREQ = LOOKUP_INTERFACE_FREQ, then the single clock mode is enabled. In single clock mode, only the lookup interface clock is used for lookup interfaces, RAM, and match logic.

All of these parameters are extracted from the P4 and code and VitisNetP4 tool during compilation. If the TCAM is used without P4, these parameters need to be set prior to generating the hardware TCAM or calling the software TCAM API. VitisNetP4 ensures that the parameters used to generate the hardware TCAM and those used to create the software TCAM instance are synchronized. For standalone usage, you must guarantee that the parameters used to generate the hardware TCAM and the parameters used to call the software TCAM API are identical.