Performance - 2.3 English

Semi-Ternary CAM Search LogiCORE IP Product Guide (PG319)

Document ID
PG319
Release Date
2022-05-25
Version
2.3 English

Maximum Frequencies

The Semi-Ternary CAM Search IP is designed to run at up to 600 MHz in UltraScale+™ -2 speed grade devices.

Latency

The STCAM lookup latency depends on the size of the STCAM, the TDM factor and memory type. The lookup latency is constant and some examples are shown in the following table.

Table 1. Lookup Latency [Block RAM / UltraRAM]
Entries TDM Factor = 1 TDM Factor = 4 TDM Factor = 16 TDM Factor = 32
256 29/29 13/13 9/9 9/9
1K 29/29 13/13 9/9 9/9
4K 30/29 13/13 9/9 9/9
16K 32/29 13/13 10/10 NA/9
Note: Latency values are measured in Lookup Interface Frequency cycles.
KEY_WIDTH = 32, RESPONSE_WIDTH = 16, NUM_MASKS = 32, LOOKUP_RATE = LOOKUP_INTERFACE_FREQ.

Throughput

The lookup throughput corresponds to the LOOKUP_RATE parameter. The highest possible lookup throughput is accomplished when LOOKUP_RATE equals the RAM _FREQ parameter. One Lookup Request can then be issued per RAM clock cycle. The Management Request has strictly lower priority than the Lookup Request, consequently the Management Request throughput becomes:

Management Request Rate = RAM_FREQ - LOOKUP_RATE*TDM_FACTOR

The ECC scrubbing process has the lowest priority. A memory read followed by a potential corrective write is only executed if both the Lookup Request and Management Request FIFOs are empty. Neither the lookup throughput nor the Management Request throughputs are affected. ECC scrubbing of a new address is only initiated if both FIFOs are empty and a potential pending corrective write has been executed.

All read and write Management Requests are 32 bits wide. The only exception is for a write Management Request of entry data. These Management Requests might be wider as described in the section below. The Management Request width for entry data is essential for correct dimensioning of the lookup rate and RAM frequency to have throughput headroom for Management Requests.

To perform management updates while maintaining correct state for lookups, the management operations need to be atomic. This means that a complete entry must be written to the CAM Database before the entry is made active (valid). To accomplish wide writes, an entry is written using multiple Management Requests where the last Management Request sets the valid bit. When an already existing entry is being updated, the valid bit is already set. This means that the response data needs to be written using only a single Management Request. For this reason a Management Request writes at least (priority + response + valid) width bits of data. The total width is rounded upwards to the next 64-bit boundary. For a 160-bit key with 72 bits of priority + response + valid, assume that the following is written:

  • Key, 160 bits
  • Priority + Response + Valid, 72 bits

In total, 232 bits are written. The width of priority + response + valid is 72 bits, this will be rounded to 128 bits. Each Management Request writes 128 bits. With rounding up, 232/128 = 2 write Management Requests are sent.

The AXI4-Lite bus uses 13 bits of address and 32 bits of data, so for every Management Request multiple AXI4-Lite writes are issued from the API software. The AXI4-Lite writes are assembled by the AXI4-Slave to a single Management Request. The AXI4-Lite interface is a standard type. Refer to AXI4-Lite IPIF LogiCORE IP Product Guide (PG155).

The following table shows an example calculation for 100 Gb Ethernet rate. Keep in mind the calculated update rate only refers to the hardware resources, the final update rate is most likely limited by the table management software.

Table 2. 100 GbE Update Rate Example Calculation (hardware limit)
Management Request Size [bits] AXI Lite Write Operations [min / max] Management Update rate [M updates/s]
64 1 / 3 4.8
128 2 / 5 2.4
256 4 / 9 1.2
512 8 / 17 0.6
1024 16 / 33 0.3
Note: Parameters used in this example: LOOKUP_RATE = 148.8, RAM_FREQ = 600, TDM_FACTOR = 4
Note: The AXI Lite maximum number of write operations applies for random data patterns. Minimum number of operations can be achieved when the write data is constant (for example, initializing to zero).