Design Criteria - 2023.2 English

Vivado Design Suite User Guide: Dynamic Function eXchange (UG909)

Document ID
UG909
Release Date
2023-11-15
Version
2023.2 English

Some component types can be reconfigured and some cannot.

  • For 7 series devices, the component rules are as follows:
    • Reconfigurable resources include CLB, block RAM, and DSP component types as well as routing resources.
    • Clocks and clock modifying logic cannot be reconfigured, and therefore must reside in the static region.
      • Includes BUFG, BUFR, MMCM, PLL, and similar components
    • The following components cannot be reconfigured, and therefore must reside in the static region:
      • I/O and I/O related components (ISERDES, OSERDES, IDELAYCTRL)
      • Serial transceivers (MGTs) and related components
      • Individual architecture feature components (such as BSCAN, STARTUP, ICAP, XADC)
  • For UltraScale and UltraScale+ devices, the list of reconfigurable component types is more extensive:
    • CLB, block RAM, and DSP component types as well as routing resources
    • Clocks and clock modifying logic, including BUFG, MMCM, PLL, and similar components
    • I/O and I/O related components (ISERDES, OSERDES, IDELAYCTRL)
      Note: The types of changes for I/O components is limited. See I/O Rules for more information.
    • Serial transceivers (MGTs) and related components
    • PCIe, CMAC, Interlaken, and SYSMON blocks
    • Bitstream granularity of these new components require that certain rules are followed. For example, partial reconfiguration of I/O require that the entire bank, plus all clocking resources in that frame are reconfigured together.
    • Only the configuration components (such as BSCAN, STARTUP, ICAP, and FRAME_ECC) must remain in the static portion of the design.
  • For Versal devices, in addition to all elements in the programmable logic supported for UltraScale+, the Network on Chip (NoC) is dynamically reconfigurable. 
  • Global clocking resources to RPs are limited, depending on the device and on the clock regions occupied by these RPs.
  • IP restrictions may occur due to components used to implement the IP or due to connections required by the IP. Examples include:
    • Vivado Debug Cores (See Using Vivado Debug Cores for more information on using debug cores inside of RMs)
    • IP modules with embedded global buffers or I/O (7 series only)
    • Memory IP controller (MMCM and BSCAN)
  • RMs must be initialized to ensure a predictable starting condition after reconfiguration. For all devices other than 7 series, GSR is automatically applied after DFX completes. For 7 series devices, GSR can be turned on, after meeting Pblock requirements, with the RESET_AFTER_RECONFIG Pblock property.
  • Decoupling logic is highly recommended to disconnect the reconfigurable region from the static portion of the design during the act of partial reconfiguration.
    • GSR events hold all logic inside the RM in reset until configuration completes. However, RM outputs can be random and all downstream logic should be decoupled. For 7 series, if RESET_AFTER_RECONFIG is not used, additional decoupling of clocks and inputs can be required to prevent unintended capture of erroneous data of during reconfiguration (e.g. spurious write to memory).
    • The Vivado Design Suite includes the DFX Decoupler IP. This IP allows users to easily insert multiplexers to efficiently decouple AXI4-Lite, AXI4-Stream, and custom interfaces. More information on the DFX Decoupler IP is available on the Xilinx.com website.
  • An RP must be floorplanned with a Pblock, so the module must be a block that can be physically isolated and meet timing. If the module is complete, it is recommended to run this design through a non-DFX flow to get an initial evaluation of placement, routing, and timing results. If the design has issues in a non-DFX flow, these should be resolved before moving on to the DFX flow.
  • Optimize an RP's interface as much as possible. An excessive number of interface pins on an RP can cause timing and routing issues. This is especially true if the partition pins are densely placed. This can happen for two reasons:
    1. RP Pblock is relatively small compared to the number of partition pins.
    2. All the partition pins are placed in a small area due to static connections.

    Consider the RP interface when designing and floorplanning for DFX.

  • Virtex 7 SSI devices (7V2000T, 7VX1140T, 7VH870T, 7VH580T) have two fundamental requirements. These requirements are:
    • Reconfigurable regions must be fully contained within a single SLR. This ensures that the global reset events are properly synchronized across all elements in the RM, and that all super long lines (SLL) are contained within the static portion of the design. SLL are not partially reconfigurable.
    • If the initial configuration of a 7 series SSI device is done through an SPIx1 interface, partial bitstreams must be delivered to the ICAP located on the SLR where the RP exists, or to an external port, such as JTAG. If the initial configuration is done through any other configuration port, the master ICAP can be used as the delivery port for partial bitstreams.
  • UltraScale devices have a new requirement related to partial reconfiguration events. Before a partial bitstream for a new RM is loaded, the current RM must be "cleared" to prepare for reconfiguration. UltraScale+ devices do not have this limitation. For more information, see Summary of BIT Files for UltraScale Devices.
  • Dedicated encryption support for partial bitstreams is available natively. See Known Limitations for specific unsupported use cases for UltraScale devices.
  • Devices can use a per-frame CRC checking mechanism, enabled by write_bitstream, to ensure each frame is valid before loading.
  • Optimization across the DFX boundary is prohibited by the implementation tools. Often the WNS paths in a DFX design are high fanout control/reset signals that cross the RP boundary. Avoid high fanout signals crossing the RP boundary because the drivers cannot be replicated. To allow the tools maximum flexibility of optimization/replication, consider the following:
    • For inputs to the RP, make the signal crossing the RP boundary a single fanout net, and register the signal inside the RM before the fanout. This can be replicated as necessary inside the RM (or put on global resources).
    • For outputs, again make the signal crossing the DFX boundary a single fanout net. Register the signal in static before the fanout for replication/optimization.
  • For design with multiple RPs, AMD recommends not having direct connections between two RPs. This includes connections that go through asynchronous static logic (not registered in static). If direct connections exist between two RPs, all possible configurations must be verified in static timing analysis to ensure timing is met across these interfaces. This can be done for closed systems that are fully owned and maintained by a single user, but can be impossible to verify for designs where different RMs are developed by multiple users. Adding a synchronous endpoint in static ensures timing is always met on any configuration, as long as the configuration where the RM was implemented met timing.

Dynamic Function eXchange is a powerful capability within AMD devices, and understanding the capabilities of the silicon and software is instrumental to success. While trade-offs must be recognized and considered during the development process, the overall result is a more flexible implementation of your FPGA design.