Watchdog Management - 2021.1 English

Zynq UltraScale+ MPSoC Software Developer Guide (UG1137)

Document ID
UG1137
Release Date
2021-07-13
Version
2021.1 English

The FPD WDT is used for monitoring APU state. Software running on APU periodically touch FPD WDT to keep it from timing out. The occurrence of WDT timeout indicates an unexpected condition on the APU which prevents the software from running properly and an APU restart is invoked. FPD WDT is configured by PMU firmware at initialization stage, but is periodically serviced by software running on APU.

The default timeout configured for WDT is 60 seconds and can be changed by RECOVERY_TIMEOUT flag in PMU firmware. When APU subsystem goes into a restart cycle, FPD WDT is kept running to ensure that the restart lands in a clean running state where software running on APU is able to touch the WDT again. Therefore, the timeout for the WDT must be long enough to cover the entire APU subsystem restart cycle to prevent the WDT from timing out in the middle of restart process. It is advisable to start providing the heartbeat as soon as is feasible in Linux. PetaLinux BSP includes recipe to add the watchdog management service in init.d. As FPD WDT is owned by PMU firmware, it would be unsafe to use full fledged Linux driver for handling WDT. It is advisable to just pump the heartbeats by writing restart key (0x1999) to restart register (WDT base + 0x8) of the WDT. It can be done through C program daemon or it can be part of bash script daemon.

It is recommended to be part of idle thread or similar low priority thread, which if hangs we should consider the subsystem hang.

The following is the snippet of the single heartbeat stroke to the FPD WDT from command prompt. This can be included in the bash script which runs periodically.

# devmem 0xFD4D0008 32 0x1999

The following wdt-heartbeat application periodically provides the heartbeat to FPD WDT. For demo purpose this application is launched as daemon. The code from this application can be implemented in appropriate location such as an idle thread of Linux.

#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>

#define WDT_BASE	0xFD4D0000
#define WDT_RESET_OFFSET	0x8
#define WDT_RESET_KEY	0x1999

#define REG_WRITE(addr, off, val) (*(volatile unsigned int*)(addr+off)=(val))
#define REG_READ(addr,off) (*(volatile unsigned int*)(addr+off))


void wdt_heartbeat(void)
{
char *virt_addr; int fd;
int map_len = getpagesize();
fd = open("/dev/mem", (O_RDWR | O_SYNC)); virt_addr = mmap(NULL,
map_len, PROT_READ|PROT_WRITE,
MAP_SHARED,
fd, WDT_BASE);

if (virt_addr == MAP_FAILED) perror("mmap failed");

close(fd);

REG_WRITE(virt_addr,WDT_RESET_OFFSET, WDT_RESET_KEY);

munmap((void *)virt_addr, map_len);
}
int main()
{
while(1)
{
wdt_heartbeat(); sleep(2);
}
return 0;
}

On the expiry of watchdog, PMU firmware receives and handles the WDT interrupt. PMU firmware idles the subsystem's master CPU i.e., all A53 cores (see APU Idling), and then carries out APU only restart flow which includes CPU reset and idling and resetting peripherals (see Peripheral Idling) associated to the subsystem reset.

Note: If ESCALATION is enabled PMU firmware will trigger the appropriate restart flow (which can be other than APU only restart) as explained in Escalation section.