Textmountain Detection - 3.5 English

Vitis AI Library User Guide (UG1354)

Document ID
Release Date
3.5 English

This network is used for multilingual text detection. The network is composed of a ResNet-FPN feature extractor and a detection predictor. The model is trained by ICDAR-2017. The input is an image containing some text. The output is a structure that includes the words detected and their position. The following image shows the result of Textmountain model.

Figure 1. Textmountain Detection

Table 1. Textmountain Model
No Model Name Framework
1 textmountain_pt PyTorch