TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance

QR Code

TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance

Scene text recognition (STR) is an important bridge between images and text, attracting abundant research attention. While convolutional neural networks (CNNS) have achieved remarkable progress in this task, most of the existing works need an extra module (context modeling module) to help CNN to cap...

Description complète

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Yue Tao, Zhiwei Jia, Runze Ma, Shugong Xu
Format:	article
Langue:	EN
Publié:	MDPI AG 2021
Sujets:	scene text recognition transformer self-attention 1-D split initial embedding Electronics TK7800-8360
Accès en ligne:	https://doaj.org/article/5d61a56968d34aa99a6d35d3cd35b304
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

Development of Vertical Text Interpreter for Natural Scene Images
par: Ong Yi Ling, et autres
Publié: (2021)

Environmental Sound Recognition on Embedded Systems: From FPGAs to TPUs
par: Jurgen Vandendriessche, et autres
Publié: (2021)

The presence of occupational structure in online texts based on word embedding NLP models
par: Zoltán Kmetty, et autres
Publié: (2021)

Performance Evaluation of Offline Speech Recognition on Edge Devices
par: Santosh Gondi, et autres
Publié: (2021)

K-Nearest Neighbor for Recognize Handwritten Arabic Character
par: Muhammad Athoillah
Publié: (2019)

Two-Stage Recognition and beyond for Compound Facial Emotion Recognition
par: Dorota Kamińska, et autres
Publié: (2021)

Machine Learning Based Embedded Code Multi-Label Classification
par: Yu Zhou, et autres
Publié: (2021)

Task-Adaptive Embedding Learning with Dynamic Kernel Fusion for Few-Shot Remote Sensing Scene Classification
par: Pei Zhang, et autres
Publié: (2021)

Embedded systems design

Embedded systems programming
Publié: (1988)

GACM: A Graph Attention Capsule Model for the Registration of TLS Point Clouds in the Urban Scene
par: Jianjun Zou, et autres
Publié: (2021)

CE-Net: A Coordinate Embedding Network for Mismatching Removal
par: Shiyu Chen, et autres
Publié: (2021)

Understanding Public Attention towards the Beautiful Village Initiative in China and Exploring the Influencing Factors: An Empirical Analysis Based on the Baidu Index
par: Qin Ji, et autres
Publié: (2021)

Ensemble of Deep Masked Language Models for Effective Named Entity Recognition in Health and Life Science Corpora
par: Nona Naderi, et autres
Publié: (2021)

Remote Sensing Image Scene Classification Based on Global Self-Attention Module
par: Qingwen Li, et autres
Publié: (2021)

Petru Dumitras a recognized expert in cavitation technologies (on the occasion of the 70th anniversary)
par: Bologa, Mircea
Publié: (2015)

Three-Dimensional Outdoor Analysis of Single Synthetic Building Structures by an Unmanned Flying Agent Using Monocular Vision
par: Andrzej Bielecki, et autres
Publié: (2021)

Printed Split-Ring Loops with High <i>Q</i>-Factor for Wireless Power Transmission
par: Jingchen Wang, et autres
Publié: (2021)

World recognized physistand bright personality of Moldova professor Vsevolod Moskalenko at his 80th anniversary
par: Canţer, Valeriu
Publié: (2008)

Are Microcontrollers Ready for Deep Learning-Based Human Activity Recognition?
par: Atis Elsts, et autres
Publié: (2021)

Jenseits des Sichtbaren
par: Di Noi, Barbara
Publié: (2021)

Knowledge-infused Learning for Entity Prediction in Driving Scenes
par: Ruwan Wickramarachchi, et autres
Publié: (2021)

EOESGC: predicting miRNA-disease associations based on embedding of embedding and simplified graph convolutional network
par: Shanchen Pang, et autres
Publié: (2021)

Language Representation Models: An Overview
par: Thorben Schomacker, et autres
Publié: (2021)

Recommendation Method Integrating Review Text Hierarchical Attention with Time Information
par: XING Changzheng, GUO Yalan, ZHANG Quangui, ZHAO Hongbao
Publié: (2021)

Self-Attention-Based Models for the Extraction of Molecular Interactions from Biological Texts
par: Prashant Srivastava, et autres
Publié: (2021)

DIFFICULTIES EMERGING IN THE PROCESS OF TEACHING RUSSIAN STUDENTS TO MAKE A SPEECH IN JAPANESE AT THE STAGE OF THE SPEECH TEXT PRELIMINARY DEVELOPMENT
par: N. L. Maksimenko
Publié: (2014)

LPNet: Retina Inspired Neural Network for Object Detection and Recognition
par: Jie Cao, et autres
Publié: (2021)

Dynamics Analysis of Multi-Flexible Body Ejection of Airborne Embedded Weapon
par: Wang Qinghai
Publié: (2021)

Testing the embedding effect in the valuation of lagoon recovery
par: Donoso,Guillermo, et autres
Publié: (2010)

Analysis of key components of the “Digital University” Model
par: I. N. Golyshkova
Publié: (2020)

Making personnel selection smarter through word embeddings: A graph-based approach
par: Nikos Kanakaris, et autres
Publié: (2022)

Masked Face Recognition Using Deep Learning: A Review
par: Ahmad Alzu’bi, et autres
Publié: (2021)

Violence Recognition Based on Auditory-Visual Fusion of Autoencoder Mapping
par: Jiu Lou, et autres
Publié: (2021)

Optical Recognition of Handwritten Logic Formulas Using Neural Networks
par: Vaios Ampelakiotis, et autres
Publié: (2021)

Deep-Learning-Based Stress Recognition with Spatial-Temporal Facial Information
par: Taejae Jeon, et autres
Publié: (2021)

Application of Spectral and Wavelet Analysis of Stator Current to Detect Angular Misalignment in PMSM Drive Systems
par: Pietrzak Przemysław, et autres
Publié: (2021)

Management of educational and scientific laboratory: new requirements and competences
par: F. F. Sharipov
Publié: (2020)

On Question of “Analytical Article” Text Type
par: D. T. Dorzhieva
Publié: (2017)

Cross-Modal Guidance Assisted Hierarchical Learning Based Siamese Network for MR Image Denoising
par: Rabia Naseem, et autres
Publié: (2021)