ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

46

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance

Preprint

Author(s): Yan Xu , Siyuan Shan , Ziming Qiu , Zhipeng Jia , Zhengyang Shen , Yipei Wang , Mengfei Shi , Eric I-Chao Chang

Publication date Created: 2016-11-18

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

In this paper, we propose an innovative end-to-end subtitle detection and recognition system for videos in East Asian languages. Our end-to-end system consists of multiple stages. Subtitles are firstly detected by a novel image operator based on the sequence information of consecutive video frames. Then, an ensemble of Convolutional Neural Networks (CNNs) trained on synthetic data is adopted for detecting and recognizing East Asian characters. Finally, a dynamic programming approach leveraging language models is applied to constitute results of the entire body of text lines. The proposed system achieves average end-to-end accuracies of 98.2% and 98.3% on 40 videos in Simplified Chinese and 40 videos in Traditional Chinese respectively, which is a significant outperformance of other existing methods. The near-perfect accuracy of our system dramatically narrows the gap between human cognitive ability and state-of-the-art algorithms used for such a task.

Related collections

Most cited references 8

Record: found
Abstract: not found
Article: not found

Reading Text in the Wild with Convolutional Neural Networks

Karen Simonyan, Andrew Zisserman, Andrea Vedaldi … (2016)

0 comments Cited 89 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Article: not found

Text Detection and Recognition in Imagery: A Survey

Qixiang Ye, David Doermann (2015)

0 comments Cited 82 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Book Chapter: not found

Deep Features for Text Spotting

Max Jaderberg, Andrea Vedaldi, Andrew Zisserman (2014)

0 comments Cited 59 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 2016-11-18

Article

ArXiV ID: 1611.06159

SO-VID: 748e6b00-c4d3-42ae-9a69-a3d681afe0af

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments 35 pages

Categories cs.CV

ScienceOpen disciplines: Computer vision & Pattern recognition

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition

Comments

Comment on this article