O. Abdel-HamidL. DengD. Yu 2013 Exploring convolutional neural network structures and optimization for speech recognition Interspeech, ISCA Lyon, France 3366 3370
A. AlvarezC. MendesM. RaffaelliT. LuısS. PauloN. PiccininiH. ArzelusJ. NetoC. del AliprandiA. Pozo 2016 Automating live and batch subtitling of multimedia contents for several European languages Multimed. Tools Appl 75 10823 10853
P. BellGales, MJF., T. HainJ. KilgourP. LanchantinX. LiuA. McParlandS. RenalsO. SazM. WesterPC. Woodland 2015 The MGB challenge: Evaluating multi-genre broadcast media recognition In Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU2015) Scottsdale, Arizona, USA, 687 693
R. CollobertC. PuhrschG. Synnaeve 2016 Wav2letter: an end-to-end convnet-based speech recognition system CoRR, Vol. abs/1609.03193
J. ConnollyK. CurranP. McKevittJ. MacraeS. Craig 2014 Broadcast Language Identification System (BLIS) In: Proc. of the 16th Irish Machine Vision and Image Processing Conference (IMVIP-14) Ulster University, UK
G. DahlD. YuL. DengA. Acero 2011 Large vocabulary continuous speech recognition with context-dependent DBN-HMMs IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Prague, Czech Republic 4688 4691
G. DahlD. YuL. DengA. Acero 2012 Context Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition 20 1 30 42
G. DahlT. SainathG. Hinton 2013 Improving DNNs for LVCSR using rectified linear units and dropout IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Vancouver ca8609 8613
L. Fryer 2018 An introduction to Audio Description: a practical guide New York, USA Routledge
G. HintonS. OsinderoY. The 2006 A fast learning algorithm for deep belief nets Neural Computation 18 1527 1554
G. HintonR. Salakhutdinov 2006 Reducing the dimensionality of data with neural networks, Science 313 5786 504 507
G. HintonL. DengD. YuG. DahlA. MohamedN. JaitlyA. SeniorV. VanhouckeP. NguyenT. N. SainathB. Kingsbury 2012 Deep Neural Networks for Acoustic Modelling in Speech Recognition IEEE Signal Processing Magazine 29 6 82 97
N. JaitlyP. NguyenA. W. SeniorV. Vanhoucke 2012 Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition The 13th International Speech Communication Association, in Proc Interspeech, New York, USA 2578 2581
N. Jaitly 2017 Lecture 12: End-to-End Models for Speech Processing Stanford University School of Engineering http://www.youtube.com/watch?v=3MjIkWxXigM&app=desktop
Kaldi 2018 http://kaldi-asr.org/doc/index.html
S. KarpagavalliE. Chandra 2016 A Review on Automatic Speech Recognition Architecture and Approaches, International Journal of Signal Processing Image Processing and Pattern Recognition 9 4 393 404
I. KaurN. KaurA. UmmatJ. KaurK. Navjot 2016 Automatic Speech Recognition: A Review InternatIonal Journal of Computer ScIence and Technology (IJCST) 7 4 Oct.-Dec
D. LiG. HintonB. Kingsbury 2013 New Types of Deep Neural Network Learning for Speech Recognition and Related Applications: An Overview IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada 8599 8603
D. LiJ. LiJ. HuangK. YaoD. YuF. SeideM. SeltzerG. ZweigX. HeJ. WilliamsY. GongA. Acero 2013 Recent Advances in Deep Learning for Speech Research at Microsoft IEEE International Conference on Acoustics, Speech and Signal Processing B.C. Vancouver Canada, 8604 8608
V. LiptchinskyG. SynnaeveR. Collobert 2017 Letter-Based Speech Recognition with Gated ConvNets CoRR Vol. abs/1712.09444
H. Maxwell 2018 Can we talk about the other 7%? http://www.redbeemedia.com/blog/can-talk-7/ Red Bee Media Blog
A. MohamedG. DahlG. Hinton 2009 Deep belief networks for phone recognition In Proc. NIPS Workshop on Deep Learning for Speech Recognition and Related Applications B. C. Vancouver Canada 1 9
A. MohamedG. DahlG. Hinton 2012 Acoustic modeling using deep belief networks IEEE Trans. on Audio, Speech, and Language Processing 20 1 14 22
Ofcom (2013 Measuring the quality of live subtitling http://www.ofcom.org.uk/__data/assets/pdf_file/0017/51731/qos-statement.pdf
V. PanayotovG. ChenD. PoveyS. Khudanpur 2015 Librispeech: an ASR corpus based on public domain audio books In International Conference on Acoustics, Speech and Signal Processing (ICASSP), Queensland, Australia 5206 5210
J. PanC. LiuZ. WangY. HuH. Jiang 2012 Investigation of Deep Neural Networks (DNN) for Large Vocabulary Continuous Speech Recognition: Why DNN Surpasses GMMS in Acoustic Modelling In Proc. of 8th International Symposium on Chinese Spoken Language Processing (ISCSLP’2012) Hong Kong 301 305
D. PoveyX. ZhangS. Khudanpur 2015 Parallel Training of Deep Neural Networks with Natural Gradient and Parameter Averaging In Proc. of 3rd International Conference on Learning Representations (ICLR2015) San DiegoUSA
P. Romero-Fresco 2014 Subtitling through speech recognition: respeaking Manchester, UK St. Jerome Publishing
T. SainathA. MohamedB. KingsburyB. Ramabhadran 2013 Deep convolutional neural networks for lvcsr In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing Vancouver, BC, Canada 8614 8618
F. SeideG. LiD. Yu 2011 Conversational Speech Transcription Using Context-Dependent Deep Neural Networks In Proc. Interspeech, Florence, Italy 444 447
T. SercuV. Goel 2016 Advances in Very Deep Convolutional Neural Networks for LVCSR Multimodal Algorithms and Engines Group, IBM J. Watson Research Center USA
W. SongJ. Cai 2015 End-to-End Deep Neural Network for Automatic Speech Recognition Technical Report, Department of Computer Science Stanford University
G. A. Stevenson 2016 Aalysis of Pre-Trained Deep Neural Networks for Large-Vocabulary Automatic Speech Recognition LLNL-TH-698797 July 28, Lawrence Livermore National Laboratory
P. C. WoodlandX. LiuY. QianC. ZhangM. GalesP. KaranasouP. LanchantinL. Wang 2015 Cambridge University Transcription Systems for the Multi-Genre Broadcast Challenge, Automatic Speech Recognition and Understanding (ASRU) IEEE Automatic Speech Recognition and Understanding Workshop Scottsdale, Arizona, USA 639 646
C. ZhangP. C. Woodland 2015 A general artificial neural network extension for HTK In Proc. Interspeech Dresden, Germany 3581 3585
Y. ZhangM. Pezeshki 2016 Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks In Proc. Interspeech San Francisco, USA 410 414