ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

1

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Multilingual Neural Machine Translation with Knowledge Distillation

Preprint

Author(s): Xu Tan , Yi Ren , Di He , Tao Qin , Zhou Zhao , Tieyan Liu

Publication date Created: 27 February 2019

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Multilingual machine translation, which translates multiple languages with a single model, has attracted much attention due to its efficiency of offline training and online serving. However, traditional multilingual translation usually yields inferior accuracy compared with the counterpart using individual models for each language pair, due to language diversity and model capacity limitations. In this paper, we propose a distillation-based approach to boost the accuracy of multilingual machine translation. Specifically, individual models are first trained and regarded as teachers, and then the multilingual model is trained to fit the training data and match the outputs of individual models simultaneously through knowledge distillation. Experiments on IWSLT, WMT and Ted talk translation datasets demonstrate the effectiveness of our method. Particularly, we show that one model is enough to handle multiple languages (up to 44 languages in our experiment), with comparable or even better accuracy than individual models.

Related collections

Most cited references 3

Record: found
Abstract: not found
Conference Proceedings: not found

A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning

Donggyu Joo, Junmo Kim, Jihoon Bae … (2017)

0 comments Cited 96 times – based on 0 reviews

Record: found
Abstract: not found
Conference Proceedings: not found

Multi-Task Learning for Multiple Language Translation

Hua Wu, Haifeng Wang, Dianhai Yu … (2015)

0 comments Cited 50 times – based on 0 reviews

Record: found
Abstract: not found
Conference Proceedings: not found

Sequence-Level Knowledge Distillation

Yoon Kim, Alexander Rush (2016)

0 comments Cited 46 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 27 February 2019

Article

ArXiV ID: 1902.10461

SO-VID: 3a912f08-8447-4853-a8b9-6a28c959a271

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments Accepted to ICLR 2019

Categories cs.CL

ScienceOpen disciplines: Theoretical computer science

Data availability:

ScienceOpen disciplines: Theoretical computer science

Comments

Comment on this article