4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Translationese as a Language in "Multilingual" NMT

      Preprint

      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Machine translation has an undesirable propensity to produce "translationese" artifacts, which can lead to higher BLEU scores while being liked less by human raters. Motivated by this, we model translationese and original (i.e. natural) text as separate languages in a multilingual model, and pose the question: can we perform zero-shot translation between original source text and original target text? There is no data with original source and original target, so we train sentence-level classifiers to distinguish translationese from original target text, and use this classifier to tag the training data for an NMT model. Using this technique we bias the model to produce more natural outputs at test time, yielding gains in human evaluation scores on both accuracy and fluency. Additionally, we demonstrate that it is possible to bias the model to produce translationese and game the BLEU score, increasing it while decreasing human-rated quality. We analyze these models using metrics to measure the degree of translationese in the output, and present an analysis of the capriciousness of heuristically-based train-data tagging.

          Related collections

          Most cited references 6

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Neural Machine Translation of Rare Words with Subword Units

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Controlling Politeness in Neural Machine Translation via Side Constraints

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

                Bookmark

                Author and article information

                Journal
                09 November 2019
                Article
                1911.03823

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                Custom metadata
                cs.CL

                Theoretical computer science

                Comments

                Comment on this article