0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Cross-lingual Distillation for Text Classification

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Cross-lingual text classification(CLTC) is the task of classifying documents written in different languages into the same taxonomy of categories. This paper presents a novel approach to CLTC that builds on model distillation, which adapts and extends a framework originally proposed for model compression. Using soft probabilistic predictions for the documents in a label-rich language as the (induced) supervisory labels in a parallel corpus of documents, we train classifiers successfully for new languages in which labeled training data are not available. An adversarial feature adaptation technique is also applied during the model training to reduce distribution mismatch. We conducted experiments on two benchmark CLTC datasets, treating English as the source language and German, French, Japan and Chinese as the unlabeled target languages. The proposed approach had the advantageous or comparable performance of the other state-of-art methods.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Combining labeled and unlabeled data with co-training

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Co-training for cross-lingual sentiment classification

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing

                Bookmark

                Author and article information

                Journal
                2017-05-04
                Article
                1705.02073
                a1b07126-1dea-4b85-aeda-93f5bcb6a5b1

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                Custom metadata
                Accepted at ACL 2017
                cs.CL

                Theoretical computer science
                Theoretical computer science

                Comments

                Comment on this article