25
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Multi Task Deep Morphological Analyzer: Context Aware Joint Morphological Tagging and Lemma Prediction

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Morphological analysis is an important first step in downstream tasks like machine translation and dependency parsing of morphologically rich languages (MRLs) such as those belonging to Indo-Aryan and Dravidian families. However, the ambiguities introduced by the recombination of morphemes constructing several possible inflections for a word makes the prediction of syntactic traits a notoriously complicated task for MRLs. We propose a character-level neural morphological analyzer, the Multi Task Deep Morphological analyzer (MT-DMA), based on multitask learning of word-level tag markers for Hindi. In order to show the portability of our system to other related languages, we present results on Urdu too. MT-DMA predicts the complete set of morphological tags for words of Indo-Aryan languages: Parts-of-speech (POS), Gender (G), Number (N), Person (P), Case (C), Tense-Aspect-Modality (TAM) marker as well as the Lemma (L) by jointly learning all these in a single end-to-end framework. We show the effectiveness of training of such deep neural networks by the simultaneous optimization of multiple loss functions and sharing of initial parameters for context-aware morphological analysis. Our model outperforms the state-of-art analyzers for Hindi and Urdu. Exploring the use of a set of character-level features in phonological space optimized for each tag through a multi-objective genetic algorithm, coupled with effective training strategies, our model establishes a new state-of-the-art accuracy score upon all seven of the tasks for both the languages. MT-DMA is publicly accessible to be used at http://35.154.251.44/.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Moses

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model

                Bookmark

                Author and article information

                Journal
                21 November 2018
                Article
                1811.08619
                6e0a6535-c34c-4f90-8f37-3f5b446f761b

                http://creativecommons.org/licenses/by/4.0/

                History
                Custom metadata
                30 pages, 7 figures, 9 tables
                cs.CL

                Theoretical computer science
                Theoretical computer science

                Comments

                Comment on this article