17
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      When silver glitters more than gold: Bootstrapping an Italian part-of-speech tagger for Twitter

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We bootstrap a state-of-the-art part-of-speech tagger to tag Italian Twitter data, in the context of the Evalita 2016 PoSTWITA shared task. We show that training the tagger on native Twitter data enriched with little amounts of specifically selected gold data and additional silver-labelled data scraped from Facebook, yields better results than using large amounts of manually annotated data from a mix of genres.

          Related collections

          Author and article information

          Journal
          2016-11-09
          Article
          1611.03057
          3beaeb1d-f61a-4bcb-b7f7-9df4eaa50521

          http://creativecommons.org/licenses/by-nc-sa/4.0/

          History
          Custom metadata
          Proceedings of the 5th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2016)
          cs.CL

          Theoretical computer science
          Theoretical computer science

          Comments

          Comment on this article