6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Sluice networks: Learning what to share between loosely related tasks

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Multi-task learning is partly motivated by the observation that humans bring to bear what they know about related problems when solving new ones. Similarly, deep neural networks can profit from related tasks by sharing parameters with other networks. However, humans do not consciously decide to transfer knowledge between tasks (and are typically not aware of the transfer). In machine learning, it is hard to estimate if sharing will lead to improvements; especially if tasks are only loosely related. To overcome this, we introduce Sluice Networks, a general framework for multi-task learning where trainable parameters control the amount of sharing -- including which parts of the models to share. Our framework goes beyond and generalizes over previous proposals in enabling hard or soft sharing of all combinations of subspaces, layers, and skip connections. We perform experiments on three task pairs from natural language processing, and across seven different domains, using data from OntoNotes 5.0, and achieve up to 15% average error reductions over common approaches to multi-task learning. We analyze when the architecture is particularly helpful, as well as its ability to fit noise. We show that a) label entropy is predictive of gains in sluice networks, confirming findings for hard parameter sharing, and b) while sluice networks easily fit noise, they are robust across domains in practice.

          Related collections

          Most cited references3

          • Record: found
          • Abstract: not found
          • Article: not found

          Convex multi-task feature learning

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            An accelerated gradient method for trace norm minimization

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Exploiting Task Relatedness for Multiple Task Learning

                Bookmark

                Author and article information

                Journal
                2017-05-23
                Article
                1705.08142
                cbf2aba0-5771-403f-877b-0ee0f4428648

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                10 pages, 3 figures, 5 tables
                stat.ML cs.AI cs.CL cs.LG cs.NE

                Theoretical computer science,Machine learning,Artificial intelligence
                Theoretical computer science, Machine learning, Artificial intelligence

                Comments

                Comment on this article