7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We propose sparsemax, a new activation function similar to the traditional softmax, but able to output sparse probabilities. After deriving its properties, we show how its Jacobian can be efficiently computed, enabling its use in a network trained with backpropagation. Then, we propose a new smooth and convex loss function which is the sparsemax analogue of the logistic loss. We reveal an unexpected connection between this new loss and the Huber classification loss. We obtain promising empirical results in multi-label classification problems and in attention-based neural networks for natural language inference. For the latter, we achieve a similar performance as the traditional softmax, but with a selective, more compact, attention focus.

          Related collections

          Most cited references11

          • Record: found
          • Abstract: not found
          • Article: not found

          Bayesian Analysis of Binary and Polychotomous Response Data

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            A Review on Multi-Label Learning Algorithms

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Efficient projections onto thel1-ball for learning in high dimensions

                Bookmark

                Author and article information

                Journal
                2016-02-05
                2016-02-08
                Article
                1602.02068
                1416ca60-e231-4a38-93cc-e07ca3e3a896

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                Minor corrections
                cs.CL cs.LG stat.ML

                Theoretical computer science,Machine learning,Artificial intelligence
                Theoretical computer science, Machine learning, Artificial intelligence

                Comments

                Comment on this article