ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

7

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

Preprint

Author(s): André F. T. Martins , Ramón Fernandez Astudillo

Publication date Created: 2016-02-05

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We propose sparsemax, a new activation function similar to the traditional softmax, but able to output sparse probabilities. After deriving its properties, we show how its Jacobian can be efficiently computed, enabling its use in a network trained with backpropagation. Then, we propose a new smooth and convex loss function which is the sparsemax analogue of the logistic loss. We reveal an unexpected connection between this new loss and the Huber classification loss. We obtain promising empirical results in multi-label classification problems and in attention-based neural networks for natural language inference. For the latter, we achieve a similar performance as the traditional softmax, but with a selective, more compact, attention focus.

Related collections

Most cited references 11

Record: found
Abstract: not found
Article: not found

Bayesian Analysis of Binary and Polychotomous Response Data

James Albert, Siddhartha Chib (1993)

0 comments Cited 377 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Article: not found

A Review on Multi-Label Learning Algorithms

Min-Ling Zhang, Zhi-Hua Zhou (2014)

0 comments Cited 302 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Conference Proceedings: not found

Efficient projections onto thel1-ball for learning in high dimensions

John Duchi, Shai Shalev-Shwartz, Yoram Singer … (2008)

0 comments Cited 145 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 2016-02-05

Publication date Updated: 2016-02-08

Article

ArXiV ID: 1602.02068

SO-VID: 1416ca60-e231-4a38-93cc-e07ca3e3a896

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments Minor corrections

Categories cs.CL cs.LG stat.ML

ScienceOpen disciplines: Theoretical computer science,Machine learning,Artificial intelligence

Data availability:

ScienceOpen disciplines: Theoretical computer science, Machine learning, Artificial intelligence

Comments

Comment on this article