Non-Monotonic Sequential Text Generation

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Standard sequential generation methods assume a pre-specified generation order, such as text generation methods which generate words from left to right. In this work, we propose a framework for training models of text generation that operate in non-monotonic orders; the model directly learns good orders, without any additional annotation. Our framework operates by generating a word at an arbitrary position, and then recursively generating words to its left and then words to its right, yielding a binary tree. Learning is framed as imitation learning, including a coaching method which moves from imitating an oracle to reinforcing the policy's own preferences. Experimental results demonstrate that using the proposed method, it is possible to learn policies which generate text without pre-specifying a generation order, while achieving competitive performance with conventional left-to-right generation.

Related collections

Most cited references 8

Record: found
Abstract: not found
Article: not found

Geometric Deep Learning: Going beyond Euclidean data

Michael M. Bronstein, Joan Bruna, Yann LeCun … (2017)

0 comments Cited 545 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

A maximum likelihood approach to continuous speech recognition.

Lalit R. Bahl, Frederick Jelinek, Robert L. Mercer (1983)

Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them.

0 comments Cited 134 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Describing Multimedia Content using Attention-based Encoder--Decoder Networks

Kyunghyun Cho, Aaron Courville, Yoshua Bengio (2015)

Whereas deep neural networks were first mostly used for classification tasks, they are rapidly expanding in the realm of structured output problems, where the observed target is composed of multiple random variables that have a rich joint distribution, given the input. We focus in this paper on the case where the input also has a rich structure and the input and output structures are somehow related. We describe systems that learn to attend to different places in the input, for each element of the output, for a variety of tasks: machine translation, image caption generation, video clip description and speech recognition. All these systems are based on a shared set of building blocks: gated recurrent neural networks and convolutional neural networks, along with trained attention mechanisms. We report on experimental results with these systems, showing impressively good performance and the advantage of the attention mechanism.

0 comments Cited 43 times – based on 0 reviews

Preprint

     Review now

Bookmark

All references

Author and article information

Journal

Publication date Created: 05 February 2019

Article

ArXiV ID: 1902.02192

SO-VID: ef28707e-f7cc-4c79-a002-10ea6a5ca6e0

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.CL cs.LG stat.ML

ScienceOpen disciplines: Theoretical computer science,Machine learning,Artificial intelligence

Data availability:

ScienceOpen disciplines: Theoretical computer science, Machine learning, Artificial intelligence

Non-Monotonic Sequential Text Generation

Read this article at

Abstract

Related collections

Radiology and Natural Language Processing

Most cited references 8

Geometric Deep Learning: Going beyond Euclidean data

A maximum likelihood approach to continuous speech recognition.

Describing Multimedia Content using Attention-based Encoder--Decoder Networks

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 80

Most referenced authors 168