A novel text representation which enables image classifiers to perform
  text classification, applied to name disambiguation

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Patent data are often used to study the process of innovation and research, but patent databases lack unique identifiers for individual inventors, making it difficult to study innovation processes at the individual level. Here we introduce an algorithm that performs highly accurate disambiguation of inventors (named entities) in US patent data (F1: 99.09%, precision: 99.41%, recall: 98.76%). The algorithm involves a novel method for converting text-based record data into abstract image representations, in which text from a given pairwise comparison between two inventor name records is converted into a 2D RGB (stacked) image representation. We train an image classification neural network to discriminate between such pairwise comparison images, and then use the trained network to label each pair of records as either matched (same inventor) or non-matched (different inventors). The resulting disambiguation algorithm produces highly accurate results, out-performing other inventor name disambiguation studies on US patent data. Our new text-to-image representation method could potentially be used more broadly for other NLP comparison problems, as it allows image-based processing techniques (e.g. image classification networks) to be applied to text-based comparison problems (such as disambiguation of academic publications, or data linkage problems).

Related collections

Most cited references 8

Record: found
Abstract: not found
Conference Proceedings: not found

Learning to compare image patches via convolutional neural networks

Sergey Zagoruyko, Nikos Komodakis (2015)

0 comments Cited 104 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Discriminative Deep Metric Learning for Face Verification in the Wild

Junlin Hu, Jiwen Lu, Yap-Peng Tan (2014)

0 comments Cited 80 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: not found

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

Wenpeng Yin, Hinrich Schütze, Bing Xiang … (2016)

How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS), paraphrase identification (PI) and textual entailment (TE). Most prior work (i) deals with one individual task by fine-tuning a specific system; (ii) models each sentence’s representation separately, rarely considering the impact of the other sentence; or (iii) relies fully on manually designed, task-specific linguistic features. This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences. We make three contributions. (i) The ABCNN can be applied to a wide variety of tasks that require modeling of sentence pairs. (ii) We propose three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart. These interdependent sentence pair representations are more powerful than isolated sentence representations. (iii) ABCNNs achieve state-of-the-art performance on AS, PI and TE tasks. We release code at: https://github.com/yinwenpeng/Answer_Selection .

0 comments Cited 74 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Publication date Created: 19 August 2019

Article

ArXiV ID: 1908.07846

SO-VID: 4bfb121a-bdda-4ce8-954d-2946ca2d87b6

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.CL cs.AI cs.CV cs.LG

ScienceOpen disciplines: Computer vision & Pattern recognition,Theoretical computer science,Artificial intelligence

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition, Theoretical computer science, Artificial intelligence

A novel text representation which enables image classifiers to perform text classification, applied to name disambiguation

Read this article at

Abstract

Related collections

Radiology and Natural Language Processing

Most cited references 8

Learning to compare image patches via convolutional neural networks

Discriminative Deep Metric Learning for Face Verification in the Wild

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 98

Most referenced authors 100