White-box Testing of NLP models with Mask Neuron Coverage

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Recent literature has seen growing interest in using black-box strategies like CheckList for testing the behavior of NLP models. Research on white-box testing has developed a number of methods for evaluating how thoroughly the internal behavior of deep models is tested, but they are not applicable to NLP models. We propose a set of white-box testing methods that are customized for transformer-based NLP models. These include Mask Neuron Coverage (MNCOVER) that measures how thoroughly the attention layers in models are exercised during testing. We show that MNCOVER can refine testing suites generated by CheckList by substantially reduce them in size, for more than 60\% on average, while retaining failing tests -- thereby concentrating the fault detection power of the test suite. Further we show how MNCOVER can be used to guide CheckList input generation, evaluate alternative NLP testing methods, and drive data augmentation to improve accuracy.

Related collections

Author and article information

Journal

Publication date Created: 10 May 2022

Article

ArXiV ID: 2205.05050

SO-VID: 24ebfa3d-b133-430d-a226-f5bc3a427746

License:

http://creativecommons.org/licenses/by/4.0/

History

Custom metadata

Journal reference Findings of NAACL 2022

Comments Findings of NAACL 2022 submission, 12 pages

Categories cs.CL cs.LG

ScienceOpen disciplines: Theoretical computer science,Artificial intelligence

Data availability:

ScienceOpen disciplines: Theoretical computer science, Artificial intelligence

White-box Testing of NLP models with Mask Neuron Coverage

Read this article at

Abstract

Related collections

Radiology and Natural Language Processing

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 399