There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
Narrative reports in medical records contain a wealth of information that may augment
structured data for managing patient information and predicting trends in diseases.
Pertinent negatives are evident in text but are not usually indexed in structured
databases. The objective of the study reported here was to test a simple algorithm
for determining whether a finding or disease mentioned within narrative medical reports
is present or absent. We developed a simple regular expression algorithm called NegEx
that implements several phrases indicating negation, filters out sentences containing
phrases that falsely appear to be negation phrases, and limits the scope of the negation
phrases. We compared NegEx against a baseline algorithm that has a limited set of
negation phrases and a simpler notion of scope. In a test of 1235 findings and diseases
in 1000 sentences taken from discharge summaries indexed by physicians, NegEx had
a specificity of 94.5% (versus 85.3% for the baseline), a positive predictive value
of 84.5% (versus 68.4% for the baseline) while maintaining a reasonable sensitivity
of 77.8% (versus 88.3% for the baseline). We conclude that with little implementation
effort a simple regular expression algorithm for determining whether a finding or
disease is absent can identify a large portion of the pertinent negatives from discharge
summaries.