Interrater reliability: the kappa statistic

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The kappa statistic is frequently used to test interrater reliability. The importance of rater reliability lies in the fact that it represents the extent to which the data collected in the study are correct representations of the variables measured. Measurement of the extent to which data collectors (raters) assign the same score to the same variable is called interrater reliability. While there have been a variety of methods to measure interrater reliability, traditionally it was measured as percent agreement, calculated as the number of agreement scores divided by the total number of scores. In 1960, Jacob Cohen critiqued use of percent agreement due to its inability to account for chance agreement. He introduced the Cohen’s kappa, developed to account for the possibility that raters actually guess on at least some variables due to uncertainty. Like most correlation statistics, the kappa can range from −1 to +1. While the kappa is one of the most commonly used statistics to test interrater reliability, it has limitations. Judgments about what level of kappa should be acceptable for health research are questioned. Cohen’s suggested interpretation may be too lenient for health related studies because it implies that a score as low as 0.41 might be acceptable. Kappa and percent agreement are compared, and levels for both kappa and percent agreement that should be demanded in healthcare studies are suggested.

Related collections

Most cited references 11

Record: found
Abstract: not found
Article: not found

The Kappa Statistic: A Second Look

Barbara Di Eugenio, Michael Glass (2004)

0 comments Cited 115 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Meta-analysis of Pap test accuracy.

P Macaskill, L Irwig, Laura M. Fahey (1995)

A literature search identified 62 studies published by August 1992 comparing Papanicolaou (Pap) test results with histology. Critical appraisal revealed that 82% of these had potential for verification bias and that only 37% stated that cytology and histology were independently assessed. Estimates of sensitivity and specificity ranged from 11 to 99% and 14 to 97%, respectively, and were highly negatively correlated (r = -0.63). Meta-analysis was used to combine data from 59 studies to estimate the accuracy of the Pap test using a summary receiver operating characteristic curve and to examine the effect of study quality. The summary receiver operating characteristic curve suggests that the Pap test may be unable to achieve concurrently high sensitivity and specificity. For example, specificity in the 90-95% range corresponds to sensitivity in the 20-35% range. Pap test accuracy was not associated with reported study characteristics or dimensions of quality. Future primary studies should pay more attention to methodologic standards for the conduct and reporting of diagnostic test evaluations.

0 comments Cited 69 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Intrarater Reliability of Dual-Energy X-Ray Absorptiometry–Based Measures of Vertebral Height in Postmenopausal Women

Norma MacIntyre, Colin E Webber, Paul Stratford … (2012)

0 comments Cited 59 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Biochem Med (Zagreb)

Journal ID (iso-abbrev): Biochem Med (Zagreb)

Journal ID (publisher-id): Biochemia Medica

Title: Biochemia Medica

Publisher: Croatian Society of Medical Biochemistry and Laboratory Medicine

ISSN (Print): 1330-0962

ISSN (Electronic): 1846-7482

Publication date (Electronic): 15 October 2012

Publication date Collection: October 2012

Volume: 22

Issue: 3

Pages: 276-282

Affiliations

Department of Nursing, National University, Aero Court, San Diego, California

Author notes

Corresponding author: mchugh8688@ 123456gmail.com

Article

Publisher ID: biochem_med-22-3-276-4

DOI: 10.11613/BM.2012.031

PMC ID: 3900052

PubMed ID: 23092060

SO-VID: 36745228-f2c4-48aa-9004-c5fff2f07c91

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Interrater reliability: the kappa statistic

Read this article at

Abstract

Related collections

UCL: UN SDG 07 Affordable and Clean Energy

Most cited references 11

The Kappa Statistic: A Second Look

Meta-analysis of Pap test accuracy.

Intrarater Reliability of Dual-Energy X-Ray Absorptiometry–Based Measures of Vertebral Height in Postmenopausal Women

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 97

Cited by 3,854

Most referenced authors 124