Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

A Reliability-Generalization Study of Journal Peer Reviews: A Multilevel Meta-Analysis of Inter-Rater Reliability and Its Determinants

1 , * , 2 , 2 , 3

PLoS ONE

Public Library of Science

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      BackgroundThis paper presents the first meta-analysis for the inter-rater reliability (IRR) of journal peer reviews. IRR is defined as the extent to which two or more independent reviews of the same scientific document agree.Methodology/Principal FindingsAltogether, 70 reliability coefficients (Cohen's Kappa, intra-class correlation [ICC], and Pearson product-moment correlation [r]) from 48 studies were taken into account in the meta-analysis. The studies were based on a total of 19,443 manuscripts; on average, each study had a sample size of 311 manuscripts (minimum: 28, maximum: 1983). The results of the meta-analysis confirmed the findings of the narrative literature reviews published to date: The level of IRR (mean ICC/r2 = .34, mean Cohen's Kappa = .17) was low. To explain the study-to-study variation of the IRR coefficients, meta-regression analyses were calculated using seven covariates. Two covariates that emerged in the meta-regression analyses as statistically significant to gain an approximate homogeneity of the intra-class correlations indicated that, firstly, the more manuscripts that a study is based on, the smaller the reported IRR coefficients are. Secondly, if the information of the rating system for reviewers was reported in a study, then this was associated with a smaller IRR coefficient than if the information was not conveyed.Conclusions/SignificanceStudies that report a high level of IRR are to be considered less credible than those with a low level of IRR. According to our meta-analysis the IRR of peer assessments is quite limited and needs improvement (e.g., reader system).

      Related collections

      Most cited references 90

      • Record: found
      • Abstract: found
      • Article: not found

      Meta-analysis in clinical trials.

      This paper examines eight published reviews each reporting results from several related trials. Each review pools the results from the relevant trials in order to evaluate the efficacy of a certain treatment for a specified medical condition. These reviews lack consistent assessment of homogeneity of treatment effect before pooling. We discuss a random effects approach to combining evidence from a series of experiments comparing two treatments. This approach incorporates the heterogeneity of effects in the analysis of the overall treatment efficacy. The model can be extended to include relevant covariates which would reduce the heterogeneity and allow for more specific therapeutic recommendations. We suggest a simple noniterative procedure for characterizing the distribution of treatment effects in a series of studies.
        Bookmark
        • Record: found
        • Abstract: found
        • Article: not found

        The measurement of observer agreement for categorical data.

         G Koch,  J R Landis (1977)
        This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interobserver agreement are developed as generalized kappa-type statistics. These procedures are illustrated with a clinical diagnosis example from the epidemiological literature.
          Bookmark
          • Record: found
          • Abstract: not found
          • Article: not found

          The Measurement of Observer Agreement for Categorical Data

            Bookmark

            Author and article information

            Affiliations
            [1 ]Max Planck Society, Munich, Germany
            [2 ]Professorship for Social Psychology and Research on Higher Education, ETH Zurich, Zurich, Switzerland
            [3 ]Evaluation Office, University of Zurich, Zurich, Switzerland
            University of Glasgow, United Kingdom
            Author notes

            Conceived and designed the experiments: LB. Performed the experiments: RM. Analyzed the data: RM. Wrote the paper: LB HDD.

            Contributors
            Role: Editor
            Journal
            PLoS One
            plos
            plosone
            PLoS ONE
            Public Library of Science (San Francisco, USA )
            1932-6203
            2010
            14 December 2010
            : 5
            : 12
            3001856
            21179459
            10-PONE-RA-17982R1
            10.1371/journal.pone.0014331
            (Editor)
            Bornmann et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
            Counts
            Pages: 10
            Categories
            Research Article
            Science Policy
            Mathematics/Statistics
            Science Policy/Education

            Uncategorized

            Comments

            Comment on this article