+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      How to assess intra- and inter-observer agreement with quantitative PET using variance component analysis: a proposal for standardisation

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Quantitative measurement procedures need to be accurate and precise to justify their clinical use. Precision reflects deviation of groups of measurement from another, often expressed as proportions of agreement, standard errors of measurement, coefficients of variation, or the Bland-Altman plot. We suggest variance component analysis (VCA) to estimate the influence of errors due to single elements of a PET scan (scanner, time point, observer, etc.) to express the composite uncertainty of repeated measurements and obtain relevant repeatability coefficients (RCs) which have a unique relation to Bland-Altman plots. Here, we present this approach for assessment of intra- and inter-observer variation with PET/CT exemplified with data from two clinical studies.


          In study 1, 30 patients were scanned pre-operatively for the assessment of ovarian cancer, and their scans were assessed twice by the same observer to study intra-observer agreement. In study 2, 14 patients with glioma were scanned up to five times. Resulting 49 scans were assessed by three observers to examine inter-observer agreement. Outcome variables were SUVmax in study 1 and cerebral total hemispheric glycolysis (THG) in study 2.


          In study 1, we found a RC of 2.46 equalling half the width of the Bland-Altman limits of agreement. In study 2, the RC for identical conditions (same scanner, patient, time point, and observer) was 2392; allowing for different scanners increased the RC to 2543. Inter-observer differences were negligible compared to differences owing to other factors; between observer 1 and 2: −10 (95 % CI: −352 to 332) and between observer 1 vs 3: 28 (95 % CI: −313 to 370).


          VCA is an appealing approach for weighing different sources of variation against each other, summarised as RCs. The involved linear mixed effects models require carefully considered sample sizes to account for the challenge of sufficiently accurately estimating variance components.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12880-016-0159-3) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references 45

          • Record: found
          • Abstract: found
          • Article: not found

          Statistical methods for assessing agreement between two methods of clinical measurement.

          In clinical measurement comparison of a new measurement technique with an established one is often needed to see whether they agree sufficiently for the new to replace the old. Such investigations are often analysed inappropriately, notably by using correlation coefficients. The use of correlation is misleading. An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability.
            • Record: found
            • Abstract: found
            • Article: not found

            Intraclass correlations: uses in assessing rater reliability.

            Reliability coefficients often take the form of intraclass correlation coefficients. In this article, guidelines are given for choosing among six different forms of the intraclass correlation for reliability studies in which n target are rated by k judges. Relevant to the choice of the coefficient are the appropriate statistical model for the reliability and the application to be made of the reliability results. Confidence intervals for each of the forms are reviewed.
              • Record: found
              • Abstract: found
              • Article: not found

              A simulation study of the number of events per variable in logistic regression analysis.

              We performed a Monte Carlo study to evaluate the effect of the number of events per variable (EPV) analyzed in logistic regression analysis. The simulations were based on data from a cardiac trial of 673 patients in which 252 deaths occurred and seven variables were cogent predictors of mortality; the number of events per predictive variable was (252/7 =) 36 for the full sample. For the simulations, at values of EPV = 2, 5, 10, 15, 20, and 25, we randomly generated 500 samples of the 673 patients, chosen with replacement, according to a logistic model derived from the full sample. Simulation results for the regression coefficients for each variable in each group of 500 samples were compared for bias, precision, and significance testing against the results of the model fitted to the original sample. For EPV values of 10 or greater, no major problems occurred. For EPV values less than 10, however, the regression coefficients were biased in both positive and negative directions; the large sample variance estimates from the logistic model both overestimated and underestimated the sample variance of the regression coefficients; the 90% confidence limits about the estimated values did not have proper coverage; the Wald statistic was conservative under the null hypothesis; and paradoxical associations (significance in the wrong direction) were increased. Although other factors (such as the total number of events, or sample size) may influence the validity of the logistic model, our findings indicate that low EPV can lead to major problems.

                Author and article information

                +45 3017 1885 ,
                BMC Med Imaging
                BMC Med Imaging
                BMC Medical Imaging
                BioMed Central (London )
                21 September 2016
                21 September 2016
                : 16
                [1 ]Department of Nuclear Medicine, Odense University Hospital, Sdr. Boulevard 29, 5000 Odense C, Denmark
                [2 ]Centre of Health Economics Research, University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark
                [3 ]Epidemiology, Biostatistics and Biodemography, University of Southern Denmark, J. B. Winsløws Vej 9b, 5000 Odense C, Denmark
                [4 ]Department of Clinical Research, University of Southern Denmark, Winsløwparken 19, 5000 Odense C, Denmark
                © The Author(s). 2016

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

                Research Article
                Custom metadata
                © The Author(s) 2016


                Comment on this article

                Similar content 68

                Cited by 4

                Most referenced authors 544