278
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      pROC: an open-source package for R and S+ to analyze and compare ROC curves

      product-review

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Receiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface.

          Results

          With data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC.

          Conclusions

          pROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: found
          • Article: not found

          What's under the ROC? An introduction to receiver operating characteristics curves.

          It is often necessary to dichotomize a continuous scale to separate respondents into normal and abnormal groups. However, because the distributions of the scores in these 2 groups most often overlap, any cut point that is chosen will result in 2 types of errors: false negatives (that is, abnormal cases judged to be normal) and false positives (that is, normal cases placed in the abnormal group). Changing the cut point will alter the numbers of erroneous judgments but will not eliminate the problem. A technique called receiver operating characteristic (ROC) curves allows us to determine the ability ofa test to discriminate between groups, to choose the optimal cut point, and to compare the performance of 2 or more tests. We discuss how to calculate and compareROC curves and the factors that must be considered in choosing an optimal cut point.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The Relative Operating Characteristic in Psychology: A technique for isolating effects of response bias finds wide use in the study of perception and cognition.

            J Swets (1973)
            The clinician looking, listening, or feeling for signs of a disease may far prefer a false alarm to a miss, particularly if the disease is serious and contagious. On the other hand, he may believe that the available therapy is marginally effective, expensive, and debilitating. The pilot seeing the landing lights only when they are a few yards away may decide that his plane is adequately aligned with the runway if he is alone and familiar with that plight. He may be more inclined to circle the field before another try at landing if he has many passengers and recent memory of another plane crashing under those circumstances. The Food and Drug administrator suspecting botulism in a canned food may not want to accept even a remote threat to the public health. But he may be less clearly biased if a recent false alarm has cost a canning company millions of dollars and left some damaged reputations. The making of almost any fine discrimination is beset with such considerations of probability and utility, which are extraneous and potentially confounding when one is attempting to measure the acuity of discrimination per se. The ROC is an analytical technique, with origins in statistical decision theory and electronic detection theory, that quite effectively isolates the effects of the observer's response bias, or decision criterion, in the study of discrimination behavior. This capability, pursued through a century of psychological testing, provides a relatively pure measure of the discriminability of different stimuli and of the capacity of organisms to discriminate. The ROC also treats quantitatively the response, or decision, aspects of choice behavior. The decision parameter can then be functionally related to the probabilities of the stimulus alternatives and to the utilities of the various stimulus-response pairs, or to the observer's expectations and motivations. In separating and quantifying discrimination and decision processes, the ROC promises a more reliable and valid solution to some practical problems and enhances our understanding of the perceptual and cognitive phenomena that depend directly on these fundamental processes. In several problem areas in psychology, effects that were supposed to reflect properties of the discrimination process have been shown by the ROC analysis to reflect instead properties of the decision process.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Small-sample precision of ROC-related estimates.

              The receiver operator characteristic (ROC) curves are commonly used in biomedical applications to judge the performance of a discriminant across varying decision thresholds. The estimated ROC curve depends on the true positive rate (TPR) and false positive rate (FPR), with the key metric being the area under the curve (AUC). With small samples these rates need to be estimated from the training data, so a natural question arises: How well do the estimates of the AUC, TPR and FPR compare with the true metrics? Through a simulation study using data models and analysis of real microarray data, we show that (i) for small samples the root mean square differences of the estimated and true metrics are considerable; (ii) even for large samples, there is only weak correlation between the true and estimated metrics; and (iii) generally, there is weak regression of the true metric on the estimated metric. For classification rules, we consider linear discriminant analysis, linear support vector machine (SVM) and radial basis function SVM. For error estimation, we consider resubstitution, three kinds of cross-validation and bootstrap. Using resampling, we show the unreliability of some published ROC results. Companion web site at http://compbio.tgen.org/paper_supp/ROC/roc.html edward@mail.ece.tamu.edu.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2011
                17 March 2011
                : 12
                : 77
                Affiliations
                [1 ]Biomedical Proteomics Research Group, Department of Structural Biology and Bioinformatics, Medical University Centre, Geneva, Switzerland
                [2 ]Swiss Institute of Bioinformatics, Medical University Centre, Geneva, Switzerland
                Article
                1471-2105-12-77
                10.1186/1471-2105-12-77
                3068975
                21414208
                423af74d-6888-4b9b-8cae-e57c91c35df8
                Copyright ©2011 Robin et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 10 September 2010
                : 17 March 2011
                Categories
                Software

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article