+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all?


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Motivation: Survival prediction of breast cancer (BC) patients independently of treatment, also known as prognostication, is a complex task since clinically similar breast tumors, in addition to be molecularly heterogeneous, may exhibit different clinical outcomes. In recent years, the analysis of gene expression profiles by means of sophisticated data mining tools emerged as a promising technology to bring additional insights into BC biology and to improve the quality of prognostication. The aim of this work is to assess quantitatively the accuracy of prediction obtained with state-of-the-art data analysis techniques for BC microarray data through an independent and thorough framework.

          Results: Due to the large number of variables, the reduced amount of samples and the high degree of noise, complex prediction methods are highly exposed to performance degradation despite the use of cross-validation techniques. Our analysis shows that the most complex methods are not significantly better than the simplest one, a univariate model relying on a single proliferation gene. This result suggests that proliferation might be the most relevant biological process for BC prognostication and that the loss of interpretability deriving from the use of overcomplex methods may be not sufficiently counterbalanced by an improvement of the quality of prediction.

          Availability: The comparison study is implemented in an R package called survcomp and is available from http://www.ulb.ac.be/di/map/bhaibeka/software/survcomp/.

          Contact: bhaibeka@ 123456ulb.ac.be

          Supplementary information: Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references43

          • Record: found
          • Abstract: not found
          • Article: not found

          Individual Comparisons by Ranking Methods

            • Record: found
            • Abstract: found
            • Article: not found

            Measuring the accuracy of diagnostic systems.

            J Swets (1988)
            Diagnostic systems of several kinds are used to distinguish between two classes of events, essentially "signals" and "noise". For them, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy. It is the only measure available that is uninfluenced by decision biases and prior probabilities, and it places the performances of diverse systems on a common, easily interpreted scale. Representative values of this measure are reported here for systems in medical imaging, materials testing, weather forecasting, information retrieval, polygraph lie detection, and aptitude testing. Though the measure itself is sound, the values obtained from tests of diagnostic systems often require qualification because the test data on which they are based are of unsure quality. A common set of problems in testing is faced in all fields. How well these problems are handled, or can be handled in a given field, determines the degree of confidence that can be placed in a measured value of accuracy. Some fields fare much better than others.
              • Record: found
              • Abstract: found
              • Book: not found

              Modeling Survival Data: Extending the Cox Model

              This is a book for statistical practitioners, particularly those who design and analyze studies for survival and event history data. Its goal is to extend the toolkit beyond the basic triad provided by most statistical packages: the Kaplan-Meier estimator, log-rank test, and Cox regression model. Building on recent developments motivated by counting process and martingale theory, it shows the reader how to extend the Cox model to analyse multiple/correlated event data using marginal and random effects (frailty) models. It covers the use of residuals and diagnostic plots to identify influential or outlying observations, assess proportional hazards and examine other aspects of goodness of fit. Other topics include time-dependent covariates and strata, discontinuous intervals of risk, multiple time scales, smoothing and regression splines, and the computation of expected survival curves. A knowledge of counting processes and martingales is not assumed as the early chapters provide an introduction to this area. The focus of the book is on actual data examples, the analysis and interpretation of the results, and computation. The methods are now readily available in SAS and S-Plus and this book gives a hands-on introduction, showing how to implement them in both packages, with worked examples for many data sets. The authors call on their extensive experience and give practical advice, including pitfalls to be avoided. Terry Therneau is Head of the Section of Biostatistics, Mayo Clinic, Rochester, Minnesota. He is actively involved in medical consulting, with emphasis in the areas of chronic liver disease, physical medicine, hematology, and laboratory medicine, and is an author on numerous papers in medical and statistical journals. He wrote two of the original SAS procedures for survival analysis (coxregr and survtest), as well as the majority of the S-Plus survival functions. Patricia Grambsch is Associate Professor in the Division of Biostatistics, School of Public Health, University of Minnesota. She has collaborated extensively with physicians and public health researchers in chronic liver disease, cancer prevention, hypertension clinical trials and psychiatric research. She is a fellow the American Statistical Association and the author of many papers in medical and statistical journals.

                Author and article information

                Oxford University Press
                1 October 2008
                17 July 2008
                17 July 2008
                : 24
                : 19
                : 2200-2208
                1Machine Learning Group, Department of Computer Science and 2Functional Genomics Unit, Department of Medical Oncology, Institut Jules Bordet, Université Libre de Bruxelles, Brussels, Belgium
                Author notes
                *To whom correspondence should be addressed.

                Associate Editor: Joaquin Dopazo

                © 2008 The Author(s)

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                : 18 January 2008
                : 30 May 2008
                : 15 June 2008
                Original Papers
                Gene Expression

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology


                Comment on this article