328
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Semi-Supervised Methods to Predict Patient Survival from Gene Expression Data

      research-article
      1 , , 2
      PLoS Biology
      Public Library of Science

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          An important goal of DNA microarray research is to develop tools to diagnose cancer more accurately based on the genetic profile of a tumor. There are several existing techniques in the literature for performing this type of diagnosis. Unfortunately, most of these techniques assume that different subtypes of cancer are already known to exist. Their utility is limited when such subtypes have not been previously identified. Although methods for identifying such subtypes exist, these methods do not work well for all datasets. It would be desirable to develop a procedure to find such subtypes that is applicable in a wide variety of circumstances. Even if no information is known about possible subtypes of a certain form of cancer, clinical information about the patients, such as their survival time, is often available. In this study, we develop some procedures that utilize both the gene expression data and the clinical data to identify subtypes of cancer and use this knowledge to diagnose future patients. These procedures were successfully applied to several publicly available datasets. We present diagnostic procedures that accurately predict the survival of future patients based on the gene expression profile and survival times of previous patients. This has the potential to be a powerful tool for diagnosing and treating cancer.

          Abstract

          Procedures that utilize both gene expression data and clinical data to identify subtypes of cancer can provide more accurate prognoses

          Related collections

          Most cited references32

          • Record: found
          • Abstract: found
          • Article: not found

          Cluster analysis and display of genome-wide expression patterns.

          A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.
            Bookmark
            • Record: found
            • Abstract: found
            • Book: not found

            Modeling Survival Data: Extending the Cox Model

            This is a book for statistical practitioners, particularly those who design and analyze studies for survival and event history data. Its goal is to extend the toolkit beyond the basic triad provided by most statistical packages: the Kaplan-Meier estimator, log-rank test, and Cox regression model. Building on recent developments motivated by counting process and martingale theory, it shows the reader how to extend the Cox model to analyse multiple/correlated event data using marginal and random effects (frailty) models. It covers the use of residuals and diagnostic plots to identify influential or outlying observations, assess proportional hazards and examine other aspects of goodness of fit. Other topics include time-dependent covariates and strata, discontinuous intervals of risk, multiple time scales, smoothing and regression splines, and the computation of expected survival curves. A knowledge of counting processes and martingales is not assumed as the early chapters provide an introduction to this area. The focus of the book is on actual data examples, the analysis and interpretation of the results, and computation. The methods are now readily available in SAS and S-Plus and this book gives a hands-on introduction, showing how to implement them in both packages, with worked examples for many data sets. The authors call on their extensive experience and give practical advice, including pitfalls to be avoided. Terry Therneau is Head of the Section of Biostatistics, Mayo Clinic, Rochester, Minnesota. He is actively involved in medical consulting, with emphasis in the areas of chronic liver disease, physical medicine, hematology, and laboratory medicine, and is an author on numerous papers in medical and statistical journals. He wrote two of the original SAS procedures for survival analysis (coxregr and survtest), as well as the majority of the S-Plus survival functions. Patricia Grambsch is Associate Professor in the Division of Biostatistics, School of Public Health, University of Minnesota. She has collaborated extensively with physicians and public health researchers in chronic liver disease, cancer prevention, hypertension clinical trials and psychiatric research. She is a fellow the American Statistical Association and the author of many papers in medical and statistical journals.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

              We have generated a molecular taxonomy of lung carcinoma, the leading cause of cancer death in the United States and worldwide. Using oligonucleotide microarrays, we analyzed mRNA expression levels corresponding to 12,600 transcript sequences in 186 lung tumor samples, including 139 adenocarcinomas resected from the lung. Hierarchical and probabilistic clustering of expression data defined distinct subclasses of lung adenocarcinoma. Among these were tumors with high relative expression of neuroendocrine genes and of type II pneumocyte genes, respectively. Retrospective analysis revealed a less favorable outcome for the adenocarcinomas with neuroendocrine gene expression. The diagnostic potential of expression profiling is emphasized by its ability to discriminate primary lung adenocarcinomas from metastases of extra-pulmonary origin. These results suggest that integration of expression profile data with clinical parameters could aid in diagnosis of lung cancer patients.
                Bookmark

                Author and article information

                Journal
                PLoS Biol
                pbio
                PLoS Biology
                Public Library of Science (San Francisco, USA )
                1544-9173
                1545-7885
                April 2004
                13 April 2004
                : 2
                : 4
                : e108
                Affiliations
                [1] 1Department of Statistics, Stanford University Palo Alto, CaliforniaUnited States of America
                [2] 2Department of Heath and Research Policy, Stanford University Palo Alto, CaliforniaUnited States of America
                Article
                10.1371/journal.pbio.0020108
                387275
                15094809
                f3bb91b8-01ca-458a-b685-9df96613337c
                Copyright: © 2004 Bair and Tibshirani. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
                History
                : 18 July 2003
                : 10 February 2004
                Categories
                Research Article
                Bioinformatics/Computational Biology
                Cancer Biology
                Genetics/Genomics/Gene Therapy
                Homo (Human)

                Life sciences
                Life sciences

                Comments

                Comment on this article