11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      RCFGL: Rapid Condition adaptive Fused Graphical Lasso and application to modeling brain region co-expression networks

      research-article

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Inferring gene co-expression networks is a useful process for understanding gene regulation and pathway activity. The networks are usually undirected graphs where genes are represented as nodes and an edge represents a significant co-expression relationship. When expression data of multiple ( p) genes in multiple ( K) conditions (e.g., treatments, tissues, strains) are available, joint estimation of networks harnessing shared information across them can significantly increase the power of analysis. In addition, examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. Condition adaptive fused graphical lasso (CFGL) is an existing method that incorporates condition specificity in a fused graphical lasso (FGL) model for estimating multiple co-expression networks. However, with computational complexity of O( p 2 K log K), the current implementation of CFGL is prohibitively slow even for a moderate number of genes and can only be used for a maximum of three conditions. In this paper, we propose a faster alternative of CFGL named rapid condition adaptive fused graphical lasso (RCFGL). In RCFGL, we incorporate the condition specificity into another popular model for joint network estimation, known as fused multiple graphical lasso (FMGL). We use a more efficient algorithm in the iterative steps compared to CFGL, enabling faster computation with complexity of O( p 2 K) and making it easily generalizable for more than three conditions. We also present a novel screening rule to determine if the full network estimation problem can be broken down into estimation of smaller disjoint sub-networks, thereby reducing the complexity further. We demonstrate the computational advantage and superior performance of our method compared to two non-condition adaptive methods, FGL and FMGL, and one condition adaptive method, CFGL in both simulation study and real data analysis. We used RCFGL to jointly estimate the gene co-expression networks in different brain regions (conditions) using a cohort of heterogeneous stock rats. We also provide an accommodating C and Python based package that implements RCFGL.

          Author summary

          Inferring gene co-expression networks can be useful for understanding pathway activity and gene regulation. While jointly estimating co-expression networks of multiple conditions, taking into account condition specificity, such as information about an edge being present only in a specific condition or an edge being present across all the conditions, substantially increases the power. In this paper, a computationally rapid condition adaptive method for jointly estimating gene co-expression networks of multiple conditions is proposed. The novelty of the method is demonstrated through a broad range of simulation studies and a real data analysis with multiple brain regions from a genetically diverse cohort of rats.

          Related collections

          Most cited references65

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

          In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.
            • Record: found
            • Abstract: not found
            • Article: not found

            Cutadapt removes adapter sequences from high-throughput sequencing reads

              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

              Background RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. Results We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. Conclusions RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.

                Author and article information

                Contributors
                Role: ConceptualizationRole: Formal analysisRole: InvestigationRole: MethodologyRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: Funding acquisitionRole: MethodologyRole: Writing – review & editing
                Role: MethodologyRole: ValidationRole: Writing – review & editing
                Role: Funding acquisitionRole: MethodologyRole: ResourcesRole: Writing – review & editing
                Role: ConceptualizationRole: Funding acquisitionRole: InvestigationRole: MethodologyRole: Project administrationRole: ResourcesRole: SupervisionRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput Biol
                plos
                PLOS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                January 2023
                6 January 2023
                : 19
                : 1
                : e1010758
                Affiliations
                [1 ] Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
                [2 ] Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
                [3 ] Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
                OvGU; Medical Faculty, GERMANY
                Author notes

                No competing interests declared.

                Author information
                https://orcid.org/0000-0003-3268-610X
                https://orcid.org/0000-0003-0675-7648
                https://orcid.org/0000-0001-6343-1607
                https://orcid.org/0000-0002-3725-5459
                Article
                PCOMPBIOL-D-22-00194
                10.1371/journal.pcbi.1010758
                9821764
                36607897
                a2ae777d-0aaf-4b48-affb-240d856cc608
                © 2023 Seal et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 9 February 2022
                : 24 November 2022
                Page count
                Figures: 9, Tables: 5, Pages: 26
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R01GM109453
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: T32 GM102057
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000026, National Institute on Drug Abuse;
                Award ID: P30DA044223
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000026, National Institute on Drug Abuse;
                Award ID: P30DA044223
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000026, National Institute on Drug Abuse;
                Award ID: P50DA037844
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000027, National Institute on Alcohol Abuse and Alcoholism;
                Award ID: R24AA013162
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000050, National Heart, Lung, and Blood Institute;
                Award ID: R01HL152735
                Award Recipient :
                Q.L. was supported by the National Institute of General Medical Sciences (NIGMS) of the National Institute of Health (NIH) grant R01GM109453. E.B.B. was supported by the NIGMS training grant T32 GM102057 awarded to Pennsylvania State University. L.M.S. and K.K. were supported by the National Institute on Drug Abuse (NIDA) of the NIH under award numbers P30DA044223. L.M.S. was also supported by NIDA under award number P50DA037844 and by the National Institute on Alcohol Abuse and Alcoholism (NIAAA) of the NIH under award number R24AA013162. K.K. was also supported by the National Heart, Lung, and Blood Institute (NHLBI) of the NIH under award number R01HL152735. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Computer and Information Sciences
                Network Analysis
                Biology and Life Sciences
                Genetics
                Gene Identification and Analysis
                Genetic Networks
                Computer and Information Sciences
                Network Analysis
                Genetic Networks
                Biology and Life Sciences
                Physiology
                Physiological Parameters
                Body Weight
                Medicine and Health Sciences
                Oncology
                Cancers and Neoplasms
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Biology and Life Sciences
                Anatomy
                Brain
                Prefrontal Cortex
                Medicine and Health Sciences
                Anatomy
                Brain
                Prefrontal Cortex
                Physical Sciences
                Mathematics
                Probability Theory
                Random Variables
                Covariance
                Computer and Information Sciences
                Neural Networks
                Biology and Life Sciences
                Neuroscience
                Neural Networks
                Custom metadata
                Associated software package can be found at this link, https://github.com/sealx017/RCFGL. All the codes and the extracted results from the simulation studies are provided with detailed documentation. The real data can be accessed through GSE173141, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE173141.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article

                Related Documents Log