28
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn).

          Results

          We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices.

          Conclusions

          The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12918-017-0420-6) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found

          Ontological analysis of gene expression data: current tools, limitations, and open problems.

          Independent of the platform and the analysis methods used, the result of a microarray experiment is, in most cases, a list of differentially expressed genes. An automatic ontological analysis approach has been recently proposed to help with the biological interpretation of such results. Currently, this approach is the de facto standard for the secondary analysis of high throughput experiments and a large number of tools have been developed for this purpose. We present a detailed comparison of 14 such tools using the following criteria: scope of the analysis, visualization capabilities, statistical model(s) used, correction for multiple comparisons, reference microarrays available, installation issues and sources of annotation data. This detailed analysis of the capabilities of these tools will help researchers choose the most appropriate tool for a given type of analysis. More importantly, in spite of the fact that this type of analysis has been generally adopted, this approach has several important intrinsic drawbacks. These drawbacks are associated with all tools discussed and represent conceptual limitations of the current state-of-the-art in ontological analysis. We propose these as challenges for the next generation of secondary data analysis tools.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Natural Scales in Geographical Patterns

            Human mobility is known to be distributed across several orders of magnitude of physical distances , which makes it generally difficult to endogenously find or define typical and meaningful scales. Relevant analyses, from movements to geographical partitions, seem to be relative to some ad-hoc scale, or no scale at all. Relying on geotagged data collected from photo-sharing social media, we apply community detection to movement networks constrained by increasing percentiles of the distance distribution. Using a simple parameter-free discontinuity detection algorithm, we discover clear phase transitions in the community partition space. The detection of these phases constitutes the first objective method of characterising endogenous, natural scales of human movement. Our study covers nine regions, ranging from cities to countries of various sizes and a transnational area. For all regions, the number of natural scales is remarkably low (2 or 3). Further, our results hint at scale-related behaviours rather than scale-related users. The partitions of the natural scales allow us to draw discrete multi-scale geographical boundaries, potentially capable of providing key insights in fields such as epidemiology or cultural contagion where the introduction of spatial boundaries is pivotal.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways.

              Because mouse models play a crucial role in biomedical research related to the human nervous system, understanding the similarities and differences between mouse and human brain is of fundamental importance. Studies comparing transcription in human and mouse have come to varied conclusions, in part because of their relatively small sample sizes or underpowered methodologies. To better characterize gene expression differences between mouse and human, we took a systems-biology approach by using weighted gene coexpression network analysis on more than 1,000 microarrays from brain. We find that global network properties of the brain transcriptome are highly preserved between species. Furthermore, all modules of highly coexpressed genes identified in mouse were identified in human, with those related to conserved cellular functions showing the strongest between-species preservation. Modules corresponding to glial and neuronal cells were sufficiently preserved between mouse and human to permit identification of cross species cell-class marker genes. We also identify several robust human-specific modules, including one strongly correlated with measures of Alzheimer disease progression across multiple data sets, whose hubs are poorly-characterized genes likely involved in Alzheimer disease. We present multiple lines of evidence suggesting links between neurodegenerative disease and glial cell types in human, including human-specific correlation of presenilin-1 with oligodendrocyte markers, and significant enrichment for known neurodegenerative disease genes in microglial modules. Together, this work identifies convergent and divergent pathways in mouse and human, and provides a systematic framework that will be useful for understanding the applicability of mouse models for human brain disorders.
                Bookmark

                Author and article information

                Contributors
                j.botia@ucl.ac.uk
                jana.vandrovcova@ucl.ac.uk
                paola.forabosco@cnr.it
                m.guelfi@ucl.ac.uk
                karishma.d'sa@kcl.ac.uk
                john.hardy@ucl.ac.uk
                cathryn.lewis@kcl.ac.uk
                mina.ryten@ucl.ac.uk
                michael.weale@kcl.ac.uk
                Journal
                BMC Syst Biol
                BMC Syst Biol
                BMC Systems Biology
                BioMed Central (London )
                1752-0509
                12 April 2017
                12 April 2017
                2017
                : 11
                : 47
                Affiliations
                [1 ]GRID grid.83440.3b, Department of Molecular Neuroscience, , Institute of Neurology, University College London, ; Queen Square, London, WC1N UK
                [2 ]GRID grid.13097.3c, Department of Medical & Molecular Genetics, , School of Medical Sciences, King’s College London, Guy’s Hospital, ; London, SE1 9RT UK
                [3 ]GRID grid.7763.5, Istituto di Ricerca Genetica e Biomedica, , CNR, Cittadella Universitaria di Monserrato, ; Monserrato, 09042 CA Italy
                Author information
                http://orcid.org/0000-0002-6992-598X
                Article
                420
                10.1186/s12918-017-0420-6
                5389000
                28403906
                13cb78dc-7596-4349-b78b-d7fce0fb35e7
                © The Author(s) 2017

                Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 26 August 2016
                : 17 March 2017
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100000265, Medical Research Council;
                Award ID: MR/K01417X/1
                Award Recipient :
                Funded by: Alzheimer’s Research UK (GB)
                Categories
                Software
                Custom metadata
                © The Author(s) 2017

                Quantitative & Systems biology
                gene co-expression networks on brain,k-means applied to wgcna,assessment of better gene clusters on bulk tissue

                Comments

                Comment on this article