23
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Comprehensive genomic analysis identifies pathogenic variants in maturity-onset diabetes of the young (MODY) patients in South India

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Maturity-onset diabetes of the young (MODY) is an early-onset, autosomal dominant form of non-insulin dependent diabetes. Genetic diagnosis of MODY can transform patient management. Earlier data on the genetic predisposition to MODY have come primarily from familial studies in populations of European origin.

          Methods

          In this study, we carried out a comprehensive genomic analysis of 289 individuals from India that included 152 clinically diagnosed MODY cases to identify variants in known MODY genes. Further, we have analyzed exome data to identify putative MODY relevant variants in genes previously not implicated in MODY. Functional validation of MODY relevant variants was also performed.

          Results

          We found MODY 3 ( HNF1A; 7.2%) to be most frequently mutated followed by MODY 12 ( ABCC8; 3.3%). They together account for ~ 11% of the cases. In addition to known MODY genes, we report the identification of variants in RFX6, WFS1, AKT2, NKX6–1 that may contribute to development of MODY. Functional assessment of the NKX6–1 variants showed that they are functionally impaired.

          Conclusions

          Our findings showed HNF1A and ABCC8 to be the most frequently mutated MODY genes in south India. Further we provide evidence for additional MODY relevant genes, such as NKX6–1, and these require further validation.

          Electronic supplementary material

          The online version of this article (10.1186/s12881-018-0528-6) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references48

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

          In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            A global reference for human genetic variation

            The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Integrative Genomics Viewer

              To the Editor Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole genome sequencing, epigenetic surveys, expression profiling of coding and non-coding RNAs, SNP and copy number profiling, and functional assays. Analysis of these large, diverse datasets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large datasets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data poses a significant challenge to the development of such tools. To address this challenge we developed the Integrative Genomics Viewer (IGV), a lightweight visualization tool that enables intuitive real-time exploration of diverse, large-scale genomic datasets on standard desktop computers. It supports flexible integration of a wide range of genomic data types including aligned sequence reads, mutations, copy number, RNAi screens, gene expression, methylation, and genomic annotations (Figure S1). The IGV makes use of efficient, multi-resolution file formats to enable real-time exploration of arbitrarily large datasets over all resolution scales, while consuming minimal resources on the client computer (see Supplementary Text). Navigation through a dataset is similar to Google Maps, allowing the user to zoom and pan seamlessly across the genome at any level of detail from whole-genome to base pair (Figure S2). Datasets can be loaded from local or remote sources, including cloud-based resources, enabling investigators to view their own genomic datasets alongside publicly available data from, for example, The Cancer Genome Atlas (TCGA) 1 , 1000 Genomes (www.1000genomes.org/), and ENCODE 2 (www.genome.gov/10005107) projects. In addition, IGV allows collaborators to load and share data locally or remotely over the Web. IGV supports concurrent visualization of diverse data types across hundreds, and up to thousands of samples, and correlation of these integrated datasets with clinical and phenotypic variables. A researcher can define arbitrary sample annotations and associate them with data tracks using a simple tab-delimited file format (see Supplementary Text). These might include, for example, sample identifier (used to link different types of data for the same patient or tissue sample), phenotype, outcome, cluster membership, or any other clinical or experimental label. Annotations are displayed as a heatmap but more importantly are used for grouping, sorting, filtering, and overlaying diverse data types to yield a comprehensive picture of the integrated dataset. This is illustrated in Figure 1, a view of copy number, expression, mutation, and clinical data from 202 glioblastoma samples from the TCGA project in a 3 kb region around the EGFR locus 1, 3 . The investigator first grouped samples by tumor subtype, then by data type (copy number and expression), and finally sorted them by median copy number over the EGFR locus. A shared sample identifier links the copy number and expression tracks, maintaining their relative sort order within the subtypes. Mutation data is overlaid on corresponding copy number and expression tracks, based on shared participant identifier annotations. Several trends in the data stand out, such as a strong correlation between copy number and expression and an overrepresentation of EGFR amplified samples in the Classical subtype. IGV’s scalable architecture makes it well suited for genome-wide exploration of next-generation sequencing (NGS) datasets, including both basic aligned read data as well as derived results, such as read coverage. NGS datasets can approach terabytes in size, so careful management of data is necessary to conserve compute resources and to prevent information overload. IGV varies the displayed level of detail according to resolution scale. At very wide views, such as the whole genome, IGV represents NGS data by a simple coverage plot. Coverage data is often useful for assessing overall quality and diagnosing technical issues in sequencing runs (Figure S3), as well as analysis of ChIP-Seq 4 and RNA-Seq 5 experiments (Figures S4 and S5). As the user zooms below the ~50 kb range, individual aligned reads become visible (Figure 2) and putative SNPs are highlighted as allele counts in the coverage plot. Alignment details for each read are available in popup windows (Figures S6 and S7). Zooming further, individual base mismatches become visible, highlighted by color and intensity according to base call and quality. At this level, the investigator may sort reads by base, quality, strand, sample and other attributes to assess the evidence of a variant. This type of visual inspection can be an efficient and powerful tool for variant call validation, eliminating many false positives and aiding in confirmation of true findings (Figures S6 and S7). Many sequencing protocols produce reads from both ends (“paired ends”) of genomic fragments of known size distribution. IGV uses this information to color-code paired ends if their insert sizes are larger than expected, fall on different chromosomes, or have unexpected pair orientations. Such pairs, when consistent across multiple reads, can be indicative of a genomic rearrangement. When coloring aberrant paired ends, each chromosome is assigned a unique color, so that intra- (same color) and inter- (different color) chromosomal events are readily distinguished (Figures 2 and S8). We note that misalignments, particularly in repeat regions, can also yield unexpected insert sizes, and can be diagnosed with the IGV (Figure S9). There are a number of stand-alone, desktop genome browsers available today 6 including Artemis 7 , EagleView 8 , MapView 9 , Tablet 10 , Savant 11 , Apollo 12 , and the Integrated Genome Browser 13 . Many of them have features that overlap with IGV, particularly for NGS sequence alignment and genome annotation viewing. The Integrated Genome Browser also supports viewing array-based data. See Supplementary Table 1 and Supplementary Text for more detail. IGV focuses on the emerging integrative nature of genomic studies, placing equal emphasis on array-based platforms, such as expression and copy-number arrays, next-generation sequencing, as well as clinical and other sample metadata. Indeed, an important and unique feature of IGV is the ability to view all these different data types together and to use the sample metadata to dynamically group, sort, and filter datasets (Figure 1 above). Another important characteristic of IGV is fast data loading and real-time pan and zoom – at all scales of genome resolution and all dataset sizes, including datasets comprising hundreds of samples. Finally, we have placed great emphasis on the ease of installation and use of IGV, with the goal of making both the viewing and sharing of their data accessible to non-informatics end users. IGV is open source software and freely available at http://www.broadinstitute.org/igv/, including full documentation on use of the software. Supplementary Material 1
                Bookmark

                Author and article information

                Contributors
                91-44-43968888 , drmohans@diabetes.ind.in
                drradha@mdrf.in
                thongn@gene.com
                Stawiski.Eric@gene.com
                bajaj.kanika@gene.com
                goldstein.leonard@gene.com
                tom.jennifer@gene.com
                dranjana@drmohans.com
                beltran.monica@gene.com
                bhangale.tushar@gene.com
                jahnavi.suresh@gmail.com
                chandnidr@gmail.com
                gayuuprabhu@gmail.com
                paul@medgenome.com
                nazhang2013@gmail.com
                sakthivel.m@medgenome.com
                sameer.p@medgenome.com
                chaudhuri.subhra@gene.com
                ravig@medgenome.com
                zhang.jingli@gene.com
                sams@medgenome.com
                stinson.jeremy@gene.com
                modrusan.zora@gene.com
                ramprasadv@medgenome.com
                650-225-1000 , sekar@gene.com
                650-225-1000 , andrewp@gene.com
                Journal
                BMC Med Genet
                BMC Med. Genet
                BMC Medical Genetics
                BioMed Central (London )
                1471-2350
                13 February 2018
                13 February 2018
                2018
                : 19
                : 22
                Affiliations
                [1 ]GRID grid.410867.c, Madras Diabetes Research Foundation & Dr. Mohan’s Diabetes Specialities Centre, ; No. 4, Conran Smith Road, Gopalapuram, Chennai, Tamil Nadu 600 086 India
                [2 ]ISNI 0000 0004 0534 4718, GRID grid.418158.1, Department of Molecular Biology, Genentech Inc., ; 1 DNA Way, South San Francisco, CA 94080 USA
                [3 ]ISNI 0000 0004 0534 4718, GRID grid.418158.1, Department of Bioinformatics and Computational Biology, Genentech Inc., ; 1 DNA Way, South San Francisco, CA 94080 USA
                [4 ]ISNI 0000 0004 0534 4718, GRID grid.418158.1, Department of Human Genetics, Genentech Inc., ; 1 DNA Way, South San Francisco, CA 94080 USA
                [5 ]ISNI 0000 0001 0705 6304, GRID grid.253527.4, Department of General Medicine, , Govt. Medical College, ; Kozhikode, 673008 India
                [6 ]MedGenome, Bangalore, Karnataka 560 099 India
                Author information
                http://orcid.org/0000-0003-4272-6443
                Article
                528
                10.1186/s12881-018-0528-6
                5811965
                29439679
                55a84649-45be-413d-9a2f-c9923c62b454
                © The Author(s). 2018

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 25 August 2017
                : 19 January 2018
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2018

                Genetics
                mody,diabetes,exome,genomics analysis,nkx6–1
                Genetics
                mody, diabetes, exome, genomics analysis, nkx6–1

                Comments

                Comment on this article