8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genetic variant classification by predicted protein structure: A case study on IRF6

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Next-generation genome sequencing has revolutionized genetic testing, identifying numerous rare disease-associated gene variants. However, to impute pathogenicity, computational approaches remain inadequate and functional testing of gene variant is required to provide the highest level of evidence. The emergence of AlphaFold2 has transformed the field of protein structure determination, and here we outline a strategy that leverages predicted protein structure to enhance genetic variant classification. We used the gene IRF6 as a case study due to its clinical relevance, its critical role in cleft lip/palate malformation, and the availability of experimental data on the pathogenicity of IRF6 gene variants through phenotype rescue experiments in irf6 -/- zebrafish. We compared results from over 30 pathogenicity prediction tools on 37 IRF6 missense variants. IRF6 lacks an experimentally derived structure, so we used predicted structures to explore associations between mutational clustering and pathogenicity. We found that among these variants, 19 of 37 were unanimously predicted as deleterious by computational tools. Comparing in silico predictions with experimental findings, 12 variants predicted as pathogenic were experimentally determined as benign. Even with the recently published AlphaMissense model, 15/18 (83%) of the predicted pathogenic variants were experimentally determined as benign. In comparison, mapping variants to the protein revealed deleterious mutation clusters around the protein binding domain, whereas N-terminal variants tend to be benign, suggesting the importance of structural information in determining pathogenicity of mutations in this gene. In conclusion, incorporating gene-specific structural features of known pathogenic/benign mutations may provide meaningful insights into pathogenicity predictions in a gene-specific manner and facilitate the interpretation of variant pathogenicity.

          Related collections

          Most cited references51

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology

            The American College of Medical Genetics and Genomics (ACMG) previously developed guidance for the interpretation of sequence variants. 1 In the past decade, sequencing technology has evolved rapidly with the advent of high-throughput next generation sequencing. By adopting and leveraging next generation sequencing, clinical laboratories are now performing an ever increasing catalogue of genetic testing spanning genotyping, single genes, gene panels, exomes, genomes, transcriptomes and epigenetic assays for genetic disorders. By virtue of increased complexity, this paradigm shift in genetic testing has been accompanied by new challenges in sequence interpretation. In this context, the ACMG convened a workgroup in 2013 comprised of representatives from the ACMG, the Association for Molecular Pathology (AMP) and the College of American Pathologists (CAP) to revisit and revise the standards and guidelines for the interpretation of sequence variants. The group consisted of clinical laboratory directors and clinicians. This report represents expert opinion of the workgroup with input from ACMG, AMP and CAP stakeholders. These recommendations primarily apply to the breadth of genetic tests used in clinical laboratories including genotyping, single genes, panels, exomes and genomes. This report recommends the use of specific standard terminology: ‘pathogenic’, ‘likely pathogenic’, ‘uncertain significance’, ‘likely benign’, and ‘benign’ to describe variants identified in Mendelian disorders. Moreover, this recommendation describes a process for classification of variants into these five categories based on criteria using typical types of variant evidence (e.g. population data, computational data, functional data, segregation data, etc.). Because of the increased complexity of analysis and interpretation of clinical genetic testing described in this report, the ACMG strongly recommends that clinical molecular genetic testing should be performed in a CLIA-approved laboratory with results interpreted by a board-certified clinical molecular geneticist or molecular genetic pathologist or equivalent.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Protein Data Bank.

              The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
                Bookmark

                Author and article information

                Contributors
                Journal
                Comput Struct Biotechnol J
                Comput Struct Biotechnol J
                Computational and Structural Biotechnology Journal
                Research Network of Computational and Structural Biotechnology
                2001-0370
                03 February 2024
                December 2024
                03 February 2024
                : 23
                : 892-904
                Affiliations
                [a ]Graduate Program in Biochemistry and Molecular Biophysics, University of Pennsylvania, Philadelphia, PA 19104, United States
                [b ]Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
                [c ]Master of Biotechnology Program, University of Pennsylvania, Philadelphia, PA 19104, United States
                [d ]Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
                [e ]Center for Craniofacial Innovation, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
                [f ]Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
                Author notes
                [* ]Corresponding author at: Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States. wangk@ 123456chop.edu
                Article
                S2001-0370(24)00018-7
                10.1016/j.csbj.2024.01.019
                10869248
                38370976
                45301d2a-87db-4f3a-92a4-015ff8b66853
                © 2024 The Authors

                This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

                History
                : 2 December 2023
                : 24 January 2024
                : 25 January 2024
                Categories
                Research Article

                variant interpretation,structural biology,clinical genetics

                Comments

                Comment on this article