2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Vaxign-ML: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens

      1 , 2 , 3 , 3 , 4 , 4 , 3 , 5 , 6
      Bioinformatics
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Reverse vaccinology (RV) is a milestone in rational vaccine design, and machine learning (ML) has been applied to enhance the accuracy of RV prediction. However, ML-based RV still faces challenges in prediction accuracy and program accessibility.

          Results

          This study presents Vaxign-ML, a supervised ML classification to predict bacterial protective antigens (BPAgs). To identify the best ML method with optimized conditions, five ML methods were tested with biological and physiochemical features extracted from well-defined training data. Nested 5-fold cross-validation and leave-one-pathogen-out validation were used to ensure unbiased performance assessment and the capability to predict vaccine candidates against a new emerging pathogen. The best performing model (eXtreme Gradient Boosting) was compared to three publicly available programs (Vaxign, VaxiJen, and Antigenic), one SVM-based method, and one epitope-based method using a high-quality benchmark dataset. Vaxign-ML showed superior performance in predicting BPAgs. Vaxign-ML is hosted in a publicly accessible web server and a standalone version is also available.

          Availability and implementation

          Vaxign-ML website at http://www.violinet.org/vaxign/vaxign-ml, Docker standalone Vaxign-ML available at https://hub.docker.com/r/e4ong1031/vaxign-ml and source code is available at https://github.com/VIOLINet/Vaxign-ML-docker.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: not found
          • Article: not found

          MINIMUM REDUNDANCY FEATURE SELECTION FROM MICROARRAY GENE EXPRESSION DATA

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            High-throughput prediction of protein antigenicity using protein microarray data.

            Discovery of novel protective antigens is fundamental to the development of vaccines for existing and emerging pathogens. Most computational methods for predicting protein antigenicity rely directly on homology with previously characterized protective antigens; however, homology-based methods will fail to discover truly novel protective antigens. Thus, there is a significant need for homology-free methods capable of screening entire proteomes for the antigens most likely to generate a protective humoral immune response. Here we begin by curating two types of positive data: (i) antigens that elicit a strong antibody response in protected individuals but not in unprotected individuals, using human immunoglobulin reactivity data obtained from protein microarray analyses; and (ii) known protective antigens from the literature. The resulting datasets are used to train a sequence-based prediction model, ANTIGENpro, to predict the likelihood that a protein is a protective antigen. ANTIGENpro correctly classifies 82% of the known protective antigens when trained using only the protein microarray datasets. The accuracy on the combined dataset is estimated at 76% by cross-validation experiments. Finally, ANTIGENpro performs well when evaluated on an external pathogen proteome for which protein microarray data were obtained after the initial development of ANTIGENpro. ANTIGENpro is integrated in the SCRATCH suite of predictors available at http://scratch.proteomics.ics.uci.edu. pfbaldi@ics.uci.edu
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Prediction of protein folding class using global description of amino acid sequence.

              We present a method for predicting protein folding class based on global protein chain description and a voting process. Selection of the best descriptors was achieved by a computer-simulated neural network trained on a data base consisting of 83 folding classes. Protein-chain descriptors include overall composition, transition, and distribution of amino acid attributes, such as relative hydrophobicity, predicted secondary structure, and predicted solvent exposure. Cross-validation testing was performed on 15 of the largest classes. The test shows that proteins were assigned to the correct class (correct positive prediction) with an average accuracy of 71.7%, whereas the inverse prediction of proteins as not belonging to a particular class (correct negative prediction) was 90-95% accurate. When tested on 254 structures used in this study, the top two predictions contained the correct class in 91% of the cases.
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                Oxford University Press (OUP)
                1367-4803
                1460-2059
                May 15 2020
                May 01 2020
                February 25 2020
                May 15 2020
                May 01 2020
                February 25 2020
                : 36
                : 10
                : 3185-3191
                Affiliations
                [1 ]Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
                [2 ]Department of Pathogenobiology, Daqing Branch of Harbin Medical University, Daqing 163319, China
                [3 ]Unit for Laboratory Animal Medicine
                [4 ]College of Literature, Science, and the Arts, University of Michigan
                [5 ]Department of Microbiology and Immunology
                [6 ]Center of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
                Article
                10.1093/bioinformatics/btaa119
                32096826
                dffe1d6c-12fb-4d66-8288-41eb316f3ec6
                © 2020

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article