31
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Likelihood-Based Inference of B Cell Clonal Families

      research-article
      , *
      PLoS Computational Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called “rearrangement” forming progenitor B cells, then a Darwinian process of lineage diversification and selection called “affinity maturation.” The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem “clonal family inference.” In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM) framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.

          Author Summary

          Antibodies must recognize a great diversity of antigens to protect us from infectious disease. The binding properties of antibodies are determined by the DNA sequences of their corresponding B cell receptors (BCRs). These BCR sequences are created in naive form by VDJ recombination, which randomly selects and trims the ends of V, D, and J genes, then joins the resulting segments together with additional random nucleotides. If they pass initial screening and bind an antigen, these sequences then undergo an evolutionary process of reproduction, mutation, and selection, revising the BCR to improve binding to its cognate antigen. It has recently become possible to determine the BCR sequences resulting from this process in high throughput. Although these sequences implicitly contain a wealth of information about both antigen exposure and the process by which we learn to resist pathogens, this information can only be extracted using computer algorithms. In this paper we describe a likelihood-based statistical method to determine, given a collection of BCR sequences, which of them are derived from the same recombination events. It is based on a hidden Markov model (HMM) of VDJ rearrangement which is able to calculate likelihoods for many sequences at once.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes.

          Multiple sclerosis (MS) is an inflammatory disease of the central nervous system (CNS) characterized by autoimmune-mediated demyelination and neurodegeneration. The CNS of patients with MS harbors expanded clones of antigen-experienced B cells that reside in distinct compartments including the meninges, cerebrospinal fluid (CSF), and parenchyma. It is not understood whether this immune infiltrate initiates its development in the CNS or in peripheral tissues. B cells in the CSF can exchange with those in peripheral blood, implying that CNS B cells may have access to lymphoid tissue that may be the specific compartment(s) in which CNS-resident B cells encounter antigen and experience affinity maturation. Paired tissues were used to determine whether the B cells that populate the CNS mature in the draining cervical lymph nodes (CLNs). High-throughput sequencing of the antibody repertoire demonstrated that clonally expanded B cells were present in both compartments. Founding members of clones were more often found in the draining CLNs. More mature clonal members derived from these founders were observed in the draining CLNs and also in the CNS, including lesions. These data provide new evidence that B cells traffic freely across the tissue barrier, with the majority of B cell maturation occurring outside of the CNS in the secondary lymphoid tissue. Our study may aid in further defining the mechanisms of immunomodulatory therapies that either deplete circulating B cells or affect the intrathecal B cell compartment by inhibiting lymphocyte transmigration into the CNS. Copyright © 2014, American Association for the Advancement of Science.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Likelihood-based tests of topologies in phylogenetics.

            Likelihood-based statistical tests of competing evolutionary hypotheses (tree topologies) have been available for approximately a decade. By far the most commonly used is the Kishino-Hasegawa test. However, the assumptions that have to be made to ensure the validity of the Kishino-Hasegawa test place important restrictions on its applicability. In particular, it is only valid when the topologies being compared are specified a priori. Unfortunately, this means that the Kishino-Hasegawa test may be severely biased in many cases in which it is now commonly used: for example, in any case in which one of the competing topologies has been selected for testing because it is the maximum likelihood topology for the data set at hand. We review the theory of the Kishino-Hasegawa test and contend that for the majority of popular applications this test should not be used. Previously published results from invalid applications of the Kishino-Hasegawa test should be treated extremely cautiously, and future applications should use appropriate alternative tests instead. We review such alternative tests, both nonparametric and parametric, and give two examples which illustrate the importance of our contentions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Simulating trees with a fixed number of extant species.

              In this paper, I develop efficient tools to simulate trees with a fixed number of extant species. The tools are provided in my open source R-package TreeSim available on CRAN. The new model presented here is a constant rate birth-death process with mass extinction and/or rate shift events at arbitrarily fixed times 1) before the present or 2) after the origin. The simulation approach for case (2) can also be used to simulate under more general models with fixed events after the origin. I use the developed simulation tools for showing that a mass extinction event cannot be distinguished from a model with constant speciation and extinction rates interrupted by a phase of stasis based on trees consisting of only extant species. However, once we distinguish between mass extinction and period of stasis based on paleontological data, fast simulations of trees with a fixed number of species allow inference of speciation and extinction rates using approximate Bayesian computation and allow for robustness analysis once maximum likelihood parameter estimations are available.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                October 2016
                17 October 2016
                : 12
                : 10
                : e1005086
                Affiliations
                [001]Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
                La Jolla Institute for Allergy and Immunology, UNITED STATES
                Author notes

                The authors have declared that no competing interests exist.

                • Conceived and designed the experiments: DKR FAM.

                • Performed the experiments: DKR.

                • Analyzed the data: DKR FAM.

                • Contributed reagents/materials/analysis tools: DKR.

                • Wrote the paper: DKR FAM.

                Author information
                http://orcid.org/0000-0003-0607-6025
                Article
                PCOMPBIOL-D-15-01705
                10.1371/journal.pcbi.1005086
                5066976
                27749910
                04cc4ef1-aa57-4491-94c7-a68c145d6ec6
                © 2016 Ralph, Matsen

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 8 October 2015
                : 27 July 2016
                Page count
                Figures: 13, Tables: 1, Pages: 28
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/100000002, National Institutes of Health;
                Award ID: GM113246
                Award Recipient :
                Both authors were supported by National Institutes of Health R01 GM113246 (PI Matsen), R01 AI103981, and U19 AI117891. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Physical sciences
                Mathematics
                Probability theory
                Markov models
                Hidden Markov models
                Biology and Life Sciences
                Cell Biology
                Cellular Types
                Animal Cells
                Immune Cells
                Antibody-Producing Cells
                B Cells
                Biology and Life Sciences
                Immunology
                Immune Cells
                Antibody-Producing Cells
                B Cells
                Medicine and Health Sciences
                Immunology
                Immune Cells
                Antibody-Producing Cells
                B Cells
                Biology and Life Sciences
                Cell Biology
                Cellular Types
                Animal Cells
                Blood Cells
                White Blood Cells
                B Cells
                Biology and Life Sciences
                Cell Biology
                Cellular Types
                Animal Cells
                Immune Cells
                White Blood Cells
                B Cells
                Biology and Life Sciences
                Immunology
                Immune Cells
                White Blood Cells
                B Cells
                Medicine and Health Sciences
                Immunology
                Immune Cells
                White Blood Cells
                B Cells
                Biology and Life Sciences
                Immunology
                Immune System Proteins
                Immune Receptors
                B Cell Receptors
                Medicine and Health Sciences
                Immunology
                Immune System Proteins
                Immune Receptors
                B Cell Receptors
                Biology and Life Sciences
                Biochemistry
                Proteins
                Immune System Proteins
                Immune Receptors
                B Cell Receptors
                Biology and Life Sciences
                Cell Biology
                Signal Transduction
                Immune Receptors
                B Cell Receptors
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Molecular Biology Assays and Analysis Techniques
                Phylogenetic Analysis
                Research and Analysis Methods
                Molecular Biology Techniques
                Molecular Biology Assays and Analysis Techniques
                Phylogenetic Analysis
                Biology and Life Sciences
                Genetics
                Mutation
                Insertion Mutation
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Custom metadata
                The "Adaptive" data set is available at http://adaptivebiotech.com/link/mat2015. The "Vollmers" data set is available at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000656.v1.p1.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article