5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Characterization of Posttranslationally Modified Multidrug Efflux Pumps Reveals an Unexpected Link between Glycosylation and Antimicrobial Resistance

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Nearly all bacterial species have at least a single glycosylation system, but the direct effects of these posttranslational protein modifications are unresolved. Glycoproteome-wide analysis of several bacterial pathogens has revealed general glycan modifications of virulence factors and protein assemblies. Using Campylobacter jejuni as a model organism, we have studied the role of general N-linked glycans in the multidrug efflux pump commonly found in Gram-negative bacteria. We show, for the first time, the direct link between N-linked glycans and multidrug efflux pump activity. At the protein level, we demonstrate that N-linked glycans play a role in enhancing protein thermostability and mediating the assembly of the multidrug efflux pump to promote antimicrobial resistance, highlighting the importance of this posttranslational modification in bacterial physiology. Similar roles for glycans are expected to be found in other Gram-negative pathogens that possess general protein glycosylation systems.

          ABSTRACT

          The substantial rise in multidrug-resistant bacterial infections is a current global imperative. Cumulative efforts to characterize antimicrobial resistance in bacteria has demonstrated the spread of six families of multidrug efflux pumps, of which resistance-nodulation-cell division (RND) is the major mechanism of multidrug resistance in Gram-negative bacteria. RND is composed of a tripartite protein assembly and confers resistance to a range of unrelated compounds. In the major enteric pathogen Campylobacter jejuni, the three protein components of RND are posttranslationally modified with N-linked glycans. The direct role of N-linked glycans in C. jejuni and other bacteria has long been elusive. Here, we present the first detailed account of the role of N-linked glycans and the link between N-glycosylation and antimicrobial resistance in C. jejuni. We demonstrate the multifunctional role of N-linked glycans in enhancing protein thermostability, stabilizing protein complexes and the promotion of protein-protein interaction, thus mediating antimicrobial resistance via enhancing multidrug efflux pump activity. This affirms that glycosylation is critical for multidrug efflux pump assembly. We present a generalized strategy that could be used to investigate general glycosylation system in Campylobacter genus and a potential target to develop antimicrobials against multidrug-resistant pathogens.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega

          Introduction Multiple sequence alignments (MSAs) are essential in most bioinformatics analyses that involve comparing homologous sequences. The exact way of computing an optimal alignment between N sequences has a computational complexity of O(L N ) for N sequences of length L making it prohibitive for even small numbers of sequences. Most automatic methods are based on the ‘progressive alignment' heuristic (Hogeweg and Hesper, 1984), which aligns sequences in larger and larger subalignments, following the branching order in a ‘guide tree.' With a complexity of roughly O(N 2), this approach can routinely make alignments of a few thousand sequences of moderate length, but it is tough to make alignments much bigger than this. The progressive approach is a ‘greedy algorithm' where mistakes made at the initial alignment stages cannot be corrected later. To counteract this effect, the consistency principle was developed (Notredame et al, 2000). This has allowed the production of a new generation of more accurate aligners (e.g. T-Coffee (Notredame et al, 2000)) but at the expense of ease of computation. These methods give 5–10% more accurate alignments, as measured on benchmarks, but are confined to a few hundred sequences. In this report, we introduce a new program called Clustal Omega, which is accurate but also allows alignments of almost any size to be produced. We have used it to generate alignments of over 190 000 sequences on a single processor in a few hours. In benchmark tests, it is distinctly more accurate than most widely used, fast methods and comparable in accuracy to some of the intensive slow methods. It also has powerful features for allowing users to reuse their alignments so as to avoid recomputing an entire alignment, every time new sequences become available. The key to making the progressive alignment approach scale is the method used to make the guide tree. Normally, this involves aligning all N sequences to each other giving time and memory requirements of O(N 2). Protein families with >50 000 sequences are appearing and will become common from various wide scale genome sequencing projects. Currently, the only method that can routinely make alignments of more than about 10 000 sequences is MAFFT/PartTree (Katoh and Toh, 2007). It is very fast but leads to a loss in accuracy, which has to be compensated for by iteration and other heuristics. With Clustal Omega, we use a modified version of mBed (Blackshields et al, 2010), which has complexity of O(N log N), and which produces guide trees that are just as accurate as those from conventional methods. mBed works by ‘emBedding' each sequence in a space of n dimensions where n is proportional to log N. Each sequence is then replaced by an n element vector, where each element is simply the distance to one of n ‘reference sequences.' These vectors can then be clustered extremely quickly by standard methods such as K-means or UPGMA. In Clustal Omega, the alignments are then computed using the very accurate HHalign package (Söding, 2005), which aligns two profile hidden Markov models (Eddy, 1998). Clustal Omega has a number of features for adding sequences to existing alignments or for using existing alignments to help align new sequences. One innovation is to allow users to specify a profile HMM that is derived from an alignment of sequences that are homologous to the input set. The sequences are then aligned to these ‘external profiles' to help align them to the rest of the input set. There are already widely available collections of HMMs from many sources such as Pfam (Finn et al, 2009) and these can now be used to help users to align their sequences. Results Alignment accuracy The standard method for measuring the accuracy of multiple alignment algorithms is to use benchmark test sets of reference alignments, generated with reference to three-dimensional structures. Here, we present results from a range of packages tested on three benchmarks: BAliBASE (Thompson et al, 2005), Prefab (Edgar, 2004) and an extended version of HomFam (Blackshields et al, 2010). For these tests, we just report results using the default settings for all programs but with two exceptions, which were needed to allow MUSCLE (Edgar, 2004) and MAFFT to align the biggest test cases in HomFam. For test cases with >3000 sequences, we run MUSCLE with the –maxiter parameter set to 2, in order to finish the alignments in reasonable times. Second, we have run several different programs from the MAFFT package. MAFFT (Katoh et al, 2002) consists of a series of programs that can be run separately or called automatically from a script with the --auto flag set. This flag chooses to run a slow, consistency-based program (L-INS-i) when the number and lengths of sequences is small. When the numbers exceed inbuilt thresholds, a conventional progressive aligner is used (FFT-NS-2). The latter is also the program that is run by default if MAFFT is called with no flags set. For very large data sets, the --parttree flag must be set on the command line and a very fast guide tree calculation is then used. The results for the BAliBASE benchmark tests are shown in Table I. BAliBASE is divided into six ‘references.' Average scores are given for each reference, along with total run times and average total column (TC) scores, which give the proportion of the total alignment columns that is recovered. A score of 1.0 indicates perfect agreement with the benchmark. There are two rows for the MAFFT package: MAFFT (auto) and MAFFT default. In most (203 out of 218) BAliBASE test cases, the number of sequences is small and the script runs L-INS-i, which is the slow accurate program that uses the consistency heuristic (Notredame et al, 2000) that is also used by MSAprobs (Liu et al, 2010), Probalign, Probcons (Do et al, 2005) and T-Coffee. These programs are all restricted to small numbers of sequences but tend to give accurate alignments. This is clearly reflected in the times and average scores in Table I. The times range from 25 min up to 22 h for these packages and the accuracies range from 55 to 61% of columns correct. Clustal Omega only takes 9 min for the same runs but has an accuracy level that is similar to that of Probcons and T-Coffee. The rest of the table is mainly taken by the programs that use progressive alignment. Some of these are very fast but this speed is matched by a considerable drop in accuracy compared with the consistency-based programs and Clustal Omega. The weakest program here, is Clustal W (Larkin et al, 2007) followed by PRANK (Löytynoja and Goldman, 2008). PRANK is not designed for aligning distantly related sequences but at giving good alignments for phylogenetic work with special attention to gaps. These gap positions are not included in these tests as they tend not to be structurally conserved. Dialign (Morgenstern et al, 1998) does not use consistency or progressive alignment but is based on finding best local multiple alignments. FSA (Bradley et al, 2009) uses sampling of pairwise alignments and ‘sequence annealing' and has been shown to deliver good nucleotide sequence alignments in the past. The Prefab benchmark test results are shown in Table II. Here, the results are divided into five groups according to the percent identity of the sequences. The overall scores range from 53 to 73% of columns correct. The consistency-based programs MSAprobs, MAFFT L-INS-i, Probalign, Probcons and T-Coffee, are again the most accurate but with long run times. Clustal Omega is close to the consistency programs in accuracy but is much faster. There is then a gap to the faster progressive based programs of MUSCLE, MAFFT, Kalign (Lassmann and Sonnhammer, 2005) and Clustal W. Results from testing large alignments with up to 50 000 sequences are given in Table III using HomFam. Here, each alignment is made up of a core of a Homstrad (Mizuguchi et al, 1998) structure-based alignment of at least five sequences. These sequences are then inserted into a test set of sequences from the corresponding, homologous, Pfam domain. This gives very large sets of sequences to be aligned but the testing is only carried out on the sequences with known structures. Only some programs are able to deliver alignments at all, with data sets of this size. We restricted the comparisons to Clustal Omega, MAFFT, MUSCLE and Kalign. MAFFT with default settings, has a limit of 20 000 sequences and we only use MAFFT with --parttree for the last section of Table III. MUSCLE becomes increasingly slow when you get over 3000 sequences. Therefore, for >3000 sequences we used MUSCLE with the faster but less accurate setting of –maxiters 2, which restricts the number of iterations to two. Overall, Clustal Omega is easily the most accurate program in Table III. The run times show MAFFT default and Kalign to be exceptionally fast on the smaller test cases and MAFFT --parttree to be very fast on the biggest families. Clustal Omega does scale well, however, with increasing numbers of sequences. This scaling is described in more detail in the Supplementary Information. We do have two further test cases with >50 000 sequences, but it was not possible to get results for these from MUSCLE or Kalign. These are described in the Supplementary Information as well. Table III gives overall run times for the four programs evaluated with HomFam. Figure 1 resolves these run times case by case. Kalign is very fast for small families but does not scale as well. Overall, MAFFT is faster than the other programs over all test case sizes but Clustal Omega scales similarly. Points in Figure 1 represent different families with different average sequence lengths and pairwise identities. Therefore, the scalability trend is fuzzy, with larger dots occurring generally above smaller dots. Supplementary Figure S3 shows scalability data, where subsets of increasing size are sampled from one large family only. This reduces variability in pairwise identity and sequence length. External profile alignment Clustal Omega can read extra information from a profile HMM derived from preexisting alignments. For example, if a user wishes to align a set of globin sequences and has an existing globin alignment, this alignment can be converted to a profile HMM and used as well as the sequence input file. This HMM is here referred to as an ‘external profile' and its use in this way as ‘external profile alignment' (EPA). During EPA, each sequence in the input set is aligned to the external profile. Pseudocount information from the external profile is then transferred, position by position, to the input sequence. Ideally, this would be used with large curated alignments of particular proteins or domains of interest such as are used in metagenomics projects. Rather than taking the input sequences and aligning them from scratch, every time new sequences are found, the alignment should be carefully maintained and used as an external profile for EPA. Clustal Omega also can align sequences to existing alignments using conventional alignment methods. Users can add sequences to an alignment, one by one or align a set of aligned sequences to the alignment. In this paper, we demonstrate the EPA approach with two examples. First, we take the 94 HomFam test cases from the previous section and use the corresponding Pfam HMM for EPA. Before EPA, the average accuracy for the test cases was 0.627 of correctly aligned Homstrad positions but after EPA it rises to 0.653. This is plotted, test case for test case in Figure 2A. Each dot is one test case with the TC score for Clustal Omega plotted against the score using EPA. The second example is illustrated in Figure 2B. Here, we take all the BAliBASE reference sets and align them as normal using Clustal Omega and obtain the benchmark result of 0.554 of columns correctly aligned, as already reported in Table I. For EPA, we use the benchmark reference alignments themselves as external profiles. The results now jump to 0.857 of columns correct. This is a jump of over 30% and while it is not a valid measure of Clustal Omega accuracy for comparison with other programs, it does illustrate the potential power of EPA to use information in external alignments. Iteration EPA can also be used in a simple iteration scheme. Once a MSA has been made from a set of input sequences, it can be converted into a HMM and used for EPA to help realign the input sequences. This can also be combined with a full recalculation of the guide tree. In Figure 3, we show the results of one and two iterations on every test case from HomFam. The graph is plotted as a running average TC score for all test cases with N or fewer test cases where N is plotted on the horizontal axis using a log scale. With some smaller test cases, iteration actually has a detrimental effect. Once you get near 1000 or more sequences, however, a clear trend emerges. The more sequences you have, the more beneficial the effect of iteration is. With bigger test cases, it becomes more and more beneficial to apply two iterations. This result confirms the usefulness of EPA as a general strategy. It also confirms the difficulty in aligning extremely large numbers of sequences but gives one partial solution. It also gives a very simple but effective iteration scheme, not just for guide tree iteration, as used in many packages, but for iteration of the alignment itself. Discussion The main breakthroughs since the mid 1980s in MSA methods have been progressive alignment and the use of consistency. Otherwise, most recent work has concerned refinements for speed or accuracy on benchmark test sets. The speed increases have been dramatic but, with just two major exceptions, the methods are still basically O(N 2) and incapable of being extended to data sets of >10 000 sequences. The two exceptions are mBed, used here, and MAFFT PartTree. PartTree is faster but at the expense of accuracy, at least as judged by the benchmarking here. The second group of recent developments have concerned accuracy. This has tended to focus on results from benchmarking, a potentially contentious issue (Aniba et al, 2010; Edgar, 2010). The benchmark test sets that we have are limited in scope and heavily biased toward single domain globular proteins. This has the potential to lead to methods that behave well on benchmarks but which are not so flexible or useful in real-world situations. One development to improve accuracy has been the recruitment of extra homologs to bulk up input data sets. This seems to work well with the consistency-based methods and for small data sets. It appears, however, that there is a limit to the extra accuracy that can be obtained this way, without further development. The extra sequences may also bring in noise and dramatically increase the complexity of the computational problem. This can be partly fixed by iteration but, EPA to a high-quality reference alignment might be a better solution. This also raises the need for methods to visualize such large alignments, in order to detect problems. A second major focus for development has been the use of external information such as RNA structure (Wilm et al, 2008) or protein structure predictions (Pirovano et al, 2008). EPA is a new approach that allows users to exploit information in their own or in publicly available alignments. It does not force new sequences to follow the older alignment exactly. The new sequences get aligned to each other using progressive alignment but the information in the external profile can help provide information as to which amino acids are most likely to occur at each position in a sequence. Most methods attempt to predict this from general models of protein evolution with secondary structure prediction as a refinement. In this paper, we have shown that even using the mass produced alignments from Pfam as external profiles provides a small increase in accuracy for a large general set of test cases. This opens up a new set of possibilities for users to make use of the information contained in large, publicly available alignments and creates an incentive for database providers to make very high-quality alignments available. One of the reasons for the great success of Clustal X was the very user-friendly graphical user interface (GUI). This, however, is not as critical as in the past due to the widespread availability of web-based services where the GUI is provided by the web-based front-end server. Further, there are several very high-quality alignment viewers and editors such as Jalview (Clamp et al, 2004) and Seaview (Gouy et al, 2010) that read Clustal Omega output or which can call Clustal Omega directly. Materials and methods Clustal Omega is licensed under the GNU Lesser General Public License. Source code as well as precompiled binaries for Linux, FreeBSD, Windows and Mac (Intel and PowerPC) are available at http://www.clustal.org. Clustal Omega is available as a command line program only, which uses GNU-style command line options, and also accepts ClustalW-style command options for backwards compatibility and easy integration into existing pipelines. Clustal Omega is written in C and C++ and makes use of a number of excellent free software packages. We used a modified version of Sean Eddy's Squid library (http://selab.janelia.org/software.html) for sequence I/O, allowing the use of a wide variety of file formats. We use David Arthur's k-means++ code (Arthur and Vassilvitskii, 2007) for fast clustering of sequence vectors. Code for fast UPGMA and guide tree handling routines was adopted from MUSCLE (Edgar, 2004). We use the OpenMP library to enable multithreaded computation of pairwise distances and alignment match states. The documentation for Clustal Omega's API is part of the source code, and in addition is available from http://www.clustal.org/omega/clustalo-api/. Full details of all algorithms are given in the accompanying Supplementary Information. The benchmarks that were used were BAliBASE 3 (Thompson et al, 2005), PREFAB 4.0 (posted March 2005) (Edgar, 2010) and a newly constructed data set (HomFam) using sequences from Pfam (version 25) and Homstrad (as of 2011-06-13) (Mizuguchi et al, 1998). The programs that were compared can be obtained from: ClustalW2, v2.1 (http://www.clustal.org) DIALIGN 2.2.1 (http://dialign.gobics.de/) FSA 1.15.5 (http://sourceforge.net/projects/fsa/) Kalign 2.04 (http://msa.sbc.su.se/cgi-bin/msa.cgi) MAFFT 6.857 (http://mafft.cbrc.jp/alignment/software/source.html) MSAProbs 0.9.4 (http://sourceforge.net/projects/msaprobs/files/) MUSCLE version 3.8.31 posted 1 May 2010 (http://www.drive5.com/muscle/downloads.htm) PRANK v.100802, 2 August 2010 (http://www.ebi.ac.uk/goldman-srv/prank/src/prank/) Probalign v1.4 (http://cs.njit.edu/usman/probalign/) PROBCONS version 1.12 (http://probcons.stanford.edu/download.html) T-Coffee Version 8.99 (http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html#DOWNLOAD). Supplementary Material Supplementary Information Supplementary Figures S1–3 Review Process File
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010.

            Measuring disease and injury burden in populations requires a composite metric that captures both premature mortality and the prevalence and severity of ill-health. The 1990 Global Burden of Disease study proposed disability-adjusted life years (DALYs) to measure disease burden. No comprehensive update of disease burden worldwide incorporating a systematic reassessment of disease and injury-specific epidemiology has been done since the 1990 study. We aimed to calculate disease burden worldwide and for 21 regions for 1990, 2005, and 2010 with methods to enable meaningful comparisons over time. We calculated DALYs as the sum of years of life lost (YLLs) and years lived with disability (YLDs). DALYs were calculated for 291 causes, 20 age groups, both sexes, and for 187 countries, and aggregated to regional and global estimates of disease burden for three points in time with strictly comparable definitions and methods. YLLs were calculated from age-sex-country-time-specific estimates of mortality by cause, with death by standardised lost life expectancy at each age. YLDs were calculated as prevalence of 1160 disabling sequelae, by age, sex, and cause, and weighted by new disability weights for each health state. Neither YLLs nor YLDs were age-weighted or discounted. Uncertainty around cause-specific DALYs was calculated incorporating uncertainty in levels of all-cause mortality, cause-specific mortality, prevalence, and disability weights. Global DALYs remained stable from 1990 (2·503 billion) to 2010 (2·490 billion). Crude DALYs per 1000 decreased by 23% (472 per 1000 to 361 per 1000). An important shift has occurred in DALY composition with the contribution of deaths and disability among children (younger than 5 years of age) declining from 41% of global DALYs in 1990 to 25% in 2010. YLLs typically account for about half of disease burden in more developed regions (high-income Asia Pacific, western Europe, high-income North America, and Australasia), rising to over 80% of DALYs in sub-Saharan Africa. In 1990, 47% of DALYs worldwide were from communicable, maternal, neonatal, and nutritional disorders, 43% from non-communicable diseases, and 10% from injuries. By 2010, this had shifted to 35%, 54%, and 11%, respectively. Ischaemic heart disease was the leading cause of DALYs worldwide in 2010 (up from fourth rank in 1990, increasing by 29%), followed by lower respiratory infections (top rank in 1990; 44% decline in DALYs), stroke (fifth in 1990; 19% increase), diarrhoeal diseases (second in 1990; 51% decrease), and HIV/AIDS (33rd in 1990; 351% increase). Major depressive disorder increased from 15th to 11th rank (37% increase) and road injury from 12th to 10th rank (34% increase). Substantial heterogeneity exists in rankings of leading causes of disease burden among regions. Global disease burden has continued to shift away from communicable to non-communicable diseases and from premature death to years lived with disability. In sub-Saharan Africa, however, many communicable, maternal, neonatal, and nutritional disorders remain the dominant causes of disease burden. The rising burden from mental and behavioural disorders, musculoskeletal disorders, and diabetes will impose new challenges on health systems. Regional heterogeneity highlights the importance of understanding local burden of disease and setting goals and targets for the post-2015 agenda taking such patterns into account. Because of improved definitions, methods, and data, these results for 1990 and 2010 supersede all previously published Global Burden of Disease results. Bill & Melinda Gates Foundation. Copyright © 2012 Elsevier Ltd. All rights reserved.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences.

              Campylobacter jejuni, from the delta-epsilon group of proteobacteria, is a microaerophilic, Gram-negative, flagellate, spiral bacterium-properties it shares with the related gastric pathogen Helicobacter pylori. It is the leading cause of bacterial food-borne diarrhoeal disease throughout the world. In addition, infection with C. jejuni is the most frequent antecedent to a form of neuromuscular paralysis known as Guillain-Barré syndrome. Here we report the genome sequence of C. jejuni NCTC11168. C. jejuni has a circular chromosome of 1,641,481 base pairs (30.6% G+C) which is predicted to encode 1,654 proteins and 54 stable RNA species. The genome is unusual in that there are virtually no insertion sequences or phage-associated sequences and very few repeat sequences. One of the most striking findings in the genome was the presence of hypervariable sequences. These short homopolymeric runs of nucleotides were commonly found in genes encoding the biosynthesis or modification of surface structures, or in closely linked genes of unknown function. The apparently high rate of variation of these homopolymeric tracts may be important in the survival strategy of C. jejuni.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                mBio
                mBio
                mbio
                mbio
                mBio
                mBio
                American Society for Microbiology (1752 N St., N.W., Washington, DC )
                2150-7511
                17 November 2020
                Nov-Dec 2020
                : 11
                : 6
                : e02604-20
                Affiliations
                [a ]Department of Pathogen Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
                [b ]Biomolecular Spectroscopy Centre, King’s College London, Hodgkin, United Kingdom
                Department of Veterinary Medicine
                Author notes
                Address correspondence to Brendan W. Wren, brendan.wren@ 123456lshtm.ac.uk .
                Author information
                https://orcid.org/0000-0002-6140-9489
                Article
                mBio02604-20
                10.1128/mBio.02604-20
                7683400
                33203757
                46033b85-1ed8-405c-99fe-93a9a1f212f9
                Copyright © 2020 Abouelhadid et al.

                This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license.

                History
                : 11 September 2020
                : 7 October 2020
                Page count
                supplementary-material: 5, Figures: 6, Tables: 2, Equations: 0, References: 52, Pages: 19, Words: 11430
                Funding
                Funded by: Wellcome, https://doi.org/10.13039/100004440;
                Award ID: 102978/Z/13/Z
                Award Recipient :
                Categories
                Research Article
                Molecular Biology and Physiology
                Custom metadata
                November/December 2020

                Life sciences
                multidrug efflux pump,n-linked glycans,glycosylation
                Life sciences
                multidrug efflux pump, n-linked glycans, glycosylation

                Comments

                Comment on this article