37
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Calcium-induced conformational changes of the regulatory domain of human mitochondrial aspartate/glutamate carriers

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The transport activity of human mitochondrial aspartate/glutamate carriers is central to the malate–aspartate shuttle, urea cycle, gluconeogenesis and myelin synthesis. They have a unique three-domain structure, comprising a calcium-regulated N-terminal domain with eight EF-hands, a mitochondrial carrier domain, and a C-terminal domain. Here we present the calcium-bound and calcium-free structures of the N- and C-terminal domains, elucidating the mechanism of calcium regulation. Unexpectedly, EF-hands 4–8 are involved in dimerization of the carrier and form a static unit, whereas EF-hands 1–3 form a calcium-responsive mobile unit. On calcium binding, an amphipathic helix of the C-terminal domain binds to the N-terminal domain, opening a vestibule. In the absence of calcium, the mobile unit closes the vestibule. Opening and closing of the vestibule might regulate access of substrates to the carrier domain, which is involved in their transport. These structures provide a framework for understanding cases of the mitochondrial disease citrin deficiency.

          Abstract

          Human mitochondrial aspartate/glutamate carriers, citrin and aralar, are regulated by calcium. Here, the authors report the dimeric structure of calcium-free and -bound versions of the regulatory domains to elucidate calcium-dependent conformational changes that could regulate access of substrate to the carrier domain.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: not found

          An introduction to data reduction: space-group determination, scaling and intensity statistics

          1. Introduction Estimates of integrated intensities from X-ray diffraction images are not generally suitable for immediate use in structure determination. Theoretically, the measured intensity I h of a reflection h is proportional to the square of the underlying structure factor |F h |2, which is the quantity that we want, with an associated measurement error, but systematic effects of the diffraction experiment break this proportionality. Such systematic effects include changes in the beam intensity, changes in the exposed volume of the crystal, radiation damage, bad areas of the detector and physical obstruction of the detector (e.g. by the backstop or cryostream). If data from different crystals (or different sweeps of the same crystal) are being merged, corrections must also be applied for changes in exposure time and rotation rate. In order to infer |F h |2 from I h , we need to put the measured intensities on the same scale by modelling the experiment and inverting its effects. This is generally performed in a scaling process that makes the data internally consistent by adjusting the scaling model to minimize the difference between symmetry-related observations. This process requires us to know the point-group symmetry of the diffraction pattern, so we need to determine this symmetry prior to scaling. The scaling process produces an estimate of the intensity of each unique reflection by averaging over all of the corrected intensities, together with an estimate of its error σ(I h ). The final stage in data reduction is estimation of the structure amplitude |F h | from the intensity, which is approximately I h 1/2 (but with a skewing factor for intensities that are below or close to background noise, e.g. ‘negative’ intensities); at the same time, the intensity statistics can be examined to detect pathologies such as twinning. This paper presents a brief overview of how to run CCP4 programs for data reduction through the CCP4 graphical interface ccp4i and points out some issues that need to be considered. No attempt is made to be comprehensive nor to provide full references for everything. Automated pipelines such as xia2 (Winter, 2010 ▶) are often useful and generally work well, but sometimes in difficult cases finer control is needed. In the current version of ccp4i (CCP4 release 6.1.3) the ‘Data Reduction’ module contains two major relevant tasks: ‘Find or Match Laue Group’, which determines the crystal symmetry, and ‘Scale and Merge Intensities’, which outputs a file containing averaged structure amplitudes. Future GUI versions may combine these steps into a simplified interface. Much of the advice given here is also present in the CCP4 wiki (http://www.ccp4wiki.org/). 2. Space-group determination The true space group is only a hypo­thesis until the structure has been solved, since it can be hard to distinguish between exact crystallographic symmetry and approximate noncrystallographic symmetry. However, it is useful to find the likely symmetry early on in the structure-determination pipeline, since it is required for scaling and indeed may affect the data-collection strategy. The program POINTLESS (Evans, 2006 ▶) examines the symmetry of the diffraction pattern and scores the possible crystallographic symmetry. Indexing in the integration program (e.g. MOSFLM) only indicates the lattice symmetry, i.e. the geometry of the lattice giving constraints on the cell dimensions (e.g. α = β = γ = 90° for an orthorhombic lattice), but such relationships can arise accidentally and may not reflect the true symmetry. For example, a primitive hexagonal lattice may belong to point groups 3, 321, 312, 6, 622 or indeed lower symmetry (C222, 2 or 1). A rotational axis of symmetry produces identical true intensities for reflections related by that axis, so examination of the observed symmetry in the diffraction pattern allows us to determine the likely point group and hence the Laue group (a point group with added Friedel symmetry) and the Patterson group (with any lattice centring): note that the Patterson group is labelled ‘Laue group’ in the output from POINTLESS. Translational symmetry operators that define the space group (e.g. the distinction between a pure dyad and a screw dyad) are only visible in the observed diffraction pattern as systematic absences, along the principal axes for screws, and these are less reliable indicators since there are relatively few axial reflections in a full three-dimensional data set and some of these may be unrecorded. The protocol for determination of space group in POINTLESS is as follows. (i) From the unit-cell dimensions and lattice centring, find the highest compatible lattice symmetry within some tolerance, ignoring any input symmetry information. (ii) Score each potential rotational symmetry element belonging to the lattice symmetry using all pairs of observations related by that element. (iii) Score combinations of symmetry elements for all possible subgroups of the lattice-symmetry group (Laue or Patterson groups). (iv) Score possible space groups from axial systematic absences (the space group is not needed for scaling but is required later for structure solution). (v) Scores for rotational symmetry operations are based on correlation coefficients rather than R factors, since they are less dependent on the unknown scales. A probability is estimated from the correlation coefficient, using equivalent-size samples of unrelated observations to estimate the width of the probability distribution (see Appendix A ). 2.1. A simple example POINTLESS may be run from the ‘Data Reduction’ module of ccp4i with the task ‘Find or Match Laue Group’ or from the ‘QuickSymm’ option of the iMOSFLM interface (Battye et al., 2011 ▶). Unless the space group is known from previous crystals, the appropriate major option is ‘Determine Laue group’. To use this, fill in the boxes for the title, the input and output file names and the project, crystal and data-set names (if not already set in MOSFLM). Table 1 ▶ shows the results for a straightforward example in space group P212121. Table 1 ▶(a) shows the scores for the three possible dyad axes in the orthorhombic lattice, all of which are clearly present. Combining these (Table 1 ▶ b) shows that the Laue group is mmm with a primitive lattice, Patterson group Pmmm. Fourier analysis of systematic absences along the three principal axes shows that all three have alternating strong (even) and weak (odd) intensities (Fig. 1 ▶ and Table 1 ▶ c), so are likely to be screw axes, implying that the space group is P212121. However, there are only three h00 reflections recorded along the a* axis, so confidence in the space-group assignment is not as high as the confidence in the Laue-group assignment (Table 1 ▶ d). With so few observations along this axis, it is impossible to be confident that P212121 is the true space group rather than P22121. 2.2. A pseudo-cubic example Table 2 ▶ shows the scores for individual symmetry elements for a pseudo-cubic case with a ≃ b ≃ c. It is clear that only the orthorhombic symmetry elements are present: these are the high-scoring elements marked ‘***’. Neither the fourfolds characteristic of tetragonal groups nor the body-diagonal threefolds (along 111 etc.) characteristic of cubic groups are present. The joint probability score for the Laue group Pmmm is 0.989. The suggested solution (not shown) interchanges k and l to make a 1 if the anomalous differences are on average greater than their error. Another way of detecting a significant anomalous signal is to compare the two estimates of ΔI anom from random half data sets, ΔI 1 and ΔI 2 (provided there are at least two measurements of each, i.e. a multiplicity of roughly 4). Figs. 5 ▶(b) and 5 ▶(f) show the correlation coefficient between ΔI 1 and ΔI 2 as a function of resolution: Fig. 5 ▶(f) shows little statistically significance beyond about 4.5 Å resolution. Figs. 5 ▶(c) and 5 ▶(g) show scatter plots of ΔI 1 against ΔI 2: this plot is elongated along the diagonal if there is a large anomalous signal and this can be quantitated as the ‘r.m.s. correlation ratio’, which is defined as (root-mean-square deviation along the diagonal)/(root-mean-square deviation perpendicular to the diagonal) and is shown as a function of resolution in Figs. 5 ▶(d) and 5 ▶(h). The plots against resolution give a suggestion of where the data might be cut for substructure determination, but it is important to note that useful albeit weak phase information extends well beyond the point at which these statistics show a significant signal. 5. Estimation of amplitude |F| from intensity I If we knew the true intensity J we could just take the square root, |F| = J 1/2. However, measured intensities have an error, so a weak intensity may well be measured as negative (i.e. below background); indeed, multiple measurements of a true intensity of zero should be equally positive and negative. This is one reason why when possible it is better to use I rather than |F| in structure determination and refinement. The ‘best’ (most likely) estimate of |F| is larger than I 1/2 for weak intensities, since we know |F| > 0, but |F| = I 1/2 is a good estimate for stronger intensities, roughly those with I > 3σ(I). The programs TRUNCATE and its newer version CTRUNCATE estimate |F| from I and σ(I) as where the prior probability of the true intensity p(J) is estimated from the average intensity in the same resolution range (French & Wilson, 1978 ▶). 6. Intensity statistics and crystal pathologies At the end stage of data reduction, after scaling and merging, the distribution of intensities and its variation with resolution can indicate problems with the data, notably twinning (see, for example, Lebedev et al., 2006 ▶; Zwart et al., 2008 ▶). The simplest expected intensity statistics as a function of resolution s = sinθ/λ arise from assuming that atoms are randomly placed in the unit cell, in which case 〈I〉(s) = 〈FF*〉(s) = g(j, s)2, where g(j, s) is the scattering from the jth atom at resolution s. This average intensity falls off with resolution mainly because of atomic motions (B factors). If all atoms were equal and had equal B factors, then 〈I〉(s) = Cexp(−2Bs 2) and the ‘Wilson plot’ of log[〈I〉(s)] against s 2 would be a straight line of slope −2B. The Wilson plot for proteins shows peaks at ∼10 and 4 Å and a dip at ∼6 Å arising from the distribution of inter­atomic spacings in polypeptides (fewer atoms 6 Å apart than 4 Å apart), but the slope at higher resolution does give an indication of the average B factor and an unusual shape can indicate a problem (e.g. 〈I〉 increasing at the outer limit, spuriously large 〈I〉 owing to ice rings etc.). For detection of crystal pathologies we are not so interested in resolution dependence, so we can use normalized intensities Z = I/〈I〉(s) ≃ |E|2 which are independent of resolution and should ideally be corrected for anisotropy (as is performed in CTRUNCATE). Two useful statistics on Z are plotted by CTRUNCATE: the moments of Z as a function of resolution and its cumulative distribution. While 〈Z〉(s) = 1.0 by definition, its second moment 〈Z 2〉(s) (equivalent to the fourth moment of E) is >1.0 and is larger if the distribution of Z is wider. The ideal value of 〈E 4〉 is 2.0, but it will be smaller for the narrower intensity distribution from a merohedral twin (too few weak reflections), equal to 1.5 for a perfect twin and larger if there are too many weak reflections, e.g. from a noncrystallographic translation which leads to a whole class of reflections being weak. The cumulative distribution plot of N(z), the fraction of reflections with Z |L| and N(|L|) = |L|(3 − L 2)/2 for a perfect twin. This test seems to be largely unaffected by anisotropy or translational non­crystallographic symmetry which may affect tests on Z. The calculation of Z = I/〈I〉(s) depends on using a suitable value for I/〈I〉(s) and noncrystallographic translations or uncorrected anisotropy lead to the use of an inappropriate value for 〈I〉(s). These statistical tests are all unweighted, so it may be better to exclude weak high-resolution data or to examine the resolution dependence of, for example, the moments of Z (or possibly L). It is also worth noting that fewer weak reflections than expected may arise from unresolved closely spaced spots along a long real-space axis, so that weak reflections are contaminated by neighbouring strong reflections, thus mimicking the effect of twinning. 7. Summary: questions and decisions In the process of data reduction, a number of decisions need to be taken either by the programs or by the user. The main questions and con­siderations are as follows. (i) What is the point group or Laue group? This is usually unambiguous, but pseudosymmetry may confuse the programs and the user. Close examination of the scores for individual symmetry elements from POINTLESS may suggest lower symmetry groups to try. (ii) What is the space group? Distinction between screw axes and pure rotations from axial systematic absences is often unreliable and it is generally a good idea to try all the likely space groups (consistent with the Laue group) in the key structure-solution step: either molecular-replacement searches or substructure searches in experimental phasing. For example, in a primitive orthorhombic system the eight possible groups P2 x 2 x 2 x should be tried. This has the added advantage of providing some negative controls on the success of the structure solution. (iii) Is there radiation damage: should data collected after the crystal has had a high dose of radiation be ignored (possibly at the expense of resolution)? Cutting back data from the end may reduce completeness and the optimum trade-off is hard to choose. (iv) What is the best resolution cutoff? An appropriate choice of resolution cutoff is difficult and sometimes seems to be performed mainly to satisfy referees. On the one hand, cutting back too far risks excluding data that do contain some useful information. On the other hand, extending the resolution further makes all statistics look worse and may in the end degrade maps. The choice is perhaps not as important as is sometimes thought: maps calculated with slightly different resolution cutoffs are almost indistinguishable. (v) Is there an anomalous signal detectable in the intensity statistics? Note that a weak anomalous signal may still be useful even if it is not detectable in the statistics. The statistics do give a good guide to a suitable resolution limit for location of the substructure, but the whole resolution range should be used in phasing. (vi) Are the data twinned? Highly twinned data sets can be solved by molecular replacement and refined, but probably not solved, by experimental phasing methods. Partially twinned data sets can often be solved by ignoring the twinning and then refined as a twin. (vii) Is this data set better or worse than those previously collected? One of the best things to do with a bad data set is to throw it away in favour of a better one. With modern synchrotrons, data collection is so fast that we usually have the freedom to collect data from several equivalent crystals and choose the best. In most cases the data-reduction process is straightforward, but in difficult cases critical examination of the results may make the difference between solving and not solving the structure.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Automated structure solution with autoSHARP.

            We present here the automated structure solution pipeline "autoSHARP." It is built around the heavy-atom refinement and phasing program SHARP, the density modification program SOLOMON, and the ARP/wARP package for automated model building and refinement (using REFMAC). It allows fully automated structure solution, from merged reflection data to an initial model, without any user intervention. We describe and discuss the preparation of the user input, the data flow through the pipeline, and the various results obtained throughout the procedure.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Maximum-likelihood density modification

              1. Introduction The phase information obtained from experimental measurements on macromolecules using either multiple isomorphous replacement or multiwavelength anomalous diffraction is often insufficient by itself for constructing a electron-density map useful for model building and interpretation. Many density-modification methods have been developed in recent years for improving the quality of ­electron-density maps by incorporation of prior knowledge about the features expected in these maps when they are obtained at high or moderate resolution (2–4 Å). Among the most powerful of these methods are solvent flattening, non-crystallographic symmetry averaging, histogram matching, phase extension, molecular replacement, entropy maximization and iterative model building (Abrahams, 1997 ▶; Bricogne, 1984 ▶, 1988 ▶; Cowtan & Main, 1993 ▶, 1996 ▶; Giacovazzo & Siliqi, 1997 ▶; Goldstein & Zhang, 1998 ▶; Gu et al., 1997 ▶; Lunin, 1993 ▶; Perrakis et al., 1997 ▶; Podjarny et al., 1987 ▶; Prince et al., 1988 ▶; Refaat et al., 1996 ▶; Roberts & Brünger, 1995 ▶; Rossmann & Arnold, 1993 ▶; Vellieux et al., 1995 ▶; Wilson & Agard, 1993 ▶; Xiang et al., 1993 ▶; Zhang & Main, 1990 ▶; Zhang, 1993 ▶; Zhang et al., 1997 ▶). The fundamental basis of density-modification methods is that there are many possible sets of structure-factor amplitudes and phases that are all reasonably probable based on the limited experimental data, and those structure factors that lead to maps that are most consistent with both the experimental data and the prior knowledge are the most likely overall. In these methods, the choice of prior information that is to be used and the procedure for combining prior information about electron density with experimentally derived phase information are crucial parts. Until recently, density modification, the combination of knowledge about expected features of an electron-density map with experimental phase information, has generally been carried out in a two-step procedure that is iterated until convergence. In the first step, an electron-density map obtained experimentally is modified in real space in order to make it consistent with expectations. This can consist of flattening solvent regions, averaging non-crystallographic symmetry-related regions or histogram matching, for example. In the second step, phases are calculated from the modified map and are combined with the experimental phases to form a new phase set. The disadvantage of this real-space modification approach is that it is not at all clear how to weight the observed phases with those obtained from the modified map. This is a consequence of the fact that the modified map contains some of the same information as the original map and some new information. This difficulty has been recognized for a long time and a number of approaches have been designed to improve the relative weighting from these two sources, recently including the use of maximum-entropy methods and the use of weighting optimized using cross-validation (Xiang et al., 1993 ▶; Roberts & Brünger, 1995 ▶; Cowtan & Main, 1996 ▶) and ‘solvent flipping’ (Abrahams, 1997 ▶). 2. Density modification by reciprocal-space-based likelihood optimization We have recently developed a very different approach to combinining experimental phase information with prior knowledge about expected electron-density distributions in maps. Our approach is based on maximization of a combined likelihood function (Terwilliger, 1999 ▶). The fundamental idea is to express our knowledge about the probability of a set of structure factors {F h } in terms of two quantities: (i) the likelihood of having measured the observed set of structure factors if this structure-factor set were correct and (ii) the likelihood that the map resulting from this structure-factor set {F h } is consistent with our prior knowledge about this and other macromolecular structures. When set up in this way, the overlap of information that occurred in the real-space modification methods is not present because the experimental and prior information are kept separate. Consequently, proper weighting of experimental and prior information only requires estimates of probability functions for each source of information. The likelihood-based density-modification approach has a second very important advantage. This is that the derivatives of the likelihood functions with respect to individual structure factors can be readily calculated in reciprocal space by FFT-based methods. As a consequence, density modification simply becomes an optimization of a combined likelihood function by adjustment of structure factors. This makes density modification a remarkably simple but powerful approach, only requiring that suitable likelihood functions be constructed for each aspect of prior knowledge that is to be incorporated. We previously showed that such an approach could be applied to solvent flattening and that the resulting algorithm was greatly improved over methods depending on real-space modification and phase recombination (Terwilliger, 1999 ▶). Here, we extend the idea of likelihood-based density modification to include prior information on the electron-density distribution from a wide variety of potential sources and demonstrate it on both the electron density in the solvent region and the region occupied by a macromolecule. First, we describe the mathematics of likelihood-based density modification in a practical formulation that is modified somewhat from the one we used for reciprocal-space solvent flattening (Terwilliger, 1999 ▶). We then show how a likelihood function for a map that includes information on both the solvent- and macromolecule-containing regions can be constructed and used. 3. Likelihood-based density modification The basic idea of our likelihood-based density-modification procedure is that there are two key kinds of information about the structure factors for a crystal of a macromolecule. The first is the experimental phase and amplitude information. This can be thought of in terms of a likelihood (or log-likelihood) function LLOBS(F h ) for each structure factor F h , where the probability distribution for the structure factor p OBS(F h ) is given by For reflections with accurately measured amplitudes, the chief uncertainty in F h will be in the phase, while for unmeasured or poorly measured reflections it will be in both phase and amplitude. The second kind of information about structure factors in this formulation is the likelihood of the map resulting from them. For example, for most macromolecular crystals a set of structure factors {F h } that leads to a map with a flat region corresponding to solvent is more likely to be correct than one that leads to a map with uniform variation everywhere. This map-likelihood function describes the probability that the map obtained from a set of structure factors is compatible with our expectations, We then combine our two principal sources of information along with any prior knowledge of the structure factors to yield the likelihood of a particular set of structure factors, where LL o ({F h }) includes any structure-factor information that is known in advance, such as the distribution of intensities of structure factors (Wilson, 1949 ▶). 3.1. Approximating the likelihood function to simplify the procedure In order to maximize the overall likelihood function in (3) we are going to need to know how the map-likelihood function changes in response to changes in structure factors. In the case of the map-likelihood function LLMAP({F h }) this can be thought of as two separate relationships, the response of the likelihood function to changes in electron density and the changes in electron density as a function of changes in structure factors. In principle, the likelihood of a particular map is a complicated function of the electron density over the entire map. Furthermore, the value of any structure factor affects the electron density everywhere in the map. To simplify the mathematics, we explicitly use a low-order approximation to the likelihood function for a map instead of attempting to evaluate the function precisely. As Fourier transformation is a linear process, each reflection contributes independently to the electron density at a given point in the cell. Although the log-likelihood of the electron density might have any form, we expect that for sufficiently small changes in structure factors, a first-order approximation to the log-likelihood function would apply and each reflection would also contribute relatively independently to changes in the log-likelihood function. Consequently, we construct a local approximation to the map-likelihood function, neglecting correlations among different points in the map and between reflections, expecting that it might describe reasonably accurately how the likelihood function would vary in response to small changes in structure factors. By neglecting correlations among different points in the map, we can write the log-likelihood for the whole electron-density map as the sum of the log-likelihoods of the densities at each point in the map, normalized to the volume of the unit cell and the number of reflections used to construct it (Terwilliger, 1999 ▶), Additionally, by treating each reflection as independently contributing to the likelihood function, we can write a local approximation to the log-likelihood of the density at each point. This approximation is given by the sum over all reflections of first few terms of a Taylor’s series expansion around the value obtained with the starting structure factors used in a cycle of density modification, , where and are the differences between F h and along the directions of and , respectively. Combining (4) and (5), we can write an expression for the map log-likelihood function, 3.2. FFT-based calculation of the reciprocal-space derivatives of log-likelihood of electron density LL[ρ(x, {F h })] The integrals in (6) can be rewritten in a form that is suitable for evaluation by an FFT-based approach. Considering the first integral in (6), we use the chain rule to write that and note that the derivative of ρ(x) with respect to for a particular index h is given by Now we can rearrange and rewrite the first integral in (6) in the form where the complex number a h is a term in the Fourier transform of LL[ρ(x, {F h })], In space groups other than P1, only a unique set of structure factors need to be specified to calculate an electron-density map. Taking space-group symmetry into account, (9) can be generalized (Terwilliger, 1999 ▶) to read where the indices h′ are all indices equivalent to h owing to space-group symmetry. A similar procedure can be used to rewrite the second integral in (6), yielding the expression where the indices h′ and k′ are each all indices equivalent to h owing to space-group symmetry and where the coefficients b h are again terms in a Fourier transform, this time of the second derivative of the log-likelihood of the electron density, The third and fourth integrals in (6) can be rewritten in a similar way, yielding the expressions and The significance of (4) through (15) is that we now have a simple expression (6) describing how the map-likelihood function LLMAP({F h }) varies when small changes are made in the structure factors. Evaluating this expression only requires that we be able to calculate the first and second derivatives of log-likelihood of the electron density with respect to electron density at each point in the map and carry out an FFT. Furthermore, maximization of the (local) overall likelihood function (3) becomes straightforward, as every reflection is treated independently. It consists simply of adjusting each structure factor to maximize its contribution to the approximation to the likelihood function through (3) to (15). In practice, instead of directly maximizing the overall likelihood function, we use it here to estimate the probability distribution for each structure factor (Terwilliger, 1999 ▶) and then integrate this probability distribution over the phase (or phase and amplitude) of the reflection to obtain a weighted mean estimate of the structure factor. Using (3) to (15), the probability distribution for an individual structure factor can be written as where, as above, the indices h′ and k′ are each all indices equivalent to h owing to space-group symmetry and the coefficients a h and b h are given in (10) and (13). Also as before, and are the differences between F h and along the directions of and , respectively. All the quantities in (16) can be readily calculated once a likelihood function for the electron density and its derivatives are obtained. 4. Likelihood function for an electron-density map with errors A key step in likelihood-based density modification is the decision as to the likelihood function for values of the electron density at a particular location in the map. For the present purpose, an expression for the log-likelihood of the electron density LL[ρ(x, {F h })] at a particular location x in a map is needed that depends on whether the point x is within the solvent region or the protein region. In general, this function might depend on whether the point satisfies any of a wide variety of conditions, such as being at a certain location in a known fragment of structure or being at a certain distance from some other feature of the map. We discussed previously (Terwilliger, 1999 ▶) how one might incorporate information on the environment of x by writing the log-likelihood function as the log of the sum of conditional probabilities dependent on the environment of x, where p PROT(x) is the probability that x is in the protein region and p[ρ(x)|PROT] is the conditional probability for ρ(x) given that x is in the protein region, and p SOLV(x) and p[ρ(x)|SOLV] are the corresponding quantities for the solvent region. The probability that x is in the protein or solvent regions is estimated by a modification of the methods of Wang (1985 ▶) and Leslie (1987 ▶) as described previously (Terwilliger, 1999 ▶). If there were more than just solvent and protein regions that identified the environment of each point, then (17) could be modified to include those as well. In developing (13) to (15), the derivatives of the likelihood function for electron density were intended to represent how the likelihood function changed when small changes in one structure factor were made. Surprisingly, the likelihood function that is most appropriate for our present purposes in this case is not a globally correct one. Instead, it is a likelihood function that represents how the overall likelihood function varies in response to small changes in one structure factor, keeping all others constant. To see the difference, consider the electron density in the solvent region of a macromolecular crystal. In an idealized situation with all possible reflections included, the electron density might be exactly equal to a constant in this region. The goal in using (16) is to obtain the relative probabilites for each possible value of a particular unknown structure factor F h . If all other structure factors were exact, then the globally correct likelihood function for the electron density (zero unless the solvent region is perfectly flat) would correctly identify the correct value of the unknown structure factor. Now suppose we had imperfect phase information. The solvent region would have a significant amount of noise and its value would no longer be a constant. If we use the globally correct likelihood function for the electron density, we would assign a zero probability to any value of the structure factor that did not lead to an absolutely flat solvent region. This is clearly unreasonable, because all the other (incorrect) structure factors are contributing noise that exists regardless of the value of this structure factor. This situation is very similar to the one encountered in structure refinement of macromolecular structures where there is a substantial deficiency in the model. The errors in all the other structure factors in the present discussion correspond to the deficiency in the macromolecular model in the refinement case. The appropriate variance to use as a weighting factor in refinement includes the estimated model error as well as the error in measurement (e.g. Terwilliger & Berendzen, 1996 ▶; Pannu & Read, 1996 ▶). Similarly, the appropriate likelihood function for electron density for use in the present method is one in which the overall uncertainty in the electron density arising from all reflections other than the one being considered is included in the variance. A likelihood function of this kind for the electron density can be developed using a model in which the electron density arising from all reflections but one is treated as a random variable (Terwilliger & Berendzen, 1996 ▶; Pannu & Read, 1996 ▶). Suppose that the true value of the electron density at x was known and was given by ρ T . Then consider that we have estimates of all the structure factors, but that substantial errors exist in each one. The expected value of the estimate of this electron density obtained from current estimates of all the structure factors (ρOBS) will be given by 〈ρOBS〉 = βρ T and the expected value of the variance by 〈(ρOBS − βρ T )2〉 = . The factor β represents the expectation that the calculated value of ρ will be smaller than the true value. This is for two reasons. One is that such a estimate may be calculated using figure-of-merit weighted estimates of structure factors, which will be smaller than the correct ones. The other is that phase error in the structure factors systematically leads to a bias towards a smaller component of the structure factor along the direction of the true structure factor. This is the same effect that leads to the D correction factor in maximum-likelihood refinement (Pannu & Read, 1996 ▶). A probability function for the electron density at this point that is appropriate for assessing the probabilities of values of the structure factor for one reflection can now be written as In a slightly more complicated case, where the value of ρ T is not known exactly but rather has an uncertainty σ T , (18) becomes Finally, in the case where only a probability distribution p(ρ T ) for ρ T is known, (18) becomes 4.1. Likelihood function for solvent- and macromolecule-containing regions of a map Using (19) and (20), we are now in a position to use a histogram-based approach (Goldstein & Zhang, 1998 ▶; Lunin, 1993 ▶; Zhang & Main, 1990 ▶) to develop likelihood functions for the solvent region of a map and for the macromolecule-containing region of a map. The approach is simple. The probability distribution for true electron density in the solvent or macromolecule regions of a crystal structure is obtained from an analysis of model structures and represented as a sum of Gaussian functions of the form If the values of β and σMAP were known for an experimental map with unknown errors but identified solvent and protein regions, then using (19) we could write the probability distribution for electron density in the each region of the map as with the appropriate values of β and σ . In practice, the values of β and σMAP are estimated by a least-squares fitting of the probability distribution given in (22) to the one found in the experimental map. This procedure has the advantage that the scale of the experimental map does not have to be accurately determined. Then (22) is used with the refined values of β and σMAP as the probability function for electron density in the corresponding region (solvent or macromolecule) of the map. 5. Evaluation of maximum-likelihood density modification with model and real data To evaluate the utility of maximum-likelihood density modification as described here, we carried out tests using the same model and experimental data that we previously analyzed using reciprocal-space solvent flattening and by real-space solvent flattening (Terwilliger, 1999 ▶). The first test case consisted of a set of phases constructed from a model with 32–68% of the volume of the unit cell taken up by protein. The initial effective figure of merit of the phases overall [〈cos(Δϕ)〉] was about 0.40. In our previous tests, we showed that both real-space and reciprocal-space solvent flattening improved the quality of phasing considerably. In the current tests, the real-space density modification included both solvent flattening and histogram matching to be as comparable as possible to the maximum-likelihood density modification we have developed. Table 1 ▶ shows the the quality of phases obtained after each method for density modification was applied to this model case. In all cases, maximum-likelihood density modification of this map resulted in phases with an effective figure of merit [〈cos(Δϕ)〉] higher than any of the other methods. When the fraction of solvent in the model unit cell was 50%, for example, maximum-likelihood density modification yielded an effective figure of merit of 0.83, while real-space solvent flattening and histogram matching resulted in an effective figure of merit of 0.62 and reciprocal-space solvent flattening gave an effective figure of merit of 0.67. The utility of maximum-likelihood density modification was also compared with real-space density modification and with reciprocal-space solvent flattening using experimental multiwavelength (MAD) data on initiation factor 5A (IF-5A) recently determined in our laboratory (Peat et al., 1998 ▶). IF-­5A crystallizes in space group I4, with unit-cell parameters a = 114, b = 114, c = 33 Å, one molecule in the asymmetric unit and a solvent content of about 60%. The structure was solved using MAD phasing based on three Se atoms in the asymmetric unit at a resolution of 2.2 Å. For purposes of testing density-modification methods, only one of the three selenium sites was used in phasing here, resulting in a starting map with a correlation coefficient to the map calculated using the final refined structure of 0.37. The resulting electron-density map was improved by real-space density modification using solvent flattening and histogram matching with dm (Cowtan & Main, 1996 ▶), by real-space density modification using solvent flipping (Abrahams, 1997 ▶) and after maximum-likelihood density modification. The ‘experimental’ map, the dm-modified map and the maximum-likelihood map are shown in Fig. 1 ▶. As anticipated, the real-space modified map obtained with dm is improved over the starting map; it has a correlation coefficient of 0.65. Density modification including solvent flipping yielded a similar improvement, with a correlation coefficient of 0.61 to the model map. The maximimum-likelihood modified map was much more substantially improved, with a correlation coefficient to the map based on a refined model of 0.79. 6. Discussion We have shown here that a maximum-likelihood approach can be used to carry out density modification on macromolecular crystal structures and that this approach is much more powerful than either conventional density modification based on solvent flattening and histogram matching or our recent reciprocal-space solvent-flattening procedure (Terwilliger, 1999 ▶). The reason the approach works so well is that the relative weighting of experimental phase information and of expected electron-density distributions is taken care of automatically by keeping the two sources of information clearly delineated and by defining suitable probability distributions for each. The maximum-likelihood approach to improvement of crystallographic phases has been developed extensively by Bricogne and others (e.g. Bricogne, 1984 ▶, 1988 ▶; Lunin, 1993 ▶). The importance of the present work and of our recent work on reciprocal-space solvent flattening (Terwilliger, 1999 ▶) is that we have developed a simple, effective and general way to carry it out. Although we have demonstrated here only two sources of expected electron-density distributions (probability distributions for solvent regions and for protein-containing regions), the methods developed here can be applied directly to a wide variety of sources of information. For example, any source of information about the expected electron density at a particular point in the unit cell that can be written in a form such as the one in (22) can be used in our procedure to describe the likelihood that a particular value of electron density is consistent with expectations. Sources of expected electron-density information that are especially suitable for application to our method include non-crystallographic symmetry and the knowledge of the location of fragments of structure in the unit cell. In the case of non-crystallographic symmetry, the probability distribution for electron density at one point in the unit cell can be written using (22) with a value of ρ T equal to the weighted mean at all non-crystallographically equivalent points in the cell. The value of σ T could be calculated based on their variance and the value of σ . In the case of knowledge of locations of fragments in the unit cell, this knowledge can be used to calculate estimates of the electron-density distribution for each point in the neighborhood of the fragment. These electron-density distributions can then in turn be used just as described above to estimate ρ T and σ T in this region. An iterative process, in which fragment locations are identified by cross-correlation or related searches (density modification) is applied and additional searches are carried out to further generate a model for the electron density, could even be developed, in an extension of the iterative chain-tracing methods described by Wilson & Agard (1993 ▶). Such a process could potentially even be used to construct a complete probabilistic model of a macromolecular structure using structure-factor estimates obtained from molecular replacement with fragments of macromolecular structures as a starting point. In all these cases, the electron-density information could be included in much the same way as the probability distributions we used here for the solvent and protein regions of maps. In each case, the key is an estimate of the probability distribution for electron density at a point in the map that contains some information that restricts the likely values of electron density at that point. The procedure could be further extended by having probability distributions describing the likelihood that a particular point in the unit cell is within protein, within solvent, within a particular location in a fragment of protein structure, within a non-crystallographically related region and so on. These probability distributions could be overlapping or non-overlapping. Then for each category of points, the probability distribution for electron density within that category could be formulated as in (22) and our current methods applied. The procedure described here differs from the reciprocal-space solvent-flattening procedure described previously (Terwilliger, 1999 ▶) in two important ways. One is that the expected electron-density distribution in the non-solvent region is included in the calculations and a formalism for incorporating information about the electron-density map from a wide variety of sources is developed. The second is that the probability distribution for the electron density is calculated using (22) for both solvent and non-solvent regions and values of the scaling parameter β and the map uncertainty σMAP are estimated by a fitting of model and observed ­electron-density distributions. This fitting process makes the whole procedure very robust with respect to scaling of the experimental data, which otherwise would have to be very accurate in order that the model electron-density distributions be applicable. Software for carrying out maximum-likelihood density modification (‘Resolve’) and complete documentation is available on the WWW at http://resolve.lanl.gov.
                Bookmark

                Author and article information

                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Pub. Group
                2041-1723
                20 November 2014
                : 5
                : 5491
                Affiliations
                [1 ]The Medical Research Council, Mitochondrial Biology Unit , Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 0XY, UK
                Author notes
                [*]

                These authors contributed equally to this work

                Article
                ncomms6491
                10.1038/ncomms6491
                4250520
                25410934
                032a1d7e-f261-4dfe-96b3-866f0ed94cc6
                Copyright © 2014, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.

                This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

                History
                : 15 May 2014
                : 06 October 2014
                Categories
                Article

                Uncategorized
                Uncategorized

                Comments

                Comment on this article