Introduction The uneven distribution of species diversity is a key feature of life on Earth and has myriad implications. While the scale-dependence of the determinants of the global variation in diversity is well acknowledged –, to date a quantitative accounting of the roles of history and environment in generating and maintaining gradients in species richness is still lacking. Over the past three decades, increased data availability has facilitated analyses of contiguous geographic patterns in species richness at relatively fine spatial grains (100–200 km) at both continental – and global scales ,. At these spatial resolutions, environmental variables such as productivity or temperature have been shown to offer extremely strong statistical predictions of species richness ,–. However, it has been difficult to connect these results directly with underlying evolutionary and ecological processes. One problem is that the ultimate drivers underpinning diversity, namely speciation and extinction , operate at scales much larger than the spatial resolution (e.g., 100 km grids) of most analyses. A number of studies have confirmed the strong effect of regional richness on local richness –,, and have speculated on the role of energy driving diversification at regional scales – as well as sorting at local scales –. But attempts to integrate them at the appropriate scale have been limited, and we know of no study that has quantified the effect of productivity on richness gradients jointly at regional and local scales and both in terms of evolutionary and ecological processes. Another impediment to interpretations of gridded richness analyses has been that species' geographic ranges are generally much larger than, for example, 100 km×100 km grid cells, resulting in geographically non-random patterns of pseudo-replication, inflated spatial autocorrelation, and an overrepresentation of wide-ranging species and their respective climatic associations ,. These issues have to date precluded straightforward evolutionary and ecological interpretations of macroecological environment correlations of gridded richness patterns ,. While partly motivated by limits in the knowledge of fine-scale species distributions , macroecological analyses have also been conducted using, for example, ca. 800 ecoregions as spatial units ,, but these regions still incur significant and geographically variable redundancy in species. We are not aware of a study on richness gradients that has successfully overcome this problem and thus truly have given each species equal weight. Finally, while there is little doubt about the importance of time for diversification –, attempts to date to invoke paleoclimate for understanding richness have been hampered by the lack of data, especially at deeper time scales. Several studies have linked relatively recent climatic oscillations, for example, those causing quarternary high-latitude glaciation, to geographic richness patterns ,–. The geography of deeper time climate conditions and exactly how it relates to the tempo of past clade diversification is inherently difficult to estimate. But given deep conservatism in the environmental (e.g., biome) associations within clades , compared to relatively dynamic geographic ranges , clades are expected to much more strongly track climatically defined regions, or biomes, rather than specific geographical locations over evolutionary history. The ages of biomes may thus offer a promising avenue for understanding the role of paleoclimate contributing to contemporary patterns of species richness and have recently been successfully correlated with both turtle and tree richness at the regional scale ,. To date, analyses connecting the age and area of regions to finer grain richness patterns have not been attempted. Here, we aim to address these problems with a hierarchical framework that integrates the drivers of regional diversification of species with those of their sorting into finer grain assemblages at their respective scales of influence. We use this model to test the relative importance of past spatio-temporal variation of climatic conditions (specifically time-integrated area and productivity) versus contemporary environment for explaining both the regional and finer scale variation in the species richness of terrestrial vertebrates worldwide. Due to environmental niche conservatism, organisms are generally restricted to climatically defined bioregions, or “evolutionary arenas,” characterized by in situ speciation and extinction. We expect differences in species richness between such regions to arise from different levels of net diversification (speciation – extinction over time). The number of speciation and extinction events should vary among regions due to differences in the sizes of populations over time and the opportunities for reproductive isolation for all resident taxa . We expect these drivers to be associated with today's area ,, and energy availability (i.e., productivity) ,– of bioregions but, critically, also with the past levels of these factors—that is, how bioregions have varied in areal extent and productivity over time . Furthermore, regional rates of diversification have been hypothesized to vary with temperature and its effects on activity and biological rates such as rates of molecular evolution or species interactions –,. We expect all of these drivers in concert to shape broad-scale gradients of diversity and predict that in an integrative assessment of regional differences in diversity, (i) models accounting for the temporal availability of area or productivity will outperform those without (i.e., regions that are older and/or have in the past been larger in extent will support higher vertebrate species richness than younger and/or smaller regions), (ii) area times energy availability (net primary productivity) will be a stronger predictor of richness than area alone ,, and (iii) average bioregion temperature will positively affect richness above and beyond the effects of productivity and have a stronger effect in ectotherms compared to endotherms . We test these predictions for the 32 main subdivisions, or “bioregions,” of the world based on vegetation type and major landmass (Figure 1, Tables S1 and S2) ,. We excluded montane regions (which exclusively harbor ca. 5% of vertebrate species and represent ca. 15% of global land area) due to their extremely steep environmental gradients and associated species turnover, which impedes reliable bioregional delineation and estimates of their extent over time. Over historical time-scales these climatically and geographically distinct bioregions have been characterized by similar environmental and climatic conditions, but have changed in size and shape over time within their respective realms . All bioregions are within the range of scales over which allopatric speciation of terrestrial vertebrate speciation typically occurs (100–1,000 km scale ) and may thus be considered bio-climatically and geographically distinct “evolutionary arenas.” After deriving time-integrated models of bioregion species richness, we then in a second step assess their ability to predict the variation in richness at the scale of 110 km grid cell assemblages. We make these finer scale predictions first under a model of simple random sorting of species from those predicted for the bioregion and, second, under a model of sorting mediated by the relative productivity of a grid cell. The goals of this second step include (i) an evaluation of the ability of this hierarchical model to make strong fine-scale richness predictions (while including paleoclimate and avoiding regional-level conflation of sample size) and (ii) a demonstration of the separate roles energy availability has at different temporal and spatial scales. 10.1371/journal.pbio.1001292.g001 Figure 1 Map of study bioregions and their area and annual productivity dynamics. The variation in area (black) and annual productivity (red) over the last 55 million years forms the species richness predictors TimeArea (cumulative time-area, units 104 km2×million years) and TimeAreaProductivity (cumulative total productivity, units 1017 kg Carbon), respectively (values in upper right box corner). Panel boxes have one of three different y-axis scales (note different line thicknesses and legend). For example, in tropical woody savannas and dry forests, the land area for the last few million years has been ∼1×107 km2 in the Afrotropics, ∼2×106 km2 in Australia, and ∼1×105 km2 in Madagascar. See also Tables S1, S2, and S5. Results and Discussion Bioregion Historical Dynamics Paleoclimatic data reveal dramatic variation in the age and spatial dynamics of different bioregions from the end of the Paleocene (55 MY bp) to the present day (Figure 1, Table S5). For example, grasslands are not thought to have covered large areas on earth until 8 million years ago, resulting in a much smaller area over time than observed for, for example, temperate or tropical moist forests that have a longer history (Figure 1). Linking estimates of the extents of bioregions over time allows the calculation of “time-integrated area” (TimeArea) , a synthetic index of area available to the bioregion's biota over time, varying from just 48×104 km2 integrated over 55 million years in the case of the Mediterranean bioregion at the southern tip of Africa to over 100,000×104 km2 in Eurasian temperate and African moist tropical forests. Unlike bioregion extent and position, climatic conditions of bioregions are assumed to be relatively static over time , which allows the determination of average bioregion net primary productivity (Productivity) and Temperature. Summed Productivity over bioregion Area yields total bioregion productivity (AreaProductivity)—that is, total annual carbon flux measured in kg/year over a whole bioregion, a measure that exhibits joint dynamics with bioregion Area. But integrated over time in the form of TimeAreaProductivity, it exhibits very different geographic patterns than TimeArea (Figure 1) with, for example, African and IndoMalay tropical moist forests experiencing a flux of over 8,000×1017 kg of carbon over the past 55 million years and the Mediterranean regions of the New World and Africa just under 3×1017 kg. Bioregion Biotic Independence We summarized terrestrial vertebrate richness per bioregion as Total (every species found in a bioregion), Resident (species for which a given bioregion contains the largest portion of the range), and Endemic (species that are restricted to a single bioregion; Table S4). We find minimal overlap in Total species among bioregions (median Jaccard similarity among bioregions: 4% for birds, 0% for other taxa; Figure S1, Table S3), which confirms their relative evolutionary isolation in addition to climatic and spatial independence and a consistently strong pattern of biome conservatism ,,. It also confirms that across all four vertebrate groups these selected bioregions represent useful spatial units that avoid the pseudo-replication of species: for Resident species richness every species enters a given analysis exactly once, and the number of distribution records is equal to the global richness of species (13,860 endothermic mammals and birds, 11,836 ectothermic amphibians and reptiles; montane endemics excluded). For Endemic species (total of 13,111 species and records) bioregions are even more likely to represent the true regions of origin compared to Resident species. We therefore expect a stronger correlation of area and productivity integrated over time (TimeAreaProductivity) with the diversity of Endemic species. Bioregion Species Richness All three predictions of our integrative model regarding the effect of time-area-productivity on richness are confirmed (Table 1). The models that account for time-integrated productivity and also include temperature as an additional predictor yield the strongest fits. For endotherms, the time-integrated measures of area outperform models that ignore time only for the Endemic richness dataset, which offered the more direct test of our hypotheses. Predictions of the two-predictor TimeAreaProductivity+Temperature model are consistently strong across all four vertebrate taxa, which represent independent replicates (Figure 2), explaining over 77% of the variation in richness (Figure 2, N = 128, see Tables S7 and S8 for more details). Models that fit TimeArea and Productivity as statistically separate terms do not on the whole yield stronger predictions (Table S9). This lends support to Wright's  parallel findings for large islands, which represent similarly closed systems, and contrasts with previous results reported for 110×110 km grid cells . The shape of the Productivity-richness relationship is linear (in Endotherm Residents) or positive accelerating in linear space (in Ectotherm Residents and both Endemics groups). In contrast, the slopes of the AreaProductivity- and TimeAreaProductivity-richness relationships, whether fitted with or without Temperature, are all positive saturating—that is, species richness tends to increase more steeply in the low than in the high productivity ranges (coefficients in ln-ln space vary between 0.4 and 1, Table S7). We did not find evidence of a hump-shaped pattern for any measure of productivity and richness at the bioregional scale ,. 10.1371/journal.pbio.1001292.g002 Figure 2 Observed versus predicted bioregion species richness of terrestrial vertebrates. Observed bioregion species richness (A, Endemic species, B, Resident species) is plotted against that predicted by the two-predictor TimeAreaProductivity+Temperature model fit separately for each of the four taxa (different symbols). Lines indicate least squares fit of regressions relating to observed predicted richness for each of the four taxa over the 32 bioregions (r 2 [Endemic] = 0.78, r 2 [Resident] = 0.78, N = 128). For detailed results, see Table S7. Colors indicate biome membership (see the map in Figure 1 to match colors). 10.1371/journal.pbio.1001292.t001 Table 1 Relative performance of integrated single- and two-predictor models of bioregion species richness. Predictor Variables Endemic Resident Endotherms Ectotherms Endotherms Ectotherms ΔAIC r2 ΔAIC r2 ΔAIC r2 ΔAIC r2 TimeAreaProductivity+Temperature 0 0.67 0 0.82 20 0.70 1 0.86 TimeArea+Temperature 0 0.68 0 0.82 14 0.75 0 0.87 AreaProductivity+Temperature 10 0.56 13 0.73 0 0.84 3 0.85 TimeAreaProductivity 20 0.36 46 0.22 31 0.56 55 0.20 TimeArea 23 0.29 49 0.14 32 0.54 58 0.13 AreaProductivity 26 0.22 50 0.10 22 0.66 57 0.16 Productivity 26 0.23 42 0.29 52 0.12 52 0.29 Area 31 0.09 53 0.02 35 0.49 61 0.04 Temperature 22 0.31 24 0.60 52 0.14 29 0.65 Endemic species are those restricted to a single bioregion and Resident counts species with the largest portion of their range in a given bioregion. Endotherms combine mammals and birds, ectotherms combine reptiles and amphibians. Best models (ΔAIC 50% inside the selected bioregion boundaries as described above (and shown in Figure 1). Only cells with >50% dry land and with at least one species from each of the three vertebrate groups were included in the analysis, resulting in 9,253 cells. For each grid cell we summarized richness of Resident species (i.e., species were counted if they occurred in several grid cells only within the same bioregion) and of Total species (i.e., species were counted whether they occurred in multiple grid cells within the same or in a different bioregion). Values were log10-transformed before analysis. For Total species, the full database consisted of a total of 2,966,137 grid cell records (birds 2,010,091; mammals 695,133; and amphibians 260,913). Bioregion and Finer Scale Environmental Data Bioregion-typical temperature estimates (Temperature) were based on average annual temperatures calculated from the University of East Anglia's Climatic Research Unit gridded climatology 1961–1990 dataset at native 10-min resolution . For estimates of bioregion-typical annual net primary productivity, we used an average from 17 global models at a spatial resolution of 0.5 degrees latitude-longitude . Average bioregion productivity (Productivity, units grams Carbon m−2 year−1) was calculated from all 0.5×0.5 degree grid cells that predominantly fall inside a bioregion, and summed productivity (AreaProductivity, units grams Carbon year−1) was then given by the product of this value and bioregion Area. With bioregions defined by their typical environmental conditions, we assumed average productivity characteristic of a bioregion to have been constant through time ,. Time-integrated productivity (TimeAreaProductivity, unit grams Carbon) was thus given as the product of Productivity and TimeArea. Values for all bioregion predictor variables are given in Table S1. All response and predictor variables were natural log-transformed for analysis, except for temperature, which was 1/kT transformed (where k is the Boltzmann constant, see ). We used the same global net primary productivity dataset  to estimate productivity at the level of 110×110 km grid cells. First, we calculated average grid cell productivity (NPP) across all encompassing 0.5×0.5 degree grid cells. Second, we normalized each grid cell by dividing by the maximum productivity grid cell value observed in a bioregion, resulting in a measure of proportional productivity (PropNPP) varying from 0 to 1. Bioregion Analyses We performed a total of nine GLM models on the bioregion data and used the Akaike criterion to identify those offering the best fit . Six models were given in the form of single predictors (Temperature, Area, Productivity, AreaProductivity, TimeArea, and TimeAreaProductivity). An additional three models were formed by the combination of the latter three variables with Temperature. We performed a separate set of analyses to assess the potential additional effect of elevation range within a bioregion, but adding this variable to any of the three two-predictor models did not improve model fit, and thus we excluded the variable from further consideration. Because of the strong independence of sampling units both in terms of response (no overlap in species) and predictor variables (by definition each bioregion is environmentally highly distinct from neighboring bioregions), the usual concerns about spatial autocorrelation affecting model results , do not apply to this analysis, and additional spatial regression analysis was not performed. Finer Scale Analyses Having established models of bioregion richness, we assessed the success of predictions of resident bioregional richness to explain the species richness (Total and Resident, see above) of all 110×110 km grid cells within bioregions (for a conceptual overview of the analytical steps, see Figure 3). Note that unlike the bioregional tests described above, analyses at this scale do double-count species. In our study we make the simplifying assumption that diversification processes are sufficiently accounted for at the bioregional scale. The models at the within-bioregion scale then address the sorting of these species each into multiple grid cells, with multiple occurrences an integral part of the signal. We acknowledge that, depending on taxon and region, diversification processes may still exert influence on the within-bioregion patterns of distribution and richness, and we hope that our work will spur further research into additional approaches that can be integrated across all scales. We first evaluated bioregion predicted resident richness alone (in essence testing for a random sorting of bioregion species into finer scale assemblages), then included bioregion Area as an additional predictor, and finally we added estimates of grid cell NPP as a finer scale predictor. We first performed simple GLM models with all 9,253 grid cells as sampling units, together with bioregion Resident richness as predicted by the TimeAreaProductivity+Temperature and AreaProductivity+Temperature models as a first predictor (BioregPred) and bioregion Area as a second predictor (Figure 3, Table S9). In the same GLM we then added grid cell proportional net primary productivity (CellPropNPP, i.e., relative productivity within a bioregion, see above) as an additional predictor. In preliminary post hoc analyses with a number of environmental variables CellPropNPP remained by far the strongest, in line with recent work on within-regional richness filters that also find productivity-related variables to be dominant ,. Given the nested nature of these analyses we focus on pseudo-r 2 values (fit of observed versus predicted) and visual examination of results in the form of partial residual plots (Figure 3). For this first demonstration, focused on a single variable, we did not include further analyses additionally fitting the signal of spatial autocorrelation. We performed a second set of analyses in an explicit mixed effects model setting (Table S10), with bioregion as a random effect (R library lme4, Version 0.999375-32, function lmer). As in the GLM model, grid cell richness is first fitted by the predictions for regional resident species richness (BioregPred, see Table 1), and then by area of the region (Area), and grid-cell-level NPP (NPP). Region was fitted as a random effect, and the slope and strength of BioregPred and BioregPred+Area as fixed effects were assessed (model formula in R: lmer (y∼BioregPred+Area+(1|Bioregion)). The additional effect of grid cell NPP was then evaluated by fitting it as an additional fixed effect with a globally constant slope (NPPconst) and by allowing the NPP–richness relationship to vary within regions as random slope (NPPvar) (model formula in R: lmer (y∼BioregPred+Area+(1|Bioregion)+(NPP|Bioregion)). Data Deposition The data are deposited in the Dryad Repository (http://dx.doi.org/10.5061/dryad.45672js4). Supporting Information Figure S1 Bioregion independence at different taxonomic ranks. The frequency of Jaccard similarity values is shown ([count of shared taxa]/[count of taxa in both]) expressed in % (Jaccard * 100) for all bioregion combinations (N = 496) for different taxonomic ranks. Results confirm high independence of bioregions at the species and genus rank and moderate independence at family rank. See also Table S3. (TIF) Click here for additional data file. Figure S2 Partial residual plots for the joint effects TimeAreaProductivity and Temperature on Resident species richness (ln-transformed). Partial residual plots illustrate the relationship between a predictor and the response given other predictors in the model. Specifically, this is a plot of ri+bxi versus xi, where ri is the ordinary residual for the i-th observation, xi is the i-th observation, and b is the regression coefficient estimate. Colors indicate biome membership (see Figure 1 for legend). For detailed model results, see Table S7. (TIF) Click here for additional data file. Figure S3 Partial residual plots for the joint effects of TimeAreaProductivity and Temperature on Endemic species richness (ln-transformed). Partial residual plots illustrate the relationship between a predictor and the response given other predictors in the model. For other details, see Figure S2. (TIF) Click here for additional data file. Figure S4 Partial residual plots for the variation in grid cell richness of Resident species among bioregions. In this model Resident richness is predicted by the bioregion (TimeAreaProductivity+Temperature) model and bioregion current-day Area. Partial residual plots illustrate the relationship between a predictor and the response given other predictors in the model. Specifically, this is a plot of ri+bxi versus xi, where ri is the ordinary residual for the i-th observation, xi is the i-th observation, and b is the regression coefficient estimate. Colors indicate biome membership (see Figure 1 for color legend and Figure 3 for results without Area). (TIF) Click here for additional data file. Table S1 List of bioregions and predictor variables in the analysis. TimeArea and TimeAreaProductivity values are from integration of bioregion area over 55 million years. For further details and geographic locations, see Figure 1. (DOC) Click here for additional data file. Table S2 Bioregion species richness values. Total: includes all species with ranges extending into a given bioregion (many species represented several times in different bioregions). Resident: includes only species with the greatest portion of their range extending into a given bioregion (each species is represented only once). Endemic: includes only species with no portion of range extending beyond a given bioregion. Vert., Vertebrates (Birds+Mammals+Amphibians+Reptiles). Amph., Amphibians. (DOC) Click here for additional data file. Table S3 Median Jaccard similarity (%) of bioregion composition at three different taxonomic ranks. Jaccard similarity is given as ([count of shared taxa]/[count of taxa in both]) expressed in % (Jaccard * 100). For a given bioregion and taxon, values are medians from the comparison with all 31 other regions, respectively. See also Figure S1. (DOC) Click here for additional data file. Table S4 Spearman rank correlations among bioregion species richness values for Total, Resident, and Endemic categories for all vertebrates, endotherms, and ectotherms, and each vertebrate clade separately (N = 32 bioregions). For richness definitions see Table S2. (DOC) Click here for additional data file. Table S5 Details regarding the ages of biomes and the sources consulted in order to calculate the area over time for each of the world's bioregions. (DOC) Click here for additional data file. Table S6 Spearman rank correlations of predictor variables among bioregions (N = 32). (DOC) Click here for additional data file. Table S7 Predictors of bioregion richness for Endotherms (mammals+birds) and Ectotherms (amphibians+reptiles) with details on slope estimates. Species richness values and all predictors except temperature were ln-transformed; temperature is given as 1/kT (where k is the Boltzmann constant). For other details see Table 1. (DOC) Click here for additional data file. Table S8 Predictors of bioregion richness. Results for all taxa. For other details see Table 1. (DOC) Click here for additional data file. Table S9 Comparison of AIC values of alternative formulations of models combining the effects of TimeArea and Productivity, and of TimeArea, Productivity, and Temperature on bioregion species richness. TimeArea and Productivity are either integrated into a single variable (TimeAreaProductivity, see Figure 1), modeled additively, or modeled as an interaction. Models with >3 units AIC larger than the model with the smallest AIC within a group (i.e., significantly worse) are marked in bold. (DOC) Click here for additional data file. Table S10 Comparison of AIC values of TopoRange (log(maximum − minimum elevation) in a bioregion) as an alternative predictor of bioregion species richness. Null model is fitting the intercept only. The variable Area, which is correlated with TopoRange (rSpearman = 0.58, N = 32), offers either equal or better fit. (DOC) Click here for additional data file. Table S11 Spearman rank correlations among Total, Resident, and Endemic richness for different taxa across 110 km quadrants (N = 9,253). For richness definitions, see Table S2. (DOC) Click here for additional data file. Table S12 Prediction success (r 2) of bioregion-level models of Total and Resident richness of 110 km grid cell assemblages (N = 9,253) based on general linear models. Grid cell richness is first fitted by the predictions for Resident species richness (“[Bioregion] Predicted richness,” see Table 1), and then additionally by Area of the bioregion, and grid-cell-level relative productivity (CellPropProductivity, calculated as proportion of maximum grid cell productivity in the region). Pseudo-r 2 values of observed versus fitted are listed. (DOC) Click here for additional data file. Table S13 Prediction success of bioregion-level models of Total and Resident richness of 110 km grid cell assemblages (N = 9,253) based on mixed effects models. Grid cell richness is first fitted by the predictions for Resident species richness (“[Bioregion] Predicted richness,” see Table 1) and then additionally by bioregion Area, and grid-cell-level relative productivity (CellPropProductivity). Bioregion is fitted as a random effect, and the slope and strength of “[Bioregion] Predicted richness” and “[Bioregion] Predicted richness+Area” as fixed effects are assessed. Pseudo-r 2 values of observed versus fitted are listed. The additional effect of grid cell productivity was evaluated by fitting it as additional fixed effect with a globally constant slope (CellPropProductivity) and by allowing the relationship with richness to vary within regions as a random slope (CellPropProductivity var). (DOC) Click here for additional data file.