The first evidence for dark matter emerged from studies of galaxy clusters in the 1930s (Zwicky 1933), and the dark matter problem assumed a central position in cosmology after technological advances allowed dynamical measurements in the outer regions of individual galaxies (Rubin and Ford 1970, Rogstad and Shostak 1972; see review by Faber and Gallagher 1979). Because it clusters on small scales, dark matter has a rich phenomenology, and detailed studies of galaxies, galaxy clusters, large scale structure, the Lyα forest, and the CMB have largely pinned down its properties even though we have yet to identify the dark matter particle or particles. The implications of the dark matter problem have proven even more profound than might have been imagined in the 1930s, pointing the way to an entirely new form of matter whose cosmic mean density exceeds that of all baryonic material by a ratio of 6:1. There are now several plausible ideas of what dark matter might be — ideas that are rooted in well motivated extensions of the standard model of particle physics and that (at least in some cases) naturally explain the observed density of dark matter (Bertone et al. 2005). With experimental methods advancing on many fronts, there are good reasons to hope that dark matter will soon be identified in particle accelerators, detected directly in underground experiments, or detected indirectly via its annihilation into γ-rays, neutrinos, or cosmic rays.
Evidence for cosmic acceleration began to emerge in the early 1990s, and it rapidly evolved into a near-airtight case following the supernova discoveries of the late 1990s (see Section 1.1). Whether the cause is a new energy component or a breakdown of GR, the implications of cosmic acceleration are dramatic, even more so than those of dark matter. Cosmic acceleration may ultimately provide clues to the nature of quantum gravity, or to the structure of the universe on scales beyond the Hubble volume, or to its history over times longer than the Hubble time. There are already many theories of cosmic acceleration, but none of them offers a convincing explanation of the observed magnitude of the effect, and nearly all of them were introduced to explain the observed acceleration, rather than emerging naturally out of fundamental physics models. In contrast to dark matter, most models of dark energy predict that it is phenomenologically poor, affecting the overall expansion history of the universe but little else. That impression could yet prove incorrect: other signatures of "cosmic acceleration physics" might appear in small-scale gravitational experiments, in the behavior of gravity in different large scale environments, or in non-gravitational interactions.
While the solution to the cosmic acceleration problem could come from a suprising direction, including theory, there is a clear experimental path forward through increasingly precise measurements of expansion history and growth of structure. Relative to current knowledge, Stage IV experiments can improve the measurement of basic cosmological observables — H(z), D(z), and G(z) — by one to two orders of magnitude. Correspondingly, they can achieve a 1-2 order-of-magnitude improvement in constraints on w, 2-3 orders-of-magnitude improvement in the DETF figure-of-merit, and still greater gains in higher dimensional parameterizations, including tests of GR violations. Any robust deviation from a cosmological constant model would have profound implications, and the greater the precision and detail with which such a deviation is characterized, the greater the direction for understanding its cause.
We have reviewed in considerable detail the four leading methods — supernovae, BAO, weak lensing, and clusters — and we have briefly discussed some of the emerging new methods, whose capabilities and limitations are as yet less thoroughly explored. We have also investigated the complementarity of these methods for constraining theories of cosmic acceleration. We have spent little time on the CMB as it has little direct constraining power on these theories, but it does provide crucial constraints on other cosmological parameters that are essential to precision tests. We now conclude our article with an editorial recap of our main takeaway points.
Type Ia supernovae have unbeatable precision for measuring distances at z ≲ 0.5. Future surveys can readily achieve statistical errors of 0.01 mag or less (0.5% in distance) averaged over bins of Δz = 0.2. The challenge is getting systematic uncertainties at or below the level of such statistical errors. In our view, the key systematics for SN studies are imperfect photometric calibration, evolution in the population of SNe represented at different redshifts, and the effects of dust extinction. The first can be addressed by careful technical design, of the instruments used for SN surveys and of the observing and calibration procedures. The second can be addressed by obtaining high quality observations of the SNe and their host galaxies that allow one to match the properties of high and low redshift systems. The third is best addressed by working in the rest-frame near-IR, where extinction is low. Rest-frame IR observations may also mitigate evolution systematics and improve statistical errors, since current observations indicate that the scatter in SN luminosities is smaller in the near-IR than in the optical.
The BAO method complements the SN method in several ways. SN measure distance ratios relative to local calibrators (i.e., distances in h-1 Mpc), while BAO measure absolute distances (in Mpc) assuming a calibration of the sound horizon. SN and BAO measurements at the same redshift therefore provide complementary information, effectively constraining H0, which is itself sensitive to acceleration when combined with CMB data. Spectroscopic BAO measurements that sample a constant fraction of the sky become more precise at high redshift because they cover a greater comoving volume and because they measure H(z) directly in addition to DA(z). (Of course, they also require a larger number of tracers to probe these larger volumes, and the tracers themselves are fainter at higher redshifts.) Cosmic variance limited BAO surveys have roughly constant sensitivity to dark energy over the range 1 < z < 3 because the decreasing dynamical impact of dark energy at higher redshifts is balanced by the greater BAO measurement precision. Furthermore, the BAO method is the only one that we expect to be statistics-limited even with Stage IV surveys. Non-linear matter clustering and non-linear galaxy bias may shift the BAO peak by more than the statistical errors of Stage IV experiments, but the shifts can be computed using theoretical models that are constrained by the smaller scale clustering data, and moderate fractional accuracy in these corrections is enough to keep any uncertainty in the corrections well below the statistical errors. Thus, we see the main challenge for the BAO method as finding ways to efficiently map the available structure. There are several promising ideas, both ground-based and space-based, and Stage IV BAO constraints will likely come from a union of several approaches covering different redshift ranges.
Weak lensing measurements provide sensitivity to both the distance-redshift relation and the growth of structure. The statistical precision achievable with future facilities is very high, so the challenge is reducing systematic uncertainties to a level that does not overwhelm these statistical errors. The most important problem is reducing multiplicative shape measurement biases to the level of ~ 10-3 or below, which requires (among other things) determining the PSF that affects the galaxy images to very high accuracy. This is an area of highly active research, and it is not yet clear what approach will prove most successful; we have advocated pursuit of a Fourier method that becomes exact in the limit of high S/N ratio. Since most shape measurement systematics depend inversely on the ratio r50 / rpsf of galaxy size to PSF radius, one can mitigate these systematics by restricting the analysis to larger galaxies, but this gives up statistical precision by reducing the surface density of usable sources. The second major challenge for WL studies is the measurement and calibration of photometric redshift distributions, characterizing both their means and their outlier fractions at the ~ 10-3 level or below. Meeting this challenge requires optical and near-IR imaging for robust identification of spectral breaks, and large spectroscopic calibration data sets. The third systematics challenge for WL is intrinsic alignment of galaxies. With continuing theoretical work and good photometric redshifts, we believe that this systematic can be kept subdominant, but it remains a challenging problem. WL measurements are rich with observables, including higher order statistics and varied combinations of galaxy-galaxy lensing, galaxy clustering, and tomography. Despite the field's formidable technical obstacles, we think it quite possible that constraints from WL surveys will eventually exceed current forecasts because these additional observables provide cosmological sensitivity and/or allow systematic uncertainties to be calibrated away.
Cluster abundance measurements provide an alternative route to measuring the growth of structure and thus testing the consistency of GR growth predictions. In addition, by reducing uncertainty in Ωm and breaking other degeneracies, cluster abundance measurements can sharpen the equation-of-state constraints from SN, BAO, and WL distance measurements. The key challenge for cluster cosmology is achieving unbiased and precise calibration of the cluster mass scale. Realizing the statistical power of future surveys requires absolute mass calibration accurate at the 0.5-1% level. In our view, this is only achievable with weak lensing, because the baryonic physics associated with other observables is too uncertain to predict them this accurately from first principles. We thus see cluster studies as a natural byproduct of WL surveys and in some sense as a specialized branch of WL, one that takes advantage of the strong additional information afforded by knowing the locations of peaks in the optical galaxy density, X-ray flux, or SZ decrement. If WL provides the fundamental mass calibration, then the shape measurement and photometric redshift uncertainties that affect WL also affect cluster methods.
While all of these methods can be pursued at ambitious levels from the ground, all would benefit from the capabilities of a space mission, especially from the capability of wide-field near-IR imaging and spectroscopy, which is possible at the necessary depth only from space. For SN, a space platform provides the greater stability and sharp PSF needed for highly accurate photometric calibration, and it allows observations in the rest-frame near-IR, which is crucial for minimizing extinction systematics and may be valuable for reducing evolution systematics. For BAO, near-IR spectroscopy allows emission-line galaxy surveys over the huge comoving volume from 1.2 ≲ z ≲ 2, which is difficult to probe with ground-based optical or IR observations. (Intensity-mapping radio methods may be able to probe this redshift range from the ground, but this approach still has significant technological hurdles to overcome.) For WL, space observations allow the deep near-IR photometry that is essential for robust and accurate photometric redshifts, and they provide stable imaging with a sharp PSF that enables accurate shape measurements for a high surface-density source population. The above considerations motivated both WFIRST and the IR capabilities of Euclid. Space-based optical imaging, the other major element of Euclid, allows a significantly sharper PSF and thus potentially more powerful WL measurements, if the systematic errors are sufficiently well controlled. More generally, space-based WL measurements can employ a higher galaxy surface density than ground-based surveys to the same photometric depth, both because the PSF itself is smaller and because greater stability and the absence of atmospheric effects should allow accurate measurements down to a smaller ratio of r50 / rpsf.
The current generation of "Stage III" experiments such as BOSS, PS1, DES, HSC, and HETDEX are collectively pursuing all of these methods, and they should achieve dark enery constraints substantially better than those that exist today. It is crucial that the next generation, Stage IV experiments maintain, collectively, a balanced program that includes SN, BAO, and WL, as well as other methods (clusters, Alcock-Paczynski, redshift-space distortions) that can be applied to the same data sets. There is much more to be gained, and much lower risk, from doing a good job on all three methods than from doing a maximal job on one at the expense of the others. A balanced program takes advantage of the methods' complementary information content and areas of sensitivity, and it allows the best cross-checks for systematic errors. It is becoming standard practice to trade systematic uncertainties for statistical errors by parameterizing their impact and marginalizing — e.g., over an uncertain shear calibration multiplier or photometric redshift offset. While this is a powerful strategy for removing biases due to "known unknowns," it does not protect against "unknown unknowns." Any conclusion about cosmic acceleration will be more compelling if it is demonstrated by independent methods, and the more interesting the conclusion, the more crucial this independent confirmation will be.
In Section 8 we have provided quantitative forecasts for a fiducial Stage IV program and for many variants upon it. Our fiducial SN program assumes 0.01 mag mean errors for a local calibrator sample at z = 0.05 and in three bins of Δz = 0.2 at 0.2 < z < 0.8, uncorrelated from bin to bin. Our fiducial BAO program assumes mapping 1/4 of the sky to z = 3, with errors that are 1.8 × the linear theory sample variance errors over this volume. Different combinations of redshift range and sky coverage that have the same comoving volume yield nearly the same results. Our fiducial WL program assumes statistical errors of a ~ 109-galaxy imaging survey (more precisely, 104 deg2 with 23 galaxies/arcmin2), and systematic errors of 2 × 10-3 in shear calibration and photometric redshift calibration. We also consider an optimistic case in which the total (systematic + statistical) errors are simply double the statistical errors, which effectively corresponds to total errors ~ 2-3 times smaller than those of the fiducial case. Our fiducial program corresponds fairly closely to the one recommended by the Astro2010 Cosmology and Fundamental Physics panel, and it is a reasonable, probably conservative forecast of what could be achieved by a combination of LSST, Euclid/WFIRST, and ground-based BAO and SN surveys.
To quantify the expected performance of this program and its variants, we considered two dark energy models, one with wa = w0 + wa(1 - a) = wp + wp(ap - a), where ap = (1 + zp)-1 is the expansion factor at which w is best constrained, and a second with w(a) allowed to vary freely in each of 36 bins of Δa = 0.025, reaching to z = 9. In both cases we allowed deviations from GR-predicted growth rates characterized by an overall multiplicative offset G9 in G(z) and by a shift Δγ in the logarithmic growth rate dlnG / dlna ∝ [Ωm(a)]γ+Δγ. We focused principally on the expected errors in wp, wa, Δγ, and G9, including the DETF FoM defined as (σwp σwa)-1. While principal components (PCs) of the general w(a) model allow a much richer characterization of the dark energy history (and its uncertainties), we regard the combination of the DETF FoM and the Δγ error to be as good as any alternative for characterizing the strength of a combined program.
The primary results of our forecasting investigation appear in Tables 8 - 10 and, in distilled form, in Figures 33 and 38. The FoM of our fiducial program is 664, more than five times better than our Stage III forecast and a roughly 50-fold improvement on current knowledge. Within the adopted parameterization, 1σ errors on individual parameters are 0.014 on wp, 0.11 on wa, 0.034 on Δγ, 0.015 on lnG9, 5.5 × 10-4 on Ωk, and 5.1 × 10-3 on h. All three methods contribute significantly to these constraints. For our fiducial assumptions, BAO have the greatest leverage on the DETF FoM, in the sense that halving the BAO errors produces the greatest increase in the FoM while doubling the BAO errors produces the greatest decrease. WL has the least leverage, which implies that the fiducial BAO and SN measurements constrain the expansion history well enough that the WL measurements add relatively little constraining power. However, the error on Δγ scales nearly linearly with the WL errors, since all of the information on growth comes from the WL measurements. (Note that we scale the total WL errors, equivalent to multiplying systematic and statistical errors by the same factor.) Conversely, changing the SN or BAO errors has almost no impact on the Δγ constraint.
Changing to our optimistic assumptions about WL systematics (total errors equal to twice the statistical errors), while retaining the fiducial SN and BAO assumptions, raises the FoM from 664 to 789 and lowers the Δγ error from 0.034 to 0.026. For the optimistic systematics model, WL measurements have the greatest leverage on the DETF FoM instead of the least, and the Δγ errors continue to scale approximately linearly with the WL errors. Thus, our conclusions about the power of WL relative to BAO and SN depend significantly on the assumed importance of WL systematics, which is difficult to predict at present.
When we move from the w0 - wa model to the general w(z) model, the forecast errors on Δγ barely change, since it is constrained by differential measurements of matter clustering over the redshift range of our fiducial data sets. The errors on G9, on the other hand, expand dramatically, because even within GR the overall amplitude of structure can be shifted by the behavior of w(z) outside of our constrained redshift range (i.e., at z > 3). If the amplitude of matter clustering proved inconsistent with that of a w0 - wa, G9 = 1 model, it would definitely indicate something interesting, but this measurement alone would not show whether the unusual behavior arises from a violation of GR or from unexpected behavior of w(z) at high redshift.
For variations around our fiducial program, the impact of reducing the errors of SN measurements is greater than the impact of increasing the redshift range of these measurements. For example, reducing the error per redshift bin from 0.01 mag to 0.005 mag increases the FoM from 664 to 1197, while increasing the maximum redshift from 0.8 to 1.6 only raises the FoM to 841. These scalings imply that the highest priority for SN studies is to minimize statistical and systematic errors at z<1, and that pushing to higher redshifts is a lower priority until the reduction in z<1 systematics has been saturated. At fixed fsky, BAO constraints have a stronger dependence on maximum redshift, because at higher z the BAO measurements become more precise and the importance of the direct H(z) measurements grows.
We have not incorporated cluster abundances into our primary forecasts, but we have investigated how precisely our fiducial Stage III and Stage IV programs (CMB+SN+BAO+WL) predict the parameter combination σ11,abs(z) Ωm0.4 that is best constrained by cluster abundances. For a w0 - wa dark energy model, the forecast precision is ~ 1.5% for Stage III and ~ 0.75% for Stage IV if we assume GR is correct. If we allow GR deviations parameterized by G9 and Δγ, then the forecast precision degrades significantly, especially for Stage III at z > 0.5. Our analysis in Section 6.3.3 indicates that clusters calibrated by stacked weak lensing should be able to achieve higher precision on σ11,abs(z) Ωm0.4. When we add the anticipated cluster constraints for a 104 deg2 survey with a 1014 M⊙ mass threshold, assuming that calibration errors are limited by weak lensing statistics, we find that the DETF FoM grows by a factor of 1.4 at Stage III and 1.9 at Stage IV relative to the fiducial CMB+SN+BAO+WL program. The error on Δγ decreases by a factor of 3.2 for Stage III and by 1.6 for Stage IV. Cluster studies will be enabled automatically by large WL surveys, which can be used to identify clusters as optical galaxy concentrations and to provide mass calibration for clusters identified by any method (X-ray, SZ, optical). If they can achieve the limits imposed by weak lensing statistics, they can add considerable leverage to tests of dark energy models and deviations from GR.
We have adopted a similar strategy for some of the alternative probes discussed in Section 7. For a w0 - wa dark energy model, the forecast precision on H0 is 0.7% from our fiducial Stage IV program, 1.3% for Stage III. A direct measurement of H0 with 1% precision would improve the DETF FoM of the fiducial Stage IV program by 20%; a 2% measurement would improve the Stage III FoM by 15%. The forecast constraint on H0 degrades, dramatically, to ~ 60% in our general w(z) model, since large changes in w at low redshift can affect H0 significantly while having minimal impact on probes at higher redshift. Thus, a discrepancy between direct measurements and H0 constraints from CMB+SN+BAO+WL data could be a diagnostic for unusual low-z evolution of dark energy.
The Alcock-Pacyznski parameter H(z) DA(z) is constrained to ~ 0.2-0.3% by our fiducial Stage IV program over the redshift range 0.2 < z < 3, setting a demanding target for AP tests. The corresponding precision forecast for Stage III is ~ 0.5%. Redshift-space distortions and galaxy clustering can measure the parameter combination σ8(z) f(z), which is constrained by our Stage IV fiducial program to about 5% at z ≈ 0.1, 2.5% at z = 0.5, and ~ 1% beyond z = 1, numbers that improve only slightly for the optimistic weak lensing systematics. For Stage III, the constraints at a given redshift are considerably weaker. In all cases the constraints are tighter if we assume GR (Δγ = 0, G9 = 1), but the main purpose of redshift-space distortion analyses would be to test GR growth, so we regard the looser constraints as the more relevant targets for such analyses. This level of precision appears within reach of large galaxy redshift surveys if theoretical systematics can be adequately controlled, making redshift-space distortions a potentially powerful addition to the arsenal of cosmic acceleration probes. While WL and redshift-space distortions both probe structure growth, they have different dependences on the two distinct potentials that enter the GR spacetime metric (see Section 7.7), so a discrepancy between them could reveal a GR-deviation that might not be captured by Δγ alone. Galaxy redshift surveys designed for BAO measurements should allow redshift-space distortion analyses (and AP tests) as an automatic by-product, which may greatly increase their science return. Precise measurements of the shape of the galaxy power spectrum could also reveal signs of scale-dependent growth, another possible consequence of modified gravity models, though these may be difficult to distinguish from other factors that affect the power spectrum shape (see Section 7.7).
The aggregate precision of our fiducial Stage IV program measurements, in the sense described in Section 8.6, is 0.23% in DL (from SN), 0.13% in DA (from BAO), 0.21% in H (also from BAO), and 0.33% in σ8 for fixed geometry (for fiducial WL; optimistic WL yields 0.14%). For the fiducial Stage IV cluster program with weak lensing mass calibration we forecast 0.20% aggregate precision on σ8(z)Ωm0.4, while our fiducial Stage IV RSD forecast yields 0.22% aggregate precision on f(z) σ8(z). The ultimate limits on H0 and Alcock-Paczynski measurements are still difficult to predict, but sub-percent precision appears well within reach on the Stage IV timescale. These forecasts represent a dramatic advance over the current state of the art, which is roughly 1-5% for distance measurements (SN, BAO, H0) and ~ 5% for structure growth measurements (σ8, f(z)). The cosmological measurements of the past two decades have established a "standard model" of cosmology based on inflation, cold dark matter, a cosmological constant, and a flat universe. The measurements of the next two decades will test that model far more stringently than it has been tested to date.
The future of cosmic acceleration studies depends partly on the facilities built to enable them, partly on the ingenuity of experimenters and theorists in controlling systematic errors and fully exploiting their data sets, and partly on the kindness of nature. The next generation of experiments could merely tighten the noose around w = -1, ruling out many specific theories but leaving us no more enlightened than we are today about the origin of cosmic acceleration. However, barely a decade after the first supernova measurements of an accelerating universe, it seems unwise to bet that we have uncovered the last "surprise" in cosmology. Equally important, the powerful data sets required to study cosmic acceleration support a broad range of astronomical investigations. These observational efforts are natural next steps in a long-standing astronomical tradition: mapping the universe with increasing precision over ever larger scales, from the solar system to the Galaxy to large scale structure to the CMB. These ever growing maps have taught us extraordinary things — that gravity is a universal phenomenon, that we live in a galaxy populated by 100 billion stars, that our galaxy is one of 100 billion within our Hubble volume, that our entire observable universe has expanded from a hot big bang 14 billion years in the past, that the dominant form of matter in the universe is non-baryonic, and that the early universe was seeded by Gaussian (or nearly Gaussian) fluctuations that have grown by gravity into all of the structure that we observe today. We hope that the continuation of this tradition will lead to new insights that are equally profound.
We gratefully acknowledge the many mentors, collaborators, and students with whom we have learned this subject over the years. For valuable comments and suggestions on the draft manuscript, we thank Joshua Frieman, Dragan Huterer, Chris Kochanek, Andrey Kravtsov, Mark Sullivan, and Alexey Vikhlinin. We also thank the many readers who sent comments in response to the original arXiv posting of the article, which led to numerous improvements in the text and referencing. We gratefully acknowledge support from the National Science Foundation, the National Aeronautics and Space Administration, the Department of Energy Office of Science, including NSF grants AST-0707725, AST-0707985, AST-0807337, and AST-1009505, NASA grant NNX07AH11G1320, and DOE grants DE-FG03-02-ER40701 and DE-SC0006624. DW acknowledges the hospitality of the Institute for Advanced Study and the support of an AMIAS membership during critical phases of this work. MM was supported by the Center for Cosmology and Astro-Particle Physics (CCAPP) at Ohio State University. CH acknowledges additional support from the Alfred P. Sloan Foundation and the David & Lucile Packard Foundation. ER was supported by the NASA Einstein Fellowship Program, grant PF9-00068.