In modern cosmological models, ∼ 5/6 of the mass in the Universe is made of dark matter (Planck Collaboration et al., 2016). This dark matter forms the skeleton on which galaxies form, evolve, and merge. In the context of this model, which is now well established from a wide range of observations, fluctuations in the matter distribution were created in the first fraction of a second during an inflationary period. Gravitational instability grew these fluctuations over time. Gas and dark matter were initially well mixed; as the universe evolved, gas was able to dissipate and fell to the centers of dark matter halos. For large enough dark matter halos, gas was able to cool, form stars, and form a protogalaxy. The power spectrum of matter indicates that small objects should form first, and halos should grow and merge over time. Galaxies within these halos then continue to form stars (in situ) as well as to grow through merging (ex situ), because their dark matter halos merge. Energetic processes within galaxies impact their surroundings after they form, as various kinds of feedback, which influences future gas accretion and star formation.
In this context, clearly the growth, internal properties, and spatial distribution of galaxies are likely to be closely connected to the growth, internal properties, and spatial distribution of dark matter halos. Very simply, the luminous matter in the Universe is arranged in galaxies, and in a cold dark matter model, the dark matter in the Universe is arranged in dark matter halos. The physical and statistical connection between them is the focus of this review. We denote this the galaxy–halo connection, which in detail can refer to the full multivariate distribution of properties of halos and the galaxies that form within them. Elucidating this connection is a stepping stone to answering several of the largest questions in astrophysics and cosmology today. These include the following:
The galaxy–halo connection as a concept started with the earliest understanding of modern galaxy formation within the framework of cold dark matter (CDM) models, as did the understanding that the spatial distribution of galaxies can lead to insight into their formation properties. For example, Peebles (1980) discussed the two-point statistics of galaxies, and early work recognized that massive galaxies and clusters should have different clustering properties than average galaxies (Davis & Peebles, 1983, Bahcall & Soneira, 1983, Klypin & Kopylov, 1983, Kaiser, 1984), and that measuring these clustering properties could provide information about the masses of the dark matter halos that they lived in, because of the strong dependence of halo clustering on halo mass (Bardeen et al., 1986, Mo & White, 1996). It also was recognized early on that this relationship could be complex and scale dependent (e.g. Klypin, Primack & Holtzman, 1996, Jenkins et al., 1998).
However, it was not until the late 1990s that cosmological simulations were able to resolve the substructures within larger dark matter halos. At about the same time, the first large galaxy surveys were beginning. The APM survey (Baugh, 1996) was the first to measure the galaxy correlation function for a large sample of galaxies. Kravtsov & Klypin (1999) and Colín et al. (1999) were able to resolve substructures in simulations of cosmological volumes, and they measured approximately power-law correlation functions that were consistent with measurements from APM. This field was then revolutionized with the Two-degree Field Galaxy Redshift Survey (2dFGRS; Colless et al., 2001) and Sloan Digital Sky Survey (SDSS; York et al., 2000). For the first time, these surveys were able to measure the spatial clustering properties of large samples of galaxies, which allowed for the separation into their physical properties such as luminosity and color or stellar mass and star formation rate (SFR). Pioneering detections of galaxies at high redshift (Adelberger et al., 1998) also enabled the first studies of galaxy clustering at these epochs.
These two joint revolutions, (a) the advent of numerical simulations that could resolve the dark matter structures and substructures hosting galaxies, over volumes large enough to measure their spatial clustering properties and (b) the advent of large galaxy surveys, that could identify large samples of galaxies and measure their spatial clustering, including over a range of redshifts, have led to a new set of approaches to statistically connect these two distributions and infer the connection between galaxies and halos, that has flourished over the last ∼ 15 years. 1 The primary focus of this review is on the inference of this statistical connection between galaxies and halos enabled by these two advances. We will highlight (a) theoretical approaches to the problem (b) the primary insights that we have gained from studying the galaxy–halo connection and (c) outstanding issues. We note that the development of the galaxy–halo connection is connected to development of the “halo model” (Ma & Fry, 2000, Peacock & Smith, 2000, Seljak, 2000, Cooray & Sheth, 2002), a method for analytically calculating the non-linear clustering of dark matter, using the properties of dark matter halos (including their abundance and spatial clustering) as the basic unit. The halo model can be combined with models of the galaxy–halo connection to predict galaxy clustering.
Elucidating the statistical connection between galaxies and halos relies on another major advance of the last two decades: the establishment of a standard cosmological model, ΛCDM, in which the universe consists of 5% baryonic matter, 25% dark matter, and 70% dark energy. The parameters of this model are now known to high precision (Betoule et al., 2014, Planck Collaboration et al., 2016, DES Collaboration et al., 2017, Alam et al., 2017); this allows robust predictions, using numerical simulations, for the growth of structure and the formation and evolution of dark matter halos. Although this review relies heavily on basic predictions of the ΛCDM model and the properties of dark matter halos, these are well-reviewed elsewhere; see for example Frenk & White (2012) and Primack (2012). The cosmological simulations that form the basis of many of the predictions described here were reviewed by Kuhlen, Vogelsberger & Angulo (2012). The current status of physical models of galaxy formation was reviewed in Somerville & Davé (2015) and outstanding theoretical challenges were reviewed by Naab & Ostriker (2017); the formation of galaxy clusters was reviewed by Kravtsov & Borgani (2012). Modern cosmological probes with galaxy surveys were reviewed in Weinberg et al. (2013); here we focus on cosmological studies that require an understanding of the galaxy–halo connection. The interplay between the galaxy–halo connection and models of dark matter on small scales was well reviewed by Bullock & Boylan-Kolchin (2017), so we only briefly touch on these issues here.
We introduce the methods of modeling and predicting the galaxy–halo connection in Section 2. In Section 3, we review the primary observational handles on the galaxy–halo connection. In Section 4, we discuss complications to the simplest modeling approaches. In Section 5 we discuss current observational constraints on the mean and scatter of the relationship between galaxy mass and halo mass; Section 6 expands this discussion to other aspects of the galaxy–halo connection. We review the primary applications of the galaxy–halo connection in Section 7, including understanding galaxy formation physics, constraining cosmological parameters, and mapping and understanding the physics of dark matter. In Section 8 we summarize the key aspects of the galaxy–halo connection that have been understood from the last decade of studies, assess the outlook for such studies over the next decade, including how they will be influenced by upcoming surveys, and highlight outstanding questions for future work.
1 We note that prior to 1999, the only use of the phrase “the galaxy–halo connection” in the literature was to describe the Milky Way Galaxy and its stellar halo; see e.g. van den Bergh (1996) Back.