Large Scale Structure of the Universe

2. THE TWO-POINT CORRELATION FUNCTION

In order to quantify the clustering of galaxies, one must survey not only galaxies in clusters but rather the entire galaxy density distribution, from voids to superclusters. The most commonly used quantitative measure of large scale structure is the galaxy two-point correlation function, (r), which traces the amplitude of galaxy clustering as a function of scale. (r) is defined as a measure of the excess probability dP, above what is expected for an unclustered random Poisson distribution, of finding a galaxy in a volume element dV at a separation r from another galaxy,

(1)

where n is the mean number density of the galaxy sample in question (Peebles 1980). Measurements of (r) are generally performed in comoving space, with r having units of h^-1 Mpc. The Fourier transform of the two-point correlation function is the power spectrum, which is often used to describe density fluctuations observed in the cosmic microwave background.

To measure (r), one counts pairs of galaxies as a function of separation and divides by what is expected for an unclustered distribution. To do this one must construct a "random catalog" that has the identical three dimensional coverage as the data - including the same sky coverage and smoothed redshift distribution - but is populated with randomly-distribution points. The ratio of pairs of galaxies observed in the data relative to pairs of points in the random catalog is then used to estimate (r). Several different estimators for (r) have been proposed and tested. An early estimator that was widely used is from Davis & Peebles (1983):

(2)

where DD and DR are counts of pairs of galaxies (in bins of separation) in the data catalog and between the data and random catalogs, and n_D and n_r are the mean number densities of galaxies in the data and random catalogs. (Hamilton 1993) later introduced an estimator with smaller statistical errors,

(3)

where RR is the count of pairs of galaxies as a function of separation in the random catalog. The most commonly-used estimator is from Landy & Szalay (1993),

(4)

This estimator has been shown to perform as well as the Hamilton estimator (Eqn. 3), and while it requires more computational time it is less sensitive to the size of the random catalog and handles edge corrections well, which can affect clustering measurements on large scales (Kerscher et al. 2000).

As can be seen from the form of the estimators given above, measuring (r) depends sensitively on having a random catalog which accurately reflects the various spatial and redshift selection affects in the data. These can include effects such as edges of slitmasks or fiber plates, overlapping slitmasks or plates, gaps between chips on the CCD, and changes in spatial sensitivity within the detector (i.e., the effective radial dependence within X-ray detectors). If one is measuring a full three-dimensional correlation function (discussed below) then the random catalog must also accurately include the redshift selection of the data. The random catalog should also be large enough to not introduce Poisson error in the estimator. This can be checked by ensuring that the RR pair counts in the smallest bin are high enough such that Poisson errors are subdominant.