2.3.1 Statistics
Statistics appeared in cosmology before the nature of the particles to be used was established. In the first paper (Sect. 3.1.1) which describes the universe in terms of the general theory of relativity (Einstein 1917) it is assumed that the particles filling the universe show Boltzmann distribution. Einstein's particles are stars.
Charlier (1922) calculated the number of collisions between galaxies from Maxwell's equation in his hierarchical universe which is based on Lambert's (1761) principles but starts with galaxies instead of planetary (or even Jovian) systems.
The universe described by Milne (1933) starts from statistics.
``We. . . .impose the condition that this spatiovelocity distribution formula shall actually represent, statistically, a concourse of material objects; this is done by a modification of Boltzmann's gas equation.
. . .since we expect the clots or subsystems to be formed in the spatial positions of the singularities in the original swarm, we were let to infer an exact correlation between position and velocity. . .; this being contrasted with the merely statistical correlation. . .It is clear that once we introduce an exact correlation between velocity and position, our system is not a statistical one but a hydrodynamical one.''
Milne concludes:
``Relativity  statistics  hydrodynamics  dynamics  gravitation  such was our course of investigation.''
Several points seem noteworhty concerning the earliest applications of statistics to cosmology:
 they include implicitly or explicitly both the distributions of positions and motions
 no assumption about the distributions are made by Milne; the assumptions made by the other authors are simple statistical or hierarchical distributions.
Another statistic and probabilistic aspect entered extragalactic astronomy with Wirtz (1918) in his search for parameter correlation. Covariance functions, now generally called correlation functions, became the standard tools.
Positional distributions became of interest as a consequence of the increasing body of data. The early application of probabilistic methods is well exemplified by the work of Zwicky (1937c, Sect. 2.2):
``By a bold extrapolation of wellknown results of ordinary statistical mechanics we adopt the following working hypothesis. . .:
1. The system of extragalactic nebulae throughout the known parts of the universe forms a statistically stationary system.
2. Every constellation of nebulae is to be endowed with a probability weight f() which is a function of the total energy of this constellation. Quantitatively the probability P of the occurrence of a certain configuration of nebulae is assumed to be of the type
Here V is the volume occupied by the configuration or cluster considered, V_{0} is the volume to be allotted, on the average, to any individual nebula in the known parts of the universe, and is the total energy of the cluster in question, while will probably be found to be proportional to the average kinetic energy of individual nebulae. The function A (V / V_{0}) can be determined a priori. On the other hand, F( / ) presumably will be found to be a monotonously decreasing function in / , analogous in type to a Boltzmann factor
Assuming the principles stated in the preceding to be correct, we may draw the following hypothetical conclusions:
a) The clustering of nebulae is favoured by high values of f and is partially checked by low values of the a priori probability a.
b) If, as would appear to be certain, nebulae are not all of the same mass, nebulae of high mass are favored in the process of clustering, since they contribute most to produce high values of the weight function f.
c) As a consequence of b, we should expect that the frequency with which different types of nebulae occur will not be the same among field nebulae and among cluster nebulae. In other words, clustering is a process which tends to segregate certain types of nebulae from the remaining types. This may contribute toward the correct interpretation of the wellknown fact that cluster nebulae are preponderantly of the globular and elliptical types, whereas field nebulae are mostly spirals. . . It is not necessary as yet to call on evolutionary processes to explain why the representation of nebular types in clusters differs from that in the general field. . .
d) If cluster nebulae, on the average, are really more massive than field nebulae, the conclusion suggests itself that globular nebulae may, somewhat unexpectedly, be among the most massive systems. It will be of great interest to check this inference by a search for gravitational lens effects among globular nebulae.''
Compare also the work of Holmberg in Sect. 2.3.2.
Another step in analyzing galaxy distributions is the adaption of the correlation function from parameter space to real space through the definition of npoint correlation functions for two and three dimensions: the angular and the spatial correlation functions.
The concept of clustering (Sect. 2.1) basic to some of the formal statistical approaches used since the 1950s is presented by Carpenter (1938):
``It is concluded that the density restrictions and the mass restrictions in metagalactic space are real and represent a fundamental property of the distribution of matter in space. It follows as a corollary and from the interpretation of [Fig. 18] that there is no basic and essential distinction between the large, rich clusters and the small, loose groups. Rather, the objects commonly recognized as physical clusterings are merely the extremes of a nonuniform though not random distribution which is limited by density as well as by population. From this point of view, the term `supergalaxy' is of questionable propriety, since it implies a distinctive and coherent organic structure inherently of a higher order than the individual galaxies themselves.''
Figure 18. Densitydiameter relation of
clusters of galaxies
(Carpenter 1938).

Upon the publication of the first data from the Lick catalogue, Neyman and Scott (1952) and Neyman et al. (1953) introduced the angular pair correlation functions and Limber (1953, 1954) the angular and the spatial pair correlation function and the integral equation relating the two functions. This initiated the first period of probabilistic studies of the distribution of galaxies. Extensive discussions were given by Neyman et at. (1956).
Results of the first applications to the Lick Survey by Neyman et at. and Limber are displayed in Figs. 19 and 20.
Figure 19. Quasicorrelations in Lick
survey fields
(Neyman et al. 1953).

The full bloom of statistical studies of galaxy clustering using correlation functions began to develop almost two decades after the initial phase. Essential steps leading to it were the publication of the complete Lick catalogue in 1967, its new reduction in 1977 (Seldner et al.), the impact of Peebles' books (1971, 1980) and the availability of powerful computing devices.
Figure 20. Correlation functions
(Limber 1954).

The importance of Limber's equation was realized when more galaxy redshifts became available. Totsuji and Kihara (1969) published an approximate solution for small angles, and the equation was extended to the general relativistic case by Groth and Peebles (1977). More detail is given in an extensive review article covering the 1970s (Fall 1979).
Further statistical techniques are mentioned by Ott (1988). Fractals are a new approach to test hierarchical clustering. It was introduced by Mandlebrot and first applied by him to astronomical problems in 1975.