In the 30's Hubble & Humason, aiming at a high-redshift extension of the velocity-distance relationship, measured several velocities for galaxies in clusters. In 1931, they  provided the first estimates of the velocity dispersions in four clusters of galaxies (9). Hubble & Humason noted that the velocity range spanned by Coma galaxies was larger than in other clusters (Virgo Pegasus, Pisces). This was a first hint of the relation between richness and velocity dispersion that Bahcall  later established in 1981. Hubble 's early estimate of the cluster velocity dispersion was 700 km/s - see Fig. 21, from Smith  - a value remarkably close to modern estimates . Zwicky [512, 513] immediately saw the great potentiality of Hubble & Humason's data, and used them for deriving the mass of the Coma cluster, via the application of the virial theorem (10). Smith  followed Zwicky and derived the virial mass of the Virgo cluster.
Figure 21. The distribution of velocities of Virgo cluster galaxies. From Smith (1936).
Zwicky 's milestone paper: On the Masses of Nebulae and of Clusters of Nebulae, published in 1937, is an exceptional work. In that paper, Zwicky correctly noticed that the masses of nebulæ, derived from rotation curves, are underestimated. By assuming, ``as a first approximation'', that clusters of nebulæ are stationary systems, and using the virial theorem, he derived a very conservative estimate of the Coma cluster mass. This implied a cluster mass-to-light ratio of 68 M / L (after conversion to a modern value of the Hubble constant). Zwicky had discovered the missing mass problem.
His discovery relied very much on the hypothesis of cluster stability. In support of his hypothesis, Zwicky noted that galaxies in the field have a much lower velocity dispersion than galaxies in clusters. This indicated that field galaxies could not origin from cluster disruption, or they would have much higher velocities than observed. In this context, Zwicky implicitly criticized the work of Smith , and emphasized the danger of applying the virial theorem to irregular systems of galaxies, which are not likely to be stable systems. Because of the possible biases inherent to the virial mass estimates, Zwicky suggested to use gravitational lensing as the ``simplest and most accurate mass determination''. He was half a century in advance of observations [289, 428]!
Smith 's paper essentially followed in the steps of Zwicky , but was published one year before the English version of Zwicky 's paper, and not surprisingly Hubble  quoted Smith and not Zwicky (although Zwicky was quoted by Smith himself). Hubble remarked that galaxy mass estimates were likely to be lower limits, while virial theorem estimates of cluster masses were likely to be upper limits, so that eventually the two might come into agreement. As a matter of fact, Zwicky's and Smith's estimates of the Coma and, respectively, Virgo masses, were quite correct, or, if anything, too low (Zwicky having tried to be conservative). Anyway, a straightforward application of the virial theorem was not without problems. In 1959 Limber  obtained a more general expression for the virial theorem, in order to account for the possible presence of diffuse IC matter. Much later Nezhinskii & Osipkov  showed that the uncertainties in the virial mass estimates are much larger than generally assumed if the diffuse matter is not distributed like galaxies, and dominates the potential. However, as it turned out, the cluster virial mass estimates were essentially correct, and it was the galaxy mass estimates which had to be revised upwards.
Figure 22. Portraits of Fritz Zwicky (left) and George O. Abell.
Holmberg  was possibly the first to criticize Zwicky's dark mass hypothesis, that he considered an ``unlikely assumption''. He attributed the high velocity dispersion of cluster galaxies to the presence of a large number of galaxies on hyperbolic orbits, i.e. interlopers. In 1954 Schwarzschild  tried to get rid of ``interlopers'' to improve the estimate of the Coma cluster velocity dispersion. After eliminating many supposed interlopers from the Coma cluster sample (far too many, in fact) he came to the wrong estimate of 630 km/s for the velocity dispersion of the Coma cluster. Some years later Abell  pointed out that the existence of superclusters enhances the probability of projection effects, leading to overestimate the cluster velocity dispersions. In 1977 Yahil & Vidal  devised a method for getting rid of interlopers in galaxy clusters that remained in use until recently .
Schwarzschild's estimate was too low, yet not enough to solve the discrepancy between the mass-to-light ratios of clusters and those of individual galaxies, or galaxy pairs. Page  had just found that galaxy pairs have a much lower mass-to-light ratios than clusters. Of course, estimating the masses of galaxy pairs was not simpler than estimating the masses of clusters (11), as Limber  pointed out. Despite the intrinsic uncertainties due to poorly controlled selection biases, Page's work strongly influenced the astronomical community, leading to a diffuse scepticism towards the cluster mass estimates. Interestingly, however, the nearest galaxy pair (M 31 and the Milky Way) was shown in those years to display the same missing mass problem of clusters (Kahn & Woltjer ). The mass estimate of Kahn & Woltjer relied on the simple assumption that M 31 and the Milky Way are on a bound orbit. Apparently, Kahn & Woltjer were unaware of Zwicky's and Smith's results on the mass of galaxy clusters.
Around 1960, Ambartsumian [28, 29] reversed Zwicky's hypothesis on the stability of clusters. According to Ambartsumian, the large velocity dispersions of clusters indicate they have positive total energy, i.e. they are disintegrating, and missing mass is not needed. In those years astronomers were discovering the wild world of radio-galaxies, with their jets, suggestive of a mechanism to emit matter out of galaxies. Similarly, interacting galaxies looked to many as the result of a fragmentation process rather than the result of encounters. Somewhat later, Noerdlinger  invoked quasars as the source of the energy leading to the cluster disruption. Ambartsumian's hypothesis became quite popular in the astronomical community because
``unless one is prepared to make wild hypotheses outside the realm of verification by direct observation [...] the 'hidden-mass' hypothesis must be ruled out'' (de Vaucouleurs )
The stability of groups and irregular clusters started to be questioned. Zwicky [517, 518] insisted on the stability of clusters, even the Cancer cluster, which Bothun et al.  much later proved to be just ``an unbound collection of groups''. On the other hand, the Burbidge's  suggested that the Hercules cluster was just an unbound collection of groups, but in fact it is not, it is only rich in substructures . de Vaucouleurs [130, 131, 132] suggested that groups might result from random encounters of unbound field galaxies. He also provided marginal evidence that Virgo was not a single dynamical unit, but two different clusters seen in projection. His hypothesis was turned down first by Kowal  who used Supernovæ to estimate the distances of Virgo galaxies, and then by Sandage & Tammann  who used a much larger sample of Virgo galaxy velocities. Finally Helou et al.  closed this issue by determining the relative distances of galaxies in Virgo with the Tully-Fisher relation .
At variance with irregular clusters and small groups, the stability of Coma was never in question, given the high degree of symmetry and regularity of this cluster. This implied that the Coma cluster contains a large quantity of unseen mass, and so ``why should not the others?'' (Burbidge & Sargent ). Abell  used the cluster virial mass estimates to provide an estimate of the mean density of the Universe, 0 0.1.
A possible solution to the missing mass problem was to revise the estimates of cluster velocity dispersions. Internal subclustering was known to be a potential source of error in the velocity dispersion estimates (12). However, subclustering in Coma took long to be recognized, and Abell  pointed out that the correction for subclustering, while important, was nevertheless too small to get rid of the missing mass (Ozernoy & Reinhardt  later came to the same conclusion). Godfredsen  and Holmberg  suggested that the cluster velocity dispersion estimates were boosted up by large errors in the galaxy velocities. Their hypothesis was rejected by de Vaucouleurs & de Vaucouleurs  and, later, by Kirshner , who found a similar mass discrepancy in groups, despite a considerable improved determination of galaxy velocities. Finally, Rood  pointed out that an a-priori assumption of isotropic galaxy orbits could lead to overestimate a cluster velocity dispersion, if these orbits were instead mainly radial.
In the early 60's Burbidge & Burbidge [83, 81] and Limber  advanced the major argument in favour of the stability of galaxy clusters. If clusters have positive energy, the time-scale for their disruption is very short. Clusters must therefore be young systems. However, clusters are populated by ellipticals, which are old galaxies, as inferred from their stellar populations. This argument seemed ironclad, yet many astronomers still preferred to question the old age of ellipticals (and the models of stellar evolution), rather than accepting the existence of dark matter (see, e.g., Neyman et al. )!
After 1965 the growing evidence for dark matter in single galaxies started to change the situation. As early as in 1939 Babcock  had shown that the rotation curve of M 31, as measured in the optical, was still raising at the last measured point. But the observational evidence for non-Keplerian galaxies rotation curves really came from radio-observations. In 1965 Seielstad & Whiteoak  noted that the turn-over radii of the galaxy rotation curves were larger when measured in the radio than when measured in the optical. More 21cm measurements accumulated, in particular through the work of Roberts  and Roberts & Rots . In 1969 Vorontsov-Velyaminov  argued that the 21cm measurements indicated flat rotation curves for galaxies and Freeman  and Lewis  suggested that this implied an increasing mass-to-light ratio with radius. Arp & Bertola  and de Vaucouleurs  argued for a high mass of the giant elliptical M 87, a suggestion later confirmed by Fabricant et al. . Hunt & Sciama  suggested that the brighter galaxies may have X-ray coronæ, a prediction later confirmed by Mathews . In 1973, Ostriker & Peebles  argued for the need of a massive halo to stabilize the spiral disks.
Progress was also being made in the dynamical modeling of galaxy systems. In 1970 Allen  derived a velocity-independent distance for NGC 7320, based on the hydrogen-mass to optical-luminosity ratio. He found that this galaxy lies at a different distance from other galaxies of the Stephan's quintet, thus reducing the mass discrepancy in this system. On the other hand, the n-body simulations of Aarseth & Saslaw  indicated that the group masses were underestimated by the use of the virial theorem, thus anticipating the conclusions of Tully , and Giuricin et al. . A few years later, Geller & Peebles  obtained a robust statistical estimate of the masses of groups, and showed that interlopers cannot cause the whole of the mass discrepancy problem. Gott et al.  and Turner & Sargent  however argued that only a fraction of all groups are bound, and of these, very few are virialized.
In 1966 Aarseth 's simulations had established that a cluster in equilibrium should be characterized by a Gaussian distribution of galaxy velocities. Six years later Rood et al.  proved the velocity distribution of galaxies in the Coma cluster to be Gaussian, lending support to the idea that the Coma cluster was a stable dynamical system. Using a larger data-set, they confirmed Mayall 's earlier suggestion that the velocity dispersion of Coma decreases with increasing radius. Previously, a similar trend in the Virgo cluster had been explained by Karachentsev  as an indication of the expansion of the cluster. Rood et al. instead correctly pointed out that the decreasing velocity dispersion profile was due to the finiteness of the cluster. They fitted the profile with a model where galaxies on isotropic orbits trace the mass distribution - see Fig. 23.
Figure 23. The Coma cluster velocity dispersion profile. A model with isotropic galaxy orbits is also plotted. From Rood et al. (1972).
Despite the observational and theoretical progress, still in the early 70's the general feeling of the astronomical community about the dark matter issue was quite negative. As an example, here are Chincarini & Rood 's conclusions from their 1971 paper on the dynamics of the Perseus cluster :
``We are not inclined to admit this possibility of adequate intergalactic mass in the cluster [...] The large 'mass' of the Perseus cluster therefore is explained with difficulty if the cluster is bound, and may suggest instability''
Another telling example is the obituary of Fritz Zwicky, written by Cecilia Payne-Goposchkin  in 1974. Many of Zwicky's major contributions to astrophysics were mentioned, but not the discovery of dark matter.
I do not know how Zwicky managed to change astronomers' minds from Heaven. It is a fact, however, that only a few months after his death, Einasto et al.  and Ostriker et al.  published two papers that catalyzed a paradigm change in favour of the existence of dark matter in the Universe. Einasto et al. and, independently, Ostriker et al. summarized the evidence supporting the existence of galaxy dark halos, and argued that the mass-to-light ratio increases with scale, from galaxies to galaxy clusters. Despite some residual criticism from Burbidge , the existence of dark matter became rapidly accepted, to such a point that in 1980 Jim Gunn  claimed that ``observations now leaves little doubt of its presence.''
The paradigm had changed, and dark matter rapidly became a very popular subject in astronomy. Many different determinations of the galaxy system masses reached very similar conclusions. Peebles  developed the ``cosmic virial theorem'' and performed the first analysis of the peculiar velocity field in the Local Supercluster . Davis et al.  followed in his steps a few years later. Capelato et al. [92, 89, 90] developed their ``Multi-Mass Model'' which accounted for a distribution of the masses of cluster galaxies. Ozernoy & Reinhardt  and, independently, Valtonen & Byrd  developed a binary model for Coma, later shown to be inconsistent with the X-ray and optical data by Tanaka et al.  and The & White , respectively. Bahcall & Tremaine  invented the ``projected mass estimator'', as an alternative to the virial theorem. In 1982 Kent & Gunn  analyzed the phase-space distribution of galaxies in Coma, and found that an isotropic mass-follows-light model was the best fit to the data, thus confirming Rood et al. 's result. On the other hand, Bailey , using the same data, showed that many other dynamical models were equally acceptable, and the cluster mass was poorly constrained. One year later, Kent & Sargent  found that radial orbits were needed to model the dynamics of another cluster, Perseus. Beers et al. , following in the steps of Kahn & Woltjer , applied a two-body dynamical analysis to the double cluster Abell 98. In 1980 Lucey et al.  showed Centaurus to be another example of a double cluster.
The virial mass estimates of galaxy clusters received a definitive confirmation through the gravitational lensing analyses (see, e.g., Fort & Mellier ), just as predicted by a visionary Fritz Zwicky some 60 years earlier. New methods of cluster mass determinations are reviewed by GELLER (these proceedings).
9 Hubble & Humason were interested in
velocity dispersions because they wanted to estimate the uncertainties
in the cluster mean velocities, which were relevant to the
10 The virial theorem had been first
used in astronomy by Poincaré in 1911.
11 The work of
Page required 165 hours of observations!
12 A detailed account of the topic of
subclustering is given in Section 4.1.
9 Hubble & Humason were interested in cluster velocity dispersions because they wanted to estimate the uncertainties in the cluster mean velocities, which were relevant to the velocity-distance relationship. Back.
10 The virial theorem had been first used in astronomy by Poincaré in 1911. Back.
11 The work of Page required 165 hours of observations! Back.
12 A detailed account of the topic of subclustering is given in Section 4.1. Back.