|Annu. Rev. Astron. Astrophys. 1988. 26:
Copyright © 1988 by . All rights reserved
3.1.3. DENDROGRAM ANALYSIS
All the information on large-scale structure extractable from a homogeneous sample of N cosmic objects is contained in the N(N - 1) / 2 separation vectors of length Rij. The two-point correlation function extracts from these data two parameters, a slope (x) and amplitude (A), and the n-point correlation functions contain 2n corresponding parameters. Correlation functions do not cleanly separate internal from global structural properties of systems of objects. Complete-linkage dendrograms, which use only those Rij's that refer to global structure, were introduced to astronomy by Materne (115).
A complete-linkage dendrogram (e.g. Figure 7) diagrams the results of an abstract elimination tournament. A sphere with sweeping radius Rs [cf. Section 2.2.1 and Oort (128, pp. 409-11)] is centered on each object. The x-axis of the dendrogram contains N evenly spaced nodes, one for each object. As Rs increases, the spheres begin to overlap. The first contact of two expanding spheres is represented by a dendrogram hoop, composed of two nodes, two vertical branches, and a horizontal similarity bar. The branches connect the nodes to the ends of the similarity bar. The midpoint of a similarity bar forms a new node. The y-coordinate of a similarity bar specifies the similarity level of the corresponding dendrogram system. The similarity level could be defined, for example, by the logarithm of Rc, the complete-linkage distance, i.e. the largest Rij among the mutual separations of the M objects in the dendrogram system. (The single-linkage distance, i.e. the sweeping radius that resulted in the link, is the smallest Rij among the mutual separations of the M objects in the dendrogram system.) Rc is an approximation to Dc, the major diameter of the dendrogram system.
A dendrogram contains information about both large-scale structure and individual structures. Parameters such as <M> and M, the average multiplicity and rms multiplicity of the (N - 1) dendrogram systems comprising the N objects, are model sensitive. Dc and M, the diameter and multiplicity of a dendrogram system, are fundamental global properties of a system. Subsets of physical systems among the dendrogram systems can be readily extracted by adopting a criterion for physical reality based on, for example, global or local number density contrast or, similarly, overdensity (if physical superclusters are to be identified) or underdensity (if physical voids are to be identified). For an observational sample and mathematical/physical models with the same number of objects and same selection functions, one can construct (by analogy with the definition of the two-point correlation function) the function SI of Dc, defined as SI = [N(Dc) / N(Dc)Poisson] - 1, which compares (for the sample of interest) the distribution of characteristic lengths of the observed dendrogram systems with that for a Poisson sample with the same number of objects and the same selection functions. Or one may wish to calculate an especially void-sensitive function, e.g. SI of Dv (where Dv is the characteristic length of a void), derived from the dendrogram systems with left and right nodes (L and R) each located on either the x-axis [characteristic lengths Dc(L) = 0 and Dc(R) = 0] or the similarity bar of the hoop representing a previously identified dendrogram system (nonzero characteristic lengths). An estimate of Dv is provided, for example, by Dv = Dc[Dc(L) + Dc(R)]. The wealth of research possibilities of dendrograms in general, and complete-linkage dendrograms in particular, toward identifying physical systems in astronomy and clearly describing basic properties of large-scale structure has only begun to be tapped (115, 158, 191, 201a, and references therein).
Complete-linkage dendrogram analysis is one of a family of broadly related techniques. Oort (128, pp. 409-11) reviews (a) "percolation cluster analysis," which analyzes plots of the complete-linkage distance vs. the single-linkage distance, and (b) "multiplicity function analysis," which analyzes the frequency distribution of multiplicities of dendrogram systems. Bhavsar & Barrow (27) recently did a percolation cluster analysis on the CfA (93a) sample of galaxies and a corresponding sample generated by N-body simulations of gravitational clustering (2a). Dekel & West (55) find that percolation is an insensitive discriminant between models of clustering that are very different. (The reader should be cautioned not to assume a priori that this result necessarily also applies to any other given technique based on the concept of a sweeping radius.) Interesting new techniques that for some purposes may provide complementary or preferred information to that accessible through study of complete-linkage diagrams include analyses with "minimal spanning trees" described by Barrow et al. (22b) and "taxonomical analysis" introduced to astronomy by Paturel (141a) and applied recently by Moles et al. (120; also 145, 145a).