![]() | Annu. Rev. Astron. Astrophys. 1988. 26:
245-294 Copyright © 1988 by Annual Reviews. All rights reserved |
3.1.3. DENDROGRAM ANALYSIS
All the information on large-scale structure extractable from a homogeneous sample of N cosmic objects is contained in the N(N - 1) / 2 separation vectors of length Rij. The two-point correlation function extracts from these data two parameters, a slope (x) and amplitude (A), and the n-point correlation functions contain 2n corresponding parameters. Correlation functions do not cleanly separate internal from global structural properties of systems of objects. Complete-linkage dendrograms, which use only those Rij's that refer to global structure, were introduced to astronomy by Materne (115).
A complete-linkage dendrogram (e.g. Figure 7) diagrams the results of an abstract elimination tournament. A sphere with sweeping radius Rs [cf. Section 2.2.1 and Oort (128, pp. 409-11)] is centered on each object. The x-axis of the dendrogram contains N evenly spaced nodes, one for each object. As Rs increases, the spheres begin to overlap. The first contact of two expanding spheres is represented by a dendrogram hoop, composed of two nodes, two vertical branches, and a horizontal similarity bar. The branches connect the nodes to the ends of the similarity bar. The midpoint of a similarity bar forms a new node. The y-coordinate of a similarity bar specifies the similarity level of the corresponding dendrogram system. The similarity level could be defined, for example, by the logarithm of Rc, the complete-linkage distance, i.e. the largest Rij among the mutual separations of the M objects in the dendrogram system. (The single-linkage distance, i.e. the sweeping radius that resulted in the link, is the smallest Rij among the mutual separations of the M objects in the dendrogram system.) Rc is an approximation to Dc, the major diameter of the dendrogram system.
A dendrogram contains information about both large-scale structure
and individual structures. Parameters such as
<M> and
M, the average
multiplicity and rms multiplicity of the (N - 1) dendrogram systems
comprising the N objects, are model
sensitive. Dc and M, the diameter
and multiplicity of a dendrogram system, are fundamental global
properties of a system. Subsets of physical systems among the
dendrogram systems can be readily extracted by adopting a criterion
for physical reality based on, for example, global or local number
density contrast or, similarly, overdensity (if physical superclusters
are to be identified) or underdensity (if physical voids are to be
identified). For an observational sample and mathematical/physical
models with the same number of objects and same selection functions,
one can construct (by analogy with the definition of the two-point
correlation function) the function SI of Dc,
defined as
SI = [N(Dc) /
N(Dc)Poisson] - 1, which compares
(for the sample of interest) the
distribution of characteristic lengths of the observed dendrogram
systems with that for a Poisson sample with the same number of objects
and the same selection functions. Or one may wish to calculate an
especially void-sensitive function, e.g. SI of
Dv (where Dv is the
characteristic length of a void), derived from the dendrogram systems
with left and right nodes (L and R) each located on either the x-axis
[characteristic lengths Dc(L) = 0 and
Dc(R) = 0] or the similarity bar
of the hoop representing a previously identified dendrogram system
(nonzero characteristic lengths). An estimate of Dv is
provided, for example, by
Dv =
Dc[Dc(L) +
Dc(R)]. The wealth of research
possibilities of dendrograms in general, and complete-linkage
dendrograms in particular, toward identifying physical systems in
astronomy and clearly describing basic properties of large-scale
structure has only begun to be tapped
(115,
158,
191,
201a, and references therein).
Complete-linkage dendrogram analysis is one of a family of broadly related techniques. Oort (128, pp. 409-11) reviews (a) "percolation cluster analysis," which analyzes plots of the complete-linkage distance vs. the single-linkage distance, and (b) "multiplicity function analysis," which analyzes the frequency distribution of multiplicities of dendrogram systems. Bhavsar & Barrow (27) recently did a percolation cluster analysis on the CfA (93a) sample of galaxies and a corresponding sample generated by N-body simulations of gravitational clustering (2a). Dekel & West (55) find that percolation is an insensitive discriminant between models of clustering that are very different. (The reader should be cautioned not to assume a priori that this result necessarily also applies to any other given technique based on the concept of a sweeping radius.) Interesting new techniques that for some purposes may provide complementary or preferred information to that accessible through study of complete-linkage diagrams include analyses with "minimal spanning trees" described by Barrow et al. (22b) and "taxonomical analysis" introduced to astronomy by Paturel (141a) and applied recently by Moles et al. (120; also 145, 145a).