Annu. Rev. Astron. Astrophys. 1988. 26:
245-294
Copyright © 1988 by . All rights reserved |

**3.1.3.** DENDROGRAM ANALYSIS

All the information on large-scale structure extractable from a
homogeneous sample of *N* cosmic objects is contained in the
*N*(*N* - 1) / 2
separation vectors of length *R*_{ij}. The two-point
correlation function
extracts from these data two parameters, a slope (*x*) and amplitude
(*A*), and the *n*-point correlation functions contain 2*n*
corresponding
parameters. Correlation functions do not cleanly separate internal
from global structural properties of systems of
objects. Complete-linkage dendrograms, which use only those
*R*_{ij}'s that
refer to global structure, were introduced to astronomy by Materne
(115).

A complete-linkage dendrogram (e.g.
Figure 7) diagrams the results
of an abstract elimination tournament. A sphere with sweeping radius
*R*_{s} [cf. Section 2.2.1 and
Oort
(128,
pp. 409-11)] is centered on
each object. The *x*-axis of the dendrogram contains *N*
evenly spaced
*nodes*, one for each object. As *R*_{s} increases,
the spheres begin to
overlap. The first contact of two expanding spheres is represented by
a *dendrogram hoop*, composed of two nodes, two vertical branches,
and a
horizontal *similarity bar*. The branches connect the nodes to the ends
of the similarity bar. The midpoint of a similarity bar forms a new
node. The *y*-coordinate of a similarity bar specifies the similarity
level of the corresponding dendrogram system. The similarity level
could be defined, for example, by the logarithm of *R*_{c}, the
complete-linkage distance, i.e. the *largest* *R*_{ij}
among the mutual
separations of the *M* objects in the dendrogram system. (The
single-linkage distance, i.e. the sweeping radius that resulted in the
link, is the *smallest* *R*_{ij} among the mutual
separations of the *M*
objects in the dendrogram system.) *R*_{c} is an
approximation to
*D*_{c}, the major diameter of the dendrogram system.

A dendrogram contains information about both large-scale structure
and individual structures. Parameters such as
<*M*> and
_{M}, the average
multiplicity and rms multiplicity of the (*N* - 1) dendrogram systems
comprising the *N* objects, are model
sensitive. *D*_{c} and *M*, the diameter
and multiplicity of a dendrogram system, are fundamental global
properties of a system. Subsets of physical systems among the
dendrogram systems can be readily extracted by adopting a criterion
for physical reality based on, for example, global or local number
density contrast or, similarly, overdensity (if physical superclusters
are to be identified) or underdensity (if physical voids are to be
identified). For an observational sample and mathematical/physical
models with the same number of objects and same selection functions,
one can construct (by analogy with the definition of the two-point
correlation function) the function *SI* of *D*_{c},
defined as
*SI* = [*N*(*D*_{c}) /
*N*(*D*_{c})_{Poisson}] - 1, which compares
(for the sample of interest) the
distribution of characteristic lengths of the observed dendrogram
systems with that for a Poisson sample with the same number of objects
and the same selection functions. Or one may wish to calculate an
especially void-sensitive function, e.g. *SI* of
*D*_{v} (where *D*_{v} is the
characteristic length of a void), derived from the dendrogram systems
with left and right nodes (L and R) each located on either the *x*-axis
[characteristic lengths *D*_{c}(*L*) = 0 and
*D*_{c}(*R*) = 0] or the similarity bar
of the hoop representing a previously identified dendrogram system
(nonzero characteristic lengths). An estimate of *D*_{v} is
provided, for example, by
*D*_{v} =
*D*_{c}[*D*_{c}(*L*) +
*D*_{c}(*R*)]. The wealth of research
possibilities of dendrograms in general, and complete-linkage
dendrograms in particular, toward identifying physical systems in
astronomy and clearly describing basic properties of large-scale
structure has only begun to be tapped
(115,
158,
191,
201a, and references therein).

Complete-linkage dendrogram analysis is one of a family of broadly
related techniques. Oort
(128,
pp. 409-11) reviews (*a*) "percolation
cluster analysis," which analyzes plots of the complete-linkage
distance vs. the single-linkage distance, and (*b*) "multiplicity
function analysis," which analyzes the frequency distribution of
multiplicities of dendrogram systems. Bhavsar & Barrow
(27) recently
did a percolation cluster analysis on the CfA
(93a) sample of galaxies
and a corresponding sample generated by N-body simulations of
gravitational clustering
(2a).
Dekel & West (55)
find that percolation
is an insensitive discriminant between models of clustering that are
very different. (The reader should be cautioned *not* to assume
*a priori*
that this result necessarily also applies to any other given technique
based on the concept of a sweeping radius.) Interesting new techniques
that for some purposes may provide complementary or preferred
information to that accessible through study of complete-linkage
diagrams include analyses with "minimal spanning trees" described by
Barrow et al.
(22b)
and "taxonomical analysis" introduced to astronomy by Paturel
(141a)
and applied recently by Moles et al.
(120; also
145,
145a).