We thus find that astronomy and astrophysics today requires a vast range of statistical capabilities. In statistical jargon, it helps for astronomers to know something about: sampling theory, survival analysis with censoring and truncation, measurement error models, multivariate classification and analysis, harmonic and autoregressive time series analysis, wavelet analysis, spatial point processes and continuous surfaces, density estimation, linear and non-linear regression, model selection, and bootstrap resampling. In some cases, astronomers need combinations of methodologies that have not yet been fully developed (Section 6 below).

Faced with such a complex of challenges, mechanical exposure to a
wider variety of techniques is a necessary but not sufficient
prerequisite for high-quality statistical analyses. Astronomers
also need to be imbued with established principles of statistical
inference; *e.g.*, hypothesis testing and parameter
estimation, nonparametric and parametric inference, Bayesian and
frequentist approaches, and the assumptions underlying and
applicability conditions for any given statistical method.

Unfortunately, we find that the majority of the thousands of
astronomical studies requiring statistical analyses use a very
limited set of classical methods. The most common tools used by
astronomers are: Fourier transforms for temporal analysis
(developed by Fourier in 1807), least squares regression and
^{2}
goodness-of-fit
(Legendre in 1805,
Pearson in 1900,
Fisher in 1924),
the nonparametric Kolmogorov-Smirnov 1- and 2-sample nonparametric tests
(Kolmogorov in 1933),
and principal components analysis for multivariate tables
(Hotelling in 1936).

Even traditional methods are often misused. Feigelson & Babu
[9]
found that astronomers use interchangeably up to 6 different fits
for bivariate linear least squares regression: ordinary least
squares (OLS), inverse regression, orthogonal regression, major
axis regression, the OLS mean, and the OLS bisector. Not only did
this lead to confusion in comparing studies (*e.g.*, in
measuring the expansion of the Universe via Hubble's constant,
*H*_{o}), but astronomers did not realize that the confidence
intervals on the fitted parameters can not be correctly estimated
with standard analytical formulae. Similarly,
Protassov et al. [24]
found that the majority of astronomical applications of the
*F* test, or more generally the likelihood ratio test, are
inconsistent with asymptotic statistical theory.

But, while the *average* astronomical study is limited to
often-improper usage of a limited repertoire of statistical
methods, a significant *tail of outliers* are much more
sophisticated. The maximization of likelihoods, often developed
specially for the problem at hand, is perhaps the most common of
these improvements. Bayesian approaches are also becoming
increasingly in vogue.

In a number of cases, sometimes buried in technical appendices of
observational papers, astronomers independently develop
statistical methods. Some of these are rediscoveries of known
procedures; for example,
Avni et al. [2]
and others recovered
elements of survival analysis for treatments of left-censored data
arising from nondetections of known objects. Some are quite
possibly mathematically incorrect; such as various revisions to
^{2}
for Poissonian data that assume the resulting statistic
still follows the
^{2}
distribution. On rare occasions, truly
new and correct methods have emerged; for example, astrophysicist
Lynden-Bell [19]
discovered the maximum-likelihood estimator for a
randomly truncated dataset, for which the theoretical validity was
later established by statistician
Woodroofe [31].

A growing group of astronomers, recognizing the potential for new
liaisons with the accomplishments of modern statistics, have
promoted astrostatistical innovation through cross-disciplinary
meetings and collaborations. Fionn Murtagh, an applied
mathematician at Queen's University (Belfast) with long experience
in astronomy, and his colleagues have run conferences and authored
many useful monographs (*e.g.*,
[16],
[17],
[22] and
[27]). We
at Penn State have run a series of *Statistical Challenges in
Modern Astronomy* meetings with both communities in attendance
(*e.g.*, [3] and
[10]).
Alanna Connors has organized brief
statistics sessions at large astronomy meetings, and we have
organized brief astronomy sessions at large *Joint Statistical
meetings*. We wrote a short volume called *Astrostatistics*
[3]
intended to familiarize scholars in one discipline with
relevant issues in the other discipline. Other series conferences
are devoted to technical issues in astronomical data analysis but
typically have limited participation by statisticians. These
include the dozen *Astronomical Data Analysis Software and
Systems* (*e.g.*,
[23]),
several Erice workshops on *Data Analysis in Astronomy* (*e.g.*,
[8]),
and the new SPIE *Astronomical Data Analysis* conferences
(*e.g.*,
[26]).

Most importantly, several powerful astrostatistical research
collaborations have emerged. At Harvard University and the
Smithsonian Astrophysical Observatory, David van Dyk worked with
scientists at the *Chandra*
^{(3)} X-ray Center on
several issues, particularly Bayesian approaches to parametric
modeling of spectra in light of complicated instrumental effects.
At Carnegie Mellon University and the University of Pittsburgh,
the Pittsburgh Computational Astrophysics group addressed several
issues, such as developing powerful techniques for multivariate
classification of extremely large datasets and applying
nonparametric regression methods to cosmology. Both of these
groups involved academics, researchers and graduate students from
both fields working closely for several years to achieve a
critical mass of cross-disciplinary capabilities.

Other astrostatistical collaborations must be mentioned. David
Donoho (Statistics at Stanford University) works with Jeffrey
Scargle (NASA Ames Research Center) and others on applying
advanced wavelet methods to astronomical problems. James Berger
(Statistics at Duke University) has worked with astronomers
William Jefferys (University of Texas), Thomas Loredo (Cornell
University), and Alanna Connors (Eureka Inc.) on Bayesian
methodologies for astronomy. Bradley Efron (Statistics at
Stanford University) has worked with astrophysicist Vehé
Petrosian (also at Stanford) on survival methods for interpreting
-ray
bursts. Philip Stark (Statistics at University of
California, Berkeley) has collaborated with solar physicists in
the *GONG* program to improve analysis of oscillations of the Sun
(helioseismology). More such collaborations exist in the U.S.,
Europe and elsewhere.

^{3} The *Chandra* X-ray
Observatory is one of NASA's Great Observatories. It was launched
in 1999 with a total budget around $2 billion.
Back.