Contemporary astronomy abounds in questions of a statistical nature. In addition to exploratory data analysis and simple heuristic (usually linear) modeling common in other fields, astronomers also often interpret data in terms of complicated non-linear models based on deterministic astrophysical processes. The phenomena studied must obey known behaviors of atomic and nuclear physics, gravitation and mechanics, thermodynamics and radiative processes, and so forth. `Modeling' data may thus involves both the selection of a model family based on an astrophysical understanding of the conditions under study, and a statistical effort to find parameters for the specified model. A wide variety of issues thus arise:
Does an observed group of stars (or galaxies or
molecular
clouds or -ray
sources) constitute a typical and unbiased
sample of the vast underlying population of similar objects?
When and how should we divide/classify these objects into 2, 3 or more subclasses?
What is the intrinsic physical relationship between two or more properties of a class of objects, especially when confounding variables or observational selection effects are present?
How do we answer such questions in the presence of observations with measurements errors and flux limits?
When is a blip in a spectrum (or image or time series) a real signal rather than a random event from Gaussian (or often Poissonian) noise or confounding variables?
How do we interpret the vast range of temporally
variable objects: periodic signals from rotating stars or orbiting
extrasolar planets, stochastic signals from accreting neutron
stars or black holes, explosive signals from magnetic reconnection
flares or -ray bursts?
How do we model the points in 2, 3, ..., 6-dimensional points representing photons in an image, galaxies in the Universe, Galactic stars in phase space?
How do we quantify continuous structures seen in the sky such as the cosmic microwave background, the interstellar and intergalactic gaseous media?
How do we fit astronomical spectra to highly non-linear astrophysical models based on atomic physics and radiative processes, including confidence limits on the best-fit parameters?
From a superficial examination of the astronomical
literature (2), we can show
that such questions are very common today. Of
15, 000
refereed papers published annually, 1% have "statistics" or
"statistical" in their title, 5% have "statistics" in their
abstract, 10% treat time-variable objects, 5 - 10% (est.)
present or analyze multivariate datasets, and 5 - 10% (est.) fit
parametric models. Accounting for overlaps, we roughly estimate
that around
3, 000 distinct studies
each year require
non-trivial statistical methodologies. Roughly 10% of these are
principally involved with statistical methods; indeed, some of
these purport to develop new methods or improve on established ones.
2 Such bibliometric measures are easily accomplished as the entire astronomical research literature is on-line (in full text at subscribing institutions) through the NASA-supported Astrophysics Data System, http://adsabs.harvard.edu/abstract_service.html. Back.