**1.1. What is density estimation?**

The *probability density function* is a fundamental concept in
statistics. Consider any random quantity *X* that has probability
density function *f*. Specifying the function *f* gives a natural
description of the distribution of *X*, and allows probabilities
associated with *X* to be found from the relation

Suppose, now, that we have a set of observed data points assumed to
be a sample from an unknown probability density function. *Density
estimation*, as discussed in this book, is the construction of an
estimate of the density function from the observed data. The two main
aims of the book are to explain how to estimate a density from a given
data set and to explore how density estimates can be used, both in
their own right and as an ingredient of other statistical procedures.

One approach to density estimation is *parametric*. Assume that the
data are drawn from one of a known parametric family of distributions,
for example the normal distribution with mean *µ* and variance
^{2}. The
density *f* underlying the data could then be estimated by finding
estimates of *µ* and
^{2} from the
data and substituting these estimates
into the formula for the normal density. In this book we shall not be
considering parametric estimates of this kind; the approach will be
more *non parametric* in that less rigid assumptions will be made about
the distribution of the observed data. Although it will be assumed
that the distribution has a probability density *f*, the data will be
allowed to speak for themselves in determining the estimate of *f* more
than would be the case if *f* were constrained to fall in a given
parametric family.

Density estimates of the kind discussed in this book were first proposed by Fix and Hodges (1951) as a way of freeing discriminant analysis from rigid distributional assumptions. Since then, density estimation and related ideas have been used in a variety of contexts, some of which, including discriminant analysis, will be discussed in the final chapter of this book. The earlier chapters are mostly concerned with the question of how density estimates are constructed. In order to give a rapid feel for the idea and scope of density estimation, one of the most important applications, to the exploration and presentation of data, will be introduced in the next section and elaborated further by additional examples throughout the book. It must be stressed, however, that these valuable exploratory purposes are by no means the only setting in which density estimates can be used.