A Statistician: is ``a man prepared to estimate the probability that the Sun will rise tomorrow in the light of its past performance'' (3). This definition, which I admit to taking out of context, is by an eminent statistician; it may confirm our worst suspicions about statisticians eminent or otherwise. At the very least it should emphasize care of definition - what is tomorrow when the Sun does not rise? Let us be more careful:
``Statistics'' is a term loosely used to describe both the science and the values. In fact, the science is
Statistical Inference: the determination of properties of the population from the sample,
while
Statistics are values (usually, but not necessarily, numerical) determined from some or all of the values of a sample.
Good statistics are those from which our conclusions concerning the population are stable from sample to sample, while good samples provide good statistics, and require appropriate design of experiment.
The ``goodness'' of the experiment, the sample, or the statistic is indicated by the
Level of Significance: suppose we perform an experiment to distinguish between two rival hypotheses, the null hypothesis (H_{0}; ``failure'', no result, no detection, no correlation) and its alternative (H_{0}; ``success'', etc.). Before the experiment we make ourselves very familiar with ``failure'' by determining, assuming H_{0} to be true, the set of all possible values of the statistic under test, the statistic we have chosen to use in deciding between H_{0} and H_{1}. Furthermore, suppose that when we do the experiment we obtain a value of the statistic which is ``unusual'' in comparison with this set, so unusual that, say, only 1 per cent of all values computed under the H_{0} hypothesis are so extreme. We can then reject H_{0} in favour of H_{1} at the 1 per cent level of significance. The level of significance is thus the probability of rejecting H_{0} when it is, in fact, true.
Now consider N values of x_{i} where i = 1, 2 . . . N and x may have a continuous or a discrete distribution. The following definitions are general:
1. Location measures
Median: arrange x_{i} according to size; renumber. Then
x_{med} | = x_{j} where j = N / 2 + 0.5, N odd |
= 1/2 (x_{j} + x_{j+1} where j = N / 2, N even. |
Mode: x_{mode} is the value of x_{i} occurring most frequently.
2. Dispersion measures
3. Moments
(Moments may be taken about any value of x; those about the arithmetic mean as above are termed central moments.) Note that µ_{2} = ^{2}; this and the next few moments characterize a probability distribution (defined below). The first two moments are useless, since µ_{0} 1 and µ 0.
Skewness: | _{1} = µ_{3}^{2} / µ_{2}^{3} indicates deviation from symmetry; |
= 0 for symmetry about µ. |
Kurtosis: | _{2} = µ_{4} / µ_{2}^{2} indicates degree of peakiness; |
= 3 for Normal distribution. |
Finally, consider probability distributions: if x is a continuous random variable, then f (x) is its probability density function if it meets these conditions:
and
Distribution | Density function | Mean | Variance | Raison d'Être |
Uniform | (a+b)/2 | (b-a)/12 | In the study of rounding errors, and as a tool in theoretical studies of other continuous distributions. | |
Binomial | np | npq | x is the number of ``successes'' in an experiment with two possible outcomes, one (``success'') of probability p, and the other (``failure'') of probability q = 1 - p. Becomes a Normal distribution as n -> . | |
Poisson | µ | µ | The limit for the Binomial distribution as p << 1, setting µ np. It is the ``count-rate'' distribution, e.g. take a star from which an average of µ photons are received per t (out of a total of n emitted; p << 1); the probability of receiving x photons in a t is f (x;µ). Tends to the Normal distribution as µ -> . | |
Normal (Gaussian) | µ | ^{2} | The essential distribution; see text. Central Limit Theorem ensures that majority of ``scattered things'' are dispersed according to f (x;µ,). | |
Chi-square | 2 | Vital in the comparison of samples, model testing; characterizes the dispersion of observed samples from the expected dispersion, because if x_{i} is a sample of variables Normally and independently distributed with means µ_{i} and variances _{i}, then obeys f (^{2}; ) Invariably tabulated and used in integral form. Tends to Normal distribution as -> . | ||
Student t | 0 | /(-2) (for > 2) | For comparison of means, Normally-distributed populations; if n x_{i}s are taken from a Normal population (µ,), and if x_{s} and _{s} are found as in text, then t = is distributed as f (t,) where ``degrees of freedom'' = n - 1. Statistic t can also be formulated to compare means for samples from Normal populations with same , different µ (4). Tends to Normal as -> . | |
F | | _{2}/(_{2}-2) (for _{2} > 2) | For comparison of two variances, or of more than two means; if two statistics (_{1} and _{2}) each follow the Chi-square distribution, then is distributed as f(F;_{1},_{2}). Care required in application; see (4), (9). |
Probability densities and distribution functions may be similarly defined for sets of discrete values x = x_{1}, x_{2}. . . .x_{n}, and for multivariate distributions. The better-known (continuous) functions appear in Table 1, together with location and dispersion measures - note that the previous definitions for these may be written in integral form for continuous distributions. The table includes some indication of how and/or where each distribution arises, and for most of them, I avoid further discussion. But there is one whose rôle is so fundamental that it cannot be treated in such a cavalier manner. This follows in the next section.