Notes on Statistics for Physicists, Revised

16. GOODNESS OF FIT, THE ²DISTRIBUTION

The numerical value of the likelihood function at curlyL ( alpha *) can, in principle, be used as a check on whether one is using the correct type of function for f ( alpha ; x). If one is using the wrong f, the likelihood function will be lower in height and of greater width. In principle, one can calculate, using direct probability, the distribution of curlyL ( alpha *) assuming a particular true f ( alpha ₀, x). Then the probability of getting an curlyL ( alpha *) smaller than the value observed would be a useful indication of whether the wrong type of function for f had been used. If for a particular experiment one got the answer that there was one chance in 10⁴ of getting such a low value of curlyL ( alpha *), one would seriously question either the experiment or the function f ( alpha ;x) that was used.

In practice, the determination of the distribution of curlyL ( alpha *) is usually an impossibly difficult numerical integration in N-dimensional space. However, in the special case of the least-square problem, the integration limits turn out to be the radius vector in p-dimensional space. In this case we use the distribution of S( alpha *) rather than of curlyL ( alpha *). We shall first consider the distribution of S( alpha ₀). According to Eqs. (23) and (24) the probability element is

Note that S = rho ², where rho is the magnitude of the radius vector in p-dimensional space. The volume of a p-dimensional sphere is U propto rho _p. The volume element in this space is then

Thus

The normalization is obtained by integrating from S = 0 to S = infty .

(30a)

where S ident S( alpha ₀).

This distribution is the well-known chi ² distribution with p degrees of freedom. chi ² tables of

for several degrees of freedom are commonly available - see Appendix V for plots of the above integral.

From the definition of S (Eq. (24)) it is obvious that Sbar ₀ = p. One can show, using Eq. (29) that = 2p. Hence, one should be suspicious if his experimental result gives an S-value much greater than

Usually alpha is not known. In such a case one is interested in the distribution of

Fortunately, this distribution is also quite simple. It is merely the chi ² distribution of (p - M) degrees of freedom, where p is the number of experimental points, and M is the number of parameters solved for. Thus we haved

(31)

Since the derivation of Eq. (31) is somewhat lengthy, it is given in Appendix II.

Example 8

Determine the chi ² probability of the solution to Example 6.

According to the chi ² table for one degree of freedom the probability of getting S* > 0.674 is 0.41. Thus the experimental data are quite consistent with the assumed theoretical shape of

Example 9 Combining Experiments

Two different laboratories have measured the lifetime of the K₁⁰ to be (1.00 ± 0.01) × 10^-10 sec and (1.04 ± 0.02) × 10¹⁰ sec respectively. Are these results really inconsistent?

According to Eq. (6) the weighted mean is alpha * = 1.008 × 10^-10 sec. (This is also the least squares solution for tau _KO.

Thus

According to the chi ² table for one degree of freedom, the probability of getting S* > 3.2 is 0.074. Therefore, according to statistics, two measurements of the same quantity should be at least this far apart 7.4% of the time.

16. GOODNESS OF FIT, THE 2DISTRIBUTION

16. GOODNESS OF FIT, THE ²DISTRIBUTION