Notes on Statistics for Physicists, Revised

7. MAXIMUM-LIKELIHOOD ERRORS, M-PARAMETERS CORRELATED ERRORS

When M parameters are to be determined from a single experiment containing N events, the error formulas of the preceding section are applicable only in the rare case in which the errors are uncorrelated.. Errors are uncorrelated only for = 0 for all cases with i neq j. For the general case we Taylor-expand w( alpha ) about ( alpha *):

where

and

(9)

The second term of the expansion vanishes because ðw / ð alpha = 0 are the equations for alpha *

Neglecting the higher-order terms, we have

(an M-dimensional Gaussian surface). As before, our error formulas depend on the approximation that curlyL ( alpha ) is Gaussian-like in the region alpha _i approx alpha _i*. As mentioned in Section 4, if the statistics are so poor that this is a poor approximation, then one should merely present a plot of curlyL ( alpha ). (see Appendix IV).

According to Eq. (9), H is a symmetric matrix. Let U be the unitary matrix that diagonalizes H:

(10)

Let and . The element of probability in the beta -space is

Since |U| = 1 is the Jacobian relating the volume elements d^M beta and d^M gamma , we have

Now that the general M-dimensional Gaussian surface has been put in the form of the product of independent one-dimensional Gaussians we have

Then

According to Eq. (10), H = U^-1^.h^.U, so that the final result is

Maximum
Likelihood
Errors,
M parameters (11)

(A rule for calculating the inverse matrix H^-1 is

If we use the alternate notation V for the error matrix H^-1, then whenever H appears, it must be replaced with V^-1; i.e., the likelihood function is

(11a)

Example 2

Assume that the ranges of monoenergetic particles are Gaussian-distributed with mean range alpha ₁ and straggling coefficient alpha ₂ (the standard deviation). N particles having ranges x₁,..., x_N are observed. Find alpha ₁*, alpha ₂*, and their errors . Then

The maximum-likelihood solution is obtained by setting the above two equations equal to zero.

The reader may remember a standard-deviation formula in which N is replaced by (N - 1):

This is because in this case the most probable value, alpha ₂*, and the mean, baralpha ₂ , do not occur at the same place. Mean values of such quantities are studied in Section 16. The matrix H is obtained by evaluating the following quantities at alpha ₁* and alpha ₂*:

According to Eq. (11), the errors on alpha ₁ and alpha ₂ are the square roots of the diagonal elements of the error matrix, H^-1:

(this is sometimes called the
error of the error).

We note that the error of the mean is 1/sqrt[N] sigma where sigma = alpha ₂ is the standard deviation. The error on the determination of sigma is sigma /sqrt[2N].

Correlated Errors

The matrix V_ij ident is defined as the error matrix (also called the covariance matrix of alpha ). In Eq. 11 we have shown that V = H^-1 where H_ij = - ð² w / (ð alpha _i ð alpha _j). The diagonal elements of V are the variances of the alpha 's. If all the off-diagonal elements are zero, the errors in alpha are uncorrelated as in Example 2. In this case contours of constant w plotted in ( alpha ₁, alpha ₂) space would be ellipses as shown in Fig. 2a. The errors in alpha ₁ and alpha ₂ would be the semi-major axes of the contour ellipse where w has dropped by ½ unit from its maximum-likelihood value. Only in the case of uncorrelated errors is the rms error Delta alpha _j = (H_jj)^-½ and then there is no need to perform a matrix inversion.

Figure 2. Contours of constant w as a function of alpha ₁ and alpha ₂. Maximum likelihood solution is at w = w*. Errors in alpha ₁ and alpha ₂ are obtained from ellipse where w = (w* - ½).
(a) Uncorrelated errors.
(b) Correlated errors. In either case Delta alpha ₁² = V₁₁ = (H^-1)₁₁ and Delta alpha ₂² = V₂₂ = (H^-1)₂₂. Note that it would be a serious mistake to use the ellipse "halfwidth" rather than the extremum for Delta alpha .

In the more common situation there will be one or more off-diagonal elements to H and the errors are correlated (V has off-diagonal elements). In this case (Fig. 2b) the contour ellipses are inclined to the alpha ₁, alpha ₂ axes. The rms spread of alpha ₁ is still Delta alpha ₁ = sqrt[V₁₁], but it is the extreme limit of the ellipse projected on the alpha ₁-axis. (The ellipse "halfwidth" axis is (H₁₁)^-½ which is smaller.) In cases where Eq. 11 cannot be evaluated analytically, the alpha *'s can be found numerically and the errors in alpha can be found by Plotting the ellipsoid where w is 1/2 unit less than w^*. The extremums of this ellipsoid are the rms error in the alpha 's. One should allow all the alpha _j to change freely and search for the maximum change in alpha _i which makes w = (w^* - ½). This maximum change in alpha _i, is the error in alpha _i and is sqrt[V₁₁].