Until now we have been discussing the situation in which the experimental result is N events giving precise values x_{1}, ... , x_{N} where the x_{i} may or may not, as the case may be, be all different.
From now on we shall confine our attention to the case of p measurements (not p events) at the points x_{1}, ... , x_{p}. The experimental results are (y_{1} ± _{1}), ... ,(y_{p} ± _{p}). One such type of experiment is where each measurement consists of N_{i} events. Then y_{i} = N_{i} and is Poisson-distributed with _{i} = sqrt[N_{i}]. In this case the likelihood function is
and
We use the notation (_{i}; x) for the curve that is to be fitted to the experimental points. The best-fit curve corresponds to _{i} = _{i}*. In this case of Poisson-distributed points, the solutions are obtained from the M simultaneous equations
If all the N_{i} >> 1, then it is a good approximation to assume each y_{i} is Gaussian-distributed with standard deviation _{i}. (It is better to use _{i} rather than N_{i} for _{i}^{2} where _{i} can be obtained by integrating (x) over the ith interval.) Then one can use the famous least squares method.
The remainder of this section is devoted to the case in which y_{i} are Gaussian-distributed with standard deviations _{i}. See Fig. 4. We shall now see that the least-squares method is mathematically equivalent to the maximum likelihood method. In this Gaussian case the likelihood function is
(23) |
where
(24) |
Figure 4. (x) is a function of known shape to be fitted to the 7 experimental points. |
The solutions _{i} = _{i}* are given by minimizing S() (maximizing w):
(25) |
This minimum value of S is called S*, the least squares sum. The values of _{i} which minimize are called the least-squares solutions. Thus the maximum-likelihood and least-squares solutions are identical. According to Eq. (11), the least-squares errors are
Let us consider the special case in which (_{i}; x) is linear in the _{i}:
(Do not confuse this f (x) with the f (x) on page 2.)
Then
(26) |
Differentiating with respect to _{j} gives
(27) |
Define
(28) |
Then
In matrix notation the M simultaneous equations giving the least-squares solution are
(29) |
is the solution for the *'s. The errors in are obtained using Eq. 11. To summarize:
(30) |
Equation (30) is the complete procedure for calculating the least squares solutions and their errors. Note that even though this procedure is called curve-fitting it is never necessary to plot any curves. Quite often the complete experiment may be a combination of several experiments in which several different curves (all functions of the _{i}) may be jointly fitted. Then the S-value is the sum over all the points on all the curves. Note that since w(*) decreases by ½ unit when one of the _{j} has the value (_{i}* ± _{j}), the S-value must increase by one unit. That is,
Example 5 Linear regression with equal errors
(x) is known to be of the form (x) = _{1} + _{2}x. There are p experimental measurements (y_{j} ± ).Using Eq. (30) we have
These are the linear regression formulas which are programmed into many pocket calculators. They should not be used in those cases where the _{i} are not all the same. If the _{i} are all equal, the errors
or
Example 6 Quadratic regression with unequal errors
The curve to be fitted is known to be a parabola. There are four experimental points at x = - 0.6, - 0.2, 0.2, and 0.6. The experimental results are 5 ± 2, 3 ± 1, 5 ± 1, and 8 ± 2. Find the best-fit curve.
(x) = (3.685 ± 0.815) + (3.27 ± 1.96)x + (7.808 ± 4.94)x^{2} is the best fit curve. This is shown with the experimental points in Fig. 5.
Figure 5. This parabola is the least squares fit to the 4 experimental points in Example 6. |
Example 7
In example 6 what is the best estimate of y at x = 1? What is the error of this estimate?
Solution: Putting x = 1 into the above equation gives
y is obtained using Eq. 12.
Setting x = 1 gives
So at x = 1, y = 14.763 ± 5.137.
Least Squares When the y_{i} are Not Independent
Let
be the error matrix-of the y measurements. Now we shall treat the more general case where the off diagonal elements need not be zero; i.e., the quantities y_{i} are not independent. We see immediately from Eq. 11a that the log likelihood function is
The maximum likelihood solution is found by minimizing
where | Generalized least squares sum |