7.1 The Least Squares Method
Let us suppose that measurements at n points, xi, are made of the variable yi with an error i (i = 1, 2, . . ., n), and that it is desired to fit a function f(x; a1, a1, . . ., am) to these data where a1, a1, . . ., am, are unknown parameters to be determined. Of course, the number of points must be greater than the number of parameters. The method of least squares states that the best values of aj are those for which the sum
is a minimum. Examining (70) we can see that this is just the sum of
the squared deviations of the data points from the curve
f(xi)
weighted by the respective errors on yi. The reader
might also
recognize this as the chi-square in (22). for this reason, the
method is also sometimes referred to as chi-square
minimization. Strictly speaking this is not quite correct as
yi must
be Gaussian distributed with mean f(xi;
aj) and variance i2 in order
for S to be a true chi-square. However, as this is almost always the
case for measurements in physics, this is a valid hypothesis most of
the time. The least squares method, however, is totally general and
does not require knowledge of the parent distribution. If the parent
distribution is known the method of maximum likelihood may also be
used. In the case of Gaussian distributed errors this yields identical
results.
To find the values of aj, one must now solve the
system of equations
Depending on the function f(x), (71) may or may not yield on
analytic solution. In general, numerical methods requiring a computer
must be used to minimize S.
Assuming we have the best values for aj, it is
necessary to estimate
the errors on the parameters. For this, we form the so-called
covariance or error matrix, Vij,
where the second derivative is evaluated at the minimum. (Note the
second derivatives form the inverse of the error matrix). The diagonal
elements Vij can then be shown to be the variances for
ai, while the
off-diagonal elements Vij represent the covariances
between ai and aj.
Thus,