Next Contents Previous

APPENDIX C: PROPERTIES OF STATISTIC chi2xi

In Section 5, we introduced the statistic chi2xi (eq. [26]) as a measure of the coherence of the residual field between the IRAS and TF data. Here we demonstrate that it has approximately the properties of a true chi2 statistic, and indicate how and why it departs from true chi2 behavior.

The measure of residual coherence at separation tau is

Equation C1 (C1)

where dij is the separation in IRAS-distance space between objects i and j, and deltam is the normalized magnitude residual (eq. [23]). The sum runs over the Np(tau) distinct pairs of objects with separation tau ± Deltatau; note that a given object may appear in more than one of these pairs. The hypothesis we wish to test is that the IRAS-TF residuals are incoherent, which signifies a good fit on all scales. A formal statement of this condition is that the individual deltam,i are independent random variables. Furthermore, the deltam have been constructed to have mean zero and unit variance. Thus, our hypothesis of uncorrelated residuals implies that the expectation value of the product deltam,i deltam,j vanishes for i neq j, and that the expectation value of its square is unity.

It follows that

Equation C2 (C2)

The variance of xi(tau) is

Equation C3 (C3)

Now the expectation value within the sum will vanish under our assumption of uncorrelated residuals unless i = k and j = l. (Notice that we cannot have i = l and j = k because of the ordered nature of the summation.) Thus, the only nonzero terms in equation (C3) are identical pairs, and it follows that E[xi2(tau)] = Np(tau).

Because xi(tau) is the sum of Np(tau) random variables, each of zero mean and unit variance, we are tempted to suppose that, by the central limit theorem, its distribution is Gaussian with mean zero and variance Np(tau) when Np(tau) is large. Indeed, for the 200 km s-1 bins used in its construction (cf. Section 5.2), Np is typically gtapprox 104. And, as shown in the previous paragraph, xi(tau) does indeed have mean zero and variance Np(tau). One also may ask about the correlation among the xi(tau) for different tau. Specifically, one may compute

Equation C4 (C4)

Now it is possible to have i = k within this sum. However, because tau1 neq tau2, if i = k then j neq l. Similarly, one may have j = l, but in that case i neq k. Thus, all of the individual expectation values in the sum vanish, and we find E[xi(tau1) xi(tau2)] = 0. To the extent the above considerations hold, the xi(taui) are independent Gaussian random variables of variance Np(taui). It then follows that the statistic chi2xi is distributed like a chi2 variable with M degrees of freedom. This is the statistic proposed in the main text as a measure of goodness of fit.

However, the central limit theorem applies only to sums of independent random variables. The individual products deltam,i deltam,j that enter into xi(tau) are uncorrelated in the specific sense E(deltam,i deltam,j) E(deltam,k deltam,l) = deltaKi,k deltaKj,l (where deltaK is the Kronecker-delta symbol). However, they are not strictly independent from one another. This is because the same object can occur in more than one pair at a given tau. We thus expect the central limit to apply only approximately, and as a result the xi(tau) are not strictly Gaussian. As a result, chi2xi cannot be a true chi2 statistic.

Furthermore, just as a single object appears in many pairs at a given tau, it can appear in pairs at different tau as well. Let us suppose object i contributes to both xi(tau1) and xi(tau2). Then the latter are not strictly independent, even though the expectation value of their product vanishes, as shown above. This factor, too, will result in a departure from chi2 behavior.

Next Contents Previous