Maximum Likelihood Comparisons of Tully-Fisher and Redshift Data: Constraints on Omega and Biasing

APPENDIX C: PROPERTIES OF STATISTIC ²

In Section 5, we introduced the statistic chi ² (eq. [26]) as a measure of the coherence of the residual field between the IRAS and TF data. Here we demonstrate that it has approximately the properties of a true chi ² statistic, and indicate how and why it departs from true chi ² behavior.

The measure of residual coherence at separation tau is

(C1)

where d_ij is the separation in IRAS-distance space between objects i and j, and delta _m is the normalized magnitude residual (eq. [23]). The sum runs over the N_p( tau ) distinct pairs of objects with separation tau ± Delta tau ; note that a given object may appear in more than one of these pairs. The hypothesis we wish to test is that the IRAS-TF residuals are incoherent, which signifies a good fit on all scales. A formal statement of this condition is that the individual delta _m,i are independent random variables. Furthermore, the delta _m have been constructed to have mean zero and unit variance. Thus, our hypothesis of uncorrelated residuals implies that the expectation value of the product delta _m,i delta _m,j vanishes for i neq j, and that the expectation value of its square is unity.

It follows that

(C2)

The variance of ( tau ) is

(C3)

Now the expectation value within the sum will vanish under our assumption of uncorrelated residuals unless i = k and j = l. (Notice that we cannot have i = l and j = k because of the ordered nature of the summation.) Thus, the only nonzero terms in equation (C3) are identical pairs, and it follows that E[²( tau )] = N_p( tau ).

Because ( tau ) is the sum of N_p( tau ) random variables, each of zero mean and unit variance, we are tempted to suppose that, by the central limit theorem, its distribution is Gaussian with mean zero and variance N_p( tau ) when N_p( tau ) is large. Indeed, for the 200 km s^-1 bins used in its construction (cf. Section 5.2), N_p is typically gtapprox 10⁴. And, as shown in the previous paragraph, ( tau ) does indeed have mean zero and variance N_p( tau ). One also may ask about the correlation among the ( tau ) for different tau . Specifically, one may compute

(C4)

Now it is possible to have i = k within this sum. However, because tau ₁ neq tau ₂, if i = k then j neq l. Similarly, one may have j = l, but in that case i neq k. Thus, all of the individual expectation values in the sum vanish, and we find E[( tau ₁) ( tau ₂)] = 0. To the extent the above considerations hold, the ( tau _i) are independent Gaussian random variables of variance N_p( tau _i). It then follows that the statistic chi ² is distributed like a chi ² variable with M degrees of freedom. This is the statistic proposed in the main text as a measure of goodness of fit.

However, the central limit theorem applies only to sums of independent random variables. The individual products delta _m,i delta _m,j that enter into ( tau ) are uncorrelated in the specific sense E( delta _m,i delta _m,j) E( delta _m,k delta _m,l) = delta ^K_i,k delta ^K_j,l (where delta ^K is the Kronecker-delta symbol). However, they are not strictly independent from one another. This is because the same object can occur in more than one pair at a given tau . We thus expect the central limit to apply only approximately, and as a result the ( tau ) are not strictly Gaussian. As a result, chi ² cannot be a true chi ² statistic.

Furthermore, just as a single object appears in many pairs at a given tau , it can appear in pairs at different tau as well. Let us suppose object i contributes to both ( tau ₁) and ( tau ₂). Then the latter are not strictly independent, even though the expectation value of their product vanishes, as shown above. This factor, too, will result in a departure from chi ² behavior.

APPENDIX C: PROPERTIES OF STATISTIC 2

APPENDIX C: PROPERTIES OF STATISTIC ²