Published in Q. Jl R. astr. Soc. (1996), 37, 519-563

# PRACTICAL STATISTICS FOR ASTRONOMERS: II. CORRELATION, DATA-MODELLING AND SAMPLE COMPARISON

## J. V. Wall

SUMMARY. Correlation is discussed in various guises in which astronomers may stumble across it, beginning with the pitfalls of searching for correlations between two variables and the tests, both parametric and non-parametric, for such correlations. This leads to the subject of regression analysis, a particular form of data modelling. Some general aspects and procedures in data modelling and parameter estimation are then described, including least squares, maximum-likelihood, Bayesian techniques and minimum chi-square. The final topic is sample comparison, an area of hypothesis testing: here some seven tests in all are described, three for the comparison of single samples with prediction and four for inter-sample comparison.

Non-parametric methods are emphasized throughout as being of most use to the astronomer, who frequently faces (i) very small samples and (ii) lack of control over the Universe needed either to rerun the experiments or to understand frequency distributions from which the small samples were drawn.

INTRODUCTION

CORRELATION BETWEEN TWO VARIABLES
The Fishing Trip
Correlation Testing

DATA MODELLING: PARAMETER ESTIMATION
The Method of Least Squares: Regression Analysis
The Maximum-Likelihood (ML) Method
Bayes Estimation
The Minimum Chi-Square Method
The Bootstrap

HYPOTHESIS TESTING: COMPARISON OF SAMPLES
Methodology
Single-Sample Goodness-of-Fit Tests
Tests for Comparison of Two Independent Samples
Summary, One- and Two-Sample Tests

CONCLUSION

REFERENCES

Table A I

Table A II

Table A III

Table A IV

Table A V

Table A VI

Table A VII

Table A VIII

Table A IX