Next Contents

1. INTRODUCTION

At first glance, it might appear surprising that a trivial mathematical result obtained by an obscure minister over 200 hundred years ago ought still to excite so much interest across so many disciplines, from econometrics to biostatistics, from financial risk analysis to cosmology. Published posthumously thanks to Richard Price in 1763, "An essay towards solving a problem in the doctrine of chances" by the rev. Thomas Bayes (1701(?)-1761) [1] had nothing in it that could herald the growing importance and enormous domain of application that the subject of Bayesian probability theory would acquire more than two centuries afterwards. However, upon reflection there is a very good reason why Bayesian methods are undoubtedly on the rise in this particular historical epoch: the exponential increase in computational power of the last few decades made massive numerical inference feasible for the first time, thus opening the door to the exploitation of the power and flexibility of a rich set of Bayesian tools. Thanks to fast and cheap computing machines, previously unsolvable inference problems became tractable, and algorithms for numerical simulation flourished almost overnight.

Historically, the connections between physics and Bayesian statistics have always been very strong. Many ideas were developed because of related physical problems, and physicists made several distinguished contributions. One has only to think of people like Laplace, Bernouilli, Gauss, Metropolis, Jeffreys, etc. Cosmology is perhaps among the latest disciplines to have embraced Bayesian methods, a development mainly driven by the data explosion of the last decade, as Figure 1 indicates. However, motivated by difficult and computationally intensive inference problems, cosmologists are increasingly coming up with new solutions that add to the richness of a growing Bayesian literature.

Figure 1

Figure 1. The evolution of the B-word: number of articles in astronomy and cosmology with "Bayesian" in the title, as a function of publication year. The number of papers employing one form or another of Bayesian methods is of course much larger than that. Up until about 1995, Bayesian papers were concerned mostly with image reconstruction techniques, while in subsequent years the domain of application grew to include signal processing, parameter extraction, object detection, cosmological model building, decision theory and experiment optimization, and much more. It appears that interest in Bayesian statistics began growing around 2002 (source: NASA/ADS).

Some cosmologists are sceptic regarding the usefulness of employing more advanced statistical methods, perhaps because they think with Mark Twain that there are "lies, damned lies and statistics". One argument that is often heard is that there is no point in bothering too much about refined statistical analyses, as better data will in the future resolve the question one way or another, be it the nature of dark energy or the initial conditions of the Universe. I strongly disagree with this view, and would instead argue that sophisticated statistical tools will be increasingly central for modern cosmology. This opinion is motivated by the following reasons:

  1. The complexity of the modelling of both our theories and observations will always increase, thus requiring correspondingly more refined statistical and data analysis skills. In fact, the scientific return of the next generation of surveys will be limited by the level of sophistication and efficiency of our inference tools.
  2. The discovery zone for new physics is when a potentially new effect is seen at the 3-4 sigma level. This is when tantalizing suggestion for an effect starts to accumulate but there is no firm evidence yet. In this potential discovery region a careful application of statistics can make the difference between claiming or missing a new discovery.
  3. If you are a theoretician, you do not want to waste your time trying to explain an effect that is not there in the first place. A better appreciation of the interpretation of statistical statements might help in identifying robust claims from spurious ones.
  4. Limited resources mean that we need to focus our efforts on the most promising avenues. Experiment forecast and optimization will increasingly become prominent as we need to use all of our current knowledge (and the associated uncertainty) to identify the observations and strategies that are likely to give the highest scientific return in a given field.
  5. Sometimes there will be no better data! This is the case for the many problems associated with cosmic variance limited measurements on large scales, for example in the cosmic background radiation, where the small number of independent directions on the sky makes it impossible to reduce the error below a certain level.

This review focuses on Bayesian methodologies and related issues, presenting some illustrative results where appropriate and reviewing the current state-of-the art of Bayesian methods in cosmology. The emphasis is on the innovative character of Bayesian tools. The level is introductory, pitched for graduate students who are approaching the field for the first time, aiming at bridging the gap between basic textbook examples and application to current research. In the last sections we present some more advanced material that we hope might be useful for the seasoned practitioner, too. A basic understanding of cosmology and of the interplay between theory and cosmological observations (at the level of the introductory chapters in [2]) is assumed. A full list of references is provided as a comprehensive guidance to relevant literature across disciplines.

This paper is organized in two main parts. The first part, sections 2-4, focuses on probability theory, methodological issues and Bayesian methods generally. In section 2 we present the fundamental distinction between probability as frequency or as degree of belief, we introduce Bayes' Theorem and discuss the meaning and role of priors in Bayesian theory. Section 3 is devoted to Bayesian parameter inference and related issues in parameter extraction. Section 4 deals with the topic of Bayesian model comparison from a conceptual and technical point of view, covering Occam's razor principle, its practical implementation in the form of the Bayesian evidence, the effective number of model parameters and information criteria for approximate model comparison. The second part presents applications to cosmological parameter inference and related topics (section 5) and to Bayesian cosmological model building (section 6), including multi-model inference and model comparison forecasting. Section 7 gives our conclusions.

Next Contents