Modeling the Panchromatic Spectral Energy Distributions of Galaxies

Annu. Rev. Astron. Astrophys. 2013. 51:393-455
Copyright © 2013 by Annual Reviews. All rights reserved

3. MASS-TO-LIGHT RATIOS & STELLAR MASSES

There are three basic techniques for estimating the stellar mass-to-light ratio, M / L, of a galaxy: (1) using tabulated relations between color and M / L; (2) modeling broadband photometry; (3) modeling moderate resolution spectra. The first technique is the simplest to use as it requires photometry in only two bands and no explicit modeling. The other techniques require construction of a library of models and a means to fit those models to the data. How do these different techniques compare?

3.1.1. Color-based M / L Ratios. The color-based M / L estimators have their origin in the pioneering work of Bell & de Jong (2001). These authors used a preliminary version of the Bruzual & Charlot (2003) SPS model to chart out the relation between M / L and color as a function of metallicity and SFH. Remarkably, they found that variation in metallicity and SFH (parameterized as a τ model) moved galaxies along a well-defined locus in the space of M / L_B vs. B-R, suggesting that the B-R color could be a useful proxy for M / L_B. Perhaps most importantly, they demonstrated that the dust reddening vector was approximately parallel to the inferred color-M / L relation, implying that dust should have only a second-order effect on derived M / L values. They also investigated NIR colors and found much larger variation in M / L at fixed I-K, due mostly to variation in the SFH. Late bursts of SF complicate the interpretation of color-M / L relations by driving the M / L ratios lower at fixed color compared to smoothly declining SFHs. Their analysis was for a fixed IMF; allowing for IMF variation will change the M / L while leaving the color basically unchanged. These authors concluded that for a fixed IMF and assuming that large starbursts and low-metallicity systems are not common properties of galaxies, and neglecting uncertainties in the SPS models, one could estimate M / L from a single color to an accuracy of ~ 0.1-0.2 dex.

In subsequent work, Bell et al. (2003) analyzed the optical-NIR SEDs of ~ 12,000 galaxies with SDSS and 2MASS photometry. They constructed a grid of SPS models with varying SFHs, including bursts, and metallicity. They did not allow for reddening due to dust. They derived best-fit M / L values from SED fitting and created new, observationally-constrained color-M / L relations. The resulting B-R vs. M / L_B relation was similar to Bell & de Jong (2001), but the NIR relation was considerably more shallow than the earlier work. At the bluest colors, the M / L_K ratios differed by ≈0.3 dex between the old and new relations. The difference arose due to a population of galaxies with blue observed colors and high inferred M / L ratios, which Bell et al. (2003) interpreted as being due to the lower metallicities allowed in their fitting procedure compared to the analytic models of Bell & de Jong. These differences are of some concern because the Bell et al. (2003) color-M / L relations are very widely used to derive `cheap' stellar mass estimates of galaxies.

Zibetti, Charlot & Rix (2009) revisited this issue with new SPS models from Charlot & Bruzual (in prep). They followed a somewhat different approach compared to Bell et al. (2003) in that they constructed color-M / L relations directly from a library of model galaxies. This approach is sensitive to how one chooses to populate parameter space, since the resulting color-M / L relations are simple averages over the model space. In contrast, Bell et al. (2003) constructed color-M / L relations only from those models that provided good fits to observed optical-NIR photometry. Zibetti et al. allowed for dust attenuation, and they also allowed for larger mass fractions of stars formed in recent bursts compared to Bell et al. Zibetti, Charlot & Rix (2009) found substantial differences in the g-i color-M / L relation compared to Bell et al. (~ 0.5 dex at the bluest colors), which they attributed to their allowance for much stronger bursts of star formation than allowed in Bell et al. (2003). Gallazzi & Bell (2009) also found a very different color-M / L relation compared to Bell et al. (2003), which they attributed to the different age distributions in their model library. Taylor et al. (2011) followed the philosophy of Bell et al. (2003) in fitting models to observed SEDs in order to derive color-M / L relations. They found systematic differences of order 0.1-0.2 dex compared to the Zibetti et al. relations, which they attributed mostly to their decision to select only those models that fit the photometric data, and also the treatment of dust reddening.

Zibetti, Charlot & Rix (2009) found a large difference in M / L_H ratios estimated from i-H colors between the new Charlot & Bruzual models and the previous version (Bruzual & Charlot 2003), with typical offsets of 0.2-0.4 dex. This difference is largely due to the different treatment of the TP-AGB evolutionary phase (see Section 3.2 for details). In contrast, optical color-M / L relations showed little difference between the two SPS models, and also showed slightly less discrepancy with Bell et al. (2003). These authors argued that optical color-M / L relations should generally be preferred to NIR-based ones owing to the larger impact of the uncertain TP-AGB phase in the NIR. Exceptions to this rule could be made for galaxies with very large amounts of dust reddening and/or very strong starbursts, in which cases the optical SED becomes a poor indicator of the SFH and hence M / L. The upshot is that NIR color-based M / L estimates should be treated with caution, since they are subject to large stellar evolution-based systematics. However, both optical and NIR-based M / L estimates will depend on the allowed range of SFHs and metallicities in the model library, with large systematics appearing for the bluest colors.

3.1.2. M / L From Broadband and Spectral Fitting Techniques. The modern era of SED fitting with SPS models was ushered in by Sawicki & Yee (1998) and Giallongo et al. (1998), who analyzed optical-NIR broadband photometry of high-redshift galaxies. The model grid of Sawicki & Yee (1998) consisted of a τ-model SFH with varying τ and start time for the SFH, and variation in metallicity and reddening. Subsequent modeling of broadband data has largely followed this approach (e.g., Brinchmann & Ellis 2000, Papovich, Dickinson & Ferguson 2001, Shapley et al. 2001, Salim et al. 2007). Generically, when fitting broadband SEDs one finds that stellar masses for `normal' galaxies (i.e., not including pathological SFHs) can be recovered at the ≈ 0.3 dex level (1σ uncertainty). This uncertainty does not include potential systematics in the underlying SPS models; see Section 3.2. Stellar masses appear to be the most robust parameter estimated from SED fitting (e.g., Papovich, Dickinson & Ferguson 2001, Shapley et al. 2001, Wuyts et al. 2009, Muzzin et al. 2009, Lee et al. 2009b). The reason for this appears to not be fully understood, although it probably is at least partly due to the fact that the dust reddening vector is approximately parallel to SFH and metallicity variations in color-M / L diagrams (Bell & de Jong 2001). The choice of the SFH, in particular whether it is rising, declining, or bursty, can significantly change the best-fit stellar mass, by perhaps as much as 0.6 dex in extreme cases (Pforr, Maraston & Tonini 2012). A general rule of thumb is that M / L ratios estimated via simple SFHs (or single-age models) will be lower limits to the true M / L ratios (Papovich, Dickinson & Ferguson 2001, Shapley et al. 2001, Trager, Faber & Dressler 2008, Graves & Faber 2010, Pforr, Maraston & Tonini 2012). This is a consequence of the fact that young stars outshine older ones, making it relatively easy to `hide' old stellar populations in galaxies with a large number of young stars. This also explains why color-M / L relations are so uncertain for very blue colors. This general rule is not universal: the modeling of certain galaxy types may lead to systematic biases in the opposite direction (Gallazzi & Bell 2009). Partly for these reasons, stellar masses estimated for quiescent systems should be more reliable than for star-forming ones. Relatedly, Zibetti, Charlot & Rix (2009) demonstrated that stellar masses estimated for individual nearby galaxies based on a pixel-by-pixel analysis of colors are generally larger than those estimated from integrated colors. The differences are typically ~ 0.05-0.15 dex, depending on galaxy type. This should not be surprising, especially for systems with star-forming disks and quiescent bulges, since the M / L ratio estimated from the integrated colors will tend to be biased low by the more luminous young stars in the star-forming component. Wuyts et al. (2012) performed a similar analysis on galaxies at 0.5 < z < 2.5 with WFC3 Hubble Space Telescope (HST) photometry and found no systematic difference in stellar masses derived from resolved vs. integrated light, although the fact that their galaxies were at high redshift means that they were probing spatial scales an order of magnitude larger than in Zibetti et al.

There has been some confusion in the literature regarding the importance of restframe NIR photometry for estimating stellar masses. First, as indicated in Figure 5, for smoothly varying SFHs, photometry at and redward of the V-band is sensitive to the same light-weighted age, and so redder bands do not provide stronger constraints on the mean stellar age. Taylor et al. (2011) analyzed mock galaxies and concluded that the addition of NIR data did not yield more accurate masses. Taylor et al. also found that different SPS models produced good agreement in derived properties when NIR data was excluded from their fits, but poor agreement when NIR was included. These authors also found much larger residuals in their SED fits when NIR data were included, suggesting that the models are still poorly calibrated in this regime. As discussed in Section 5.2, the NIR is at present probably most useful for constraining metallicities (within the context of a particular SPS model), and so NIR data may be useful in cases where there is a degeneracy between M / L and Z. In general however stellar mass estimates do not appear to be strongly improved with the addition of NIR data, at least with currently available models. Exceptions to this rule may be made for galaxies with very high dust opacities.

As first emphasized by Bell & de Jong (2001) and Bell et al. (2003), some of the largest uncertainties in derived M / L ratios stem from uncertainties in the assumed SFHs, in particular the presence of bursty SF episodes. The consideration of the Balmer lines with other age and metallicity-sensitive features, such as the 4000 Å break (D_n4000), can constrain the burstiness of the SFH. Optical spectra therefore offers the possibility of providing stronger constraints on the M / L ratio. Kauffmann et al. (2003) modeled the Hδ and D_n4000 spectral features measured from SDSS spectra in order to constrain SFHs and M / L ratios. They obtained 95% confidence limits on stellar masses of ~ 0.2 dex and ~ 0.3 dex for quiescent and star-forming galaxies, respectively. Again, these are statistical uncertainties because the underlying SPS model and other aspects such as the adopted parameterization of the SFH were held fixed. Chen et al. (2012) employed principle component analysis (PCA) to model the optical spectra of massive galaxies in SDSS. These authors found systematic uncertainties in the recovered stellar masses of order ~ 0.1 dex depending on the assumed metallicity, dust model, and SFH. They also demonstrated that their PCA technique was capable of measuring parameters at much lower S/N than direct fitting to selected spectral indices. They determined statistical uncertainties on their mass measurements to be ~ 0.2 dex, which they showed to be comparable to the formal errors estimated from modeling broadband photometry. This result is in agreement with Gallazzi & Bell (2009), who argued that for galaxies with simple SFHs and lacking dust, the uncertainty on the derived masses are not much larger when using color-based estimators compared to spectroscopic-based estimators. With regards to estimating M / L, the real value of spectra appears to be restricted to galaxies with unusual SFHs.

The comparison between spectroscopically-based and photometrically-based stellar masses is informative because the approaches suffer from different, though by no means orthogonal systematics. The most obvious difference is with regards to dust — spectroscopic masses are much less sensitive to dust attenuation than photometric masses. Drory et al. (2004) compared their own photometrically-derived masses to the spectroscopic masses from Kauffmann et al. (2003). They found an rms scatter of ~ 0.2 dex between the two estimates, with a modest systematic trend that correlated with Hα EW. Blanton & Roweis (2007) estimated photometric stellar masses via a technique that is similar to PCA except that the templates are constrained to be non-negative and are based on an SPS model. They compared their derived stellar masses to those of Kauffmann et al. and found good agreement, with systematic trends between the two restricted to ≲ 0.2 dex.

A comparison between stellar masses for the same galaxies estimated with different SPS models, priors, and fitting techniques is shown in Figure 6 (from Moustakas et al. 2012). The galaxies included in this figure are predominantly `normal' star-forming and quiescent z ~ 0 galaxies. For these galaxies the mean absolute differences between various mass estimators is less than 0.2 dex, in agreement with other work on the systematic uncertainties in stellar masses of normal galaxies.

Figure 6. Comparison of different fitting codes, SPS models, and priors on derived stellar masses (from Moustakas et al. 2012). The fiducial masses are based on fitting SDSS and GALEX photometry of z ~ 0 galaxies using the iSEDfit code (Moustakas et al. 2012), with SSPs from FSPS (Conroy, Gunn & White 2009), including dust attenuation, a range in metallicities, and SFHs with both smooth and bursty components. The left panel compares stellar mass catalogs produced by different groups/codes. K-correct and MPA/JHU-DR7 are based on SDSS photometry; MPA/JHU-DR4 is based on SDSS spectral indices, and Salim+07 is based on SDSS and GALEX photometry. The middle panel shows the effect of different SPS models (i.e., different SSPs), and the right panel shows the effect of varying the priors on the model library. The mean systematic differences between mass estimators is less than ± 0.2 dex. Figure courtesy of J. Moustakas.

3.2. Uncertainties in M / L due to Stellar Evolution Uncertainties

Maraston et al. (2006) was the first to draw attention to the sensitivity of derived stellar masses to uncertain stellar evolutionary phases, in particular the TP-AGB. The Maraston (2005) SPS model predicts much more luminosity arising from TP-AGB stars than in previous models (e.g., Fioc & Rocca-Volmerange 1997, Bruzual & Charlot 2003). At ages where the TP-AGB phase is most prominent (~ 3 × 10⁸ - 2 × 10⁹ yr), Maraston's model predicts roughly a factor of two more flux at >1 μm compared to earlier work. Maraston et al. (2006) found that her SPS model implied a factor of two smaller stellar mass, on average, compared to masses derived with the Bruzual & Charlot (2003) SPS model for galaxies at z ~ 2. When the fits excluded the possibility of dust reddening, Maraston's model provided a better fit to the optical-NIR SEDs than previous models, whereas when dust was allowed, the quality of the fits became indistinguishable, although the offsets in best-fit masses remained.

Subsequent work has largely confirmed the sensitivity of estimated stellar masses to the adopted SPS model (Wuyts et al. 2007, Kannappan & Gawiser 2007, Cimatti et al. 2008, Muzzin et al. 2009, Longhetti & Saracco 2009, Conroy, Gunn & White 2009). Essentially all work on this topic has focused on comparing Maraston's model to Bruzual & Charlot's, where the difference in TP-AGB treatment it probably the most significant, though not the only difference (other differences include different RGB temperatures and different treatments for core convective overshooting). Many authors have found a maximum factor of ~ 2-3 difference in derived M / L ratios between the two models, especially when NIR data were included. Kannappan & Gawiser (2007) showed that the differences between Maraston's and Bruzual & Charlot's models are relatively modest (≲1.3) when NIR data are excluded. Conroy, Gunn & White (2009) constructed a new SPS model in which the luminosity contribution from the TP-AGB phase could be arbitrarily varied. These authors isolated the importance of the TP-AGB phase and confirmed previous work indicating that the adopted weight given to this phase in the models can have a large modulating effect on the stellar mass.

The contribution of TP-AGB stars to the integrated light peaks at ~ 3 × 10⁸ - 2 × 10⁹ yr, depending on metallicity, and so the importance of this phase to modeling SEDs will depend on the SFH of the galaxy. It was for this reason that Maraston et al. (2006) focused their efforts on quiescent galaxies at z ~ 2; at this epoch even a quiescent galaxy will have a typical stellar age not older than several Gyr. In addition to high-redshift quiescent galaxies, Lançon et al. (1999) suggested that post-starburst galaxies should harbor large numbers of TP-AGB stars because their optical spectra show strong Balmer lines, indicative of large numbers of 10⁸ - 10⁹ yr old stars. This fact was exploited by Conroy & Gunn (2010), Kriek et al. (2010), and Zibetti et al. (2012) to constrain the TP-AGB contribution to the integrated light. These authors all find evidence for a low contribution from TP-AGB stars, and in particular they argue that Maraston's models predict significantly too much flux in the NIR for these objects.

The situation with TP-AGB stars is extremely complex owing to the fact that this phase is so sensitive to age and metallicity. Moreover, using galaxy SEDs to constrain the importance of this phase is difficult because many parameters must be simultaneously constrained (metallicity, SFH, dust, etc.). A case in point is the analysis of high-redshift galaxies, where Maraston et al. (2006) found large differences in the quality of the fit of Bruzual & Charlot models depending on whether or not dust was included in the fits (as an aside, if these galaxies do have copious numbers of TP-AGB stars then one may expect them to also contain dust, since these stars are believed to be efficient dust factories). It is also worth stressing that the conclusion to the TP-AGB controversy may not be an `either-or' situation in the sense that some models may perform better for some ranges in age and metallicity while other models may perform better in different regions of parameter space. Ultimately of course we desire models that perform equally well over the full range of parameter space, and this requires a continual evolution and improvement of SPS models.

Of course, there are other aspects of stellar evolution that are poorly constrained, including convective overshooting, blue stragglers and the HB, and even the temperature of the RGB. The propagation of these uncertainties into derived properties such as M / L ratios is only just beginning (e.g., Conroy, Gunn & White 2009). Melbourne et al. (2012) has for example demonstrated that the latest Padova isochrones fail to capture the observed flux originating from massive core He burning stars as observed in nearby galaxies. These stars are very luminous and can dominate the NIR flux of young stellar populations (≲ 300 Myr).

Finally, it is worth stressing that the effect of TP-AGB stars is limited to galaxies of a particular type, in particular those that are dominated by stars with ages in the range ~ 3 × 10⁸ - 2 × 10⁹ yr. Examples include post-starburst galaxies, which are rare at most epochs, or quiescent galaxies at z ≳ 2. For typical galaxies at z ~ 0, the treatment of this phase seems to be of little relevance for estimating stellar masses, at least on average, as is evident in Figure 6.

3.3. Stellar Masses at High Redshift

As first emphasized by Papovich, Dickinson & Ferguson (2001), galaxies with high SFRs can in principle contain a large population of `hidden' older stars, since these stars have high M / L ratios. Papovich et al. found that the data allowed for significantly larger stellar masses when two-component SFH models (young+old) were compared to their fiducial single component models (a median difference of a factor of ≈3 with extreme cases differing by an order of magnitude). This issue becomes more severe at higher redshifts because the typical SFRs are higher and because of the increasing possibility of rising SFHs (to be discussed in detail in Section 4). This makes the analysis of high redshift galaxy SEDs much more complicated.

However, at the highest redshifts (z ≳ 6), the analysis of SEDs may actually become simpler. At z = 8 the age of the universe is only ≈ 640 Myr, and if one presumes that galaxy formation commences after z ≈ 20, then even the oldest stars at z = 8 will be no more than ≈ 460 Myr old. The oldest possible main sequence turnoff stars will thus be A type stars. Put another way, the M / L ratio in the blue (U through V bands) at 13 Gyr is 4.5-15 times higher for an instantaneous burst of SF compared to a constant SFH, whereas the M / L ratio for these two SFHs differs by the more modest 2.5-5 at 500 Myr. So there is some expectation that `hidden mass' will be less hidden when modeling SEDs at the highest redshifts. This expectation appears to be borne out by detailed modeling. Finkelstein et al. (2010) analyzed the SEDs of z ~ 7-8 galaxies and concluded that even allowing for the possibility of extreme amounts of old stars (90% by mass), the best-fit stellar masses increased by no more than a factor of two compared to a fiducial single-component SFH (see also Curtis-Lake et al. 2012).

At even higher redshifts (z ≳ 10), the analysis of galaxy SEDs may become even simpler, at least with regards to estimating stellar masses. At sufficiently high redshifts the age of the oldest possible stars will eventually become comparable to the SF timescale probed by the restframe UV (~ 10⁸ yr at ~ 2500 Å ; see Figure 5). The measured SFR will thus be averaged over roughly the entire age of the galaxy, implying that a reasonable estimate of the stellar mass can be obtained by multiplying the UV-derived SFR with the age of the universe. This may in fact explain the unexpectedly strong correlation between stellar mass and UV luminosity at z > 4 reported by González et al. (2011). Of course, at the highest redshifts other difficulties arise, including accounting for the effects of very high EW emission lines and the reliability of the underlying models at very low metallicities.

3.4. Summary

M / L ratios for most galaxies with normal SEDs are probably accurate at the ~ 0.3 dex level, for a fixed IMF, with the majority of the uncertainty dominated by systematics (depending on the type of data used). Galaxies with light-weighted ages in the range of ~ 0.1-1 Gyr will have more uncertain M / L ratios, with errors as high as factors of several, owing to the uncertain effect of TP-AGB stars and perhaps other stars in advanced evolutionary phases. Galaxies that have very young light-weighted ages (e.g., high-redshift and starburst galaxies) will also have very uncertain M / L ratios because of the difficulty in constraining their past SFHs. Uncertainties due to dust seem to be subdominant to other uncertainties, at least for standard reddening laws and modest amounts of total attenuation. Essentially all stellar masses are subject to an overall normalization offset due both to the IMF and the uncertain contribution from stellar remnants. Owing to the lingering model uncertainties in the NIR (e.g., from TP-AGB and cool core He burning stars), it may be advisable to derive physical parameters by modeling the restframe UV-optical (see e.g., the discussion in Taylor et al. 2011).