Bayesian photometric redshifts with empirical training sets
/ Authors
/ Abstract
We combine in a single framework the two complementary benefits of χ 2 template fits and empirical training sets used e.g. in neural nets: χ 2 is more reliable when its probability density functions (PDFs) are inspected for multiple peaks, while empirical training is more accurate when calibration and priors of query data and training set match. We present a χ 2 empirical method that derives PDFs from empirical models as a subclass of kernel regression methods, and apply it to the Sloan Digital Sky Survey Data Release 5 sample of >75 000 quasi-stellar objects, which is full of ambiguities. Objects with single-peak PDFs show 2.5, these figures are two times better. Outliers result purely from the discrete nature and limited size of the model, and rms errors are dominated by the intrinsic variety of object colours. PDFs classed as ambiguous provide accurate probabilities for alternative solutions and thus weights for using both solutions and avoiding needless outliers. E.g. the PDFs predict 78.0 per cent of the stronger peaks to be correct, which is true for 77.9 per cent of them. Redshift incompleteness is common in faint spectroscopic surveys and turns into a massive undetectable outlier risk above other performance limitations, but we can quantify residual outlier risks stemming from size and completeness of the model. We propose a matched χ 2 error scale for noisy data and show that it produces correct error estimates and redshift distributions accurate within Poisson errors. Our method can easily be applied to future large galaxy surveys, which will benefit from the reliability in ambiguity detection and residual risk quantification.
Journal: Monthly Notices of the Royal Astronomical Society