Estimating Non-Gaussianity in The Microwave Background: A. F. Heavens
Estimating Non-Gaussianity in The Microwave Background: A. F. Heavens
Estimating Non-Gaussianity in The Microwave Background: A. F. Heavens
A. F. Heavens
Institute for Astronomy, University of Edinburgh, Blackford Hill, Edinburgh EH9 3HJ
Accepted 1998 May 13. Received 1998 April 23; in original form 1998 January 7
A B S T R AC T
q 1998 RAS
806 A. F. Heavens
correlated noise, and that the estimates of the bispectrum coef- (Gangui et al. 1994):
ficients come with error bars and covariances between the errors. p
Raijk ¼ Nðgij ; gjk ; gki ÞW,1 W,2 W,3
This last point is particularly important when one realizes that a 2
single bispectrum coefficient estimate is unlikely to rule out a X
× P,4 ðcos gij ÞP,5 ðcos gjk ÞP,6 ðcos gki Þ
model, because the cosmic variance is often larger than the ,4 ,5 ,6
m4 m5 m6
expected signal, and so one is going to need many coefficients in
practice. A final point is that, for Gaussian fluctuations, the × H,m44,m6 ,6 1m1 H,m55,m4 ,4 2m2 H,m55,m6 ,6 3m3 ; ð7Þ
estimator below cannot be improved, in the sense of having a
smaller error bar. where
Z
H,m11,m2 ,2 3m3 ¼ dQ Y,m11 ¬ Y,m22 Y,m33 ð8Þ
2 METHOD and can be related to Clebsch–Gordan coefficients. The effect of
The optimization procedure in this paper is a generalization of the beam-smearing (here modelled by a Gaussian) is through the
where dQ, DQi represent elements of solid angle, and v and f are 2.1 Optimal estimator ya
polar coordinates. The power spectrum is defined as We wish to minimize the variance of y (cf. Tegmark 1997 for the
C, ¼ hja,m j i;2
ð2Þ power spectrum), which involves the six-point function. The means
are
where the angle brackets indicate an ensemble average. Expectation X 0
hya i ¼ Ba0 Raijk Eijk
a
: ð11Þ
values of products of distinct spherical harmonic coefficients are a0 ijk
zero by isotropy, independently of whether the temperature map is
Gaussian or not. The bispectrum is defined as The covariance beween the ys is Caa0 ; hya ya0 i ¹ hya ihya0 i which
we obtain from the triplet data covariance matrix:
Bð,1 ,2 ,3 ; m1 m2 m3 Þ ; ham 1 m2 m3
,1 a,2 a,3 i: ð3Þ hxi xj xk xi0 xj0 xk0 i ¹ hxi xj xk ihxi0 xj0 xk0 i: ð12Þ
It is zero, unless the indices comply with the following triangle We now make an assumption concerning the departures from
closure constraints (e.g. Edmonds 1957; Luo 1994a): Gaussianity. Since these are expected to be small, we approximate
m1 þ m2 þ m3 ¼ 0; ,1 þ ,2 þ ,3 ¼ even; j,i ¹ ,j j # ,k # ,i þ ,j the covariance matrix by the covariance matrix for a Gaussian field
for i; j; k ¼ 1; 2; 3. with the same power spectrum. This assumes that the bispectrum is
We seek an estimator of B that is lossless, if possible, in the sense small compared with the cosmic variance, and also assumes that the
that it contains as much information as the original map fxi g. connected four-point function is small. Strictly, this method is
Ideally it should be unbiased, and with calculable statistical proper- optimal for testing the hypothesis that the field is Gaussian, but it
ties. In the spirit of Tegmark’s optimal quadratic estimator for the should be very close to optimal for practical cases, since the
power spectrum, we seek an estimator for the bispectrum that is expectation is that the bispectrum will be small. If the assumption
cubic. We consider quantities ya of the following form: is not justified, and the bispectrum is not small compared with the
X a cosmic variance, detection will not be difficult in any event. If this
ya ¼ Eijk xi xj xk : ð4Þ turns out to be the case, it will be important to check that the
pixels ijk estimator remains unbiased in the case of large intrinsic bispectrum.
In the Gaussian approximation, hxi xj xk i ¼ 0, and we use Wick’s
We will find that the ya are related to the bispectrum estimates, but
theorem to write
will not be the bispectrum estimates themselves. We introduce the
shorthand notation a ; f,1 ; ,2 ; ,3 ; m1 ; m2 ; m3 g, and we also com- hxi xj xk xi0 xj0 xk0 i ¼yij yki0 yj0 k0
bine the list of data triplets into a data vector with elements labelled þ permutations ð15 termsÞ; ð13Þ
by A:
where we have defined the two-point function of the temperature
DA ; xi xj xk ; ð5Þ field:
a
where A represents some triplet fi; j; kg. The Eijk are some coef- X 2, þ 1
yij ; hxi xj i ¼ C, P, ðcos gij ÞW,2 þ Nij : ð14Þ
ficients to be determined. The mean of ya involves the three-point ,
4p
function, which may be written in terms of the bispectra as follows:
X P, are Legendre polynomials and Nij is the noise covariance matrix.
mA ; hxi xj xk i ¼ Ba Raijk ; ð6Þ We can then compute the covariance matrix for ya :
a ÿ a a0
Vaa0 ; hya ya0 i ¼ yij yki0 yj0 k0 þ perm: Eijk E i0 j0 k 0 ; ð15Þ
where we have assumed that the noise has a zero three-point
function. If it is known and non-zero, it may be added. The functions and we have from now adopted the summation convention for
connecting the three-point functions in real and harmonic space are repeated pixel indices, and also, unless stated otherwise, a indices.
¹1
¼ Faa 0 Ba00 Fa0 a00 ¼ Ba ; ð30Þ
3 L O S S L E S S E S T I M AT O R O F T H E
BISPECTRUM and the bispectrum estimates also have calculable covariance
properties:
In order to estimate how well the ya will perform in estimating the
desired parameters Ba , we compute the Fisher information matrix Caa0 ; hB̂a B̂a0 i ¹ hB̂a ihB̂a0 i
(Tegmark, Taylor & Heavens 1997) ¹1 ¹1
¼ Faa ¹1 ¹1
00 Fa0 a000 hya00 ya000 i ¹ Faa00 Fa0 a000 hya00 ihya000 i
2
∂ ln p ¹1 ¹1
Faa0 ; ¹ ð22Þ ¼ Faa 00 Fa0 a000 Fa00 a000
∂Ba ∂Ba0
;
¹1
where p is the posterior probability distribution for the parameters ¼ Faa 0: ð31Þ
(equal to the likelihood, if uniform priors for the parameters are This also proves that the estimators are optimal, by the Fisher–
assigned). For a data vector with components with means m and Cramer–Rao inequality.
covariance matrix C, the Fisher matrix is
1 ∂C ¹1 ∂C ∂m ∂m
Faa0 ¼ Tr C¹1 C þ 2C ¹1 ð23Þ
∂Ba ∂Ba0 ∂Ba ∂Ba0 4 A P P L I C AT I O N T O C O B E 4 - Y R D M R D ATA
:
2
The error on the parameters Ba is contained in this matrix: if all We illustrate the method by applying the method to the COBE DMR
other parameters are known, the minimum error is the conditional
p 4-yr data, focusing on measuring low-order coefficients. For this
one, jBa ¼ 1= Faa . If all parameters are to be estimated from the
p experiment, the width of the approximately Gaussian beam is
data, then the appropriate error is the marginal error Faa ¹1. This j ¼ 38: 2 (Wright et al. 1992). The method is computationally
assumes that the probability surface is adequately approximated by expensive, and is in the process of being optimized, but, for the
a second-order Taylor expansion at the peak. moment, the approach taken is to average the ,4000 unmasked
As expected for the ‘near-Gaussian’ approximation, the covari- pixels of the COBE data set into larger pixels of roughly 12 degrees
ance matrix for either the triplets xi xj xk or the ya does not depend on square. This introduces an additional effective Gaussian smoothing