Entropy of The Gaussian Distribution: Appendix A
Entropy of The Gaussian Distribution: Appendix A
(A.1)
with respect to f (x), under the constraints that a) f (x) is a pdf, that is, the probabilities of all x sum 1:
f (x)dx = 1,
(A.2)
xf (x)dx =
(A.3)
x2 f (x)dx 2 = 2 95
(A.4)
96
APPENDIX A. ENTROPY OF THE GAUSSIAN DISTRIBUTION The Lagrangian function under these constraints is: (f, 0 , 1 , 2 ) = + 0 + 1 + 2
The critical values of are zero-gradient points, that is, when the partial derivatives of are zero: (f, 0 , 1 , 2 ) f (x) (f, 0 , 1 , 2 ) 0 (f, 0 , 1 , 2 ) 1 (f, 0 , 1 , 2 ) 1 = log f (x) 1 + 0 + 1 x + 2 x2 = 0 (A.9) = = =
These equations form a system from which 0 , 1 , 2 and, more importantly, f (x), can be calculated. From Eq. A.9: f (x) = e2 x
2 + x+ 1 1 0
(A.13)
e2 x
2 + x+ 1 1 0
dx = 0 dx = dx = 2 + 2
xe2 x
2 + x+ 1 0
x2 e2 x
2 + x+ 1 1 0
For solving the latter system of three equations with three unknowns (0 , 1 , 2 ), the three improper integrals have to be computed. In 1778 Laplace proved that 2 I= ex dx = . (A.17)
96
ex dx e(x
2
=
2 +y 2 )
ex dx
ey dy
0 2 0
dxdy
er rdrd
0
1 2 = 2 er 2
In Eq. A.18,the dierential dxdy represents an element of area in cartesian coordinates in the xy-plane. This is represented in polar coordinates r, which are given by x = r cos , y = r sin and r2 = x2 + y 2 . The element of area turns into rdrd. The 2 factor in Eq. A.21 comes from the integration over . The integral over r can be calculated by substitution, u = r2 , du = 2rdr. 2 The more general integral of xn eax +bx+c has the following closed form ([Wei98]):
n ax2 +bx+c
x e
dx =
b2 /(4a)+c e a
n/2 k=0
(A.22)
for integer n > 0, the variables a, b belonging to the punctured plane (the complex plane with the origin 0 removed), and the real part of a being positive. Provided the previous result, the system of equations becomes: 2 /(42 )+0 1 e 1 =1 2 2 /(42 )+0 1 1 e 1 = 2 22 2 /(42 )+0 1 2 1 1 e 1 2 2 2 42 2 By applying the natural logarithm to the equations, log e1 /(42 )+0 1+log 97
2
/2
= log 1,
(A.26)
98
the unknown 0 can be isolated from each equation of the system: 0 = 0 = 0 = 2 1 + 1 log 42 2 1 + 1 log 42 2 1 + 1 log 42 2 22 + log 2 1 2 + 2 + log 2 2 1 1
42 2
22
From which it is deduced that 1 and 2 have the following relation: 0 = log then, 1 = 22 1 2 2 + 2 = 12 42 22 (A.32) (A.33) 2 + 2 22 = log 2 , 1 1 1 2 2 4 2
2
(A.31)
provides the values of 1 and 2 and by subsituting them in any equation of the system, 0 is also obtain. The result is 1 2 0 = 2 log 2 2 + 1 2 1 = 2 1 2 = 2 2
These solutions can be substituted in the f (x) expression (Eq. A.13): p(x) = e 22 x + 2 x+ 2 2 log 2 1 2 2 1 = e 22 ( 2x+x ) = 2 2 ()2 1 e 22 , = 2 2
1 2 1 2 2
which, indeed, is the pdf of the Gaussian distribution. Finally, the negative sign of the second derivative of ensures that the obtained solution is a 98
B.Bonev maximum, and not a minimum or an inection point. 1 2 (f, 0 , 1 , 2 ) = 2 f (x) f (x) (A.40)
99