0% found this document useful (0 votes)
506 views

Entropy of The Gaussian Distribution: Appendix A

Entropy maximizing distributions play an important role in several entropy estimation methods. The Gaussian distribution is the one with maximum entropy from among all the probability distributions which have a finite mean and a finite variance. This can be proved using the Lagrange multipliers. Follows a detailed proof.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
506 views

Entropy of The Gaussian Distribution: Appendix A

Entropy maximizing distributions play an important role in several entropy estimation methods. The Gaussian distribution is the one with maximum entropy from among all the probability distributions which have a finite mean and a finite variance. This can be proved using the Lagrange multipliers. Follows a detailed proof.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Appendix A

Entropy of the Gaussian distribution


From among the distributions which cover the entire real line R and have a nite mean and a nite variance 2 , the maximum entropy distribution is the Gaussian. Following I present the analytical proof, using optimization under constraints. The Lagrange multipliers method oers quite a straightforward way for nding the Gaussian probability distribution. The function to maximize is the Shannon entropy of a pdf f (x): H(f ) =

f (x) log f (x)dx

(A.1)

with respect to f (x), under the constraints that a) f (x) is a pdf, that is, the probabilities of all x sum 1:

f (x)dx = 1,

(A.2)

b) it has a nite mean :


xf (x)dx =

(A.3)

c) and a nite variance 2 :


x2 f (x)dx 2 = 2 95

(A.4)

96

APPENDIX A. ENTROPY OF THE GAUSSIAN DISTRIBUTION The Lagrangian function under these constraints is: (f, 0 , 1 , 2 ) = + 0 + 1 + 2

f (x) log f (x)dx

(A.5) (A.6) (A.7) (A.8)

f (x)dx 1 xf (x)dx x2 f (x)dx 2 2

The critical values of are zero-gradient points, that is, when the partial derivatives of are zero: (f, 0 , 1 , 2 ) f (x) (f, 0 , 1 , 2 ) 0 (f, 0 , 1 , 2 ) 1 (f, 0 , 1 , 2 ) 1 = log f (x) 1 + 0 + 1 x + 2 x2 = 0 (A.9) = = =

f (x)dx 1 = 0 xf (x)dx = 0 x2 f (x)dx 2 2 = 0

(A.10) (A.11) (A.12)

These equations form a system from which 0 , 1 , 2 and, more importantly, f (x), can be calculated. From Eq. A.9: f (x) = e2 x
2 + x+ 1 1 0

(A.13)

Now f (x) can be substituted in Eqs. A.10, A.11, A.12:


e2 x

2 + x+ 1 1 0

dx = 0 dx = dx = 2 + 2

(A.14) (A.15) (A.16)

xe2 x

2 + x+ 1 0

x2 e2 x

2 + x+ 1 1 0

For solving the latter system of three equations with three unknowns (0 , 1 , 2 ), the three improper integrals have to be computed. In 1778 Laplace proved that 2 I= ex dx = . (A.17)

96

B.Bonev This result is obtained by switching to polar coordinates: I2 = = =

ex dx e(x
2

=
2 +y 2 )

ex dx

ey dy

(A.18) (A.19) (A.20) (A.21)

0 2 0

dxdy

er rdrd
0

1 2 = 2 er 2

In Eq. A.18,the dierential dxdy represents an element of area in cartesian coordinates in the xy-plane. This is represented in polar coordinates r, which are given by x = r cos , y = r sin and r2 = x2 + y 2 . The element of area turns into rdrd. The 2 factor in Eq. A.21 comes from the integration over . The integral over r can be calculated by substitution, u = r2 , du = 2rdr. 2 The more general integral of xn eax +bx+c has the following closed form ([Wei98]):
n ax2 +bx+c

x e

dx =

b2 /(4a)+c e a

n/2 k=0

n! (2b)n2k k!(n 2k)! (4a)nk

(A.22)

for integer n > 0, the variables a, b belonging to the punctured plane (the complex plane with the origin 0 removed), and the real part of a being positive. Provided the previous result, the system of equations becomes: 2 /(42 )+0 1 e 1 =1 2 2 /(42 )+0 1 1 e 1 = 2 22 2 /(42 )+0 1 2 1 1 e 1 2 2 2 42 2 By applying the natural logarithm to the equations, log e1 /(42 )+0 1+log 97
2

(A.23) (A.24) = 2 + 2 (A.25)

/2

= log 1,

(A.26)

98

APPENDIX A. ENTROPY OF THE GAUSSIAN DISTRIBUTION

the unknown 0 can be isolated from each equation of the system: 0 = 0 = 0 = 2 1 + 1 log 42 2 1 + 1 log 42 2 1 + 1 log 42 2 22 + log 2 1 2 + 2 + log 2 2 1 1
42 2

(A.27) (A.28) (A.29) (A.30)

22

From which it is deduced that 1 and 2 have the following relation: 0 = log then, 1 = 22 1 2 2 + 2 = 12 42 22 (A.32) (A.33) 2 + 2 22 = log 2 , 1 1 1 2 2 4 2
2

(A.31)

provides the values of 1 and 2 and by subsituting them in any equation of the system, 0 is also obtain. The result is 1 2 0 = 2 log 2 2 + 1 2 1 = 2 1 2 = 2 2

(A.34) (A.35) (A.36)

These solutions can be substituted in the f (x) expression (Eq. A.13): p(x) = e 22 x + 2 x+ 2 2 log 2 1 2 2 1 = e 22 ( 2x+x ) = 2 2 ()2 1 e 22 , = 2 2
1 2 1 2 2

(A.37) (A.38) (A.39)

which, indeed, is the pdf of the Gaussian distribution. Finally, the negative sign of the second derivative of ensures that the obtained solution is a 98

B.Bonev maximum, and not a minimum or an inection point. 1 2 (f, 0 , 1 , 2 ) = 2 f (x) f (x) (A.40)

99

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy