Applications of Harmonic Analysis: 4.1 Useful Facts

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Lecture 4

Applications of Harmonic Analysis


February 4, 2005
Lecturer: Nati Linial
Notes: Matthew Cary
4.1 Useful Facts
Most of our applications of harmonic analysis to computer science will involve only Parsevals identity.
Theorem 4.1 (Parsevals Identity).
|f|
2
= |

f|
2
Corollary 4.2.
f, g) =

f, g).
Proof. Note that f + g, f + g) = |f + g|
2
= |

f +g|
2
= |

f + g|
2
. Now as f + g, f + g) =
|f|
2
2
+ |g|
2
2
+ 2f, g), and similarly |

f + g|
2
2
= |

f|
2
2
+ | g|
2
2
+ 2

f, g), applying Parseval to |f|


2
and
|g|
2
and equating nishes the proof.
The other basic identity is the following.
Lemma 4.3.

f g =

f g
Proof. We will show this for the unit circle T, but one should note that it is true more generally. Recall that
by denition h = f g means that
h(t) =
1
2
_
T
f(s)g(t s)ds.
Now to calculate

f g we manipulate

h.

h(r) =
1
2
_
T
h(x)e
irx
dx
=
1
4
2
__
T
2
f(s)g(x s)e
irx
ds dx
=
1
4
2
__
T
2
f(s)g(x s)e
irs
e
ir(xs)
dxds
23
using e
irx
= e
irs
e
ri(xs)
and interchanging the order of integration. Then by taking u = xs we have
=
1
4
2
__
T
2
f(s)g(u)e
irs
e
iru
duds
=
1
4
2
_
T
f(s)e
irs
ds
_
T
g(u)e
iru
du
=
_
1
2
_
T
f(s)e
irs
ds
__
1
2
_
T
g(u)ds
_
=

f g.
4.2 Hurwitzs Proof of the Isoperimetric Inequality
Recall from last lecture that the isoperimetric problem is to show that a circle encloses the largest area for
all curves of a xed length. Formally, if L is the length of a curve and A the area enclosed, then we want
to show that L
2
4A 0 with equality if and only if the curve is a circle. We will prove the following
stronger theorem.
Theorem 4.4. Let (x, y) : T R
2
be an anticlockwise arc length parametrization of a non self-intersecting
curve of length L enclosing an area A. If x, y C
1
, then
L
2
4A = 2
2
_

n=0
[n x(n) i y(n)[
2
+[n y(n) +i x(n)[
2
+ (n
2
1)
_
[ x(n)[
2
+[ y(n)[
2
_
_
.
In particular, L
2
4A with equality if and only if is a circle.
We will not dene arc length parameterization formally, only remark that intuitively it means that
if one views the parameterization as describing the motion of a particle in the plane, then an arc length
parameterization is one so that the speed of the particle is constant. In our context, where we view time as
the unit circle T of circumference 2, we have that ( x)
2
+( y)
2
is a constant so that the total distance covered
is
_
L
2
_
2
.
Proof. First we use our identity about the parameterization to relate the length to the transform of the
parameterization.
_
L
2
_
2
=
1
2
_
T
_
_
x(s)
_
2
+
_
y(s)
_
2
_
ds
= |

x|
2
2
+|

y|
2
2
by Parsevals
=

[in x(n)[
2
+[in y(n)[
2
by Fourier differentiation identities
=

n
2
_
[ x(n)[
2
+[ y(n)[
2
_
(4.1)
24
dx/ds
y
Figure 4.1: Computing the area enclosed by a curve
Now we compute the area. As the curve is anticlockwise,
A =
_
y
dx
ds
ds,
where the negative sign comes from the fact that the curve is anticlockwise. See Figure 4.1. This area
integral looks like an inner product, so we write
A
2
= y, x) = y,

x).
By symmetry, considering the area integral from the other direction, we also have that
A
2
= x,

y),
note there is no negative sign in this expression. Hence by adding we have that
A

= x,

y) y,

x)
=

in
_
x(n)

y(n) x(n) y(n)

_
, (4.2)
using the Fourier differentiation identities and using the notation a

for the complex conjugate of a. Now


subtract (4.1) and (4.2) to prove the theorem.
To see why the right hand side is zero if and only if is a circle, consider when the right hand side
vanishes. As it is a sum of many squares, x and y must vanish for all n ,= 0 or 1. Looking carefully at
what those terms mean shows that they vanish if and only if is a circle.
4.3 Harmonic Analysis on the Cube for Coding Theory
The theory of error correcting codes is broad and has numerous practical applications. We will look at the
asymptotic theory of block coding, which like many problems in coding theory is well-known, has a long
25
history and is still not well understood. The Boolean or Hamming cube 0, 1
n
is the set of all n-bit strings
over 0, 1. The usual distance on 0, 1
n
is the Hamming distance d
H
(x, y), dened over x, y 0, 1
n
by the number of positions where x and y are not equal: d
H
(x, y) = [i : x
i
,= y
i
[. A code ( is a subset of
0, 1
n
. The minimum distance of ( is the minimum distance between any two elements of (:
dist(() = mind
H
(x, y) : x, y (.
The asymptotic question is to estimate the size of the largest code for any given minimum distance,
A(d, n) = max[([, ( 0, 1
n
, dist(() d,
as n . The problem is easier if we restrict the parameter space by xing d to be a constant fraction of
the bit-length n, that is, consider A(n, n). Simple constructions show for 1/2 > > 0 that A(n, n) is
exponential in n, so the interesting quantity will be the bit-rate of the code. Accordingly, we dene the rate
of a code as
1
n
log [([, and then dene the asymptotic rate limit as
R() = limsup
n
_
1
n
log [([ : ( 0, 1
n
, dist(() n
_
.
It is a sign of our poor knowledge of the area that we do not even know if in the above we can replace the
limsup by lim, i.e., if the limit exists. If [([ = 2
k
, we may think of the code as mapping k-bit strings into
n-bit strings which are then communicated over a channel. The rate is then the ratio k/n, and measures the
efciency with which we utilize the channel.
A code is linear if ( is a linear subspace of 0, 1
n
, viewed as a vector space over GF(2). In a linear
code, if the minimum distance is realized by two codewords x and y, then x y is a codeword whose
(Hamming) length equals the minimum distance. Hence for linear codes we have that
dist(() = min
_
[w[ : w ( 0
_
.
Here we use [ [ to indicate the Hamming weight of a codeword, the number of nonzero positions. Note that
this is equal to several other, common norms on GF(2).
A useful entity is the orthogonal code to a given code. If ( a linear code, we dene
(

= y : x (, x, y) = 0,
where we compute the inner product , ) over GF(2), that is, x, y) =

n
i=1
x
i
y
i
(mod 2).
4.3.1 Distance Distributions and the MacWilliams Identities
Our rst concrete study of codes will be into the distance distribution, which are the probabilities
Pr[[x y[ = k : x, y chosen randomly from (]
for 0 k n. If ( is linear, our discussion above shows that the question of distance distribution is
identical to the weight distribution of a code, the probabilities that a randomly selected codeword has a
specied weight.
The MacWilliams Identities are important identities about this distribution that are easily derived using
Parsevals Identity. Let f = 1
C
, the indicator function for the code. We rst need the following lemma.
26
Lemma 4.5.

f =
[([
2
n
1
C

Proof.

f(u) =
1
2
n

v
f(v)
v
(u)
=
1
2
n

v
f(v)(1)
u,v
=
1
2
n

vC
(1)
u,v
If u (

, then u, v) = 0 for all v (, so that



f(u) = [([/2
n
. Suppose otherwise, so that

C
(1)
u,v
= [(
0
[ [(
1
[, where (
0
are the codewords of ( that are perpendicular to u, and (
1
= ( (
1
.
As u , (

, (
1
is nonempty. Pick an arbitrary w in (
1
. Then, any y (
1
w corresponds to a unique
x (
0
, namely w +y. Similarly, any x (
0
0 corresponds to w +x (
1
w. As w (
1
corresponds
to 0 (
0
, we have that [(
0
[ = [(
1
[. Hence

C
(1)
e
1
,v
= 0, so that

f(u) =
_
[([/2
n
if u (

0 otherwise
which proves the lemma.
We now dene the weight enumerator of a code to be
P
C
(x, y) =

wC
x
|w|
y
n|w|
.
The MacWilliams Identity connects the weight enumerators of ( and (

for linear codes.


Theorem 4.6 (The MacWilliams Identity).
P
C
(x, y) = [([P
C
(y x, y +x)
Proof. Harmonic analysis provides a nice proof of the identity by viewing it as an inner product. Dene
f = 1
C
and g(w) = x
|w|
y
n|w|
. Then, using Parsevals,
P
C
(x, y) = 2
n
f, g) = 2
n

f, g).

f has already been computed in Lemma 4.5, so we turn our attention to g.


g(u) =
1
2
n

v
g(v)(1)
u,v
=
1
2
n

v
x
|v|
y
n|v|
(1)
u,v
27
Let u have k ones and n k zeros. For a given v, let s be the number of ones of v that coincide with those
of u, and let t be the number ones of v coinciding with the zeros of u. Then we rewrite the sum as
=
1
2
n

s,t,k
_
k
s
__
n k
t
_
x
s+t
y
nst
(1)
s
=
y
n
2
n

s
_
k
s
__

x
y
_
s

t
_
n k
t
__
x
y
_
t
=
y
n
2
n
_
1
x
y
_
k
_
1 +
x
y
_
nk
=
1
2
n
(y x)
k
(y +x)
nk
=
1
2
n
(y x)
|u|
(y +x)
n|u|
Now, as f, g) =

f, g) = 2
n
P
C
(x, y), we plug in our expressions for

f and g to get
2
n
P
C
(x, y) =
[([
2
n

wC

1
2
n
(y x)
|w|
(y +x)
n|w|
=
[([
2
n
P
C
(y x, y +x),
which implies
P
C
= [([P
C
(y x, y +x).
4.3.2 Upper and Lower Bounds on the Rate of Codes
We now turn our attention to upper and lower bounds for codes. We remind any complexity theorists reading
these notes that the senses of upper bound and lower bound are reversed from their usage in complexity
theory. Namely, a lower bound on R() shows that good codes exist, and an upper bound shows that superb
codes dont.
In the remainder of this lecture we show several simple upper and lower bounds, an then set the stage
for the essentially strongest known upper bound on the rate of codes, the MacEleiece, Rumsey, Rodemich
and Welsh (MRRW) upper bound. This is also referred to as the JPL bound, after the lab the authors worked
in, or the linear programming (LP) bound, after its proof method.
Our rst bound is a lower bound. Recall the binary entropy function
H(x) = xlog x (1 x) log(1 x).
Theorem 4.7 (Gilbert-Varshamov Bound).
R() 1 H(),
and there exists a linear code satisfying the bound.
28
Proof. We will sequentially pick codewords where each new point avoids all n-spheres around previously
selected points. The resulting code ( will satisfy
[([
2
n
vol(sphere of radius n)
=
2
n

n
j=0
_
n
j
_.
Now, note that log
_
n
n
_
/n H() as n , so that 2
n
/

n
_
n
j
_
2
n(1H())
, and take logs to prove
the rst part of the theorem.
We now show that theres a linear code satisfying this rate bound. This proof is different than the one
given in class, as I couldnt get that to work out. The presentation is taken from Trevisons survey of coding
theory for computational complexity. We can describe a linear k-dimensional code (
A
by a k n 0-1
matrix A by (
A
= Ax : x 0, 1
k
. Well show that if k/n 1 H(), with positive probability
dist((
A
) n. As the code is linear, it sufces to show that the weight of all nonzero codewords is at least
n. As for a given x 0, 1
k
, Ax is uniformly distributed over 0, 1
n
, we have
Pr[[Ax[ < n] = 2
n
n1

i=0
_
n
i
_
2
n
2
nH()+o(n)
,
using our approximation to the binomial sum. Now we take a union bound over all 2
k
choices for x to get
Pr[x ,= 0 : Ax < d] 2
k
2
n
2
nH()+o(n)
= 2
k+n(H()1)+o(1)
< 1
by our choice of k n(1 H()).
We now turn to upper bounds on R().
Theorem 4.8 (Sphere-Packing Bound).
R() 1 H(/2)
Proof. The theorem follows from noting that balls of radius n/2 around codewords must be disjoint, and
applying the approximations used above for the volume of spheres in the cube.
We note in Figure 4.2 that the sphere-packing bound is far from the GV bound. In particular, that GV
bound reaches zero at = 1/2, while the sphere-packing bound is positive until = 1. However, we have
the following simple claim.
Claim 4.1. R() = 0 for > 1/2.
Proof. We will show the stronger statement that if [([ is substantial then not only is it impossible for
d
H
(x, y) > n for all x, y (, but even the average of all x, y ( will be at most n/2. This average
distance is
1
_
|C|
2
_

CC
d(x, y),
29
0.5 1
R()

GV bound
Sphere-Packing Bound
Elias Bound
Figure 4.2: The GV bound contrasted with the Sphere-Packing Bound
and we will expand the distance d(x, y) = [i : x
i
,= y
i
[. Reversing the order of summation,
Average distance =
1
_
|C|
2
_

x,y
1
x
i
=y
i
=
1
_
|C|
2
_

i
z
i
([([ z
i
),
where z
i
is the number of zeros in the i
th
position of all the codewords of (.

1
_
|C|
2
_

[([
2
4

1
2
n
[([
[([ 1
.
So unless ( is very small, the average distance is essentially n/2.
Our next upper bound improves on the sphere packing bounds, at least achieving R() = 0 for > 1/2.
It still leaves a substantial gap with the GV bound.
Theorem 4.9 (Elias Bound).
R() 1 H
_
1

1 2
2
_
Proof. The proof begins by considering the calculation of average distance from the previous theorem. It
follows from Jensens inequality that if the average weight of the vectors in ( is n, then the maximum of

z
i
([([ z
i
) is obtained if for all i, z
i
= (1 )( for some . We sketch the argument for those not
30
familiar with Jensens inequality. The inequality states that if f is convex, then for x
1
, . . . , x
n
,
1
n

f(x
i
)
f(

x
i
/n), with equality if and only if x
1
= = x
n
. For our case, the function f(x) = x([([ x) is
easily veried convex and so its maximum over a set of z
i
is achieved when the z
i
are all equal. This makes
the average distance in ( at most 2(1 )n.
With this calculation in mind, chose a spherical shell S in 0, 1
n
centered at some x
0
with radius r
such that
[S ([ [C[
[S[
2
n
.
Such a shell exists as the right hand side of the inequality is the expected intersection size if the sphere is
chosen randomly. Set r = pn so that [S[ 2
nH(p)
, which means
[S ([
[([
2
n
_
1H(p)
_ .
Now apply the argument above on x
0
+ ( S. It follows from our discussion that we actually have a
p fraction of ones in each row, so if > 2p(1 p), the [S ([ is subexponential, but this is equal to
[([2
n
_
1H(p)
_
, implying
1
n
log [([ < 1 H(p).
Let us rewrite our condition > 2p(1 p) as follows:
1 2p

1 2 p =
1

1 2
2
.
This is the critical value of pwhen p is below this the code is insubstantially small.
Figure 4.2 shows how the Elias bound improves the sphere-packing bound to something reasonable. The
gap between it and the GV bound is still large, however.
4.4 Aside: Erdo os-Ko-Rado Theorem
The proof of the Elias bound that we just saw is based on the following clever idea: we investigate and
unknown object (the code () by itersecting it with random elements of a cleverly chosen set (the sphere).
This method of a randomly chosen sh-net is also the basis for the following beautiful proof, due to
Katona, of the Erd os-Ko-Rado theorem.
Denition 4.1. An intersecting family is a family T of k-sets in 1 . . . n (compactly, T
_
[n]
k
_
), with 2k n,
such that for any A, B T, A B ,= .
Informally, an intersecting family is a collection of sets which are pairwise intersecting. One way to
construct such a set is to pick a common point of intersection, and then choose all possible (k 1)-sets to
ll out the sets. The Erdo os-Ko-Rado Theorem says that this easy construction is the best possible.
Theorem 4.10 (Erd os-Ko-Rado). If T
_
[n]
k
_
is an intersecting family with 2k n, then
T
_
n 1
k 1
_
.
31
Proof (Katona). Given an intersecting family T, arrange 1 . . . n in a random permutation along a circle,
and count the number of sets A T such that A appears as an arc in . This will be our random sh-net.
There are (n 1)! cyclic permutationsthat is, n! permutations, divided by n as rotations of the circle
are identical. There are k! ways for a given k-set to be arranged, and (n k)! ways of the other elements
not interfering with that arc, so that the set appears consecutively on the circle. Hence the probability that a
given k-set appears as an arc is
k!(n k)!
(n 1)!
=
n
_
n
k
_,
which by linearity of expectation implies
E
_
# arcs belonging
to T
_
=
n[T[
_
n
k
_ .
Now, as 2k n, at most k member of an intersecting family can appear as arcs on the circle, otherwise two
of the arcs wouldnt intersect. Hence
n[T[
_
n
k
_ k
implying
[T[
k
n
_
n
k
_
=
_
n 1
k 1
_
32

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy