Spatial and Temporal Linear Filters: 6.8300/6.8301 Advances in Computer Vision
Spatial and Temporal Linear Filters: 6.8300/6.8301 Advances in Computer Vision
Spatial and Temporal Linear Filters: 6.8300/6.8301 Advances in Computer Vision
Spring 2024
6.8300/6.8301 Advances in Computer Vision Sara Beery, Kaiming He, Vincent Sitzmann, Mina Konaković Luković
Announcements
fi
Remember, an image is just an array of numbers
What we see What the machine gets
Some visual areas…
From M. Lewicky
f(αx) = αf(x)
f(x + y) = f(x) + f(y)
fi
We need translation invariance
Now we also want translation invariant operations. We can have linear translation invariant and non-linear translation invariant. We will focus on linear translation invariant
Classifier “Bird”
Classifier “Bird”
Bird
Bird
Classifier “Sky”
Sky
Sky Sky Sky Sky Sky Sky Sky Bird
f
<latexit sha1_base64="b4HLEbhr7TEtaehBb4ygFYyuiV8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipGQ7KFbfqLkDWiZeTCuRoDMpf/WHM0gilYYJq3fPcxPgZVYYzgbNSP9WYUDahI+xZKmmE2s8Wh87IhVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGNn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNiUbgrf68jppX1U9t+o1ryv12zyOIpzBOVyCBzWowz00oAUMEJ7hFd6cR+fFeXc+lq0FJ585hT9wPn8AyvGM6g==</latexit>
Sky Sky Sky Sky Sky Sky Sky Sky
with
One of Fourier’s original examples of sine series is the expansion of the ramp signal:
14
fi
The Discrete Fourier Transform
Discrete Fourier Transform (DFT) transforms a signal f [n] into F [u] as:
( N)
N−1
un
∑
F[u] = f[n] exp −2πj
n=0
( N)
1 N−1 un
N∑
f[n] = F[u] exp 2πj
u=0
( (N M ))
N−1 M−1
un vm
∑∑
F[u, v] = f[n, m] exp −2πj +
n=0 m=0
Amplitude Phase 16
Visualizing the image Fourier transform
f [n, m] F[u, v]
m v v
n u u
v
v
u u
17
Location information goes into the phase, strength goes into magnitude
Simple Fourier transforms
Images are 64x64 pixels. The wave is a cosine, therefore DFT phase is zero.
18
2N+1
1 1 … 1 h[n] with N=1
1 1 1
1 1 1 2M+1
…
1 1 1 1 n=0 n
21
Box filter
mean
1
21X21
=
mean
256X256 256X256
[1 1 1] [1 2 1]
1 + 2 cos(2π u/20)
24
h1[n] vs b2[n]
25
Gaussian filter
In the continuous domain:
26
Gaussian filter
Continuous Gaussian:
27
Scale
28
Gaussian filter for low-pass filtering
29
Dali
Properties of the Gaussian filter
31
Binomial filter
b1 = [1 1]
b2 = [1 1] [1 1] = [1 2 1]
b3 = [1 1] [1 1] [1 1] = [1 3 3 1]
32
Binomial filter
33
Properties of binomial filters
Note: These properties are analogous to the gaussian property in the continuous
domain (but the binomial filter is different than a discretization of a gaussian)
34
B2[n]
35
Laplacian lter
+
-
-
36
Sharpening
fi
fi
Laplacian lter
+
-
Gaussian lter -
+ =
37
fi
fi
38
Maximum sensitivity
~ 6 cycles / degree of visual angle
Invisible
Contrast sensitivity
visible
0.1 1 10 100
Low Spatial frequency (cycles/degree) High
When you’re far away you just see the low spatial frequencies, when you are close you see the high spatial frequencies
Hybrid Images
Oliva & Schyns
41
We start by taking two pictures, we isolate the details and contours in one, here the woman, and we blur the second face. as you can see, the man seems to go out of focus and the details of the woman are superposed.
Hybrid Images
42
+
=
43
When you’re far away you just see the low spatial frequencies, when you are close you see the high spatial frequencies
Hybrid Images
44
http://cvcl.mit.edu/hybrid_gallery/gallery.html
High pass-filters
47
Finding edges in the image
Image gradient:
Edge strength
Edge orientation:
Edge normal:
48
Image derivative [-1 1]
[-1, 1] =
h[m,n]
g[m,n] f[m,n]
49
[-1 1]T
[-1, 1]T =
h[m,n]
g[m,n] f[m,n]
50
Discrete derivatives
51
This version can give you a better centering. In the first example the edge would be at one half pixel
Discrete derivatives
52
Discrete derivatives
53
Derivatives
We want to compute the image derivative:
But derivatives and convolutions are linear and we can move them
around:
54
Gaussian derivatives
55
Gaussian Scale
Edges at a different spatial scale - not picking strips, picking just edges of the body
Orientation
58
Orientation
59
Sampling
You should always blur your image before you subsample it!
Sampling
Pixels
Continuous world
61
Sampling
62
Sampling
63
The question is: how many samples we need to capture the continuous thing well?
Aliasing
Let’s start with this continuous image (it is not really continuous…)
64
Aliasing
65
66
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra
67
68
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra
69
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra
70
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra
aliased
components
71
High frequency appears to be a low frequency of another thing. To avoid this situation, you do a low pass lter rst, so you can capture at least 2 samples of your
sinusoid.
fi
fi
spatial
domain
frequency
domain
Aliasing
You repeat copies by the spacing that is related to the sampling frequency.
Antialising filtering
Before sampling, apply a low pass-filter to remove all the
frequencies that will produce aliasing: “blur before you
subsample"
Without antialising
filter.
With antialising
filter.
• Temporal filtering
• Motion illusion, involving aliasing, addressing
whether humans match spatial patterns, or use
temporal filters, to measure motion.
Temporal filtering
time
Sequences
t
Global constant motion
(vx,vy)
where:
Our Fourier signal is the original signal multiplied by the delta function.
Fourier transformation of a box is a sink function (sin x/x)
Fourier domain is just a different representation that makes certain operations simpler. Like defining motion - just calculate the orientation of the signal to keep track of the direction of motion.
Temporal Gaussian
Filters remove some frequencies we don’t like and keep the ones we like.
Blurring more in time than in space with this filter.
Spatio-temporal Gaussian
Spatio-temporal Gaussian
How could we create a filter that keeps sharp objects that
move at some velocity (vx , vy) while blurring the rest?
U0=0.1
x 2 +y 2
− 2
ψ c (x, y) = e 2σ
cos(2πu0 x )
x 2 +y 2
− 2
ψ s (x, y) = e 2σ
sin(2πu0 x )
x 2 +y 2
− 2
ψ c (x, y) = e 2σ
cos(2πu0 x )+φ t)
x 89
Space-time plot of the a slice through the
patio-temporal filter of the previous slide
x 2 +y 2
− 2
ψ c (x, y) = e 2σ
cos(2πu0 x )+φ t)
x
90
ti
Spatio-temporal sampling illusion, due to
Edward Adelson and Jim Bergen
92
Evidence for filter-based analysis of motion in the
human visual system shown via spatio-temporal
visual illusion based on sampling
Two potential theories for how humans compute our motion perceptions:
(a) We match the pattern in the image that we see at one moment and compare
it with what we see at subsequent times.
(b) We use spatio-temporal lters to measure spatio-temporal energy in order to
measure local motion.
93
fi
spatial frequency
temporal frequency
Visual signal (this “video” is static)
space
time
Static sinusoid
alpha: 1 squareFlag: 0 offset: 0
Start with a picture of a sin wave, a static wave that’s not moving in time
spatial frequency Visual signal
temporal frequency
Moving sinusoid
space
time
95
A square wave is an infinite sum of sinusoids
filters to analyze motion
96
temporal frequency
Moving square waves
space
time
97
spatial frequency Visual signal
temporal frequency
Jitter space
square time
wave
temporal frequency
space
Blur
time
99
alpha: 0 squareFlag: 1 offset: period/4
spatial frequency
Visual signal
temporal frequency
space
time
100
alpha: 0 squareFlag: 1 offset: period/4
101
102
blend over the two conditions fraction of square wave
fundamental frequency
103
faster display speed
104
alpha: 1 squareFlag: 1 offset: period/4
faster display speed
105
alpha: 0 squareFlag: 1 offset: period/4
fast blended…
106
lecture summary
• We have “inverted U shaped” sensitivity to spatial
frequencies, peaking at 6 cycles per degree.
• We discussed ways to filter out different spatial frequency
components of an image.
• Aliasing: “blur before you subsample”.
• Spatio-temporal filtering enables motion analysis.
• Motion illusion gives evidence some temporal filtering
mechanisms are involved in our motion processing.
107