Spatial and Temporal Linear Filters: 6.8300/6.8301 Advances in Computer Vision

Download as pdf or txt
Download as pdf or txt
You are on page 1of 107

Lecture 4

Spatial and Temporal


Linear Filters

Spring 2024
6.8300/6.8301 Advances in Computer Vision Sara Beery, Kaiming He, Vincent Sitzmann, Mina Konaković Luković
Announcements

• Pset 1 is out (due next Thu)

• Lecture recordings, lecture slides, course notes available at https://advances-in-


vision.github.io/schedule.html

• Register for Piazza and Canvas (participation on Piazza rewarded)

• Max 3 person team for nal project

• Check new tutorial rooms on webpage/Canvas

fi
Remember, an image is just an array of numbers
What we see What the machine gets
Some visual areas…

From M. Lewicky

Slides from lewicki


Signals and systems
Input Output

One important class of systems is the set of linear systems.

A function f is linear if it satis es:

f(αx) = αf(x)
f(x + y) = f(x) + f(y)

fi
We need translation invariance

Now we also want translation invariant operations. We can have linear translation invariant and non-linear translation invariant. We will focus on linear translation invariant
Classifier “Bird”
Classifier “Bird”
Bird
Bird

Classifier “Sky”
Sky
Sky Sky Sky Sky Sky Sky Sky Bird

Sky Sky Sky Sky Sky Sky Sky Sky

f
<latexit sha1_base64="b4HLEbhr7TEtaehBb4ygFYyuiV8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipGQ7KFbfqLkDWiZeTCuRoDMpf/WHM0gilYYJq3fPcxPgZVYYzgbNSP9WYUDahI+xZKmmE2s8Wh87IhVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGNn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNiUbgrf68jppX1U9t+o1ryv12zyOIpzBOVyCBzWowz00oAUMEJ7hFd6cR+fFeXc+lq0FJ585hT9wPn8AyvGM6g==</latexit>
Sky Sky Sky Sky Sky Sky Sky Sky

Bird Bird Bird Sky Bird Sky Sky Sky

Sky Sky Sky Bird Sky Sky Sky Sky


Convolution
Convolution
Input Image Output Image

The same weighting occurs within each window


Fourier series
Any function f(t) with t in the interval (0,π) could be expressed as an in nite linear
combination of harmonically related sinusoids:

with

One of Fourier’s original examples of sine series is the expansion of the ramp signal:

14

fi
The Discrete Fourier Transform
Discrete Fourier Transform (DFT) transforms a signal f [n] into F [u] as:

( N)
N−1
un

F[u] = f[n] exp −2πj
n=0

The inverse of the DFT is:

( N)
1 N−1 un
N∑
f[n] = F[u] exp 2πj
u=0

The signal f [n] is a weighted linear combination of complex exponentials


with weights F [u]
15
Visualizing the image Fourier transform

( (N M ))
N−1 M−1
un vm
∑∑
F[u, v] = f[n, m] exp −2πj +
n=0 m=0

The values of F [u,v] are complex.

Using the real and imaginary components:

Or using a polar decomposition:

Amplitude Phase 16
Visualizing the image Fourier transform
f [n, m] F[u, v]
m v v

n u u

v
v

u u

17

Location information goes into the phase, strength goes into magnitude
Simple Fourier transforms

Images are 64x64 pixels. The wave is a cosine, therefore DFT phase is zero.
18

High frequency means further fro m the origin


Today: A collection of useful filters in space and
time, and aliasing.

Low-pass filters Band-pass filters


19
Low pass-filters
Box filter

2N+1
1 1 … 1 h[n] with N=1
1 1 1
1 1 1 2M+1

1 1 1 1 n=0 n

21
Box filter
mean

1
21X21
=

mean

256X256 256X256

What does it do?


• Replaces each pixel with an average of its neighborhood
• Achieve smoothing effect (remove sharp features)
22
3-tap box filter versus b2[n]

[1 1 1] [1 2 1]

Which one is a better low-pass filter?


23
2 + 2 cos(2π u/20)

1 + 2 cos(2π u/20)

24
h1[n] vs b2[n]

[1, 1, 1] […, 1, -1, 1, -1, 1, -1, …] = […, -1, 1, -1, 1, -1, 1, …]

[1, 2, 1] […, 1, -1, 1, -1, 1, -1, …] = […, 0, 0, 0, 0, 0, 0, …]

25
Gaussian filter
In the continuous domain:

26
Gaussian filter

Continuous Gaussian:

Discretization of the Gaussian:

27
Scale

28
Gaussian filter for low-pass filtering

29
Dali
Properties of the Gaussian filter

• The (continuous) Fourier transform of a Gaussian is another Gaussian

• The convolution of two n-dimensional Gaussians is an n-dimensional


Gaussian.

where the variance of the result is the sum

(it is easy to prove this using the FT of the Gaussian) 30


Binomial filter

• Binomial coefficients provide a compact approximation of


the Gaussian coefficients using only integers

• The simplest blur filter (low pass) is [1 1]

• Binomial filters in the family of filters obtained as


successive convolutions of [1 1]

31
Binomial filter

b1 = [1 1]

b2 = [1 1] [1 1] = [1 2 1]

b3 = [1 1] [1 1] [1 1] = [1 3 3 1]

32
Binomial filter

33
Properties of binomial filters

• Sum of the values is 2n


• The variance of bn is
• The convolution of two binomial filters is
also a binomial filter
with a variance:

Note: These properties are analogous to the gaussian property in the continuous
domain (but the binomial filter is different than a discretization of a gaussian)

34
B2[n]

35

2D version by convolving 2 1D filters


What about the opposite of blurring?
Gaussian lter

Laplacian lter
+
-
-

36

Sharpening
fi
fi
Laplacian lter
+
-
Gaussian lter -

+ =

37
fi
fi
38

Our system responds to different special frequencies differently


Contrast Sensitivity Function
Blackmore & Campbell (1969)

Maximum sensitivity
~ 6 cycles / degree of visual angle

Invisible

Contrast sensitivity
visible

0.1 1 10 100
Low Spatial frequency (cycles/degree) High

Things that are very close Things far away


and/or large are hard to see are hard to see
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

When you’re far away you just see the low spatial frequencies, when you are close you see the high spatial frequencies
Hybrid Images
Oliva & Schyns

41

We start by taking two pictures, we isolate the details and contours in one, here the woman, and we blur the second face. as you can see, the man seems to go out of focus and the details of the woman are superposed.
Hybrid Images

42
+
=

43

When you’re far away you just see the low spatial frequencies, when you are close you see the high spatial frequencies
Hybrid Images

44
http://cvcl.mit.edu/hybrid_gallery/gallery.html
High pass-filters
47
Finding edges in the image
Image gradient:

Approximation image derivative:

Edge strength

Edge orientation:

Edge normal:

48
Image derivative [-1 1]

[-1, 1] =

h[m,n]

g[m,n] f[m,n]

49
[-1 1]T

[-1, 1]T =

h[m,n]

g[m,n] f[m,n]

50
Discrete derivatives

51

This version can give you a better centering. In the first example the edge would be at one half pixel
Discrete derivatives

52
Discrete derivatives

53
Derivatives
We want to compute the image derivative:

If there is noise, we might want to “smooth” it with a blurring filter

But derivatives and convolutions are linear and we can move them
around:

54
Gaussian derivatives

The continuous derivative is:

55
Gaussian Scale

σ=2 σ=4 σ=8


56

Analyze edges of the image (high-frequency edges) at a different scale


Derivatives of Gaussians: Scale

σ=2 σ=4 σ=8


57

Edges at a different spatial scale - not picking strips, picking just edges of the body
Orientation

58
Orientation

59
Sampling

You should always blur your image before you subsample it!
Sampling
Pixels

Continuous world

61
Sampling

62
Sampling

63

The question is: how many samples we need to capture the continuous thing well?
Aliasing

Let’s start with this continuous image (it is not really continuous…)
64
Aliasing

65

Subsample it (take every forth pixel or something)


Aliasing

Both waves fit the same samples. Aliasing consists in “perceiving”


the red wave when the actual input was the blue wave.

66
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra

Blue shows sampled signal


spatial sampled signal frequency
domain domain

sampled at Nyquist frequency

67

The number of replicas is equal to the sampling frequency


Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra

Blue shows sampled signal


spatial sampled signal frequency
domain domain

sampled at Nyquist frequency

68
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra

Blue shows sampled signal


spatial sampled signal frequency
domain domain

sampled at Nyquist frequency

69
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra

Blue shows sampled signal


spatial sampled signal frequency
domain domain

sampled at Nyquist frequency

70
Red curve is the signal: sinusoid + constant signal baseband spectrum
replicated spectra

Blue shows sampled signal


spatial sampled signal frequency
domain domain

sampled at Nyquist frequency

aliased
components

71

High frequency appears to be a low frequency of another thing. To avoid this situation, you do a low pass lter rst, so you can capture at least 2 samples of your
sinusoid.

fi
fi
spatial
domain

frequency
domain

Aliasing

You repeat copies by the spacing that is related to the sampling frequency.
Antialising filtering
Before sampling, apply a low pass-filter to remove all the
frequencies that will produce aliasing: “blur before you
subsample"

Without antialising
filter.

With antialising
filter.
• Temporal filtering
• Motion illusion, involving aliasing, addressing
whether humans match spatial patterns, or use
temporal filters, to measure motion.
Temporal filtering

why filter videos over time?


Sequences

time
Sequences

Cube size = 128x128x90

Slope says how fast it’s moving


Sequences

Cube size = 128x128x90


A box moving with speed vx
f (t) vx

t
Global constant motion

(vx,vy)

A global motion of the image can be written as:

where:
Our Fourier signal is the original signal multiplied by the delta function.
Fourier transformation of a box is a sink function (sin x/x)
Fourier domain is just a different representation that makes certain operations simpler. Like defining motion - just calculate the orientation of the signal to keep track of the direction of motion.
Temporal Gaussian

This filter keeps stationary


things sharp, and blurs
moving things.

Filters remove some frequencies we don’t like and keep the ones we like.
Blurring more in time than in space with this filter.
Spatio-temporal Gaussian
Spatio-temporal Gaussian
How could we create a filter that keeps sharp objects that
move at some velocity (vx , vy) while blurring the rest?

We can design Gaussians that pick up a particular direction.


Grab stationary objects
Grab people moving to the left…
Quadrature pair of Gabor filters

U0=0.1

x 2 +y 2
− 2
ψ c (x, y) = e 2σ
cos(2πu0 x )

x 2 +y 2
− 2
ψ s (x, y) = e 2σ
sin(2πu0 x )

Gabor is a Gaussian wavelet multiplied by cosine


Using phase changes of local Gabor filters to
analyze or generate motion

x 2 +y 2
− 2
ψ c (x, y) = e 2σ
cos(2πu0 x )+φ t)

x 89
Space-time plot of the a slice through the
patio-temporal filter of the previous slide

x 2 +y 2
− 2
ψ c (x, y) = e 2σ
cos(2πu0 x )+φ t)

x
90

Orientation in space time is speed in real world


Mo on without movement

ti
Spatio-temporal sampling illusion, due to
Edward Adelson and Jim Bergen

92
Evidence for filter-based analysis of motion in the
human visual system shown via spatio-temporal
visual illusion based on sampling

Two potential theories for how humans compute our motion perceptions:

(a) We match the pattern in the image that we see at one moment and compare
it with what we see at subsequent times.
(b) We use spatio-temporal lters to measure spatio-temporal energy in order to
measure local motion.

This illusion favors one theory over the other.

93

fi
spatial frequency

temporal frequency
Visual signal (this “video” is static)

space

time

Static sinusoid
alpha: 1 squareFlag: 0 offset: 0

Start with a picture of a sin wave, a static wave that’s not moving in time
spatial frequency Visual signal

temporal frequency
Moving sinusoid

space

time

95
A square wave is an infinite sum of sinusoids
filters to analyze motion

Instead of looking at sinusoids we are going to use square waves

96

Add higher frequency sins, this is represented as a sum of a lot of sins


spatial frequency
Visual signal

temporal frequency
Moving square waves

space

time

97
spatial frequency Visual signal

temporal frequency
Jitter space
square time
wave

alpha: 1 squareFlag: 1 offset: period/4 98


spatial frequency
Visual signal

temporal frequency
space
Blur
time

99
alpha: 0 squareFlag: 1 offset: period/4
spatial frequency
Visual signal

temporal frequency
space

time

100
alpha: 0 squareFlag: 1 offset: period/4
101
102
blend over the two conditions fraction of square wave
fundamental frequency

103
faster display speed

104
alpha: 1 squareFlag: 1 offset: period/4
faster display speed

105
alpha: 0 squareFlag: 1 offset: period/4
fast blended…

106
lecture summary
• We have “inverted U shaped” sensitivity to spatial
frequencies, peaking at 6 cycles per degree.
• We discussed ways to filter out different spatial frequency
components of an image.
• Aliasing: “blur before you subsample”.
• Spatio-temporal filtering enables motion analysis.
• Motion illusion gives evidence some temporal filtering
mechanisms are involved in our motion processing.

107

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy