Conv2d Intro

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Convolution and

Cross-correlation
Dr. Thanh-Sach LE
LTSACH@hcmut.edu.vn

GVLab: Faculty of Computer Science and Engineering,


Graphics and Vision Laboratory HCMUT
2 Contents
❖ What & Why?
❖ Mathematical Definition
❖ Computation of convolution
Convolution and Cross-correlation
What & Why?
Dr. Thanh-Sach LE
LTSACH@hcmut.edu.vn

GVLab: Faculty of Computer Science and Engineering,


Graphics and Vision Laboratory HCMUT
4 What & Why?
❖ Convolution and cross-correlation
✤ important operations in signal processing
✤ used to transform input signal (e.g. image) to output feature map
and from input feature map to output feature map
✤ hence, in deep neural network they are used to learn (to extract)
features from input signals

❖ Convolution in Deep Learning


✤ Convolution is widely used to design deep neural networks (see
following slides)
discrete distribution for 1000 classes
Softmax

FC

FC

FC

POOL

CONV

CONV

CONV

POOL

NORM

CONV

POOL

NORM

CONV

Image: 3x227x227 AlexNet (2012)


Alex Krizhevsky and Sutskever, Ilya and Hinton, Geoffrey E, “ImageNet Classification with Deep Convolutional Neural
Networks,” in Advances in Neural Information Processing Systems, pp.1097-1105, 2012
discrete distribution for 1000 classes
Softmax

FC

FC

FC

POOL

CONV
They used convolution to extract features by
CONV transforming from raw pixels to meaningful
representation
CONV

POOL

NORM

CONV

POOL

NORM

CONV

Image: 3x227x227 AlexNet (2012)


Alex Krizhevsky and Sutskever, Ilya and Hinton, Geoffrey E, “ImageNet Classification with Deep Convolutional Neural
Networks,” in Advances in Neural Information Processing Systems, pp.1097-1105, 2012
Softmax
FC
FC
FC

}
POOL
CONV
CONV
CONV
They used convolution to extract features by

}
POOL
transforming from raw pixels to meaningful
CONV
representation
CONV
CONV

}
POOL
CONV
CONV
CONV
POOL
CONV
CONV }
POOL
CONV
CONV } VGG-16 (2014)
AKaren Simonyan, Andrew Zisserman, “Very Deep Convolutional Networks for Large-Scale
Image Recognition,” arXiv:1409.1556v6
UNet (2015)
Olaf Ronneberger, Philipp Fischer, Thomas Brox, “U-Net: Convolutional Networks for Biomedical Image
Segmentation,” arXiv:1505.04597v1 [cs.CV]
They used convolution to extract features by transforming from raw pixels to meaningful
representation
UNet (2015)
Olaf Ronneberger, Philipp Fischer, Thomas Brox, “U-Net: Convolutional Networks for Biomedical Image
Segmentation,” arXiv:1505.04597v1 [cs.CV]
Convolution and Cross-correlation
Mathematical Definition
Dr. Thanh-Sach LE
LTSACH@hcmut.edu.vn

GVLab: Faculty of Computer Science and Engineering,


Graphics and Vision Laboratory HCMUT
11 Mathematical Definition

❖ Computation node:

x11 x12 x13


y11 y12
x21 x22 x23 CONV
y21 y22
x31 x32 x33
Y
X
∂b

w11 w12 Output


Input (A feature map)
(Image or feature map) w21 w22

W
Convolution’s parameters
(Filter’s kernel)

❖ Notation:
Y = X*W
12 Mathematical Definition

❖ Computation node:

x11 x12 x13


y11 y12
x21 x22 x23 CONV
y21 y22
x31 x32 x33
Y
X
∂b

w11 w12 Output


Input (A feature map)
(Image or feature map) w21 w22

W
Convolution’s parameters
(Filter’s kernel)

❖ Notation:
Y = X*W
* : NOT a matrix multiplication
13 Mathematical Definition

❖ Definition:
u j
0 1 2 3 4 5 6 3
1 2
2 1
3 -3 -2 -1 0 1 2 3

i
∂b

4 -1
5 -2
6 -3
v W
X
Radius of kernel:
7
W : 7x7 ⇒ r = ⌊ ⌋ = 3
2
14 Mathematical Definition

❖ Definition:

{
Y(u, v) = X * W
r r

∑∑
∂b
Convolution = X(u − i, v − j)W(i, j)
i=−r j=−r

{
Y(u, v) = X ⋆ W
r r

∑∑
Cross-Correlation = X(u + i, v + j)W(i, j)
i=−r j=−r
15 Mathematical Definition
❖ Convolution vs cross-correlation:
W W
w11 w12 w11 w12

w21 w22 w21 w22

∂b
Rot1800

w22 w21

w12 w11
x11 x12 x13

Correlation x21 x22 x23 Convolution

=
x31 x32 x33

Y Y
16 Mathematical Definition
❖ Convolution vs cross-correlation:
W W
w11 w12 w11 w12

w21 w22 w21 w22

∂b
Rot1800

w22 w21

w12 w11
x11 x12 x13

Convolution x21 x22 x23 Correlation

=
x31 x32 x33

Y Y
Convolution and Cross-correlation
Computation of convolution
Dr. Thanh-Sach LE
LTSACH@hcmut.edu.vn

GVLab: Faculty of Computer Science and Engineering,


Graphics and Vision Laboratory HCMUT
18 Computation of convolution

• How to perform the computation (Logical view):


Step 1:
• Flip the kernel around the x-axis and then around the
y-axis
• Or, rotate the kernel 180o around its center
∂b

w11 w12 w22 w21

w21 w22 w12 w11

w11 w12 w13 w33 w32 w31

w21 w22 w23 w23 w22 w21

w31 w32 w33 w13 w12 w11


19 Computation of convolution

• How to perform the computation (Logical view):


Step 2:
• Compute the cross-correlation between the kernel
(obtained from Step 1) with the input image, by:

∂b
(b) Place the kernel aligned with the left-top of the
image
(c) Take the dot product between the kernel and
the sub-image occupied by the kernel and
assign the result to the output at the
corresponding location
(d) Slide the kernel to left and down; do task (b)
after each sliding

See the following illustration for detail


Start Convolution algorithm
✴ Rotate the kernel 1800
✴ Flatten it to a vector 1

✴ Padding zero to input


✴ Allocate output buffer 2

✴ Align the kernel window with the top-left of the input 3

✴ Extract the sub-image overlapped with the kernel


✴ Flatten the sub-image to a vector
✴ Take dot-product with the flattened kernel 4
✴ Fill the result to output

all output No ✴ Slide the kernel window to next


values assigned? position on the input

Yes 5
End
21 Computation of convolution

3 1 0 1

1
1

2
2

2
0

1
CONV Output ?
0 1 0 2
1 0 2
Input image
1 2 0
0 1 1

Filter’s kernel
22 Computation of convolution
1 Rotate the kernel
1 0 2 1 1 0
Rotation 180o
1 2 0 0 2 1
0 1 1 2 0 1
0
W Rot180 (W)

Rot1800(W) = Flip on horizontal direction + Flip on vertical direction


23 Computation of convolution
1 Flatten the rotated kernel
1 0 2 1 1 0
Rotation 180o
1 2 0 0 2 1
0 1 1 2 0 1
0
W Rot180 (W)

Flattening

1 1 0 0 2 1 2 0 1
24 Computation of convolution
2 Padding the input

3 1 0 1

1 1 2 0

1 2 2 1

0 1 0 2

X 2

This section illustrates convolution


2
with no padding, stride = 1

Output size: 2x2 will be clear shortly Y


Output
25 Computation of convolution
3 Align the rotated kernel with the input image

3 1 0 1 1x3 1x1 0x0 1

1 1 2 0 0x1 2x1 1x2 0

1 2 2 1 2x1 0x2 1x2 1

0 1 0 2 0 1 0 2

Input image

1 1 0
0 2 1
2 0 1

Rot1800(W)
26 Computation of convolution
4 Compute dot-product 3 1 0
X and W: aligned at left-top
1 1 2

·
Get sub-image
1x3 1x1 0x0 1 1 2 2

0x1 2x1 1x2 0 flattening


2x1 0x2 1x2 1
3 1 0 1 1 2 1 2 2
0 1 0 2
dot-product
Input image

1 1 0 0 2 1 2 0 1
1 1 0
0 2 1
3x1 + 1x1 + 0x0 +
2 0 1 1x0 + 1x2 + 2x1 + 12
Rot1800(W) 1x2 + 2x0 + 2x1
= 12
27 Computation of convolution
4 Compute dot-product 1 0 1
X and W: aligned at left-top
1 2 0

·
Get sub-image
3 1 0 1 2 2 1

1 1 2 0 flattening
1 2 2 1
1 0 1 1 2 0 2 2 1
0 1 0 2
dot-product
Input image

1 1 0 0 2 1 2 0 1
1 1 0
0 2 1
1x1 + 0x1 + 1x0 +
2 0 1 1x0 + 2x2 + 0x1 + 12 10
Rot1800(W) 2x2 + 2x0 + 1x1
= 10
28 Computation of convolution
4 Compute dot-product 1 1 2
X and W: aligned at left-top
1 2 2

·
Get sub-image
3 1 0 1 0 1 0

1 1 2 0 flattening
1 2 2 1
1 1 2 1 2 2 0 1 0
0 1 0 2
dot-product
Input image

1 1 0 0 2 1 2 0 1
1 1 0
0 2 1
1x1 + 1x1 + 2x0 +
2 0 1 1x0 + 2x2 + 2x1 + 12 10
Rot1800(W) 0x2 + 1x0 + 0x1
8
=8
29 Computation of convolution
4 Compute dot-product 1 2 0
X and W: aligned at left-top
2 2 1

·
Get sub-image
3 1 0 1 1 0 2

1 1 2 0 flattening
1 2 2 1
1 2 0 2 2 1 1 0 2
0 1 0 2
dot-product
Input image

1 1 0 0 2 1 2 0 1
1 1 0
0 2 1
1x1 + 2x1 + 0x0 +
2 0 1 12 10
2x0 + 2x2 + 1x1 +
Rot1800(W) 1x2 + 0x0 + 2x1
8 12
= 12
30 Computation of convolution
Final result
3 1 0 1

1 1 2 0 12 10
CONV
1 2 2 1 8 12

0 1 0 2 Output
1 0 2
Input image
1 2 0
0 1 1

Filter’s kernel
31 Computation of convolution
Final result
3 1 0 1

1 1 2 0 12 10
CONV
1 2 2 1 8 12

0 1 0 2 Output
1 0 2 (4 − 3 + 1) × (4 − 3 + 1)
Input image
1 2 0 2×2
4×4
0 1 1

Filter’s kernel
3×3
32 Computation of convolution
Final result
3 1 0 1

1 1 2 0 12 10
CONV
1 2 2 1 8 12

0 1 0 2 Output
1 0 2 (4 − 3 + 1) × (4 − 3 + 1)
Input image
1 2 0 2×2
4×4
0 1 1

Filter’s kernel
3×3

i1 × i2
* (i1 − k1 + 1) × (i2 − k2 + 1)

k1 × k2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy