Cvpr17 Pointnet Slides
Cvpr17 Pointnet Slides
Charles R. Qi*
Hao Su*
Kaichun Mo
Leonidas J. Guibas
Big Data + Deep Representation Learning
Emerging 3D Applications
Big Data + Deep Representation Learning
LiDAR
Depth Sensor
Point Cloud
3D Representation: Point Cloud
Point cloud is close to raw sensor data
LiDAR
Volumetric
Depth Sensor
Point Cloud
Depth Map
Previous Works
Most existing point cloud features are handcrafted
towards specific tasks
Source: https://github.com/PointCloudLibrary/pcl/wiki/Overview-and-Comparison-of-Features
Previous Works
Voxelization 3D CNN
Projection/Rendering 2D CNN
PointNet
Our Work: PointNet
Object Classification
N
Unordered Input
D D
D D
Examples:
f (x1 , x2 ,…, xn ) = max{x1 , x2 ,…, xn }
f (x1 , x2 ,…, xn ) = x1 + x2 +…+ xn
…
Permutation Invariance: Symmetric Function
Examples:
f (x1 , x2 ,…, xn ) = max{x1 , x2 ,…, xn }
f (x1 , x2 ,…, xn ) = x1 + x2 +…+ xn
…
How can we construct a family of symmetric
functions by neural networks?
Permutation Invariance: Symmetric Function
Observe:
f (x1 , x2 ,…, xn ) = γ ! g(h(x1 ),…,h(xn )) is symmetric if g is symmetric
Permutation Invariance: Symmetric Function
Observe:
f (x1 , x2 ,…, xn ) = γ ! g(h(x1 ),…,h(xn )) is symmetric if g is symmetric
h
(1,2,3)
(1,1,1)
(2,3,2)
…
(2,3,4)
Permutation Invariance: Symmetric Function
Observe:
f (x1 , x2 ,…, xn ) = γ ! g(h(x1 ),…,h(xn )) is symmetric if g is symmetric
h
(1,2,3) simple symmetric function
(1,1,1) g
(2,3,2)
…
(2,3,4)
Permutation Invariance: Symmetric Function
Observe:
f (x1 , x2 ,…, xn ) = γ ! g(h(x1 ),…,h(xn )) is symmetric if g is symmetric
h
(1,2,3) simple symmetric function
(1,1,1) g γ
(2,3,2)
…
Symmetric functions
PointNet
(vanilla)
Universal Set Function Approximator
Theorem:
A Hausdorff continuous symmetric function f : 2X → ! can be
arbitrarily approximated by PointNet.
S ⊆ !d PointNet (vanilla)
Basic PointNet Architecture
h
(1,2,3) MLP
g γ
(1,1,1) MLP
max MLP
(2,3,2) MLP
…
3 T-Net transform 3
params
N Transform N
Transformed
Data
Data
Input Alignment by Transformer Network
3 T-Net transform 3
params
N Transform N
Transformed
Data
Data
Input Alignment by Transformer Network
3 T-Net transform 3
params
N Transform N
Transformed
Data
Data
Input Alignment by Transformer Network
3 T-Net transform 3
params: 3x3
Matrix
N
Mult.
Transformed
Data
Data
Embedding Space Alignment
transform
T-Net params: 64x64
Matrix
Mult.
Input Transformed
embeddings: embeddings:
Nx64 Nx64
Embedding Space Alignment
transform
T-Net params: 64x64 Regularization:
Input Transformed
embeddings: embeddings:
Nx64 Nx64
PointNet Classification Network
PointNet Classification Network
PointNet Classification Network
PointNet Classification Network
PointNet Classification Network
PointNet Classification Network
PointNet Classification Network
Extension to PointNet Segmentation Network
3D CNNs
Input
Output
3D CNN
Visualizing Global Point Cloud Features
3 MLP 1024
maxpool
n
shared global feature
Original Shape:
3 MLP 1024
maxpool
n
shared global feature
Original Shape:
Original Shape:
lexsorted
(1,2,3) (1,1,1)
(1,1,1) (1,2,3)
(2,3,2) (2,3,2) MLP
(2,3,4) (2,3,4)
Permutation Invariance: How about Sorting?
Multi-Layer Perceptron
lexsorted (ModelNet shape classification)
(1,2,3) (1,1,1) Accuracy
(1,1,1) (1,2,3)
(2,3,2) (2,3,2) MLP Unordered Input 12%
(2,3,4) (2,3,4) Lexsorted Input 40%
PointNet (vanilla) 87%
Permutation Invariance: How about RNNs?
LSTM Network
… (ModelNet shape classification)
LSTM LSTM LSTM LSTM
Accuracy
MLP MLP MLP … MLP LSTM 75%
ModelNet40 Accuracy
PointNet (vanilla) 87.1%
+ input 3x3 87.9%
+ feature 64x64 86.9%
+ feature 64x64 + reg 87.4%
+ both 89.2%
Visualizing Point Functions
1x3 1x1024
Compact View: FCs
1x3 1x1024
Expanded View: FC FC FC FC FC
64 64 64 128