0% found this document useful (0 votes)

27 views110 pages

L3 - UUCLxDeepMind DL2020

The document describes a lecture on convolutional neural networks for image recognition. It provides background on CNNs and how they take advantage of the topological structure of images. It then discusses the basic building blocks of CNNs, including convolutional layers, pooling layers, and how they are stacked to create hierarchical representations of images.

Uploaded by

Neel Patel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views110 pages

L3 - UUCLxDeepMind DL2020

Uploaded by

Neel Patel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 110

WELCOME TO THE

UCL x DeepMind
lecture series
In this lecture series, leading research scientists
from leading AI research lab, DeepMind, will give
12 lectures on an exciting selection of topics
in Deep Learning, ranging from the fundamentals
of training neural networks via advanced ideas around
memory, attention, and generative modelling to the
important topic of responsible innovation.
Please join us for a deep dive lecture series into
Deep Learning!

#UCLxDeepMind
General
information Exits:
At the back, the way you came in

Wiﬁ:
UCL guest
TODAY’S SPEAKER

Sander Dieleman
Sander Dieleman is a Research Scientist at DeepMind
in London, UK, where he he has worked on the development
of AlphaGo and WaveNet. He was previously a PhD student at
Ghent University, where he conducted research on feature
learning and deep learning techniques for learning hierarchical
representations of musical audio signals. During his PhD he also
developed the deep learning library Lasagne and won solo and
team gold medals respectively in Kaggle's "Galaxy Zoo"
competition and the ﬁrst National Data Science Bowl. In the
summer of 2014, he interned at Spotify in New York, where he
worked on implementing audio-based music recommendation
using deep learning on an industrial scale.
In the past decade, convolutional neural
networks have revolutionised computer
vision. In this lecture, we will take a closer
look at convolutional network architectures
through several case studies, ranging from
the early 90's to the current state of the art.
We will review some of the building blocks
that are in common use today, discuss the
TODAY’S LECTURE
challenges of training deep models, and

Convolutional strategies for ﬁnding effective architectures,

with a focus on image recognition.

Neural Networks
for Image
Recognition
Convolutional
Neural Networks for
Image Recognition
Sander Dieleman

UCL x DeepMind Lectures

Plan for this lecture Private & Conﬁdential

01 02 03
Background Building blocks Convolutional neural
networks

04 05 06
Going deeper: Advanced topics Beyond image
case studies recognition
1 Background
Last week:
neural networks
Linear Sigmoid Linear Softmax Cross
Linear Node Loss entropy

Data Target
How can we feed
images to a neural
network?
Linear Sigmoid Linear Softmax Cross
Linear Node Loss entropy

Data Target
Neural networks for images

A digital image is a 2D grid of pixels.

Neural networks for images

A digital image is a 2D grid of pixels.

Neural networks for images

A digital image is a 2D grid of pixels.

A neural network expects a vector of numbers as input.

Neural networks for images

A digital image is a 2D grid of pixels.

A neural network expects a vector of numbers as input.

Neural networks for images

A digital image is a 2D grid of pixels.

A neural network expects a vector of numbers as input.

Locality and translation invariance

Locality: nearby pixels are more strongly correlated

Translation invariance: meaningful patterns can occur anywhere in the image

Taking advantage of topological structure
Taking advantage of topological structure

Weight sharing: use the same network parameters to

detect local patterns at many locations in the image
Taking advantage of topological structure

Weight sharing: use the same network parameters to

detect local patterns at many locations in the image

Hierarchy: local low-level features are

composed into larger, more abstract features

edges and textures object parts objects

Data drives
research
The ImageNet challenge Private & Conﬁdential

Major computer vision

benchmark
Ran from 2010 to 2017
1.4M images, 1000 classes
Image classiﬁcation

Want to learn more?

Russakovsky, Olga et al. ImageNet Large Scale Visual
Recognition Challenge International Journal of
Computer Vision 115.3 (2015)
Top-5 classiﬁcation error rate of
the competition winners
Traditional computer vision techniques
AlexNet
VGGNet and GoogLeNet
ResNet
2 Building
blocks

UCL x DeepMind Lectures

From fully connected to locally connected
From fully connected to locally connected
From fully connected to locally connected

fully-connected unit
From fully connected to locally connected

locally-connected units
3✕3 receptive ﬁeld
From locally connected to convolutional

convolutional units
3✕3 receptive ﬁeld
From locally connected to convolutional

Receptive ﬁeld

Feature map
Implementation: the convolution operation