2023 AN2DL DeepLearning Intro

Download as pdf or txt
Download as pdf or txt
You are on page 1of 122

Artificial Neural Networks and Deep Learning

- Introduction to the course -


https://boracchi.faculty.polimi.it/
Prof. Giacomo Boracchi – giacomo.boracchi@polimi.it
Loris Giulivi – loris.giulivi@polimi.it
but also …
Prof. Matteo Matteucci – matteo.matteucci@polimi.it
Eng. Eugenio Lomurno – eugenio.lomurno@polimi.it
Eng. Francesco Lattari – francesco.lattari@polimi.it
Who I am

Giacomo Boracchi (https://boracchi.faculty.polimi.it/ )


• Mathematician (Università Statale degli Studi di Milano 2004),
• PhD in Information Technology (DEIB, Politecnico di Milano 2008)
• Associate Professor since 2019 at DEIB, Polimi (Computer Science)

My Research Interests are mathematical and statistical methods for:


• Machine Learning and in particular unsupervised learning, change
and anomaly detection
• Image analysis and processing
… and the two combined

2
Teaching

Advanced courses taught:


• Artificial Neural Networks and Deep Learning (MSc)
• Mathematical Models and Methods for Image Processing (MSc, spring 2023)
• Advanced Deep Learning Models And Methods (PhD, Winter 2022 with Prof.
Matteucci)
• Online Learning and Monitoring (PhD, Spring 2022 with Prof Trovò)

• Computer Vision and Pattern Recognition (MSc in USI, Spring 2020)


• Learning Sparse Representations for image and signal modeling (PhD)

• Informatica A (Mathematical Engineering!!!)

3
Course Objectives

“The course major goal is to provide students with the theoretical


background and the practical skills to understand and use Neural
Networks, and, at the same time, become familiar and with Deep Learning
for solving complex engineering problems ... especially in vision tasks”

4
A Course with Code Sharing

This course is offered to Bioengineering and Mathematical Engineering


• 056869 - ARTIFICIAL NEURAL NETWORKS AND DEEP LEARNING - 5 CFU
• Prof. Giacomo Boracchi, Eng. Eugenio Lomurno

… equivalent course for Computer Science and Engineering students


• 054307 - ARTIFICIAL NEURAL NETWORKS AND DEEP LEARNING - 5 CFU
• Prof. Matteo Matteucci, Eng. Francesco Lattari

The same teachers will teach the same topics to both classes, but you
need to be enrolled in the right course and attend the right lectures …

5
The Teachers
Official teacher, please refer
to me for bureaucratic stuff!
Prof. Matteo Matteucci
• Neural Networks
• Deep Learning
• Sequence Learning

Prof. Giacomo Boracchi


• Deep Learning for visual recognition (Classification, Segmentation, Detection..)
Loris Giulivi, Francesco Lattari and Eugenio Lomurno
• Programming DL in Py
• Online Challenges
6
https://boracchi.faculty.polimi.it/teaching/AN2DLCalendar.htm

A detailed schedule on
Google Calendar

Each event includes


- Teacher
- Possibly last-minute slides
- Links to video recordings

7
The Students
???
Students are expected to:
• To attend the proper classes according to their program
• Feel confortable with basic statistics and calculus
• Feel confortable with basic programming (Python)
• Be ready to act as «guinea pigs» for this course edition
• Be curious and willing to learn ...
Students are not expected to:
• Know more than what is usually tought in basic engineerig courses
• Know already about machine learning (althought it doesn’t hurt)
• Be hyper-skilled python hackers (you’ll not need it)
• ...

8
Course syllabus

Introduction to Neural Network and Deep Learning 2h lectures


Neural Networks and Deep Learning
• From the Perceptron to neural networks
• Backpropagation and neural networks training
• Best practices in neural network training 16h lectures
• Recurrent architectures
• Autoencoders and long short-term memories
Visual Recognition with Deep Neural Networks
• Image Classification and Convolutional Neural Networks
• CNN Training Tricks and Best Practices 16h lectures
• CNN for Advanced Vision Tasks (Segmentation, Detection,…)
ANN and Deep Learning Coding (with Keras)
16h practicals

9
Course Website and Detailed Schedule

All details and info are on the course website


https://boracchi.faculty.polimi.it/teaching/AN2DL.htm

https://chrome.deib.polimi.it/index.php?title=Artificial_Neural_Networks_and_Deep_Learning

How to get there?


• From our websites https://boracchi.faculty.polimi.it/
• Select “Teaching And Available Thesis”, then “Artificial Neural Network and
Deep Learning”

What do you find there:


• Detailed schedule
• Lecture slides / links
10
Lectures Schedule and Timings MTM/BIO

Classes (there is no distinction between lecture and exercises):


• Monday in Room B.4.2. from 8.45 till 10.15 (we postpone a bit the lecture start)
• Tuesday in Room 7.1.3 from 10.30 till 12.00
Check the teacher who will be in class on the detailed schedule
• Lectures will be recorded and made available afterwards
• Lectures won’t be streamed

11
Lectures Schedule and Timings CSE

You might consider attending these to avoid overlaps


• Wednesday, 16:15 – 18:15, in T2.2 (starts at 16:30 ends at 18:10)
• Thursday, 14:15 – 16:15, in T2.1 (starts at 14:30 end at 16:10)

Please drop us an email if you plan to attend the other course, as this
needs to be authorized
Check the detailed schedule (including Lecture Topics) as due to calendar
issues, the two courses are not perfectly aligned!
CSE: https://boracchi.faculty.polimi.it/teaching/AN2DLCalendar_CS.htm
BIO+MTM: https://boracchi.faculty.polimi.it/teaching/AN2DLCalendar.htm

12
Course Evaluation AN2DL

Grading comprises a theoretical part and a practical part:


• Written examination covering the whole program up to 20/30
• Home project in the form of 2 coding challenges up to 10/30
• Final score will be the sum of the grades of the two 30/30
Challenges are graded based on what you do, not based on the position in the rank!

13
Written Examination

• Digital exam on moodle: bring your own laptop


• We will use the platform: https://remoteexam.polimi.it/
• Safe Exam Browser (SEB) will be required
• It does not run on Linux…. Sorry for that… make sure you can borrow a
Windows or Mac laptop

Please, make sure you can run the test quiz well ahead the exam.

14
Go to https://remoteexam.polimi.it/

15
Search for our course (that will be updated…)

16
Run the test (it’s already there)

17
Run the test (but read instructions)!

The test is configured exactly as the exam form. Please, make sure you can successfully
accomplish the following tasks with your laptop at your earliest convenience:
• Go to the https://remoteexam.polimi.it/ website and select the "[2023-2024] Artificial
Neural Networks and Deep Learning [Giacomo Boracchi Matteo Matteucci]" course
• Log in with your student credentials and select the "test exam for AY 23-24 sessions".
• In case you do not have SEB installed, you will have the option to install SEB. Make sure
SEB is installed before the exam.
• Make sure everything goes smoothly and you can fill in the quiz.
• After submitting the answers, you can quit the SEB session. You will be prompted a quit
password, use "IAmDone".

If you cannot access the form through the SEB, you won't have the chance to give the exam.

18
Course Evaluation!

Grading comprises a theoretical part and a practical part:


• Written examination covering the whole program up to 20/30
• Home project in the form of 2 coding challenges up to 10/30
• Final score will be the sum of the grades of the two 30/30

Comments and notes about the grading


• 10 points of the theoretical part will be given by Prof. Matteucci
• 10 points of the theoretical part will be given by Prof Boracchi
• 5 points for each homework challenge are given by Francesco Lattari
• Homework challenges are not repeated, they are just run once a year
• Challenge 1 around 2nd November, Challenge 2 around 6th December
Challenges are graded based on what you do, not based on the position in the rank!
19
Course Evaluation!

Grading comprises a theoretical part and a practical part:


• Written examination covering the whole program up to 20/30
• Home project in the form of 2 coding challenges up to 10/30
• Final score will be the sum of the grades of the two 30/30

Comments and notes about the grading


• 10 points of the theoretical part will be given by Prof. Matteucci
• 10 points of the theoretical part will be given by Prof Boracchi
• 5 points for each homework challenge are given by Francesco Lattari
• Homework challenges are not repeated, they are just run once a year
• Challenge 1 around 2nd November, Challenge 2 around 6th December
Challenges are graded based on what you do, not based on the position in the rank!
20
Laude

Laude is meant to reward brilliant students that:


• Actively participate to lectures
• Provide outstanding homework solutions
• Solve the written exam very timely

21
IC Course Evaluation

METHODS & APPLICATIONS OF AI IN BIOMEDICINE [I.C. 10 CFUs]

The final grade will be the average of


- Applied AI in Biomedicine (5 CFUs from Prof. Valentina Corino)
- AN2DL (5 CFUs).

22
Synergies with Other Courses

AN2DL is a course on machine learning, sinergies might exists with other


Even taking them all the overlap
ends up to be at most 10h (<20%)
courses on the same topic, but it has been designed not to overlaps with:
• Machine Learnig: there you see classical machine learning tools, some
concepts such as generalization, overfitting, and crossvalidation might be
similar ... Up to 4-5h out of 50h (< 10%)
• Uncertainty in Artificial Intelligence: neural networks have been removed from
this course and they have been replaced by Bayesian Networks and Graphical
Models … 0h out of 50h (0%)
• Image Analysis and Computer Vision: Image classification part has been
removed… there is a shared background on image filtering… 0h out of 50h (0%)
• Data Mining and text Mining: does not cover neural networks and it is mostly
based on unsupervised methods up to 4h out of 50h (< 10%)

23
Ironing out the kinks ...

Some details have not been sorted out yet today, working on those ..
• WeBeep Use
• No we use the calendar and enrolled students emails
• Projects/Competitions:
• How many people per group (2-3 people)
• Competitions out 2nd November & 6th December
• Practical evaluation of challenges:
• Not doing it scores up to 0 points
• Doing it with basic tools present in class
up to 1-4 points (?)
• Doing it with passion and in a propositive
manner up to 5 points (?)
• Automated scoring / code plagiarism check (?)

24
Frequently Asked Question (up to now)

I cannot attend all classes, do you follow a book?


You can find all covered topics on the Deep Learning book, but we are going to present the course in
a personalized manner. We suggest you to attend and follow our material then check the book to
complete your preparation. Slides will be made available as well as lecture recordings.

We are not computer scientist, will we be able to do the competition?


We are going to use simple libraries, we expect with basic competencies in programming you should
be able to do it autonomously at least to a minimum level.
Are you going to stream/record lectures?
We are going to record and share links on the Google Calendar. No lecture streaming, though.

25
Frequently Asked Question (up to now)

I have overlaps can I attend AN2DL with CS?


Sure, that’s fine by us. However, please inform us so that we can keep track of how many students
are going to attend
• Wednesday from 16.30 till 18.00
• Thursday from 14.30 till 16.00

Other questions?

26
https://boracchi.faculty.polimi.it/teaching/AN2DLCalendar_CS.htm

Similar Calendar for AN2DL


for CSE students…

You might want to check this


out in case of lecture
overlaps

27
Artificial Neural Networks and Deep Learning
- Machine Learning vs Deep Learning-

Giacomo Boracchi, PhD


https://boracchi.faculty.polimi.it/
Politecnico di Milano
Standard Programming
/* What is this program about?*/
# include<stdio.h>
int main()
{
int a, sum;
sum = 0;
printf("\nInsert a:");
scanf("%d", &a);
while (a > 0)
{
sum += a;
printf("\nInsert a:");
scanf("%d", &a);
}
printf("\nSum = %d", sum);
}

29
Standard Programming
/* What is this program about?*/
# include<stdio.h>
int main()
{ Can you write a program that
int a, sum;
sum = 0;
takes as input an image and
printf("\nInsert a:"); tells whether it contains a car
scanf("%d", &a);
while (a > 0) or a motorbike?
{
sum += a;
printf("\nInsert a:");
scanf("%d", &a);
}
printf("\nSum = %d", sum);
}

30
Machine Learning Paradigms

ML is the solution, as the program becomes a very big parameteric function


𝑓𝑓𝜃𝜃 , whos paramters 𝜃𝜃 are learned from data!
𝑓𝑓𝜃𝜃

ML prediction
Model
𝑥𝑥 𝑦𝑦

𝑥𝑥 𝑓𝑓𝜃𝜃 𝑥𝑥 = 𝑦𝑦

31
Machine Learning Paradigms

ML is the solution, as the program becomes a very big parameteric function


𝑓𝑓𝜃𝜃 , whos paramters 𝜃𝜃 are learned from data!
𝑓𝑓𝜃𝜃

Learning consists is (automatically) ML


defining prediction
Model
the parameters 𝜃𝜃 of the model 𝑓𝑓.
𝑥𝑥 which give rise to the
Different settings applies, 𝑦𝑦
supervised and unsupervised settings

𝑥𝑥 𝑓𝑓𝜃𝜃 𝑥𝑥 = 𝑦𝑦

32
Machine Learning Paradigms

ML is the solution, as the program becomes a very big parameteric function


𝑓𝑓𝜃𝜃 , whos paramters 𝜃𝜃 are learned from data!
𝑓𝑓𝜃𝜃

Supervised Learning
ML prediction
• Classification
Model
• Regression 𝑥𝑥 𝑦𝑦

𝑥𝑥 𝑓𝑓𝜃𝜃 𝑥𝑥 = 𝑦𝑦

33
Supervised Learning

In Supervised Learning we are given a training in the form:


𝑇𝑇𝑇𝑇 = 𝑥𝑥1 , 𝑦𝑦1 , … , 𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛
where
• 𝑥𝑥𝑖𝑖 ∈ ℝ𝑑𝑑 is the input
• 𝑦𝑦𝑖𝑖 ∈ Λ is the target, the expected output of the model to 𝑥𝑥𝑖𝑖
The set Λ can be
• A discrete set, as in classification Λ = {"brown", "green", "blue"} (e.g.,
possible eye colors)
• An ordinal set (often continuous set, ℝ) in case of regression.
Λ can be also multivariate (e.g., regressing weight and height of an
individual or estimating they eye colors and heirs color)
34
Training Set for (binary) Image Classification

Cars Motorcycles
𝑇𝑇𝑇𝑇 = 𝑥𝑥1 , 𝑦𝑦1 , … , 𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛
• 𝑥𝑥𝑖𝑖 ∈ ℝ𝑅𝑅×𝐶𝐶×3 is the input image
• 𝑦𝑦𝑖𝑖 ∈ {"car", "motorcycle"}

35
Inference Using the Trained Classifier

Cars Motorcycles

Classifier CAR
Motorcycle

36
Supervised learning: Regression

12000 $ 15000 $ 6000 $ 2000 $ 8000 $

22000 $ 4000 $ 28000 $ 6000 $ 35000 $

Regressor 25000
3800 $

37
Training Set for Regression

12000 $ 15000 $ 6000 $ 2000 $ 8000 $

22000 $ 4000 $ 28000 $ 6000 $ 35000 $

𝑇𝑇𝑇𝑇 = 𝑥𝑥1 , 𝑦𝑦1 , … , 𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛


• 𝑥𝑥𝑖𝑖 ∈ ℝ𝑅𝑅×𝐶𝐶×3 is the input image
• 𝑦𝑦𝑖𝑖 ∈ ℝ

38
Supervised learning: Regression

12000 $ 15000 $ 6000 $ 2000 $ 8000 $

22000 $ 4000 $ 28000 $ 6000 $ 35000 $

Regressor 25000
3800 $

39
Remarks

• Number of classes can be larger than two (multiclass classification,


e.g., {"car", "motorcycle","truck"} )
• The input size in general needs to be fixed
• The number of outputs for regression can be larger (multivariate
regression, e.g., estimating cost and weight of the vehicle)
• Training a Classifier or a Regressor requires different losses
• Difference between classification or regression is not only on the fact
that Λ discrete, but whether it is ordinal
• Λ categorical (no ordinal) -> classification
• Λ ordinal (either discrete or continuous) -> regression

40
Give a few examples of

Regression problems on images Classification problems on images


• •
• •
• •
• •
• •

41
Machine Learning Paradigms

ML is the solution, as the program becomes a very big parameteric function


𝑓𝑓𝜃𝜃 , whos paramters 𝜃𝜃 are learned from data!
𝑓𝑓𝜃𝜃

Supervised Learning
ML prediction
• Classification
Model
• Regression 𝑥𝑥 𝑦𝑦
Unsupervised Learning
• Clustering
• Anomaly Detection 𝑥𝑥 𝑓𝑓𝜃𝜃 𝑥𝑥 = 𝑦𝑦
• …
43
Unupervised Learning

In Unsupervised Learning, the training set contains only inputs,


𝑇𝑇𝑇𝑇 = 𝑥𝑥1 , … , 𝑥𝑥𝑛𝑛
and the goal is to find structure in the data, like
• grouping or clustering of data points
• estimating probability density distribution
• detecting outliers
• …

45
Unsupervised learning: Clustering

46
Unsupervised learning: Clustering

47
Unsupervised learning: Clustering

48
Unsupervised learning: Clustering

49
Unsupervised learning: Clustering

50
Unsupervised learning: Anomaly Detection

51
To Summarize: Machine Learning Paradigms

Immagine you have a certain experience E, i.e., data, and let’s name it

𝐷𝐷 = 𝑥𝑥1 , 𝑥𝑥2 , 𝑥𝑥3 , … , 𝑥𝑥𝑁𝑁

• Supervised learning: given a training set of pairs (input, desired output)


{ 𝑥𝑥1 , 𝑦𝑦1 , … , (𝑥𝑥𝑁𝑁 , 𝑦𝑦𝑁𝑁 )}, learn to produce the correct output of new inputs
• Unsupervised learning: exploit regularities in 𝐷𝐷 to build a meaningful/compact
representation of these, which can help regression/prediction
• Reinforcement learning: producing actions 𝑎𝑎1 , 𝑎𝑎2 , 𝑎𝑎3 , … , 𝑎𝑎𝑁𝑁 which affect
the environment, and receiving rewards 𝑟𝑟1 , 𝑟𝑟2 , 𝑟𝑟3 , … , 𝑟𝑟𝑁𝑁 learn to act in order
to maximize rewards in the long term

52
To Summarize: Machine Learning Paradigms

Immagine you have a certain experience E, i.e., data, and let’s name it

𝐷𝐷 = 𝑥𝑥1 , 𝑥𝑥2 , 𝑥𝑥3 , … , 𝑥𝑥𝑁𝑁

• Supervised learning: given a training set of pairs (input, desired output)


{ 𝑥𝑥1 , 𝑦𝑦1 , … , (𝑥𝑥𝑁𝑁 , 𝑦𝑦𝑁𝑁 )}, learn to produce the correct output of new inputs
• Unsupervised learning: exploit regularities in 𝐷𝐷 to build a meaningful/compact
representation of these, which can help regression/prediction
This course focuses most on
• Reinforcement learning: producing actions 𝑎𝑎1 , 𝑎𝑎2 , 𝑎𝑎 which affect
3 , … , 𝑎𝑎𝑁𝑁 Learning
Supervised (with
the environment, and receiving rewards 𝑟𝑟1 , 𝑟𝑟2 , 𝑟𝑟3 , …some learn to actspots)
, 𝑟𝑟𝑁𝑁 unsupervised in order
to maximize rewards in the long term

53
Machine Learning

54
Machine Learning

Deep Learning

55
56
57
Hand-Crafted Features
How images / signals were classified before deep learning
Assume you need to automatize this process

60
Assume you need to automatize this process

61
Assume you need to automatize this process

62
An Illustrative Example: Parcel Classification

Images acquired from a RGB-D sensor:


• No color information provided
• A few pixels report depth
measurements
• Images of 3 classes
• ENVELOPE
• PARCEL
• DOUBLE
Envelop height at that
pixel

63
An Illustrative Example: Parcel Classification

Images acquired from a RGB-D sensor:


• No color information provided
• A few pixels report depth
measurements
• Images of 3 classes
• ENVELOPE
• PARCEL
• DOUBLE

64
An Illustrative Example: Parcel Classification

Images acquired from a RGB-D sensor:


• No color information provided
• A few pixels report depth
measurements
• Images of 3 classes
• ENVELOPE
• PARCEL
• DOUBLE

65
Hand Crafted Featues

Engineers:
• know what’s meaningful in an
image (e.g. a specific color/shape,

Feature Extraction
the area, the size)
• can implement algorithms to map
this information in a set of
measurements, a feature vector

66
Hand Crafted Featues

h average

Feature Extraction
area
h max h min

perimeter
ratio

𝐱𝐱 ∈ ℝ𝑑𝑑

67
This is exactly what a doctor would to to classify ECG tracings

Heartbeats morphology has been widely investigated

Doctors know which patterns are


meaningful for classifying each beat

Features are extracted from


landmarks indicated by doctors:
e.g. QT distance, RR distance…

Created by Agateller (Anthony Atkielski), Public Domain, https://commons.wikimedia.org/w/index.php?curid=1560893


68
The Training Set

The training set is a set of annotated examples


𝑇𝑇𝑇𝑇 = { 𝒙𝒙, 𝒚𝒚 𝑖𝑖 , 𝑖𝑖 = 1, … , 𝑁𝑁}
Each couple 𝒙𝒙, 𝒚𝒚 𝑖𝑖 corresponds to:
• an image 𝒙𝒙𝑖𝑖
• the corresponding label 𝒚𝒚𝑖𝑖

This is meant for a Supervised Learning Problem!

69
The Training Set: images + labels
The Training Set: images + labels
The Training Set: features + labels
The Training Set
Training Set

If height < 2.5


𝑙𝑙 = "parcel"
Training Set

If height > 6.2


𝑙𝑙 = "envelope"
Training Set

If 3.5 < height < 6.2 & area > 200


𝑙𝑙 = "double"
If 3.5 < height < 6.2 & area < 200
𝑙𝑙 = "envelope"
Classifier output
A tree classifying image features

Feature Extraction Algorithm


Input image
if (ℎ < 3.5cm)
false true

𝒉𝒉 if (ℎ > 6.2cm)
false true «envelope»
𝒂𝒂
if (𝑎𝑎 < 200px)
«Parcel»
false true

«envelope» «Double»

𝐱𝐱 ∈ ℝ2
𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1
“double” “envelope” “parcel”
78
Limitations of Rule Based Classifier

It is difficult to grasp what are meaningful dependencies over multiple


variables (it is also impossible to visualize these)

Let’s resort to a data-driven model for the only task of separating feature
vectors in different classes.

How can a classifier achieve better performance?

79
A tree classifying image features

Feature Extraction Algorithm


Input image
if (ℎ < 3.5cm)
false true
The classifier has a few
𝒉𝒉 if (ℎ > 6.2cm)
patameters: «envelope»
false true
• The splitting criteria 𝒂𝒂
if (𝑎𝑎 < 200px)
• The splitting thresholds 𝑻𝑻𝒊𝒊 «Parcel»
false true

«envelope» «Double»

𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1 𝐱𝐱 ∈ ℝ2


“double” “envelope” “parcel”
80
This is our first solution
There are a few errors
Can I do better?

Classification error: 14.2%


Let’s try different parameters

Classification error: 13.7%


Data Driven Models
Data Driven Models are defined from a training set of (supervised) pairs
𝑇𝑇𝑇𝑇 = { 𝒙𝒙, 𝒚𝒚 𝑖𝑖 , 𝑖𝑖 = 1, … , 𝑁𝑁}

The model parameters 𝜃𝜃 (e.g. Neural Network weights) are set to minimize a loss
function (e.g., the classification error in case of discrete output or the reconstruction
error in case of continuous output)
𝜃𝜃 ∗ = argmin ℒ 𝜃𝜃, 𝑇𝑇𝑇𝑇
𝜃𝜃

Network training is an optimization process to find params minimizing the loss function.
Can definitvely boost the image classification performance
• Annotated training set is always needed
• Classification performance depends on the training set
• Generalization is not guaranteed
85
Hand Crafted Feature Extraction, data-driven Classification

Feature Extraction Algorithm


Input image

mean
area

Classifier
max min “double”
per.
ratio 𝑡𝑡 ∈ Λ
𝐱𝐱 ∈ ℝ𝑑𝑑
𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1
(𝑑𝑑 ≪ 𝑟𝑟 × 𝑐𝑐)
86
Are there better classifiers?
Are there better classifiers?

Neural networks provide


non-linear separation
boundaries among classes
And Neural Networks are not the only..
Neural Networks

Feature Extraction Algorithm


Input image

… … …

𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1
𝐱𝐱 ∈ ℝ𝑑𝑑
input layer Hidden layer(s) Output Layer
Neural Networks Input layer: Same size of the
feature vector

𝑥𝑥1

Feature Extraction Algorithm


Input image

… … …

𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1 𝑥𝑥𝑑𝑑
𝐱𝐱 ∈ ℝ𝑑𝑑
input layer Hidden layer(s) Output Layer
Neural Networks Output layer: Same size
as the number of
classes

Feature Extraction Algorithm


Input image
𝑃𝑃(𝑡𝑡 = "doub. "|𝒙𝒙)

𝑃𝑃(𝑡𝑡 = "env. "|𝒙𝒙)

… … …

𝑃𝑃(𝑡𝑡 = "parc. "|𝒙𝒙)
𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1
𝐱𝐱 ∈ ℝ𝑑𝑑
input layer Hidden layer(s) Output Layer
Neural Networks Hidden layers: arbitrary size

Feature Extraction Algorithm


Input image

… … …

𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1
𝐱𝐱 ∈ ℝ𝑑𝑑
input layer Hidden layer(s) Output Layer
Image Classification by Hand Crafted Features

Feature Extraction Algorithm


Input image

… … …

𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1

Hand Crafted Data Driven


Hand Crafted Featues, pros:

• Exploit a priori / expert information


• Features are interpretable (you might understand why they are not
working)
• You can adjust features to improve your performance
• Limited amount of training data needed
• You can give more relevance to some features

96
Hand Crafted Featues, cons:

• Requires a lot of design/programming efforts


• Not viable in many visual recognition tasks that are easily performed
by humans (e.g. when dealing with natural images)
• Risk of overfitting the training set used in the feature design
• Not very general and "portable"

97
What is Deep Learning after all?
Machine learns how to
take the Iris apart

Hand-crafted Learned
Features Classifier
double

Height

Area

99
What is Deep Learning after all?
Machine learns how to
take the Iris apart

Hand-crafted Learned
Features Classifier double

Height

Sometimes the decision


might be more complex

Area

100
What is Deep Learning after all?
Machine learns how to
take the Iris apart

Hand-crafted Learned
Features Classifier
double

Height

Sometimes the decision


might be Impossible!

Area

101
What is Deep Learning after all? This happens if you do
not know which features
to extract!!!

Hand-crafted Learned
Features Classifier
double

Height

Sometimes the decision


might be Impossible!

Area

102
Data Driven Features
… the advent of Deep Learning
Data-Driven Features

Input image

Feature Extraction
… … …

𝐼𝐼1 ∈ ℝ𝑟𝑟1 ×𝑐𝑐1 …

Data Driven Data Driven


What is Deep Learning after all? Optimized for
the task!

Machine Learned Learned


Features Classifier
double

Easier to learn!
Height

Area

109
What is Deep Learning after all? Hierarchical representation
Learn from data! optimized for the task!

Learned Learned Learned Learned


features features features Classifier
double

Deep Learning is about learning


data representation from data!

But which data?

110
Deep Learning
a Breakthrough in Visual Recognition
Image Classfiication on Imagenet

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural 113
information processing systems 25 (2012).
The impact of Deep Learning in Visual Recognition

Classification accuracy on ILSVRC

Many layers!

ILSVCR: ImageNet Large Scale Visual Recognition Challenge


The impact of Deep Learning in Visual Recognition
How was this possible?
Classification accuracy on ILSVRC

Many layers!

ILSVCR: ImageNet Large Scale Visual Recognition Challenge


Large Collections of Annotated Data

The ImageNet project is a large


visual database designed for use in visual
object recognition software research.
More than 14 million images have been
hand-annotated by the project to indicate
what objects are pictured and in at least
one million of the images, bounding boxes
are also provided.[3] ImageNet contains
more than 20,000 categories

From Wikipedia October 2021

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, ImageNet: A Large-Scale Hierarchical Image Database. CVPR, 2009.
Parallel Computing Architectures

https://www.flickr.com/photos/nvidia/34686550412
And more recently…. Software libraries

Google LLC, Public domain, via Wikimedia Commons


PyTorch, BSD <http://opensource.org/licenses/bsd-license.php>, via Wikimedia Commons
And of course "New" Network Architectures…

…but these were around since ‘97


You will learn to read this!
(required to pass the exam)

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. “Gradient-based learning applied to document recognition” Proceedings of
the IEEE, 1998 86(11), 2278-2324. 119
Workshop Tecnico
05.07.2022

https://awards.acm.org/about/2018-turing
Advanced Visual Recognition
Problems with DL
127
128
https://github.com/alexjc/neural-enhance

129
Stile Transfer https://github.com/jcjohnson/neural-style
https://github.com/jcjohnson/fast-neural-style
https://ml4a.github.io/ml4a/style_transfer/

https://github.com/luanfujun/deep-photo-styletransfer

130
Image Captioning

"little girl is eating piece of cake." "black cat is sitting on top of suitcase."

Andrej Karpathy, Li Fei-Fei "Deep Visual-Semantic Alignments for Generating Image Descriptions" CVPR 2015
131
Generative Adversarial Networks (these people do not exist)

1322019
Tero Karras, Samuli Laine, Timo Aila «A Style-Based Generator Architecture for Generative Adversarial Networks” CVPR
Image Generation by Generative Adversarial Networks

133
134
135
Even though sometimes it fails…
Even though sometimes it fails…

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy