Machine Learning Algorithms For Iot
Machine Learning Algorithms For Iot
Machine Learning Algorithms For Iot
IFA’2021 1
DATA SCIENCE OVERVIEW
IFA’2021 2
MACHINE LEARNING
IFA’2021 3
How to deal with so many devices and huge amount of data in IoT?
• Device Management
* Number of devices in IoT is extremely large
* Connection between them and with the sinks over very large distances
* Connectivity of devices is very important
* Very large collected data must be managed efficiently
IFA’2021 5
MACHINE LEARNING BASICS
n Traditional Programming
Data (Input)
Computer Output
Model
n Machine Learning
Data (Input)
Computer Learned Model
Output
IFA’2021 6
MACHINE LEARNING BASICS
n Machine learning gives computers/machines the ability to learn without being explicitly programmed
Machine Learning
Training Data Algorithm
Prediction
n It consists of methods that can learn from and make predictions on data
IFA’2021 7
EXAMPLES OF MACHINE LEARNING PROBLEMS
n Computer Vision & Speech Processing & Data Analytics
n Pattern Recognition
– Facial identities or facial expressions
– Handwritten or spoken words (e.g., Siri)
– Medical images
– Sensor Data/IoT
n Pattern Generation
– Generating images or motion sequences
n Anomaly Detection
– Unusual patterns in the telemetry from physical and/or virtual
plants
– Unusual sequences of credit card transactions Facial Recognition-
– Unusual patterns of sensor data from a nuclear power plant Pattern Recognition Example
n Prediction
– Future stock prices or currency exchange rates
IFA’2021 8
EXAMPLES OF MACHINE LEARNING PROBLEMS
n Object Recognition Example:
Object Detected: Motorbike
IFA’2021 9
PURPOSE OF MACHINE LEARNING ALGORITHMS
IFA’2021 10
MACHINE LEARNING ALGORITHMS: INTRODUCTION
n Algorithms and techniques come from diverse fields including statistics, mathematics,
neuroscience, electrical engineering, mechanical engineering, industrial/systems
engineering, computer science etc.
IFA’2021 11
MACHINE LEARNING ALGORITHMS: MAIN CATEGORIES
FOCUS: SUPERVISED and UNSUPERVISED LEARNING since they have been and are still widely
applied in IoT smart data analysis.
IFA’2021 12
MACHINE LEARNING ALGORITHMS:
SUPERVISED LEARNING
n Objective is to learn how to predict the appropriate output vector for a given input
vector.
n Cases where the target labels are composed of one or more continuous
variables are known as regression tasks.
IFA’2021 13
MACHINE LEARNING ALGORITHMS:
SUPERVISED LEARNING EXAMPLE
Known
Data
It’s an
apple!
Known
Response New Data
(Labels)
IFA’2021 14
MACHINE LEARNING ALGORITHMS:
UNSUPERVISED LEARNING
n This preprocessing stage can significantly improve the result of the subsequent
machine learning algorithm and is named feature extraction.
IFA’2021 15
MACHINE LEARNING ALGORITHMS:
UNSUPERVISED LEARNING: CLUSTERING EXAMPLE
Pattern
Detected!
Model
Input Data (Clustering
(No Labels) Algorithm)
Response (Clustered)
(without knowing their
classes)
IFA’2021 16
OVERVIEW OF ML ALGORITHMS
Data
Regression/
DATA Classification/ Data
DATA DATA DATA FEATURE
CLASSIFICATION/ Clustering/
CLASSIFICATION REGRESSION REGRESSION CLUSTERING EXTRACTIO
Feature Anomaly
N
Extraction Detection
• k-Nearest • Linear • Classification
• k-Means * Principal
Neighbor Regression and Component
Regression Analysis One-class
• Density-Based
• Naïve Bayes Trees Feed Forward Support
• Support Spatial
* Canonical Neural Network* Vector
Clustering of
• Support Vector • Random Correlation
Machines
Applications Analysis
Vector Regression Forests with Noise
Machines
• Bagging
IFA’2021 17
OVERVIEW OF ML ALGORITHMS AND THEIR USE CASES IN IoT
Machine Learning Algorithm IoT, Smart City Use Cases Metric to Optimize
Classification Smart Traffic Traffic Prediction, Increase Data Abbreviation
Clustering Smart Traffic, Smart Health Traffic Prediction, Increase Data Abbreviation
Traffic Prediction, Increase Data Abbreviation,
Anomaly Detection Smart Traffic, Smart Environment
Finding Anomalies in Power Dataset
Support Vector Regression Smart Weather Prediction Forecasting
Linear Regression Economics, Market analysis, Energy usage Real Time Prediction, Reducing Amount of Data
Classification and Regression Trees Smart Citizens Real Time Prediction, Passengers Travel Pattern
Support Vector Machine All Use Cases Classify Data, Real Time Prediction
Passengers' Travel Pattern, Efficiency of the
K-Nearest Neighbors Smart Citizen
Learned Metric
Food Safety, Passengers Travel Pattern,
Naive Bayes Smart Agriculture, Smart Citizen
Estimate the Numbers of Nodes
Outlier Detection, fraud detection, Analyze
Smart City, Smart Home, Smart Citizen, Small Data set, Forecasting Energy
k-Means
Controlling Air and Traffic Consumption, Passengers Travel Pattern,
Stream Data Analyze
IFA’2021 18
OVERVIEW OF ML ALGORITHMS AND THEIR USE CASES IN IoT
Machine Learning Algorithm IoT, Smart City Use Cases Metric to Optimize
Labeling Data, Fraud Detection, Passengers Travel
Density-Based Clustering Smart Citizen
Pattern
Reducing Energy Consumption, Forecast the
Feed Forward Neural Network Smart Health States of Elements, Overcome the Redundant
Data and Information
Principal Component Analysis Monitoring Public Places Fault Detection
Canonical Correlation Analysis Monitoring Public Places Fault Detection
One-class Support Vector Machines Smart Human Activity Control Fraud Detection, Emerging Anomalies in the data
IFA’2021 19
NEURAL NETWORKS: INTRODUCTION
Dendrites
Axon
Soma
Neuron vs Node
x1 x1
x2
y1 x2 f(x) y1
xn
xn
Synapse vs Weights
Synapse
xi yi
Weight
Inputs are received by dendrites, and if the input levels are over a threshold, the neuron fires,
passing a signal through the axon to the synapse which then connects to another neuron.
IFA’2021 22
NEURAL NETWORKS: INTRODUCTION
Input Hidden
Layer Output
Mapping brain to Layer
neural networks
Layer
Inputs Outputs
Weights
IFA’2021 Nodes
23
NEURAL NETWORKS:
FEEDFORWARD NEURAL NETWORK
n Hidden Nodes:
– No direct connection with the outside world (hence
the name “hidden”).
– They perform computations and transfer information
from the input nodes to the output nodes.
– A collection of hidden nodes forms a “Hidden Layer”.
– While a feedforward network will only have a single
Output 1
input layer and a single output layer, it can have zero
or multiple Hidden Layers.
n Information is distributed
Output 1
n Information processing is parallel
Output 2
Internal representation
(interpretation) of data Information
IFA’2021 25
PERCEPTRONS: FORWARD PROPAGATION
Linear (weighted)
BIAS combination of inputs
b=1 Output
𝜽𝟎
𝐦
𝜽𝟏 𝐲! = 𝐠 𝛉𝟎 + ' 𝐱𝐢 𝛉𝐢
x1 𝐢&𝟏
𝜽𝟐 ∑ ŷ Non-linear Bias
x2 acivaion funcion
𝜽𝒎
xm
Inputs Weights Sum Activation Output
Function
IFA’2021 26
ACTIVATION (TRANSFER) FUNCTIONS
+1
Output +1 Output Output
t
Input
Input 0
Input
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
IFA’2021 31
BUILDING NEURAL NETWORKS WITH PERCEPTRONS:
MULTI LAYER
IFA’2021 32
FEEDFORWARD NEURAL NETWORKS:
TWO TYPES
Hidden
Layer Output
Layer
IFA’2021 34
NEURAL NETWORKS: MODEL TRAINING MECHANISM
Training Dataset
x1 x2 x3 x4 y1 y2 y3
. . . . . . .
. . . . . . .
. . . . . . .
y
IFA’2021 36
PERCEPTRONS: TRAINING PERCEPTRONS
STEP 1:
– Inputs are given random weights (usually between –0.5 and 0.5)
STEP2:
– An item of training data (x,y) pair is presented, and the perceptron provides es;mated output (!
𝒚)
STEP3:
– Loss func;on ||yi – 𝒚#𝒊|| computes the error
STEP4:
– Based on the error, the weights are modified according to (also known as back propaga;on):
)𝒊 ))
𝜽𝒊 ← 𝜽𝒊 +(𝒂 ∗ 𝒙𝒊 ∗ ( 𝒚𝒊 − 𝒚
n The process of fine-tuning the weights and biases from the input data is known as training
the Neural Network.
n Determine a weight vector that causes the perceptron to produce correct +-1 output for
each of the given training examples
IFA’2021 38
PERCEPTRONS: DELTA RULE AND GRADIENT DESCENT
n If the training data are not linearly separable another approach called the d rule
using gradient descent
– Same basic rule for finding the update values for weights
– Changes/Differences
l Do not incorporate the threshold in the output value (un-thresholded
perceptrons)
l Wait to update the weights unMl the cycle is complete
IFA’2021 39
CONVERGENCE OF DELTA RULE
IFA’2021 40
PERCEPTRONS: BACKPROPAGATION
(𝒍)
n Each hidden node 𝑗 is “responsible” for some fraction of the error 𝛿𝒋 in
each of the output nodes to which it connects
(𝒍)
n 𝛿𝒋 is divided according to the strength of the connection between hidden
node and the output node
n Then, the “blame” is propagated back to provide the error values for the
hidden layer
IFA’2021 41
PERCEPTRONS: BACKPROPAGATION ALGORITHM
IFA’2021 42
PERCEPTRONS: BACKPROPAGATION ALGORITHM
n When to stop:
– After fixed number of iterations
– Error falls below some threshold
– Once the error on a separate validation set of examples meets some criterion
– May not find the global minimum because many local minima.
– Can run several times to find global minima – in practice it works well
IFA’2021 43
PERCEPTRONS: BACKPROPAGATION
+1 +1 +1
(𝟑) (𝟑) (𝟒) (𝟒)
𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏
𝒙𝟏
(𝟐)
𝒛𝟏
→ 𝒂𝟏𝟐 = 𝒈(𝒛𝟏 )(𝟐) (𝟑) (𝟒)
(𝟐) 𝜹𝟏 𝜹𝟏
𝜹𝟏
𝒙𝟐 (𝟐)
𝒛𝟐 → 𝒂𝟐
(𝟐) (𝟑)
𝒛𝟐 → 𝒂𝟐
(𝟑)
(𝟐) (𝟑)
𝜹𝟐 𝜹𝟐
(:)
δ9 = “error” of node j in layer l
IFA’2021 44
PERCEPTRONS: BACKPROPAGATION
+1 +1 +1
(𝟐) (𝟐) (𝟑) (𝟑) (𝟒) (𝟒)
𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏
𝒙𝟏 (𝟐) (𝟑) (𝟒)
𝜹𝟏 𝜹𝟏 𝜹𝟏
(𝟒) (𝟒)
𝜹𝟏 = 𝒂𝟏 −𝒚
𝒙𝟐 (𝟐)
𝒛𝟐 →
(𝟐)
𝒂𝟐
(𝟑)
𝒛𝟐 →
(𝟑)
𝒂𝟐
(𝟐) (𝟑)
𝜹𝟐 𝜹𝟐
(𝒍)
𝜹𝒋 = “error” of node 𝒋 in layer 𝒍
IFA’2021 45
PERCEPTRONS: BACKPROPAGATION
+1 +1 +1
(𝟐) (𝟐) (𝟑) (𝟑) (𝟒) (𝟒)
𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏
𝒙𝟏 (𝟐) (𝟑) (𝟒)
𝜹𝟏 𝜹𝟏 𝜹𝟏
(𝟑)
𝒙𝟐 (𝟐)
𝒛𝟐 →
(𝟐)
𝒂𝟐
(𝟑)
𝒛𝟐 →
(𝟑)
𝒂𝟐 𝜣𝟏𝟐
(𝟐) (𝟑)
𝜹𝟐 𝜹𝟐
(𝒍)
𝜹𝒋 = “error” of node 𝒋 in layer 𝒍 (𝟑) (𝟑) (𝟒)
𝜹𝟐 = 𝜣𝟏𝟐 ×𝜹𝟏
IFA’2021 46
PERCEPTRONS: BACKPROPAGATION
+1 +1 +1
(𝟐) (𝟐) (𝟑) (𝟑) (𝟒) (𝟒)
𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏
𝒙𝟏 (𝟐) (𝟑) (𝟒)
𝜹𝟏 𝜹𝟏 (𝟑) 𝜹𝟏
𝜣𝟏𝟏
𝒙𝟐 (𝟐)
𝒛𝟐 → 𝒂𝟐
(𝟐) (𝟑)
𝒛𝟐 → 𝒂𝟐
(𝟑)
(𝟐) (𝟑)
𝜹𝟐 𝜹𝟐
(𝟑) (𝟑) (𝟒)
𝜹𝟏 = 𝜣𝟏𝟏 ×𝜹𝟏
(𝒍)
𝜹𝒋 = “error” of node 𝒋 in layer 𝒍 (𝟑) (𝟑) (𝟒)
𝜹𝟐 = 𝜣𝟏𝟐 ×𝜹𝟏
IFA’2021 47
PERCEPTRONS: BACKPROPAGATION
+1 +1 +1
(𝟐) (𝟐) (𝟑) (𝟑) (𝟒) (𝟒)
𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏 𝒛𝟏 → 𝒂𝟏
𝒙𝟏 (𝟐) (𝟑) (𝟒)
𝜹𝟏 (𝟐)
𝜹𝟏
𝜹𝟏
𝜣𝟏𝟐
𝒙𝟐 (𝟐)
𝒛𝟐 → 𝒂𝟐
(𝟐) (𝟑)
𝒛𝟐 → 𝒂𝟐
(𝟑)
(𝟐) (𝟑)
𝜹𝟐 (𝟐)
𝜣𝟐𝟐 𝜹𝟐
(𝒍) (𝟐) (𝟐) (𝟑) (𝟐) (𝟑)
𝜹𝒋 = “error” of node 𝒋 in layer 𝒍 𝜹𝟐 = 𝜣𝟏𝟐 ×𝜹𝟏 + 𝜣𝟐𝟐 ×𝜹𝟐
IFA’2021 48
EXAMPLE: HOUSING MARKET
n Given the data about the previous house sales, can we predict whether
a current house will be sold or not?
$100K 2 ?
IFA’2021 49
EXAMPLE: HOUSING MARKET
5 Error= 0-0.6=-0.6
IFA’2021 50
EXAMPLE: HOUSING MARKET
IFA’2021 51
EXAMPLE: HOUSING MARKET
IFA’2021 52
EXAMPLE: STUDENT PASS/FAIL
n Given the data about the previous students, can we predict whether a
current student will pass or fail?
12 75 0 (fail)
25 70 ?
IFA’2021 53
EXAMPLE: STUDENT PASS/FAIL
IFA’2021 54
EXAMPLE: STUDENT PASS/FAIL
IFA’2021 55
EXAMPLE: STUDENT PASS/FAIL
Input Layer Hidden Layer Output Layer The prediction says the
student will pass!
1 1
𝜽1’
Hours studied Probability of pass=0.8
25 𝜽2’
IFA’2021 56
ONLINE TOOL: HTTP://PLAYGROUND.TENSORFLOW.ORG
TENSORFLOW PLAYGROUND
IFA’2021 57
EXAMPLE: SENSOR LOCALIZATION
n For example, sensor node localization problem (i.e., determining node's geographical position)
n Node localization can be based on propagating angle and distance measurements of the received
signals from anchor nodes.
n Such measurements may include received signal strength indicator (RSSI), time of arrival (TOA),
and time difference of arrival (TDOA) as in Figure.
n After several training, the neurons can compute the location of the node.
IFA’2021 59
DEEP LEARNING: INTRODUCTION
Car
Not Car
Car
Not Car
IFA’2021 60
DEEP LEARNING: PERSPECTIVE
Artificial Intelligence
Machine Learning
Any technique A subset of AI that
that enables includes complex
computers to staOsOcal
Deep Learning
mimic human techniques that
intelligence. enable machines The subset of machine
to improve tasks learning composed of
with data. algorithms that permit
socware to train itself in a
mulOlayered neural networks
with vast amount of data.
IFA’2021 61
DEEP LEARNING: VITAL FOR IoT
§ DL and IoT are among the top 3 strategic technology trends for 2017 that
were announced at Gartner Symposium/ITxpo 2016
§ IoT systems need different modern data analytic approaches and AI methods
according to the hierarchy of IoT data generation and management.
IFA’2021 62
DEEP LEARNING ALGORITHMS:
IFA’2021 63
SUMMARY OF DEEP LEARNING MODELS & IoT APPLICATIONS
IFA’2021 64
SUMMARY OF DEEP LEARNING MODELS & IoT APPLICATIONS
IFA’2021 66
RESEARCH TRENDS AND OPEN ISSUES
Challenge 1. IoT Data CharacterisMcs
n High quality informaMon is required since the quality directly effects the
accuracy of knowledge extracMon.
n IoT data characterisMcs:
– High volume
– Fast velocity
– Variety of data
– Consists mostly of raw data
– Distributed nature
n SoluMon: SemanMc technologies tend to enhance the abstracMon of IoT data
through annotaMon algorithms, while they require further effort to overcome
its velocity and volume.
IFA’2021 67
RESEARCH TRENDS AND OPEN ISSUES
IFA’2021 68
RESEARCH TRENDS AND OPEN ISSUES
n According to the characteristic of smart data, analytic algorithms should be able to handle big data
n Algorithms must be able to analyze
– Data coming from a variety of sources
– In real time
n Solution: Deep learning algorithms can reach high accuracy if they have enough data and time
– Cons:
l They can be easily influenced by noisy smart data
l Neural network based algorithms lack interpretation
(Data scientist cannot understand the reasons for the model results)
l Semi-supervised algorithms, which model a small amount of labeled data with a large
amount of unlabeled data can assist
IFA’2021 69