Machine Learning &deep Learning in Python &R
Machine Learning &deep Learning in Python &R
Machine Learning &deep Learning in Python &R
Submitted
By
SWATI
(03196302818)
Page 1 of 49
DECLARATION
I ,Swati, hereby declare that I have completed my two weeks winter training at
Udemy (one of the world’s leading online certification training providers) from 10
March 2021 to 23 March 2021 under the guidance of Vivek Sridhar. I have declared
that I have worked with full dedication during these two weeks of training and my
learning outcomes fulfill the requirements of training for the award of degree of
Bachelor of Technology (B.Tech.) in ECE , Maharaja Surajmal Institute of
Technology, New Delhi.
Page 2 of 49
CERTIFICATE
Page 3 of 49
ACKNOWLEDGEMENT
A research work owes its success from commencement to completion , do the people with
researchers at various stages . Let me in this page express my gratitude to all those who help me at
various stages of this study. First I would like to express my sincere gratitude and indebtedness to
Dr. Pradeep Sangwan (HOD, Department of Electronics and Communication Engineering ,
Maharaja Surajmal Institute of technology, New Delhi) unknown me to undergo the winter
training of 3 weeks at Udemy .
I am very grateful to our guide Mrs.Neetu Sehrawat, for the help provided in completion of the
report which was assigned to me. Without is friendly help and guidance it was difficult to develop
this report.
I pay my sincere thanks and gratitude to all the staff members of udemy for their support and for
making our training valuable and fruitful.
Page 4 of 49
ABSTRACT
Artificial Intelligence, Machine Learning and Deep Learning are the buzzwords that have been
able to grasp the interest of many researchers since various numbers of years.
Machine learning is an application of artificial intelligence (AI) that provides systems the
ability to automatically learn and improve from experience without being explicitly
programmed. Machine learning focuses on the development of computer programs that can
access data and use it learn for themselves
6. Technology Learnt.............................................................................................................08
6.1. Introduction to Artificial Intelligence and Machine Learning....................................08
6.1.1. Definition of Artificial Intelligence.................................................................08
6.1.2. Definition of Machine Learning.......................................................................09
6.1.3. Machine Learning Algorithms.........................................................................10
6.1.4. Applications of Machine Learning...................................................................11
6.2. Techniques of Machine Learning................................................................................12
6.2.1. Supervised Learning.........................................................................................12
6.2.2. Unsupervised Learning.....................................................................................16
6.2.3. Semi- supervised Learning..............................................................................18
6.2.4. Reinforcement Learning...................................................................................19
6.2.5. Some Important Considerations in Machine Learning....................................19
6.3. Data Preprocessing.....................................................................................................20
6.3.1. Data Preparation..............................................................................................20
6.3.2. Feature Engineering.........................................................................................21
6.3.3. Feature Scaling.................................................................................................22
6.3.4. Datasets............................................................................................................24
6.3.5. Dimensionality Reduction with Principal Component Analysis......................24
6.4. Math Refresher............................................................................................................25
6.4.1. Concept of Linear Algebra...............................................................................25
6.4.2. Eigenvalues, Eigenvectors, and Eigen decomposition....................................30
6.4.3. Introduction to Calculus...................................................................................30
6.4.4. Probability and Statistics..................................................................................31
6.5. Supervised learning.....................................................................................................34
6.5.1. Regression........................................................................................................34
6.5.1.1. Linear Regression.................................................................................35
6.5.1.2. Multiple Linear Regression..................................................................35
6.5.1.3. Polynomial Regression.........................................................................36
6.5.1.4. Decision Tree Regression.....................................................................37
6.5.1.5. Random Forest Regression...................................................................37
6.5.2. Classification....................................................................................................38
6.5.2.1. Linear Models......................................................................................39
6.5.2.1.1. Logistic Regression..................................................................39
6.5.2.1.2. Support Vector machines.........................................................39
6.5.2.2. Nonlinear Models.................................................................................40
6.5.2.2.1. K-Nearest Neighbors (KNN)....................................................40
6.5.2.2.2. Kernel Support Vector Machines (SVM)................................40
6.5.2.2.3. Naïve Bayes..............................................................................41
6.5.2.2.4. Decision Tree Classification....................................................41
6.5.2.2.5. Random Forest Classification..................................................42
6.6. Unsupervised learning................................................................................................43
6.6.1. Clustering.........................................................................................................43
6.6.1.1. Clustering Algorithms..........................................................................43
6.6.1.2. K-means Clustering..............................................................................44
6.7. Introduction to Deep Learning....................................................................................45
6.7.1. Meaning and Importance of Deep Learning....................................................45
6.7.2. Artificial Neural Networks...............................................................................46
6.7.3. TensorFlow.......................................................................................................47
7. Reason for choosing Machine Learning.............................................................................47
8. Learning Outcome..............................................................................................................48
9. Bibliography.......................................................................................................................49
9.1. Content source.............................................................................................................49
9.2. Picture from.................................................................................................................49
9.3. Book referred..............................................................................................................49
1. Introduction
1.1.A Taste of Machine Learning
Arthur Samuel, an American pioneer in the field of computer gaming and
artificial intelligence, coined the term "Machine Learning" in 1959.
Over the past two decades Machine Learning has become one of the mainstays
of information technology.
With the ever-increasing amounts of data becoming available there is good
reason to believe that smart data analysis will become even more pervasive as a
necessary ingredient for technological progress.
1.2. Relation to Data Mining
Data mining uses many machine learning methods, but with different goals; on
the other hand, machine learning also employs data mining methods as
"unsupervised learning" or as a preprocessing step to improve learner accuracy.
1.3. Relation to Optimization
Machine learning also has intimate ties to optimization: many learning problems
are formulated as minimization of some loss function on a training set of
examples. Loss functions express the discrepancy between the predictions of the
model being trained and the actual problem instances.
1.4. Relation to Statistics
Michael I. Jordan suggested the term data science as a placeholder to call the
overall field.
Leo Breiman distinguished two statistical modelling paradigms: data model and
algorithmic model, wherein "algorithmic model" means more or less the machine
learning algorithms like Random forest.
2. Technology Learnt
2.1. Introduction to AI & Machine Learning
2.1.1. Definition of Artificial Intelligence
Data Economy
World is witnessing real time flow of all types structured and unstructured data
from social media, communication, transportation, sensors, and devices.
International Data Corporation (IDC) forecasts that 180 zettabytes of data will
be generated by 2025.
This explosion of data has given rise to a new economy known as the
Data Economy.
Data is the new oil that is precious but useful only when cleaned and processed.
There is a constant battle for ownership of data between enterprises to
derive benefits from it.
Define Artificial Intelligence
Artificial intelligence refers to the simulation of human intelligence in machines that are
programmed to think like humans and mimic their actions. The term may also be applied to
any machine that exhibits traits associated with a human mind such as learning and problem-
solving.
Machine Learning is an approach or subset of Artificial Intelligence that is based on the idea
that machines can be given access to data along with the ability to learn from it.
Define Machine Learning
Machine learning is an application of artificial intelligence (AI) that provides systems the
ability to automatically learn and improve from experience without being explicitly
programmed. Machine learning focuses on the development of computer programs that can
access data and use it learn for themselves.
Features of Machine Learning
Machine Learning is computing-intensive and generally requires a large amount
of training data.
It involves repetitive training to improve the learning and decision making of
algorithms.
As more data gets added, Machine Learning training can be automated for
learning new data patterns and adapting its algorithm.
2.1.3. Machine Learning Algorithms
Traditional Programming vs. Machine Learning Approach
Traditional Approach
Traditional programming relies on hard-coded rules.
Machine Learning Approach
Machine Learning relies on learning patterns based on sample data.
Classification
Answers “What class?”
Applied when the output has finite and discreet values Example: Social
media sentiment analysis has three potential outcomes, positive,
negative, or neutral
Regression
Answers “How much?”
Clustering
The most common unsupervised learning method is cluster analysis. It is used to find data
clusters so that each cluster has the most closely matched data.
Visualization Algorithms
Visualization algorithms are unsupervised learning algorithms that accept unlabeled data and
display this data in an intuitive 2D or 3D format. The data is separated into somewhat clear
clusters to aid understanding.
Anomaly Detection
This algorithm detects anomalies in data without any prior training.
Semi-supervised learning falls between unsupervised learning (without any labeled training
data) and supervised learning (with completely labeled training data).
Example of Semi-supervised Learning
Google Photos automatically detects the same person in multiple photos
from a vacation trip (clustering –unsupervised).
One has to just name the person once (supervised), and the name tag
gets attached to that person in all the photos.
2.2.4. Reinforcement Learning
Define Reinforcement Learning
Reinforcement Learning is a type of Machine Learning that allows the learning system to
observe the environment and learn the ideal behavior based on trying to maximize some
notion of cumulative reward.
It differs from supervised learning in that labelled input/output pairs need not be presented,
and sub-optimal actions need not be explicitly corrected. Instead the focus is finding a
balance between exploration (of uncharted territory) and exploitation (of current knowledge)
Features of Reinforcement Learning
The learning system (agent) observes the environment, selects and takes
certain actions, and gets rewards in return (or penalties in certain cases).
The agent learns the strategy or policy (choice of actions) that maximizes
its rewards over time.
Example of Reinforcement Learning
In a manufacturing unit, a robot uses deep reinforcement learning to identify
a device from one box and put it in a container.
The robot learns this by means of a rewards-based learning system,
which incentivizes it for the right action.
2.2.5. Some Important Considerations in Machine Learning
Bias & Variance Tradeoff
Bias refers to error in the machine learning model due to wrong assumptions.
A high-bias model will underfit the training data.
Variance refers to problems caused due to overfitting. This is a result of
over- sensitivity of the model to small variations in the training data. A
model with
many degrees of freedom (such as a high-degree polynomial model) is likely
to have high variance and thus overfit the training data.
Bias & Variance Dependencies
Increasing a model’s complexity will reduce its bias and increase its variance.
Conversely, reducing a model’s complexity will increase its bias and reduce its
variance. This is why it is called a tradeoff.
What is Representational Learning
In Machine Learning, Representation refers to the way the data is presented. This often make
a huge difference in understanding.
Unlabeled Data
Test Data
Validation Data
2.3.2. Feature Engineering
Define Feature Engineering
The transformation stage in the data preparation process includes an important step known as
Feature Engineering.
Feature Engineering refers to selecting and extracting right features from the data that are
relevant to the task and model in consideration.
Aspects of Feature Engineering
Feature Selection
Most useful and relevant features are selected from the available data
Feature Addition
New features are created by gathering new data
Feature Extraction
Existing features are combined to develop more useful ones
Feature Filtering
Filter out irrelevant features to make the modelling step easy
Standardization
Standardization is a popular feature scaling method, which gives data
the property of a standard normal distribution (also known as
Gaussian distribution).
All features are standardized on the normal distribution (a
mathematical model).
The mean of each feature is centered at zero, and the feature column
has a standard deviation of one.
Normalization
In most cases, normalization refers to rescaling of data features
between 0 and 1, which is a special case of Min-Max scaling.
In the given equation, subtract the min value for each feature from
each feature instance and divide by the spread between max and min.
In effect, it measures the relative percentage of distance of each
instance from the min value for that feature.
2.3.4. Datasets
Machine Learning problems often need training or testing datasets.
A dataset is a large repository of structured data.
In many cases, it has input and output labels that assist in Supervised Learning.
2.3.5. Dimensionality Reduction with Principal Component Analysis
Define Dimensionality Reduction
Dimensionality reduction involves transformation of data to new dimensions
in a way that facilitates discarding of some dimensions without losing any key
information.
Supervised learning
2.3.6. Regression
In statistical modeling, regression analysis is a set of statistical processes for
estimating the relationships among variables.
It includes many techniques for modeling and analyzing several variables, when the
focus is on the relationship between a dependent variable and one or more
independent variables (or 'predictors').
More specifically, regression analysis helps one understand how the typical value of the
dependent variable (or 'criterion variable') changes when any one of the independent
variables is varied, while the other independent variables are held fixed.
2.5.1.1. Linear Regression
y = wx + b
where b is the bias or the value of output for zero input
2.5.1.2. Multiple Linear Regression
It is a statistical technique used to predict the outcome of a response
variable through several explanatory variables and model the relationships
between them.
It represents line fitment between multiple inputs and one output, typically:
A new input point is classified in the category such that it has the
greatest number of neighbors from that category.
Start at the tree root and split the data on the feature using the
decision algorithm, resulting in the largest information gain (IG).
2.5.2.2.5. Random Forest Classification
Random decision forests correct for decision trees' habit of overfitting to
their training set.
Clustering means
Clustering is a Machine Learning technique that involves the
grouping of data points.
https://www.wikipedia.org/
https://towardsdatascience.com/
https://www.expertsystem.com/
https://www.udemy.org/
https://www.edureka.co/
https://subhadipml.tech/
https://www.forbes.com/
https://medium.com/
https://www.google.com/
4.2. All Pictures are from
https://www.google.com/
https://www.wikipedia.org/
https://www.youtube.com/
https://www.edureka.co/
4.3. Book I referred are
Hands-on Machine Learning with Scikit-learn & Tensorflow By Aurelien Geron
Python Machine Learning by Sebastian Raschka
MAHARAJA SURAJMAL INSTITUTE OF TECHNOLOGY
Mobile No- 9971964204 Broader Area – Machine learning and deep learning
with python &R
No. of days Number of Curriculum Scheduled for the Curriculum actually covered
Scheduled for days student by the student
the training actually
attended
Week 3
Swati
(Signature of the student)
Any comments or suggestions for the student performance during the training program(to be filled by
instructor)
…………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………
Abhishek Bansal
(Signature of the Instructor)
along with Seal
Note: Every student has to fill and submit this Performa duly signed by his/her instructor to the
faculty-in-charge by first week of September.