My Internship Report This Is My Project Work
My Internship Report This Is My Project Work
DEV BHOOMI
UTTARAKHAND
UNIVERSITY
Submitted By:
RISHABH TYAGI
Under the guidance of my mentors and my respected
faculties.
SoCSE
July 2022 – August 2022
Page 1 of 49
DECLARATION
I hereby declare that I have completed my eight weeks summer
training at YHills (one of the country’s leading online
certification training providers) from 1st July, 2022 to 31st
August, 2022 under the guidance of my tutor. I have declared
that I have worked with full dedication during these eight weeks
of training and my learning outcomes fulfil the requirements of
training for the award of degree of Bachelor of Science (B.Sc) in
IT, Dev Bhoomi Uttarakhand University.
__________________
(Signature of Student)
Name – Rishabh Tyagi
ID – 21BSIT0004
Date – 28/Sep/2022
Page 2 of 49
ACKNOWLEDGEMENT
The success and final outcome of learning Machine Learning
required a lot of guidance and assistance from many people and I
am extremely privileged to have got this all along the completion
of my course and few of the projects. All that I have done is only
due to such supervision and assistance and I would not forget to
thank them.
I respect and thank YHills, for providing me an opportunity to
do the course and project work and giving me all support and
guidance, which made me complete the course duly. I am
extremely thankful to the course advisor Miss. Lata Thiru
Ma’am.
I am thankful to and fortunate enough to get constant
encouragement, support and guidance from all Teaching staffs of
YHills which helped me in successfully completing my course
and project work.
__________________
(Signature of Student)
Name – Rishabh Tyagi
ID – 21BSIT0004
Date – 28/Sep/2022
Page 3 of 49
Page 4 of 49
TABLE OF CONTENTS
1. Introduction………………………………………………………………………………07
1.1. A Taste of Machine Learning……………………………………………………….07
1.2. Relation to Data Mining………………………………………………………….….07
1.3. Relation to Optimization………………………………………………………….…07
1.4. Relation to Statistics…………………………………………………………............08
1.5. Future of Machine Learning………………………………………………………....08
2. Technology Learnt……………………………………………………………………….08
2.1. Introduction to Artificial Intelligence and Machine Learning……………………....08
2.1.1. Definition of Artificial Intelligence…………………………………………..08
2.1.2. Definition of Machine Learning…………………………………………...…09
2.1.3. Machine Learning Algorithms……………………………………………….10
2.1.4. Applications of Machine Learning………………………………………...…11
2.2. Techniques of Machine Learning…………………………………………………....12
2.2.1. Supervised Learning…………………………………………...……………..12
2.2.2. Unsupervised Learning……………………………………………...………..16
2.2.3. Semi- supervised Learning……………………………………………..…….18
2.2.4. Reinforcement Learning…………………………………………………..….19
2.2.5. Some Important Considerations in Machine Learning………………….........19
2.3. Data Preprocessing………………………………………………………….…….....20
2.3.1. Data Preparation…………………………………………………………..….20
2.3.2. Feature Engineering…………………………………………………….….…21
2.3.3. Feature Scaling…………………………………………………………….…22
2.3.4. Datasets………………………………………………………………….……24
2.3.5. Dimensionality Reduction with Principal Component Analysis………….….24
2.4. Math Refresher………………………………………………………………………25
2.4.1. Concept of Linear Algebra……………………………………………...……25
2.4.2. Eigenvalues, Eigenvectors, and Eigen decomposition…………………….....30
2.4.3. Introduction to Calculus…………………………………………………..….30
2.4.4. Probability and Statistics………………………………………………….….31
2.5. Supervised learning……………………………………………………………….…34
2.5.1. Regression……………………………………………………………………34
2.5.1.1. Linear Regression…………………………………………………….35
2.5.1.2. Multiple Linear Regression…………………………………………..35
2.5.1.3. Polynomial Regression……………………………………………….36
2.5.1.4. Decision Tree Regression…………………………………………….37
2.5.1.5. Random Forest Regression…………………………………………...37
2.5.2. Classification…………………………………………………………………38
2.5.2.1. Linear Models…………………………………………………….…..39
2.5.2.1.1. Logistic Regression…………………………………………..39
2.5.2.1.2. Support Vector machines…………………………………….39
Page 5 of 49
Page 6 of 49
1. Introduction
1.1. A Taste of Machine Learning
Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence, coined
the term "Machine Learning" in 1959.
Over the past two decades Machine Learning has become one of the mainstays of information
technology.
With the ever-increasing amounts of data becoming available there is good reason to believe that smart
data analysis will become even more pervasive as a necessary ingredient for technological progress.
1.2. Relation to Data Mining
• Data mining uses many machine learning methods, but with different goals; on the other hand,
machine learning also employs data mining methods as "unsupervised learning" or as a
preprocessing step to improve learner accuracy.
1.3. Relation to Optimization
Machine learning also has intimate ties to optimization: many learning problems are formulated as
minimization of some loss function on a training set of examples.
Loss functions express the discrepancy between the predictions of the model being trained and the
actual problem instances.
Page 7 of 49
Michael I. Jordan suggested the term dat a science as a placeholder to call the overall
field.
Leo Breiman distinguished two statistical modelling paradigms: data model and algorithmic model,
wherein "algorithmic model" means more or less the machine learning algorithms like Random
forest.
1.5. Future of Machine Learning
Machine Learning can be a competitive advantage to any company be it a top MNC or a startup as
things that are currently being done manually will be done tomorrow by machines.
Machine Learning revolution will stay with us for long and so will be the future of Machine Learning.
2. Technology Learnt
2.1. Introduction to AI & Machine Learning
2.1.1. Definition of Artificial Intelligence
Data Economy
World is witnessing real time flow of all types structured and unstructured data from social media,
communication, transportation, sensors, and devices.
International Data Corporation (IDC) forecasts that 180 zettabytes of data will be generated by 2025.
This explosion of data has given rise to a new economy known as the Data Economy.
Page 8 of 49
Data is the new oil that is precious but useful only when cleaned and processed.
There is a constant battle for ownership of data between enterprises to derive benefits from it.
Define Artificial Intelligence
Artificial intelligence refers to the simulation of human intelligence in machines that are programmed to think
like humans and mimic their actions. The term may also be applied to any machine that exhibits traits associated
with a human mind such as learning and problemsolving.
Machine Learning is an approach or subset of Artificial Intelligence that is based on the idea that machines can be
given access to data along with the ability to learn from it.
Page 9 of 49
Traditional Approach
Traditional programming relies on hard-coded rules.
Page 10 of 49
Machine Learning can learn from labelled data (known as supervised learning) or unlabeled
data (known as unsupervised learning).
Page 11 of 49
Image Processing
Optical Character Recognition (OCR)
Self-driving cars
Image tagging and recognition
Robotics
Industrial robotics
Human simulation
Data Mining
Association rules
Anomaly detection
Grouping and Predictions
Video games
Pokémon
PUBG
Text Analysis
Spam Filtering
Information Extraction
Sentiment Analysis
Healthcare
Emergency Room & Surgery
Research
Medical Imaging & Diagnostics
2.2. Techniques of Machine Learning
2.2.1. Supervised Learning
❖ Define Supervised Learning
Supervised learning is the machine learning task of learning a function that maps an input to an output based on
example input-output pairs. It infers a function from labeled training data consisting of a set of training
examples.
Page 12 of 49
In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired
output value (also called the supervisory signal).
Feature Engineering
Reserve 80% of data for Training (Train_X) and 20% for Evaluation (Train_E)
Training Step
Design algorithmic logic
If accuracy score is high, you have the final learned algorithm y = f(x) If accuracy
score is low, go back to training step
Production Deployment
Use the learned algorithm y = f(x) to predict production data.
The algorithm can be improved by more training data, capacity, or algo redesign.
Page 13 of 49
If the learning is poor, we have an underfitted situation. The algorithm will not work well on
test data. Retraining may be needed to find a better fit.
If learning on training data is too intensive, it may lead to overfitting–a situation where the
algorithm is not able to handle new testing data that it has not seen before. The technique to
keep data generic is called regularization.
Page 14 of 49
Classification
Answers <What class?=
Page 15 of 49
Applied when the output has finite and discreet values Example: Social media
sentiment analysis has three potential outcomes, positive, negative, or neutral
✓ Regression
Page 16 of 49
Here the task of machine is to group unsorted information according to similarities, patterns and differences
without any prior training of data.
Clustering
The most common unsupervised learning method is cluster analysis. It is used to find data clusters so that each
cluster has the most closely matched data.
Page 17 of 49
Visualization Algorithms
Visualization algorithms are unsupervised learning algorithms that accept unlabeled data and display this data
in an intuitive 2D or 3D format. The data is separated into somewhat clear clusters to aid understanding.
Anomaly Detection
This algorithm detects anomalies in data without any prior training.
Semi-supervised learning falls between unsupervised learning (without any labeled training data) and
supervised learning (with completely labeled training data).
Page 18 of 49
It differs from supervised learning in that labelled input/output pairs need not be presented, and sub-optimal
actions need not be explicitly corrected. Instead the focus is finding a balance between exploration (of
uncharted territory) and exploitation (of current knowledge)
Page 19 of 49
Conversely, reducing a model’s complexity will increase its bias and reduce its variance. This
is why it is called a tradeoff.
What is Representational Learning
In Machine Learning, Representation refers to the way the data is presented. This often make a huge difference
in understanding.
Page 20 of 49
Unlabeled Data
Test Data
Validation Data
2.3.2. Feature Engineering
Define Feature Engineering
The transformation stage in the data preparation process includes an important step known as Feature
Engineering.
Feature Engineering refers to selecting and extracting right features from the data that are relevant to the task
and model in consideration.
Feature Addition
New features are created by gathering new data
Feature Extraction
Feature Filtering
Page 21 of 49
Standardization
Standardization is a popular feature scaling method, which gives data the property of
a standard normal distribution (also known as Gaussian distribution).
All features are standardized on the normal distribution (a mathematical model).
The mean of each feature is centered at zero, and the feature column has a standard
deviation of one.
Page 22 of 49
✓ Normalization
In the given equation, subtract the min value for each feature from each feature
instance and divide by the spread between max and min.
In effect, it measures the relative percentage of distance of each instance from the min
value for that feature.
Page 23 of 49
2.3.4. Datasets
Machine Learning problems often need training or testing datasets.
A dataset is a large repository of structured data.
In many cases, it has input and output labels that assist in Supervised Learning.
2.3.5. Dimensionality Reduction with Principal Component Analysis
Define Dimensionality Reduction
Dimensionality reduction involves transformation of data to new dimensions in a way that
facilitates discarding of some dimensions without losing any key information.
Page 24 of 49
Applications of PCA
Noise reduction
Compression
Preprocess
2.4. Math Refresher
2.4.1. Concept of Linear Algebra
Linear Equation
Linear algebra is a branch of mathematics that deals with the study of vectors and linear
functions and equations.
A linear equation does not involve any products, inverses, or roots of variables. All variables
occur only to the first power and not as arguments for trigonometric, logarithmic, or
exponential functions.
Page 25 of 49
A linear system that has a solution is called consistent, and the one with no solution is termed
inconsistent.
Matrix
An m × n matrix: the m rows are horizontal and the n columns are vertical. Each element of a matrix is often
denoted by a variable with two subscripts. For example, a2,1 represents the element at the second row and first
column of the matrix.
Addition
Two matrices can be added only if they have the same number of rows and columns. Also, during addition, A +
B=B+A
Subtraction
Two matrices can be subtracted only if they have the same number of rows and columns. Also, during
subtraction, A -B not equal to B -A
Multiplication
Page 26 of 49
The matrix product AB is defined only when the number of columns in A is equal to the number of rows in B.
BA is defined only when the number of columns in B is equal to the number of rows in A. AB is not always
equal to BA.
Transpose
Inverse
Page 27 of 49
Symmetric Matrix
Identity Matrix
❖ Vector
Page 28 of 49
Addition
The operation of adding two or more vectors together into a vector sum is referred to
as vector addition.
Subtraction
Vector subtraction is the process of subtracting two or more vectors to get a vector
difference.
Multiplication
Vector multiplication refers to a technique for the multiplication of two (or more)
vectors with themselves.
Page 29 of 49
Eigen decomposition
Integers can be broken into their prime factors to understand them, example: 12 = 2 x 2 x 3.
From this, useful properties can be derived, for example, the number is not divisible by 5 and
is divisible by 2 and 3.
Similarly, matrices can be decomposed. This will help you discover information about the
matrix.
Differential Calculus
Differential calculus is a part of calculus that deals with the study of the rates at which quantities
change.
Let x and y be two real numbers such that y is a function of x, that is, y = f(x).
If f(x) is the equation of a straight line (linear equation), then the equation is represented as y =
mx + b.
Where m is the slope determined by the following equation:
Page 30 of 49
Integral Calculus
Integral Calculus assigns numbers to functions to describe displacement, area, volume, and
other concepts that arise by combining infinitesimal data.
Given a function f of a real variable x and an interval [a, b] of the real line, the definite integral
is defined informally as the signed area of the region in the xyplane that is bounded by the graph
of f, the x -axis, and the vertical lines x=a and x=b.
Probability of any specific event is between 0 and 1 (inclusive). The sum of total probabilities
of an event cannot exceed 1, that is, 0 <= p(x) <= 1. This implies that. p(x)dx =1 (integral of p
for a distribution over x)
Conditional Probability
Conditional Probability is a measure of the probability of an event occurring given that another
event has occurred.
Page 31 of 49
Standard Deviance
Standard deviation is a quantity that expresses the value by which the members of a group differ
from the mean value for the group.
Standard deviation is used more often than variance because the unit in which it is measured is
the same as that of mean, a measure of central tendency.
Variance
Variance refers to the spread of the data set, for example, how far the numbers are in relation
to the mean.
Variance is particularly useful when calculating the probability of future events or performance.
Page 32 of 49
Logistic Sigmoid
The Logistic Sigmoid is a useful function that follows the S curve. It saturates when input is
very large or very small.
Gaussian Distribution
The distribution where the data tends to be around a central value with lack of bias or minimal
bias toward the left or right is called Gaussian distribution, also known as normal distribution.
Page 33 of 49
Page 34 of 49
Linear regression is a linear approach for modeling the relationship between a scalar
dependent variable y and an independent variable x.
y = wx + b
where b is the bias or the value of output for zero input
2.5.1.2. Multiple Linear Regression
It is a statistical technique used to predict the outcome of a response variable through
several explanatory variables and model the relationships between them.
Page 35 of 49
It represents line fitment between multiple inputs and one output, typically:
Page 36 of 49
Page 37 of 49
Page 38 of 49
The probability in the logistic regression is often represented by the Sigmoid function
(also called the logistic function or the S-curve)
Page 39 of 49
The optimization objective is to find <maximum margin hyperplane= that is farthest from
the closest points in the two classes (these points are called support vectors).
2.5.2.2. Nonlinear Models
2.5.2.2.1. K-Nearest Neighbors (KNN)
K-nearest Neighbors algorithm is used to assign a data point to clusters based on
similarity measurement.
A new input
point is classified in the category such that it has the greatest number of neighbors
from that category.
2.5.2.2.2. Kernel Support Vector Machines (SVM)
Kernel SVMs are used for classification of nonlinear data.
In the chart, nonlinear data is projected into a higher dimensional space via a mapping
function where it becomes linearly separable.
Page 40 of 49
A
reverse
projection of the higher dimension back to original feature space takes it back to
nonlinear shape.
2.5.2.2.3. Naïve Bayes
According to Bayes model, the conditional probability P(Y|X) can be calculated as:
This means you have to estimate a very large number of P(X|Y) probabilities for a
relatively small vector space X.
Page 41 of 49
Start at the tree root and split the data on the feature using the decision algorithm,
resulting in the largest information gain (IG).
2.5.2.2.5. Random Forest Classification
Random decision forests correct for decision trees' habit of overfitting to their training set.
Random forests or random decision forests are an ensemble learning method for
classification, regression and other tasks that operates by constructing a multitude of
decision trees at training time and outputting the class that is the mode of the classes
(classification) or mean prediction (regression) of the individual trees.
Page 42 of 49
Clustering means
✓ Clustering is a Machine Learning technique that involves the grouping of data points.
Page 43 of 49
Step 4: calculate the distance of the centroids from each point again
Step 5: move points across clusters and re-calculate the distance from the centroid
Step 6: keep moving the points across clusters until the Euclidean distance is
minimized
Elbow Method
One could plot the Distortion against the number of clusters K. Intuitively, if K increases,
distortion should decrease. This is because the samples will be close to their assigned centroids.
This plot is called the Elbow method.
Page 44 of 49
It indicates the optimum number of clusters at the position of the elbow, the point where
distortion begins to increase most rapidly.
Euclidian Distance
✓ K-means is based on finding points close to cluster centroids. The distance between
two points x and y can be measured by the squared Euclidean distance between them
in an m-dimensional space.
Experts have discovered multi-layered learning networks that can be leveraged for deep learning
as they learn in layers.
Scientists have figured out that high-performing graphics processing units (GPU) can be used for
deep learning.
Page 45 of 49
ML Vs Deep Learning
It is an interconnected group of nodes akin to the vast network of layers of neurons in a brain.
2.7.3. TensorFlow
TensorFlow is the open source Deep Learning library provided by Google.
Page 46 of 49
It allows development of a variety of neural network applications such as computer vision, speech
processing, or text recognition.
It uses data flow graphs for numerical computations.
3. Reason for choosing Machine Learning
➢ Learning machine learning brings in better career opportunities
Machine learning is the shining star of the moment.
Every industry looking to apply AI in their domain, studying machine learning opens world of
opportunities to develop cutting edge machine learning applications in various verticals – such
as cyber security, image recognition, medicine, or face recognition.
Several machine learning companies on the verge of hiring skilled ML engineers, it is becoming
the brain behind business intelligence.
➢ Machine Learning Jobs on the rise
The major hiring is happening in all top tech companies in search of those special kind of people
(machine learning engineers) who can build a hammer (machine learning algorithms).
The job market for machine learning engineers is not just hot but it’s sizzling.
Machine Learning Jobs on Indeed.com - 2,500+(India) & 12,000+(US)
Page 47 of 49
4. Learning Outcome
Have a good understanding of the fundamental issues and challenges of machine learning: data, model
selection, model complexity, etc.
Have an understanding of the strengths and weaknesses of many popular machine learning approaches.
Appreciate the underlying mathematical relationships within and across Machine Learning algorithms
and the paradigms of supervised and un-supervised learning.
Be able to design and implement various machine learning algorithms in a range of real-world
applications.
Ability to integrate machine learning libraries and mathematical and statistical tools with modern
technologies
Ability to understand and apply scaling up machine learning techniques and associated computing
techniques and technologies.
Page 48 of 49
5. Gantt Chart
6. Bibliography
6.1. All Content used in this report is from
https://www.simplilearn.com/
https://www.wikipedia.org/
https://towardsdatascience.com/
https://www.expertsystem.com/
https://www.coursera.org/
https://www.edureka.co/
https://subhadipml.tech/
https://www.forbes.com/
https://medium.com/
https://www.google.com/
6.2. All Pictures are from
https://www.simplilearn.com/
https://www.google.com/
https://www.wikipedia.org/
https://www.youtube.com/
https://www.edureka.co/
6.3. Book I referred are
Page 49 of 49