ML Unit 1 CS
ML Unit 1 CS
ML Unit 1 CS
Introduction
KCS 055/ KOE 073: Machine Learning
Introduction
Anurag Malik
(Associate Prof. CS & E)
4. Bishop, C., Pattern Recognition and Machine Learning. Berlin: Springer- Verlag.
• Some learning is immediate, induced by a single event (e.g. being burned by a hot
stove), but much skill and knowledge accumulates from repeated experiences.
7
Well – Posed Learning Problems
Learning can be defined through a computer program that improves its
performance at some task through experience.
Definition of Learning: A computer program is said to learn from
experience E with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by P, improves
with experience E.
Lets have some examples of Well Posed Learning Problems
Learn to Play Checkers
Learn to recognize spoken words (SPHINX System)
Learning to drive an autonomous vehicle (ALVINN System)
Learning to classify new astronomical structures
Predict recovery rates of pneumonia patients
Detect fraudulent use of credit cards
What experience?
E: play games against itself (advantage of getting a lot of data this way)
Now use the legal moves to generate every subsequent board state
and use V to choose the best one and therefore the best legal move
V: Board ->
2. Critic: Take history of problem as input and produce a set of training examples of
target function as output.
3. Generalizer: Take training examples as input and produce estimate of target function
as output hypothesis. It generalizes from specific training examples, hypothesizing a
general function that covers all examples.
https://www.youtube.com/watch?v=Cx5aNwnZYDc
https://www.youtube.com/watch?v=YhSeTEumjVA
https://www.youtube.com/watch?v=ZoemTySxFso
Data Science is a field about processes and systems Machine Learning is a field of study that gives
1. to extract data from structured and semi- computers the capability to learn without
structured data. being explicitly programmed.
2. Need the entire analytics universe. Combination of Machine and Data Science.
Many operations of data science that is, data It is three types: Unsupervised learning,
7.
gathering, data cleaning, data manipulation, etc. Reinforcement learning, Supervised learning.
4. Deeplearning4j
Deeplearning4j is termed as the first open-source, commercial grade, distributed deep learning library
developed for Scala and Java. It's easy to use infrastructure makes it a panacea for non-researchers. The
most fascinating quality of DL4J is that it can import neural net models from many major frameworks via
Keras, which include Theano, Caffe, and TensorFlow.
5. Torch
Torch is also an open source machine learning library, which is being used by many giant IT firms
including Yandex, IBM, Idiap Research Institute, & Facebook AI Research Group. It can also be termed
as a scientific computing framework and a script language that is based on Lua programming language.
After its successful execution on web platforms, Torch has also been extended for the use on iOS and
Android.
Testing
Input Learning
Samples Method
System
Training
Data Practical
acquisition usage
Universal set
(unobserved)
Supervised Unsupervised
learning learning
Steps 1 to 4 are different forms of data preprocessing, where the data are prepared for
mining. The data mining step may interact with the user or a knowledge base. The
interesting patterns are presented to the user and may be stored as new knowledge in the
knowledge base
• Developer labels sample data and set strict boundaries upon which the
algorithm operates.
• Example:
CONS
• Concrete examples are required for training classifiers.
•Linear Regression
•k-Nearest Neighbor
•Naive Bayes
•Decision Trees
•Support Vector Machine (SVM)
•Random Forest
•Neural Networks (Deep learning)
<=30 overcast
31..40 >40
no yes no yes
P( H | X) P(X | H ) P( H )
P(X)
Informally, this can be written as
posteriori = likelihood x prior/evidence
Predicts X belongs to Ci iff the probability P(Ci|X) is the
highest among all the P(Ck|X) for all the k classes
Practical difficulty: require initial knowledge of many
probabilities, significant computational cost
September 30, 2023 69
Artificial Neural Networks
Artificial Neural Networks (ANN) Started by psychologists and neurobiologists to develop and
test computational analogues of neurons
Other names:
1.Connectionist learning 2.Prediction by N N 3. Adaptive networks,
4. Neural computation 5.Parallel distributed processing 6. Collective computation
Artificial neural networks components:
Units : A neural network is composed of a number of nodes, or units. It is Metaphor for
nerve cell body
Links: Units connected by links. Links represent synaptic connections from one unit to
another
Weight : Each link has a numeric weight
•PCA
•t-SNE
•k-means
•DBSCAN
•Apriori algorithm
• FP – Growth
• Output: There are many possible output as there are variety of solution to a
particular problem.
• Training: Training is based upon input, the model will return a state and
user will decide to reward or punish the model based on its output.
• Q learning
Missing values
Outliers
Bad encoding (for text)
Wrongly-labeled examples
Biased data
•Do I have many more
samples of one class than the rest?
•Two steps:
Variable transformation (e.g., dates into weekdays,
normalizing)
Feature creation (e.g., n-grams for texts, if word is
capitalized to2023
September 30, detect names, etc.) 93
4. Algorithm Selection & Training
• Supervised • Unsupervised
•PCA
•Linear classifier •t-SNE
•Naive Bayes •k-means
•DBSCAN
•Support Vector Machines
• Apriori algorithm
(SVM) •FP – Growth
•Decision Tree
•Random Forests • Reinforcement
•k-Nearest Neighbors •SARSA–λ
•Q-Learning
•Neural Networks (Deep
• Markov Decision
learning) Process
September 30, 2023 94
4. Algorithm Selection & Training
•Goal of training: making the correct prediction as often as
possible .
•Incremental improvement:
Interval-Scaled Attributes
Continuous measurements on a roughly
The data represented in Machine Learning is The data representation is used in Deep
2. quite different as compared to Deep Learning as Learning is quite different as it uses neural
it uses structured data networks(ANN).
Outputs: Numerical Value, like classification of Anything from numerical values to free-form
5.
score elements, such as free text and sound.
Uses various types of automated algorithms that Uses neural network that passes data through
6. turn to model functions and predict future action processing layers to the interpret data features
from data. and relations.
Algorithms are detected by data analysts to Algorithms are largely self-depicted on data
7.
examine specific variables in data sets. analysis once they’re put into production.
Machine Learning is highly used to stay in the Deep Learning solves complex machine
8.
competition and learn new things. learning issues.