Machine learning
Machine learning
The goals of machine learning can vary depending on the specific application, but
some common objectives include:
1. Prediction: Predicting future outcomes or trends based on past data. This could
include forecasting stock prices, predicting customer behaviour, or estimating
disease risk.
2. Classification: Categorizing data into different classes or groups based on its
features. For example, classifying emails as spam or non-spam, or identifying
whether a tumour is malignant or benign based on medical imaging data.
3. Clustering: Organizing data into natural groupings or clusters based on
similarities between data points. This could be used for customer segmentation
in marketing or for grouping similar documents in text analysis.
4. Pattern Recognition: Identifying patterns and relationships within data that
can be used to make decisions or extract insights. This could involve
recognizing handwriting, detecting anomalies in network traffic, or identifying
fraudulent transactions.
5. Optimization: Finding the best possible solution to a problem by iteratively
improving performance based on feedback. This is often used in areas like
recommendation systems, where the goal is to recommend items that are most
relevant to a user's preferences.
Machine learning has a wide range of applications across various industries, including:
1. Overfitting:
o Definition: The model learns the training data too well, capturing noise
and details that do not generalize to new data.
2. Underfitting:
o Definition: The model is too simple to capture the underlying patterns in
the data, leading to poor performance on both training and test data.
o
3. Data Quality:
o Definition: Poor quality data, such as missing values, noise, and outliers,
can significantly affect model performance.
4. Data Quantity:
o Definition: Insufficient data can lead to models that do not generalize
well to new, unseen data.
5. Bias and Variance Trade-off:
o Definition: Balancing bias (error due to overly simplistic models) and
variance (error due to models being too complex) is crucial for model
performance.
6. Interpretability:
o Definition: Complex models (like deep neural networks) can be difficult
to interpret, making it hard to understand how decisions are made.
Steps help you choose the best machine learning algorithm for your problem:
1. Supervised learning
2. Unsupervised learning
3. Semi-Supervised Learning:
a. Semi-supervised learning lies between supervised and unsupervised
learning. It involves training a model on a dataset that contains both
labeled and unlabelled data.
b. The model learns from the labeled data while also leveraging the
unlabelled data to improve performance.
c. Semi-supervised learning is useful when labeled data is scarce or
expensive to obtain.
d. Examples: Training a model to classify images using a combination of
labeled and unlabelled data, where only a small portion of the images
are labeled.
4. Reinforcement Learning:
a. Reinforcement learning involves training an agent to interact with an
environment and learn to make decisions in order to maximize some
notion of cumulative reward.
b. The agent learns through trial and error, receiving feedback from the
environment in the form of rewards or penalties.
c. The goal is to learn a policy that maps states of the environment to
actions that maximize the cumulative reward over time.
d. Reinforcement learning is used in various applications, including game
playing, robotics, and autonomous vehicle control.
e. Examples: Training an AI agent to play chess or go, teaching a robot to
navigate through a maze, or optimizing resource allocation in a
manufacturing process.
5. Generative and discriminative model:
1. Generative Models:
Modeling Approach: Generative models aim to learn the joint probability
distribution of the input features and the class labels.
Learning Process: During training, generative models estimate the
probability of observing each feature given each class, as well as the prior
probability of each class.
Example: Naive Bayes classifier is a classic example of a generative
model. In email classification, Naive Bayes learns the probability
distribution of word frequencies in spam and non-spam emails. It models
how likely it is to see specific words in each type of email, as well as the
overall likelihood of an email being spam or non-spam.
Usage: Generative models can be used for both classification and
generation of new data samples.
2. Discriminative Models:
Modeling Approach: Discriminative models directly learn the decision
boundary between classes without explicitly modeling the underlying
probability distribution of the features.
Learning Process: Discriminative models learn the conditional
probability of each class given the input features directly from the training
data.
Example: Logistic Regression and Support Vector Machines (SVM) are
examples of discriminative models. In email classification, logistic
regression learns a linear decision boundary that separates spam emails
from non-spam emails based on the input features (e.g., word
frequencies).
Usage: Discriminative models are primarily used for classification tasks
and don't have the capability to generate new data samples.
Comparison:
Generative Models:
Learns joint probability distribution.
Can be used for classification and generation.
Examples: Naive Bayes, Gaussian Mixture Models (GMM).
Discriminative Models:
Learns conditional probability directly.
Used solely for classification.
Examples: Logistic Regression, Support Vector Machines (SVM).
6. Decision Tree Algo:
Example:
Consider a dataset containing information about customers' purchasing behavior,
including features such as age, income, and whether they purchased a product ("Yes"
or "No"). We want to predict whether a new customer will purchase the product based
on their age and income.
The Decision Tree algorithm may split the dataset based on the feature that
provides the most information gain, such as age.
It continues splitting the dataset into subsets based on different features until it
reaches a leaf node where all instances belong to the same class (e.g., all
customers above a certain age range purchased the product).
8- K nearest neighbour:
1. Training Phase: KNN stores all available data points and their corresponding
labels.
2. Prediction Phase:
o Given a new, unseen data point, the algorithm calculates the distance
between this point and all other points in the training dataset. Common
distance metrics include Euclidean distance, Manhattan distance, or
Minkowski distance.
o It then identifies the K nearest data points (neighbors) to the new point
based on the calculated distances.
o For classification tasks, KNN takes a majority vote among the labels of the
K nearest neighbors and assigns the most frequent class label to the new
data point.
o For regression tasks, KNN takes the average of the labels of the K nearest
neighbors and assigns it as the predicted value for the new data point.
3. Choosing K: The choice of K (the number of neighbors) is a hyperparameter
that needs to be specified before applying the algorithm. It can significantly
affect the performance of the model. A small K value may lead to noise
sensitivity and overfitting, while a large K value may lead to oversmoothing and
underfitting.
UNIT-2
1. Bagging and Boosting:
Boosting:
2. AdaBoost Algo:
Example:
Suppose we have a dataset of emails labeled as spam or not spam.
AdaBoost trains multiple weak learners, such as decision stumps, on the
data. It iteratively adjusts the instance weights, giving more importance to
misclassified emails in each iteration. Finally, it combines the predictions
of all weak learners to classify new emails as spam or not spam.
3. SVM:
1. Linear Separation Only: Perceptrons can only handle problems where the
data can be separated by a straight line or plane. If the data is more
complicated, perceptrons won't work well.
2. Binary Output: Perceptrons give only yes or no answers. They can't tell you
probabilities or give you a range of outputs like other models can.
3. Shallow Structure: Perceptrons are like a single layer of neurons. They can't
understand complex patterns that need many layers to figure out.
4. Sensitive to Scaling: If your data has different scales (like weight and height),
perceptrons might get confused. They need all the input data to be in a similar
range.
5. Limited Complexity: Because they're simple, perceptrons can't handle
complicated tasks. They struggle with things like recognizing faces or
understanding language.
6. Noise Sensitivity: If your data has mistakes or outliers, perceptrons might
learn the wrong thing. They're not good at handling messy data.