0% found this document useful (0 votes)

2 views5 pages

Nit ML Sugg

The document covers key concepts in machine learning including eigenvalues and eigenvectors, decision boundaries, and various regression techniques. It discusses methods for clustering, handling outliers, and the importance of regularization to prevent overfitting. Additionally, it highlights the differences between bagging and boosting, and introduces reinforcement learning and its applications.

Uploaded by

stxraptor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views5 pages

Nit ML Sugg

Uploaded by

stxraptor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

NIT ML SUGG (SUVECCHHA PAUL)

1. Eigenvalues and Eigenvectors (in Machine Learning)

 Eigenvectors represent the direction in which data varies the most.
 Eigenvalues show how much the data varies in those directions.

 In PCA (Principal Component Analysis), eigenvectors are used to find new axes, and
eigenvalues tell us how much information (variance) is captured.
 Why important: They help reduce data dimensions and focus on the most useful
features.
 Keywords: direction, variance, PCA, dimensionality reduction, important features.

2. Decision Boundaries (in Classification)

 A decision boundary is a line or surface that separates different classes.
 Formed by models like SVM, Logistic Regression, or Perceptron.

 For example, it separates "spam" from "not spam" based on features.

 Helps model decide class of new data points.

 Keywords: boundary, separation, classifier, line, margin.

3. Logistic vs Linear Regression

 Linear Regression is used for predicting numbers (e.g., salary, temperature).

 Logistic Regression is used for predicting categories (e.g., yes/no, pass/fail).

 Logistic regression uses a sigmoid function to give results between 0 and 1 (as
probabilities).
 When to use:

o Linear: For continuous values.

o Logistic: For classification problems.
 Keywords: classification, regression, sigmoid, prediction, probability.

4. Principal Component Analysis (PCA)

 A technique to reduce number of features while keeping important information.

 It finds new axes (principal components) using eigenvalues and eigenvectors.

 Reduces complexity, increases speed, and avoids overfitting.

 Used in image processing, text data, etc.

 Keywords: feature reduction, dimensionality reduction, variance, eigenvalues,
eigenvectors.
5. Perceptron Update Rule

 A learning rule to update model weights when the prediction is wrong.

 Formula: w = w + learning_rate × (true - predicted) × input

 It helps to draw the best possible decision boundary.

 Repeats until it finds a line that separates classes correctly.

 Keywords: weights, learning, prediction, update, decision boundary.

6. Measuring Cluster Quality

 Silhouette Score: Measures how well a point fits in its own cluster vs others.

 Inertia: Sum of squared distances to cluster center (less is better).

 Dunn Index, Davies-Bouldin Index are other measures.
 Good clustering = similar points together, far from other clusters.

 Keywords: cluster quality, distance, similarity, evaluation.

7. KNN Advantages & Disadvantages

 ✅ Simple, easy to implement.

 ✅ No training time.

 ❌ Slow with large datasets.

 ❌ Affected by irrelevant features and noise.

 Works well with properly scaled and cleaned data.

 Keywords: neighbors, voting, distance, simplicity, sensitive.

8. Maximum Margin Classification (SVM)

 SVM tries to draw a line (or plane) that maximizes the gap (margin) between
classes.

 More margin = better generalization and accuracy.

 Support Vectors: Data points nearest to the margin.
 Helps model work well even on new unseen data.

 Keywords: SVM, margin, support vectors, separation.

9. KNN (Working)
 For classification: Looks at 'K' nearest neighbors and predicts based on majority
class.
 For regression: Takes average value of 'K' nearest neighbors.

 Depends on a good distance measure (e.g., Euclidean).

 No training needed, just memory of data.

 Keywords: K neighbors, vote, average, distance, prediction.

10. Bagging vs Boosting

 Bagging (Bootstrap Aggregating):

o Builds many independent models and averages them (like Random Forest).
o Reduces variance.

 Boosting:
o Builds models one by one, each correcting previous errors (like AdaBoost).
o Reduces bias.

 Keywords: ensemble, bagging = parallel, boosting = sequential, random forest,

AdaBoost.

11. Handling Outliers & Missing Values

 Outliers:
o Can be removed or replaced (with mean/median).

o Use Z-score or IQR to detect them.

 Missing values:

o Fill with mean/median/mode.

o Or remove rows/columns with too many missing values.

 Keywords: clean data, missing data, imputation, outlier removal.

12. Logistic Regression Applications

 Predicts probability of class belonging.

 Used for binary classification: disease/no disease, spam/not spam.

 Output is sigmoid shaped (0 to 1).

 Better than linear regression for classification tasks.

 Keywords: probability, classification, sigmoid, logistic model.

13. DBSCAN Clustering

 Density-Based Spatial Clustering of Applications with Noise.
 Groups data based on density (points close together form a cluster).

 Detects outliers naturally.

 Better than K-Means for uneven shapes and noisy data.

 Keywords: density, clustering, core points, noise, outliers.

14. Hierarchical Divisive Clustering

 Start with all data as one cluster.
 Divide recursively until small groups are formed.

 Builds a tree-like diagram (dendrogram).

 Opposite of Agglomerative (bottom-up) clustering.

 Keywords: top-down, split, tree, hierarchical clustering.

15. Spam Detection (Supervised Learning)

 Collect labeled data: spam or not spam.

 Extract features like keywords, links, sender info.

 Use classification models like Logistic Regression, Naive Bayes.

 Train the model with past data and test on new emails.
 Keywords: supervised learning, labels, features, classification.

16. Reinforcement Learning

 Learning by trial and error.

 Agent interacts with environment, gets rewards or penalties.

 Learns which actions give best long-term rewards.
 Used in games (e.g., Chess), robots, self-driving cars.

 Keywords: agent, action, reward, environment, learning loop.

17. Overfitting and Underfitting

 Overfitting: Model memorizes training data, fails on new data.

 Underfitting: Model too simple, can't learn enough from data.

 Solutions:

o Use simpler or more complex model.

o More training data.

o Use regularization.
 Keywords: generalization, performance, model fit, complexity.

18. Regularization in Regression

 Adds a penalty for large coefficients to reduce overfitting.

 Two types:

o L1 (Lasso): Some weights become zero (feature selection).

o L2 (Ridge): Shrinks all weights (no zero).
 Helps model stay simple and generalize better.

 Keywords: penalty, L1, L2, overfitting, shrink weights.

Superior Speaking Course Overview & Resource Guide CEC-2023
100% (1)
Superior Speaking Course Overview & Resource Guide CEC-2023
100 pages
0004000000055D00
100% (1)
0004000000055D00
9 pages
Machinelearning GateNotes
No ratings yet
Machinelearning GateNotes
105 pages
AI ML Concepts
No ratings yet
AI ML Concepts
97 pages
Maths Clinic Gr12 ENG SmartPrep v1.0 1 PDF
50% (8)
Maths Clinic Gr12 ENG SmartPrep v1.0 1 PDF
69 pages
GATE ML Updated 111023
No ratings yet
GATE ML Updated 111023
109 pages
ML Notes 2k25
No ratings yet
ML Notes 2k25
19 pages
100 Days ML
No ratings yet
100 Days ML
15 pages
Staad Pro-Series 11-Wind Load Analysis
100% (3)
Staad Pro-Series 11-Wind Load Analysis
35 pages
Topic 08 - Data Modelling - Part II
No ratings yet
Topic 08 - Data Modelling - Part II
59 pages
Lecture 2 Final
No ratings yet
Lecture 2 Final
90 pages
Machine Learning Engineer Interview Preparation Guide
No ratings yet
Machine Learning Engineer Interview Preparation Guide
14 pages
Unit 3 Ds
No ratings yet
Unit 3 Ds
10 pages
ML - ML in Nutshell
No ratings yet
ML - ML in Nutshell
7 pages
ML
No ratings yet
ML
18 pages
MLTAHER
No ratings yet
MLTAHER
14 pages
DS 4
No ratings yet
DS 4
18 pages
Models
No ratings yet
Models
46 pages
Machine Lar Arii
No ratings yet
Machine Lar Arii
9 pages
PRCV Viva Notes
No ratings yet
PRCV Viva Notes
32 pages
Data Science Notes C
No ratings yet
Data Science Notes C
4 pages
ML QB With Answer
No ratings yet
ML QB With Answer
20 pages
AWS Machine Learning Specialty Master Cheat Sheet
No ratings yet
AWS Machine Learning Specialty Master Cheat Sheet
24 pages
PRCV Unit-2
No ratings yet
PRCV Unit-2
24 pages
Introduction and Basics of Machine Learning
No ratings yet
Introduction and Basics of Machine Learning
9 pages
Pattern Summary Final
No ratings yet
Pattern Summary Final
28 pages
ML 1
No ratings yet
ML 1
17 pages
Chapter 2,3,4
No ratings yet
Chapter 2,3,4
8 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Vocabulary Building: Presented by Mavis Hara
No ratings yet
Vocabulary Building: Presented by Mavis Hara
42 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Name: Grade: Division: Roll No: School Code:: Continuous Assessment - I
No ratings yet
Name: Grade: Division: Roll No: School Code:: Continuous Assessment - I
6 pages
Jadavpur University: Assignment Submission
No ratings yet
Jadavpur University: Assignment Submission
9 pages
ML (Theory)
No ratings yet
ML (Theory)
11 pages
ML Unit 3
No ratings yet
ML Unit 3
10 pages
Supervised Learning Final With Diagrams Cleaned
No ratings yet
Supervised Learning Final With Diagrams Cleaned
7 pages
21CS743
No ratings yet
21CS743
27 pages
ML Unit4
No ratings yet
ML Unit4
10 pages
Bwiru Boys
No ratings yet
Bwiru Boys
6 pages
WebSphere® Development Studio ILE RPG Reference Summary
No ratings yet
WebSphere® Development Studio ILE RPG Reference Summary
78 pages
Machine Learning Theory Updated
No ratings yet
Machine Learning Theory Updated
8 pages
TOEFL ITP LISTENING Prep
No ratings yet
TOEFL ITP LISTENING Prep
23 pages
ML 2m Cie2
No ratings yet
ML 2m Cie2
4 pages
Machine Learning Presentation
No ratings yet
Machine Learning Presentation
12 pages
Cicero and Tacitus in Sixteenth-Century France
100% (1)
Cicero and Tacitus in Sixteenth-Century France
26 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
19 pages
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
No ratings yet
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
20 pages
Pattern Recognition Unit 2
No ratings yet
Pattern Recognition Unit 2
24 pages
Machine Learning Questions and Answers: Decision Tree
No ratings yet
Machine Learning Questions and Answers: Decision Tree
3 pages
ML - Interview Prep
No ratings yet
ML - Interview Prep
9 pages
Mod B Concise Summary
No ratings yet
Mod B Concise Summary
2 pages
Spam Not Spam
No ratings yet
Spam Not Spam
7 pages
Computer Vision-Lec 3
No ratings yet
Computer Vision-Lec 3
11 pages
,smas, Climate Change, NOA Islamabad 120116
No ratings yet
,smas, Climate Change, NOA Islamabad 120116
20 pages
Thesis Statement Worksheet Answer Key
100% (3)
Thesis Statement Worksheet Answer Key
4 pages
Hcu 2014 PDF
No ratings yet
Hcu 2014 PDF
20 pages
Unit 4 Introduction To Algorithm
No ratings yet
Unit 4 Introduction To Algorithm
10 pages
Interview AI Algo
No ratings yet
Interview AI Algo
3 pages
Chatgpt Unit - 3
No ratings yet
Chatgpt Unit - 3
4 pages
ML Imp Ques 1
No ratings yet
ML Imp Ques 1
22 pages
Rotary Seals Vrings - PD PDF
No ratings yet
Rotary Seals Vrings - PD PDF
25 pages
Arithmetic and Geometric Sequences Sampler
No ratings yet
Arithmetic and Geometric Sequences Sampler
3 pages
CHP 1,2
No ratings yet
CHP 1,2
18 pages
CSC413 Lecture Note
No ratings yet
CSC413 Lecture Note
32 pages
The Law Ex-Ists: Reading Kafka With Lacan
No ratings yet
The Law Ex-Ists: Reading Kafka With Lacan
7 pages
AIML105
No ratings yet
AIML105
5 pages
Near Misses: Nurses' Experiences With Medication Errors, Power Distance and Error Recover
No ratings yet
Near Misses: Nurses' Experiences With Medication Errors, Power Distance and Error Recover
10 pages
English Year 6 PDPR
No ratings yet
English Year 6 PDPR
2 pages
Data Science Notes B
No ratings yet
Data Science Notes B
5 pages
ML CheatSheet
No ratings yet
ML CheatSheet
14 pages
28 - Design of Lintels
No ratings yet
28 - Design of Lintels
5 pages
Math 9th Monthly
No ratings yet
Math 9th Monthly
2 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Rubrics
No ratings yet
Rubrics
2 pages
Gtsummary
No ratings yet
Gtsummary
2 pages
AML Imp Ques
No ratings yet
AML Imp Ques
10 pages
Business Letter Writing
No ratings yet
Business Letter Writing
27 pages
Table of Specification For Grade 7
No ratings yet
Table of Specification For Grade 7
2 pages
Caepup 2022 Application Form: Please No Fy
No ratings yet
Caepup 2022 Application Form: Please No Fy
1 page
Five Steps To Writing The Perfect Market Research Brief
No ratings yet
Five Steps To Writing The Perfect Market Research Brief
3 pages
2020 01 19 Brendon Cloete CV
No ratings yet
2020 01 19 Brendon Cloete CV
2 pages
6th - SEM Machine Learning Notes PDF
100% (1)
6th - SEM Machine Learning Notes PDF
36 pages
Kendall Langston Bio
No ratings yet
Kendall Langston Bio
1 page
Report On Traffic Volume Study
No ratings yet
Report On Traffic Volume Study
16 pages
21CS743 Model Question Paper Solution
No ratings yet
21CS743 Model Question Paper Solution
32 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Nit ML Sugg

Uploaded by

Nit ML Sugg

Uploaded by

NIT ML SUGG (SUVECCHHA PAUL)

1. Eigenvalues and Eigenvectors (in Machine Learning)

2. Decision Boundaries (in Classification)

 For example, it separates "spam" from "not spam" based on features.

 Keywords: boundary, separation, classifier, line, margin.

 Linear Regression is used for predicting numbers (e.g., salary, temperature).

o Linear: For continuous values.

4. Principal Component Analysis (PCA)

 It finds new axes (principal components) using eigenvalues and eigenvectors.

 Used in image processing, text data, etc.

 A learning rule to update model weights when the prediction is wrong.

 Formula: w = w + learning_rate × (true - predicted) × input

 Repeats until it finds a line that separates classes correctly.

6. Measuring Cluster Quality

 Inertia: Sum of squared distances to cluster center (less is better).

 Keywords: cluster quality, distance, similarity, evaluation.

7. KNN Advantages & Disadvantages

 ✅ Simple, easy to implement.

 ❌ Slow with large datasets.

 ❌ Affected by irrelevant features and noise.

 Works well with properly scaled and cleaned data.

 Keywords: neighbors, voting, distance, simplicity, sensitive.

 More margin = better generalization and accuracy.

 Keywords: SVM, margin, support vectors, separation.

 Depends on a good distance measure (e.g., Euclidean).

 No training needed, just memory of data.

10. Bagging vs Boosting

 Keywords: ensemble, bagging = parallel, boosting = sequential, random forest,

11. Handling Outliers & Missing Values

o Use Z-score or IQR to detect them.

o Fill with mean/median/mode.

 Keywords: clean data, missing data, imputation, outlier removal.

12. Logistic Regression Applications

 Used for binary classification: disease/no disease, spam/not spam.

 Better than linear regression for classification tasks.

13. DBSCAN Clustering

 Detects outliers naturally.

 Better than K-Means for uneven shapes and noisy data.

 Keywords: density, clustering, core points, noise, outliers.

14. Hierarchical Divisive Clustering

 Builds a tree-like diagram (dendrogram).

 Keywords: top-down, split, tree, hierarchical clustering.

 Collect labeled data: spam or not spam.

 Use classification models like Logistic Regression, Naive Bayes.

16. Reinforcement Learning

 Learning by trial and error.

 Agent interacts with environment, gets rewards or penalties.

 Keywords: agent, action, reward, environment, learning loop.

 Overfitting: Model memorizes training data, fails on new data.

o Use simpler or more complex model.

o More training data.

18. Regularization in Regression

 Adds a penalty for large coefficients to reduce overfitting.

o L1 (Lasso): Some weights become zero (feature selection).

 Keywords: penalty, L1, L2, overfitting, shrink weights.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.