The document outlines key concepts in machine learning, including the machine learning pipeline, ensemble learning, and comparisons between EM and K-Means clustering. It also discusses Hidden Markov Models for sequence data and the backpropagation process in deep learning. Each section highlights important techniques, challenges, and applications relevant to the field.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
6 views2 pages
Mcs1009 Ml Answer Key Part c
The document outlines key concepts in machine learning, including the machine learning pipeline, ensemble learning, and comparisons between EM and K-Means clustering. It also discusses Hidden Markov Models for sequence data and the backpropagation process in deep learning. Each section highlights important techniques, challenges, and applications relevant to the field.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2
Answer Key - MCS1009 Machine Learning
PART C (1 x 15 = 15 Marks)
QC101 (a) (15 Marks):
Machine Learning Pipeline: 1. Problem Formulation: Define objective (e.g., spam detection). 2. Data Collection: Gather structured/unstructured data. 3. Exploratory Data Analysis: Understand distributions, outliers. 4. Feature Engineering: Normalize, select or extract features. 5. Model Selection: Choose algorithms (SVM, RF, etc.). 6. Training and Validation: Use techniques like k-fold cross-validation. 7. Performance Metrics: Accuracy, F1, AUC-ROC, etc. 8. Deployment: APIs, cloud integration, or edge deployment. Challenges: - Data Imbalance: Use SMOTE, resampling. - Noisy Data: Use filtering, robust models. - Interpretability: Use SHAP, LIME for explanations.
QC201 (b) (15 Marks):
Ensemble Learning: - Combines multiple models to produce better predictive performance. - Reduces overfitting, increases robustness. Types: • Bagging: Random Forest • Boosting: AdaBoost, XGBoost Principle: Different models learn different patterns/errors and their combination reduces variance or bias. Applications: Fraud detection, recommendation systems.
QC301 (a) (15 Marks):
EM vs. K-Means: - EM uses probabilistic assignment (soft clustering). - K-Means uses hard assignments. - EM models data as Gaussian Mixtures. - K-Means minimizes Euclidean distances. Application: EM is used in voice recognition and image segmentation where overlapping clusters are common.
QC401 (b) (15 Marks):
Hidden Markov Models (HMMs): - Used for sequence data like speech, text. Components: • States: Hidden variables • Observations: Emitted symbols • Transition Probabilities • Emission Probabilities Modeling: - HMMs estimate the probability of a sequence based on state transitions and emissions. Use Cases: POS tagging, speech recognition, bioinformatics.
QC501 (b) (15 Marks):
Backpropagation in Deep Learning: - It calculates the gradient of the loss function with respect to weights. - Uses chain rule to propagate error backward. - Updates weights using gradient descent. Importance: • Enables multi-layer training. • Improves accuracy over epochs. • Basis for training deep networks like CNNs and RNNs.