0% found this document useful (0 votes)

13 views20 pages

CENG3300 Lecture 10

This document provides an overview of various machine learning techniques relevant to molecular engineering including clustering, dimension reduction, and reinforcement learning. It discusses k-means clustering and how it is used to group similar chemical compounds or protein sequences. Principal component analysis (PCA) is introduced as an unsupervised technique that projects high-dimensional data into a lower-dimensional space while maximizing variance. Dimension reduction techniques like PCA and t-SNE are used for data visualization, noise removal, and preprocessing for machine learning models. Autoencoders and generative adversarial networks are also discussed for dimensional reduction. Reinforcement learning is defined as a method for self-learning complex decisions through simulated experiences and has seen success in chemistry applications.

Uploaded by

huichloemail

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views20 pages

CENG3300 Lecture 10

Uploaded by

huichloemail

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Data Science for Molecular

Engineering
Lecture 10
ILOs
• Understand the concept and method for clustering;
• Understand the motivation and methods for dimension reduction;
• Understand the elements of reinforcement learning;
Clustering
• Goal: partition unlabeled data into groups of similar data points
• When and why?
• Data organization (e.g. for easier search)
• Understand the underlying structure of data
• Preprocessing for further analysis
Applications of clustering
• Cluster chemical compounds for better organization

• Cluster protein sequences by function or genes according to

expression profile.
K-means clustering
• Partition the data around k centers

• Using a set of input points and a similarity measure, identify the best
position of k centers so that the sum of distances to the nearest
center is minimized.
Similar clustering algorithms
Principal component analysis
• What is PCA?
• Unsupervised learning technique for extracting variance structure
from high dimensional datasets

PCA is an orthogonal projection or transformation of

the data into a (possibly lower dimensional) subspace
so that the variance of the projected data is
maximized.
Principal component analysis

PCA tries to rotate the axis/coordinate system so that only one feature is kept
Principal component analysis

PCA can apply to higher dimensional data as well

Principal component analysis
• Principal components (PC) are orthogonal directions that capture the
most variance in the data
• The first principal component is the direction of the greatest data variability
• 2nd PC is the direction orthogonal to 1st PC that has the greatest data
variability,
• And so on…
What is PCA used for?
• Visualization
• Noise removal
• Feature transformation – lower dimension can lead to better
generalizability
• Preprocessing for machine learning models
t-distributed stochastic neighbor embedding
(tSNE)
• Non-linear dimension reduction technique
• Project high dimensional data onto a 2-D or 3-D space, where similar
points are modeled by similar lower dimensional points with high
probability
• The most popular method for data visualization, especially for high
dimensional data
t-SNE examples
Autoencoders
Dimensional reduction techniques such as PCA and t-SNE transforms original features into some new features
(latent features), and can be viewed as an encoder.
Autoencoders add a decoder to recover the original input from the latent features
Generative adversarial network (GAN)
Examples of AE and GAN
Reinforcement learning
• A method used for self-learning to make complex decisions through
numerous simulated experiences
Striking success has been seen using RL
Similar concept has been brought into
chemistry

UNIT-4 Machine Learning
No ratings yet
UNIT-4 Machine Learning
20 pages
20-pca
No ratings yet
20-pca
50 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
Lecture 6(b) PCA-II
No ratings yet
Lecture 6(b) PCA-II
90 pages
ML(UNIT_5)
No ratings yet
ML(UNIT_5)
34 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
UE22EE342BC8_0306dc04-1803-4c4f-88e0-375843782856_20250423095250
No ratings yet
UE22EE342BC8_0306dc04-1803-4c4f-88e0-375843782856_20250423095250
18 pages
Advanced Data Analysis Techniques 2
No ratings yet
Advanced Data Analysis Techniques 2
32 pages
Lecture 9_PCA
No ratings yet
Lecture 9_PCA
44 pages
PCA Theory
No ratings yet
PCA Theory
13 pages
CS464_Ch6_FeatureExtraction
No ratings yet
CS464_Ch6_FeatureExtraction
46 pages
AA11_Unsupervised Learning_2024 (2)
No ratings yet
AA11_Unsupervised Learning_2024 (2)
39 pages
Module 3
No ratings yet
Module 3
41 pages
CE880_Lecture4_slides
No ratings yet
CE880_Lecture4_slides
30 pages
315 F19 27 Pca1
No ratings yet
315 F19 27 Pca1
28 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
Pca 1692550768
No ratings yet
Pca 1692550768
13 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
Clustering_and_dimensionality_reduction_techniques__PCA__t_SNE__K_means_ (1)
No ratings yet
Clustering_and_dimensionality_reduction_techniques__PCA__t_SNE__K_means_ (1)
15 pages
Unit V Foml
No ratings yet
Unit V Foml
18 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
pca1
No ratings yet
pca1
3 pages
EDAB Module 5 Singular Value Decomposition (SVD)
No ratings yet
EDAB Module 5 Singular Value Decomposition (SVD)
58 pages
Dimensionality Reduction: Motivation I: Data Compression
No ratings yet
Dimensionality Reduction: Motivation I: Data Compression
35 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
03 Dimensionality Reduction
No ratings yet
03 Dimensionality Reduction
38 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
topic 2
No ratings yet
topic 2
10 pages
10 ASAP Advanced Statistics Dimension Reduction
No ratings yet
10 ASAP Advanced Statistics Dimension Reduction
8 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
Pca
No ratings yet
Pca
17 pages
1.1 + 1.2
No ratings yet
1.1 + 1.2
3 pages
ML Module 6
No ratings yet
ML Module 6
6 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
Advances in Principal Component Analysis Research and Development - Ganesh R. Naik
No ratings yet
Advances in Principal Component Analysis Research and Development - Ganesh R. Naik
256 pages
PCA How To.1
No ratings yet
PCA How To.1
13 pages
Face Recognition PAC
No ratings yet
Face Recognition PAC
24 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
Dimensionality Reduction: Principal Component Analysis (PCA)
No ratings yet
Dimensionality Reduction: Principal Component Analysis (PCA)
11 pages
SE 458 - Data Mining (DM) : Spring 2019 Section W1
No ratings yet
SE 458 - Data Mining (DM) : Spring 2019 Section W1
10 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
33 pages
data reduction
No ratings yet
data reduction
9 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Devoir PCA
No ratings yet
Devoir PCA
13 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Summary PCA by Atta Mohammad 26040
No ratings yet
Summary PCA by Atta Mohammad 26040
2 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
2d
No ratings yet
2d
17 pages
Deep Learning for Data Analytics 2023 Answer
No ratings yet
Deep Learning for Data Analytics 2023 Answer
6 pages
Pca&kmean
No ratings yet
Pca&kmean
6 pages
Love Report
No ratings yet
Love Report
7 pages
program-3
No ratings yet
program-3
7 pages
Ai ( PCA)
No ratings yet
Ai ( PCA)
3 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
K. J. Somaiya College of Engineering, Mumbai-77: Title: Implementation of Principal Component Analysis
No ratings yet
K. J. Somaiya College of Engineering, Mumbai-77: Title: Implementation of Principal Component Analysis
2 pages
Decision Tree Slides
No ratings yet
Decision Tree Slides
94 pages
Top 50 Analytics Projects 1691221401
100% (1)
Top 50 Analytics Projects 1691221401
52 pages
PHD Thesis On Facial Expression Recognition
100% (3)
PHD Thesis On Facial Expression Recognition
6 pages
Arpan Karki L2C2
No ratings yet
Arpan Karki L2C2
56 pages
Neurofunctional Theory
75% (4)
Neurofunctional Theory
6 pages
Ai - 1
No ratings yet
Ai - 1
40 pages
Defect Classification and Analysis Risk Identification: Presented by
No ratings yet
Defect Classification and Analysis Risk Identification: Presented by
19 pages
DBMS M3
No ratings yet
DBMS M3
66 pages
CENG3300 Lecture 4
No ratings yet
CENG3300 Lecture 4
25 pages
CENG3300 Lecture 3
No ratings yet
CENG3300 Lecture 3
24 pages
CENG3300 Lecture 2-2
No ratings yet
CENG3300 Lecture 2-2
23 pages
CENG3300 Lecture 1
No ratings yet
CENG3300 Lecture 1
21 pages
CENG3300 Lecture 2-1
No ratings yet
CENG3300 Lecture 2-1
21 pages
Lecture 7
No ratings yet
Lecture 7
19 pages
Lecture 5
No ratings yet
Lecture 5
18 pages
How Netflix Uses AI (AutoRecovered)
No ratings yet
How Netflix Uses AI (AutoRecovered)
8 pages
4f13 Machine Learning Coursework
100% (2)
4f13 Machine Learning Coursework
8 pages
Machine Learning Manual
100% (1)
Machine Learning Manual
81 pages
CENG3300 Lecture 9
No ratings yet
CENG3300 Lecture 9
19 pages
Lecture 6
No ratings yet
Lecture 6
29 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
Python Dictionaries Cheat Sheet
No ratings yet
Python Dictionaries Cheat Sheet
9 pages
Red Zuma Project - Part-1
100% (2)
Red Zuma Project - Part-1
5 pages
Adjusted Community-Aware Attributed Graph Anomaly Detection
No ratings yet
Adjusted Community-Aware Attributed Graph Anomaly Detection
8 pages
Linked List Data Structure
No ratings yet
Linked List Data Structure
7 pages
Nature Inspired Meta-Heuristic Algorithms For Deep Learning: Recent Progress and Novel Perspective
No ratings yet
Nature Inspired Meta-Heuristic Algorithms For Deep Learning: Recent Progress and Novel Perspective
14 pages
A Simpler (And Better) SQL Approach To Relational Division
No ratings yet
A Simpler (And Better) SQL Approach To Relational Division
5 pages
User-Centered Systems Design: A Brief History
No ratings yet
User-Centered Systems Design: A Brief History
23 pages
Generative AI Certification Course
No ratings yet
Generative AI Certification Course
8 pages
SAUPUBS
No ratings yet
SAUPUBS
27 pages
Lect 8 Dynamic Behaviour of Feedback Controller Process
No ratings yet
Lect 8 Dynamic Behaviour of Feedback Controller Process
12 pages
استخدام سلاسل ماركوف في التنبؤ بالحصة السوقية للبنوك التجارية دراسة
No ratings yet
استخدام سلاسل ماركوف في التنبؤ بالحصة السوقية للبنوك التجارية دراسة
13 pages
Optimization of PID Tuning Using Genetic Algorithm
No ratings yet
Optimization of PID Tuning Using Genetic Algorithm
10 pages
Hyper Tuner
No ratings yet
Hyper Tuner
11 pages
Introduction To Artificial Intelligence July 2023
No ratings yet
Introduction To Artificial Intelligence July 2023
2 pages
Artificial Intelligence Mini Project
No ratings yet
Artificial Intelligence Mini Project
5 pages
Conditions of Learning
No ratings yet
Conditions of Learning
3 pages
What Type of Thing Is Artificial Intelligence?: Scientific Field
No ratings yet
What Type of Thing Is Artificial Intelligence?: Scientific Field
2 pages
Mae 506 Syl Lab Us Fall 2014
No ratings yet
Mae 506 Syl Lab Us Fall 2014
4 pages
EEE 482 Introduction To State-Space Methods (3) (F)
No ratings yet
EEE 482 Introduction To State-Space Methods (3) (F)
2 pages
Mastering Algorithms and Data Structures
From Everand
Mastering Algorithms and Data Structures
Manish Soni
No ratings yet
Computer Data
From Everand
Computer Data
Angel Gabaldon
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CENG3300 Lecture 10

Uploaded by

CENG3300 Lecture 10

Uploaded by

Data Science for Molecular

• Cluster protein sequences by function or genes according to

PCA is an orthogonal projection or transformation of

PCA can apply to higher dimensional data as well

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.