0% found this document useful (0 votes)

8 views5 pages

air-quality-randomforest

Random Forest

Uploaded by

nijir70713

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views5 pages

air-quality-randomforest

Random Forest

Uploaded by

nijir70713

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

# This Python 3 environment comes with many helpful analytics libraries installed

# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python

# For example, here's several helpful packages to load

import numpy as np # linear algebra

import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory

# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

/kaggle/input/air-quality-and-pollution-assessment/pollution_dataset.csv

Libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

Data Upload
df=pd.read_csv('/kaggle/input/air-quality-and-pollution-assessment/pollution_dataset.csv').dropna()

df.head(3)

Temperature Humidity PM2.5 PM10 NO2 SO2 CO Proximity_to_Industrial_Areas Population_Density Air Quality

0 27.2 51.7 35.1 46.2 26.7 32.2 0.98 11.2 314 Hazardous

1 26.3 59.3 1.0 6.2 38.3 20.4 0.68 13.5 298 Good

2 27.9 73.2 20.0 39.4 19.6 5.8 0.95 5.4 309 Good

Data Preprocessing
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Temperature 5000 non-null float64
1 Humidity 5000 non-null float64
2 PM2.5 5000 non-null float64
3 PM10 5000 non-null float64
4 NO2 5000 non-null float64
5 SO2 5000 non-null float64
6 CO 5000 non-null float64
7 Proximity_to_Industrial_Areas 5000 non-null float64
8 Population_Density 5000 non-null int64
9 Air Quality 5000 non-null object
dtypes: float64(8), int64(1), object(1)
memory usage: 390.8+ KB

df['Air Quality'].unique()

array(['Hazardous', 'Good', 'Poor', 'Moderate'], dtype=object)

ENCODİNG
custom_mapping = {'Hazardous': 0, 'Poor': 1, 'Moderate': 2, 'Good': 3}
df['Air Quality'] = df['Air Quality'].map(custom_mapping)

print("Updated DataFrame:")
print(df)

Updated DataFrame:
Temperature Humidity PM2.5 PM10 NO2 SO2 CO \
0 27.2 51.7 35.1 46.2 26.7 32.2 0.98
1 26.3 59.3 1.0 6.2 38.3 20.4 0.68
2 27.9 73.2 20.0 39.4 19.6 5.8 0.95
3 23.9 51.9 14.7 24.3 5.2 12.6 1.24
4 25.2 59.0 26.3 30.9 26.8 13.5 1.06
... ... ... ... ... ... ... ...
4995 29.3 36.8 80.3 90.9 9.2 14.1 0.97
4996 15.7 51.7 0.7 11.4 40.5 13.8 1.07
4997 27.8 48.1 8.9 16.4 8.6 17.7 0.54
4998 30.4 50.4 2.2 18.8 13.1 22.3 0.94
4999 21.5 76.5 45.0 58.0 37.9 0.0 0.96

Proximity_to_Industrial_Areas Population_Density Air Quality

0 11.2 314 0
1 13.5 298 3
2 5.4 309 3
3 4.5 282 1
4 5.6 293 1
... ... ... ...
4995 10.2 287 2
4996 4.2 320 3
4997 0.3 302 2
4998 6.7 308 2
4999 0.2 290 0

[5000 rows x 10 columns]

#Hazardous -> 0
#Poor -> 1
#Moderate -> 2
#Good -> 3

df.head(3)

Temperature Humidity PM2.5 PM10 NO2 SO2 CO Proximity_to_Industrial_Areas Population_Density Air Quality

0 27.2 51.7 35.1 46.2 26.7 32.2 0.98 11.2 314 0

1 26.3 59.3 1.0 6.2 38.3 20.4 0.68 13.5 298 3

2 27.9 73.2 20.0 39.4 19.6 5.8 0.95 5.4 309 3

Train-Test
y=df["Air Quality"]
x=df.drop("Air Quality", axis=1)

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)

scaler = StandardScaler()

Standard Scaler
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)

df.head(3)

Temperature Humidity PM2.5 PM10 NO2 SO2 CO Proximity_to_Industrial_Areas Population_Density Air Quality

0 27.2 51.7 35.1 46.2 26.7 32.2 0.98 11.2 314 0

1 26.3 59.3 1.0 6.2 38.3 20.4 0.68 13.5 298 3

2 27.9 73.2 20.0 39.4 19.6 5.8 0.95 5.4 309 3

Random Forest
clf = RandomForestClassifier(random_state=42)
clf.fit(x_train_scaled, y_train)

▾ RandomForestClassifier

RandomForestClassifier(random_state=42)

Predict
y_pred = clf.predict(x_test_scaled)
print(classification_report(y_test, y_pred))

precision recall f1-score support

0 0.00 0.00 0.00 174

1 0.18 0.04 0.07 309
2 0.33 0.28 0.30 516
3 0.39 0.69 0.50 651

accuracy 0.37 1650

macro avg 0.23 0.25 0.22 1650
weighted avg 0.29 0.37 0.31 1650

accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")
print("\nClassification Report:\n", classification_report(y_test, y_pred))

Accuracy: 0.37

Classification Report:
precision recall f1-score support

0 0.00 0.00 0.00 174

1 0.18 0.04 0.07 309
2 0.33 0.28 0.30 516
3 0.39 0.69 0.50 651

accuracy 0.37 1650

macro avg 0.23 0.25 0.22 1650
weighted avg 0.29 0.37 0.31 1650

Data Visualization
import warnings
warnings.filterwarnings('ignore')

sns.pairplot(df, hue='Air Quality', palette='Set1', diag_kind='kde')

plt.suptitle('Pairplot of Features vs Air Quality', y=1.02)
plt.show()
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=clf.classes_, yticklabels=clf.classes_)
plt.title('Confusion Matrix')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.show()
"Thank you for exploring this notebook! I hope you found the analysis insightful and informative. Feel free to share your thoughts and
questions in the comments below!"

Merged
No ratings yet
Merged
35 pages
Exp_2-EDA_CaliforniaData Set_HeatMap_PairPlot-checkpoint - Jupyter Notebook
No ratings yet
Exp_2-EDA_CaliforniaData Set_HeatMap_PairPlot-checkpoint - Jupyter Notebook
12 pages
Ml Lab Manual
No ratings yet
Ml Lab Manual
43 pages
Dengue Case Prediction Using Machine Learning: Import As Import As Import As Import As Import
No ratings yet
Dengue Case Prediction Using Machine Learning: Import As Import As Import As Import As Import
137 pages
1746590750026
No ratings yet
1746590750026
28 pages
Machine Learning Lab Manual (1) (1)
No ratings yet
Machine Learning Lab Manual (1) (1)
26 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
Machine Learning Lab Manual (1)
No ratings yet
Machine Learning Lab Manual (1)
33 pages
Heart Disease Prediction! ❤️?
No ratings yet
Heart Disease Prediction! ❤️?
52 pages
ML LAB - BCSL606
No ratings yet
ML LAB - BCSL606
67 pages
Ml Lab Mannual1
No ratings yet
Ml Lab Mannual1
37 pages
machinelearning
No ratings yet
machinelearning
26 pages
Shaheed Zulfikar Ali Bhutto Institute of Science & Technology
No ratings yet
Shaheed Zulfikar Ali Bhutto Institute of Science & Technology
12 pages
lab manual ML
No ratings yet
lab manual ML
23 pages
NMHK
No ratings yet
NMHK
13 pages
ML lab manual
No ratings yet
ML lab manual
25 pages
explainable-ai-driven-rainfall-prediction-using-dl
No ratings yet
explainable-ai-driven-rainfall-prediction-using-dl
66 pages
Bigdata - Ipynb - Colab
No ratings yet
Bigdata - Ipynb - Colab
28 pages
Class07
No ratings yet
Class07
13 pages
LAB-Skill Advanced Course Machine Learning With Python Experiments
No ratings yet
LAB-Skill Advanced Course Machine Learning With Python Experiments
23 pages
Python ML Projects
No ratings yet
Python ML Projects
18 pages
ML#05
No ratings yet
ML#05
35 pages
Openlab1
No ratings yet
Openlab1
17 pages
Machine Learning in Agriculture
No ratings yet
Machine Learning in Agriculture
29 pages
Exercise5 Solution
No ratings yet
Exercise5 Solution
22 pages
Air Quality Index Analysis Using Machine Learning 1647514117
No ratings yet
Air Quality Index Analysis Using Machine Learning 1647514117
20 pages
Maghda Zakiyah Muthi'Ah - Colab
No ratings yet
Maghda Zakiyah Muthi'Ah - Colab
4 pages
m1
No ratings yet
m1
10 pages
Dma 89
No ratings yet
Dma 89
21 pages
Quality Prediction Checkpoint
No ratings yet
Quality Prediction Checkpoint
14 pages
World Air Quality Analysis
No ratings yet
World Air Quality Analysis
15 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
Assessment On Tuple
No ratings yet
Assessment On Tuple
8 pages
MLT Use Case
No ratings yet
MLT Use Case
13 pages
Machine Learning
No ratings yet
Machine Learning
22 pages
Boston Dataset
No ratings yet
Boston Dataset
6 pages
DAVL PR1.2 Mit
No ratings yet
DAVL PR1.2 Mit
10 pages
MLRecord
No ratings yet
MLRecord
24 pages
Introduction to Neural Networks
No ratings yet
Introduction to Neural Networks
4 pages
HW1
No ratings yet
HW1
11 pages
Proyecto Final Model
No ratings yet
Proyecto Final Model
13 pages
ML LAB 12 - Jupyter Notebook
No ratings yet
ML LAB 12 - Jupyter Notebook
11 pages
Practical04.ipynb - Colab
No ratings yet
Practical04.ipynb - Colab
2 pages
Untitled 23
No ratings yet
Untitled 23
4 pages
DAC Phase2
No ratings yet
DAC Phase2
8 pages
Project Information-Gain
No ratings yet
Project Information-Gain
5 pages
20MIS1025 - Comparative Analysis - Ipynb - Colaboratory
No ratings yet
20MIS1025 - Comparative Analysis - Ipynb - Colaboratory
6 pages
Setup: Chapter 2 - End-To-End Machine Learning Project
No ratings yet
Setup: Chapter 2 - End-To-End Machine Learning Project
31 pages
Innovative Assignment PDF
No ratings yet
Innovative Assignment PDF
11 pages
1 Abril PDF
No ratings yet
1 Abril PDF
10 pages
Correlation: Import As Import As Import As Import As From Import From Import Import Matplotlib Import
No ratings yet
Correlation: Import As Import As Import As Import As From Import From Import Import Matplotlib Import
1 page
210430_PracticalWeek03a
No ratings yet
210430_PracticalWeek03a
1 page
035 Assignment PDF
No ratings yet
035 Assignment PDF
14 pages
Normialization Dataset
No ratings yet
Normialization Dataset
7 pages
Clustering Documentation Python Code
No ratings yet
Clustering Documentation Python Code
8 pages
Copy of Project 4 _ House Price Prediction.ipynb - Colab
No ratings yet
Copy of Project 4 _ House Price Prediction.ipynb - Colab
5 pages
KNN - Ipynb - Colaboratory
No ratings yet
KNN - Ipynb - Colaboratory
3 pages
Data Analysis Dummy Report: 0. Data Import and Cleaning
No ratings yet
Data Analysis Dummy Report: 0. Data Import and Cleaning
1 page
Module 7. Data Quality
No ratings yet
Module 7. Data Quality
42 pages
GenAI Use
No ratings yet
GenAI Use
33 pages
Project Report (Batch 5)
No ratings yet
Project Report (Batch 5)
61 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
Dead Weight Tester Manual
No ratings yet
Dead Weight Tester Manual
20 pages
Lecture 6 Text Classification
No ratings yet
Lecture 6 Text Classification
19 pages
Performance and Evaluation CSC416 ECU Final
No ratings yet
Performance and Evaluation CSC416 ECU Final
39 pages
Basic Principles of Analytical Method Validation
No ratings yet
Basic Principles of Analytical Method Validation
34 pages
Analysis of Biomechanics Slap Hit and Push in The Field Hockey
No ratings yet
Analysis of Biomechanics Slap Hit and Push in The Field Hockey
6 pages
(BS en 12350-5 - 2009) - Testing Fresh Concrete. Flow Table Test
No ratings yet
(BS en 12350-5 - 2009) - Testing Fresh Concrete. Flow Table Test
14 pages
3murphy Et Al 2019 Testing The Independence of Self Reported Interoceptive Accuracy and Attention
No ratings yet
3murphy Et Al 2019 Testing The Independence of Self Reported Interoceptive Accuracy and Attention
19 pages
40 CFR Appendix L To Part 50 - PM2.5
No ratings yet
40 CFR Appendix L To Part 50 - PM2.5
43 pages
whitehouse1976
No ratings yet
whitehouse1976
7 pages
Physics Postmortem Notes
No ratings yet
Physics Postmortem Notes
28 pages
Calibration of An Ultrasonic Flow Meter For Hot Water PDF
No ratings yet
Calibration of An Ultrasonic Flow Meter For Hot Water PDF
8 pages
Accuracy and Inaccuracy in Memory and Cognition
No ratings yet
Accuracy and Inaccuracy in Memory and Cognition
10 pages
Instruction Manual: Phaseguard C/ T/ HT
No ratings yet
Instruction Manual: Phaseguard C/ T/ HT
32 pages
Dispositivos de Monitoreo de Temperatura
No ratings yet
Dispositivos de Monitoreo de Temperatura
38 pages
Quantitative Control Diagram
No ratings yet
Quantitative Control Diagram
30 pages
CAT 2021 Paper Slot 3 With Answer Keys
No ratings yet
CAT 2021 Paper Slot 3 With Answer Keys
28 pages
Master of Business Administration - MBA Semester 3 MB0050 Research Methodology Assignment Set-1
No ratings yet
Master of Business Administration - MBA Semester 3 MB0050 Research Methodology Assignment Set-1
9 pages
UEQHandbook V8
No ratings yet
UEQHandbook V8
16 pages
Chapter 1: Matter-Its Properties and Measurement
No ratings yet
Chapter 1: Matter-Its Properties and Measurement
26 pages
Instrument Parameters
No ratings yet
Instrument Parameters
2 pages
Iso - 22118 - 2011 - PCR
100% (1)
Iso - 22118 - 2011 - PCR
16 pages
Machine Tool Testing
No ratings yet
Machine Tool Testing
9 pages
Technical Terms Used in Research
No ratings yet
Technical Terms Used in Research
3 pages
Understanding Sources of Bias in Diagnostic Accuracy Studies
No ratings yet
Understanding Sources of Bias in Diagnostic Accuracy Studies
8 pages
Spek MethMix Defstan 68-253 Issue 1
No ratings yet
Spek MethMix Defstan 68-253 Issue 1
10 pages
Physics-1-Module-5-Laboratory Activity No. 5
No ratings yet
Physics-1-Module-5-Laboratory Activity No. 5
3 pages
Service DM Kalibrierung Tab Dienstleistungen en
No ratings yet
Service DM Kalibrierung Tab Dienstleistungen en
2 pages
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

air-quality-randomforest

Uploaded by

air-quality-randomforest

Uploaded by

# This Python 3 environment comes with many helpful analytics libraries installed

# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python

import numpy as np # linear algebra

# Input data files are available in the read-only "../input/" directory

array(['Hazardous', 'Good', 'Poor', 'Moderate'], dtype=object)

Proximity_to_Industrial_Areas Population_Density Air Quality

[5000 rows x 10 columns]

0 27.2 51.7 35.1 46.2 26.7 32.2 0.98 11.2 314 0

1 26.3 59.3 1.0 6.2 38.3 20.4 0.68 13.5 298 3

2 27.9 73.2 20.0 39.4 19.6 5.8 0.95 5.4 309 3

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)

0 27.2 51.7 35.1 46.2 26.7 32.2 0.98 11.2 314 0

1 26.3 59.3 1.0 6.2 38.3 20.4 0.68 13.5 298 3

2 27.9 73.2 20.0 39.4 19.6 5.8 0.95 5.4 309 3

precision recall f1-score support

0 0.00 0.00 0.00 174

accuracy 0.37 1650

accuracy = accuracy_score(y_test, y_pred)

0 0.00 0.00 0.00 174

accuracy 0.37 1650

sns.pairplot(df, hue='Air Quality', palette='Set1', diag_kind='kde')

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.