0% found this document useful (0 votes)
158 views27 pages

Supervised Classification Using GEE and Google Colab

The document outlines a supervised classification project using Google Colab and Google Earth Engine, focusing on land cover mapping in Galle. It details the steps involved, including study area selection, ground truth data collection, classification processes, and accuracy assessment methods. The project aims to improve classification accuracy by utilizing a larger study area and various geospatial tools.

Uploaded by

Saffi Ur Rehman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
158 views27 pages

Supervised Classification Using GEE and Google Colab

The document outlines a supervised classification project using Google Colab and Google Earth Engine, focusing on land cover mapping in Galle. It details the steps involved, including study area selection, ground truth data collection, classification processes, and accuracy assessment methods. The project aims to improve classification accuracy by utilizing a larger study area and various geospatial tools.

Uploaded by

Saffi Ur Rehman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

SUPERVISED CLASSIFICATION

USING GOOGLE COLAB & GOOGLE EARTH ENGINE

SOORIYAARACHCHI NS
PL 3508 – ADVANCED GIS AND REMOTE SENSING FOR PLANNING
DEPARTMENT OF TOWN AND COUNTRY PLANNING
UNIVERISTY OF MORATUWA
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

1 CONTENTS
1 INTRODUCTION ............................................................................................................................... 2
2 STUDY AREA SELECTION ............................................................................................................. 2
3 COLLECTING GROUND TRUTH DATA ...................................................................................... 4
4 CLASSIFICATION ........................................................................................................................... 10
4.1 Step 1 .......................................................................................................................................... 10
4.2 Step 2 - Define the study area boundary ................................................................................. 10
4.3 Step 3 - Create training data samples ...................................................................................... 10
4.4 Step 4 – Merge trained data and export the classified dataset .............................................. 12
4.5 Step 5 – Importing and Initializing Earth Engine .................................................................. 13
4.6 Step 6 – Define the study area .................................................................................................. 13
4.7 Step 7 – Load and Preprocess Landsat Data........................................................................... 13
4.8 Step 8 - Load and Prepare Training Data ............................................................................... 14
4.9 Step 9 - Train the Classifier and Classify the Image .............................................................. 15
4.10 Step 10 - Clip and Visualize the Classified Image................................................................... 15
4.11 Step 11 - Clip with Polygon Shapefile and Export.................................................................. 16
4.12 Step 12 - Export Classified Image ............................................................................................ 17
4.13 Map layout of final output ........................................................................................................ 18
5 ACCURACY ASSESSMENT ........................................................................................................... 19
5.1 Step 1 – Add land use class to each point according to the classified result ......................... 20
5.2 Step 2 - Loading the Data in Google Colab and import relevant libraries ........................... 20
5.3 Step 3 - Load the CSV File........................................................................................................ 21
5.4 Step 4 - Extract Columns .......................................................................................................... 21
5.5 Define Class Labels ................................................................................................................... 21
5.6 Calculate Confusion Matrix and Create DataFrame for Confusion Matrix ........................ 21
5.7 Calculating Accuracy Metrics .................................................................................................. 22
5.8 Printing the Results ................................................................................................................... 22
5.9 Accuracy Metrics Results from the Google Colab.................................................................. 23
5.10 Accuracy Results from Excel.................................................................................................... 23
6 INTERPRETATION OF THE RESULTS...................................................................................... 25
7 LIMITATIONS TO IMPROVE ACCURACY............................................................................... 25
8 GOOGLE COLAB CODES ............................................................................................................. 26

1
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

1 INTRODUCTION
Supervised classification is a widely used technique in remote sensing that involves categorizing
pixels in satellite imagery into predefined classes based on training data. This technique is essential
for various applications such as land cover mapping, urban planning, and environmental
monitoring. The main objective of this assignment is to apply supervised classification to a selected
study area using Google Earth Engine (GEE) and Google Colab, leveraging their powerful
computational capabilities and extensive data repositories.
In this assignment, the focus is on classifying land cover types within a selected area in my
hometown. The classification process involves several key steps, including data collection,
preprocessing, training data selection, model training, classification, and accuracy assessment.

2 STUDY AREA SELECTION


For this assignment, I selected several Grama Niladhari Divisions (GNDs) from my hometown,
Galle, for supervised classification. This decision was based on the observation that classified
results tend to be more accurate over larger areas compared to smaller ones. During preliminary
trials, I found that classifying smaller areas was challenging and often resulted in less accurate
representations of actual land cover classes. Here, I attach some of the classification results from
small areas, illustrating the low accuracy achieved by considering a small study area.

2
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

As you can see in the above figures, it was difficult to get a more accurate classification for a
smaller area. Therefore, I preferred a larger study area first.
The larger study area provided a more diverse and representative sample of different land cover
types, leading to improved training and classification accuracy. By encompassing a variety of land
covers within a larger area, the classifier could better differentiate between classes, ultimately
yielding more reliable and precise results.
This is the study area which I selected to do the supervised classification.

3
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

3 COLLECTING GROUND TRUTH DATA


After selecting the study area, the next crucial step was to collect ground truth data to ensure the
accuracy of the supervised classification. For this purpose, I utilized the QField application, a
mobile GIS tool that allows for efficient and precise data collection in the field. Using QField, I
visited various locations within the selected GNDs to gather accurate information on land use
types. This involved recording the land use classes for each sample point. The collected ground
truth data provided a reliable reference for training and validating the classifier, ensuring that the
classification results closely matched the actual land cover conditions observed in the field.
The steps I followed in collecting ground truth data with the QField application are given below.
• Install the QField application to the mobile phone.

• Create a QGIS project, add a Base map and create a point layer with suitable fields to
collect ground truth data.

o Id
o Place Name
o Land use class of that place
o Photo
o Date of the when data is collected

4
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• Changed the attribute forms of the created points layer as follows.

5
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• Add the created QGIS project with the points layer to the QField Cloud using the
QFieldSync plugin.

6
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• Open the Qfield, find the relevant project and add data to the fields.

7
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

Here, I added around 60 points representing all land use classes I identified in the study area.

• QField project was Synchronized to the QGIS project

8
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

9
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

4 CLASSIFICATION

The classification was done using Google Earth Engine and Google Colab.

4.1 Step 1
• Create a Google Earth Engine project

4.2 Step 2 - Define the study area boundary


• Study area coordinates were extracted from geojson.ai by adding a kml file of the selected
study area was added to the geojson.ai.

4.3 Step 3 - Create training data samples


• Use the point geometry feature.
• I identified 4 land use classes for the classification of selected area.

10
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• Those are Built up areas, Waterbodies, Vegetation and Bare soil


• The settings of the point geometry layer of Built up Areas were adjusted as the below
figure.

• Then added points to the geometry layer according to built-up areas shown in the satellite
base map.

• As these figures, layers were created and points were added for all land use classes which
are identified

11
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• Around 200 points were added to each land use class but bare soil is very rare than other
land use classes.
• Therefore, the training points for that class were reduced.

• The Below figure shows the result after adding points to all land use classes.

4.4 Step 4 – Merge trained data and export the classified dataset

• After running these codes in the above figure, Classified_dataset was added to the Asset
of the Google Earth Engine project.
• Then, the path of that dataset can be copied and added to the Google Colab project.

12
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

4.5 Step 5 – Importing and Initializing Earth Engine

• Imports the Earth Engine library, which is essential for accessing and manipulating
geospatial data.
• Authenticates the Earth Engine account, allowing access to the project and data.
• Specifies the project ID for your Earth Engine project
• Initializes the Earth Engine library with the specified project ID

4.6 Step 6 – Define the study area


• Defines the geographical area of interest using a rectangular geometry with specified
coordinates.
• The coordinates from geojson.ai are used in this step.
• This is the study area where we created training data for each land use class.

4.7 Step 7 – Load and Preprocess Landsat Data


• Loads the Landsat 8 Collection 2, Tier 1 Top of Atmosphere (TOA) Reflectance data.
• Filters the image collection to include only images that intersect with the study area.
• Filters the image collection to include only images within the specified date range (January
1, 2023, to December 31, 2023)
• Sorts the filtered images by cloud cover in ascending order
• Selects the first image (the one with the least cloud cover)
• Selects specific bands (B2, B3, B4, B5, B6, B7) from the image collection

13
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

4.8 Step 8 - Load and Prepare Training Data


• Loads the training data, which is a feature collection containing labeled sample points.
• Samples the image subset using the training data to create a training dataset. The
sampleRegions function extracts pixel values at the locations of the training samples and
associates them with the corresponding class labels.

14
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

4.9 Step 9 - Train the Classifier and Classify the Image


• Trains a Support Vector Machine (SVM) classifier using the training dataset. The SVM
uses a radial basis function (RBF) kernel with specified gamma and cost parameters.
• Applies the trained classifier to the image subset to generate a classified image, where each
pixel is assigned a class label.

4.10 Step 10 - Clip and Visualize the Classified Image


• Clips the classified image to the study area to focus the analysis on the area of selected.
• Imports folium for creating interactive maps and display from IPython for displaying the
map.
• Defines a helper function to add Earth Engine layers to a Folium map
• Defines a color palette for visualizing different land cover classes.
• Creates a Folium map centered on the study area with a specified zoom level.
• Adds the classified image to the map using the defined visualization parameters.

15
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• The map received is given below.

4.11 Step 11 - Clip with Polygon Shapefile and Export


• As I did the classification for a large area, I clipped the classified image into some GNDs
as I instructed for selecting the study area.
• Here I selected an area with 6 GNDs. Those GNDs are Megalle, Dewata, Kadagoda,
Dewathura, Bataduwa West and Nugadoowa.

16
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• The clipped map is given below.

4.12 Step 12 - Export Classified Image


• Defines an export task to save the classified image as a TIFF file to your Google Drive.
The export task specifies the image to be exported, a description, the destination folder, the
filename prefix, the region, the scale (30 meters per pixel), the coordinate reference system
(EPSG:32644), and the maximum number of pixels.

17
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

4.13 Map layout of final output

18
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

5 ACCURACY ASSESSMENT

After completing the supervised classification, it is crucial to evaluate the accuracy of the
classification results. Accuracy assessment helps determine how well the classified image matches
the actual land cover on the ground. This process involves comparing the classified results with
ground truth data that was not used during the training phase. The following steps outline the key
components of the accuracy assessment:

Confusion Matrix

• A confusion matrix is constructed to summarize the performance of the classifier. It shows


the number of correct and incorrect classifications for each class.
• The matrix includes rows representing the actual classes and columns representing the
predicted classes.
Overall Accuracy

• The ratio of correctly classified instances to the total number of instances. It is calculated
as the sum of the diagonal elements of the confusion matrix divided by the total number of
instances.
Producer's Accuracy

• Indicates the accuracy of individual classes, representing the probability that a pixel in a
given class was classified correctly.
• It is calculated by dividing the number of correctly classified pixels for a class by the total
number of actual pixels in that class (row total).
User's Accuracy
• Represents the reliability of the classified map from the user's perspective.
• It is calculated by dividing the number of correctly classified pixels for a class by the total
number of pixels classified in that class (column total).
Kappa Coefficient
• A statistical measure that accounts for the agreement occurring by chance.
• It is calculated using the confusion matrix and provides a more robust assessment of
classification accuracy.

• Several steps have been taken to complete the accuracy assessment.


• Here, I tried to find accuracy by Google Colab as well as manual method.
• The steps I followed in the accuracy assessment are given below.

19
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

5.1 Step 1 – Add land use class to each point according to the classified result

• First I added a land use class for each point manually and then got that class by running the
sample raster points tool.
• The both were same so I used the result from the sample raster points for further.
• Then exported that points layer as a comma delimited separated (CSV) file.

5.2 Step 2 - Loading the Data in Google Colab and import relevant libraries
• Import the Accuracy_csv_new csv file to the content of the Google Colab project.

• Pandas library is used for data manipulation and analysis.

20
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• The Sklearn library provides tools for building and evaluating machine learning models,
including metrics for classification.

5.3 Step 3 - Load the CSV File


• To read the added CSV file into a DataFrame

5.4 Step 4 - Extract Columns


• Extracts the actual and classified land use classes from the DataFrame.

5.5 Define Class Labels


• Gets the unique class labels from the actual land use column and sorts them.

5.6 Calculate Confusion Matrix and Create DataFrame for Confusion Matrix
• Computes the confusion matrix, a table that describes the performance of the classification
model by comparing the actual and classified labels.
• Then converts the confusion matrix into a DataFrame for better readability.

21
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

5.7 Calculating Accuracy Metrics

5.8 Printing the Results

22
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

5.9 Accuracy Metrics Results from the Google Colab


• These are the results I received for Confusion matrix, User accuracy, producer accuracy,
Overall accuracy and Kappa coefficient.

5.10 Accuracy Results from Excel


• Confusion Matrix
Builtup Areas Waterbodies Vegetation Bare Soil User Accuracy
Builtup Areas 14 2 1 0 17
Waterbodies 2 13 0 0 15
Vegetation 1 2 11 2 16
Bare Soil 0 0 5 5 10
Producer Accuracy 17 18 17 6 58

• User Accuracy
User Accuracy = (Total number of correctly classified pixels in each category / Total number of pixels in that category)*100

User Accuracy
Builtup Areas 82.35294118
Waterbodies 86.66666667
Vegetation 68.75
Bare Soil 50

23
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

• Producer Accuracy
Producer Accuracy = (Total number of correctly classified pixels in each category / Total number of pixels in that category)*100

Producer Accuracy
Builtup Areas 82.35294118
Waterbodies 72.22222222
Vegetation 64.70588235
Bare Soil 83.33333333

• Kappa Coefficient
Kappa Coefficient =((TS×TCS)-∑(Column Total×Row Total))/(TS^2-∑(Column TotalxRow Total))×100
TS = Total samples
TCS = Total corected samples or the Sum of the diaganal values

Kappa Coefficient
TS×TCS 2494
∑(Column Total×Ro 891
TS^2 3364

Kappa Coefficient 64.82005661

• Overall Accuracy

Overall Accuracy = (Total number of correctly classified pixels / Total number of pixels)* 100

Overall Accuracy 74.13793103

• As you can see in both results from Google Colab and Excel, both values are same.

24
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

6 INTERPRETATION OF THE RESULTS


• This is the Kappa value interpretation by Landis & Koch (1977).
(https://towardsdatascience.com/interpretation-of-kappa-values-2acd1ca7b18f)
Value of Kappa Level of Agreement
<0 No agreement
0 - .20 Slight
.21 - .40 Fair
.41 - .60 Moderate
.61 - .80 Substantial
.81 - 1.0 Perfect

• I got the Kappa coefficient value of 64.82 and it is a substantial level of agreement between
the ground truth data and classified data.
• Waterbodies and built-up areas have relatively high producer and user accuracies,
indicating reliable classification for these classes.
• Vegetation has moderate producer and user accuracies, suggesting some confusion with
other classes, particularly Bare Soil.
• Bare Soil has the lowest user accuracy, indicating significant misclassification, primarily
as Vegetation.

7 LIMITATIONS TO IMPROVE ACCURACY


• If certain classes have significantly more training samples than others, the classifier might
perform better for those classes and worse for underrepresented ones.
• In my classification, bare soil is much less than other land use types. Therefore the training
samples of that land use type is less than other classes.
• Some classes might have similar spectral characteristics, leading to confusion (Bare Soil
and Vegetation)
• The spatial resolution of the Landsat data (30 meters) might not be sufficient to capture
fine details, leading to misclassification in heterogeneous areas.

25
212365B – SOORIYAARACHCHI NS
SUPERVISED CLASSIFICATION

8 GOOGLE COLAB CODES


• Supervised Classification
https://colab.research.google.com/drive/1mAXIEXMAHppGj_Mnfe3ixa2aQrGvSqR8?usp=s
haring
• Accuracy Assessment
https://colab.research.google.com/drive/1i8Iq72YqtlJw349JA5Rxonzz2TrLXMKM?usp=sha
ring

26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy