INTERNSHIP

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

AICTE EDUKILLS GOOGLE AI-ML

VIRTUAL INTERNSHIP
An Internship Report Submitted at the end of seventh semester

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING

Submitted By
BODDEPALLI MEENESH
(21981A0520)

Under the esteemed guidance of


KARTHIK PADMANABHAN
(EduSkills Foundation)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

RAGHU ENGINEERING COLLEGE


(AUTONOMOUS)
Affiliated to JNTU GURAJADA, VIZIANAGARAM
Approved by AICTE, Accredited by NBA, Accredited by NAAC with A+ grade
2024-2025
RAGHU ENGINEERING COLLEGE
(AUTONOMOUS)
Affiliated to JNTU GURAJADA, VIZIANAGARAM
Approved by AICTE, Accredited by NBA, Accredited by NAAC with A+ grade

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


CERTIFICATE

This is to certify that this project entitled “GOOGLE AI-ML” done by “BODDEPALLI
MEENESH (21981A0520)” is a student of B.Tech in the Department of Computer Science and Engineering,
Raghu Engineering College, during the period 2021-2025, in partial fulfillment for the award of the Degree of
Bachelor of Technology in Computer Science and Engineering to the Jawaharlal Nehru Technological
University, Gurajada Vizianagaram is a record of bonafide work carried out under my guidance and
supervision.
The results embodied in this internship report have not been submitted to any other University or
Institute for the award of any Degree.

Internal Guide Head of the Department


Dr.P.Appala Naidu, Dr.R.Sivaranjani,
Professor, Professor,
Dept of CSE, Dept of CSE,
Raghu Engineering College, Raghu Engineering College,
Dakamarri (V), Dakamarri (V),
Visakhapatnam. Visakhapatnam.

External Examiner
DISSERTATION APPROVAL SHEET
This is to certify that the dissertation titled
Google AI-ML Virtual Internship
BY
BODDEPALLI MEENESH (21981A0520)

Is approved for the degree of Bachelor of Technology

Dr.P.Appala Naidu
INTERNSHIP GUIDE
(Professor)

Internal Examiner

External Examiner

Dr.R.Sivaranjani
HOD
(Professor)

Date:
DECLARATION

This is to certify that this internship titled “GOOGLE AI-ML” is bonafide work done by me,
impartial fulfillment of the requirements for the award of the degree B.Tech and submitted to the
Department of Computer Science and Engineering, Raghu Engineering College, Dakamarri.
I also declare that this internship is a result of my own effort and that has not been copied from
anyone and I have taken only citations from the sources which are mentioned in the references.
This work was not submitted earlier at any other University or Institute for the reward of any
degree.

Date:
Place:

BODDEPALLI MEENESH
(21981A0520)
CERTIFICATE
ACKNOWLEDGEMENT

I express sincere gratitude to my esteemed Institute “Raghu Engineering College”, which has
provided us an opportunity to fulfill the most cherished desire to reach my goal.
I take this opportunity with great pleasure to put on record our ineffable personal indebtedness to
Mr. Raghu Kalidindi, Chairman of Raghu Engineering College for providing necessary
departmental facilities.
I would like to thank the Principal Dr. CH. Srinivasu of “Raghu Engineering College”, for
providing the requisite facilities to carry out projects on campus. Your expertise in the subject matter
and dedication towards our project have been a source of inspiration for all of us.
I sincerely express our deep sense of gratitude to Dr.R.Sivaranjani, Professor, Head of
Department in Department of Computer Science and Engineering, Raghu Engineering College, for
her perspicacity, wisdom and sagacity coupled with compassion and patience. It is my great pleasure
to submit this work under her wing. I thank you for guiding us for the successful completion of this
project work.
I would like to thank Karthik Padmanabhan, EduSkills Foundation for providing the
technical guidance to carry out the module assigned. Your expertise in the subject matter and
dedication towards our project have been a source of inspiration for all of us.
I extend my deep hearted thanks to all faculty members of the Computer Science department for
their value based imparting of theory and practical subjects, which were used in the project.
I thank the non-teaching staff of the Department of Computer Science and Engineering, Raghu
Engineering College, for their inexpressible support.

Regards
BODDEPALLI MEENESH
(21981A0520)
ABSTRACT
The AI/ML internship at Google is a super-strict platform designed to allow the
participants to learn in depth about artificial intelligence and machine learning including but
not limited to use of TensorFlow. Interns undertake different areas of work, including the
theory and practice course that are meant to promote practical application of the knowledge
gained.
Efficiency of the course will be presented by unlocking badges through the Google
Developer Profile successively which means earning proficiency in essential skills like object
detection, image classification and product image search.
By working in real-time with TensorFlow over the course of the internship, the
participant will have the capability of designing and deploying machine learning models
without a problem. They also resort to Google Colab, an online platform that provides an
environment for scientific computing and for writing scientific documents (which is called
Jupyter notebook). This makes experimentation and model development more flexible.
Kapotasteria program structure involves having mentorship meetings, group projects, and
code reviews together with other interns, aiming at a balanced learning experience and skill
building.
These skills are put to good use in the real world during the legal application of AI/ML.
During code reviews and presentations, communication and problem-solving skills are further
seen in action, thus preparing students for real-world workplace scenarios. Technology
employed in the program accurately reflects the industry standards and the emphasis on team
learning enables participants to be trained with skills and knowledge that are highly valued in
the modern AI/ML. At the end of the internship, trainers, with strong knowledge of the basic
approaches to Artificial Intelligence/Machine Learning, obtain a practical toolkit that helps
them to solve these tasks. It is through the internship program that trainees are well equipped
to handle opportunities in future in the upcoming digital space.
Table of Contents
S.NO CONTENT PAGE NUMBER

1. Introduction 1

2. Program neural networks with TensorFlow 2

3. Get started with object detection 3-5

4. Go further with object detection 6-8

5. Get started with product image search 9-10

6. Go further with product image search 11-15

7. Go further with image classification 16-18

8. Conclusion 19
1. INTRODUCTION
Today's world needs understanding and handling of visual information in almost every field, for
instance: self-driving technology, e-health, e-commerce among others. Computer vision, a branch of
artificial intelligence, involves the capacity of computers to process visual information and has a number
of applications that improve user and business processes.

In this project we will focus on the application of TensorFlow in solving important computer vision
problems namely: object detection, image classification and product search. TensorFlow is a broad open-
source library created by Google that allows the easy integration of advanced solutions into existing
architectures.

Self-driving cars and the pattern recognition technologies require object detection whereas image
classification is useful in enhancing medical images as well as content security. Besides that, visual
product search has also gained importance especially in online shopping where images are used to source
for products instead of written text.

Within the scope of this project, we will consider the methods, algorithms, datasets, and other
relevant components related to the aforementioned tasks. We will demonstrate the application of
TensorFlow in building effective and precise vision systems that are implemented in practice.

This internship program is designed to equip participants with foundational and advanced
knowledge in AI and ML, focusing on real-world applications using Google’s industry-leading tools and
platforms. Interns engage in hands-on projects covering key topics such as supervised and unsupervised
learning, neural networks, natural language processing, and cloud-based AI solutions. Through a
comprehensive curriculum, participants not only gain technical expertise but also develop
problem-solving and analytical skills necessary to build and deploy AI models effectively.

The Google AI/ML Virtual Internship aims to nurture a new generation of AI/ML professionals by
providing in-depth insights into the industry and fostering a deep understanding of AI’s transformative
potential.
2. Program neural networks with TensorFlow

2.1 Introduction to Computer Vision:


This field focuses on teaching machines to understand and interpret visual data (images, videos).
It allows computers to mimic human vision tasks like recognizing objects, detecting motion, and
understanding scenes. Applications include facial recognition, self-driving cars, medical imaging, and
more.
2.2 Introduction to Convolutions:
Convolution is a mathematical operation essential in image processing. It involves applying
filters to images, which helps in detecting patterns such as edges, textures, and corners. These filters are
small matrices that move across the image, capturing essential features.
2.3 Convolutional Neural Networks (CNNs):
CNNs are deep learning models designed specifically for processing and analyzing visual data.
CNNs are composed of layers:
● Convolutional layers- extract features from input images by applying filters (kernels).
● Pooling layers-reduce the dimensionality of the image data, retaining the most important
information.
● Fully connected layers-connect the final features to the output, making predictions like
classification.
CNNs are widely used in image recognition tasks because they efficiently capture hierarchical patterns
in images—starting with simple edges and progressing to complex structures.
2.4 Using CNNs with Larger Datasets:
When working with large datasets, CNNs show exceptional performance but require a lot of
computational power. Key techniques for managing large datasets include:
● Data augmentation: This involves modifying images (e.g., rotating, flipping, or scaling) to
artificially expand the dataset without collecting more data.
● Transfer learning: Involves using pre-trained CNNs and fine-tuning them for specific tasks. This
saves time and computational resources while leveraging the knowledge from previously trained
models.
By employing CNNs and these techniques, developers can handle complex, large-scale visual data,
building systems that achieve high accuracy in image classification, object detection, and more.
3. Get started with object detection
ML Kit is a mobile SDK that brings Google's on-device machine learning expertise to Android
and iOS apps. You can use the powerful yet simple to use Vision and Natural Language APIs to solve
common challenges in your apps or create brand-new user experiences. All are powered by Google's
best-in-class ML models and offered to you at no cost.
ML Kit's APIs all run on-device, allowing for real-time use cases where you want to process a
live camera stream, for example. This also means that the functionality is available offline.
This codelab will walk you through simple steps to add Object Detection and Tracking (ODT) for a
given image into your existing Android app. Please note that this codelab takes some shortcuts to
highlight ML Kit ODT usage.
3.1 Add ML Kit Object Detection and Tracking API to the project

First, there is a Button ( ) at the bottom to:


● bring up the camera app integrated in your device/emulator
● take a photo inside your camera app
● receive the captured image in starter app
Try out the Take photo button, follow the prompts to take a photo, accept the photo and observe it
displayed inside the starter app.

fig 3.1 interface of the object fig 3.2 capturing photo in fig 3.3 captured image
detection mobile app starter app
3.2 Add on-device object detection
In this step, you will add the functionality to the starter app to detect objects in images. As you
saw in the previous step, the starter app contains boilerplate code to take photos with the camera app on
the device. There are also 3 preset images in the app that you can try object detection on if you are
running the codelab on an Android emulator.
When you have selected an image, either from the preset images or taking a photo with the
camera app, the boilerplate code decodes that image into a Bitmap instance, shows it on the screen and
calls the runObjectDetection method with the image.
3.3 Set up and run on-device object detection on an image
There are only 3 simple steps with 3 APIs to set up ML Kit ODT:
● prepare an image: InputImage
● create a detector object: ObjectDetection.getClient(options)
● connect the 2 objects above:
process(image) Step 1: Create an InputImage
Step 2: Create a detector instance
ML Kit follows Builder Design Pattern. You will pass the configuration to the builder, then acquire a
detector from it. There are 3 options to configure (the options in bold are used in this codelab):
● detector mode (single image or stream)

● detection mode (single or multiple object detection)


● classification mode (on or
off) Step 3: Feed image(s) to the
detector
Object detection and classification is async processing:
● You send an image to the detector (via process()).
● The Detector reports the result back to you via a callback.
3.4 Post-processing the detection results
In this section, you'll make use of the result into the image:
● draw the bounding box on image
● draw the category name and confidence inside bounding box
3.5 Understand the visualization utilities
● fun drawDetectionResult(bitmap: Bitmap, detectionResults: List<BoxWithText>): Bitmap This
method draws the object detection results in detectionResults on the input bitmap and returns the
modified copy of it.
Here is an example of an output of the drawDetectionResult utility method:

fig 3.4 result of detection result

3.6 Visualize the ML Kit detection result


Use the visualization utilities to draw the ML Kit object detection result on top of the input
image. Once the app loads, press the Button with the camera icon, point your camera to an object, take a
photo, accept the photo (in Camera App) or you can easily tap any preset images. You should see the
detection results; press the Button again or select another image to repeat a couple of times to experience
the latest ML Kit ODT!

fig 3.5 detection results of a photo


4. Go further with object detection
In this unit, you'll learn how to train a custom object detection model using a set of training
images with TFLite Model Maker, then deploy your model to an Android app using TFLite Task
Library. You will:
● Build an Android app that detects ingredients in images of meals.
● Integrate a TFLite pre-trained object detection model and see the limit of what the model can
detect.
● Train a custom object detection model to detect the ingredients/components of a meal using a
custom dataset called salad and TFLite Model Maker.
● Deploy the custom model to the Android app using TFLite Task Library.

4.1 Object Detection


Object detection is a set of computer vision tasks that can detect and locate objects in a digital
image. Given an image or a video stream, an object detection model can identify which of a known set
of objects might be present, and provide information about their positions within the image.
TensorFlow provides pre-trained, mobile optimized models that can detect common objects, such
as cars, oranges, etc. You can integrate these pre-trained models in your mobile app with just a few lines
of code. However, you may want or need to detect objects in more distinctive or offbeat categories. That
requires collecting your own training images, then training and deploying your own object detection
model.
4.2 TensorFlow Lite
TensorFlow Lite is a cross-platform machine learning library that is optimized for running
machine learning models on edge devices, including Android and iOS mobile devices. TensorFlow Lite
is actually the core engine used inside ML Kit to run machine learning models. There are two
components in the TensorFlow Lite ecosystem that make it easy to train and deploy machine learning
models on mobile devices:
● Model Maker is a Python library that makes it easy to train TensorFlow Lite models using your
own data with just a few lines of code, no machine learning expertise required.
● Task Library is a cross-platform library that makes it easy to deploy TensorFlow Lite models
with just a few lines of code in your mobile apps.
● fun drawDetectionResult(bitmap: Bitmap, detectionResults: List<DetectionResult>): Bitmap
This method draws the object detection results in detectionResults on the input bitmap.
Here is an example of an output of the drawDetectionResult utility method.

fig 4.1 output of the detection result

The TFLite Task Library makes it easy to integrate mobile-optimized machine learning models
into a mobile app. It supports many popular machine learning use cases, including object detection,
image classification, and text classification. You can load the TFLite model and run it with just a few
lines of code.
The starter app is the minimal Android application that:
- Uses either the device camera or available preset images.
- Now contains methods for taking pictures and presenting object detection output.
You will add functionality for object detection within the application by filling out the method
`runObjectDetection()`
The functions are defined as follows:
`runObjectDetection(bitmap: Bitmap)`: It is a function that conducts object detection on an input image.
It uses the object detection algorithm.
Add a Pre-trained Object Detection Model
● Download the Model. The pre-trained TFLite model is EfficientDet-Lite. This model is designed
to be mobile efficient, and it's trained on the COCO 2017 data set.
● Add dependencies
● Configure and Perform Object Detection
● Rendering the Detectors Results
● Train a Custom Object Detection Model
● You will train a custom model to detect meal ingredients using TFLite Model Maker and Google
Colab. The dataset is composed of some labeled images of ingredients like cheese and baked
products.

fig 4.2 accuracy of the predicted items

Developed an Android application that can detect objects in images, first by a TFLite pretrained
model, then train and deploy the learnt object detection model. You have utilized TFLite Model Maker
for model training and TFLite Task Library for its integration into the application.
5. Get started with product image search
5.1 Detect objects in images to build a visual product search with ML Kit:Android
Have you seen the Google Lens demo, where you can point your phone camera at an object and
find where you can buy it online? If you want to learn how you can add the same feature to your app,
then this codelab is for you. It is part of a learning pathway that teaches you how to build a product
image search feature into a mobile app.
In this codelab, you will learn the first step to build a product image search feature: how to detect
objects in images and let the user choose the objects they want to search for. You will use ML Kit Object
Detection and Tracking to build this feature.

5.1.1 Import the app into Android Studio


Start by importing the starter app into the Android Studio.
Go to Android Studio, select Import Project (Gradle, Eclipse ADT, etc.) and choose the starter folder from
the source code that you have downloaded earlier.
5.1.2 Add the dependencies for ML Kit Object Detection and Tracking
The ML Kit dependencies allow you to integrate the ML Kit ODT SDK in your app.
Go to the app/build.gradle file of your project and confirm that the dependency is already there:
build.gradle
5.2 Add on-device object detection
In this step, you'll add the functionality to the starter app to detect objects in images. As you saw
in the previous step, the starter app contains boilerplate code to take photos with the camera app on the
device. There are also 3 preset images in the app that you can try object detection on, if you are running
the codelab on an Android emulator.
When you select an image, either from the preset images or by taking a photo with the camera
app, the boilerplate code decodes that image into a Bitmap instance, shows it on the screen and calls the
runObjectDetection method with the image.
In this step, you will add code to the runObjectDetection method to do object detection!
Step 1: Create an InputImage
Step 2: Create a detector instance
Step 3: Feed image(s) to the detector
Object detection and classification is async processing:
● you send an image to detector (via process())
● detector reports the result back to you via a callback
Upon completion, detector notifies you with
1. Total number of objects detected
2. Each detected object is described with
● trackingId: an integer you use to track it cross frames (NOT used in this codelab)
● boundingBox: object's bounding box
● labels: list of label(s) for the detected object (only when classification is enabled)
● text (Get the text of this label including "Fashion Goods", "Food", "Home Goods", "Place",
"Plant")
5.3 Understand the visualization utilities
There is some boilerplate code inside the codelab to help you visualize the detection result. Leverage these
utilities to make our visualization code simple:
● fun drawDetectionResults(results: List<DetectedObject>) This method draws white circles at the
center of each object detected.
● fun setOnObjectClickListener(listener: ((objectImage: Bitmap) -> Unit)) This is a callback to
receive the cropped image that contains only the object that the user has tapped on. You will
send this cropped image to the image search backend in a later codelab to get a visually similar
result. In this codelab, you won't use this method yet.

fig 5.1 interface of the product image search app


6. Go further with product image search
6.1 Call Vision API Product Search backend on Android
Have you seen the Google Lens demo, where you can point your phone camera to an object and
find where you can buy it online? If you want to learn how you can add the same feature to your app,
then this codelab is for you. It is part of a learning pathway that teaches you how to build a product
image search feature into a mobile app.
In this codelab, you will learn how to call a backend built with Vision API Product Search from a
mobile app. This backend can take a query image and search for visually similar products from a
product catalog.
6.2 About Vision API Product Search
Vision API Product Search is a feature in Google Cloud that allows users to search for visually
similar products from a product catalog. Retailers can create products, each containing reference images
that visually describe the product from a set of viewpoints. You can then add these products to product
sets (i.e. product catalog). Currently Vision API Product Search supports the following product
categories: homegoods, apparel, toys, packaged goods, and general.
When users query the product set with their own images, Vision API Product Search applies
machine learning to compare the product in the user's query image with the images in the retailer's
product set, and then returns a ranked list of visually and semantically similar results.
6.3 Handle object selection
6.3.1 Allow users to tap on a detected object to select
Now you'll add code to allow users to select an object from the image and start the product
search. The starter app already has the capability to detect objects in the image. It's possible that there
are multiple objects in the image, or the detected object only occupies a small portion of the image.
Therefore, you need to have the user tap on one of the detected objects to indicate which object they
want to use for product search.
The view that displays the image in the main activity (ObjectDetectorActivity) is actually a
custom view (ImageClickableView) that extends Android OS's default ImageView. It implements some
convenient utility methods, including:
● fun setOnObjectClickListener(listener: ((objectImage: Bitmap) -> Unit)) This is a callback to
receive the cropped image that contains only the object that the user has tapped on. You will
send this cropped image to the product search backend.
fig 5.1 interface of the product image search app
The onObjectClickListener is called whenever the user taps on any of the detected objects on the screen.
It receives the cropped image that contains only the selected object.
The code snippet does 3 things:
● Takes the cropped image and serializes it to a PNG file.
● Starts the ProductSearchActivity to execute the product search sequence.
● Includes the cropped image URI in the start-activity intent so that ProductSearchActivity can
retrieve it later to use as the query image.
There are a few things to keep in mind:
● The logic for detecting objects and querying the backend has been split into 2 activities only to
make the codelab easier to understand. It's up to you to decide how to implement them in your
app.
● You need to write the query image into a file and pass the image URI between activities because
the query image can be larger than the 1MB size limit of an Android intent.
● You can store the query image in PNG because it's a lossless format.
6.3.2 Explore the product search backend
Build the product image search backend
This codelab requires a product search backend built with Vision API Product Search. There are
two options to achieve this:
Option 1: Use the demo backend that has been deployed for you
Option 2: Create your own backend by following the Vision API Product Search quickstart
You will come across these concepts when interacting with the product search backend:
● Product Set: A product set is a simple container for a group of products. A product catalog can
be represented as a product set and its products.
● Product: After you have created a product set, you can create products and add them to the
product set.
● Product's Reference Images: They are images containing various views of your products.
Reference images are used to search for visually similar products.
● Search for products: Once you have created your product set and the product set has been
indexed, you can query the product set using the Cloud Vision API.
6.3.3 Understand the preset product catalog
The product search demo backend used in this codelab was created using the Vision API Product
Search and a product catalog of about a hundred shoes and dress images. Here are some images from the
catalog:

fig 6.2 products images from the catalog

6.3.4 Call the product search demo backend


You can call the Vision API Product Search directly from a mobile app by setting up a Google
Cloud API key and restricting access to the API key to just your app.
To keep this codelab simple, a proxy endpoint has been set up that allows you to access the demo
backend without worrying about the API key and authentication. It receives the HTTP request from the
mobile app, appends the API key, and forwards the request to the Vision API Product Search backend.
Then the proxy receives the response from the backend and returns it to the mobile app.
6.4 Implement the API client
6.4.1 Understand the product search workflow
Follow this workflow to conduct product search with the backend:
● Encode the query image as a base64 string
● Call the projects.locations.images.annotate endpoint with the query image
● Receive the product image IDs from the previous API call and send them to the
projects.locations.products.referenceImages.get endpoints to get the URIs of the product images
in the search result.

6.5 Implement the API client class


Now you'll implement code to call the product search backend in a dedicated class called
ProductSearchAPIClient. Some boilerplate code has been implemented for you in the starter app:
● class ProductSearchAPIClient: This class is mostly empty now but it has some methods that you
will implement later in this codelab.
● fun convertBitmapToBase64(bitmap: Bitmap): Convert a Bitmap instance into its base64
representation to send to the product search backend
● fun annotateImage(image: Bitmap): Task<List<ProductSearchResult>>: Call the
projects.locations.images.annotate API and parse the response.
● fun fetchReferenceImage(searchResult: ProductSearchResult): Task<ProductSearchResult>:
Call the projects.locations.products.referenceImages.get API and parse the response.
● SearchResult.kt: This file contains several data classes to represent the types returned by the
Vision API Product Search backend.
6.6 Explore the API request and response format
You can find similar products to a given image by passing the image's Google Cloud Storage URI,
web URL, or base64 encoded string to Vision API Product Search.
Here are some important fields in the product search result object:
● product.name: The unique identifier of a product in the format of projects/{project-
id}/locations/{location-id}/products/{product_id}
● product.score: A value indicating how similar the search result is to the query image. Higher
values mean more similarity.
● product.image: The unique identifier of the reference image of a product in the format of
projects/{project-id}/locations/{location-id}/products/{product_id}/referenceImages/{image_id}.
You will need to send another API request to projects.locations.products.referenceImages.get to
get the URL of this reference image so that it will display on the screen.
6.7 Get the product reference images
Explore the API request and response format
You'll send a GET HTTP request with an empty request body to the
projects.locations.products.referenceImages.get endpoint to get the URIs of the product images returned
by the product search endpoint.
The reference images of the demo product search backend was set up to have public-read
permission. Therefore, you can easily convert the GCS URI to an HTTP URL and display it on the app
UI. You only need to replace the gs:// prefix with https://storage.googleapis.com/.
6.8 Implement the API call
Next, craft a product search API request and send it to the backend. You'll use Volley and Task
API similarly to the product search API call.
6.9 Connect the two API requests
Go back to annotateImage and modify it to get all the reference images' HTTP URLs before
returning the ProductSearchResult list to its caller.
Once the app loads, tap any preset images, select an detected object, tap the Search button to see
the search results, this time with the product images.

fig 6.3 interface of product image search app after connecting the two APIs
7. Go further with image classification
In the previous codelab you created an app for Android and iOS that used a basic image labeling
model that recognizes several hundred classes of image. It recognized a picture of a flower very
generically – seeing petals, flower, plant, and sky.
To update the app to recognize specific flowers, daisies or roses for example, you'll need a
custom model that's trained on lots of examples of each of the type of flower you want to recognize.
This codelab will not go into the specifics of how a model is built. Instead, you'll learn about the
APIs from TensorFlow Lite Model Maker that make it easy.
7.1 Install and import dependencies
Install TensorFlow Lite Model Maker. You can do this with a pip install. The &> /dev/null at the
end just suppresses the output. Model Maker outputs a lot of stuff that isn't immediately relevant. It's
been suppressed so you can focus on the task at hand.
7.2 Download and Prepare your Data
If your images are organized into folders, and those folders are zipped up, then if you download
the zip and decompress it, you'll automatically get your images labeled based on the folder they're in.
This directory will be referenced as data_path.
This data path can then be loaded into a neural network model for training with TensorFlow Lite
Model Maker's ImageClassifierDataLoader class. Just point it at the folder and you're good to go.
One important element in training models with machine learning is to not use all of your data for
training. Hold back a little to test the model with data it hasn't previously seen. This is easy to do with
the split method of the dataset that comes back from ImageClassifierDataLoader.
7.3 Create the Image Classifier Model
Model Maker abstracts a lot of the specifics of designing the neural network so you don't have to
deal with network design, and things like convolutions, dense, relu, flatten, loss functions and optimizers.
The model went through 5 epochs – where an epoch is a full cycle of training where the neural network
tries to match the images to their labels. By the time it went through 5 epochs, in around 1 minute, it was
93.85% accurate on the training data. Given that there's 5 classes, a random guess would
be 20% accurate, so that's progress!
7.4 Export the Model
Now that the model is trained, the next step is to export it in the .tflite format that a mobile
application can use. Model maker provides an easy export method that you can use — simply specify
the directory to output to.
For the rest of this lab, I'll be running the app in the iPhone
simulator which should support the build targets from the codelab.
If you want to use your own device, you might need to change the
build target in your project settings to match your iOS version.
Run it and you'll see something like this:
Note the very generic classifications – petal, flower, sky.
The model you created in the previous codelab was trained to
detect 5 varieties of flower, including this one – a daisy.
For the rest of this codelab, you'll look at what it will take to
upgrade your app with the custom model.

fig 7.1 classification of an image

7.5 Update your Code for the Custom Model

1. Open your ViewController.swift file. You may see an error on the ‘import MLKitImageLabeling'
at the top of the file. This is because you removed the generic image labeling libraries when you updated
your pod file.
import MLKitVision
import MLKit
import MLKitImageLabelingCommon
import MLKitImageLabelingCustom

It might be easy to speed read these and think that they're repeating the same code! But it's "Common" and
"Custom" at the end!

2. Next you'll load the custom model that you added in the previous step. Find the getLabels() func.
Beneath the line that reads visionImage.orientation = image.imageOrientation, add these lines:

3. Find the code for specifying the options for the generic ImageLabeler. It's probably giving you
an error since those libraries were removed:
let options = ImageLabelerOptions()
Replace that with this code, to use a CustomImageLabelerOptions, and which specifies the local model:
let options = CustomImageLabelerOptions(localModel: localModel)
...and that's it! Try running your app now! When you try to classify the image it should be more accurate
– and tell you that you're looking at a daisy with high probability!

fig 7.2 showing accuracy of the classification of an image


CONCLUSION

In conclusion, I am proud to have successfully completed the virtual internship as a Google


AI/ML Intern, which has been an incredibly enriching experience in my professional development.
Throughout this internship, I engaged with a series of comprehensive courses and hands-on projects that
provided me with a strong foundation in artificial intelligence and machine learning.

Starting with the AI Foundations course, I gained essential knowledge about the fundamental
concepts of AI and ML. This course covered critical topics such as supervised and unsupervised
learning, neural networks, and deep learning algorithms. Understanding these key concepts has been
instrumental in shaping my perspective on the growing impact of AI and its applications in various
industries.

Building on this foundation, I progressed to the Applied Machine Learning course, which offered
deeper insights into deploying machine learning models in real-world scenarios. This course provided
practical exposure to training models, fine-tuning hyperparameters, and evaluating model performance
using Google’s AI tools and frameworks. The hands-on experience from this course equipped me with
the ability to build and optimize models to solve real-world problems effectively.

Finally, I completed a capstone project focused on using Google Cloud AI and ML tools, where I
implemented a machine learning solution to a real business problem. This project allowed me to apply
everything I learned, from data preprocessing and model building to deployment. This practical
experience has prepared me for real-world challenges where I can apply AI and ML to drive meaningful
results.

Overall, this virtual internship has not only expanded my technical knowledge but also solidified
my passion for pursuing a career in artificial intelligence and machine learning. The combination of
theoretical learning and practical application through these courses has significantly enriched my
understanding of AI and ML. I am excited to leverage this knowledge as I continue to explore the field
and contribute to innovative solutions in the AI/ML domain.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy