CG - REPORT Nithin
CG - REPORT Nithin
CHAPTER 1
INTRODUCTION
This summer, I had the opportunity to work alongside a very talented group of individuals at
Invisible AI. In this blog post, I will be discussing my project focused on barcode detection using
deep learning. This is the first of a series of technical posts scheduled for this month so stay tuned :)
Barcodes are used in various commercial applications including manufacturing, healthcare, and
advertising to embed useful information for product identification, expiration date, batch number,
and more. The general appearance of a 1D barcode is long black stripes while QR codes are 2D
with black rectangles. Given the ubiquity of barcodes in our modern world, several robust and
efficient techniques are used to read them, including laser scanners and camera readers.
However, there are a few limitations preventing them from being used for large scale applications.
Laser scanners, for example, can only read 1D barcodes. They are also unable to read barcodes
from screens since they rely on the reflection of light. These drawbacks motivate the need to
explore methods of barcode detection which are more reliable and versatile.
Relevant Work
Early barcode detection methods relied primarily on traditional signal processing techniques such
as corner detection[1], gradient methods[2], morphological operations[3], and so on. These
methods were typically assessed against two standard barcode datasets, namely the Muenster
Barcode Dataset (WWU) and the Artelab 1D Medium barcode database[4]. These datasets contain
a large collection of annotated 1D and 2D barcodes.
More recently, with the growing success of artificial neural networks and deep learning, researchers
are quick to apply such methods to the domain of barcode detection. In [5], the authors used the
YOLO (You Only Look Once) detector first to detect bounding boxes of barcodes and then fed the
detections as inputs to another neural network to predict the orientation. After correcting the
orientation of the detected barcode, they fed the result into an open-sourced barcode reader to read
the contents. Using ANN approaches, researchers were able to achieve state-of-the-art performance
and establish new baselines on the WWU and Artelab datasets.
Instead, SSD predicts bounding boxes and classes directly from feature maps in a single pass.
To compensate for the lowered accuracy, SSD adopts strategies like using small convolution filters
to predict object classes and predicting the offsets to the predefined bounding boxes. Even without
the region proposal network, SSD is still able to run inference in real-time while still achieving
satisfactory performance. We at Invisible AI are developing AI-enabled cameras running for real-
time monitoring applications which necessitates the use of efficient algorithms. Real-time object
detection is important since we want to deploy these models onto our production-ready cameras.
1.1OBJECTIVES
1. Efficiency
Since scanning a barcode automatically enters a large amount of data into a system, they are
incredibly valuable for streamlining recordkeeping and improving efficiency. Modern supply chain
and inventory management simply would not be possible without the use of barcodes. Rather than
manually entering inventory and shipment data for every item into a system, employees can simply
scan entire pallets, crates, and even shipping containers to instantly know what contents they
contain inside. Given the sheer scale of products moving through a supply chain, barcodes allow
companies to automate a key process to save time and money even as they scale operations.
Barcode scanners can also streamline onboarding and training since it takes much less time to teach
someone to use a scanner than to manually enter data.
2. Error Reduction
Manual data entry is notorious for its high levels of human error. According to research conducted
over several decades, even workplaces with the best performance measures in place see human
error rates of five to ten failures in every hundred opportunities. That’s a lot of opportunities for
things to go wrong, whether it takes the form of inverted characters, skipped lines, misreadings,
illegible markings, or faulty keystrokes. Even worse, once an error occurs, it will often be
reproduced across a system, making it very difficult to locate and remediate the original
mistake. According to one estimate, errors resulting in bad data cost businesses more than $600
billion each year. Scanning a barcode, by contrast, completely automates the data entry process and
significantly reduces the risks associated with manual errors. Information encoded into a barcode
will be reproduced accurately each and every time the image is scanned to ensure consistency
across systems.
3. Tracking
Each time a barcode is scanned, it creates another step in a data trail that can be easily referenced to
locate items and events. This allows businesses to greatly improve real-time visibility into their
operations. From identifying a shipment’s most recent location or determining whether or not a
patient picked up their prescription from a pharmacy, barcodes help organizations and customers
alike to track down information quickly and accurately. By improving visibility throughout their
systems, companies can deliver a better customer experience that builds trust and prioritizes
transparency. Since barcodes are easy to create and print, they can be added to almost any type of
business process to streamline productivity and track essential activities.
4. Data Collection
Today’s organizations rely heavily upon data analytics to formulate their business strategy and
make key decisions. The more data they have available to them, the more nuanced and accurate
their analysis will be. Barcodes play a critical role in data collection strategies. Not only are they
used to gather information about inventory, supply chain, and sales activity, but the latest
generation of QR codes (a common form of 2d barcode) are also being deployed to learn more
Dept of CSE, SEACET 2023-24 Page 3
Computer Graphics and Image Processing
about customer behavior and preferences. Thanks to real-time QR code tracking, companies can
see how many times the barcode is scanned, where it was scanned, and what devices were used to
scan it. Gathering more extensive barcode data provides a more detailed picture of what’s actually
happening “on the ground” throughout an organization and in the market. By eliminating
conjecture and guesswork, businesses can make much more informed decisions that will help them
to sustainably scale operations and capitalize on opportunities.
1.2LIMITATION :
Cellular barcodes are designed in an equally sized manner, which should allow for simultaneous
amplification and an unbiased quantification of every single barcode. Although the approach by
Schepers et al. was based on a specific microarray platform, subsequent application of a similar
barcode construct to hematopoietic stem cells showed the quantitative potential of this marking
strategy. In particular, Gerrits et al. demonstrated that subcloning strategies of the barcode
sequences enabled clonal analysis with unmatched sensitivity and precision. Given an appropriate
sequencing depth, the barcode design should, in principle, facilitate a sufficiently high resolution
for the detection of very few or even single cells within a marked cell population. As a logical next
step, several other groups, including ourselves, established protocols to combine viral barcoding
with next-generation sequencing (NGS) to mark and track cells not only in the field of
hematopoiesis, but also in several other settings. Despite their enormous potential for clonal
analysis, only few reports addressed the accuracy of the quantitative results of experiments based
on cellular barcodes. In particular, there is still uncertainty to which extend NGS barcode read
counts faithfully reproduce clonal abundances and how barcode isolation and sequencing mask a
unique relationship. However, the understanding of methodological constrains with the cellular
barcoding technology is of utmost importance to allow for quantitative interpretation of
experimentalresult
1.3 SCOPE
The scope of barcode detection is quite broad and spans various industries and applications. Here
are some key areas where barcode detection is commonly used:
1. Retail and Inventory Management: Barcodes are essential for tracking products, managing
inventory, and speeding up the checkout process in retail stores.
2. Logistics and Supply Chain: Barcodes help in tracking shipments, managing warehouse
inventory, and ensuring accurate delivery of goods.
3. Healthcare: Barcodes are used for patient identification, tracking medical supplies, and
managing medication administration.
4. Manufacturing: Barcodes assist in tracking parts and products through the manufacturing
process, ensuring quality control and efficient production.
5. Library Systems: Barcodes are used to manage book inventories, track checkouts, and
streamline the return process.
6. Event Management: Barcodes on tickets help in managing entry to events, tracking
attendance, and ensuring security.
7. Document Management: Barcodes are used to organize and track documents, making it
easier to retrieve and manage records.
8. Payment Systems: QR codes, a type of barcode, are widely used for mobile payments and
online transactions.
CHAPTER 2
LITERATURE REVIEW
IMAGE PROCESSING
Image processing is a method used to perform operations on an image to enhance it or extract
useful information. It is a type of signal processing where the input is an image, such as a
photograph or video frame, and the output can be either an image or a set of characteristics or
parameters related to the image.
1. Analog Image Processing: This involves processing images in their analog form. It is used
for hard copies like printouts and photographs.
2. Digital Image Processing: This involves manipulating digital images using computers. It has
a wide range of applications, from medical imaging to remote sensing.
1. Image Acquisition: This is the first step, where an image is captured by a sensor (such as a
camera) and converted into a manageable entity.
3. Image Restoration: This aims to reconstruct or recover an image that has been degraded by
using a priori knowledge of the degradation phenomenon.
4. Image Compression: This reduces the amount of data required to represent an image. It is
essential for efficient storage and transmission.
5. Image Segmentation: This involves partitioning an image into its constituent parts or
objects. It is a crucial step in image analysis and pattern recognition.
6. Image Representation and Description: This phase involves converting the segmented
image into a form suitable for computer processing, such as boundary representation or
region representation.
7. Image Recognition: This involves assigning a label to an object based on its descriptors.
2. Image Restoration: Techniques like inverse filtering, Wiener filtering, and blind
deconvolution are used to restore images degraded by factors like motion blur or noise.
3. Image Compression: Techniques like JPEG, PNG, and GIF are used to reduce the size of
image files.
5. Morphological Processing: Techniques like dilation, erosion, opening, and closing are used
to process geometrical structures in an image.
6. Color Image Processing: Techniques like color space conversion and color enhancement are
used to process color images.
1. Medical Imaging: Used in techniques like X-ray, MRI, and CT scans to enhance and
analyze medical images.
2. Remote Sensing: Used in satellite and aerial image analysis for environmental monitoring,
weather forecasting, and land-use mapping.
3. Industrial Inspection: Used for quality control and inspection in manufacturing processes.
4. Robotics: Used in vision systems for navigation, object recognition, and manipulation.
5. Security and Surveillance: Used in facial recognition, fingerprint recognition, and video
surveillance systems.
6. Entertainment: Used in image editing, animation, and special effects in movies and video
games.
Dept of CSE, SEACET 2023-24 Page 7
Computer Graphics and Image Processing
CHAPTER 3
METHODOLOGY
Python programming language is one of dynamic and object-oriented programming languages used
for the development of diverse kinds of software developed by the Python Software foundation. Its
significant advantage is that facilitates integration with other programing languages and software
development tools. In addition, it has in-built standard libraries that are extensive. This means that
it facilitates the development of a better source code.
The programming paradigm of Python language embarks on the readability of the source code
enhanced through clear syntax. Apart from object-oriented programming paradigm, Python can
implement other programming methodologies such as functional programing and imperative
programming. Another important feature of Python language that makes it suitable as a software
development tool is that it has a dynamic type system and its memory management strategy is
automatic. In addition, it can support the implementation of scripting applications. It is important to
note that the development model of Python language is community based, implying that its
reference implementation is free and bases on the open source platform. There are various
interpreters for the language for various systems software, although programs developed by Python
are executable in any environment irrespective of operating system environment.
3.1 OPENCV
OpenCV, short for Open Source Computer Vision Library, is a powerful and widely-used open-
source computer vision and machine learning software library. Originally developed by Intel, it is
now maintained by the OpenCV Foundation and a community of developers. OpenCV is designed
to provide a common infrastructure for computer vision applications and to accelerate the use of
machine perception in commercial products. OpenCV was initially released in 2000 and has since
grown to become one of the most popular libraries for computer vision. It is written in optimized C
and C++ and has interfaces for Python, Java, and MATLAB/Octave. The library is cross-platform,
supporting Windows, Linux, macOS, iOS, and Android.
Core Functionalities
OpenCV offers a wide range of functionalities, including:
1. Image Processing: OpenCV provides tools for image filtering, transformation, and
enhancement. This includes operations like blurring, sharpening, edge detection, and
geometric transformations (e.g., scaling, rotation).
2. Video Analysis: The library supports video capture and analysis, including background
subtraction, object tracking, and motion analysis.
3. Feature Detection and Matching: OpenCV includes algorithms for detecting and matching
features in images, such as corners, edges, and blobs. This is useful for tasks like image
stitching and object recognition.
4. Object Detection: OpenCV provides pre-trained models and tools for detecting objects in
images and videos. This includes face detection, pedestrian detection, and more advanced
techniques like deep learning-based object detection.
5. Machine Learning: The library includes a module for machine learning, offering tools for
training and using models for classification, regression, and clustering.
6. 3D Reconstruction: OpenCV supports 3D reconstruction from multiple images, which is
useful for applications like augmented reality and 3D modeling.
3.2 NUMPY
NumPy (Numerical Python) is a powerful open-source library used for numerical and scientific
computing in Python. It provides support for large, multi-dimensional arrays and matrices, along
with a collection of mathematical functions to operate on these arrays efficiently.
Key Features of NumPy
1. N-Dimensional Array Object (ndarray):
o The core of NumPy is the ndarray, a fast and space-efficient multidimensional array
providing vectorized operations and complex broadcasting capabilities. Unlike
Python lists, NumPy arrays are homogeneous, meaning all elements are of the same
3.3 PYZBAR
Pyzbar is a Python library that facilitates the reading and decoding of one-dimensional barcodes
and QR codes. It leverages the capabilities of the ZBar library, making it a powerful tool for
barcode detection in Python. Pyzbar is compatible with both Python 2 and 3, and it works
seamlessly with various image processing libraries such as PIL/Pillow, OpenCV, and imageio. This
versatility allows developers to decode barcodes from different image formats and sources,
including raw bytes. One of the key advantages of pyzbar is its simplicity and ease of use. The
library provides a straightforward decode function that can process images and return detailed
information about the detected barcodes, including their data, type, and location within the image.
This makes it an excellent choice for applications that require barcode scanning capabilities, such
as inventory management systems, point-of-sale systems, and mobile apps. To use pyzbar, you
need to install the library along with its dependencies. On Windows, the ZBar DLLs are included
with the Python wheels, while on other operating systems, you may need to install the ZBar shared
library separately. Once installed, you can start decoding barcodes by importing
the decode function from the pyzbar.pyzbar module and passing an image to it. The function can
handle images loaded with PIL, OpenCV, or even raw pixel data. For example, you can load an
image using PIL and decode it with pyzbar to extract barcode information. The library supports a
wide range of barcode types, including CODE128, EAN13, and QR codes, making it a versatile
tool for various barcode scanning needs. Additionally, pyzbar can handle images in different
formats, such as JPEG, PNG, and BMP, further enhancing its utility. The library’s ability to decode
barcodes from numpy arrays also makes it suitable for integration with computer vision
applications that use OpenCV. Pyzbar’s performance is robust, and it can accurately detect and
decode barcodes even in challenging conditions, such as low-resolution images or images with
noise.
This reliability is crucial for applications where accurate barcode detection is essential.
Furthermore, pyzbar’s open-source nature allows developers to contribute to its development and
improvement, ensuring that it remains up-to-date with the latest advancements in barcode
technology. In summary, pyzbar is a highly effective and user-friendly library for barcode detection
in Python. Its compatibility with various image processing libraries, support for multiple barcode
types, and ease of use make it an ideal choice for developers looking to integrate barcode scanning
capabilities into their applications.
3.4 Decode
The python string decode() method decodes the string using the codec registered for its encoding.
The encoded string can be decoded and the original string can be obtained with the help of this
function. This function works based on the parameters specified which are encoding and the error.
There are various types of standard encodings such as base64, ascii, gbk, hz, iso2022_kr, utf_32,
utf_16, and many more. Based on the specified encoding, the string is decoded. During this
process, different error handling schemes can be set using the errors parameter. Finally, this method
returns the decoded string. In the following section, we will be learning more details about the
python string decode() method.
3.6 VS CODE
Visual Studio Code (VS Code) is a popular open-source code editor developed by Microsoft. It's
known for its versatility, extensive feature set, and ease of use. Here are some key features and
aspects of VS Code:
Key Features:
1. Extensibility:
o Extensions: VS Code has a vast marketplace with thousands of extensions that add
functionality such as language support, themes, linters, debuggers, and more.
o Customizable: Users can customize the editor through settings, keybindings, themes,
and snippets.
2. Intelligent Code Editing:
o IntelliSense: Provides intelligent code completions based on variable types, function
definitions, and imported modules.
o Syntax Highlighting: Supports a wide range of programming languages and provides
syntax highlighting to make code easier to read.
3. Built-in Git Integration:
o VS Code has built-in Git support, allowing users to perform version control
operations like commit, push, pull, and merge without leaving the editor.
4. Debugging:
o The editor includes a powerful debugger with breakpoints, call stacks, and an
interactive console for multiple programming languages.
5. Integrated Terminal:
o VS Code has an integrated terminal that allows users to run shell commands and
scripts directly within the editor.
6. Multi-Language Support:
o VS Code supports a wide range of programming languages out of the box and
through extensions, making it suitable for various types of development.
CHAPTER 4
IMPLEMENTATION
Program
import cv2
import numpy as np
cap=cv2.VideoCapture(0)
cap.set(3,640)
cap.set(4,480)
#with open('myDataFile.text') as f:
# myDataList=f.read().splitLines()
#print(myDataList)
while True:
success, img=cap.read()
myData=barcode.data.decode('utf-8')
print(myData)
# myOutput='authorized'
#else:
# myOutput='Unauthorized'
pts=np.array([barcode.polygon],np.int32)
pts=pts.reshape((-1,1,2))
cv2.polylines(img,[pts],True,(255,0,255),5)
pts2=barcode.rect
cv2.putText(img,myData,(pts2[0],pts2[1]),cv2.FONT_HERSHEY_SIMPLEX,0.9,
(255,0,255),2)
cv2.imshow('result',img)
cv2.waitKey(1)
Program Explanation
import cv2
import numpy as np
from pyzbar.pyzbar import decode
installed.
Main Loop
python
while True:
success, img=cap.read()
for barcode in decode(img):
myData=barcode.data.decode('utf-8')
print(myData)
pts=np.array([barcode.polygon],np.int32)
pts=pts.reshape((-1,1,2))
cv2.polylines(img,[pts],True,(255,0,255),5)
pts2=barcode.rect
cv2.putText(img,myData,(pts2[0],pts2[1]),cv2.FONT_HERSHEY_SIMPLEX,0.9,
(255,0,255),2)
cv2.imshow('result',img)
cv2.waitKey(1)
Capturing and Processing Frames
success, img = cap.read(): Captures a frame from the video stream.
for barcode in decode(img): Iterates over each barcode detected in the frame. The decode
function (from pyzbar) scans the frame for barcodes. This line will only work if the decode
function is properly imported.
Decoding Barcode Data
myData = barcode.data.decode('utf-8'): Decodes the barcode data from bytes to a string.
print(myData): Prints the decoded barcode data to the console.
Authorization Check (Commented Out)
The code for checking if the decoded barcode data (myData) is in the authorized list
(myDataList) is commented out. If it were active, it would set myOutput to 'authorized' or
'unauthorized' based on the check.
Drawing Bounding Box Around Barcode
pts = np.array([barcode.polygon], np.int32): Converts the barcode polygon points into a
NumPy array.
pts = pts.reshape((-1, 1, 2)): Reshapes the points array to the format required by
cv2.polylines.
cv2.polylines(img, [pts], True, (255, 0, 255), 5): Draws a polygon around the barcode using
the points, with a color of (255, 0, 255) and a thickness of 5 pixels.
cv2.imshow('result', img): Displays the frame with the drawn barcode bounding boxes and
text in a window named 'result'.
cv2.waitKey(1): Waits for 1 millisecond for a key press. This allows the display to update
in real-time, creating a video feed effect.
Summary
This code captures video from a webcam, detects barcodes in real-time, decodes the barcode data,
and displays the barcode data and a bounding box around the barcode on the video feed. The
commented-out parts suggest additional functionality for checking if the barcode data is authorized.
CHAPTER 5
FINAL OUTPUTS
OUTPUT 1:
OUTPUT 2:
CONCLUSION
This program demonstrates real-time barcode detection using a webcam by leveraging OpenCV for
video capture and image processing. It sets the webcam resolution to 640x480 pixels and enters a
continuous loop to read frames from the webcam. The program is designed to decode barcodes
present in each frame using the pyzbar library (although the import statement and related
functionality are currently commented out). For each detected barcode, it decodes the data and
prints it to the console. Additionally, it draws a bounding polygon around the detected barcode and
overlays the decoded data as text on the image. The processed frame is then displayed in a window
named 'result', updating in real-time to create a continuous video feed with visual barcode
indications. The program includes a commented-out section for checking if the decoded barcode
data matches entries in an authorized list, suggesting that it could be enhanced to include
authorization verification. Overall, the program provides a functional base for real-time barcode
detection and could be extended with minor adjustments to include full barcode decoding and
authorization features.
BIBLIOGRAPHY
This document contains provisions which, through reference in this text, constitute provisions of
the present document.
2) https://stackoverflow.com/
3) Geeky Gadgets
4) https://greeksforgreeks.com