0% found this document useful (0 votes)
17 views

Batch no_07

The project report details the development of a 'Visual Trackpad Using Python' aimed at creating an eye-tracking system for hands-free computer interaction, particularly beneficial for individuals with disabilities. The system integrates machine learning and computer vision techniques to accurately interpret eye movements and translate them into mouse commands, enabling users to control a virtual keyboard. The report includes acknowledgments, an abstract, and outlines the methodology, applications, and potential impact of the project.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Batch no_07

The project report details the development of a 'Visual Trackpad Using Python' aimed at creating an eye-tracking system for hands-free computer interaction, particularly beneficial for individuals with disabilities. The system integrates machine learning and computer vision techniques to accurately interpret eye movements and translate them into mouse commands, enabling users to control a virtual keyboard. The report includes acknowledgments, an abstract, and outlines the methodology, applications, and potential impact of the project.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

VISUAL TRACKPAD USING PYTHON

A PROJECT REPORT

Submitted by

ANAND G 312321106016
DHANUSU MAHARAJAN M 312321106044

in partial fulfilment for the award of the degree

of

BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMMUNICATION ENGINEERING

St. JOSEPH’S COLLEGE OF ENGINEERING


(An AUTONOMOUS INSTITUTION)

CHENNAI 600 119


MAY-2024
St. JOSEPH’S COLLEGE OF ENGINEERING
CHENNAI-600119

BONAFIDE CERTIFICATE

Certified that this project report “VISUAL TRACKPAD USING PYTHON” is the

bonafide work of “ANAND G (312321106016) and DHANUSU MAHARAJAN M

(312321106044)” who carried out the project work under my supervision.

SIGNATURE SIGNATURE

HEAD OF THE DEPARTMENT SUPERVISOR


Dr.S.Rajesh Kannan M.E., Ph.D. Dr.Shirley Selvan M.E., Ph.D.
Associate Professor Associate Professor
Electronics and Communication Engineering Electronics and Communication Engineering

St.Joseph’s College of Engineering St. Joseph’s College of Engineering


CERTIFICATE OF EVALUATION

COLLEGE NAME : St. JOSEPH’S COLLEGE OF ENGINEERING


BRANCH : ELECTRONICS AND COMMUNICATION ENGINEERING
SEMESTER : VI

Name of the students


Name of the Supervisor with
S. NO who have done the Title of the project
Designation
project

ANAND G VISUAL TRACKPAD


1.
(312321106016) USING PYTHON
Dr.Shirley Selvan ,M.E.,Ph.D.
DHANUSU ASSOCIATE PROFESSOR
2. MAHARAJAN M
(312321106044)

This report of project work submitted by the above student in partial fulfilment for the
award of Bachelor of Engineering in Electronics and Communication Engineering
degree in Anna University was evaluated and confirmed to be reports of the work done
by the above student.

Submitted for UNIVERSITY VIVA VOCE Examination held on


……………….

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

The contentment and elation that accompany the successful completion of any work
would be incomplete without mentioning the people who made it possible.

Words are inadequate in offering our sincere thanks and gratitude to our respected
Chairman Dr. B. Babu Manoharan, M.A., M.B.A., Ph.D. for his outstanding
leadership, unwavering dedication, and invaluable contributions to our organization.
His vision, guidance, and commitment have been instrumental in shaping our success
and driving us towards excellence.

We express our sincere gratitude to our beloved Managing Director Mr. B. Shashi
Sekar, M.Sc., (Intl. Business) and Executive Director Mrs. B. Jessie Priya,
M.Com., (Commerce) for their support and encouragement.
We also thank our beloved principal Dr. Vaddi Seshagiri Rao, M.E, MBA, Ph.D.,
for having encouraged us to do our under graduation in Electronics and
Communication Engineering in this esteemed institution.

We also express our sincere thanks and most heartfelt sense of gratitude to our eminent
Head of the Department, Dr. S. Rajesh Kannan, M.E., Ph.D., for having extended
his helping hand at all times.

We humbly express our profound gratitude for the invaluable guidance and expert
suggestions generously shared by our supervisor, Dr. Shirley Selvan
,M.E., Ph.D.
We thank all staff members of our department, our family members and friends who
have been the greatest source of support to us.
ABSTRACT

There are different reasons for which people need an artificial of locomotion
such as a virtual keyboard. The main reason we need some alternative and innovative
ideas which make our work easier. The project's main objectives are to design an
accurate and efficient eye-tracking system, develop software to process the captured
data, and integrate the software with the computer's operating system to allow hands-
free mouse control. The system comprises of an eye tracking device and a software
module. The eye-tracking device captures the user's eye movements, while the
software module analyses the captured data to determine the intended mouse
movements. The software module then communicates the mouse movements to the
computer's operating system, allowing the user to interact with the computer hands-
free. The project involves the use of machine learning algorithms and computer
vision techniques to improve the accuracy of the eye- tracking system. The number
of people, who need to move around with the help of some article means, because of
an illness. Moreover, implementing a controlling system in it enables them to move
without the help of another person is very helpful. The idea of eye controls of great
use to not only the future of natural input but more importantly the handicapped and
disabled. Camera will be capturing the image of eye movement. First detect pupil
centre position of eye. Then the different variation on pupil position get different
command set for virtual keyboard. The signals pass the motor driver to interface with
the virtual keyboard itself. The motor driver will control both speed and direction to
enable the virtual keyboard to move forward, left, right and stop.
TABLE OF CONTENTS
CHAPTER NO TITLE PAGE NO
ABSTRACT IV
LIST OF ABBREVIATIONS VII
1 INTRODUCTION 1
1.1 MACHINE LEARNING 2
1.2 MACHINE LEARNING APPLICATIONS 3
1.3 SENSOR TECHNOLOGY IN AUTONOMOUS 4
SYSTEMS
1.4 INTEGRATION WITH AI AND MACHINE 5
LEARNING
1.5 PRACTICAL APPLICATION 6
2 LITERATURE REVIEW 8
3 METHODOLOGY 14
3.1 REQUIREMENT ANALYSIS 15
3.2 SOFTWARE REQUIREMENTS 17
4 RESULT AND DISCUSSIONS 21
5 CONCLUSION 24
SOURCE CODE 25
REFERENCES
LIST OF ABBREVIATIONS

HCI Human-Computer Interface


ASR Automatic Speech Recognition
ML Machine Learning
SVR Support Vector Regression
PCA Principal Component Analysis
IE Information Extraction
LDA Linear Discriminant Analysis
ANOVA Analysis of Variance
ANN Artificial Neural Networks
SVM Support Vector Machine
RDBMS Relational Database Management System
LASSO Least Absolute Shrinkage and Selection
Operator
OpenCV Open-Source Computer Vision
STT Speech To Text
GAN Generative Adversarial Networks
CHAPTER 1
INTRODUCTION
As the computer technologies are growing rapidly, the importance of human
computer interaction becomes highly notable. Some persons who are disabled cannot be
able to use the computers. Eye ball movement control mainly used for disabled people.
Incorporating this eye controlling system with the computers will make them to work
without the help of other individual. Human-Computer Interface (HCI) is focused on use
of computer technology to provide interface between the computer and the human. There
is a need for finding the suitable technology that makes the effective communication
between human and computer. Human computer interaction plays the important role.
Thus, there is a need to find a method that spreads an alternate way for making
communication between the human and computer to the individuals those who have
impairments and give them an equivalent space to be an element of Information Society.
In recent years, the human computer interfaces are attracting the attention of various
researchers across the globe. Human computer interface is an implementation of the
vision- based system for eye movement detection for the disabled people in the proposed
system, we have included the face detection, face tracking, eye detection and
interpretation of a sequence of eye blinks in real time for controlling a nonintrusive
human computer interface. Conventional method of interaction with the computer with
the mouse is replaced with the human eye movements. This technique will help the
paralyzed person, physically challenged people especially person without hands to
compute efficiently and with the ease of use. Firstly, camera captures the image and
focuses on the eye in the image using OpenCV code for pupil detection. This results the
centre position of the human eye (pupil). Then the centre position of the pupil is taken as
a reference and based on that the human or the user will control the cursor by moving
left and right. This project describes existing solutions to find the cursor movement using
some 3D models. In this research work how, cursor is working based only on Eyeball

1
movement using OpenCV methodology. And how the cursor is moving using eyeball
with example with the better solutions. And, also the Conclusion part is present

1.1.Machine Learning – Overview

Machine learning is a very hot topic for many key reasons, and because it
provides the ability to automatically obtain deep insights, recognize unknown
patterns, and create high performing predictive models from data, all without
requiring explicit programming instructions.

This high-level understanding is critical if ever involved in a decision-making


process surrounding the usage of machine learning, how it can help achieve business
and project goals, which machine learning techniques to use, potential pitfalls, and
how to interpret the results.

Machine learning is a subfield of computer science, but is often also referred to


as predictive analytics, or predictive modelling. Its goal and usage are to build new
and/or leverage existing algorithms to learn from data, in order to build generalizable
models that give accurate predictions, or to find patterns, particularly with new and
unseen similar data.

Imagine a dataset as a table, where the rows are each observation (aka
measurement, data point, etc), and the columns for each observation represent the
features of that observation and their values.

At the outset of a machine learning project, a dataset is usually split into two
or three subsets. The minimum subsets are the training and test datasets, and often an
optional third validation dataset is created as well.

Once these data subsets are created from the primary dataset, a predictive
model or classifier is trained using the training data, and then the model’s predictive
accuracy is determined using the test data.

2
As mentioned, machine learning leverages algorithms to automatically model
and find patterns in data, usually with the goal of predicting some target output or
response. These algorithms are heavily based on statistics and mathematical
optimization.

Optimization is the process of finding the smallest or largest value (minima or


maxima) of a function, often referred to as a loss, or cost function in
minimization case. One of the most popular optimization algorithms used in machine
learning is called gradient descent, and another is known as the normal equation.

In a nutshell, machine learning is all about automatically learning a highly


accurate predictive or classifier model, or finding unknown patterns in data, by
leveraging learning algorithms and optimization techniques.

1.2.Machine Learning Applications

As we move forward into the digital age, one of the modern innovations we’ve
seen is the creation of Machine Learning. This incredible form of artificial
intelligence is already being used in various industries and professions. For Example,
Image and Speech Recognition, Medical Diagnosis, Prediction, Classification,
Learning Associations, Statistical Arbitrage, Extraction, Regression. Today we’re
looking at all these Machine Learning Applications in today’s modern world.

1.2.1.Image Recognition

It is one of the most common machine learning applications. There are many
situations where you can classify the object as a digital image. For digital images,
the measurements describe the outputs of each pixel in the image.

In the case of a black and white image, the intensity of each pixel serves as
one measurement. So, if a black and white image has N*N pixels, the total number
of pixels and hence measurement is N2.
3
In the coloured image, each pixel considered as providing 3 measurements of the
intensities of 3 main colour components i.e. RGB. So, N*N coloured image there are
3 N2 measurements.
 For face detection – The categories might be face versus no face present.
There might be a separate category for each person in a database of several
individuals.
 For character recognition – It can segment a piece of writing into smaller
images, each containing a single character. The categories might consist of
the 26 letters of the English alphabet, the 10 digits, and some special
characters.

1.3 Sensor Technologies In Autonomous Systems


In the realm of human-computer interaction, the development of innovative input
methods has been a subject of extensive research and exploration. The emergence of
computer vision and machine learning techniques has opened up new possibilities for
creating intuitive and immersive interfaces. In this literature review, we explore the
relevant research and technologies that inform the development of a visual trackpad
using Python.

1.3.1 Principles Of Ultrasonic Sensor


Ultrasonic sensors operate on the principle of emitting high-frequency sound
waves and measuring the time it takes for the waves to bounce back after hitting an
object. This technology is commonly used for distance measurement and obstacle
detection in robotics and automation applications. While not directly related to visual
trackpads, understanding the principles of ultrasonic sensors can inform the
development of complementary input methods or hybrid systems that combine
multiple sensing modalities.

4
1.3.2 Advantages Of Ultrasonic Sensors
Ultrasonic sensors offer several advantages, including non-contact operation,
high accuracy, and reliability in various environmental conditions. These qualities
make them suitable for applications where precise distance measurement is required,
such as robotic navigation and object detection. While not inherently tied to visual
trackpads, these advantages highlight the potential for integrating ultrasonic sensors
into multi-modal input systems to enhance user interaction and expand functionality.

1.3.3 Challenges And Limitations


Despite their advantages, ultrasonic sensors also have limitations and
challenges to consider. These may include limited range, susceptibility to interference
from ambient noise, and difficulty in accurately detecting certain materials or
surfaces. When designing a visual trackpad system, it's essential to assess these
limitations and explore strategies to mitigate their impact or complement them with
other sensing technologies..

1.3.4 Comparative Analysis


A comparative analysis of different sensing modalities, including cameras,
ultrasonic sensors, and other proximity sensors, can provide insights into their
respective strengths and weaknesses for human-computer interaction applications. By
evaluating factors such as accuracy, robustness, cost, and ease of integration,
researchers can make informed decisions about the most suitable sensor technologies
for specific use cases, including visual trackpads

1.4 Integration With Ai And Machine Learning


The integration of AI and machine learning techniques holds promise for
enhancing the functionality and performance of visual trackpad systems. By training
models to recognize and interpret hand gestures, researchers can enable more

5
sophisticated interaction capabilities, such as gesture-based commands, multi-touch
gestures, and gesture recognition in diverse environments and lighting conditions.
Machine learning algorithms can also adapt and improve over time based on user
feedback, further enhancing the user experience.

1.5 Practical Application

A practical application for a visual trackpad using Python could be its integration
into interactive kiosks or digital signage systems. Here's how it could be implemented:

1 Interactive Kiosks:

Wayfinding and Navigation: In a shopping mall or large public space, users can interact
with a visual trackpad to navigate through maps, find directions to specific stores or
facilities, and explore points of interest.

Product Catalog Browsing: In retail environments, users can browse through product
catalogs by swiping through images or categories using hand gestures on the visual
trackpad. They can select items for more information or to make a purchase.

Information Retrieval: At information kiosks in airports, museums, or libraries, users


can use the visual trackpad to search for information, access multimedia content, or
interact with interactive exhibits.

Virtual Tour Guides: In tourist attractions or historical sites, users can use the

visual trackpad to access virtual tour guides, view augmented reality overlays, and
interact with interactive maps or exhibits.

2 Digital Signage Systems:


Interactive Advertisements: In retail stores or public spaces, interactive digital signage
displays equipped with a visual trackpad can engage users with interactive
advertisements, allowing them to swipe through product images, watch videos, or
participate in interactive games or surveys.

6
Information Display: In corporate or educational environments, digital signage displays
can provide interactive information displays, allowing users to browse through event
schedules, campus maps, or company announcements using hand gestures on the visual
trackpad.

Wayfinding and Directory Systems: In large buildings or complexes, digital signage


displays equipped with visual trackpads can serve as wayfinding and directory systems,
allowing users to search for specific locations, view floor plans, and navigate through
interactive maps.

7
CHAPTER 2

LITERATURE REVIEW
2.1Hand Gesture Recognition Based Virtual Mouse Events

Manav Ranawat, Madhur Rajadhyaksha, Neha Lakhani, Radha


Shankarmani (2021) determined a virtual mouse application based on the tracking of
different hand gestures. The system eliminates the dependency on any external
hardware required to perform mouse actions. A built-in camera tracks the user's
hands, predefined gestures are recognized and the corresponding mouse events are
executed. This system has been implemented in Python using OpenCV and
PyAutoGUI. Researchers have studied background conditions, effects of differences
in illuminance and skin colour individually. However, the proposed system aims to
consider all the above factors to build an application most suitable in the real world.

2.2 Virtual Mouse Control Using Coloured Finger Tips And Hand Gesture
Recognition

Vantukala VishnuTeja Reddy, Thumma Dhyanchand, Galla Vamsi Krishna,


Satish Maheshwaram (2022) determined that in human-computer interaction, virtual
mouse implemented with fingertip recognition and hand gesture tracking based on
image in a live video is one of the studies. In this paper, virtual mouse control using
fingertip identification and hand gesture recognition is proposed. This study consists
of two methods for tracking the fingers, one is by using coloured caps and other is
by hand gesture detection. This includes three main steps that are finger detection
using colour identification, hand gesture tracking and implementation on on-screen
cursor. In this study, hand gesture tracking is generated through the detection of the
contour and formation of a convex hull around it. Features of hands are extracted
with the area ratio of contour and hull formed. Detailed tests are performed to check

8
this algorithm in real world scenarios. A method for on-screen cursor control without
any physical connection to a sensor is presented. Identification of coloured caps on
the fingertips and their tracking is involved in this work. Different hand gestures can
be replaced in place of coloured caps for the same purpose. Different operations of
mouse controlled are single left click, double left click, right click and scrolling.
Various combinations of the coloured caps are used for different operations. Range
of skin colour can be varied in the program in accordance with the person to be used,
surrounding lightening conditions. An approximate area ratio that is not being used
by the hand in the convex hull is taken after analysing the program output at different
gestures of the hand. This work can be used in various real time applications like
cursor control in a computer, android based smart televisions etc. Although there are
devices like mouse and laser remotes for the same purpose, this work is so simple so
that it reduces the usage of external hardware in such a way that the motion of fingers
in front of a camera will result in the necessary operation on the screen.

2.3 Mouse On A Ring: A Mouse Action Scheme Based On Imu And Multi-Level
Decision Algorithm

Yuliang Zhao (2022) determined that the traditional mouse has been used as
a main tool for human-computer interaction for more than 50 years. However, it has
become unable to cater to people’s need for mobile officing and all-weather use due
to its reliance on the support of a two-dimensional plane, poor portability,
wearisomeness, and other problems. In this paper, they proposed a portable ring- type
wireless mouse scheme based on IMU sensors and a multi-level decision algorithm.
The user only needs to operate in the air with a smart ring worn on the middle finger
of their right hand to realize the interactive function of a mouse. The smart ring first
captures changes in the finger’s attitude angle to reflect how the cursor position
changes. And, it captures the rapid rotation of the user’s palm to the left and right to
achieve mouse clicking. In addition, a multi-level decision algorithm is developed to
9
improve the response speed and recognition accuracy of the virtual mouse. The
experimental results show that the virtual mouse has a target selection accuracy of
over 96%, which proves its practicability in real-world applications. This virtual
mouse is expected to be used as a portable and reliable tool for multi-scenario human-
computer interaction applications in the future. In this paper, they proposed a ring-
type wireless mouse scheme based on IMU sensors and a multi-leveldecision
algorithm. This scheme does not require the support of a 2D desktop. The user can
realize the interactive function of a traditional mouse in the air by simply wearing a
smart ring on the middle finger of their right hand. First, the smart ring captures
changes in the finger’s attitude angle to control how the cursor moves. This reduces
the dependence on the environment and allows for unrestricted interaction in multiple
scenarios. Second, the smart ring, which is integrated on the ring carrier without extra
auxiliary sensors, captures the rapid rotation of the user’s palm to the left and right
to achieve mouse clicking. This design greatly reduces the virtual mouse’s size and
improves its portability. Finally, considering that the increase in user motion types
affects the recognition performance, a multilevel decision algorithm is developed
based on the hierarchy in the data set and used to improve the response speed and
recognition accuracy of the virtual mouse. The experimental results show that the
target selection accuracy of the virtual mouse is over 96%, and its target selection
time and path efficiency are very close to those of a standard mouse. These findings
demonstrate the effectiveness and usability ofthe virtual mouse in real-world
applications. They believe that this virtual mouse will provide a portable, highly
accurate, and responsive solution to support the needs for multi-scenario human-
computer interaction in the future.

2.4 Wireless Gyro-Mouse For Text Input On A Virtual Keyboard


Rares Pogoreanu and Radu Gabriel Bozomitu (2022) determined that, a
gyroscopic pointing device that allows a user to type text by using a specialized

10
virtual keyboard. They compared different typing methods that use the dwell time of
the pointer as a character selection method. The proposed solution can be used in
different applications, such as: computer gaming, virtual reality, remote control and
to facilitate communication with people affected by certain disabilities and can easily
accommodate more types of users. The performances of different methods have been
analysed by using the speed of typing measured in words per minute, the task success
rate, as well as the feedback from the user regarding comfort and perceived ease-of-
use. In this paper, a robust method of typing based on a wireless gyro- mouse has
been presented. The proposed interface includes a specialized virtual keyboard with
dwell time as a selection method and a wireless gyro-mouse based on an IMU sensor
that combines an accelerometer and gyroscope in a single package. The proposed
device has been tested on 10 healthy subjects, students of “Gheorghe Asachi”
Technical University of Iasi, Romania. According to the obtained results, all subjects
succeeded in using the device as intended, and obtained good performances in the
typing test. All subjects rapidly and easily learned to use the device and succeed in
completing the required tasks. The best performance results have been obtained for
a dwell time of 750 ms, where the average speed of typing for all subjects was
4.1WPM.

2.5 Mouse On A Ring: A Mouse Action Scheme Based On Imu And Multi-Level
Decision Algorithm
Nowadays computer vision has reached its pinnacle, where a computer can
identify its owner using a simple program of image processing. In this stage of
development, people are using this vision in many aspects of day to day life, like
Face Recognition, Colour detection, Automatic car, etc. In this project, computer
vision is used in creating an Optical mouse and keyboard using hand gestures. The
camera of the computer will read the image of different gestures performed by a
person's hand and according to the movement of the gestures the Mouse or the cursor

11
of the computer will move, even perform right and left clicks using different gestures.
Similarly, the keyboard functions may be used with some different gestures, like
using one finger gesture for alphabet select and four-figure gesture to swipe left and
right.

2.6 Real-Time Eye Gaze Direction Classification Using Convolutional Neural


Network

Anjith George, Aurobinda Routray (2016) determine the estimation eye gaze
direction is useful in various human-computer interaction tasks. Knowledge of gaze
direction can give valuable information regarding users point of attention. Certain
patterns of eye movements known as eye accessing cues are reported to be related to
the cognitive processes in the human brain. They proposed a real-time framework for
the classification of eye gaze direction and estimation of eye accessing cues. In the
first stage, the algorithm detects faces using a modified version of the Viola- Jones
algorithm. A rough eye region is obtained using geometric relations and facial
landmarks. The eye region obtained is used in the subsequent stage to classify the
eye gaze direction. A convolutional neural network is employed in this work for the

classification of eye gaze direction. The proposed algorithm was tested on Eye
Chimera database and found to outperform state of the art methods. The
computational complexity of the algorithm is very less in the testing phase. The
algorithm achieved an average frame rate of 24 fps in the desktop environment. In
this work, a framework for real-time classification of eye gaze direction is presented.
The estimated eye gaze direction is used to infer eye accessing cues, giving
information about the cognitive states. The computational complexity is very less;
They achieved frame rates around 24 Hz in Python implementation in a 2.0 GHz
Core i5 PC running Ubuntu 64 bit (4GB RAM). The per-frame computational time
is 42 ms, which is much less than that of the other state of the art methods (250 ms in

12
[20]). Off the shelf webcams can be used for computing the Eye gaze direction. The
proposed algorithm works even with in plane rotations of the face. The eye gaze
direction obtained can also be used for human-computer interaction applications. The
computational complexity of the algorithm in testing phase is less, which makes it
suitable for smart devices with low resolution cameras using pre-trained models.

13
CHAPTER 3

METHODOLOGY

BLOCK DIAGRAM

Image Capturing

Face Detection

Eye Detection

Face Detection Eye Tracking Blink Detection

Mouse Scrolling Cursor Movement Click Event

Fig.3.0 Block diagram of the proposed system

Fig.3.0.Represents on how the camera detects the user and approach its functions.
The user has to sit in front of the display screen of a private computer with a specialised
video camera established above the screen to study the consumer’s eyes. The laptop
constantly analyses the video photo of the attention and determines where the consumer
is calling on the display screen. Nothing is attached to the consumer’s head or body. To
Frames. To "pick out“ any key, the user looks at it for a specific amount of time, and to
"press“ any key, the consumer simply blinks the eye. On this device, calibration
14
procedures are not required. For this system, the entry is the easiest. No Outside Hardware
Is Connected or Required. The camera gets its input from the eye. After receiving these
streaming movies from thecameras, it’ll spoil Into Frames. After Receiving Frames, It
Will Check for Lighting Conditions Because Cameras Require Enough Lighting Fixtures
from External Sources, in Any Other Case, Blunders Will Show on the Screen.The
Movement of the Pupil is detected and it determines where the curser move. Blinking of
left eye results in left click on virtual mouse pad.

3.1. Requirement Analysis:

- Define the functional requirements of the visual trackpad, including supported


gestures, interaction scenarios, and performance expectations.

- Identify hardware and software requirements, such as cameras, computing platforms,


and Python libraries.

3.1.2. Selection of Computer Vision Libraries:

- Evaluate and select appropriate computer vision libraries for gesture recognition and
image processing tasks.

- Consider factors such as compatibility with Python, support for real-time processing,
and availability of pre-trained models.

3.1.3. Camera Setup:

- Choose a suitable camera setup based on the application requirements, such as


webcam, depth camera, or RGB-D sensor.

- Ensure proper calibration and alignment of the camera to capture clear and consistent
images for gesture recognition.

3.1.4. Gesture Recognition Algorithm:

- Develop or select a gesture recognition algorithm capable of detecting and


interpreting hand movements from camera input.
15
- Explore techniques such as background subtraction, hand segmentation, feature
extraction, and machine learning-based classification.

3.1.5. Training Data Collection (if applicable):

- Gather a diverse dataset of hand gestures to train and validate the gesture recognition
algorithm.

- Capture hand movements under various conditions, including different lighting


environments, backgrounds, and hand poses.

3.1.6. Model Training and Optimization (if applicable):

- Train machine learning models, such as convolutional neural networks (CNNs) or


deep learning-based classifiers, using the collected training data.

- Optimize model hyperparameters, architecture, and training strategies to improve


accuracy and generalization performance.

3.1.7. Real-time Processing Pipeline:

- Implement a real-time processing pipeline to capture video frames from the camera,
apply the gesture recognition algorithm, and generate corresponding input commands.

- Optimize the processing pipeline for efficiency and low latency to ensure responsive
interaction with the visual trackpad.

3.1.8. Integration with User Interface (UI):

- Integrate the visual trackpad with a user interface (UI) framework or application to
provide feedback and visual feedback to the user.

- Design intuitive user controls and feedback mechanisms to guide users in performing
gestures and interpreting system responses.

16
3.1.9. Testing and Evaluation:

- Conduct thorough testing of the visual trackpad system under various conditions,
including different lighting, backgrounds, and user scenarios.

- Evaluate the accuracy, responsiveness, and usability of the visual trackpad through
user testing, feedback collection, and performance metrics analysis.

3.1.10. Documentation and Deployment:

- Document the implementation details, including software architecture, algorithms,


parameters, and usage instructions.

- Prepare the visual trackpad system for deployment, including packaging, installation,
and distribution to end-users or deployment platforms.

3.1.11. Maintenance and Updates:

- Provide ongoing maintenance and support for the visual trackpad system, including
bug fixes, performance optimizations, and updates to accommodate changes in hardware
or software dependencies.

- Incorporate user feedback and feature requests to iteratively improve the visual
trackpad's functionality and usability over time

3.2 SOFTWARE AND ALGORITHM:

SOFTWARE REQUIREMENTS

 Python

 OpenCV

 PyAutoGUI

17
3.2.1 Python
Python is a high-level, interpreted programming language known for its
simplicity, readability, and versatility. It was first created in the late 1980s by Guido
van Rossum and has since become one of the most popular languages for a wide
range of applications, including web development, data analysis, machine learning,
scientific computing, and more.
One of the key features of Python is its simplicity and ease of use. Its syntax is
easy to learn and read, making it an ideal language for beginners. The language
emphasizes code readability, which means that it is designed to be easily understood
and maintained by developers
Python is also known for its vast collection of libraries and frameworks, which
provide a wealth of functionality for developers. Some of the most popular libraries
include NumPy, Pandas, Matplotlib, and SciPy, which provide powerful tools for
data analysis and scientific computing.Additionally, Python has a large and active
community of developers, who contribute to the language by creating new libraries
and tools, as well as supporting each other through online forums and communities.
Another advantage of Python is its cross-platform compatibility. Python code can
run on various operating systems, including Windows, Linux, and macOS, without
the need for modification. This makes it a popular choice for developing software
that needs to run on multiple platforms.

Finally, Python is an interpreted language, which means that it does not need
to be compiled before running. This allows for rapid prototyping and iteration,
making it an ideal choice for developing software in an agile environment.

3.2.2 Python OpenCV


OpenCV (Open Source Computer Vision Library) is a popular open-source
computer vision and machine learning software library, originally developed by Intel
in 1999, and now maintained by the OpenCV community. It is written in C++, but

18
also provides bindings for various programming languages, including Python.
Python OpenCV is a Python interface for the OpenCV library that allows
developers to use OpenCV functions and algorithms in Python code. It provides a
rich set of image processing and computer vision algorithms that can be used for
various tasks such as object detection, recognition, tracking, face detection, image
and video analysis, and more.
Python OpenCV provides various modules for image and video processing,
including image filtering, feature detection and description, object recognition,
motion analysis, and more. It also provides support for reading and writing various
image and video file formats, as well as for capturing live video streams from
cameras.
One of the main benefits of using Python OpenCV is its simplicity and ease of
use. It provides a Pythonic interface that makes it easy to write Python code for image
processing and computer vision tasks. Additionally, it has a large community of
developers who contribute to its development, provide support, and share their
knowledge and code through various forums and resources.
In summary, Python OpenCV is a powerful tool for image and video processing
and computer vision tasks that provides a simple and easy-to-use Python interface
for the OpenCV library.

Finally, Python is an interpreted language, which means that it does not need to be
compiled before running. This allows for rapid prototyping and iteration, making it
an ideal choice for developing software in an agile environment .
3.2.3 PYAUTOGUI
PyAutoGUI is a Python library that allows you to automate mouse and
keyboard operations on a computer. It provides a simple and cross-platform interface
for controlling GUI applications, automating repetitive tasks, and testing software.
Some of the common tasks that can be automated using PyAutoGUI include clicking,
typing, dragging, scrolling, and taking screenshots.

19
PyAutoGUI works by using Python's built-in modules such as pywin32 (for
Windows) and Quartz (for Mac) to simulate keyboard and mouse input events. It can
also retrieve information about the screen, such as the position of windows, mouse
cursor position, and colour values of pixels. This allows for a wide range of
possibilities when automating tasks.
One of the main advantages of PyAutoGUI is its simplicity. Its easy-to-use
functions and intuitive interface make it accessible to beginners who have little or no
experience in automation. Additionally, PyAutoGUI is cross-platform, meaning that
it can be used on Windows, Mac, and Linux operating systems.
However, it is important to note that PyAutoGUI may not work well in some
situations, especially in cases where there is a delay or lag in the system. It is also
important to use PyAutoGUI carefully, as it can potentially cause damage to the
system if not used properly.

3.2.4 ALGORITHM:

STEP 1 : Start the Program.

STEP 2 : Open a file and using the file location go to the terminal.

STEP 3 : Import the necessary packages.

STEP 4 : Initialize the code and Start the video capturing of WEBCAM

STEP 5 : Detect eyes and recognize the gesture.

STEP 6 : Perform mouse operation according to the gesture.

STEP 7 : Terminate the Program.

20
CHAPTER 4

RESULTS AND DISCUSSION

RESULT:

Fig.4.1.Camera sensing the pupil of the eye

Fig.4.1.This picture shows the detection of eye (pupil) from the webcam.

Fig.4.2.Mouse using face recognition

21
Fig.4.2. shows the result and working of visual trackpad. After implementing
the visual trackpad using python and conducting thorough testing, several key results
and discussion emerge:
Functional Visual Trackpad System: The primary outcome of the project would be the
development of a functional visual trackpad system implemented in Python. This system
would allow users to control a cursor or interact with digital content using hand gestures
captured by a camera.
Gesture Recognition Algorithm: A key outcome would be the implementation of a
robust gesture recognition algorithm capable of accurately detecting and interpreting a
variety of hand movements and gestures. This algorithm would form the core of the
visual trackpad system and enable intuitive interaction with digital devices.
Real-Time Processing Pipeline: The project would involve the design and
implementation of a real-time processing pipeline to capture video frames from the
camera, apply the gesture recognition algorithm, and generate corresponding input
commands. This pipeline would ensure responsive interaction with the visual trackpad
system.
User Interface Integration: Another outcome would be the integration of the visual
trackpad system with a user interface (UI) framework or application, allowing users to
visualize their hand gestures and receive feedback on their interactions. The UI would
provide an intuitive and user-friendly interface for controlling digital content.
Testing and Evaluation Results: The project would include testing and evaluation of
the visual trackpad system under various conditions, including different lighting
environments, backgrounds, and user scenarios. Results from testing would provide
insights into the accuracy, responsiveness, and usability of the system.
Documentation and User Guide: A comprehensive documentation and user guide
would be produced as an outcome of the project, detailing the implementation details,
usage instructions, and troubleshooting tips for the visual trackpad system. This
documentation would facilitate the deployment and use of the system by other
developers and end-users.
22
Demonstration and Presentation: Finally, the project outcomes would include a
demonstration of the visual trackpad system and a presentation showcasing its features,
functionality, and potential applications. This demonstration would highlight the
project's achievements and contributions to the field of human-computer interaction.

23
CHAPTER 5
CONCLUSION
In this research, First, it will locate the eye's pupil in the middle. Then, various commands
are established for the virtual keyboard depending on the variations in pupil position.
The signals pass through the motor driver on their way to the virtual keyboard itself. The
motor driver's ability to regulate direction and speed allows the virtual mouse to go
forward, left, right, and stop. It provides a control method based on eye tracking that
enables people to easily and intuitively interact with computers using only their eyes. The
system combines keyboard and mouse functionality such that users may use it to do
practically all computer inputs without the need for conventional input devices. The
technology not only makes it possible for disabled people to use computers in the same
way as able-bodied users, but it also gives sighted persons a fresh option for using
computers. In a surfing trial, the suggested method increased browsing effectiveness and
enjoyment, and also allowed users to easily engage with multimedia. the development
and implementation of the visual trackpad using Python represent a significant
advancement in human-computer interaction (HCI) technology. Through the integration
of computer vision techniques, machine learning algorithms, and real-time processing,
the visual trackpad offers users a novel and intuitive input method that enhances their
interaction with digital devices and environments. , Future improvements could focus on
enhancing the robustness and adaptability of the gesture recognition algorithm through
advanced machine learning techniques.

24
SOURCE CODE
SOURCE CODE:
import cv2
import mediapipe as mp
import pyautogui
cam = cv2.VideoCapture(0)
face_mesh = mp.solutions.face_mesh.FaceMesh(refine_landmarks=True)
screen_w, screen_h = pyautogui.size()
while True:
_, frame = cam.read()
frame = cv2.flip(frame, 1)
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
output = face_mesh.process(rgb_frame)
landmark_points = output.multi_face_landmarks
frame_h, frame_w, _ = frame.shape
if landmark_points:
landmarks = landmark_points[0].landmark

for id, landmark in enumerate(landmarks[474:478]):


x = int(landmark.x * frame_w)
y = int(landmark.y * frame_h)
cv2.circle(frame, (x, y), 3, (0, 255, 0))
if id == 1:
screen_x = screen_w * landmark.x
screen_y = screen_h * landmark.y
pyautogui.moveTo(screen_x, screen_y)
25
left = [landmarks[145], landmarks[159]]
for landmark in left:
x = int(landmark.x * frame_w)
y = int(landmark.y * frame_h)
cv2.circle(frame, (x, y), 3, (0, 255, 255))
if (left[0].y - left[1].y) < 0.004:
pyautogui.click()
pyautogui.sleep(1)
cv2.imshow('Eye Controlled Mouse', frame)
cv2.waitKey(1)

26
REFERENCES

[1] Brooks,R.E.(1997) ―Towards a theory of the cognitive processes in

computer programming,‖ Int. J. Man-Mach. Studies, vol. 9, pp. 737–751.

[2] Cheng- Chih Wu, Ting-Yun Hou(2015)‖Tracking Students’Cognitive

Processes During Program Debugging‖—An Eye-Movement Approach,


IEEE.

[3] D. LeBlanc, H. Hamam, and Y. Bouslimani, “Infrared-based

humanmachine interaction,” in Proc. 2nd Int. Conf. Inf. Commun.


Technol., 2006, pp. 870–875.

[4] Ehrlich,K. and Soloway,E.(1983) Cognitive strategies and looping

constructs: An empirical study,‖ Commun. ACM, vol. 26, no. 11, pp.
853–860.

[5] Eric Sung and Jian-Gang Wang (2002)―Study on Eye Gaze

Estimation‖, IEEE, VOL. 32, NO. 3, JUNE .

[6] Murphy,L. (2008)―Debugging: The good, the bad, and the quirky—

A qualitative analysis of novices’ strategies,‖ SIGCSE Bull., vol. 40, no.


1, pp. 163–167.

[7] N. Shaker and M. A. Zliekha, "Real-time Finger Tracking for Interaction,"

2007 5th International Symposium on Image and Signal Processing and


Analysis, Istanbul, 2007, pp. 141-145.

27
[8] QiangJi and Zhiwei Zhu (2007)‖Novel Eye Gaze Tracking Techniques

Under Natural Head Movement‖, Senior Member, IEEE, VOL. 54, NO.
12, DECEMBER

[9] R. S. Batu, B. Yeilkaya, M. Unay and A. Akan, "Virtual Mouse Control

by Webcam for the Disabled," 2018 Medical Technologies National


Congress (TIPTEKNO), Magusa, 2018, pp. 1-4.

[10] S. Yousefi, F. A. Kondori, and H. Li, “Camera-based gesture tracking for

3D interaction behind mobile devices,” Int. J. Pattern Recognit. Artif.


Intell., vol. 26, no. 8, Dec. 2012, Art. no. 1260008.

[11] S. K. Kang, M. Y. Nam and P. K. Rhee, "Color Based Hand and Finger

Detection Technology for User Interaction," 2008 International


Conference on Convergence and Hybrid Information Technology,
Daejeon, 2008, pp. 229- 236.

[12] Z. Ma and E. Wu, “Real-time and robust hand tracking with a single

depth camera,” Vis. Comput., vol. 30, no. 10, pp. 1133–1144, Oct. 2014.

28
29

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy