Fin Irjmets1711951459
Fin Irjmets1711951459
Figure 1: The flowchart details entails the various steps involved in the development of virtual desktop
assistant.
2) Speech Recognition and Text to Speech: We implemented a speech recognition system that converts
spoken language into text. For this we will use the Speech Recognition library in Python. Along with thiswe
implement a text-to-speech system that converts the assistant’s responses (text) into natural-sounding
speech.
3) Task Execution and Personalization: Develop modules to perform various tasks based on user intents,
such as opening websites, playing music, asking time and date, etc.
4) User Interface Design: A user-friendly interface will be developed to facilitate intuitive interactions with
the voice assistant, ensuring a smooth user experience across various devices and platforms.
5) Integration of External APIs: Integrate external APIs for accessing relevant information and services,
such as weather forecasts, news updates, and online search functionalities, to enrich the assistant’s
capabilities.
B. System Design
The virtual desktop assistant takes voice commands, recog- nizes them, processes them, and executes the
requested task. It responds dynamically to user input, executing tasks based on their desires, as depicted by
the data flow diagrams (DFD) of Figure 2 and Figure 3 respectively.
Figure 2: The diagram depicts Level 0 DFD of a end user-virtual assistant interaction.
The user sends a command, the assistant processes it taking the relevant information using APIs and the user
receives a response. The diagram shows the user’s role in the communication process, with the assistant using
various APIs.
Figure 3: The diagram depicts Level 1 DFD illustrating end user input process and subsequent system response
and actions.
IV. IMPLEMENTATION AND RESULTS
A. Implementation Details
For implementation of the voice desktop assistant following are the minimum hardware configurations that
are required:
• Processor: The program needs a processor that is at least as powerful as an Intel Core 2 Duo. Although this
is the bare minimum, a more potent CPU is advised for lag- free performance. Performance will be much
enhanced with a multi-core processor, like an Intel Core i5 or i7, especially for jobs requiring voice
recognition and artificial intelligence.
• RAM (Random Access Memory): A minimum of 6 GB of RAM is required. However, to handle resource-
intensive tasks and ensure smooth multitasking, it’s ad- visable to have more RAM, such as 8 GB or 16 GB.
The amount of RAM affects the software’s ability to process voice commands and AI tasks efficiently.
• Hard Drive (HDD/SSD): A minimum of 256 GB of storage space is essential for software installation, data
storage, and updates. Consider using a Solid State Drive (SSD) instead of a Hard Disk Drive (HDD) for
faster data access and improved overall system performance. Additionally, having extra storage space is
beneficial if you plan to store a significant amount of data.
The minimum software configurations required are as fol- lows:
• Operating System: This software is compatible with Windows 11 (64-bit) to access the latest features
andsecurity updates and ensure that your system meets the required requirements.
• Integrated Development Environment (IDE): Visual Studio Code (VS Code) is a free, open-source code
editor that supports Python and offers real-time functionality, creating an efficient environment for code
development, debugging, and collaboration.
• Programming Language (Python): The software, de- veloped using Python, is suitable for AI and voice
recog- nition tasks and requires Python version 3.9.11 for proper functionality.
Why Choose Python Language for Vikram? Python, originating before the surge in popularity of machine
learning and AI, remains a preferred choice owing to its distinctive attributes, which distinguish it from
other programming languages. These qualitiesinclude:
1) Rich Collection of Packages and Libraries: Py- thon boasts an extensive array of packages and libraries,
which, despite its inception predating the rise of machine learning and AI, render it invalu- able for these
domains.
2) Enhanced Code Readability: Python’s elegant and concise syntax significantly aids machine learning
endeavors by simplifying program composition and enhancing readability, facilitating comprehension and
maintenance.
Figure 4: The above figure showcases the top 10 most popular programming languages globally, including
Python, Java, and JavaScript.
B. Results
Following figures of provides visual results of our AI assistant’s functionality and user interface, highlighting
user interaction flow, interface design, and tasks executed.
Figure 5: Greets the user based on the time of day and asks how it can assist
Figure 6: Controlling video playback on YouTube; allows user to perform actions like pausing, playing, seeking
backward or forward, toggling full screen, muting, etc. Here on user’s request to ”pause” the video, the assistant
pauses the video.
Figure 7: On saying ”WhatsApp”, the assistant enables the user to send WhatsApp messages to specified
contacts by dictating the message via voice input.
Figure 8: As it can be seen the WhatsApp message is sent to the specified person.
Figure 9: On saying ”Wikipedia (query)”, the assistant gives the relevant information for the given query
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[5624]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:03/March-2024 Impact Factor- 7.868 www.irjmets.com
V. CONCLUSION
AI desktop assistants are revolutionizing human-machine interactions, offering benefits across sectors like
customer service, healthcare, and education. They streamline tasks, automate processes, and provide
personalized support, enhancing productivity and efficiency. In customer service, they handle inquiries and
provide personalized assistance. In healthcare, they automate administrative tasks, aid in diagnosis and
treatment planning, and facilitate communication. In education, they offer personalized learning experiences.
Beyond these sectors, AI desktop assistants drive innovation across finance, manufacturing, and research and
development. Their adaptability and ability to adapt to user needs make them significant catalysts for
progress in the digital age.
VI. FUTURE SCOPE
While AI desktop assistants have already demonstrated their value, there are several avenues for future
development and enhancement to fully unlock their potential considering the factors of limitations, security
and stability of the system. Hereare some suggestions for future scope:
• Offline Functionality: Enhances utility and accessibility in diverse environments without internet
connection.
• Enhanced Security Features: Implements encryption, multi-factor authentication, and biometric
recognition foruser data protection.
• AI Model Training: Continuously refines AI models for improved accuracy, responsiveness, and natural
languageunderstanding.
• Voice Recognition Improvement: Invests in research and development for better understanding of
diverse accents and environments.
• Integration with Smart Home Devices: Enables seam- less control and automation of connected devices.
• User Feedback Mechanisms: Gathers feedback for con- tinuous improvement of features, usability, and
user ex- perience.
By addressing these areas for future development, we can further elevate the capabilities, security, and user
experience of AI desktop assistants, ensuring their continued relevance and effectiveness in meeting the
evolving needs of users across various domains.
Divisha Pandey, Afra Ali, Shweta Dubey, Muskan Srivastava, Shyam Dwivedi, Md. Saif Raza (2022). ”Voice
Assistant Using Python and AI.”
ACKNOWLEDGMENT
First, we would like to thank Professor Rajesh Kolte, Head of Department (Data Science) and our guide, Ms.
Aarti Dharmani for her valuable guidance and continuous support during the project; her patience, motivation,
enthusiasm, and immense knowledge. Her direction and mentoring helped us to work successfully on the
project topic.
Our sincere gratitude to Dr. Yogesh Nerkar, Principal (Usha Mittal Institute of Technology) for his valuable
encouragement and insightful comments.
We would also like to thank to all the teaching and non-teaching staff for their valuable support.
Last but not the least we would like to thank to our parents and friends.
VII. REFERENCES
[1] Leandro Tibola, Liane Margarida Rockenbach Tarouco (2013). ”Interop- erability in Virtual World.”
[2] Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever (2018). ”Improving Language
Understanding by Generative PreTraining.”
[3] Deepak Shende, Ria Umahiya, Monika Raghorte, Aishwarya Bhisikar, Anup Bhange (2019). ”AI Based
Voice Assistant Using Python.”
[4] Sangpal, Ravivanshikumar, Gawand, Tanvee, Vaykar, Sahil, Madhavi, Neha (2019). ”JARVIS: An
interpretation of AIML with integration of gTTS and Python.”
[5] Rajat Sharma, Adweteeya Dwivedi (2022). ”JARVIS - AI Voice Assist- ant.”