Alen
Alen
Alen
ABSTRACT
Further enhancing its utility, the assistant can perform web searches directly in
Google Chrome, streamlining the process of information retrieval. It also supports
launching commonly used applications such as Notepad, VS Code, and Command
Prompt, thereby improving workflow efficiency. The integration with YouTube allows
the assistant to search for and play videos, catering to the user’s entertainment needs.
TABLE OF CONTENTS
1 Introduction 01-02
2 Objectives 03-04
4 Methodologies 07-08
7 References 17-17
8 Conclusion 18-18
In addition to weather updates, Alen is equipped to fetch and narrate jokes and
random pieces of advice. These features, powered by the JokeAPI and Advice Slip API,
respectively, are designed to offer users a blend of entertainment and wisdom, making
interactions with the assistant more engaging and enjoyable.
This voice assistant project not only exemplifies the practical applications of
modern AI technologies but also highlights the potential for future enhancements. With
its scalable design and robust feature set, Alen serves as a testament to the seamless
integration of AI into everyday life, paving the way for more advanced and personalized
digital assistants in the future.
To enhance the utility of the voice assistant, the project aims to integrate various
Application Programming Interfaces (APIs) to fetch real-time information. This includes:
The project aims to design the system architecture to be easily scalable, allowing
for the addition of new features and functionalities in the future. This involves:
By achieving these objectives, the project aims to deliver a versatile and reliable
voice assistant that not only simplifies daily tasks but also enriches the user's digital
experience through robust AI and NLP technologies.
3.2 Libraries
pyttsx3: This library is used for text-to-speech conversion. It allows the voice
assistant to communicate with users by converting text responses into spoken words,
ensuring clear and natural interaction.
SpeechRecognition: This library is utilized for capturing and processing spoken
language. It converts the user's speech into text, which the voice assistant can then
interpret and respond to accordingly.
requests: This library facilitates easy HTTP requests to interact with APIs. It is
used to fetch real-time data from various online services, such as weather updates and
jokes.
youtubesearchpython: This library is used to search for YouTube videos based
on user queries. It retrieves video results and helps in playing the most relevant
videos, enhancing the multimedia capabilities of the assistant.
webbrowser: This standard library module is used to open URLs in the default
web browser. It allows the assistant to perform web searches and open web pages
based on user commands.
subprocess: This module is used to spawn new processes, connect to their
input/output/error pipes, and obtain their return codes. It helps in launching various
applications like Notepad, VS Code, and Command Prompt.
os: This module provides a way to interact with the operating system. It is used for
miscellaneous operating system interfaces, such as executing system commands.
JokeAPI: This API is used to fetch jokes. It enhances the user experience by providing
entertainment and humor through random jokes.
Advice Slip API: This API is utilized to fetch random pieces of advice. It offers users
insightful and interesting tips, adding value to their interactions with the assistant.
Visual Studio Code (VS Code) is the chosen integrated development environment (IDE)
for this project. VS Code is highly versatile and supports a wide range of programming
languages and extensions, making it an excellent tool for development. Key features
include
Extensibility
Integrated Terminal
Debugging Tools
Version Control Integration
4.1.2 Details:
4.2.2 Details:
4.3.2 Details:
The system integrates multiple APIs to retrieve real-time data and provide
dynamic responses:
4.4.2 Details:
4.5 Workflow
4.5.1 Speech Input: The user speaks a command, which is captured by the
microphone.
4.5.3 NLP Processing: The text is analyzed to determine the user's intent and the
appropriate action to take.
4.5.5 Text-to-Speech: The pyttsx3 library converts the textual response into
spoken words, which are then communicate to user.
CHAPTER 5
DEPT. OF CSE AGMRCET Varur,Hubli 8 page
Mini-project 2k24-25
FUTURE ENHANCEMENTS
The Alen Voice Assistant project, while already robust and functional, has
significant potential for further enhancements. Future developments could focus on
improving user interaction, expanding functionalities, and integrating advanced
technologies to create an even more versatile and intelligent assistant. Here are some
potential areas for future enhancements:
IoT Device Control: Enable the assistant to control smart home devices such
as lights, thermostats, and security systems, providing users with a comprehensive
smart home management solution.
Routine Automation: Allow users to set up routines and automate tasks, such
as turning off lights and locking doors at a specified time.
Data Encryption: Ensure all user data and interactions are encrypted to
maintain privacy and security.
Voiceprint Authentication: Use voiceprint authentication to verify user
identity and enhance security.
Music and Podcast Integration: Allow the assistant to play music and
podcasts from various streaming services, creating a richer multimedia
experience.
Visual Display Support: Integrate with smart displays to provide visual
feedback and enhance user interaction through multimedia presentations.
CHAPTER 6
CODE ORGANIZATION
Alen Voice Assistant is designed to capture user commands, process them, and
execute the appropriate actions. It comprises the following components:
6. 1. Speech Recognition
Explanation: After converting speech to text, basic NLP techniques are employed to
understand the user's intent. This involves parsing the text to identify keywords and
commands. The following structure represents how the commands are interpreted:
if "weather" in query:
# Extract the city from the query
city = query.replace("weather in", "").strip()
weather_report = get_weather(city)
print(weather_report) # Output the weather information
speak(weather_report)
elif "search" in query:
search_in_chrome(query)
speak("Searching in Chrome...")
elif "open" in query:
open_application(query)
speak("Opening application...")
elif "youtube" in query:
find_and_play_youtube_video(query)
speak("Playing video on YouTube...")
elif "joke" in query:
if "another" in query or last_joke is None:
last_joke = get_joke()
speak(last_joke)
elif "advice" in query:
advice = get_advice()
speak(advice)
elif "exit" in query or "bye" in query:
speak("Goodbye!")
break
else:
speak("I'm sorry, I didn't understand that.")
6. 3 Text-to-Speech (TTS)
import pyttsx3
def speak(text):
engine = pyttsx3.init()
engine.say(text)
engine.runAndWait()
import requests
def get_weather(city):
api_key = "your_api_key_here"
url = f"http://api.openweathermap.org/data/2.5/weather?
q={city}&appid={api_key}&units=metric"
response = requests.get(url)
data = response.json()
if data["cod"] == 200:
weather_description = data["weather"][0]["description"]
temperature = data["main"]["temp"]
weather_report = f"The weather today in {city} is {weather_description}. The
temperature is {temperature} degrees Celsius."
return weather_report
JOKE:
def get_joke():
url = "https://v2.jokeapi.dev/joke/Any"
response = requests.get(url)
if response.status_code == 200:
joke = response.json()
if joke["type"] == "single":
return f"Here's a joke: {joke['joke']}"
elif joke["type"] == "twopart":
return f"Here's a joke: {joke['setup']} ... {joke['delivery']}"
else:
ADVICE:
def get_advice():
url = "https://api.adviceslip.com/advice"
response = requests.get(url)
if response.status_code == 200:
advice = response.json()
return f"Here's an advice: {advice['slip']['advice']}"
else:
return "Sorry, I couldn't fetch an advice at the moment."
import webbrowser
def search_in_chrome(query):
query = query.replace("search", "").strip()
url = f"https://www.google.com/search?q={query}"
webbrowser.open(url)
from youtubesearchpython import VideosSearch
def find_and_play_youtube_video(query):
query = query.replace("youtube", "").strip()
videos_search = VideosSearch(query, limit=1)
results = videos_search.result()
if results['result']:
video_url = results['result'][0]['link']
webbrowser.open(video_url)
else:
print("No video results found.")
import os
def open_application(query):
applications = {
"notepad": "notepad.exe",
"vs code": "code", # Just use 'code' as the command
"cmd": "cmd.exe", # Command Prompt
}
DEPT. OF CSE AGMRCET Varur,Hubli 15 page
Mini-project 2k24-25
query = query.replace("open", "").strip().lower()
if query in applications:
try:
if query == "vs code":
os.system(applications[query]) # Use os.system to execute command
else:
subprocess.Popen(applications[query])
except Exception as e:
print(f"Error opening {query}: {e}")
else:
print("Application not found.")
CHAPTER 7
REFERENCES
DEPT. OF CSE AGMRCET Varur,Hubli 16 page
Mini-project 2k24-25
SpeechRecognition Library:
pyttsx3 Library:
OpenWeatherMap API:
JokeAPI:
"Advice Slip JSON API." Advice Slip, accessed July 28, 2024,
https://api.adviceslip.com/.
youtubesearchpython Library:
CHAPTER 8
DEPT. OF CSE AGMRCET Varur,Hubli 17 page
Mini-project 2k24-25
OUTPUT IMAGE