AI ChatBot Final
AI ChatBot Final
“AI CHATBOT”
(Session : 2023-2025)
ACKNOWLEDGEMENT
There are many people who helped directly or indirectly on the successful completion of my
mini project.
We would like to thank Dean of USCS “Dr. Sonal Sharma” for providing us with all the
necessary resources required to complete the project. We profusely are thankful to the head of
the department “Dr. Sameer Dev Sharma” for their valuable guidance.
We are very thankful to our project guide “Mr. Parminder Singh” who has been inspiring
guide and committed caretaker for her unflinching devotion. The encouragement and support
by her, especially in carrying out this project motivated me to complete this project.
Finally, we are very much indebted to my parents for moral support and encouragement to
achieve goals. I have no words to express my gratitude and still We are very thankful to our
parents who have shown me this world and for every support they gave me.
Akshit Goyal
Roll No. 09
Harshit Sharma
Roll No. 29
Nikhil Panwar
Roll No. 62
DECLARATION
We hereby declare that the project report entitled “AI ChatBot” submitted by
to Uttaranchal School of Computing Sciences. The project was done under the
Guidance of Mr. Parminder Singh. I further declare that the work reported in
this project has Not been submitted and will not be submitted, either in part or
in full, for the award of any other degree or diploma in this university or any
other university or institute.
Akshit Goyal
Roll No. 09
Harshit Sharma
Roll No. 29
Nikhil Panwar
Roll No. 62
CERTIFICATE OF ORIGINALITY
This is to certify that the project entitled “AI CHATBOT” by Akshit Goyal, Nikhil Panwar
and Harshit Sharma has been submitted in the partial fulfilment of the requirements for the
award of the degree of MCA from Uttaranchal University, Dehradun. The results embodied in
this project have not been submitted to any other University or Institution for the record of any
degree.
i
CHAPTER 1
INTRODUCTION
Chatbots are software applications used to conduct online chat conversations via text or text-
to-speech rather than direct contact with a real human agent. A chatbot must convincingly
simulate how a person behaves as a conversational partner. You can build bots using languages
such as AIML (Artificial Intelligence Mark-up Language), an XML-based language that allows
developers to write rules that the bot need to be followed[13]. There are two categories of
chatbots. One Category is command-based chatbots where the chatbot relies on a database of
answers to generate a response. The user has to be very specific when asking questions for the
bot to answer. Therefore, these bots can only answer a limited number of questions and cannot
perform any functions outside the code. Another category is chatbots powered by AI or
machine learning Algorithms allow these bots to answer obscure questions that users shouldn't
have to Be specific when asking questions. These bots therefore create responses to user
requests Using natural language processing (NLP)[1].
Figure 1 shows how a chatbot works. Every time the user asks a question, the bot then
First analyzes the request, then identify the intent and entity, builds a response, and send the
response to the user. Here intent means the intention of the query and entity means details of
1
the query. For example, if a student wants to know the office hours of the college, the intent in
this case is office hours and the entity is the college.
1.1 MOTIVATION
AI-powered chatbots are motivated by the need for traditional websites to provide a Chat
features that require a bot to chat with users and resolve queries. When Live agents can only
perform two-three operations at a time, but chatbots work without an upper limit which really
increases the operations. Also, if there is a school or business If you have a lot of requests, a
chatbot on your website will ease the burden of support team. Chatbots significantly improve
response rates compared to human support team. We also found chatbots because millennials
prefer live chat over phone calls.
It offers a highly engaging interactive marketing platform. again, Chatbots can automate
repetitive tasks. company or the school receives the same requests many times a day. The
support team should respond to Each query repeatedly. After all, the most important advantage
of chatbot is They are Available 24/7. At any time, the user can resolve the request. all these
Advantages of chatbots are motivations for implementing college Enquiry chatbot.
Before implementing the college Enquiry chatbot, there were various existing chatbots
Verified as Amazon Alexa, Google Assistant, Hey Siri and Bixby. To understand the
requirement of a chatbot, A sample Amazon shopping app. In this app, customers purchase
items There is no information on how to return the product. To get this information, Customers
must call and wait long hours to speak to a customer representative. But This complete process
is cumbersome for the customer[17]. Therefore, Amazon created a chatbot to answer simple
customer request.
Similarly, the College Enquiry Chatbot is designed to help students solve their questions with
the click of a button. The Questions are answered at the touch of a button. The main drawbacks
we found when using the existing chatbot is a lack of personality and conversational flow.
Another downside we found when researching about chatbots was that the bots Designed to
follow a specific route and most likely will not satisfy Anything other than a previously defined
script. This means that if they are not part of Predefined script, quite a few bots cannot even
understand it even if It's the most basic type of query and makes a repetitive and terrible
experience[6]. To resolve this issue, we implemented this approach of identifying the intent of
the query before responding to the query to give the best possible answer of the request.
2
1.2 PROBLEM STATEMENT
In this modern age of technology, people do not want to go to college and waste their time
asking for informational tasks. Traditional methods are generally slow. Universities have
different departments for that, but they still need chatbots for student information. To get the
right answer, you must follow some guidelines and go through each process. It saves a lot of
time if the user does not have to manually enter data for the information. It is efficient and
timesaving if students can retrieve data with one click.
• College students face several problems with college information.
• Burden on students to get information.
• Lack of information about recently held programs.
• Wasting students' time to find out complete information.
• Need to visit person to person for required information.
a. Literature Survey: This is the first section of the project where we have discussed the
similar existing projects and literature that we have surveyed for the project. We have
given examples of the different methods and technologies that have been used for
Development of chatbot. The next part of this section states and explains the limitations
in these existing systems.
b. Software and Hardware Requirements: This section discusses the different kinds of
software used in our project and it contains information about the different functional
and non-functional requirements of our project. Furthermore, we have discussed the
minimum hardware specifications that are required for the chatbot to work efficiently.
3
c. Proposed System Design: This section contains a detailed review of the proposed
system design of our project. We have included various UML diagrams that include use
case diagram, class diagram, sequence diagram and activity diagram along with full-
fledged description for every diagram. This section will provide a clear about the
architecture, flow, and functionality of our project.
d. Implementation and Testing: This section gives details about the implementation of our
project and testing result. Every component of the project is explained clearly. The
application is then tested against all the functional requirements. Detailed explanation
and screenshots are provided for the same.
e. Conclusion and Future Scope: This section states and explains the final results that we
have obtained from our project. We also discuss few additional features that can be
added to our project to make it more efficient accurate and have a wider scope.
4
CHAPTER 2
LITERATURE SURVEY
A chatbot (or chatterbot) is software that interacts with users (humans). A virtual assistant that
can answer a series of user questions and provide the best possible answer [7]. In recent years,
the use of chatbots has expanded rapidly in various fields such as Healthcare, marketing,
education, support systems, major etc.
Companies are developing several chatbots for both industrial solutions and research. The best
known are Apple Siri, Microsoft Cortana, Facebook M etc. These are just a few of the most
popular systems.
Chatbots were originally designed to entertain and mimic human conversations. This is still the
reason for the popularity of chatbot development, but since the popularity with the technology
has gone up, so has the different uses. The chatbot technology is used for a variety of purposes,
including getting information, answering questions, helping with fact-based decision-making,
shopping assistants, museum guides, language partners, and education.
Especially in a world where tech-savvy students rely heavily on social media and instant
messaging platforms like Slack and Facebook Messenger. Chatbots have the potential to
provide students with standardized information on the fly. Using chatbots is possible to adapt
the speed at which a student can learn without being too pushy.
1. Harshala Gawade, Prachi Vishe, Vedika Patil, Sonali Kolpe[2] a chatbot is designed by
them using knowledge in database. The proposed system features an online inquiry and
an online chatbot system. Development is done using various programming languages
by creating user-friendly graphical interfaces for sending and receiving responses. It
makes use of SQL (Structured Query Language) for pattern matching.
2. Ms.Ch. Lavanya Susanna, R. Pratyusha, P. Swathi, P. Rishi Krishna, V. Sai Pradeep[3]
created a rule based chatbot in which the user will be provided with a set of categories
or questions to be asked and the answers are provided to those questions only.
3. Hrushikesh Koundinya K, Ajay Krishna Palakurthi, Vaishnavi Putnala, Dr. Ashok
Kumar K [4] a chatbot is designed by them using ML and Python. Which is also a rule
5
based chatbot if the query is matching with the database, then the response will be
provided to the user otherwise some predefined response will be provided.
4. Gandhar Khandagale, Meghana Wagh, Pranali Patil, Prof. Satish Kuchiwale [5] created
a chatbot which displays a list of options to the user and the user need to input the option
number which needs to be answered, the chatbot will provide a link of the college
institution on the user’s request.
6
CHAPTER 3
SOFTWARE & HARDWARE SPECIFICATIONS
7
CHAPTER 4
PROPOSED SYSTEM DESIGN
NATURAL LANGUAGE PROCESSING (NLP):
Natural language processing strives to build machines that understand and respond to text or
voice data—and respond with text or speech of their own—in much the same way humans
do.
What is natural language processing?
Natural language processing (NLP) refers to the branch of computer science—and more
specifically, the branch of AI concerned with giving computers the ability to understand text and
spoken words in much the same way human beings can.NLP combines computational
linguistics—rule-based modelling of human language—with statistical, machine learning, and
deep learning models. Together, these technologies enable computers to process human
language in the form of text or voice data and to ‘understand’ its full meaning, complete with
the speaker or writer’s intent and sentiment. NLP drives computer programs that translate text
from one language to another, respond to spoken commands, and summarize large volumes
of text rapidly—even in real time. There is a good chance you have interacted with NLP in the
form of voice- operated GPS systems, digital assistants, speech-to-text dictation software,
customer service chatbots, and other consumer conveniences. But NLP also plays a growing
role in enterprise solutions that help streamline business operations, increase employee
productivity, and simplify mission-critical business processes.
8
NLP use cases
Natural language processing is the driving force behind machine intelligence in many modern
real-world applications. Here are a few examples:
• Spam detection: You may not think of spam detection as an NLP solution, but the
best spam detection technologies use NLP's text classification capabilities to scan emails
for language that often writes down spam or phishing. These indicators can include
overuse of financial terms, characteristic bad grammar, threatening language,
inappropriate urgency, misspelled company names, and more. Spam detection is one
of a handful of NLP problems that experts consider 'mostly solved' (although you may
argue that this does not match your email experience).
• Social media sentiment analysis: NLP has become an essential business tool for
uncovering hidden data insights from social media channels. Sentimentanalysis can
analyze language used in social media posts, responses, reviews, and more to extract
attitudes and emotions in response to products, promotions, and events information
companies can use in product designs, advertising campaigns, and more.
• Text summarization: Text summarization uses NLP techniques to digest huge
volumes of digital text and create summaries and synopses for indexes, research
databases, or busy readers who do not have time to read full text. The best text
summarization applications use semantic reasoning and natural language generation
(NLG) to add useful context and conclusions to summaries.
Bots offer a new way to communicate with your customers. With chatbots, we can capture
customer’s attention at just the right moment. Chatbots help businesses better understand
consumer issues and take action to address those issues[11]. One operator can serve one
customer at a time. Chatbots, on the other hand, can answer thousands of requests. Chatbots
operate within a pre-defined framework and rely on a single authoritative source within a
catalog of commands to answer questions, reducing the risk of confusion or inconsistency in
responses[12].
Before going deeper into the methodology, we need to know the following:
• Neural Network
• Bag-of-Words Model
• Lemmatization
9
NEURAL NETWORK: This is a deep learning algorithm that resembles the way neurons in
the brain process information (hence the name). It is often used to achieve patterns between
input features in a dataset and corresponding outputs.
In the above figure purple circles represent the input vector xi, where i = 1, 2,......... , D, and are
just features of the data set. Blue circles are hidden layer neurons. These are the layers that
learn the mathematics required to relate inputs to outputs. Finally, we have the pink circles that
10
make up the output layer. The dimensionality of the output layer depends on the number of
different classes used. For example, say you have a 5x4 dataset with 5 input vectors, each with
values for of 4 features (A, B, C, D). Suppose you want to classify each row as good or bad
and use the number 0 to represents good and 1 represents bad. The neural network then has 4
neurons in the input layer and 2 neurons in the output layer.
This step connects the input layer to the output layer through a series of hidden layers. The first
layer of neurons (l=1) receives the weighted sum of the elements of the input vector (xi) along
with the bias term b. Each neuron then transforms the weighted sum received on input, a ,using
a differentiable nonlinear activation function h(•) to produce output z.
For subsequent layer neurons, the weighted sum of the outputs of all previous layer neurons
is passed as input along with the bias term. The layers of subsequent layers transform the
input they receive using activation function.
11
This process continues until the outputs of the neurons in the last layer (l = L) are evaluated.
These neurons in the output layer are responsible for identifying the class to which the input
vector belongs. Input vectors are tagged with the class whose corresponding neuron has the
highest output value.
Activation function may differ from layer to layer. The two most commonly used activation
functions for our Chatbots are the Rectified Linear Unit (ReLu) function and the SoftMax
function. The former is used for the hidden layer and the latter for the output layer. A SoftMax
function is usually used in the output as it gives a stochastic output.
This step is the most important. In this the job of a neural network algorithm is to find the
correct set of weights for all layers that give the correct output, and all this step is to find the
correct weights and biases. Imagine an input vector passed to the network and know that it
belongs to class A. Suppose the output layer gives the highest value of class B. Therefore, our
prediction is wrong. Now that we can compute only the error at the output, we need to
propagate that error backwards to learn the correct set of weights and biases.
ii. PREPROCESS DATA: When working with text data, it is necessary to perform
various preprocessing operations on the data before building a machine learning or deep
learning model. Based on the requirements, you need to apply different operations to
preprocess the data. Tokenization is the most basic and the first thing you can do with
text data. Tokenization breaks all text into smaller word-like pieces.Here we iterate over
12
the pattern, tokenize the sentence using the nltk.word_tokenize() function, and add each
word to the word list. Also create a list of classes for the tag.
Then lemmatize each word and remove duplicate words from the list. Lemmatization
is the process of converting a word into its lemma form and then creating a pickle file
to store the Python objects used in making predictions.
iii. MAKING THE DATA MACHINE-FRIENDLY: In this step, we will convert our
text into numbers using the bag-of-words (bow) model. The two lists words and classes
act as a vocabulary for patterns and tags respectively. We’ll use them to create an array
of numbers of size the same as the length of vocabulary lists. The array will have values
1 if the word is present in the pattern/tag being read and 0 if it is absent. The data has
thus been converted into numbers and stored in two arrays.
iv. BUILDING THE NEURAL NETWORK MODEL: Now we create a neural network
using Keras Sequential model. The input to this network will be the array created in the
previous step. These would then traverse through the model of 3 different layers with
the first having 128 neurons, the second having 64 neurons, and the last layer having
the same number of neurons as the length of the classes array. Next, to reach the correct
weights, we have chosen the SGD optimizer and defined our error function using the
categorical cross-entropy function. And, the metric we have chosen is accuracy. We’ll
train the python chatbot model about 200 times so that it reaches the desired
accuracy.we have also used a Dropout layer which helps in preventing overfitting
during training.
Class diagram is a static diagram. It represents the static view of an application. Class diagram
is not only used for visualizing, describing, and documenting different aspects of a system but
also for constructing executable code of the software application.
Class diagram describes the attributes and operations of a class and also the constraints imposed
on the system. The class diagrams are widely used in the modeling of object-oriented systems
13
because they are the only UML diagrams, which can be mapped directly with object-oriented
languages.
Figure 4.2.1 represents the class diagram. There are 4 classes in the system ChatBot , NLP
Techniques, Data Cleaning and Preprocessing, User Interface. The Chatbot is responsible for
loading the data from intents file and preprocessing it and training the model. Data Cleaning
and Preprocessing manages to preprocess the input text and predict the response. NLP
Techniques controls the lemmatization of text and finally the user interface is responsible for
14
extracting the input data and displaying the response to the user.
4.3 USECASE DIAGRAM
A use case diagram is used to represent the dynamic behavior of a system. It encapsulates the
system's functionality by incorporating use cases, actors, and their relationships. It models the
tasks, services, and functions required by a system/subsystem of an application. It depicts the
high-level functionality of a system and also tells how the user handles a system.
The above Diagram Represents that there are 3 Actors in the proposed System User, Admin,
Chatbot. The User is responsible for Asking the query and view the response and Admin is
responsible for Asking the query, viewing response, Add, delete, Update And View
Information in the database. Lastly, the chatbot is responsible for Processing the query and
responding with the most suitable answer.
15
4.4 ACTIVITY DIAGRAM
We use Activity Diagrams to illustrate the flow of control in a system and refer to the steps
involved in the execution of a use case. We model sequential and concurrent activities using
activity diagrams. So, we basically depict workflows visually using an activity diagram. An
activity diagram focuses on condition of flow and the sequence in which it happens. We
describe or depict what causes a particular event using an activity diagram. An activity
diagram portrays the control flow from a start point to a finish point showing the various
decision paths that exist while the activity is being executed.
16
4.5 SEQUENCE DIAGRAM
The sequence diagram represents the flow of messages in the system and is also termed as an
event diagram. It helps in envisioning several dynamic scenarios. It portrays the
communication between any two lifelines as a time-ordered sequence of events, such that these
lifelines took part at the run time. In UML, the lifeline is represented by a vertical bar, whereas
the message flow is represented by a vertical dotted line that extends across the bottom of the
page. It incorporates the iterations as well as branching. A sequence diagram simply depicts
interaction between objects in a sequential order i.e. the order in which these interactions
take place. We can also use the terms event diagrams or event scenarios to refer to a sequence
diagram. Sequence diagrams describe how and in what order the objects in a system function.
These diagrams are widely used by businessmen and software developers to document and
understand requirements for new and existing systems.
17
Figure 4.5.1 Represents the sequence diagram where we have 3 LifeLines User, Chatbot,
Database. The user sends the query which is received by the chatbot. Initially, the chatbot pre-
processes the data and then trains the model with the help of database. The chatbot then
Preprocesses the query and Search for the response in the data base. The chatbot interns Returns
the response to the user.
This section describes the overall architecture of the proposed system. The main purpose of
this chatbot is to respond to user queries without manpower. Users can use chatbots in any Web
browser. Whenever the user requests, the chatbot receives the request and analyses it, and
respond to users in return. This analysis makes use of various machine learning algorithms.
The queries are defined with the Certain tags for each set. This tag is Nothing but keywords to
help the chatbot analyze User request. After analysis, the chatbot replies to the user with a
required response. When the users request is unclear For chatbot, responses are standard
messages defined by Developer. Almost all user questions are clearly answered. Only rare
cases are exceptional[1].
18
4.7 TECHNOLOGY DESCRIPTION
1. FRONT END
HTML & CSS
HTML stands for Hyper Text Markup Language, the web's most popular language for
developing web pages. In order to provide the user with an easy and responsive User Interface,
we have created the web page using HTML to place various elements such as buttons and text
fields.
CSS is the language we use to style an HTML document.CSS describes how HTML element
Should be displayed. Use CSS to control text color, font style, spacing between paragraphs,
column size and layout, background images or colors used, layout design, display variations
on different devices and screen sizes, and other effects Host of.
Various HTML widgets we have used in the project are:
• The <input> element is the most important form element. It is the tag which specifies
an input field where the user can enter query.
• The <button> tag defines a clickable button. After entering the query in the input field
if the user clicks on the button the query will be sent to the server code to be processed.
• The <div> tag is used to group the large section of HTML elements together. Here we
place the enter chat container in the div tag. By wrapping a set of elements in a div tag,
you can take advantage of CSS styles to apply font styles to all paragraphs at once,
rather than coding the same style for each paragraph element.
• The p tag is used to define paragraphs in web pages.
We make use of Various attributes such as id, class to access various elements of the web page.
JAVASCRIPT
jQuery is the most popular JavaScript library used for HTML DOM Manipulation, Event
Handling, Animations, and Ajax.A lot of tasks that need to write in many lines of JavaScript
code can be called with a single line of jQuery code. That is because jQuery wraps those
common tasks into methods. Until the document is "ready", the page cannot be safely
manipulated. jQuery detects this ready state for us. Code included inside $(document). ready
() will only run once the page Document Object Model (DOM) is ready for JavaScript code to
execute. Whenever the button is clicked after inputting the query a POST request is sent to the
server along with the data which includes the question that the user asked.On successful request
the result is fetched in a variable and displayed in the browser.
19
2. BACK END
PYTHON
• random
• json
• numpy
• pickle
• nltk
• tensorflow.keras
random: random is a python inbuilt module which is used to generate random numbers. These
are pseudo random numbers which are not completely random. It is used to perform random
actions such as generating random numbers, printing random numbers etc. In our project we
make use of method shuffle from random module which is random.shuffle() to generate a
random response from a list of responses after classifying the intent.
json: JSON stands for JavaScript Object Notation. It is a syntax for Storing and exchanging
20
data. From this module we make use of the method loads which is json.loads() to load the data
from the text file. This method converts the JSON data into python dictionary.
numpy: numpy stands for numerical python. numpy is a python library to work with arrays.
Basically, python have lists which serves the purpose of arrays, but they are slow in processing.
NumPy aims to provide array objects that are up to fifty times faster than traditional Python
lists. From this module we make use of the method array which is numpy.array() will convert
any python array like object into ndarray.
pickle: The Python pickle module is used to serialize and deserialize Python object structures.
You can insert any Python object so that it can be saved to disk. Pickle first "serializes" the
object before writing it to the file. Pickling is a way to convert a Python object (list, dict, etc.)
to a character stream. The idea is that this string contains all the information needed to
reconstruct the object in another Python script. from this module we use two methods
pickle.dump() and pickle.load(). the pickle.dump() is used to store the object data to the file.
To retrieve pickled data, we have to use pickle.load().
nltk: NLTK is a standard Python library with pre-built functions and utilities for ease of use
and implementation. It is one of the most widely used libraries in natural language processing
and computational linguistics. From this library we import WordNetLemmatizer from
nltk.stem for lemmatization of the data. Lemmatization is the process of reducing inflection
from words. it reduces words to their origins which have actual meaning.
After compiling and fitting the model the model is stored in a .h5 file. Which stores the data
in the Hierarchical Data Format 5.
21
CHAPTER 5
IMPLEMENTATION AND TESTING
This section describes the working of the system on an overall basis and further with specific
focus on the software part of the chatterbot and the predefined query data set. An algorithm of
the process, proceeded by a design motive of the system is also included.
The coding part is worked with python, HTML, CSS and JavaScript. This includes many
library functions like NLTK, TensorFlow, NumPy and few other. These library functions help
the chatbot to analyze the user request and decide the response to be given. Python itself has a
package for chatbots, which is mainly required for the development of a user-friendly
chatbot[1].
22
we will extract words from patterns and the corresponding tag to them. This has been achieved
by iterating over each pattern using a nested for loop and tokenizing it
using nltk.word_tokenize. The words have been stored in “words” and the corresponding tag
to it has been stored in “documents”.
For the list words, the punctuations have not been added by using a simple conditional
statement and the words have been converted into their root words using
NLTK's WordNetLemmatizer(). This is an important step when writing a chatbot in Python as
it will save us a lot of time when we will feed these words to our deep learning model. At last,
both the lists have been sorted and these functions have been used to remove any duplicates.
23
CHAPTER 6
CONCLUSION AND FUTURE SCOPE
The goal of the project is to reduce man-power and to respond to user query at faster rate. Early
days, the user’s use to send a query mail to the particular site administrator and it would take
few days for the site administrator to reply to the mails. Chatbots can overcome this delay,
chatbot satisfies the user request or query immediately with relevant responses. These days
many websites of banks, educational institutions, business sectors have developed their
chatbots to satisfy user request in a faster time. Chatbots are user-friendly artificial machines.
This project can be developed even more by adding multi languages, speech recognition. We
can add many more tags to the data set, as the website gets developed. The chat history of a
particular user can be sent as a mail to him/her after the conversation is over. This can be done
by authorizing the users and receiving their mail id’s. This project is a small initiative to make
the website user-friendly and easily understandable by the user.
24
REFERENCES:
25