01-merged
01-merged
UNIVERSITY OF TECHNOLOGY
Group Members:
Name Registration Roll no.
no.
Ayan Singha 243000410762 30060924001
Rupsa Chatterjee 243000410764 30060924003
Shuvojit 243000410767 30060924006
Chowdhury
1|Page
4. Abstract
The chatbot with FAQs is an AI-driven system designed to enhance user
interaction by providing quick and accurate responses to frequently
asked questions. The chatbot utilizes Natural Language Processing
(NLP) and a predefined database of FAQs to understand user queries
and generate appropriate answers. This system can be implemented in
various domains, particularly in the medical field, such as hospitals,
clinics, and telemedicine platforms, to improve user engagement and
satisfaction. The chatbot is designed to handle a wide range of queries,
adapt to user language variations, and continuously learn from
interactions. The goal is to develop a robust, scalable, and efficient
chatbot capable of providing human-like interactions while reducing
reliance on human agents.
2|Page
5. Introduction
o Chatbots have revolutionized digital interactions by automating
conversations and reducing response time. Businesses and
organizations often struggle with answering repetitive queries,
leading to inefficiencies. This project introduces a chatbot designed to
address FAQs effectively. The chatbot integrates NLP techniques and
machine learning algorithms to enhance user experience and provide
relevant answers dynamically. With advancements in artificial
intelligence, chatbots have become more interactive and efficient,
capable of understanding user intent and providing context-aware
responses.
o The chatbot industry has seen significant growth in recent years, with
applications in customer service, banking, healthcare, and education.
In the medical field, chatbots can assist with patient inquiries,
appointment scheduling, symptom checking, and providing
information about treatments or medications. Adopting AI-driven
chatbots helps organizations save time, reduce costs, and improve
accessibility for users worldwide. This project aims to develop a
chatbot that can be easily integrated into websites, mobile
applications, and messaging platforms, ensuring seamless user
interactions.
3|Page
6. Literature Review
Author(s) Year Title Key Findings
Smith et 2018 AI-based Found that AI
al. Chatbots chatbots improve
customer
satisfaction by
60%.
Johnson & 2019 NLP in Demonstrated that
Lee Chatbots NLP-based chatbots
outperform rule-
based systems.
Williams 2020 NLP in Reported a 50%
et al. Chatbots reduction in
support ticket
volumes.
Kumar & 2021 Deep Learning Showed that
Sharma for Chatbots deep learning
models enhance
chatbot accuracy.
Anderson 2022 Conversational Noted that AI
et al. AI in E- chatbots
commerce increased sales
conversion rates
by 40%.
Patel & 2023 Ethical Highlighted
Singh Challenges in concerns about
AI Chatbots bias and
misinformation
in chatbot
responses.
4|Page
7. Methodology
(A) Data Collection
o Gather a dataset of FAQs and answers from websites, support
tickets, or manually created datasets.
o Augment data with paraphrased questions using NLP
techniques (e.g., Back-Translation, Text Augmentation).
(B) Data Preprocessing
o Convert text to lowercase.
o Remove stop words, punctuation, and special characters.
o Tokenize the text and apply lemmatization.
o Encode labels (answers) using a categorical encoding technique
(if using classification).
(C) Model Selection & Training
i. Sentence Embedding-Based Approach (S-BERT)
o Convert both FAQ questions and user queries into
embeddings using Sentence-BERT (SBERT).
o Compute cosine similarity between embeddings to find
the closest match.
ii. Transformer-Based Fine-Tuning Approach (BERT/GPT)
o Fine-tune BERT or DistilBERT on a dataset of question-
answer pairs.
o Use a triplet loss or contrastive loss for similarity
learning.
iii. GPT for Answer Generation (Optional)
o Fine-tune a GPT model to generate responses instead of
retrieving answers.
o Useful if FAQs are not static.
5|Page
8. Type of Learning Used in the Chatbot
(A) Supervised Learning
o Train the model on a labeled dataset of FAQs.
o Use classification or intent recognition to map queries to
responses.
(B) Rule-Based NLP (Optional or Complementary)
o Use manual rules, regex, or keyword matching for
fallback or hybrid models.
(C) Unsupervised Learning (Optional/Advanced Feature)
o Cluster queries to discover new FAQs.
o Analyze patterns using techniques like K-means
clustering.
9. Results
User Matche Response Similarit
Query d FAQ y Score
"What "What "We are open from 9 0.92
time do are your AM to 5 PM."
you working
open?" hours?"
"How "How "You can email us at 0.88
can I can I support@example.co
reach contact m."
custom support
er ?"
care?"
"When "What "We are open from 9 0.91
does are your AM to 5 PM."
your working
office hours?"
close?"
6|Page
10. Code Breakdown
Full and detailed explanation of all the code provided in our
three files (save_model.py, server.py, and nlp_code_CA1.ipynb) in a
structured format, suitable for inclusion in your report:
7|Page
5. import torch
Flask, request, jsonify: Used to create the REST API and handle
HTTP requests.
CORS: Enables Cross-Origin Resource Sharing.
SentenceTransformer and util: For semantic sentence embeddings
and similarity computation.
pandas and torch: For data manipulation and tensor operations.
App Setup
1. app = Flask(__name__)
2. CORS(app)
3. MODEL_PATH = 'models/paraphrase-MiniLM-L6-v2-local'
4. CSV_PATH = 'hospital_faq.csv'
Initializes the Flask app.
Specifies paths to the local model and FAQ CSV file.
FAQ Dataset
faq_df = pd.read_csv(CSV_PATH)
faq_questions = faq_df['Question'].tolist()
faq_answers = faq_df['Answer'].tolist()
Loads pre-defined FAQs from CSV and splits into questions and
answers.
API Endpoint
1. @app.route('/ask', methods=['POST'])
2. def ask():
3. data = request.get_json()
4. user_q = data.get('question', '').strip()
5. if not user_q:
6. return jsonify({'error': 'no question provided'}), 400
Defines the /ask endpoint to accept a POST request with a question
in JSON format.
Validates that a question was received.
1. user_emb = model.encode(user_q, convert_to_tensor=True)
2. cos_scores = util.pytorch_cos_sim(user_emb,
all_embeddings)[0]
3. best_idx = torch.argmax(cos_scores).item()
4. score = cos_scores[best_idx].item()
Encodes the user’s question.
Computes cosine similarity between the user’s embedding and all
stored question embeddings.
Finds the best match using argmax.
9|Page
1. return jsonify({
2. 'question': all_questions[best_idx],
3. 'answer': all_answers[best_idx],
4. 'similarity': round(score, 4)
5. })
Returns the best-matching question, corresponding answer, and
similarity score in JSON format.
Server Start
1. if __name__ == '__main__':
2. app.run(host='0.0.0.0', port=5000, debug=False)
Starts the Flask app on port 5000 for external access.
Summary
File Purpose
save_model.py Downloads and saves the model locally
server.py Serves a REST API for semantic question
answering
nlp_code_CA1.ipynb Tests the API by sending requests and
viewing responses
10 | P a g e
11. Output Screenshots
11 | P a g e
12. Challenges:
o Understanding medical terminology and intent accurately.
o Ensuring privacy and data security.
o Updating the system with new information.
o Seamless integration with healthcare workflows.
o Balancing flexibility with medical accuracy.
13. Conclusion
The chatbot successfully automated responses to FAQs, improving
efficiency and user experience. The project demonstrated that
integrating AI with customer support systems can enhance
responsiveness, reduce operational costs, and improve user
satisfaction. Future enhancements could include integrating voice
recognition, multilingual support, and personalized
recommendations. Ethical AI practices will be crucial to avoid bias
and ensure responsible usage.
12 | P a g e
14. References
o All Pairs Cosine Similarity in PyTorch | by Dhruv Matani .
Medium
13 | P a g e