0% found this document useful (0 votes)
3 views15 pages

01-merged

Uploaded by

Ayan Singha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views15 pages

01-merged

Uploaded by

Ayan Singha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

MAULANA ABUL KALAM AZAD

UNIVERSITY OF TECHNOLOGY

Chatbot with FAQs


SUBJECT: Natural Language Processing SUBJECT CODE: PGIT(AI)203A

Group Members:
Name Registration Roll no.
no.
Ayan Singha 243000410762 30060924001
Rupsa Chatterjee 243000410764 30060924003
Shuvojit 243000410767 30060924006
Chowdhury

YEAR: 1st SEMESTER: 2nd


Table of Content
Sl. No. Topic Name Page No
1 Problem Statement 01
2 Key challenges addressed by the chatbot: 01
3 A chatbot can address these issues by providing: 01
4 Abstract 02
5 Introduction 03
6 Literature Review 04
7 Methodology 05
A. Data Collection
B. Data Preprocessing
C. Model Selection & Training
I. Sentence Embedding-Based Approach (S-
BERT)
II. Transformer-Based Fine-Tuning Approach
(BERT/GPT)
III. GPT for Answer Generation (Optional)
8 Type of Learning Used in the Chatbot 06
A. Supervised Learning
B. Rule-Based NLP (Optional or Complementary)
C. Unsupervised Learning (Optional/Advanced
Feature)
9 Results 06

10 Code Breakdown 07-10


11 Output Screenshots 11
12 Challenges: 12
13 Conclusion 12
14 References 13
1. Problem Statement
A chatbot with frequently asked questions (FAQs) aims to provide
instant and automated responses to common queries. Traditional
customer support systems require human intervention, leading to
delays and inefficiencies. This project seeks to develop a chatbot that
leverages artificial intelligence to improve response accuracy and
efficiency while reducing human dependency.

2. Key challenges addressed by the chatbot:


(A) Traditional customer support systems require human
intervention, leading to delays and inefficiencies.
(B) Many businesses and organizations face challenges in handling
large volumes of queries.
(C) Increased operational costs due to human-dependent customer
support.
(D) Delayed response times negatively impacting customer
satisfaction.
(E) Lack of accessibility and availability of customer support around
the clock.

3. A chatbot can address these issues by providing:


(A) Real-time answers to user queries.
(B) Improved customer satisfaction by providing instant responses.
(C) Enhanced accessibility by being available 24/7.
(D) Reduced dependency on human agents, lowering operational
costs.

1|Page
4. Abstract
The chatbot with FAQs is an AI-driven system designed to enhance user
interaction by providing quick and accurate responses to frequently
asked questions. The chatbot utilizes Natural Language Processing
(NLP) and a predefined database of FAQs to understand user queries
and generate appropriate answers. This system can be implemented in
various domains, particularly in the medical field, such as hospitals,
clinics, and telemedicine platforms, to improve user engagement and
satisfaction. The chatbot is designed to handle a wide range of queries,
adapt to user language variations, and continuously learn from
interactions. The goal is to develop a robust, scalable, and efficient
chatbot capable of providing human-like interactions while reducing
reliance on human agents.

2|Page
5. Introduction
o Chatbots have revolutionized digital interactions by automating
conversations and reducing response time. Businesses and
organizations often struggle with answering repetitive queries,
leading to inefficiencies. This project introduces a chatbot designed to
address FAQs effectively. The chatbot integrates NLP techniques and
machine learning algorithms to enhance user experience and provide
relevant answers dynamically. With advancements in artificial
intelligence, chatbots have become more interactive and efficient,
capable of understanding user intent and providing context-aware
responses.

o The chatbot industry has seen significant growth in recent years, with
applications in customer service, banking, healthcare, and education.
In the medical field, chatbots can assist with patient inquiries,
appointment scheduling, symptom checking, and providing
information about treatments or medications. Adopting AI-driven
chatbots helps organizations save time, reduce costs, and improve
accessibility for users worldwide. This project aims to develop a
chatbot that can be easily integrated into websites, mobile
applications, and messaging platforms, ensuring seamless user
interactions.

3|Page
6. Literature Review
Author(s) Year Title Key Findings
Smith et 2018 AI-based Found that AI
al. Chatbots chatbots improve
customer
satisfaction by
60%.
Johnson & 2019 NLP in Demonstrated that
Lee Chatbots NLP-based chatbots
outperform rule-
based systems.
Williams 2020 NLP in Reported a 50%
et al. Chatbots reduction in
support ticket
volumes.
Kumar & 2021 Deep Learning Showed that
Sharma for Chatbots deep learning
models enhance
chatbot accuracy.
Anderson 2022 Conversational Noted that AI
et al. AI in E- chatbots
commerce increased sales
conversion rates
by 40%.
Patel & 2023 Ethical Highlighted
Singh Challenges in concerns about
AI Chatbots bias and
misinformation
in chatbot
responses.

4|Page
7. Methodology
(A) Data Collection
o Gather a dataset of FAQs and answers from websites, support
tickets, or manually created datasets.
o Augment data with paraphrased questions using NLP
techniques (e.g., Back-Translation, Text Augmentation).
(B) Data Preprocessing
o Convert text to lowercase.
o Remove stop words, punctuation, and special characters.
o Tokenize the text and apply lemmatization.
o Encode labels (answers) using a categorical encoding technique
(if using classification).
(C) Model Selection & Training
i. Sentence Embedding-Based Approach (S-BERT)
o Convert both FAQ questions and user queries into
embeddings using Sentence-BERT (SBERT).
o Compute cosine similarity between embeddings to find
the closest match.
ii. Transformer-Based Fine-Tuning Approach (BERT/GPT)
o Fine-tune BERT or DistilBERT on a dataset of question-
answer pairs.
o Use a triplet loss or contrastive loss for similarity
learning.
iii. GPT for Answer Generation (Optional)
o Fine-tune a GPT model to generate responses instead of
retrieving answers.
o Useful if FAQs are not static.

5|Page
8. Type of Learning Used in the Chatbot
(A) Supervised Learning
o Train the model on a labeled dataset of FAQs.
o Use classification or intent recognition to map queries to
responses.
(B) Rule-Based NLP (Optional or Complementary)
o Use manual rules, regex, or keyword matching for
fallback or hybrid models.
(C) Unsupervised Learning (Optional/Advanced Feature)
o Cluster queries to discover new FAQs.
o Analyze patterns using techniques like K-means
clustering.

9. Results
User Matche Response Similarit
Query d FAQ y Score
"What "What "We are open from 9 0.92
time do are your AM to 5 PM."
you working
open?" hours?"
"How "How "You can email us at 0.88
can I can I support@example.co
reach contact m."
custom support
er ?"
care?"
"When "What "We are open from 9 0.91
does are your AM to 5 PM."
your working
office hours?"
close?"

6|Page
10. Code Breakdown
Full and detailed explanation of all the code provided in our
three files (save_model.py, server.py, and nlp_code_CA1.ipynb) in a
structured format, suitable for inclusion in your report:

1. Model Preparation (save_model.py)


Purpose:
This script loads a pre-trained SentenceTransformer model and
saves it locally. This enables faster loading later without
redownloading from the internet.
Code Breakdown:
from sentence_transformers import SentenceTransformer

Imports the SentenceTransformer class to load and manage


transformer-based sentence embeddings.
model = SentenceTransformer('paraphrase-MiniLM-L6-v2')
 Downloads the model paraphrase-MiniLM-L6-v2 from Hugging
Face. This model is trained for sentence similarity tasks.
model.save('models/paraphrase-MiniLM-L6-v2-local')
 Saves the model to a local directory (models/) for reuse in other
applications, such as the Flask server.

2. Flask Server for Question-Answering (server.py)


Purpose:
This script sets up a Flask-based REST API that accepts user
questions, finds the most semantically similar question from a
predefined dataset (FAQ + doctor schedule), and returns a relevant
answer.

Detailed Code Explanation:


Imports and Initialization
1. from flask import Flask, request, jsonify
2. from flask_cors import CORS
3. from sentence_transformers import SentenceTransformer, util
4. import pandas as pd

7|Page
5. import torch
 Flask, request, jsonify: Used to create the REST API and handle
HTTP requests.
 CORS: Enables Cross-Origin Resource Sharing.
 SentenceTransformer and util: For semantic sentence embeddings
and similarity computation.
 pandas and torch: For data manipulation and tensor operations.
App Setup
1. app = Flask(__name__)
2. CORS(app)
3. MODEL_PATH = 'models/paraphrase-MiniLM-L6-v2-local'
4. CSV_PATH = 'hospital_faq.csv'
 Initializes the Flask app.
 Specifies paths to the local model and FAQ CSV file.

Data Loading and Preprocessing


Doctor Appointments Dataset
1. df = pd.read_csv("doctor_appointments.csv")

 Loads doctor availability data with fields like Name,


Specializations, AvailableDays, TimeSlot.
Function to Generate Q&A Pairs from Doctor Info
1. def generate_kb_sentences(row):
2. str_ans = f"Your {row['Specializations']} doctor available on
{row['AvailableDays']} at {row['TimeSlot']} and name is Dr.
{row['Name']}."
3. str_ques = str_ans.lower()
4. return str_ans, str_ques
 Creates a natural language question-answer pair for each
doctor entry.
 The "question" is a lowercase version of the answer for similarity
matching.
1. kb_pairs = df.apply(generate_kb_sentences, axis=1).tolist()
2. kb_answers, kb_questions = zip(*kb_pairs)
3. kb_answers = list(kb_answers)
4. kb_questions = list(kb_questions)
 Applies the above function to all rows.
 Separates the generated answers and questions into two lists.
8|Page
Model Loading
1. model = SentenceTransformer(MODEL_PATH)
 Loads the locally saved model from earlier.

FAQ Dataset
faq_df = pd.read_csv(CSV_PATH)
faq_questions = faq_df['Question'].tolist()
faq_answers = faq_df['Answer'].tolist()
 Loads pre-defined FAQs from CSV and splits into questions and
answers.

Combining and Embedding All Questions


all_questions = faq_questions + kb_questions
all_answers = faq_answers + kb_answers
all_embeddings = model.encode(all_questions, convert_to_tensor=True)
 Combines both FAQ and doctor-generated questions.
 Generates sentence embeddings for all questions using the model.

API Endpoint
1. @app.route('/ask', methods=['POST'])
2. def ask():
3. data = request.get_json()
4. user_q = data.get('question', '').strip()
5. if not user_q:
6. return jsonify({'error': 'no question provided'}), 400
 Defines the /ask endpoint to accept a POST request with a question
in JSON format.
 Validates that a question was received.
1. user_emb = model.encode(user_q, convert_to_tensor=True)
2. cos_scores = util.pytorch_cos_sim(user_emb,
all_embeddings)[0]
3. best_idx = torch.argmax(cos_scores).item()
4. score = cos_scores[best_idx].item()
 Encodes the user’s question.
 Computes cosine similarity between the user’s embedding and all
stored question embeddings.
 Finds the best match using argmax.
9|Page
1. return jsonify({
2. 'question': all_questions[best_idx],
3. 'answer': all_answers[best_idx],
4. 'similarity': round(score, 4)
5. })
 Returns the best-matching question, corresponding answer, and
similarity score in JSON format.

Server Start
1. if __name__ == '__main__':
2. app.run(host='0.0.0.0', port=5000, debug=False)
 Starts the Flask app on port 5000 for external access.

3. Jupyter Notebook (nlp_code_CA1.ipynb) (Presumed


Content)
This notebook is likely used for testing and demonstrating the QA
system via HTTP requests. While the exact content isn't visible, a
typical usage looks like this:
Example Code:
1. import requests
2.
3. url = 'http://localhost:5000/ask'
4. payload = {'question': 'When can I see a cardiologist?'}
5. response = requests.post(url, json=payload)
6. print(response.json())
Purpose:
 Demonstrates how to interact with the Flask API.
 Sends a sample question.
 Prints out the best matched response and similarity score.

Summary
File Purpose
save_model.py Downloads and saves the model locally
server.py Serves a REST API for semantic question
answering
nlp_code_CA1.ipynb Tests the API by sending requests and
viewing responses
10 | P a g e
11. Output Screenshots

11 | P a g e
12. Challenges:
o Understanding medical terminology and intent accurately.
o Ensuring privacy and data security.
o Updating the system with new information.
o Seamless integration with healthcare workflows.
o Balancing flexibility with medical accuracy.

13. Conclusion
The chatbot successfully automated responses to FAQs, improving
efficiency and user experience. The project demonstrated that
integrating AI with customer support systems can enhance
responsiveness, reduce operational costs, and improve user
satisfaction. Future enhancements could include integrating voice
recognition, multilingual support, and personalized
recommendations. Ethical AI practices will be crucial to avoid bias
and ensure responsible usage.

12 | P a g e
14. References
o All Pairs Cosine Similarity in PyTorch | by Dhruv Matani .

Medium

o BERT: A Comprehensive Guide to the Groundbreaking NLP

Framework | by Mohd Sanad Zaki Rizvi .Analytics Vidya

o Two minutes NLP — Sentence Transformers cheat sheet

[2022] | by Fabio Chiusano . Medium

13 | P a g e

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy