Finals

AI-Powered Interview Assistant
A Main project thesis submitted in partial fulfillment of requirements for the award
of degree for VIII semester
1.1.1 BACHELOR OF TECHNOLOGY

IN
1.1.2 COMPUTER SCIENCE AND ENGINEERING
by
J.BHAVYA SRI (20131A4223)
RAMEEZ AHMAD (20131A04245)
NUZHATH TAHSEEN (21145A4204)
P.VIJAYASIMHA REDDY (21135A4205)
Under the esteemed guidance of

1.1.3 Dr. R. Seeta Sireesha,
Assistant Professor,
Department of Computer Science and Engineering
1.1.4 GAYATRI VIDYA PARISHAD COLLEGE OF

ENGINEERING(AUTONOMOUS)
(Affiliated to JNTU-K, Kakinada)
VISAKHAPATNAM
2022-2023
1
CERTIFICATE
This is to certify that the main project entitled “AI-Powered Interview Assistant”
being submitted by

in partial fulfilment for the award of the degree “Bachelor of Technology” in Computer
Science and Engineering to the Jawaharlal Nehru Technological University, Kakinada is
a record of bonafide work done under my guidance and supervision during VIII semester
of the academic year 2022-2023.
The results embodied in this record have not been submitted to any other
university or institution for the award of any Degree or Diploma.
1.1.5 Guide Head of the Department

Dr. R. Seeta Sireesha Dr. D. N. D. Harini
Assistant Professor Associate Professor & H.O.D
Department of CSE Department of CSE
GVPCE(A) GVPCE(A)
DECLARATION
We hereby declare that this project entitled “AI-Powered Interview Assistant”

is a bonafide work done by us and submitted to “Department of Computer Science and
Engineering, G. V. P College of Engineering (Autonomous) Visakhapatnam, in partial
fulfilment for the award of the degree of B. Tech is of our own and it is not submitted to
any other university or has been published anytime before.
PLACE: VISAKHAPATNAM J.BHAVYA SRI (20131A4223)
DATE: RAMEEZ AHMAD (20131A04245)

ACKNOWLEDGEMENT
We would like to express our deep sense of gratitude to our esteemed institute
Gayatri Vidya Parishad College of Engineering (Autonomous), which has provided
us an opportunity to fulfill our cherished desire.
We express our sincere thanks to our principal Dr. A. B. KOTESWARA RAO,

Gayatri Vidya Parishad College of Engineering (Autonomous) for his encouragement
to us during this project, giving us a chance to explore and learn new technologies in the
form of mini projects.
We express our deep sense of Gratitude to Dr. D. N. D. HARINI, Associate

Professor and Head of the Department of Computer Science and Engineering,
Gayatri Vidya Parishad College of Engineering (Autonomous) for giving us an
opportunity to do the project in college.
We express our profound gratitude and our deep indebtedness to our guide
Dr. R. Seeta Sireesha ,whose valuable suggestions, guidance and comprehensive

assessments helped us a lot in realizing our project.
We also thank our coordinator, Dr. CH. SITA KUMARI, Associate Professor,
Department of Computer Science and Engineering, for the kind suggestions and guidance
for the successful completion of our project work.

2 ABSTRACT
In today's rapidly evolving educational landscape, where students pursuing technical

degrees are constantly balancing coursework, internships, and part-time jobs, the
allocation of time for crucial skill assessments becomes increasingly challenging. As
graduation looms, the absence of adequate interview preparation can instill fear and
undermine confidence levels.
The challenge is posed by inflexible schedules and set interview timelines, which make it
difficult for students to connect theoretical knowledge with the practical skills necessary
in the competitive tech industry. It's crucial for students to continuously adapt and acquire
advanced skills to stay relevant in the face of rapid technological advancements..
With technology evolving rapidly, students must continually acquire advanced skills to
remain competitive. While current models contribute to skill development, they often fall
short in providing personalized one-on-one interview experiences.
Existing skill development models often fail to provide personalized one-on-one

interview experiences, leaving students without adequate guidance to excel. However,
assessing interview performance and identifying areas for improvement remains
challenging.
Our proposed approach leverages OpenAI keys and LangChain models to revolutionize
interview preparation by generating tailored questions from user resumes, enhancing the
learning experience. Streamlit facilitates seamless interaction, while OpenAI integration
enhances simulation sophistication, bridging the gap between theory and practice. This
comprehensive solution represents a paradigm shift, empowering students to excel
confidently in job interviews.
KEYWORDS – Langchain , OpenAI API’s , Streamlit .

3 INDEX
CHAPTER 1. INTRODUCTION.................................................................1
1.1 Objective…...........................................................................1
1.2 About the Algorithm............................................................2
1.3 Purpose…..............................................................................5
1.4 Scope….................................................................................6
CHAPTER 2. SRS DOCUMENT................................................................7
2.1 Functional Requirements..................................................7
2.2 Non-functional Requirements...........................................7
2.3 Minimum Hardware Requirements...................................8
2.4 Minimum Software Requirements....................................8
CHAPTER 3. ALGORITHM ANALYSIS…..............................................9

3.1 Existing Algorithm.........................................................10
3.2 Proposed Algorithm........................................................10
3.3 Feasibility Study…..........................................................11
3.4 Cost Benefit Analysis.....................................................14
CHAPTER 4. SOFTWARE DESCRIPTION...........................................15
4.1 Visual Studio Code….......................................................15
4.2 Langchain.........................................................................15
4.3 Python...............................................................................15
4.4 Open AI............................................................................16
4.5 Pycharm ............................................................................16
4.6 Streamlit ...........................................................................16
CHAPTER 5. PROJECT DESCRIPTION..............................................17
5.1 Problem Definition….......................................................17
5.2 Project Overview..............................................................17
5.3 Module Description….....................................................18
5.3.1 Python and Flask Framework…........................18
5.3.2 Model.................................................................19
CHAPTER 6. SYSTEM DESIGN.............................................................25

6.1 Introduction to UML.........................................................25
6.2 Building Blocks of the ML................................................26
6.3 UML Diagrams.................................................................32
CHAPTER 7. DEVELOPMENT...............................................................34
7.1 Datasets used.....................................................................34
7.2 Sample Code......................................................................37
7.3 Results...............................................................................59
CHAPTER 8. TESTING.............................................................................69
8.1 Introduction of Testing......................................................69
CHAPTER 9. CONCLUSION...................................................................73
FUTURE SCOPE...............................................................74
REFERENCE LINKS.......................................................75
1. INTRODUCTION
In the realm of career advancement, meticulous interview preparation emerges as a

cornerstone of success. This innovative application, powered by cutting-edge
technologies such as Streamlit, LangChain, and OpenAI, epitomizes a sophisticated
solution tailored to meet the evolving needs of today's job seekers. By seamlessly
integrating advanced language processing mechanisms, it offers a refined approach to
interview readiness, empowering individuals with personalized insights and guidance
At its essence, the application embodies efficiency and efficacy, leveraging the
prowess of Streamlit for intuitive user interaction and LangChain for seamless text
processing. Through the lens of OpenAI's language models, it navigates the complexities
of resume parsing, swiftly distilling pertinent information and crafting tailored interview
questions. This symbiotic integration of technology not only optimizes the preparation
journey but also ensures that candidates are equipped with a comprehensive
understanding of the topics they may encounter during interviews.
Furthermore, the project objective transcends mere question generation; it aspires
to foster a culture of continuous improvement and empowerment. By facilitating audio
recording capabilities and leveraging LangChain's capabilities, it enables candidates to
articulate their responses with clarity and precision. Through iterative analysis and
feedback loops, the application empowers individuals to refine their communication
skills, ultimately enhancing their confidence and competitiveness in the job market.
1.1 OBJECTIVE
The primary objective of this application is to revolutionize the interview preparation

process, offering a holistic solution that combines technological innovation with strategic
foresight. Through the seamless integration of Streamlit, LangChain, and OpenAI, it
endeavors to provide individuals with a tailored and immersive experience that
transcends traditional methods of preparation.
Firstly, the application aims to streamline the preparation journey by leveraging
Streamlit's interactive interface, ensuring user-friendly navigation and engagement.
Additionally, through LangChain's text processing capabilities, it seeks to automate the
extraction of relevant information from resumes, facilitating the generation of
personalized interview questions that align with the candidate's skills and experiences.
Moreover, the application aspires to enhance candidates' communication
proficiency through the integration of audio recording functionality. By leveraging
LangChain's speech-to-text conversion capabilities, it enables individuals to articulate
their responses verbally, fostering a more dynamic and immersive preparation
experience. Through iterative analysis and feedback mechanisms, the application
empowers candidates to refine their responses and elevate their interview performance
Overall, the objective of this application is to empower individuals with the tools
and insights necessary to confidently navigate the interview process and secure their
desired career opportunities. By harnessing the combined capabilities of Streamlit,
LangChain, and OpenAI, it aims to redefine the paradigm of interview preparation,
setting a new standard for efficiency, effectiveness, and empowerment.
1.2 ABOUT THE ALGORITHM
In our algorithm, we aim to develop a robust system for efficient document retrieval and
processing, leveraging advanced techniques such as document loaders, text splitting,
embedding models, vector stores, retrievers, and indexing. This algorithmic framework is
crucial for enabling streamlined access to information, enhancing search capabilities, and
facilitating seamless integration with user interfaces.
1.2 FIG 1.1 Langchain Architecture

1.2.1 Document loaders
Document loaders act as the primary entry point for bringing data into our system. They
provide the initial step in the data ingestion process, facilitating the seamless
integration of textual content from various sources.
Text Loader:
The Text Loader component serves as a foundational element in our system, responsible
for sourcing textual documents from various data repositories. By seamlessly interfacing
with diverse sources including local files and cloud-based storage solutions, Text Loader
ensures the reliable acquisition of data essential for subsequent processing and analysis.
Unstructured URL Loader:
The Unstructured URL Loader expands our system's capabilities by enabling the retrieval
of unstructured data from web sources. Through sophisticated web scraping techniques,
this component facilitates the extraction of information from publicly accessible URLs,
enriching our dataset with external content for comprehensive analysis and insight
generation.
1.3.1 Text Splitters
Text Splitter efficiently breaks down large documents into manageable chunks,
enhancing processing efficiency and enabling targeted analysis. Coherent Chunking:
Utilizes advanced algorithms to ensure that text chunks maintain coherence and
relevance, preserving the contextual integrity of the original document. Optimized
Processing: By segmenting text into smaller units, Text Splitter optimizes subsequent
retrieval and analysis processes, facilitating faster and more accurate information
extraction.
Character Text Splitter:

At the core of our data preprocessing pipeline, the Character Text Splitter module plays a
pivotal role in segmenting large textual documents into manageable fragments. Utilizing
sophisticated character-based splitting algorithms, this component optimizes data
processing efficiency and enhances retrieval performance by isolating relevant sections of
text.
Recursive Character Text Splitter:
Building upon the capabilities of its predecessor, the Recursive Character Text Splitter
further refines the text segmentation process through recursive parsing techniques. This
advanced algorithm ensures precise extraction of meaningful content from complex
documents, facilitating accurate representation across diverse formats and structures.
1.2 FIG 1.2 Text Splitters
1.4.1 Vector Database:

In the ever-evolving landscape of artificial intelligence, vector databases stand as pivotal
solutions, indexing and storing vector embeddings to enable swift retrieval and similarity
searches. As we navigate through the AI revolution, these databases emerge as
indispensable tools, addressing the escalating complexity and scale of modern data
processing. By harnessing the semantic richness embedded within vector representations,
they empower applications reliant on large language models and generative AI,
facilitating efficient knowledge retrieval and long-term memory maintenance.
Through seamless integration with embedding models, these databases augment AI
capabilities, facilitating tasks such as semantic information retrieval with unparalleled
efficiency. Thus, they play a pivotal role in enhancing the effectiveness of AI-driven
applications, embodying the synergy between advanced data management and
transformative AI innovation.
1.2 FIG 1.3 Vector Database
FIASS
FAISS (Facebook AI Similarity Search) is a cutting-edge library designed for efficient

similarity search and clustering of high-dimensional vector data. Developed by Facebook
AI Research, FAISS offers optimized algorithms tailored for large-scale datasets
encountered in AI applications. Its advanced indexing techniques, such as Product
Quantization (PQ) and Hierarchical Navigable Small World (HNSW), ensure rapid and
accurate nearest neighbor search operations.
1.2 FIG 1.4 FAISS Indexing
FAISS supports essential functionalities like CRUD operations and metadata filtering,
simplifying data management. Additionally, FAISS enables horizontal scaling,
distributing index structures across multiple machines for enhanced performance and
scalability. As a cornerstone technology, FAISS empowers AI systems with swift and
precise retrieval of semantic information
1.5.1 Retrieval:
Retrieval mechanisms orchestrate the process of fetching relevant data based on user
queries, bridging the gap between raw data and actionable insights. The
RetrievalQAWithSourcesChain leverages sophisticated algorithms to identify and
retrieve pertinent information, taking into account multiple data sources and query types.
By employing techniques such as semantic search and ensemble retrieval, it enhances the
precision and comprehensiveness of search results, empowering users with actionable
knowledge
1.2 FIG 1.5 Retrieval
RetrievalQAWithSourcesChain
The RetrievalQAWithSourcesChain module represents the pinnacle of our system's

retrieval capabilities. Incorporating advanced algorithms, this component enables users to
pose complex queries and retrieve relevant documents with exceptional efficiency. By
integrating multiple data sources and leveraging semantic understanding,
RetrievalQAWithSourcesChain empowers users to extract actionable insights from vast
repositories of textual data with unparalleled accuracy and speed.
1.6.1 Streamlit UI:
The Streamlit UI component serves as the user-facing interface of our system, providing
intuitive access to its functionalities. Designed for simplicity and ease of use, Streamlit
UI enables users to explore, query, and visualize data effortlessly. By offering a seamless
and interactive experience, the UI enhances user engagement and ensures efficient
utilization of our system's capabilities across diverse applications and use cases.
Built upon Streamlit's framework, the UI offers a user-friendly experience,

enabling effortless access to various functionalities and insights. Concurrently, project
coding encompasses the implementation of underlying algorithms and logic, ensuring the
robustness and functionality of our system. Through meticulous coding practices and
adherence to best practices, we uphold the integrity and reliability of our solution.
1.3 FIG 1.5 Example of Stremlit UI

1.3 PURPOSE
The purpose of the provided code and application is to streamline and enhance the
interview preparation process for job seekers. By leveraging advanced technologies such
as Streamlit, LangChain, and OpenAI, the application offers a sophisticated platform for
generating personalized technical interview questions based on the content of uploaded
resumes.
Through seamless integration with document loaders and text splitters, the
application efficiently extracts relevant information from resumes, ensuring that
generated questions are tailored to each candidate's unique skills and experiences.
Additionally, the incorporation of audio recording functionality allows candidates
to verbally respond to interview questions, fostering dynamic and immersive preparation
sessions. The application's objective is to empower job seekers with the tools and
resources needed to confidently navigate the interview process and secure their desired
career opportunities.
Overall, the code and application aim to revolutionize interview preparation by
providing a user-friendly interface, intelligent question generation capabilities, and
interactive features for audio-based responses.
By combining cutting-edge technologies with a focus on user-centric design, the
application strives to enhance the efficiency, effectiveness, and confidence of job seekers
as they prepare for interviews. With its comprehensive approach and innovative features,
the application sets out to redefine the standard for interview preparation in the modern
job market.
At its core, the application seeks to empower individuals with a strategic
advantage in their career pursuits. Through intelligent question generation and
personalized feedback mechanisms, it fosters a deeper understanding of one's strengths
and areas for improvement, enabling candidates to showcase their capabilities with
confidence and precision during interviews.
1.4 SCOPE
The scope of our project encompasses the development of a comprehensive platform

tailored to streamline the interview process through the integration of advanced AI
technologies. By leveraging natural language processing and machine learning
algorithms, our application aims to analyze candidate resumes, generate personalized
technical questions, and facilitate efficient evaluation of their skills and experiences. With
a focus on enhancing both the candidate and recruiter experience, our platform seeks to
revolutionize traditional hiring practices by providing a data-driven approach to talent
assessment and selection. Here are some potential areas of focus:
1. Document Loaders: Retrieve documents from diverse sources including private S3

buckets and public websites. Integrates with major providers such as AirByte and
Unstructured.
2. Text Splitters: Segment large documents into manageable chunks using specialized
algorithms for different document types like code, markdown, etc.
3. Embedding Models: Generate embeddings to capture semantic meaning of text,
offering over 25 integrations with diverse providers from open-source to proprietary
API models.
4. Vector Stores: Facilitate efficient storage and retrieval of embeddings with
integrations with over 50 vector stores, ranging from open-source local options to
cloud-hosted proprietary solutions.
5. Retrievers: Retrieve relevant documents from the database using various algorithms
including basic semantic search and advanced techniques like Parent Document
Retriever, Self Query Retriever, and Ensemble Retriever.
6. Indexing: Sync data from any source into a vector store for efficient retrieval,
preventing duplication, re-computation of unchanged content, and minimizing
resource utilization while improving search results.
2. SRS DOCUMENT
A software requirements specification (SRS) is a document that describes what the

software will do and how it will be expected to perform.
2.1 FUNCTIONAL REQUIREMENTS
A Functional Requirement (FR) is a description of the service that the software

must offer. It describes a software system or its component. A function is nothing
but inputs to the software system, its behavior, and outputs. It can be a calculation,
data manipulation, business process, user interaction, or any other specific
functionality which defines what function a system is likely to perform. Functional
Requirements are also called Functional Specification.
• The system will seamlessly load resumes/documents from various sources,
including local files and web URLs. It will analyze document content to
identify technical skills and project experiences, generating personalized
questions accordingly.
• The system will offer a user-friendly interface for candidates to interact with
generated questions, supporting both audio recording and text input. It will
analyze candidate responses using natural language processing techniques,
evaluating relevance, coherence, and depth of knowledge, and providing
feedback to candidates and recruiters accordingly.
2.2 NON-FUNCTIONAL REQUIREMENTS
NON-FUNCTIONAL REQUIREMENT (NFR) specifies the quality attribute of a

software system. They judge the software system based on Responsiveness,
Usability, Security, Portability. Non-functional requirements are called qualities of
a system, there are as follows:
• Performance-Performance: The system will offer real-time feedback during question

generation and answer submission while maintaining high performance under
concurrent user interactions.
• Reliability - The system will ensure data integrity and confidentiality throughout the interview
process. It shall have mechanisms in place to recover gracefully from unexpected errors or
failures.
• Operability - The interface of the system will be consistent.
• Usability: The user interface must be intuitive and provide clear instructions,
accommodating candidates of diverse technical backgrounds throughout the interview
process.
• Efficiency - Once user has learned about the system through his interaction, he can
perform the task easily.
• Understandability-Because of user friendly interfaces, it is more understandable to
the users
2.3 MINIMUM HARDWARE REQUIREMENTS
• Processor -Intel Core i3 or above

• Hard Disk – 256GB
• RAM – 8GB
• Operating System – Windows 10
2.4 MINIMUM SOFTWARE REQUIREMENTS
Python based Computer Vision and Deep Learning libraries will be exploited for
the development and experimentation of the project.
• Programming Language – PYTHON 3.5

• IDE – Visual Studio Code
• Langchain
• OpenAI API
• Google API
• Streamlit
• Pycharm
3. ANALYSIS
3.1. EXISTING SYSTEMS
 OpenAI's high API usage costs and ethical concerns regarding biases in
question generation may hinder its suitability for large-scale interview
preparation. Similarly, platforms like Gemini may lack customization, while
Hugging Face's models might require complex integration and lack
specialized capabilities, contrasting with the project's objectives.
 Brad.ai's focus on coaching may not align with automated question
generation goals, and 1:1 mock interviews could lack scalability compared
to automated systems. Concerns arise over the cost and ethics of OpenAI's
language models, while Gemini's focus on scheduling may limit
customization.
 Integrating Hugging Face's models may be complex, lacking specialized
capabilities, and Brad.ai's coaching emphasis might not align with the
project's aims. 1:1 mock interviews could lack scalability compared to
automated systems.
1.3.1 3.1.1. DRAWBACKS OF EXISTING ALGORITHM
 Limited Information Extraction: Discuss how existing systems struggle to

extract comprehensive information from PDF resumes. Emphasize the
limitations in parsing complex formats and layouts.
 Time-Consuming Manual Review: Elaborate on the time and effort required
to manually review resumes. Highlight the inefficiency in handling a large
volume of resumes.
 Subjective Evaluation: Point out the subjectivity in human-based
evaluations. Discuss the potential for bias and inconsistency in decision-
making
3.2. PROPOSED ALGORITHM
We envisioned a pioneering solution aimed at revolutionizing interview preparation. Our

platform seamlessly integrates cutting-edge technologies to offer a comprehensive suite
of features. Users can upload their resumes, triggering advanced algorithms to generate
personalized interview questions tailored to their skills and experiences.
Additionally, the inclusion of audio recording functionality enables dynamic
responses to these questions, fostering immersive preparation sessions. With intuitive
user interface design powered by Streamlit, our application aims to elevate interview
readiness to new heights, empowering candidates with confidence and proficiency.
For training our interview question generation model, we employ a combination of

advanced techniques:
 Contextual Analysis: Utilizing LangChain's OpenAI API, we capture nuanced
patterns within resume content to generate contextually relevant questions.
 Semantic Understanding: Leveraging LangChain's language processing
capabilities, we assess the semantic relevance of questions generated from resume
data.
 Fluency Optimization: Fine-tuning OpenAI's GPT models ensures the fluency and
coherence of interview questions, enhancing their natural language generation
capabilities.
 Personalization Strategies: Implementing LangChain's adaptive algorithms, we
tailor question generation based on individual user feedback and preferences.
Interactive Learning: Integrating user interaction mechanisms, we employ
ensemble learning approaches to refine question generation processes,
incorporating user input for continual enhancement. Iterative
 Improvement: Through iterative training and model refinement using LangChain's
resources, we continuously optimize the question generation process, ensuring the
highest quality output.
3.2.1. ADVANTAGES OF PROPOSED MODEL
 It is very time-saving
 Dynamic Question Generation
 Accurate results
 Automated Resume Parsing
 User- friendly graphical interface
 Highly reliable
 Cost effective
3.3 FEASIBILITY STUDY

A feasibility study is an analysis that takes all a project's relevant factors into
account including economic, technical, legal, and scheduling considerations
to ascertain the likelihood of completing the project successfully. A
feasibility study is important and essential to evolute any proposed project is
feasible or not. A feasibility study is simply an assessment of the practicality
of a proposed plan or project.
1.3.2 The main objectives of feasibility are mentioned below:
To determine if the product is technically and financially feasible to develop,

is the main aim of the feasibility study activity. A feasibility study should
provide management with enough information to decide:
• Whether the project can be done.
• To determine how successful your proposed action will be.
• Whether the final product will benefit its intended users.
• To describe the nature and complexity of the project.
• What are the alternatives among which a solution will be chosen (During
subsequent phases)
• To analyze if the software meets organizational requirements. There are
various types of feasibility that can be determined. They are:
Operational - Define the urgency of the problem and the acceptability of any
solution, includes people-oriented and social issues: internal issues, such as
manpower problems, labor objections, manager resistance, organizational conflicts,
and policies; also, external issues, including social acceptability, legal aspects, and
government regulations.
Technical: Is the feasibility within the limits of current technology? Does the
technology exist at all? Is it available within a given resource?
Economic - Is the project possible, given resource constraints? Are the benefits that
will accrue from the new system worth the costs? What are the savings that will
result from the system, including tangible and intangible ones? What are the
development and operational costs?
Schedule - Constraints on the project schedule and whether they could be

reasonably met.
3.3.1 ECONOMIC FEASIBILITY:
Economic analysis could also be referred to as cost/benefit analysis. It is the most

frequently used method for evaluating the effectiveness of a new system. In
economic analysis the procedure is to determine the benefits and savings that are
expected from a candidate system and compare them with costs. Economic
feasibility study related to price, and all kinds of expenditure related to the scheme
before the project starts. This study also improves project reliability. It is also
helpful for the decision-makers to decide the planned scheme processed latter or
now, depending on the financial condition of the organization. This evaluation
process also studies the price benefits of the proposed scheme. Economic
Feasibility also performs the following tasks.
• Cost of packaged software.
• Cost of doing full system study.
• Is the system cost Effective?
3.3.2. TECHNICAL FEASIBILITY:
A large part of determining resources has to do with assessing technical feasibility.

It considers the technical requirements of the proposed project. The technical
requirements are then compared to the technical capability of the organization. The
systems project is considered technically feasible if the internal technical capability
is sufficient to support the project requirements. The analyst must find out whether
current technical resources can be where the expertise of system analysts is
beneficial, since using their own experience and their contact with vendors they will be
able to answer the question of technical feasibility.
Technical Feasibility also performs the following tasks.

• Is the technology available within the given resource constraints?
• Is the technology have the capacity to handle the solution
• Determines whether the relevant technology is stable and established.
• Is the technology chosen for software development has a large number of
users so that they can be consulted when problems arise, or improvements
are required?
3.3.3 OPERATIONAL FEASIBILITY:
Operational feasibility is a measure of how well a proposed system solves the

problems and takes advantage of the opportunities identified during scope
definition and how it satisfies the requirements identified in the requirements
analysis phase of system development. The operational feasibility refers to the
availability of the operational resources needed to extend research results beyond
on which they were developed and for which all the operational requirements are
minimal and easily accommodated. In addition, the operational feasibility would
include any rational compromises farmers make in adjusting the technology to the
limited operational resources available to them. The operational Feasibility also
perform the tasks like
• Does the current mode of operation provide adequate response time?

• Does the current of operation make maximum use of resources.
• Determines whether the solution suggested by the software development
team is acceptable.
• Does the operation offer an effective way to control the data?
• Our project operates with a processor and packages installed are supported
by the system.
3.4 COST BENEFIT ANALYSIS
The financial and the economic questions during the preliminary investigation are
verified to estimate the following:
• The cost of the hardware and software for the class of application being
considered.
• The benefits in the form of reduced cost.
• The proposed system will give the minute information, as a result.
• Performance is improved which in turn may be expected to provide
increased profits.
• This feasibility checks whether the system can be developed with the
available funds.
• This can be done economically if planned judicially, so it is economically
feasible.
 The cost of the project depends upon the number of man-hours required
4. SOFTWARE DESCRIPTION
4.1. Visual Studio Code

Visual Studio Code (famously known as VS Code) is a free open-source text editor
by Microsoft. VS Code is available for Windows, Linux, and macOS. VS Code
supports a wide array of programming languages from Java, C++, and Python to
CSS, Go, and Docker file. Moreover, VS Code allows you to add on and even
creating new extensions including code linters, debuggers, and cloud and web
development support. The VS Code user interface allows for a lot of interaction
compared to other text editors.
4.2. Langchain
LangChain is an innovative blockchain-based platform that revolutionizes

multilingual communication and translation services. It offers a decentralized
solution to bridge language barriers, providing a secure and efficient environment
for users to interact across linguistic boundaries. By leveraging blockchain
technology, LangChain ensures transparency, immutability, and trust in language
transactions. Users can access a wide range of language services, including
translation, interpretation, and language learning, all within a decentralized
ecosystem. With LangChain, individuals, businesses, and organizations can
seamlessly communicate and collaborate on a global scale, unlocking new
opportunities and fostering cross-cultural understanding.
4.3. Python
Python is an interpreted, object-oriented, high-level programming language with dynamic

semantics. Its high-level built-in data structures, combined with dynamic typing and
dynamic binding, make it very attractive for Rapid Application Development, as well as
for use as a scripting or glue language to connect existing components together. Python's
simple, easy to learn syntax emphasizes readability and therefore reduces the cost of
program maintenance. Python supports modules and packages, which encourages
program modularity and code reuse. The Python interpreter and the extensive standard
library are available in source or binary form without charge for all major platforms, and
can be freely distributed.
4.4 Open AI
OpenAI stands as a pioneering research organization at the forefront of artificial

intelligence development, dedicated to advancing the boundaries of AI technology and its
accessibility. Renowned for its groundbreaking research and innovative solutions,
OpenAI strives to democratize AI through its APIs, tools, and research findings,
empowering developers, businesses, and researchers worldwide. With a focus on
responsible AI deployment, OpenAI fosters collaborations, conducts cutting-edge
research, and promotes ethical AI practices. Its contributions span various domains, from
natural language processing and computer vision to reinforcement learning and robotics.
Through its commitment to transparency and collaboration, OpenAI continues to shape
the future of AI, driving impactful advancements that benefit society as a whole.
4.5 Pycharm
PyCharm stands as a premier integrated development environment (IDE)
meticulously crafted for Python programming, renowned for its robust features and
user-friendly interface. Developed by JetBrains, PyCharm offers a comprehensive
suite of tools designed to enhance the productivity and efficiency of Python
developers. Its intelligent code completion, advanced debugging capabilities, and
seamless integration with version control systems streamline the development
workflow. PyCharm provides support for various Python frameworks and libraries,
facilitating the creation of diverse applications ranging from web development to
data analysis and machine learning. With its extensive plugin ecosystem and
customizable settings, PyCharm caters to the unique needs of developers, enabling
them to build high-quality software with ease. Whether working on personal
projects or large-scale enterprise applications, PyCharm remains a preferred choice
for Python developers seeking a feature-rich and intuitive development
environment.
4.6 Streamlit
Streamlit is a Python library that simplifies the creation of interactive web
applications for data science and machine learning projects. It offers a
straightforward and intuitive way to build user-friendly interfaces without the need
for extensive web development experience. With Streamlit, developers can
seamlessly integrate data visualizations, input widgets, and text elements to create
dynamic applications that enable users to explore and interact with data in real-
time. Its declarative syntax and automatic widget rendering make prototyping and
deploying applications quick and efficient. Streamlit's seamless integration with
popular data science libraries like Pandas, Matplotlib, and TensorFlow further
enhances its capabilities, allowing developers to leverage their existing knowledge
and tools. Overall, Streamlit empowers data scientists and machine learning
engineers to share insights, prototypes, and models with stakeholders effectively,
accelerating the development and deployment of data-driven applications.
5. PROBLEM DESCRIPTION
5.1. PROBLEM DEFINITION
The Personalized Interviewer application is designed to assist technical students

in preparing for job interviews by generating tailored interview questions based on their
resumes. It analyzes the content of uploaded PDF resumes, extracts relevant information,
and utilizes machine learning models to create personalized interview simulations. The
objective is to enhance students' interview preparation experience by offering a user-
friendly interface and personalized feedback mechanism.
We used OpenAI and LangChain, to bridge the gap between theoretical
knowledge and practical skills. The main objective is to develop a reliable system for
interview preparation that provide with accurate results.
5.2. PROJECT OVERVIEW
Our project's primary goal is to assist users , particularly technical students, in preparing
for job interviews effectively. By allowing users to record their answers using audio
input, the system aims to facilitate practice and improvement, ultimately enhancing their
interview performance and boosting confidence levels.
We utilized a combination of LangChain, OpenAI, and Streamlit to develop an advanced

interview preparation tool tailored for technical students. Leveraging LangChain's text
processing capabilities, OpenAI's language models, and Streamlit's user-friendly
interface, our project offers a seamless experience for users.
Fig 5.1 Overview
LangChain and OpenAI to analyze the content of uploaded PDF resumes and generate
personalized interview questions. The system adapts its approach based on the type of
input provided, whether it's textual information from resumes or audio recordings of user
responses. This comprehensive solution empowers users to practice interview scenarios
effectively, bridging the gap between theoretical knowledge and practical skills required
in the competitive tech industry.
The steps involved in the project are: -
1. Upload a PDF file(Resume)

2. Extract text from PDF
3. Split text into chunks
4. Load or create Vector store
5. Generate Questions
6. Record Answers
7. Convert audio to text(Answers)
8. Analyze questions and corresponding answers after all recordings
9. Displaying score between 0 to 100 and the areas of improvement for betterment of the
candidate.
The output of our project is a user-friendly interface where technical students can
upload their resumes and final result is to provide a score between 0 to 100 and for each
question along with the areas of improvement.
5.3. MODULE DESCRIPTION
5.3.1. PYTHON AND STREAMLIT FRAMEWORK
PyCharm, is a powerful integrated development environment (IDE) specifically designed

for Python development. PyCharm, developed by JetBrains, is a Python IDE renowned
for its powerful features tailored for Python development. It offers advanced code editing
tools such as syntax highlighting, code completion, and refactoring. Integrated with
version control systems like Git, PyCharm facilitates efficient project management. Its
intelligent code analysis and debugging capabilities enable quick error identification and
resolution, ensuring application reliability. PyCharm serves as an essential tool for
Python developers, offering a robust environment for writing, testing, and debugging
Python code.
Python is an interpreted, object-oriented, high-level programming language with

dynamic semantics. Its high-level built-in data structures, combined with dynamic typing
and dynamic binding, make it very attractive for Rapid Application Development, as well
as for use as a scripting or glue language to connect existing components together.
Python's simple, easy to learn syntax emphasizes readability and therefore reduces the
cost of program maintenance. Python supports modules and packages, which encourages
program modularity and code reuse. The Python interpreter and the extensive standard
library are available in source or binary form without charge for all major platforms, and
can be freely distributed.
Python Streamlit is a powerful open-source framework that allows developers to
create interactive web applications with ease. With its simple and intuitive syntax,
developers can quickly build data-driven applications using Python code. Streamlit
provides various components and widgets for creating interactive elements such as
buttons, sliders, and charts, making it ideal for building user-friendly interfaces.
5.3.2. MODEL
Streamlit App Setup and PDF Processing

The script commences with the importation of essential libraries, including Streamlit for
web application development, PyPDF2 for PDF manipulation, and various modules from
the LangChain library tailored for natural language processing endeavors. Configurations
for the Streamlit sidebar are established to furnish users with pertinent information
regarding the application's functionality. The OpenAI API key is designated as an
environment variable to authorize subsequent API requests. Subsequently, functions for
audio recording and conversion are defined, setting the groundwork for capturing user
responses. The main() function orchestrates the setup of the Streamlit application
interface. Within this function, users upload PDF files, which are then parsed and
analyzed utilizing PyPDF2. Text extraction ensues, followed by segmentation into
manageable chunks, culminating in the creation of a vector store using LangChain's
FAISS module. Questions pertinent to the PDF content are generated via an OpenAI
language model. Finally, the user interface is updated to showcase the generated
questions, awaiting user input.
Recording and Analyzing User Responses
Upon PDF upload and question generation, the interface prompts users to commence
answering questions sequentially. Each question triggers the initiation of audio recording
upon user interaction with a designated button, leveraging the device's microphone.
Recorded audio is subsequently transcribed into text format using the Google Speech
Recognition API. Captured responses are stored in session state variables for subsequent
analysis. Upon completion of all questions, the application proceeds to analyze user
responses. A formulated query facilitates scrutiny of questions and corresponding
answers, yielding scores and areas for improvement for each question-answer pair.
LangChain's question-answering capabilities process the query, presenting findings to the
user via the application interface.
Fig 5.2
Model
Error Handling and Overall Application Workflow
The script encompasses robust error handling mechanisms to gracefully navigate
exceptions that may arise during execution. Instances of errors, such as file upload or
audio recording mishaps, prompt the display of informative error messages, preserving a
seamless user experience. Throughout the codebase, various control flow structures,
including conditional statements and loops, orchestrate the application's workflow and
handle diverse scenarios adeptly. Modular code architecture enhances maintainability and
readability, facilitating comprehension and modification endeavors. In sum, the script
embodies adept utilization of libraries and tools, such as Streamlit, PyPDF2, and
LangChain, culminating in the development of an interactive and user-centric application
tailored for personalized interview preparation.Error Handling and Overall Application
Workflow The script encompasses robust error handling mechanisms to gracefully
navigate exceptions that may arise during execution. Instances of errors, such as file
upload or audio recording mishaps, prompt the display of informative error messages,
preserving a seamless user experience. Throughout the codebase, various control flow
structures, including conditional statements and loops, orchestrate the application's
workflow and handle diverse scenarios adeptly. Modular code architecture enhances
maintainability and readability, facilitating comprehension and modification endeavors.
In sum, the script embodies adept utilization of libraries and tools, such as Streamlit,
PyPDF2, and LangChain, culminating in the development of an interactive and user-
centric application tailored for personalized interview preparation.
In the project, LangChain's LLM (Language Learning Model) plays a crucial role in
generating tailored interview questions based on the content of uploaded resumes.
Leveraging advanced natural language processing techniques, the LLM comprehensively
analyzes the textual data to identify relevant skills and experiences. It then formulates
personalized questions to simulate real-world interview scenarios. Additionally, the LLM
evaluates user responses, providing constructive feedback and areas for improvement. By
harnessing the power of the LLM, the project enhances interview preparation by offering
dynamic and targeted question-answering interactions, ultimately empowering users to
refine their technical communication skills.
The implementation consists of following modules: -
 The PdfReader() class initializes a PdfReader object, allowing the script to read the
content of PDF files.
 The extract_text() method is then utilized to extract text content from individual
pages of the PDF, enabling further processing and analysis
 The load_qa_chain() function is responsible for loading a question-answering chain
for processing documents. This chain is essential for generating responses to user
queries based on the content of the documents provided.
 Similarly, the get_openai_callback() function retrieves an OpenAI callback function,
which is crucial for interacting with OpenAI's API during the question-answering
process. These functions encapsulate complex logic and functionality, enabling
streamlined document processing and response generation.
 The record_audio() function facilitates audio recording for a specified duration
using the sounddevice library and saves the recorded audio to a temporary WAV
file. This functionality is vital for allowing users to provide verbal responses to
interview questions, adding an interactive element to the application.
 Additionally, the convert_audio_to_text() function leverages the Google Speech
Recognition API to convert recorded audio files to text format. This conversion
enables seamless integration of spoken responses into the question-answering
workflow.
Fig 5.3 the implementation
 The RecursiveCharacterTextSplitter() class initializes a text splitter object designed

to break down text into smaller, manageable chunks. This functionality aids in
processing large volumes of text efficiently, particularly when dealing with lengthy
documents such as PDF files.
 Furthermore, the OpenAIEmbeddings() class initializes an embeddings object used
for handling text embeddings, which are essential for various natural language
processing tasks such as semantic similarity analysis.
 The FAISS.from_texts() method is employed to create a FAISS vector store from
the text data extracted from documents. This vector store facilitates efficient
similarity searches and other vector-based operations, enhancing the performance of
the question-answering system.
 Within the Streamlit framework, several functions and widgets are utilized to create
the user interface and manage application state. Functions such as st.file_uploader(),
st.header(), st.write(), st.button(), st.title(), st.empty(), and st.error() are employed to
display text, widgets, and interactive elements on the Streamlit app.
 Additionally, the st.session_state attribute is utilized to access and manage session
state variables, enabling data persistence and user interaction tracking within the
application. Overall, these functionalities contribute to the creation of a user-friendly
and interactive interview preparation tool.
6. SYSTEM DESIGN
6.1 Introduction to UML
Unified Modeling Language (UML) is a general-purpose modeling

language. The main aim of UML is to define a standard way to visualize the way a
system has been designed. It is quite like blueprints used in other fields of
engineering. UML is not a programming language; it is rather a visual language. We
use UML diagrams to portray the behavior and structure of a system. UML helps
software engineers, businessmen and system architects with modeling, design and
analysis. The Object Management Group (OMG) adopted Unified Modelling
Language as a standard in 1997. It's been managed by OMG ever since.
International Organization for Standardization (ISO) published UML as an
approved standard in 2005. UML has been revised over the years and is reviewed
periodically.
3.3.1 Why we need UML
• Complex applications need collaboration and planning from multiple teams

and hence require a clear and concise way to communicate amongst them.
• Businessmen do not understand code. So, UML becomes essential to

communicate with nonprogrammers’ essential requirements, functionalities and
processes of the system.
• A lot of time is saved down the line when teams can visualize processes,
user interactions and static structure of the system.
UML is linked with object-oriented design and analysis. UML makes the
use of elements and forms associations between them to form diagrams. Diagrams
in UML can be broadly classified as:
• Structural Diagrams – Capture static aspects or structure of a system.

Structural Diagrams include Component Diagrams, Object Diagrams, Class
Diagrams and Deployment Diagrams.
• Behaviour Diagrams – Capture dynamic aspects or behaviour of the system.
Behaviour diagrams include Use Case Diagrams, State Diagrams, Activity
Diagrams and Interaction Diagrams.
Building Blocks of the UML Building Blocks of the UML Building Blocks of the
UML
Fig 6.1 BUILDING BLOCKS IN UML
6.2 Building Block of the UML

The vocabulary of the UML encompasses three kinds of building blocks:
• Things
• Relationships
• Diagrams
Things are the abstractions that are first-class citizens in a model; relationships tie
these things together; diagrams group interesting collections of things.
3.3.2 Things in the UML
There are four kinds of things in the UML:
• Structural things
• Behavioural things
• Grouping things
• Annotational things
These things are the basic object-oriented building blocks of the UML. You use
them to write well-formed models.
3.3.3 Structural Things
Structural things are the nouns of UML models. These are the mostly static
parts of a model, representing elements that are either conceptual or physical.
Collectively, the structural things are called classifiers.
A class is a description of a set of objects that share the same attributes,

operations, relationships, and semantics. A class implements one or more interfaces.
Graphically, a class is rendered as a rectangle, usually including its name, attributes,
and operations
Class - A Class is a set of identical things that outlines the functionality and
properties of an object. It also represents the abstract class whose functionalities are
not defined. Its notation is as follows
Interface - A collection of functions that specify a service of a class or component,

i.e., Externally visible behavior of that class.
Collaboration - A larger pattern of behaviors and actions. Example: All classes
and behaviors that create the modeling of a moving tank in a simulation.
Use Case - A sequence of actions that a system performs that yields an observable
result. Used to structure behavior in a model. Is realized by collaboration.
Component - A physical and replaceable part of a system that implements a

number of interfaces. Example: a set of classes, interfaces, and collaborations.
Node - A physical element existing at run time and represents are source.
3.3.4 Behavioral Things
Behavioral things are the dynamic parts of UML models. These are the verbs of a
model, representing behavior over time and space. In all, there are three primary
kinds of behavioral things
• Interaction
• State machine
3.3.5 Interaction
It is a behavior that comprises a set of messages exchanged among a set of

objects or roles within a particular context to accomplish a specific purpose. The
behavior of a society of objects or of an individual operation may be specified with
an interaction. An interaction involves a number of other elements, including
messages, actions, and connectors (the connection between objects). Graphically, a
message is rendered as a directed line, almost always including the name of its
operation.
3.3.6 State machine
State machine is a behaviour that specifies the sequences of states an object

or an interaction goes through during its lifetime in response to events, together
with its responses to those events. The behaviour of an individual class or a
collaboration of classes may be specified with a state machine. A state machine
involves a number of other elements, including states, transitions (the flow from
state to state), events (things that trigger a transition), and activities (the response to
a transition). Graphically, a state is rendered as a rounded rectangle, usually
including its name and its substates.
3.3.7 Grouping Things
Grouping things can be defined as a mechanism to group elements of a

UML model together. There is only one grouping thing available.
Package − Package is the only one grouping thing available for gathering structural
and behavioural things.
3.3.8 Annotational Things

Annotational things are the explanatory parts of UML models. These are the
comments you may apply to describe, illuminate, and remark about any element in
a model. There is one primary kind of annotational thing, called a note. A note is
simply a symbol for rendering constraints and comments attached to an element or
a collection of elements.
3.3.9 Relationships in the UML
Relationship is another most important building block of UML. It

shows how the elements are associated with each other and this association
describes the functionality of an application.
There are four kinds of relationships in the UML:
• Dependency
• Association
• Generalization
• Realization
3.3.10 Dependency
It is an element (the independent one) that may affect the semantics of the other
element (the dependent one). Graphically, a dependency is rendered as a dashed
line, possibly directed, and occasionally including a label.
3.3.11 Association
Association is basically a set of links that connects the elements of a UML
model. It also describes how many objects are taking part in that relationship.
3.3.12 Generalization
It is a specialization/generalization relationship in which the specialized
element (the child) builds on the specification of the generalized element (the
parent). The child shares the structure and the behavior of the parent. Graphically, a
generalization relationship is rendered as a solid line with a hollow arrowhead
pointing to the parent.
3.3.13 Realization
Realization can be defined as a relationship in which two elements are
connected. One element describes some responsibility, which is not implemented
and the other one implements them. This relationship exists in case of interfaces.
6.3 UML DIAGRAMS
UML is a modern approach to modeling and documenting software. It is

based on diagrammatic representations of software components. It is the final
output, and the diagram represents the system.
UML includes the following
• Class diagram
• Object diagram
• Component diagram
• Composite structure diagram
• Use case diagram
• Sequence diagram
• Communication diagram
• State diagram
• Activity diagram
3.4 Fig 6.2 Use-Case diagram

Fig 6.3 Sequence Diagram
6. DEVELOPMENT
7.1. RAW
DATA
7.2. SAMPLE CODE
app.py
import streamlit as st
import pickle
from PyPDF2 import PdfReader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
from langchain.callbacks import get_openai_callback
import os
import time
import sounddevice as sd
import soundfile as sf
import tempfile
import speech_recognition as sr
# Initialize session state variables

if 'questions' not in st.session_state:
st.session_state.questions = []
if 'recorded_answers' not in st.session_state:

st.session_state.recorded_answers = {}
if 'n' not in st.session_state:

# number of Questions
st.session_state.n=5
if 'analysis' not in st.session_state:

# number of Questions
st.session_state.analysis=False
# Sidebar contents
with st.sidebar:
st.title('🤗💬 LLM Application')
st.markdown('''
## About
References used for building the APP:
- [Streamlit](https://streamlit.io/)
- [LangChain](https://python.langchain.com/)
- [OpenAI](https://platform.openai.com/docs/models) LLM model
''')
# Set OpenAI API key

os.environ['OPENAI_API_KEY'] = "sk-
efcKQr3uh0wXn7kbOYJWT3BlbkFJDnA3c8ozTQ6Outcr4HW1"
file_path = "faiss_store_openai.pkl"
main_placeholder = st.empty()
def record_audio(duration, fs):

with tempfile.NamedTemporaryFile(delete=False, suffix='.wav') as tmpfile:
recording = sd.rec(int(duration * fs), samplerate=fs, channels=1, dtype='int16')
sd.wait()
sf.write(tmpfile.name, recording, fs, format='wav')
return tmpfile.name
def convert_audio_to_text(audio_file):
r = sr.Recognizer()
try:
with sr.AudioFile(audio_file) as source:
audio_data = r.record(source)
text = r.recognize_google(audio_data)
return text
except sr.UnknownValueError:
return ""
except sr.RequestError as e:
return f"Could not request results from Google Speech Recognition service; {e}"
# Sample rate
fs = 44100
# Default recording duration
duration = 60
# Main function
def main():
st.header("Personalized Interviewer 💬")
# Upload a PDF file

pdf = st.file_uploader("Upload your PDF", type='pdf')
main_placeholder.text("Data Loading...Started...✅✅✅")
if pdf is not None:

try:
if not st.session_state.analysis:
# Read PDF content
pdf_reader = PdfReader(pdf)
# Check if the PDF is empty
if len(pdf_reader.pages) == 0:
st.error("The uploaded PDF is empty.")
return
# Extract text from PDF

text = ""
for page_num in range(len(pdf_reader.pages)):
page_text = pdf_reader.pages[page_num].extract_text()
if page_text:
text += page_text
else:
# Break out of the loop if the page is empty, indicating the end of the file
break
# Check if the extracted text is empty

if not text.strip():
st.error("No text found in the PDF.")
return
# Split text into chunks

text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
length_function=len
)
chunks = text_splitter.split_text(text=text)
# Load or create st.session_state.vectorstore

if os.path.exists(file_path):
with open(file_path, "rb") as f:
embeddings = OpenAIEmbeddings()
st.session_state.vectorstore = FAISS.from_texts(chunks,
embedding=embeddings)
st.session_state.vectorstore.save_local("faiss_store_openai")
else:
st.write("Analysing your resume...")
embeddings = OpenAIEmbeddings()
st.session_state.vectorstore = FAISS.from_texts(chunks,
embedding=embeddings)
st.session_state.vectorstore.save_local("faiss_store_openai")
# Generate questions
query = f"Give {st.session_state.n} technical questions on the skills and
projects from the above pdf"
if query:
docs = st.session_state.vectorstore.similarity_search(query=query, k=3)
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.6,
max_tokens=500)
chain = load_qa_chain(llm=llm, chain_type="stuff")
with get_openai_callback() as cb:
response = chain.run(input_documents=docs, question=query)
if not st.session_state.questions:
st.session_state.questions = list(response.split('?'))[0:-1]
# Initialize recorded answers

if not st.session_state.recorded_answers:
for i in range(st.session_state.n):
curr_qns_ans = {}
curr_qns_ans["Question"] = st.session_state.questions[i]
curr_qns_ans["Answer"] = "Not Answered Yet"
st.session_state.recorded_answers[i] = curr_qns_ans
# Display questions and record answers

st.write("Analysis completed. Start answering Below Questions\n")
st.session_state.analysis=True
st.header("Questions")
for i, question in enumerate(st.session_state.questions):
st.write(f"{question}")
start_recording = st.button(f"Start Answering {i+1}")
if start_recording:
st.write("Listening...")
audio_file = record_audio(duration, fs)
st.write("Time's up!")
# Convert audio to text

text = convert_audio_to_text(audio_file)
# Store the recorded answer

curr_qns_ans = {}
curr_qns_ans["Question"] = question
curr_qns_ans["Answer"] = text
st.session_state.recorded_answers[i] = curr_qns_ans
# Display all recorded answers

if st.session_state.recorded_answers:
st.header("All Your Answers")
for qn_num, ele in st.session_state.recorded_answers.items():
st.write(f'{ele["Question"]}')
st.write(f'Answer: {ele["Answer"]}')
# Analyze questions and corresponding answers after all recordings
query = f"""Analyze all the above questions and corresponding answers and give
a score between 0 to 100 and also provide the areas of improvement for betterment of the
candidate. The list of questions and answers are as follows, providing a review only for
answered questions: {str(st.session_state.recorded_answers)}. Give analysis for every
question and corresponding answer. The format of the review is '[Question number] :
[score]/100 Areas of improvement: [suggestions to improve]'. Every question's response
should be separated by '###'. For example:
Question 1: Score - [score]/100 Areas of improvement: The candidate provided a

brief answer, but it could be improved by providing more specific details about the
methods used to fine-tune the VGG16 model and the results achieved ###
Question 2: Score - N/A Areas of improvement: The candidate did not provide an
answer for this question, so no score or areas of improvement can be given
and question number starts from 1.Please give each answer in a newline"""
count = 0
for i, question in enumerate(st.session_state.questions):
if st.session_state.recorded_answers[i]["Answer"] != "Not Answered Yet":
count += 1
answered_all = True if count == st.session_state.n else False

if query and answered_all:
st.title("Analysis and Review")
docs = st.session_state.vectorstore.similarity_search(query=query, k=3)
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.6,
max_tokens=1000)
chain = load_qa_chain(llm=llm, chain_type="stuff")
with get_openai_callback() as cb:
response = chain.run(input_documents=docs, question=query)
reviews=response.split("###")
for review in reviews:
st.write(review)
# st.write(response)
except Exception as e:
st.error(f"An error occurred: {str(e)}")
# Run the main function

if __name__ == '__main__':
main()
7.3. RESULTS
INITIAL WEBPAGE
7. TESTING
8.1 INTRODUCTION TO TESTING
Software Testing is defined as an activity to check whether the actual results

match the expected results and to ensure that the software system is Defect free. It
involves the execution of a software component or system component to evaluate one or
more properties of interest. It is required for evaluating the system. This phase is the
critical phase of software quality assurance and presents the ultimate view of coding.
3.4.1 Importance of Testing

The importance of software testing is imperative. A lot of times this process is
skipped, therefore, the product and business might suffer. To understand the importance
of testing, here are some key points to explain
• Software Testing saves money

• Provides Security
• Improves Product Quality
• Customer satisfaction
Testing is of different ways The main idea behind the testing is to reduce the errors and
do it with a minimum time and effort.
3.4.2 Benefits of Testing

• Cost-Effective: It is one of the important advantages of software testing. Testing
any IT project on time helps you to save your money for the long term. In case if the bugs
caught in the earlier stage of software testing, it costs less to fix.
• Security: It is the most vulnerable and sensitive benefit of software testing. People
are looking for trusted products. It helps in removing risks and problems earlier.
• Product quality: It is an essential requirement of any software product. Testing

ensures a quality product is delivered to customers.
• Customer Satisfaction: The main aim of any product is to give satisfaction to their
customers. UI/UX Testing ensures the best user experience.
3.4.3 Different types of Testing

Unit Testing: Unit tests are very low level, close to the source of your application. They
consist in testing individual methods and functions of the classes, components or
modules used by your software. Unit tests are in general quite cheap to automate and can
be run very quickly by a continuous integration server.
Integration Testing: Integration tests verify that different modules or services used by
your application work well together. For example, it can be testing the interaction with
the database or making sure that microservices work together as expected. These types
of tests are more expensive to run as they require multiple parts of the application to be
up and running.
Functional Tests: Functional tests focus on the business requirements of an application.

They only verify the output of an action and do not check the intermediate states of the
system when performing that action.
There is sometimes a confusion between integration tests and functional tests as they
both require multiple components to interact with each other. The difference is that an
integration test may simply verify that you can query the database while a functional test
would expect to get a specific value from the database as defined by the product
requirements.
Regression Testing: Regression testing is a crucial stage for the product & very useful
for the developers to identify the stability of the product with the changing requirements.
Regression testing is a testing that is done to verify that a code change in the software
does not impact the existing functionality of the product.
System Testing: System testing of software or hardware is testing conducted on a
complete integrated system to evaluate the system’s compliance with its specified
requirements. System testing is a series of different tests whose primary purpose is to
fully exercise the computer-based system.
Performance Testing: It checks the speed, response time, reliability, resource usage,
scalability of a software program under their expected workload. The purpose of
Performance Testing is not to find functional defects but to eliminate performance
bottlenecks in the software or device.
Alpha Testing: This is a form of internal acceptance testing performed mainly by the in-
house software QA and testing teams. Alpha testing is the last testing done by the test
teams at the development site after the acceptance testing and before releasing the
software for the beta test. It can also be done by the potential users or customers of the
application. But still, this is a form of in-house acceptance testing.
Beta Testing: This is a testing stage followed by the internal full alpha test cycle. This is
the final testing phase where the companies release the software to a few external user
groups outside the company test teams or employees. This initial software version is
known as the beta version. Most companies gather user feedback in this release.
Black Box Testing: It is also known as Behavioural Testing, is a software testing

method in which the internal structure/design/implementation of the item being tested is
not known to the tester. These tests can be functional or non-functional, though usually
functional.
3.4.4 Fig 8.1.1 Blackbox Testing
This method is named so because the software program, in the eyes of the tester, is like a
black box; inside which one cannot see. This method attempts to find errors in the
following categories:
• Incorrect or missing functions

• Interface errors
• Errors in data structures or external database access
• Behaviour or performance errors
• Initialization and termination errors
White Box Testing: White box testing (also known as Clear Box Testing, Open Box
Testing, Glass Box Testing, Transparent Box Testing, Code-Based Testing or Structural
Testing) is a software testing method in which the internal
structure/design/implementation of the item being tested is known to the tester. The
tester chooses inputs to exercise paths through the code and determines the appropriate
outputs. Programming know-how and the implementation knowledge is essential. White
box testing is testing beyond the user interface and into the nitty-gritty of a system. This
method is named so because the software program, in the eyes of the tester, is like a
white/transparent box; inside which one clearly sees.
Fig 8.1.2 Whitebox Testing
8. CONCLUSION
Our project leverages advanced technologies like LangChain and OpenAI to

revolutionize the interview preparation process. By automating question generation based
on resume content and enabling personalized audio responses, we offer a dynamic and
efficient platform for candidates to hone their interview skills. The seamless integration
of machine learning algorithms ensures objectivity, fairness, and real-time feedback,
enhancing the overall interview experience.
Both benefits and drawbacks exist with our project. On the positive side, it automates
question generation and response recording, streamlining the interview preparation
process. Additionally, it provides personalized feedback and analysis, enhancing
candidate performance and confidence. However, reliance on machine learning
algorithms may introduce biases or inaccuracies in question generation, impacting the
quality of interview practice. Our system may not fully replicate the nuances of human
interaction in interview scenarios, and users should supplement their preparation with
real-world practice and feedback.
The main challenge that we faced while working on this project was the need for internet
connectivity and API access may limit accessibility and usability in certain environments.
9. FUTURE SCOPE
The future scope of our project is expansive, driven by our overarching objective of
revolutionizing interview preparation processes.
We continue to refine our system, we aim to leverage cutting-edge technologies to
enhance user experience and effectiveness.
This includes exploring advanced natural language processing techniques to generate
more contextually relevant and diverse interview questions.
Additionally, we envision integrating machine learning algorithms to provide
personalized feedback and performance analytics to users.
Moreover, we plan to expand the application's capabilities by incorporating features such
as mock interview simulations and industry-specific question sets.
These enhancements will ensure that our platform remains at the forefront of interview
preparation innovation, catering to diverse user needs and preferences.
Therefore, these are some upcoming upgrades or enhancements that we intend to
make.
10. REFERENCE LINKS
1. Data Extraction from pdf: https://www.freecodecamp.org/news/extract-data-

from-pdf-files-with-python/
2. Open Ai docs: https://platform.openai.com/docs/introduction
3. Google Cloud Speech to text API: https://cloud.google.com/text-to-speech?hl=en
4. Google Cloud text to Speech API: https://cloud.google.com/speech-to-text?hl=en
5. LangChain docs: https://python.langchain.com/docs/get_started/introduction

Finals

Uploaded by

Copyright:

Available Formats

Finals

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Finals

Uploaded by

Copyright:

Available Formats

AI-Powered Interview Assistant

1.1.1 BACHELOR OF TECHNOLOGY

Under the esteemed guidance of

1.1.4 GAYATRI VIDYA PARISHAD COLLEGE OF

J.BHAVYA SRI (20131A4223)

1.1.5 Guide Head of the Department

We hereby declare that this project entitled “AI-Powered Interview Assistant”

PLACE: VISAKHAPATNAM J.BHAVYA SRI (20131A4223)

DATE: RAMEEZ AHMAD (20131A04245)

NUZHATH TAHSEEN (21145A4204)

P.VIJAYASIMHA REDDY (21135A4205)

We express our sincere thanks to our principal Dr. A. B. KOTESWARA RAO,

We express our deep sense of Gratitude to Dr. D. N. D. HARINI, Associate

Dr. R. Seeta Sireesha ,whose valuable suggestions, guidance and comprehensive

J.BHAVYA SRI (20131A4223)

In today's rapidly evolving educational landscape, where students pursuing technical

Existing skill development models often fail to provide personalized one-on-one

KEYWORDS – Langchain , OpenAI API’s , Streamlit .

CHAPTER 3. ALGORITHM ANALYSIS…..............................................9

CHAPTER 6. SYSTEM DESIGN.............................................................25

In the realm of career advancement, meticulous interview preparation emerges as a

The primary objective of this application is to revolutionize the interview preparation

1.2 ABOUT THE ALGORITHM

1.2 FIG 1.1 Langchain Architecture

Unstructured URL Loader:

1.3.1 Text Splitters

Character Text Splitter:

1.2 FIG 1.2 Text Splitters

1.4.1 Vector Database:

1.2 FIG 1.3 Vector Database

FAISS (Facebook AI Similarity Search) is a cutting-edge library designed for efficient

1.2 FIG 1.4 FAISS Indexing

1.2 FIG 1.5 Retrieval

The RetrievalQAWithSourcesChain module represents the pinnacle of our system's

Built upon Streamlit's framework, the UI offers a user-friendly experience,

1.3 FIG 1.5 Example of Stremlit UI

The scope of our project encompasses the development of a comprehensive platform

1. Document Loaders: Retrieve documents from diverse sources including private S3

A software requirements specification (SRS) is a document that describes what the

2.1 FUNCTIONAL REQUIREMENTS

A Functional Requirement (FR) is a description of the service that the software

2.2 NON-FUNCTIONAL REQUIREMENTS

NON-FUNCTIONAL REQUIREMENT (NFR) specifies the quality attribute of a

• Performance-Performance: The system will offer real-time feedback during question

2.3 MINIMUM HARDWARE REQUIREMENTS

• Processor -Intel Core i3 or above

2.4 MINIMUM SOFTWARE REQUIREMENTS

• Programming Language – PYTHON 3.5

3.1. EXISTING SYSTEMS

1.3.1 3.1.1. DRAWBACKS OF EXISTING ALGORITHM

 Limited Information Extraction: Discuss how existing systems struggle to

We envisioned a pioneering solution aimed at revolutionizing interview preparation. Our

For training our interview question generation model, we employ a combination of

3.3 FEASIBILITY STUDY

1.3.2 The main objectives of feasibility are mentioned below:

To determine if the product is technically and financially feasible to develop,

• To determine how successful your proposed action will be.

• Whether the final product will benefit its intended users.

• To describe the nature and complexity of the project.