IntentEmail PROJECT REPORT
IntentEmail PROJECT REPORT
IntentEmail PROJECT REPORT
U.Rakshitha-21211A7259
S. Ashish Rao-21211A7254
P.Harishwar-21211A7250
August 23,2023
CONTENTS
1 Abstract 7
Acknowledgments 9
2 Introduction 10
2.1 Intent Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Literature Review 12
3.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
3.2 Approaches to Event Extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
3.2.1 Supervised Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.2 Semi-Supervised and Unsupervised Methods . . . . . . . . . . . . . . . .13
3.2.3 Deep Learning and Neural Networks . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Challenges and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
3.3.1 Ambiguity and Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
3.3.2 Entity Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.3 Data Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Project Uniqueness 15
4.1 Innovative Event Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
4.2 Contextual Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
4.3 Smart Scheduler Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4.Real - World Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
4.4.1 Professional Efficiency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.2 Personal Productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
4.4.3 Student Life. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2
5 Project Applications 17
5.1 Case Study 1: Professional Efficiency Enhancement . . . . . . . . . . . . . . . 18
5.2 Case Study 2: Personal Productivity Boost . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Case Study 3: Student Organization and Success . . . . . . . . . . . . . . . . . 20
3
7. Implementation Details 27
7.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.1.1 Kaggle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.1.2 Enron Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.1.3 Personal Emails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2.1 Text Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2.2 Tokenization and Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2.3 Lemmatization and Stopwords. . . . . . . . . . . . . . . . . . . . . . . . . . . . .29
7.2.4 Intent Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.2.5 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.2.6 Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.3 Support Vector Machines (SVM) Classification . . . . . . . . . . . . . . . . . . . . 29
7.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29
7.3.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3.2.1 Hyperplane and Margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31
7.3.2.2 Handling Non-Linearity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3.2.3 Training and Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3.2.4 Regularization and C- Parameter . . . . . . . . . . . . . . . . . . . . . 32
7.3.2.5Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.4.Event Extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.4.1 Leveraging spaCy for Linguistic Analysis . . . . . . . . . . . . . . . . . . . . .34
7.4.1.1 Tokenization and Part-of-speech tagging . . . . . . . . . . . . . . . .35
7.4.1.2 Dependency Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
7.4.1.3 Named Entity Recognition (NER) . . . . . . . . . . . . . . . . . . . . . .35
7.5 Extracting Dates and Times with dateutil . . . . . . . . . . . . . . . . . . . . . . . . .35
7.5.1 Date Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
7.5.2 Relative Time Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.6 Using regex for Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36
4
7.6.1 Keyword Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.6.2 Contextual Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36
7.7 Integration for Comprehensive Event Extraction . . . . . . . . . . . . . . . . . . 36
7.7.1 Linguistic Analysis and Named Entities . . . . . . . . . . . . . . . . . . . . 36
7.7.2 Temporal Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36
7.7.3 Pattern Matching and Specific Events . . . . . . . . . . . . . . . . . . . . . .37
5
10 References, Annexure 51
10.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A Annexure-1: List of Tables 52
B Annexure-2: List of Figures 54
6
1.ABSTRACT
In today's fast-paced world, people are constantly inundated with an
overwhelming amount of emails on a daily basis. Managing and organizing these
emails can be a daunting task, especially when it comes to keeping track of
important tasks and deadlines. Our proposed application aims to solve this
problem by automatically generating a to-do list from emails. The application will
use natural language processing (NLP) algorithms to identify important keywords
and phrases related to tasks and deadlines within incoming emails. These
keywords will then be used to automatically generate a to-do list, which can be
easily accessed and managed by the user. The to-do list will be organized based
on the priority of tasks, and will include information such as the task
description, due date, and any relevant attachments. The user will also have the
ability to customize and edit the to-do list as needed. One of the key benefits of
our proposed application is its ability to save time and reduce the risk of
important tasks and deadlines being forgotten. By automatically generating a to-
do list from emails, users can quickly and easily keep track of their tasks and
stay organized without having to manually sift through their inbox. Additionally,
the application will have a user-friendly interface, making it easy for users of all
levels of technological proficiency to use. The application will also be compatible
with a wide range of email clients and platforms, including Gmail, Outlook, and
Apple Mail. In terms of potential users, our application could be beneficial for a
variety of individuals and organizations, including students, professionals, and
small businesses. Students could use the application to manage their
coursework and assignments, while professionals could use it to stay on top of
important meetings, deadlines, and projects. Small businesses could also benefit
from the application, as it could help them stay organized and improve
productivity. In terms of technical implementation, the application will use a
combination of NLP algorithms and machine learning models to analyze
incoming emails and identify important
7
tasks and deadlines. The application will also incorporate a user-friendly
interface, which will allow users to easily manage and customize their to-do lists.
In conclusion, our proposed application has the potential to revolutionize the
way people manage their emails and tasks. By automatically generating a to-do
list from emails, the application will save users time and reduce the risk of
important tasks and deadlines being forgotten. With its user-friendly interface
and compatibility with a wide range of email clients and platforms, our
application has the potential to benefit individuals and organizations of all kinds.
8
ACKNOWLEDGEMENTS
Our gratitude also goes to the institution for providing us with the
necessary resources and environment to conduct this research. The
learning experience we gained during this project will undoubtedly
shape our future endeavors.
9
2.INTRODUCTION
2.1.Intent Email:
In the realm of modern living, time is currency, and efficiency the venerated
anthem. Intent Email, with its elegant amalgamation of technological prowess
and intuitive design, rises to meet this clarion call for streamlined efficiency. The
very name "Intent Email" resonates with purpose, encapsulating its primary
objective - to unearth the latent intent concealed within the convoluted tapestry
of email content. This app, resplendent in its potential, presents a solution to a
ubiquitous dilemma - the mastery of one's digital domain.
Picture, if you will, the intricate threads of an email's fabric - the ephemeral
words woven with intention, suggesting meetings, appointments, invitations,
and obligations. Herein lies the crux of Intent Email's finesse, the ability to
decipher the unspoken yet palpable intent hidden beneath linguistic veils. As a
digital sleuth, Intent Email employs an intricate ensemble of algorithms, a
symphony of machine learning, natural language processing, and pattern
recognition, to extract events that matter. From the sprawling lexicon of an
email's narrative, it unearths the gems of intent, from the fleeting "Let's meet
10
next week" to the grandiloquent "The annual symposium is scheduled for the
24th."
It is not mere event extraction that distinguishes Intent Email, but the
transformation of raw data into refined pearls of productivity. Like an alchemist,
it transmutes extracted events into tangible to-do lists - personalized
choreographies that dance in harmony with individual routines and preferences.
These choreographies are not arbitrary, but woven from the fabric of
understanding. An email announcing a board meeting at noon is transmuted
into a task, scheduled deftly into the user's day, respecting pre-existing
engagements and circadian rhythms.
In a world where information is gold, Intent Email dons the mantle of a guardian,
a custodian of the user's digital identity. Its security measures, akin to an
impervious vault, shield sensitive data from the prying grasp of malevolent
entities. It understands that the orchestration of productivity is a dance of trust,
and thus, it safeguards user information with the tenacity of a sentinel.
11
Stepping into tomorrow, Intent Email is not a stagnant entity but a journey into
boundless potential. As technology burgeons, its algorithms evolve, its
intelligence matures, and its proficiency deepens. It is a phoenix rising from each
interaction, each keystroke, each commitment fulfilled. A journey through the
labyrinthine garden of Intent Email is a sojourn towards productivity, a voyage
towards clarity, and a passage towards mastery.
12
3.LITERATURE REVIEW
In today's digital age, email communication has become an integral part of our
personal and professional lives. The deluge of emails flooding our inboxes
necessitates efficient email management, where identifying and acting upon
crucial events, appointments, and tasks becomes paramount. Enter "Intent
Email," an innovative solution designed to revolutionize this landscape by
automating the event extraction process, deciphering user intent, and
seamlessly generating actionable to-do lists. This literature review delves into the
multifaceted world of email management, event extraction, and task generation,
elucidating the varied approaches, challenges, and motivations that drive the
development of Intent Email.
Rule-based methods: The initial foray into event extraction entailed rule-based
techniques. These methods relied on predefined patterns and rules to capture event-
related information within emails. Common phrases, discerning date formats, and
spotting keywords marked the foundation of this approach. However, the inflexibility of
rule-based systems limited their adaptability to diverse language expressions and
contextual intricacies. The recognition of idiomatic expressions like "Let's grab coffee
next week" posed a daunting challenge for these systems.
The advent of labeled email datasets catalyzed the rise of supervised machine
learning techniques. By harnessing the power of algorithms like support vector
machines, conditional random fields, and neural networks, models were trained
to discern intricate patterns and relationships among entities and attributes.
These algorithms exhibited a commendable aptitude for event extraction and
13
classification, provided they were equipped with substantial and accurately
annotated training data.
The scarcity and expense of labeled data necessitated the exploration of semi-
supervised and unsupervised approaches. These innovative methodologies
juxtaposed limited labeled data with a deluge of unlabeled data, subsequently
enhancing event extraction precision. Techniques like clustering, bootstrapping,
and topic modeling emerged as the avant-garde, contributing to the gradual
refinement of event extraction methods.
The meteoric rise of deep learning beckoned forth a new era in event extraction.
Transformer-based models, notably BERT, emerged as the harbinger of
transformation in this domain. The innate ability of these models to capture
contextual nuances and intricate word relationships fueled their superior
performance in deciphering event mentions and delineating their intricate
attributes. The comprehensive understanding of event-centric semantics set the
stage for more accurate and contextually nuanced event extraction.
3.3.2.Entity Resolution:
14
The landscape of event extraction is dotted with the formidable challenge of
entity resolution. Accurately identifying and resolving entities such as dates,
times, locations, and participants assumes pivotal importance. Contextual
variations and the presentation of these entities in diverse formats pose intricate
puzzles that event extraction systems must adeptly solve.
3.3.3.Data Privacy:
As the trajectory of event extraction accelerates, the aspect of data privacy looms
large. Ensuring that event-related information is extracted from emails while
safeguarding user privacy and adhering to stringent data protection regulations
becomes a compelling conundrum. Striking a balance between functionality and
user data security emerges as a formidable challenge in the Intent Email
landscape.
3.4Conclusion
15
4.PROJECT UNIQUENESS
Intent Email is not just another email management app; it's a game-changer that
combines event extraction, intelligent scheduling, and personalized to-do lists to
revolutionize the way individuals interact with their emails and manage their
tasks. This uniqueness is underpinned by its innovative features, adaptability to
user preferences, and real-world applications.
4.2.Contextual Understanding
What truly differentiates Intent Email is its ability to comprehend the contextual
intricacies of emails. While other apps may struggle with deciphering ambiguous
phrases or handling informal language, Intent Email excels. For instance, an
email containing "Let's catch up sometime next week!" might baffle traditional
systems. However, Intent Email skillfully interprets the intent and timeframe,
creating a task for scheduling a catch-up event in the upcoming week.
16
into users' calendars, optimizing their time and avoiding scheduling conflicts.
Imagine a user with a busy day ahead; Intent Email recognizes this and
schedules the extracted tasks strategically, maximizing efficiency without
overwhelming the user.
4.4.1.Professional Efficiency:
4.4.2.Personal Productivity:
4.4.3.Student Life:
17
5.PROJECT APPLICATIONS
Intent Email, the innovative email management app that combines an SVM
classifier with NLP libraries, brings a paradigm shift to how individuals interact
with their emails and manage tasks. This unique synergy has far-reaching
applications across various domains, revolutionizing productivity and
organization. In this discussion, we delve into three diverse case studies to
showcase the real-world impact and versatility of Intent Email.
Scenario:
Application:
Intent Email offers a tailored solution for professionals like John, streamlining
their workflow and maximizing efficiency.
Event Extraction:
Intent Email's SVM classifier identifies event-related emails and extracts crucial
information. When John receives an email titled "Client Presentation Tomorrow
at 10 AM," Intent Email instantly recognizes the event, extracts the date, time,
and participant, and categorizes it as a "Meeting."
18
Intent Email compiles a personalized to-do list for John, populating it with the
extracted event. The task is scheduled to align with John's preferences, ensuring
that he receives timely reminders without disrupting his workflow.
Recognizing John's busy schedule, Intent Email integrates the extracted event
into his calendar, avoiding conflicts with ongoing tasks and ensuring that John
is well-prepared for the client presentation.
Impact:
Scenario:
Application:
Intent Email becomes Sarah's trusted ally, enabling her to seamlessly manage
personal and professional obligations.
Event Extraction:
19
Intent Email recognizes the importance of family events. When Sarah receives
an email about her son's school play, the app identifies the event, extracts the
date and time, and categorizes it as "Family Event."
Intent Email curates a to-do list for Sarah, prioritizing family events. The app's
flexibility ensures that Sarah's schedule isn't overwhelmed, providing her with
manageable tasks to fit around her other commitments.
Intent Email synchronizes the extracted family event with Sarah's calendar,
preventing any clashes with work meetings or personal appointments.
Impact:
Sarah now enjoys a more balanced life, thanks to Intent Email's assistance in
managing family events and tasks. She attends her son's play, juggles work
responsibilities, and indulges in personal pursuits with ease.
Scenario:
Application:
Intent Email transforms Alex's academic journey, empowering him to excel both
in academics and extracurricular activities.
20
Event Extraction:
Intent Email identifies academic events in emails. When Alex receives an email
about a group study session, the app recognizes the intent, extracts the date,
time, and location, categorizing it as a "Study Session."
Intent Email assembles a tailored to-do list for Alex, focusing on academic events
and assignments. The app ensures that the extracted study session is seamlessly
integrated into his study routine.
Intent Email incorporates the study session into Alex's calendar, aligning it with
his class timings and part-time work shifts.
Impact:
Alex gains an academic edge as Intent Email optimizes his task management. He
attends study sessions, submits assignments on time, and maintains an active
presence in extracurricular activities without sacrificing his studies.
21
6.SOFTWARE AND HARDWARE REQUIREMENTS
Fig 6.1 illustrates an email processing system's architecture, including POP3 retrieval,
Python-based backend with SVM classification, React.js UI
22
6.1Comprehensive software requirements for Intent Email:
23
6.1.2.1.Python:
6.1.2.2Node.js:
Used to build the backend server that handles communication with the frontend,
manages requests, and interacts with the SVM classifier and NLP libraries.
Intent Email's email extraction process relies on the POP3 protocol to retrieve
emails from mail servers. The software should have the capability to connect to
email servers, authenticate users, and fetch email content.
6.1.3.3SpaCy:
24
6.1.3.4dateutil:
Intent Email utilizes regex patterns to capture specific formats and patterns
relevant to event extraction. The software's regex engine should support
advanced pattern matching and extraction.
6.1.4.1React.js:
Intent Email's user interface is built using the React.js library. This requires the
setup of a Node.js environment, npm package manager, and the ability to develop
and bundle React components.
25
6.1.5. Database (Optional):
PostgreSQL or MySQL: If Intent Email requires data storage for user preferences,
extraction history, or any other user-related information, the software should be
capable of integrating with a relational database.
6.1.6.1.Internet Connectivity:
Intent Email requires a stable and reliable internet connection for tasks such as
retrieving emails, sending responses, and real-time interaction with users.
6.1.7.Deployment:
Cloud Hosting Platforms: Intent Email can be deployed on cloud platforms such
as AWS, Microsoft Azure, or Google Cloud. These platforms offer scalability,
security, and infrastructure management, facilitating hassle-free deployment
and management.
26
7.IMPLEMENTATION DETAILS
Fig: 7.1 outlines a robust intent-based email application: dataset collection from
diverse sources, preprocessing using spaCy, dateutil, and regex; SVM-based
intent classification, and event extraction for organizing tasks from emails.
7.1.Data Collection:
27
comprising around 20,000 emails. The following outlines the data collection and
preprocessing steps.
7.1.1.Kaggle:
7.1.2.Enron Dataset:
The Enron email dataset, infamous due to the Enron scandal, provides a
valuable resource for real-world email analysis. It contributes a diverse set of
emails from a corporate environment, aiding in capturing professional
communication patterns.
7.1.3.Personal Emails:
7.2.Preprocessing:
7.2.1.Text Cleaning:
The emails were tokenized into words and sentences using spaCy. Text was then
normalized by converting to lowercase and handling punctuation to ensure
consistent text representation.
28
7.2.3.Lemmatization and Stopwords:
7.2.4.Intent Labeling:
Each email was labeled with its corresponding intent category, which was
manually assigned based on context. This allowed the SVM model to learn from
labeled examples during training.
7.2.5.Feature Extraction:
Features for SVM training were extracted using spaCy. Part-of-speech tags,
named entities, dependency relations, and other linguistic features were
captured to represent the content and structure of the emails.
7.2.6.Balancing:
29
maintaining a maximal distance (margin) between the classes and the
hyperplane.
Fig 7.2 Illustrates how Intent Email employs SVM for effective categorization ,
beginning with input email, feature extraction and transformation.
30
7.3.2.Feature Extraction:
The fundamental idea of SVM is to find the hyperplane that best separates the
two classes while maximizing the margin between them. The margin is the
minimum distance between the hyperplane and the closest data points from each
class. SVM aims to find the hyperplane that has the largest margin, as this
generally leads to better generalization and improved performance on unseen
data.
7.3.2.2.Handling Non-Linearity:
While SVM is highly effective for linearly separable data, many real-world
problems involve complex decision boundaries that cannot be accurately
described by a linear hyperplane. In email filtering, where the distinction
between importance and spam can be nuanced, SVM can utilize techniques like
kernel functions to map the data into higher-dimensional space where
separation becomes possible. Common kernels include polynomial, radial basis
function (RBF), and sigmoid kernels.
Training SVM involves finding the hyperplane parameters that maximize the
margin while correctly classifying the training data. This is achieved through
optimization techniques like gradient descent or quadratic programming. SVM
strives to balance the margin maximization with minimizing classification errors
on the training set.
31
Fig 7.3.This code loads and preprocesses text data from a dataset, trains an SVM
classifier with a linear kernel, and handles NaN values for both training and
testing.
7.3.2.5.Model Evaluation:
After training, the SVM model is evaluated on a separate test dataset to assess
its generalization performance. Common evaluation metrics for email filtering
include accuracy, precision, recall, and F1-score. These metrics provide insights
32
into the model's ability to correctly classify both important emails and spam
while minimizing false positives and false negatives.
Fig 7.4This code converts the test data matrix for SVM classification, predicts
classes, calculates accuracy, and generates a classification report along with a
confusion matrix for evaluation.
7.4.Event Extraction
33
Fig 7.3 illustrates how Intent Email integrates SpaCy for NLP tasks. Starting with
text input, SpaCy performs named entity recognition, part-of-speech tagging,
and syntactic analysis, enhancing email content understanding and
organization.
spaCy, a powerful NLP library, provides a range of tools for linguistic analysis
that are crucial for event extraction.
34
7.4.1.1.Tokenization and Part-of-Speech Tagging:
Tokenization breaks down the email text into individual words or tokens, while
part-of-speech tagging assigns grammatical categories to each token. This
information aids in identifying relevant phrases and entities related to events.
7.4.1.2.Dependency Parsing:
NER identifies entities like dates, times, locations, and names. In event
extraction, NER helps identify specific dates and times mentioned in the email.
7.5.1.Date Parsing:
dateutil can accurately parse dates in various formats, such as "August 22,
2023," "22/08/2023," or "tomorrow." This capability ensures that event dates
are correctly identified and extracted.
35
7.6.Using regex for Pattern Matching:
7.6.1.Keyword Extraction:
7.6.2.Contextual Patterns:
Contextual patterns can be developed to capture event details that follow certain
linguistic structures. For instance, a pattern that identifies a sentence starting
with "Regarding our meeting on [date]," followed by relevant details, can
efficiently extract meeting information.
7.7.2.Temporal Parsing:
36
7.7.3.Pattern Matching for Specific Events:
Fig7.4 This code utilizes SpaCy to extract event names from email subjects. It
identifies keywords related to events, iterates through tokens, and captures
event names based on the first matching keyword, enhancing email content
understanding.
37
Fig 7.5 This code uses SpaCy to process the email body, extracting event
information like event ID, title, and description. It employs regular expressions
to detect and process event dates, enhancing event details extraction.
38
8.RESULTS AND DISCUSSION
In this section, we present the results of our intent email conversion project,
which aims to automatically convert emails into a to-do list using Support Vector
Machines (SVM), spaCy for natural language processing, regular expressions,
and the dateutil library for event extraction. Our approach offers an efficient way
to transform email content into structured tasks, providing users with organized
and actionable information.
8.1.Results Overview
Our project involved a series of critical steps, starting from data collection and
preprocessing, followed by SVM classification for intent identification, event
extraction using spaCy, date extraction using regular expressions, and
ultimately generating a to-do list from the extracted event data. To evaluate our
system, we utilized a diverse dataset consisting of 20,000 emails gathered from
various sources, including Kaggle, the Enron dataset, and our own personal
emails.
39
Our trained SVM classifier achieved an impressive accuracy of 97.04% on the
test set. Furthermore, the precision and recall values were well-balanced,
indicating that the model can effectively classify both "important" and "spam"
emails without a significant bias toward one category. The F1-score of 0.97
validates the model's overall performance in capturing true positives while
minimizing false positives and false negatives.
Fig 8.1 The accuracy report showcases Intent Email's exceptional performance,
achieving a high accuracy of 97%. With meticulous email categorization, it
effectively identifies and manages important messages
40
Fig 8.2 The convolution matrix highlights the classification results, showing a
generally accurate categorization of emails.
41
Fig 8.3 depicts the output generated from event extraction. It showcases two
instances of event information.
Having extracted event and date information, the next step was to generate a
structured to-do list for users. This to-do list serves as a concise summary of
upcoming events, allowing users to efficiently plan their schedules. We further
integrated this feature into a user interface using React's dhtmlx scheduler. This
integration enables users to visualize their tasks and events within a user-
friendly calendar interface, enhancing their ability to manage their time
effectively.
This strategic integration seamlessly translates users' tasks and events into an
intuitive calendar interface. By harnessing the capabilities of dhtmlx scheduler,
users gain a comprehensive visual representation of their commitments, further
strengthening their capacity to efficiently navigate their schedules. This unified
42
system underscores our commitment to fostering enhanced time management
and productivity within a user-friendly environment.
Fig 8.4Streamline planning with this robust tool, visually presenting events and
tasks for seamless schedule management in a dynamic, interactive calendar
interface.
43
Fig8.5 A comprehensive list showcasing tasks, priorities, and completion status,
aiding efficient task management and promoting organized productivity.
8.2.Discussion
The results of our intent email conversion project underline the potential benefits
and challenges associated with automating email processing and organization.
The following discussion elaborates on key findings, limitations, and future
directions.
44
analysis tasks such as sentiment analysis, content categorization, and
information extraction.
Test Email
Hi U RAKSHITHA,
We're excited to invite you to Google Cloud Next this year, where Kaggle Models
will be featured in a demo showing you how to deploy and fine-tune openly
available LLMs with Vertex AI.
Google Cloud Next is an annual hybrid event where you'll get the chance to see
the latest Google Cloud technology in action. It’s happening next week on Aug. 29-
31, 2023 at the Moscone Center in San Francisco or online.
In addition to the demo where we're featured, there are dozens more machine
learning demos and workshops worth checking out. Discover the entire AI and ML
session library here.
45
DESCRIPTION:Google Cloud Next is an annual hybrid event where you'll get the
chance to see the latest Google Cloud technology in action. It’s happening next
week on Aug. 29-31, 2023 at the Moscone Center in San Francisco or online.
In addition to the demo where we're featured, there are dozens more machine
learning demos and workshops worth checking out. Discover the entire AI and ML
session library here.
,
admin_id: [1, 2, 3, 4], }
Status pass
46
9.CONCLUSION AND FUTURE SCOPE
9.1Conclusion:
47
extraction is pivotal for ensuring that users have a clear understanding of event
timings.
The integration of a to-do list within a user interface, powered by React's dhtmlx
scheduler, provides a tangible and visual way for users to engage with their
upcoming events. This integration promotes efficient task management and
planning.
9.2Future Scopes
As our intent email conversion project continues to pave the way for enhanced
productivity and streamlined communication, the road ahead is rich with
exciting prospects. The initial success of our system, built on the foundation of
machine learning, natural language processing, and user interface integration,
lays the groundwork for an array of future scopes that can propel the project to
new heights. This section delves into the various avenues we can explore to
further evolve the system into a fully functional and indispensable app. From
expanding linguistic capabilities to real-time processing, customizable filters,
and advanced event detection, each facet contributes to our mission of
revolutionizing email management. By harnessing the power of user feedback,
embracing cutting-edge technologies, and prioritizing data security, we are
poised to develop an app that seamlessly integrates into users' daily lives,
enhancing their productivity and simplifying their interactions with emails and
tasks. This exploration of future scopes serves as a roadmap to guide our ongoing
48
efforts and ensure that our project remains at the forefront of innovative
solutions in the realm of digital productivity.
49
9.2.3.Customizable Filters and Priority Settings
Empowering users to customize their filters and priority settings will offer a more
tailored experience. Users can define their own criteria for categorizing emails as
"important" and can assign varying levels of priority to different email sources or
senders. This customization ensures that the system aligns with individual
preferences and workflows, making it a highly adaptable tool for diverse users
with unique needs.
50
In conclusion, the future scope of developing a fully functional app represents a
pivotal phase in the evolution of our intent email conversion project. The
transition from a conceptual framework to a tangible application underscores
our commitment to delivering a transformative solution that directly addresses
the challenges users face in managing their emails and tasks. By embracing this
future scope, we embark on a journey to make our innovation accessible, user-
friendly, and seamlessly integrated into users' lives. The envisioned app not only
streamlines email processing but also empowers users to take control of their
productivity, offering them a centralized platform for managing their
commitments and engagements. As we embark on the development of the app,
we remain dedicated to creating a robust and reliable tool that revolutionizes
how individuals interact with their emails and tasks, ultimately enhancing their
efficiency, organization, and overall digital experience. With a future app in mind,
our project's impact is poised to extend far beyond its current capabilities,
ushering in a new era of streamlined communication and productivity.
51
10. REFERENCES, ANNEXURE
10.1 References
52
10. Bondy, C., Chen, L., Grover, P., Hanson, V., Li, R., & Shi, P. (2021). Evaluating
Technology-Mediated Collaborative Workflows for Telehealth. IEEE Journal of Biomedical
and Health Informatics, 25(12), 4308–4316.
https://doi.org/10.1109/JBHI.2021.3119458
11. Colonnelli, I., Cantalupo, B., Merelli, I., & Aldinucci, M. (2021). StreamFlow: Cross-
Breeding Cloud With HPC. IEEE Transactions on Emerging Topics in Computing, 9(4),
1723–1737. https://doi.org/10.1109/TETC.2020.3019202
12. Detti, A., Funari, L., Petrucci, L., Dórazio, M., Mencattini, A., & Martinelli, E. (2023).
CWL-PLAS: Task Workflows Assisted by Data Science Cloud Platforms. IEEE Access, 11,
44092–44106. https://doi.org/10.1109/ACCESS.2023.3272619
13. Dickinson, M., Debroy, S., Calyam, P., Valluripally, S., Zhang, Y., Antequera, R. B., Joshi,
T., White, T., & Xu, D. (2021). Multi-Cloud Performance and Security Driven Federated
Workflow Management. IEEE Transactions on Cloud Computing, 9(1), 240–257.
https://doi.org/10.1109/TCC.2018.2849699
14. Downey, L. X., Bauchot, F., & Röling, J. (2018). Blockchain for Business Value: A
Contract and Work Flow Management to Reduce Disputes Pilot Project. IEEE Engineering
Management Review, 46(4), 86–93. https://doi.org/10.1109/EMR.2018.2883328
15. Du, S., Wu, P., Wu, G., Yao, C., & Zhang, L. (2018). The Collaborative System Workflow
Management of Industrial Design Based on Hierarchical Colored Petri-Net. IEEE Access,
6, 27383–27391. https://doi.org/10.1109/ACCESS.2018.2809439
16. Govindarajan, U. H., Narang, G., & Kumar, M. (2022). Graphic Facilitation in the
Engineering Workflow: Adoption Framework, Barriers, and Future Roadmap. IEEE
Engineering Management Review, 50(4), 186–202.
https://doi.org/10.1109/EMR.2022.3200580
17. Halioui, A., Valtchev, P., & Diallo, A. B. (2018). Bioinformatic Workflow Extraction from
Scientific Texts based on Word Sense Disambiguation. IEEE/ACM Transactions on
Computational Biology and Bioinformatics, 15(6), 1979–1990.
https://doi.org/10.1109/TCBB.2018.2847336
18. Hazekamp, N., Kremer-Herman, N., Tovar, B., Meng, H., Choudhury, O., Emrich, S., &
Thain, D. (2018). Combining Static and Dynamic Storage Management for Data Intensive
Scientific Workflows. IEEE Transactions on Parallel and Distributed Systems, 29(2), 338–
350. https://doi.org/10.1109/TPDS.2017.2764897
19. Kozma, D., Varga, P., & Larrinaga, F. (2021). Dynamic Multilevel Workflow Management
Concept for Industrial IoT Systems. IEEE Transactions on Automation Science and
Engineering, 18(3), 1354–1366. https://doi.org/10.1109/TASE.2020.3004313
20. Li, X., Zhang, L., Wu, Y., Liu, X., Zhu, E., Yi, H., Wang, F., Zhang, C., & Yang, Y. (2019).
A Novel Workflow-Level Data Placement Strategy for Data-Sharing Scientific Cloud
53
Workflows. IEEE Transactions on Services Computing, 12(3), 370–383.
https://doi.org/10.1109/TSC.2016.2625247
21. Liu, J., Ren, J., Dai, W., Zhang, D., Zhou, P., Zhang, Y., Min, G., & Najjari, N. (2021).
Online Multi-Workflow Scheduling under Uncertain Task Execution Time in IaaS Clouds.
IEEE Transactions on Cloud Computing, 9(3), 1180–1194.
https://doi.org/10.1109/TCC.2019.2906300
22. Ma, X., Xu, H., Gao, H., & Bian, M. (2021). Real-Time Multiple-Workflow Scheduling in
Cloud Environments. IEEE Transactions on Network and Service Management, 18(4),
4002–4018. https://doi.org/10.1109/TNSM.2021.3125395
23. Marozzo, F., Talia, D., & Trunfio, P. (2018). A Workflow Management System for Scalable
Data Mining on Clouds. IEEE Transactions on Services Computing, 11(3), 480–492.
https://doi.org/10.1109/TSC.2016.2589243
24. Mofrad, S., Ahmed, I., Zhang, F., Lu, S., Yang, P., & Cui, H. (2022). Securing Big Data
Scientific Workflows via Trusted Heterogeneous Environments. IEEE Transactions on
Dependable and Secure Computing, 19(6), 4187–4203.
https://doi.org/10.1109/TDSC.2021.3123640
25. Niu, M., Cheng, B., Feng, Y., & Chen, J. (2020). GMTA: A Geo-Aware Multi-Agent Task
Allocation Approach for Scientific Workflows in Container-Based Cloud. IEEE
Transactions on Network and Service Management, 17(3), 1568–1581.
https://doi.org/10.1109/TNSM.2020.2996304
26. Shan, C., Wu, C., Xia, Y., Guo, Z., Liu, D., & Zhang, J. (2023). Adaptive resource
allocation for workflow containerization on Kubernetes. Journal of Systems Engineering
and Electronics, 34(3), 723–743. https://doi.org/10.23919/JSEE.2023.000073
27. Viriyasitavat, W., Da Xu, L., Dhiman, G., Sapsomboon, A., Pungpapong, V., & Bi, Z.
(2023). Service Workflow: State-of-the-Art and Future Trends. IEEE Transactions on
Services Computing, 16(1), 757–772. https://doi.org/10.1109/TSC.2021.3121394
28. Welivita, A., Perera, I., Meedeniya, D., Wickramarachchi, A., & Mallawaarachchi, V.
(2018). Managing Complex Workflows in Bioinformatics: An Interactive Toolkit With GPU
Acceleration. IEEE Transactions on NanoBioscience, 17(3), 199–208.
https://doi.org/10.1109/TNB.2018.2837122
29. Yao, F., Pu, C., & Zhang, Z. (2021). Task Duplication-Based Scheduling Algorithm for
Budget-Constrained Workflows in Cloud Computing. IEEE Access, 9, 37262–37272.
https://doi.org/10.1109/ACCESS.2021.3063456
30. Ye, L., Xia, Y., Tao, S., Yan, C., Gao, R., & Zhan, Y. (2023). Reliability-Aware and
Energy-Efficient Workflow Scheduling in IaaS Clouds. IEEE Transactions on Automation
Science and Engineering, 20(3), 2156–2169. https://doi.org/10.1109/TASE.2022.31959
54
A LIST OF TABLES
55
Fig A.2 Difference between POP3 and IMAP
56
B LIST OF FIGURES
57
Fig B.2 An overview of Email access protocols
58