Fyp Final Report

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 47

FINAL YEAR PROJECT

THE TRENDS

SESSION 2015-2019

G1F15BSCS0112 USAMA ABDUL SAMAD


G1F15BSCS0106 SYED ZEESHAN JAFAR
G1F15BSCS0103 HUZAIFA RUBBANIBUTT

BACHELOR OF SCIENCE
IN
COMPUTER SCIENCE

Punjab Group of Colleges


University of Central Punjab
THE TRENDS

By

USAMA ABDUL SAMAD


SYED ZEESHAN JAFAR
HUZAIFA RUBBANI BUTT

Projected submitted to
Faculty of Information Technology,
University of Central Punjab,
Lahore, Pakistan.
In partial fulfillment of the requirements for the degree of

BACHELOR OF SCIENCE
IN
COMPUTER SCIENCE

Lecturer: Ghulam Ayzed Khan Dean: Prof. Shahid Saleem

___________________ _______________
DECLARATION

We, hereby, declare that “No portion of the work referred to, in this project has been
submitted in support of an application for another degree or qualification of this or any
other university/institute or other institution of learning”. It is further declared that this
undergraduate project, neither as a whole nor as a part thereof has been copied out
from any sources, wherever references have been provided.

MEMBERS’ SIGNATURES

i
University of Central Punjab,Gujranwala Department of Computer Science
ACKNOWLEDGEMENTS

We have taken efforts in this project. However, it would not have been possible without
the kind support and help of many individuals and organizations. I would like to extend
my sincere thanks to all of them.

We are highly indebted to Professor Ghulam Ayzed Khan for their guidance and
constant supervision as well as for providing necessary information regarding the
project & also for their support in completing the project.

We would like to express our gratitude towards our parents & member of University of
Central Punjab for their kind co-operation and encouragement which help us in
completion of this project. We would like to express our special gratitude and thanks to
our fellow class mates for giving us such attention and time.

Our thanks and appreciations also go to our family members who always be our
emotional and mental support in developing the project and our batch mates who have
willingly helped us out with their abilities.

ii
University of Central Punjab,Gujranwala Department of Computer Science
DEDICATION
We would like to dedicate this project to our parents, supervisor and teachers and to
those of our friends who have helped us through this project by being a great source of
comfort and encouragement to us.

iii
University of Central Punjab,Gujranwala Department of Computer Science
Table of Contents

1 Introduction........................................................................................................................................7
1.1 Purpose of The Trends................................................................................................................8
1.2 Existing Examples/ Solutions......................................................................................................8
1.3 Business Scope............................................................................................................................9
1.4 Useful Tools and Technologies...................................................................................................9
1.5 Project Work Breakdown..........................................................................................................10
1.6 Project Timeline........................................................................................................................11
1.7 Architecture Diagram................................................................................................................13
2 Software Requirement Analysis.......................................................................................................15
2.1 Purpose......................................................................................................................................15
2.2 Scope.........................................................................................................................................15
2.3 Overall Description....................................................................................................................15
2.4 Software Development Model...................................................................................................17
2.5 System Use Case Modeling........................................................................................................18
2.6 Use Case Description.................................................................................................................19
2.7 System Sequence Diagrams.......................................................................................................22
3 System Design...................................................................................................................................24
3.1 Software Architecture Diagram.................................................................................................25
3.2 Sequence Diagram.....................................................................................................................25
3.3 Activity Diagram.........................................................................................................................27
3.4 Data Flow Diagram Level 0........................................................................................................27
3.5 Data Flow Diagram Level 1........................................................................................................28
3.6 Entity Relationship Diagram.......................................................................................................29
3.7 Class Diagram.............................................................................................................................31
3.8 Interface....................................................................................................................................32
4 Implementation................................................................................................................................35
4.1 Introduction...............................................................................................................................35
4.2 Web Snippets.............................................................................................................................37
4.3 Results and Accuracy.................................................................................................................39

4
University of Central Punjab,Gujranwala Department of Computer Science
5 Testing..............................................................................................................................................41
5.1 Introduction...............................................................................................................................41
5.2 Test Cases..................................................................................................................................41
6 Software Deployment......................................................................................................................47
7 Project Evaluation............................................................................................................................49
7.1 Evaluation of Objectives and Aims...........................................................................................49
7.2 Evaluation of Project Management..........................................................................................49
7.3 Discussion..................................................................................................................................49
References:...............................................................................................................................................50

5
University of Central Punjab,Gujranwala Department of Computer Science
Chapter 1
Introduction

6
University of Central Punjab,Gujranwala Department of Computer Science
1 Introduction
The application is basically a web based application that will do sentimental analysis of
tweets and these tweets will be brought in the web application by using the concept of
data stream processing. The main concept of streaming processing is to compute the
data as it is produced, usually the data is of sensor events, user activities on
website(twitter), financial trends and etc. Before stream processing first the data was
saved in the database and then through that database the required data was accessed
(Batch processing). Stream processing turns the table because in it data flows through
the application continuously. As the stream receives any data the stream application
responds to that data and immediately updates an aggregate or remembers it for the
future usage.

Stream processing can also do multiple data streams jointly, and each computation over
the event data stream may produce other event data streams. The systems that receive
and send the data streams are called stream processors.

The stream processing paradigm naturally addresses many challenges that developers
of real-time data analytics and event-driven applications face today:

 Applications and analytics react to events instantly:  In it there is no lag time


it responds in milliseconds. Data is up-to date, meaningful and valuable.
 Data Quantity: The event streams are processed directly and is in millions, and
only a meaningful subset from the data is persisted.

 Stream processing naturally and easily models the continuous and timely
nature of most data: This is in contrast to scheduled (batch) queries and
analytics on static/resting data. Incrementally computing updates, rather than
periodic re-computation of all data fits naturally with the stream processing
model.

 Stream processing decentralizes and decouples the infrastructure: The


streaming paradigm reduces the need for large and expensive shared
databases. Instead, each stream processing application maintains its own data
and state, which is made simple by the stream processing framework. In this
way, a stream processing application fits naturally in a micro services
architecture

In other words, stream processing makes better performance, accessibility and


applicability.
7
University of Central Punjab,Gujranwala Department of Computer Science
1.1 Purpose of The Trends
The main purpose of our web based application is to process sentimental analysis of
tweets on twitter. Tweets that will be on trending on top trending (hashtags) will be
processed through stream processor (Apache Kafka) and then by reading the text in the
tweet it will be analyzed that are people talking good/bad or in favor/against it. Then the
application will give different options to visualize the analysis and it can be in indicator,
scatter graph, bar graph etc. The results of analysis can then be used for future
predictions or just used for the past results about the trend.

It can be used on any data stock market, politics and further more.

1.2 Existing Examples/ Solutions


There are some websites and software’s whose purpose is also to do sentimental
analysis. The trends will have more features and it will be different from other existing
systems. All these work in batch processing

Following are similar system mentioned below:

Meltwater: This software does analysis on brands that are there marketing performing
in the social media and to ensure whether they are meeting there PR goals.

https://www.meltwater.com/me/social-media-analysis/?
gclid=Cj0KCQjwuafdBRDmARIsAPpBmVWsDDaEX3yILS_mD7TUfSsflSO58cTMk2bfj5
KnDbAj7RpXjZUdb0AaAmzYEALw_wcB

Octaparse:

Octaparse basically acts as extraction software with actually no code needed. It


basically extracts content from any website and saves it in a clean structured format and
also has the option of converting any data into API. It is simply click and point elements
interface and deals with almost all websites including extract text, images URL, HTML,
extract data behind login, capture data from search result pages, X-path (a query
language for selecting nodes from an xml document it basically navigates through
pages which are in XML format).

https://www.octoparse.com/

1.3 Business Scope


The main scope of our application is to target those people who have interest in sports,
are sports analyst. There are a lot of sports events running every day e.g. Football world
8
University of Central Punjab,Gujranwala Department of Computer Science
cup, Football Leagues, Cricket world cup, Cricket Leagues etc. The match between
Dallas Cowboys and the Carolina Panthers of NFL was watched by 23.3 million
viewers.

People always want to know the Hot Favorite team or player of the match or event. Our
website will show trends related to that event or match. That’s how a lot of people come
to our website and this will increase our business.

1.4 Useful Tools and Technologies


 To implement the Web application, PHP will be used.
 Python, java is used.
 Apache Kafka will be used for bring the tweets from twitter (stream processing).
 Sentiment Analysis will be done through a API Textblob.

1.5 Project Work Breakdown


The Trends

Initiation Requirements Design Implementation Testing Project 9


University of Central Punjab,Gujranwala Department of Computer Science
End
Proposal Functional Software Front-End Deployment Sign-off
Architecture

Non Class Back-End


Functional Diagram

Use Case ER Diagram

Domain Database
Model Schema

1.6 Project Timeline

No Milestones Deliverables
1 7th semester Mid Term  Requirements
 Design
 Technology Learning
10
University of Central Punjab,Gujranwala Department of Computer Science
 Working batch Processing

2 7th semester final term  First prototype based on batch processing

3 8th semester mid term  Changing domain from batch processing to


real time processing and making code optimal
4 8th semester final term  Marketing
 Error Correction
 Complete system Delivered

11
University of Central Punjab,Gujranwala Department of Computer Science
1.7 Architecture Diagram
In the diagram it is showed that the application (Twitter) that is brought through stream
processing in Kafka then by Api Textblob sentimental analysis is done in the data and
then the analysis will be shown on the web application.

12
University of Central Punjab,Gujranwala Department of Computer Science
Chapter 2
Software Requirement Analysis

13
University of Central Punjab,Gujranwala Department of Computer Science
2 Software Requirement Analysis
2.1 Purpose
Requirements analysis, also called requirements engineering, is the process of
determining user expectations for a new or modified product. These features, called
requirements, must be quantifiable, relevant and detailed. The requirements defined
in this section will later be used to validate and verify the project.

2.2 Scope
We are developing this project for the cricket enthusiasts and analysts with a friendly
user-interface.

2.3 Overall Description


2.3.1 Product Perspective
The main purpose of this system is to provide a quick and easy access to sports
prediction.

2.3.2 Product Functions


The system to be developed is to involve only one actor:
2.3.2.1 User:
S. Functional Non Functional Status
N Requirements Requirement
o
1 User can submit a All text fields must be Fulfilled
Query by entering filled

14
University of Central Punjab,Gujranwala Department of Computer Science
keyword and
selecting sentiment
2 User can view result User should submit a Fulfilled
of which sentiment Query by entering
analysis has been keyword
performed

2.3.2.2 System:
S. Functional Non Functional Status
N Requirements Requirement
o
1 System will get the Kafka should be fulfilled
tweets by using kafka connected

2 System will do the Authentication with twitter Fulfilled


real time analysis of should be there
tweets

3 System will do the Fulfilled


sentiment analysis of
tweets
4 System will generate Graph option should be Fulfilled
sentiment analysis available
result in different
forms

15
University of Central Punjab,Gujranwala Department of Computer Science
2.4 Software Development Model
The software development process model that we plan on using for the
development of this application is the agile model. The agile model is a combination
of iterative and incremental process models. In the agile model working product is
delivered after each iteration; this is the basic reason behind choosing this model. It
is also best suited because it will be relatively quick and as we have time
constraints to finish this project we would like to have working version available.

By using this model we would be developing our web application in stages with
every working version having more features than the previous one. This would
largely decrease the complexity of the project and after each iteration we would be
able to get feedback from user on any possible shortcomings which we would then
be able to improve in the next iteration. By doing this the product that we would get
at our final iteration would be one that would have all the features initially planned
for and any variations that would arise when getting feedback from users.

Due to the fact that we are going to be getting working versions after every iteration
it insures that at no point in time would we be completely blank or have nothing to
show for our efforts.

2.5 System Use Case Modeling


A use case defines a set of use-case instances, where each instance is a sequence
of actions a system performs that yields an observable result of value to a particular
actor.

16
University of Central Punjab,Gujranwala Department of Computer Science
2.5.1 User Use Case Model

2.6 Use Case Description


2.6.1 Use Case Description “User Search Sentiment”
Use Case ID: 1

Use Case Search Sentiment


Name:

17
University of Central Punjab,Gujranwala Department of Computer Science
Created By: Usama Last Updated By: 16/7/2019

Date Created: 01/02/2019 Last Revision 16/7/2019


Date:

Actors: User

Description: The user who wants to search any sentiment will type any
query or keyword.

Trigger: Search button after any query is written in search box

Preconditions:  Server should be running.


 Browser should be connected to the server.
 Twitter API should be working.

Post conditions: Results will be displayed according to the given query

Normal Flow: User System

1) The user will enter a query


in the context field.

2) The user will then select


the sentiment analysis type

3) The user will then click on 4) System will process the


the Search button to get the query and then will show the
results. sentiment analysis result to
the user.

Alternative Flows: User will cancel that page

Exceptions: 1. Query is invalid

2.6.2 Use Case description “User Display Type”:


Use Case ID: 2

Use Case Display Type


18
University of Central Punjab,Gujranwala Department of Computer Science
Name:

Created By: Zeeshan Last Updated By: 16/07/2019

Date Created: 01/02/2019 Last Revision 16/07/2019


Date:

Actors: User

Description: The user wants to know the result in any form will be like
(scatter chart, bar chart)

Trigger: Result type drop down

Preconditions: • Server should be running.


• Browser should be connected to the server.
• Twitter API should be working.
• Real time data should be working
• Search button after any query is written in search
box
• drop down is displayed.

Post conditions: Results will be displayed according to the selected type

Normal Flow: User System

1) The user will select the 2) System will display the


result type from the given results in the form of
types selected type.

Alternative Flows: User don’t want to select any type

Exceptions: The selected type may not display the result accurately.

2.6.3 Use Case description “Sentiment Result”:


Use Case ID: 3

Use Case View Sentiment Result


Name:

Created By: Huzaifa Last Updated By: 16/07/2019

19
University of Central Punjab,Gujranwala Department of Computer Science
Date Created: 01/02/2019 Last Revision 16/07/2019
Date:

Actors: User

Description: The result in any form like (scatter chart, bar chart)is shown
to the user

Trigger: Result will be viewed according to the search

Preconditions: • Server should be running.


• Browser should be connected to the server.
• Twitter API should be working.
• Real time data should be working
• Search button after any query is written in search
box
• Drop down is displayed.
• Result window is shown

Normal Flow: User System

1) The user will get the 2) System will display the


desired result of that results to the user.
particular keyword.

Alternative Flows: User did not provide keyword

Exceptions: Keyword is invalid

2.7 System Sequence Diagrams


System Sequence diagrams are created to show the sequence of events among
user and the system to complete an action / use case.

20
University of Central Punjab,Gujranwala Department of Computer Science
2.7.1 SSD User search sentiment

21
University of Central Punjab,Gujranwala Department of Computer Science
Chapter 3
System Design

3 System Design
The main focus in this chapter is on the design of the software. The designing of the
software is just of as much importance as is its coding. The design of the software
discussed in this chapter is done by attempting to explain its architecture to the
stakeholders of the software being developed.

22
University of Central Punjab,Gujranwala Department of Computer Science
3.1 Software Architecture Diagram

3.2 Sequence Diagram


The behavior of various components of the software being developed is presented via
Sequence Diagrams.

23
University of Central Punjab,Gujranwala Department of Computer Science
3.2.1 SD User Search Sentiment

24
University of Central Punjab,Gujranwala Department of Computer Science
3.3 Activity Diagram

3.4 Data Flow Diagram Level 0

25
University of Central Punjab,Gujranwala Department of Computer Science
3.5 Data Flow Diagram Level 1

26
University of Central Punjab,Gujranwala Department of Computer Science
3.6 Entity Relationship Diagram

27
University of Central Punjab,Gujranwala Department of Computer Science
3.7 Class Diagram

28
University of Central Punjab,Gujranwala Department of Computer Science
3.8 Interface
3.8.1 Home

29
University of Central Punjab,Gujranwala Department of Computer Science
3.8.2 Search

3.8.3 Result

30
University of Central Punjab,Gujranwala Department of Computer Science
Chapter 4
Implementation

31
University of Central Punjab,Gujranwala Department of Computer Science
4 Implementation
4.1 Introduction
The main focus in this chapter is on the implementation issues that are being faced during
the development of the software. This portion of the document covers the aspects of the
chunks of the software code along with their description. The major issues faced in these
modules first of all was to bring real time tweets in the web application, the second problem
was how to check the polarity of the tweets and assigning them to a specific partition.

Tools and Technology


Following are the tools and the technologies used for the development of the
software along with their explanation.
Parts of Software Tools and Reason
Technologies
Web Application It is used as the desired
Development tool to develop the
website.
PHP Programing language
used for the
development of
websites.
HTML and CSS These are used to
design the client end.
Google Chrome A web browser used for
testing the web
application.
For Analysis Python idle IDLE (short for
integrated development
environment or
integrated development
and learning
environment) is an
integrated development
environment for Python
Python Programing language
used for the
development of
websites and data
analysis
Libraries Tweepy is used to bring
the tweets from twitter.

32
University of Central Punjab,Gujranwala Department of Computer Science
Pandas is an open
source, BSD-licensed
library providing high-
performance, easy-to-
use data structures and
data analysis tools for
the Python
programming 

Matplotlib is a Python
2D plotting library which
produces publication
quality figures in a
variety of hardcopy
formats and interactive
environments across
platforms

33
University of Central Punjab,Gujranwala Department of Computer Science
4.2 Web Snippets
4.2.1 Python code for searching required tweets

4.2.2 Python code for saving the tweets in a CSV file

4.2.3 Python code for showing the analysis in charts

34
University of Central Punjab,Gujranwala Department of Computer Science
4.3 Results and Accuracy
4.3.1 Result 1

35
University of Central Punjab,Gujranwala Department of Computer Science
CHAPTER 5
Testing

36
University of Central Punjab,Gujranwala Department of Computer Science
5 Testing
5.1 Introduction
In this section of the document the testing phase upon the project is briefly explained.
The main advantage of this podium is that it helps to resolve the issues that may occur
during the actual user usage. This section is briefly explained with the important
techniques that user encounters the most during the lifecycle of his usage of system.
Software Testing is most crucial stage when it comes to the Software Development
Process. The main purpose of the Testing is to investigate or evaluate the software
component to determine the bugs and error that may occur during the usage. Testing is
done by executing a system in such a way that it identifies gaps, errors, or missing
requirements in contrary to the actual requirements.

Testing Methodology

It is essential to have a testing plan in place to ensure that the product delivered is
robust and stable, and is delivered on a predictable timeline. The testing methodology
that we have adopted to test the robustness of our tool is “unit testing”. The reason for
adapting this technique is that it allows testing at most basic level and is provided with
code-level access. Also this methodology is time-efficient and is best suited to our
developmental strategy.

5.2 Test Cases


Following are test cases which are performed on web application.

Test case 1: User Search Query

Date: 21-01-2019 Tested by: Usama,Zeeshan,Huzaifa


System: The Trends (Sports) Environment: Web based
Objective: Test for User Search Query Test ID: 3
Version: 1.1 Test type: Unit
Input
Query: Pakistan vs South Africa
Number of tweets: 20
Expected result: The Query Should be performed and result will be showed in chart.

37
University of Central Punjab,Gujranwala Department of Computer Science
Actual result: Passed

Test case 2: Guest Search Query

Date: 21-01-2019 Tested by: Usama,Zeeshan,Huzaifa


System: The Trends (Sports) Environment: Web based
Objective: Test for Guest Search Test ID: 4
Version: 1.1 Test type: Unit
Input
Query: India vs New Zealand
Number of tweets: 80
Expected result: The Query Should be performed and result will be showed in chart.
Actual result: Passed

5.2.1.1 Query Testing


5.2.1.1.1 Query= Cricket

38
University of Central Punjab,Gujranwala Department of Computer Science
5.2.1.1.2 Query= Zardari

5.2.1.1.3 Query= Pakistan

39
University of Central Punjab,Gujranwala Department of Computer Science
5.2.1.1.4 5.2.1.1.3 Query=Canada

40
University of Central Punjab,Gujranwala Department of Computer Science
CHAPTER NO 6
Software Deployment

41
University of Central Punjab,Gujranwala Department of Computer Science
6 Software Deployment
In this section of the document there is explanation how to deploy project on live server.
There is little bit difference between localhost and live server. But it’s not easy to deploy
project live server.
The first step was to mark all the language, libraries or other software necessary to run
the code on the server, then the next step was download the language on the server in
our case it is python 3 version and then we downloaded the required libraries that
include Nltk, Textblob, Matplotlib, Apache kafka, Zookeeper from the terminal of the c-
panel and then finally uploading the php files and python script.
.

URL of The Trends:


www.thetrends-pk.com

42
University of Central Punjab,Gujranwala Department of Computer Science
CHAPTER NO 7
Project Evaluation

43
University of Central Punjab,Gujranwala Department of Computer Science
7 Project Evaluation
In this chapter, we will focus on our objectives that are achieved in the development of
the project and the ideas that are grown during the development.
7.1 Evaluation of Objectives and Aims
We are proud to say that we have successfully achieved our objectives that we planned
at the start of the project.
7.2 Evaluation of Project Management
Our supervisor Professor Ghulam Ayzed Khan played an important role to motivate us
throughout the development of the project and helped us in learn the technology more
efficiently.
7.3 Discussion
In the beginning of the project, we were not clear about how to meet the proposed
objectives, firstly we researched about stream processing and how is it different from
batch processing after learning this we then moved on to the next step that which
software’s work on the concept of stream processing, Apache kafka was one of many
software’s that fulfilled are requirements. Then the next big hurdle was to gather useful
information from the incoming tweets (tweets cleaning process) this was done by using
Nltk library for Python language, also are project proposed to do sentiment analysis of
the tweets gathered through stream processing this was completed by using Textblob
library for Python and the last proposed point was to display this analysis in graphs that
keeps changing (real time graphs) so to fulfill this need we used a library named
Matplotlib for Python, then finally the script was hosted on Webserver as it was a web
based application.

44
University of Central Punjab,Gujranwala Department of Computer Science
References:
1. https://kafka.apache.org/documentation/
2. https://www.nltk.org/
3. https://zookeeper.apache.org/
4. https://matplotlib.org/
5. https://textblob.readthedocs.io/en/dev/

45
University of Central Punjab,Gujranwala Department of Computer Science

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy