IJCRT2305032

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

www.ijcrt.

org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882

STOCK MARKET PREDICTION USING


CNN AND LSTM
Harsh P Kothari Nikhil Nahar Aman Gautam
Department of Computer Department of Computer Department of Computer
Science, Science, Science,
SRM Institute of Science and SRM Institute of Science and SRM Institute of Science and
Technology, Chennai, India, Technology, Chennai, India, Technology, Chennai, India.

external factors, including social, mental, geopolitical,


Abstract and economic considerations.
Stock markets price prediction is a difficult
undertaking that has historically required substantial The primary characteristics of stock market data are
human-computer cooperation. Because stock prices typically time-variant and nonlinear.In the realm of stock
are interrelated, traditional batch processing trading, making predictions about the stock market is
techniques cannot be used effectively for stock extremely important.Investors who are insufficiently
market research. It is practically impossible to informed and knowledgeable run the biggest risk of
estimate stock prices to the penny due to the losing their money.
unpredictability of the factors that affect price
To make large gains, traders must correctly forecast
movement. However, an informed estimate of pricing
the stock value of firms in the future. To accurately make
is attainable. The data is provided in the form of
predictions on the financial markets, numerous
corporate stock data, which will be evaluated in real-
prediction algorithms have been created. When there
time. The system produces an output in a graphical
were no computational tools for risk assessments, there
form that depicts the predicted future stock price.
have been two extensively used conventional methods.
Hence we have proposed a solution to predict the
There are numerous traditional techniques for
stock market prices using the machine learning
forecasting stock values.
algorithm.
Keywords: stock market,price prediction,stock 1.1 Analysis of The Stock Market
market research,machine learning algorithm.
There are two methods of analysis in the Stock
Market: Fundamental Analysis and Technical Analysis.
INTRODUCTION We will discuss each one in detail below:

Stock market price modeling is a constant challenge 1.2 Fundamental Analysis


for many company analysts and academics. Estimating
A type of analysis that has proven to be highly
stock market prices is both a fascinating and difficult
advantageous throughout the history of the stock market
subject of study. The share market is extremely difficult
is fundamental analysis. A security's inherent worth is
to predict accurately since it is greatly affected by
IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a209
www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
assessed utilizing economic, financial, qualitative, and Locate the company's rivals
quantitative variables in fundamental analysis. Both
macroeconomic and microeconomic factors are thought To make a successful investment, it is essential to
to have an impact on the value of a share. choose a company that outperforms its competitors. The
selected company should have a positive outlook for the
These variables may include the state of the economy, future, such as upcoming projects or new facilities, and
business environment, financial situation, and stronger growth potential.
management skills. The main objective of conducting a
fundamental analysis is to assess whether a stock is Examine the potential outcomes.
priced too high or too low by evaluating its intrinsic
value and comparing it to its current market value. Fundamental analysis is most effective when
considering long-term investments that will be made on
a recurring basis. It is advisable to invest in companies
whose products or services will remain significant 15 to
Process For Fundamental Analysis of a Stock 25 years into the future.

The steps for Fundamental Analysis are: Review each aspect periodically.

Knowing the organization Don't invest in a business and then ignore it. Keep
yourself informed about the business you have invested
Understanding the industry in which you intend to in. You ought to be informed of all company news and
invest is essential. You will get more understanding of financial results. In the event of an issue with the
the business's operations if it is making the best choices corporation, sell the share.
for its long-term objectives, and whether you should
keep or sell the shares. A smart way to gather such 1.3 Qualitative and Quantitative Fundamental
knowledge is by visiting its website and learning about Analysis
the business, its administration, its supporters, and its
merchandise. Defining the term "fundamentals" can be challenging
since it encompasses all factors that influence a
Review the company's financial reports. company's financial well-being. This can include a
company's market position, the quality of its
As soon as you are certain, you should start looking at management team, and financial metrics such as revenue
the company's financial records, including the balance and profits.Quantitative and qualitative fundamental
sheet, profit-loss accounts, cash flow statements, components might be combined into one category. These
operating costs, revenues, and expenses, among others. terms' definitions according to finance aren't all that
If the company's net profit has been rising over the past different from the well meanings:
five years, it might be viewed as a positive indication for
the business. You can assess its compound annual Qualitative Factors:
growth rate (CAGR), sales, and other factors.
Qualitative measurements as opposed to the quantity
Verify the debt measure caliber, character, and nature of things. These
include four things: The Business Model, The
Debt is a significant aspect that might negatively Comparative Advantage, Management, and Corporate
impact a company's profitability. If an asset has a sizable Governance.
debt by itself, it could function well and pay you back. It
is advised that you steer clear of businesses with Quantitative Factors:
significant debt. Always look for a debt-laden company
to invest in a debt-to-equity ratio that is lower than 1. Quantitative measurements measure figures,
numbers, formulas, and ratios. These comprise a
statement of cash flows, an income statement, and a
balance sheet.

IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a210


www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
1.4 Technical Analysis Computer algorithms typically perform price
computations, and bid, ask, and bid-ask spreads can be
Technical analysis is the process of studying past viewed on a broker's website. Stock trades are now
price and volume data in the market to predict future conducted electronically due to the internet and online
market trends. It draws upon knowledge from stockbrokers. Various factors such as news, political
quantitative research, behavioral economics, and market events, and economic reports can affect stock prices.
psychology. Technical analysis includes a wide range of
methodologies, including statistical indicators and chart 1.6 Stock Market Predictions Using LSTM
analysis. Technical analysts utilize a number of tools,
including trendlines, candlestick patterns, and Due to the volatility of tick market prices, accurately
mathematical visualization approaches, to find probable estimating stock prices is challenging. Time series
entry and exit points since their main goal is to anticipate modeling, such as the use of LSTM neural network
whether a trend will continue or reverse. Analysts often architecture, can aid in forecasting future values. A set
use a combination of tools to make trading decisions. of predictors, including fundamental market data and
technical indicators, is used to represent stock market
The basic tenet of technical analysis is that market behavior. Single and multilayer LSTM models are
prices already reflect all relevant data, making it created using the selected input variables, and their
unnecessary to consider economic, fundamental, or new performance is compared using RMSE, MAPE, and
developments. Technical analysts believe that prices in Correlation Coefficient. Empirical findings indicate that
the market tend to move in trends and that historical the single-layer LSTM model outperforms the multilayer
patterns often repeat themselves, reflecting the market's LSTM model in terms of fit and prediction accuracy.
psychology.

As a sort of discretionary technical analysis, technical II. Literature Survey


analysts employ chart patterns to pinpoint areas of
support and resistance on a chart. These patterns aim to
forecast price changes that will occur after a breakout or
breakdown from a specific price and period, and they are
supported by psychological variables. Technical
indicators, which are a statistical subset of technical
analysis, are produced by technical analysts using
mathematical formulae on prices and quantities. Trading
strategies are developed based on these evaluation
metrics by using technical indicators like moving
averages, which smooth price data to make trend
identification easier, and the more sophisticated moving
average convergence divergence (MACD), which looks
at how different moving averages interact.

1.5 How Does the Stock Market Work?

Buying stock means purchasing a small portion of a


publicly traded company, which can be done through
stock exchanges. An initial public offering (IPO) allows
companies to list their shares on an exchange and sell
them to investors to raise capital for business expansion.
These shares can be traded among investors. The term
"bid-ask spread" refers to the variation between the
greatest price a buyer is ready to pay (the "bid") and the
lowest price a seller is willing to accept (the "ask").
IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a211
www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882

3.MODULE DESCRIPTION

MODULE 1: DATA ACQUISITION AND


PRE-PROCESSING

Stock market prediction aims to predict a company's


financial stock value in the future. Machine learning has
emerged as a popular approach for predicting stock
values by training on prior stock market indices.
Multiple machine learning models are used for accurate
stock price prediction. This paper specifically examines
the use of LSTM-based machine learning and regression
for stock value prediction. The date format is changed to

IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a212


www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
slash format (16/09/22) to better represent the time- Finding the missing numbers in a series is another
series nature of stock prices. potential application of these methods when working
with Pandas Series.
The Python feature below will be used for this.Despite
the fact that dates have a standard representation, you Normalization of data
could want to publish them in another manner. In that
case, you can use the various format codes to get a Normalization is a scaling method used in machine
customized string representation. strftime() uses a learning to adjust numerical values to a standard scale. It
number of industry-standard directives to represent a is necessary only when the feature ranges vary. Data is
DateTime in a string format. The list of directives used centered around the mean with a unit standard deviation
by the strptime() and strftime() methods is the same. using standardization scaling or Z-score normalization.

Finding Null Values

The current Kaggle dataset is deficient in a number of


areas. This can be the result of insufficient data
collection methods or a dearth of relevant information. 3.2. MODULE 2:
NaN (not a number) or None are used to represent these
values in the dataset. Whatever the origins, this MODEL IMPLEMENTATION
complicates and distorts our computing. As a result, we
a. Model
track down the missing information and replace it with
fully operational parts. GridSearchCV is an automated method used to find
the best hyperparameter settings for a specific model.
Replacing Null Values This method helps to improve the model's performance
by iterating over a range of hyperparameters. Scikit
Once the null values have been identified, the null
Learn's GridSearchCV function evaluates the model for
values' rows and columns can be removed or converted
each combination of hyperparameters, allowing us to
to the mean, median, and mode. We used the dropna()
select the best-performing combination. This function is
function to eliminate every null value from a dataset.
useful for constructing an ensemble model that combines
This operation can be used to remove null values from
predictions from different models to enhance prediction
table columns and rows. The mean or median value can
accuracy.
be used to replace the missing data if the relevant
columns have integer or float data types. b. Splitting The Dataset

The value or mode that occurs most frequently could Splitting the dataset into two distinct subsets—
be used in place of another value in the absence of the training dataset and the test dataset—is a
specific information. This can use both floating-point typical way for assessing a model's
numbers and integers. But if the relevant columns also performance. To find out how effectively the
have strings, the usefulness rises. The fillna() function model generalizes to fresh, untested data, it is
loops through our data collection, inserting the median, first trained on the training dataset, and then its
mode, and average values into any empty rows. For performance is evaluated on the test dataset.
categorical data, the mode will be used in place of any These two programs, Sklearn and Pandas, are
missing values. In the absence of sufficient empty rows, frequently used to divide the dataset into these
the median will be applied. two subgroups. The dataset is split with a test
size of 0.3 and a random state of 0 to ensure a
None and NaN are both considered comparable
fair allocation of records between the two
representations of missing or null values by Pandas.
subsets.
Numerous helpful functions in Pandas DataFrame make
it simpler to identify, remove, and alter null values.
Pandas DataFrame uses the functions isnull() and notnull
to recognize null values (). You can use any one of these
procedures to determine whether a number is NaN.

IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a213


www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
c. Dataset Training now ready for training; that's all there is to it. The data is
then loaded in batches.
Deep learning is especially well suited to GPUs since
it necessitates the same kinds of calculations that GPUs d. Saving The Best Model
were designed to perform. We really apply a
mathematical transformation to a matrix, which is how The system proposed in this study was able to identify
images, movies, and other visuals are captured, when we some relationships within the data, showcasing its
do any operation, such as a zoom-in effect or a camera potential. The results indicate that the CNN architecture
rotation. used in the methodology is effective in detecting changes
in trends. Despite being commonly used in other time-
In this work, a CNN-LSTM approach that considers dependent data processing, the CNN architecture
the sequential features of the stock data is suggested for outperformed other models in predicting stock market
predicting the closing price of stocks for the next day. To behavior using current data due to the unpredictable
extract the characteristics, the approach employs a nature of the market's abrupt shifts. It is possible that
variety of data inputs, including the starting price, stock market fluctuations may not always follow a
highest price, lowest price, closing price, volume, predictable pattern or cycle.
turnover, ups and downs, and change of stock data. The
LSTM model is trained using the features that were Fig 1.2 compares the classification accuracy of the
extracted using the CNN. The relevant data from the machine learning models used such as CNN, DNN and
experiment are used as an example to validate the LSTM. We can infer that the LSTM model is found to
findings using the Shanghai Composite Index. have the highest accuracy compared to the other two
models.
In terms of performance and accuracy, CNN-LSTM
performs better than CNN-RNN, MLP, RNN, LSTM,
and CNN. It has the lowest MAE and RMSE values, and
R2 is close to 1. Investors can benefit from using CNN-
LSTM for stock price forecasting, and it provides
valuable real-world experience for financial time series
researchers.

The correct code is conda install numba & conda


install cudatoolkit. Then, using the training dataset, we Fig 1.2 Classification Accuracy
execute the standard CPU function. Then, we create a
function that is enhanced for GPU performance. We
ultimately receive the outcomes of time spent with a
GPU and without one. Our model benefits from faster,
more efficient, and more effective GPU training.

Thanks to NVIDIA's CUDA parallel computing


technology, the software may utilize both the CPU and
GPU simultaneously. NVIDIA is now the most well-
liked GPU supplier for cloud computing and machine
learning, thus we use them. Additionally, the majority of
Python languages that support GPUs can be used with
Fig 1.3 Precision, Recall and F1 Score
NVIDIA GPUs.
Figure 1.3 illustrates a comparison of other variables
Then we train our model after loading the data onto like Precision, Recall, and F1 score that were obtained
the CUDA GPU. Since our system uses CUDA, we want from different Machine Learning models, including
to transfer our data from the CPU to the GPU RAM
CNN, DNN, and LSTM. The chart demonstrates that the
during the training phase. As a result, enabling pin LSTM model has equally distributed values for
memory=True will cause the data to be transferred into Precision, Recall, and F1 Score, which is unlike neural
page-locked memory, accelerating training. The data are networks such as CNN and DNN.
IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a214
www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882

MODULE 3: IMPLEMENTING WEBAPP

Creating a Webapp Using STREAMLIT

We will write programming to handle the server side


processing. We'll send requests to our code. It will
choose the topic and kind of the requests. It will also
choose the type of response to send to the user. For all of
this, Flask will be used. The procedure for designing web
applications is made simpler. With the help of Flask, we
can focus on the questions that customers have and the
best way to respond.

Within the directory that will house the project files,


we will create a virtual environment for the project.
Install virtualenv first using pip Install virtualenv, then Fig 4.1 – Architecture Diagram
create and activate a virtual environment before
installing Flask with the pip install command. The main
goal of this project is to precisely predict the stock's
closing price over an extended period of time. Dash html
Data Flow Diagram
components and dash core components were used in this
project to create the website's framework and enhance its The whole system is shown as a single process in a
user interface. level DFD. Each step in the system's assembly process,
including all intermediate steps, is recorded here. The
4.1 SYSTEM DESIGN "basic system model" consists of this and 2-level data
flow diagrams.
This illustration depicts the integration of different
entities in the system, offering a clear and concise
overview of their interrelationships. It displays the
connections between various actions and decisions,
presenting the entire process as a visual representation.
The diagram portrays the functional links between
different entities.

Fig 4.2 – Data Flow Diagram Level 0

Fig 4.3 – Data Flow Diagram Level 1

IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a215


www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
Sequence Diagram 5.TESTING

These are another type of interaction-based diagram 1. THE DATASET


used to display the workings of the system. They record
the conditions under which objects and processes
cooperate

2. CNN

Class Diagram

In essence, this is a "context diagram," another name


for a contextual diagram. It simply stands for the very
highest point, the 0 Level, of the procedure. As a whole,
the system is shown as a single process, and the
connection to externalities is shown in an abstract
manner.

3. LSTM

IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a216


www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
4 CNN+LSTM

8.REFERENCES

[1] Ishita Parmar, Navanshu Agarwal,


Sheirsh Saxena, Ridam Arora, Shikhin Gupta,
Himanshu Dhiman, Lokesh Chouhan, “Stock
Market Prediction Using Machine Learning”,
First International Conference on Secure Cyber
Computing and Communication (ICSCCC),
2019

[2] Kavinnilaa J, Hemalatha E, Minu


Susan Jacob, Dhanalakshmi R, “Stock Price
Prediction Based on LSTM Deep Learning
6. RESULT Model”, International Conference on System,
Computation, Automation and Networking
We are using python for our entire model which (ICSCAN), 2021
involves Data Cleaning, Data Modeling and Testing
Data using CNN-LSTM which provides a high accuracy [3] Ayan Maiti, Pushparaj Shetty D,
and it is quite compatible with Artificial Intelligence and “Indian Stock Market Prediction using Deep
Machine Learning, and Streamlit is used to create a Learning”, IEEE Region 10 Conference
website for final running and displaying of data and (TENCON), 2020
graph.We used Google colab to train the model,
Cleaning data, testing, and training the model. Visual [4] Kunal Pahwa, Neha Agarwal, “Stock
Studio Code is used for collecting model for each Market Analysis using Supervised Machine
technique and finally, all models are put together, Learning”, International Conference on
combined, and rendered into a web page. Machine Learning, Big Data, Cloud and
Parallel Computing (COMITCon), 2019
7. CONCLUSION
[5] Rubi Gupta, Min Chen, “Sentiment
Analysis for Stock Price Prediction”, IEEE
The most important steps in the data pretreatment Conference on Multimedia Information
process are data cleansing and data visualisation, which Processing and Retrieval (MIPR), 2020
are crucial for increased stock market forecast accuracy.
Due to overfitting brought on by improper input [6] Jing Yee Lim, Kian Ming Lim, Chin
parameter utilisation, the current system exhibits subpar Poo Lee, “Stacked Bidirectional Long Short-
accuracy. The proposed model has been used as a Term Memory for Stock Market Analysis”,
technique for developing special feature sets and getting IEEE International Conference on Artificial
accurate predictions. the experiment were carried out Intelligence in Engineering and Technology
with a non-linear RBF kernel, which gave us incredibly (IICAIET), 2021
accurate results. Most importantly, the evaluation
indicated above helped us predict the outcome and gave [7] Samuel Olusegun Ojo, Pius Adewale
us insightful information about the kind of data that Owolawi, Maredi Mphahlele, Juliana Adeola
might be used in the future to train our differentiators in Adisa, “Stock Market Behaviour Prediction
the most efficient way. using Stacked LSTM Networks”,
International Multidisciplinary Information
Technology and Engineering Conference
(IMITEC), 2020

IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a217


www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
[8] J.Aruna Jasmine, S. Srinivasan, M.
Godson, T.P. Rani, S.Susila Sakthy, “Share
Market Prediction Using Long Short Term
Memory and Artificial Neural Network”, 4th
International Conference on Computing and
Communications Technologies (ICCCT),
2022
[9] Xiaochun Zhang, Chen Li, Kuan-Lin
Chen, Dimitrios Chrysostomou, Hongji
Yang, “Stock Prediction with Stacked-LSTM
Neural Networks”, IEEE 21st International
Conference on Software Quality, Reliability
and Security Companion (QRS-C), 2022

[10] Rokan Uddin, Fahim Irfan


Alam, Avisheak Das, Sadia Sharmin, “Multi-
Variate Regression Analysis for Stock
Market price prediction using Stacked
LSTM”, International Conference on
Innovations in Science, Engineering and
Technology (ICISET), 2022

IJCRT2305032 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org a218

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy