0% found this document useful (0 votes)

212 views6 pages

Financial Forecasting and Gradient Descent

This document proposes using a conjugate gradient learning algorithm and multiple linear regression weight initialization for neural network time series forecasting. The conjugate gradient algorithm aims to overcome the slow convergence of the steepest descent algorithm. Multiple linear regression provides a better alternative for initializing network weights than random initialization. The approach is tested on stock market data from the Shanghai Stock Exchange. Results show that neural networks can effectively model time series data, and the conjugate gradient algorithm with multiple linear regression initialization requires lower computation costs and learns better than alternatives.

Uploaded by

p_sanecki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

212 views6 pages

Financial Forecasting and Gradient Descent

Uploaded by

p_sanecki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Financial Time Series Forecasting by Neural Network

Using Conjugate Gradient Learning Algorithm and

Multiple Linear Regression Weight Initialization

CHAN Man-Chung, WONG Chi-Cheong, LAM Chi-Chung

Department of Computing
The Hong Kong Polytechnic University
Kowloon, Hong Kong
csmcchan@comp.polyu.edu.hk csccwong@comp.polyu.edu.hk

Abstract
Multilayer neural network has been successfully applied to the time series forecasting. Steepest descend, a
popular learning algorithm for backpropagation network, converges slowly and has the difficulty in
determining the network parameters. In this paper, conjugate gradient learning algorithm with restart
procedure is introduced to overcome these problems. Also, the commonly used random weight initialization
does not guarantee to generate a set of initial connection weights close to the optimal weights leading to slow
convergence. Multiple linear regression (MLR) provides a better alternative for weight initialization.

The daily trade data of the listed companies from Shanghai Stock Exchange is collected for technical analysis
with the means of neural networks. Two learning algorithms and two weight initializations are compared.
The results find that neural networks can model the time series satisfactorily, whatever which learning
algorithm and weight initialization are adopted. However, the proposed conjugate gradient with MLR weight
initialization requires a lower computation cost and learns better than steepest decent with random
initialization.

Keywords: time series forecasting, technical analysis, learning algorithm, conjugate gradient, multiple linear regression weight
initialization, backpropagation neural network

order steepest descent technique as learning algorithm.

1. Introduction Weights are modified in a direction that corresponds to the
negative gradient of the error surface. Gradient is an
Detecting trends of stock data is a decision support extremely local pointer and does not point to global
process. Although the Random Walk Theory claims that minimum. This hill-climbing search is in zigzag motion and
price changes are serially independent, traders and certain may move towards a wrong direction, getting stuck in a
academics[4] have observed that there is no efficient local minimum. The direction may be spoiled by
market. The movements of market price are not random subsequent directions, leading to slow convergence.
and predictable.
In addition, classical backpropagation is sensitive to the
Statistical methods and neural networks are commonly used parameters such as learning rate and momentum rate. For
for time series prediction. Empirical results have shown examples, the value of learning rate is critical in the sense
that Neural Networks outperform linear regression[1,18,32] that too small value will make have slow convergence and
since stock markets are complex, nonlinear, dynamic and too large value will make the search direction jump wildly
chaotic[22]. Neural networks are reliable for modeling and never converge. The optimal values of the parameters
nonlinear, dynamic market signals[15]. Neural Network are difficult to find and often obtained empirically.
makes very few assumptions as opposed to normality
assumptions commonly found in statistical methods. Customary random weight initialization does not guarantee
Neural network can perform prediction after learning the a good choice of initial weight values. The random weights
underlying relationship between the input variables and may be far from a good solution or near local minima or
outputs. From a statistician’s point of view, neural saddle points of the error surface, leading to a slow learning.
networks are analogous to nonparametric, nonlinear
regression models. To overcome the deficiencies of steepest descent learning
and random weight initialization, some researches[19,29]
Backpropagation neural network is commonly used for have investigated the use of Genetic Algorithms and
price prediction. Classical backpropagation adopts first- Simulated Annealing to escape local minimum. Some[5,20]
have attempted Orthogonal Least Squares. Some have 1
d k +1 = [ µ( −g k +1 ) + ( 1 − µ )d k ] (5)
adopted Newton-Raphson and Levenberg-Marquardt. µ
The search direction can be viewed as a convex combination
In this paper, conjugate gradient learning algorithm and of the current steepest descent direction and the direction
multiple linear regression weight initialization are used in the last move.
attempted. In next section, conjugate gradient learning
algorithm is introduced. Section 3 mentions multiple linear The search distance of each direction is varied. The value
regression weight initialization. The descriptions and the
of α k can be determined by line search techniques, such as
results of experiments on the performance of both learning
algorithms and both weight initializations are reported in Golden Search and Brent’s Algorithm, in the way that
section 4. Finally, conclusion is drawn and further f ( w k + α k d k ) is minimized along the direction d k ,
research is discussed in section 5.
given fixed w k and fixed d k .
2. Conjugate Gradient β k can be calculated by the following three formulae:
Learning Algorithm Hestenes and Stiefel’s formula,
The training phase of a backpropagation network is an g Tk +1 [ g k +1 − g k ]
unconstrained nonlinear optimization problem. The goal of βk = (6)
the training is to search an optimal set of connection d Tk [ g k +1 − g k ]
weights in the manner that the errors of the network output Polak and Ribiere’s formula,
can be minimized. g Tk +1 [ g k +1 − g k ]
βk = (7)
g Tk g k
Besides popular steepest descent algorithm, conjugate
Fletcher and Reeves’ formula,
gradient algorithm is another search method that can be
g Tk +1 g k + 1
used to minimize network output error in conjugate βk = (8)
directions. Conjugate gradient method uses orthogonal and g Tk g k
linearly independent non-zero vectors. Two vectors di
Shanno’s inexact line search[15] considers the conjugate
and d j are mutually G -conjugate if method as a memoryless quasi-Newton method. Shanno
d Ti Gd j = 0 for i ≠ j (1) derives a formula for computing d k +1 :
 yT y  p T g yT g  pT g
The algorithm was firstly developed to minimize a d k +1 = − g k +1 − 1 + kT k  kT k − kT k  pk + kT k y k
quadratic function of n variables  pk y k  pk y k pk y k  pk y k
1 T where p k = α k d k and yk = gk +1 − gk . (9)
f ( w) = c − b T w + w Gw (2)
2 The method performs an approximate line minimization in
where w is a vector with n elements and G is an n × n a descent direction in order to increase numerical stability.
symmetric and positive definite matrix. The algorithm was
then extended to minimization of general non-linear For n-dimensional quadratic problems, the solution is
functions by interpreting (2) as a second order Taylor converged from w0 to w* by n step moves along
series expansion of the objective function. G in (2) is different G -conjugate directions d 1 , d 2 ,..., d n .
regarded as Hessian matrix of function f.
However, for non-quadratic problems, G -conjugacy of the
direction vectors deteriorates. Therefore, the direction
A starting point w1 is selected first. The first search vector is reinitialized to the negative gradient of the current
direction d1 is set to negative gradient g1 (i.e. point after every n steps. That is,
d1 =- g1 ). Conjugate gradient method is to minimize d k = − g k where k = mn +1, m ∈ N (10)
differentiable function (2) by generating a sequence of
approximation wk +1 iteratively according to Conjugate gradient method has a second-order convergence
property without complex calculation of the Hessian
w k +1 = w k + α k d k (3) matrix. A faster convergence is expected than first order
d k +1 = − gk +1 + βk d k (4) steepest descent approach. Conjugate-gradient approach
α and β are momentum terms to avoid oscillations. finds the optimal weight vector w along the current gradient
by doing a line-search. It computes the gradient at the new
point and projects it onto the subspace defined by the
Let µ = 1 . Equation (4) can be rewritten as: complement of the space defined by all previously chosen
1 + βk gradients. The new direction is orthogonal to all previous
search directions. The method is simple. No parameter is 3. Multiple Linear Regression
involved. It requires little storage space and expected to be
efficient. Weight Initialization
Backpropagation is a hill-climbing technique. It runs the
The summary of conjugate gradient algorithm is describe risk of being trapped in local optimum. The starting point
below: of the connection weights becomes an important issue to
1. Set k = 1. Initialize w1. reduce the possibility of being trapped in local optimum.
2. Compute g 1 = ∇f(w1). Random weight initialization does not guarantee to generate
3. Set d 1 = -g 1. a good starting point. It can be enhanced by multiple linear
4. Compute α k by line search, regression. In this method, weights between input layer
where α k = arg min α [ f ( w k + α k d k )] . and hidden layer are still initialized randomly but weights
between hidden layer and output layer is obtained by
5. Update weight vector by wk+1 = wk+ α k d k.
multiple linear regression.
6. If network error is less than a pre-set minimum value
or the maximum number of iterations has been
reached, stop; else go to step 7. The weight wij between the input node i and the hidden
7. If k+1 > n, then w1 = wk+1, k = 1 and go to step 2; node j is initialized by uniform randomization. Once input
Else a) set k= k+1 xis of sample s has been fed into the input node and wij ’s
b) compute g k+1 = ∇f(wk+1).
c) compute âk . have been assigned values, output value R sj of the hidden
d) compute new direction: d k+1=-g k+1+βk d k. node j can be calculated as
e) go to step 4 R sj = f (∑ wij xis ) , (15)
i

To compute gradient in step 2 and 7b, the objective where f is a transfer function. The output value of the
function is first defined. The aim is to minimize the output node can be calculated as
network error that is dependent of the independent ∑
y s = f ( v j R sj )
j
(16)
connection weights. The objective function is defined by
the error function: where v j is the weight between the hidden layer and the
∑∑( t
1 2
f (w ) = nj − y nj ( w )) (11) output layer.
2N n j

where N is the number of patterns in the training set; Assume sigmoid function f ( x ) = 1 is used as the
w is one-dimensional weight vector in which 1 + e −x
weights are ordered by layer and then by neuron; transfer function of the network. By Taylor’s expansion,
1 x
t nj and ynj ( w ) are the actual and desired f (x ) ≅ + (17)
2 4
outputs of the j-th output neuron for n-th pattern,
respectively. Applying the linear approximation in (17) to (16), we have
the following approximated linear relationship between the
With the arguments in [34], the gradient is output y and vj’s:
1
g( w ) = ∑δ y ( w )
N n nj ni
(12) 1 1 m
y s = + ( ∑ v j R sj ) (18)
2 4 j
For output nodes,
or 4 y s − 2 = v1 R1s + v2 R2s + ... + vm Rms
δ nj = −( t nj − y nj ( w ))s 'j ( netnj ) (13)
s = 1, 2,..N (19)
where s'j ( net nj ) is the derivative of the activation where m is the number of hidden nodes;
N is the total number of training samples.
function of the input of the j-th neuron net nj .
For the hidden node, The set of equations in (19) is a typical multiple linear
δ nj = s 'j ( net nj )∑ δ nk w jk (14)
regression model. R sj ’s are considered as the regressors.
k

where w jk is the weight from j-th to the k-th neuron. v j ’s can be estimated by standard regression method.

Once v j ’s have been obtained, the network initialization is

completed and the training starts.
4. Experiment
Prediction of price change allows a larger error tolerance
The daily trading data of eleven listing companies in 1994- than prediction of exact price value, resulting in a significant
1996 was collected from Shanghai Stock Exchange for improvement in the forecasting ability[8,10]. In order to
technical analysis of stock price. The first 500 entries were smooth out the noise and the random component of data,
used as training data. The rest 150 were testing data. The exponential moving average of the closing price change at
raw data is preprocessed into various technical indicators to day t (∆EMA(t)) was selected as the output node of the
gain insight into the direction that the stock market may be network. ∆EMA(t) can then be transformed to stock
going. Ten technical indicators was selected as inputs of closing price Pt by
the neural network: the lagging input of past 5 days’ change 1
in exponential moving average (∆1EMA(t-1), ∆2EMA(t-1), Pt = Pt−1 + [ ∆ EMA( t ) − ∆ EMA( t − 1 )] + ∆EMA( t − 1 )
L
∆3EMA(t-1), ∆4EMA(t-1), ∆5EMA(t-1)), relative
strength index on day t-1 (RSI(t-1)), moving average A three-layer network architecture was used. The required
convergence-divergence on day t-1 (MACD(t-1)), number of hidden nodes is estimated by
MACD Signal Line on day t-1 (MACD Signal No. of hidden nodes = (M + N) / 2
Line (t-1)), stochastic %K on day t-1 (%K(t-1)) and where M and N is the number of input nodes and output
stochastic %D on day t-1 (%D(t-1)). nodes respectively. In our network, there were ten input
nodes and one output node. Hence, five hidden nodes were
EMA is a trend-following tool that gives an average value used.
of data with greater weight to the latest data. Difference of
EMA can be considered as momentum. RSI is an oscillator The following scenarios have been examined:
which measures the strength of up versus down over a a. Conjugate gradient with random initialization (CG/RI)
certain time interval (nine days were selected in our b. Conjugate gradient with multiple linear regression
experiment). High value of RSI indicates a strong market initialization (CG/MLRI)
and low value indicates weak markets. MACD, a trend- c. Steepest descent with random initialization (SD/RI)
following momentum indicator, is the difference between d. Steepest descent with multiple linear regression
two moving average of price. In our experiment, 12-day initialization (SD/MLRI)
EMA and 26-day EMA were used. MACD signal line In steepest descent algorithm, the learning rate and
smoothes MACD. 9-day EMA of MACD was selected momentum rate was set to 0.1 and 0.5 respectively[27]. In
for the calculation of MACD signal line. Stochastic is an conjugate gradient, Golden Search was used to perform the
oscillator that tracks the relationship of each closing price exact the line search of α . According to Bazarra’s
to the recent high-low range. It has two lines: %K and %D. analysis[3], Polak and Ribiere’s form was selected for the
%K is the “raw” Stochastic. In our experiment, the calculation of β. For all scenarios mentioned above, the
Stochastic’s time window was set to five for calculation of training is terminated when mean square error (MSE) is
%K. %D smoothes %K – over a 3-day period in our smaller than 0.5%.
experiment.
All the eleven company data were used for each of the
Neural network cannot handle wide range of values. In above scenario. Each company data set ran 10 times.
order to avoid difficulty in getting network outputs very Figure 1 shows a sample result from testing phase.
close to the two endpoints, the indicators were normalized
to the range [0.05, 0.95], instead of [0,1], before being input
to the network.

Fig 1a: Predicted ∆EMA(t) vs actual ∆EMA(t) Fig 1b: Predicted stock price vs actual stock price
Figure 1: A sample result from neural network
In figure 1a, although predicted ∆EMA(t) and actual 5. Conclusion & Discussion
∆EMA(t) have a relative great deviation in some regions,
the network can still model the actual EMA reasonably The experimental results show that it is possible to model
well. On the other hand, after the transformation of stock price based on historical trading data by using a three-
∆EMA(t) to exact price value, the deviation between actual layer neural network. In general, both steepest descent
price and predicted price is small. Two curves in figure 1b network and conjugate gradient network produce the same
nearly coincide. This reflects the selection of the network level of error and reach the same level of direction
forecaster was appropriate. prediction accuracy.

The performance of scenarios mentioned above is evaluated Conjugate gradient approach has advantages of steepest
by average number of iterations required for training, descent approach. It does not require empirical
average MSE in testing phase and the percentage of correct determination of network. As opposed to zigzag motion in
direction prediction in testing phase. The results are steepest descent approach, its orthogonal search prevents a
summarized in Table 1. good point being spoiled. Theoretically, the convergence of
second-order conjugate gradient method is faster than first
Average % of correct order steepest descent approach. This is verified in our
Average experiment.
number of direction
MSE
iterations prediction
In regard to initial starting point, the experimental results
CG / RI 56.636 0.001753 73.055
show the good starting point generated by multiple linear
CG / MLRI 30.273 0.001768 73.545
regression weight initialization is spoiled by subsequent
SD / RI 497.818 0.001797 72.564
direction in steepest descent network. On the contrary,
SD / MLRI 729.367 0.002580 69.303 regression initialization provides a good starting point,
Table 1: Performance evaluation for four scenarios improving the convergence of conjugate gradient learning.

All scenarios, except for steepest descent with MLR To sum up, the efficiency of backpropagation can be
initialization, achieve similar average MSE and percentage improved by conjugate gradient learning with multiple
of correct direction prediction. All scenarios perform linear regression weight initialization.
satisfactory. The mean square error produced is on average
below 0.258% and more than 69% correct direction It is believed that the computation time of conjugate
prediction is reached. gradient can be reduced by Shanno’s approach[7]. The
initialization scheme may be improved by estimating
Conjugate gradient learning on average requires significant weights between input nodes and hidden nodes, instead of
less number of iterations than steepest descent learning. random initialization. Enrichment of more relevant inputs
Due to complexity of line search, conjugate gradient such as fundamental data and data from derivative markets
requires a longer computation time than steepest gradient may improve the predictability of the network. Finally,
per iteration. However, overall convergence of conjugate more sophisticated network architectures can be attempted
gradient neural network is still faster than steepest descent for price prediction.
network.

In conjugate gradient network, MLR initialization requires

less number of iterations required for training than random
initialization, achieving similar MSE and direction
prediction accuracy with random initialization. The
positive result shows that regression provides a better
starting point for the local quadratic approximation of the
nonlinear network function performed by conjugate
gradient.

However, in steepest descent network, regression

initialization does not improve performance. It requires
more number of iterations for training, produces a larger
MSE and fewer correct direction predictions than random
initialization. The phenomenon is opposite to the case in
conjugate gradient network. It is attributed to the
characteristics of the gradient descent algorithm that
modifies direction to negative gradient of error surface,
resulting in spoils of good starting point generated by MLP
by subsequent directions.
Proc. of World Congress on Neural Networks,
References Washington D.C., July 1995.
18. Leorey Marquez, Tim Hill, Reginald Worthley,
1. Arun Bansal, Robert J. Kauffman, Rob R. Weitz. William Remus. Neural Network Models as an
Comparing the Modeling Performance of Regression Alternate to Regression, Proc. of IEEE 24th Annual
and Neural Networks as Data Quality Varies: A Hawaii Int’l Conference on System Sciences,
Business Value Approach, Journal of Management pp.129-135, Vol VI, 1991.
Information Systems, Vol 10. pp.11-32, 1993. 19. Michael McInerney, Atam P. Dhawan. Use of
2. Baum E.B., Haussler D. What size net gives valid Genetic Algorithms with Back-Propagation in training
generalization?, Neural Computation 1, pp.151- of Feed-Forward Neural Networks, Proc. of IEEE
160, 1988. Int’l Joint Conference on Neural Networks, Vol.
3. Bazarra M.S. Nonlinear Programming, Theory and 1, pp. 203-208, 1993.
Algorithms, Ch 8, Wiley, 1993. 20. Mikko Lehtokangas. Initializing Weights of a
4. Burton G. Malkeil. A Random Walk Down Wall Multilayer Perceptron Network by Using the
Street, Ch 5-8, W.W. Norton & Company, 1996. Orthogonal Least Squares Algorithm, Neural
5. Chen S., Cowan C.F.N., Grant P.M. Orthogonal Computation, 1995.
Least Squares LearningAlgorithm for Radial Basis 21. Referenes A-P., Zapranis A., Francis G. Stock
Function Networks, IEEE Trans. Neural Network Performance Modeling Using Neural Networks: A
2(2), pp. 302-309, 1991. Comparative Study with Regression Models, Neural
6. Chris Bishop. Exact Calculation of the Hessian Networks, Vol. 7, no. 2, pp.375-388, 1994.
Matrix for the Multilayer Perceptron, Neural 22. Robert R. Trippi, Jae L. Lee. Artificial Intelligence in
Computation 4, pp 494-501, 1992. Finance & Investing, Ch 10, IRWIN, 1996.
7. David F. Shanno. Conjugate Gradient Methods with 23. Robert R. Trippi. Neural Networks in Finance and
Inexact Searches, Mathematics of Operations Investing, Probus Publishing Company, 1993.
Research, Vol. 3, no. 3, 1978. 24. Roberto Battiti. First- and Second-Order Methods for
8. David M. Skapura. Building Neural Networks, Ch 5, Learning: Between Steepest Descent and Newton’s
ACM Press, Addison-Wesley. Method, Neural Computation 4, pp. 141-166,
9. Davide Chinetti, Francesco Gardin, Cecilia Rossignoli. 1992.
A Neural Network Model for Stock Market 25. Roger Stein. Preprocessing Data for Neural Networks,
Prediction, Proc. Int’l Conference on Artificial AI Expert, March, 1993.
Intelligence Applications on Wall Street, 1993. 26. Roger Stein. Selecting Data for Neural Networks, AI
10. Edward Gately. Neural Networks for Financial Expert, March, 1993.
Forecasting, Wiley, 1996. 27. Simon Haykin. Neural Networks, A Comprehensive
11. Eitan Michael Azoff. Reducing Error in Neural Foundation, Ch 6, Macmillan, 1994.
Network Time Series Forecasting, Neural 28. Steven B. Achilis. Technical Analysis From A to Z,
Computing & Applications, pp.240-247, 1993. Probus Publishing, 1995.
12. Emad W. Saad, Danil V.P., Donald C.W. Advanced 29. Timothy Masters. Practical Neural Network Receipes
Neural Network Training Methods for Low False in C++, Ch 4-10, Academic Press.
Alarm Stock Trend Prediction, Proc. of World 30. W. Press et al. Numerical Recipes in C, The Art of
Congress on Neural Networks, Washing D.C., Scientific Computing, 2nd Edition, Cambridge
June 1996. University Press, 1994.
13. G. Thimm and E. Fiesler. High Order and Multilayer 31. Wasserman P.D. Advanced Methods in Neural
Perceptron Initialization, IEEE Transactions on Computing, Van Nostrand Reinhold.
Neural Networks, Vol. 8, No.2, pp.349-359, 1997. 32. White H. Economic Prediction Using Neural
14. Gia-Shuh Jang, Feipei Lai. Intelligent Stock Market Networks: The Case of IBM Daily Stock Returns,
Prediction System Using Dual Adaptive-Structure Proc. of IEEE Int’l Conference on Neural
Neural Networks, Proc. Int’l Conference on Networks, 1988.
Artificial Intelligence Applications on Wall 33. Wood D., Dasgupta B. Classifying Trend Movements
Street, 1993. in the MSCI U.S.A. Capital Market Index - A
15. Guido J. Deboeck. Trading on the Edge - Neural, comparison of Regression, ARIMA and Neural
Genetic, and Fuzzy Systems for Chaotic Financial Network Methods, Computer and Operations
Markets, Wiley, 1994. Research, Vol. 23, no. 6, pp.611, 1996.
16. Hecht-Nielsen R. Kolmogorov’s mapping neural 34. Rumelhart, D.E., Hinton G.E., Williams R.J. Learning
network existence theorem, Proc. 1st IEEE Int’l Internal Representations by Error Propagation,
Joint Conf. Neural Network, June 1987. Parallel Distributed Processing: Explorations
17. Hong Tan, Danil V. P., Donald C. W. Probabilistic in the Microstructure of Cognition, Vol. 1:
and Time-Delay Neural Network Techniques for Foundations. MIT Press, 1986.
Conservative Short-Term Stock Trend Prediction,

Eem520l3 2023
No ratings yet
Eem520l3 2023
25 pages
11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
Models PDF
No ratings yet
Models PDF
86 pages
Chapter 2 Optimization
No ratings yet
Chapter 2 Optimization
47 pages
Theory of Deep Learning 1652786371
No ratings yet
Theory of Deep Learning 1652786371
118 pages
Generating High Frequency Trading Strategies With Artificial - PDF Room
No ratings yet
Generating High Frequency Trading Strategies With Artificial - PDF Room
120 pages
Machine Learning Theory and Applications - 2024 - Vasques - Machine Learning Alg (1)
No ratings yet
Machine Learning Theory and Applications - 2024 - Vasques - Machine Learning Alg (1)
98 pages
DL UNIT 2
No ratings yet
DL UNIT 2
46 pages
Multi Perceptor
No ratings yet
Multi Perceptor
37 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
43 pages
Back Propagation
No ratings yet
Back Propagation
28 pages
DL Regularization
No ratings yet
DL Regularization
51 pages
Unit 2
No ratings yet
Unit 2
36 pages
NLO Notes
No ratings yet
NLO Notes
75 pages
Unit 2
No ratings yet
Unit 2
37 pages
Experiments On Learning by Back Propagation
No ratings yet
Experiments On Learning by Back Propagation
45 pages
3.Linear Regression
No ratings yet
3.Linear Regression
18 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
Real Analysis
100% (1)
Real Analysis
284 pages
Gradient Descent Algorithm.Y... (1)
No ratings yet
Gradient Descent Algorithm.Y... (1)
10 pages
RK 4th Order Method
No ratings yet
RK 4th Order Method
20 pages
Conjugate Gradient Method - Wikipedia
No ratings yet
Conjugate Gradient Method - Wikipedia
15 pages
lec30 (6)
No ratings yet
lec30 (6)
22 pages
DL-UNIT_2
No ratings yet
DL-UNIT_2
7 pages
A Scaled Conjugate Gradient Algorithm
No ratings yet
A Scaled Conjugate Gradient Algorithm
9 pages
Conjugate Gradient Method
No ratings yet
Conjugate Gradient Method
11 pages
BSC Part 3
No ratings yet
BSC Part 3
29 pages
SVM, RF, Decision tree
No ratings yet
SVM, RF, Decision tree
17 pages
Gradient Descent Based Learners
No ratings yet
Gradient Descent Based Learners
11 pages
[Lecture Notes in Electrical Engineering Vol. 10.1007_978!90!481-2311-7 Iss. Chapter 44] Ao, Sio-Iong_ Gelman, Len - [Lecture Notes in Electrical Engineering] Advances in Electrical Engineering and Computational Science Volume - Libgen.li
No ratings yet
[Lecture Notes in Electrical Engineering Vol. 10.1007_978!90!481-2311-7 Iss. Chapter 44] Ao, Sio-Iong_ Gelman, Len - [Lecture Notes in Electrical Engineering] Advances in Electrical Engineering and Computational Science Volume - Libgen.li
10 pages
Conjugate Gradient Method
No ratings yet
Conjugate Gradient Method
8 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
PDF 4
No ratings yet
PDF 4
11 pages
ANN - Ch2-Adaline and Madaline
100% (1)
ANN - Ch2-Adaline and Madaline
29 pages
Trading Predictions Neural Networks
No ratings yet
Trading Predictions Neural Networks
7 pages
A New Backpropagation Algorithm Without Gradient Descent
No ratings yet
A New Backpropagation Algorithm Without Gradient Descent
15 pages
Simon Chapter 3
No ratings yet
Simon Chapter 3
12 pages
ANN - Ch2-Adaline and Madaline
No ratings yet
ANN - Ch2-Adaline and Madaline
27 pages
Unit IV BPA GD
No ratings yet
Unit IV BPA GD
12 pages
Componentwise Triple Jump Accelration For Training Linear SVM
No ratings yet
Componentwise Triple Jump Accelration For Training Linear SVM
4 pages
Scaled Conjugate Gradient For Supervised Learning
No ratings yet
Scaled Conjugate Gradient For Supervised Learning
23 pages
Comparison of Gradient Descent Algorithms On Training Neural Networks
No ratings yet
Comparison of Gradient Descent Algorithms On Training Neural Networks
20 pages
Stochastic Gradient Descent Algorithm
No ratings yet
Stochastic Gradient Descent Algorithm
6 pages
6.1-Fundamentals of Artificial Neural Networks
No ratings yet
6.1-Fundamentals of Artificial Neural Networks
12 pages
Future Scope and Conclusion
No ratings yet
Future Scope and Conclusion
13 pages
ANN - Ch2-Adaline and Madaline
No ratings yet
ANN - Ch2-Adaline and Madaline
27 pages
MLP Learning
No ratings yet
MLP Learning
13 pages
pahwa-2017-ijca-913453
No ratings yet
pahwa-2017-ijca-913453
8 pages
Support Vector Machines For Prediction of Futures Prices in Indian Stock Market
No ratings yet
Support Vector Machines For Prediction of Futures Prices in Indian Stock Market
5 pages
A Multilayer Feed-Forward Neural Network
No ratings yet
A Multilayer Feed-Forward Neural Network
9 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Appendix: 9.1 Functionals and Functional Derivatives
No ratings yet
Appendix: 9.1 Functionals and Functional Derivatives
4 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
A Modified Conjugate Gradient Formula For Back Propagation Neural Network Algorithm-Libre
No ratings yet
A Modified Conjugate Gradient Formula For Back Propagation Neural Network Algorithm-Libre
8 pages
Multi Layer Feed-Forward Network Learning
No ratings yet
Multi Layer Feed-Forward Network Learning
5 pages
Tutorial On Multivariate Logistic Regression: Javier R. Movellan July 23, 2006
No ratings yet
Tutorial On Multivariate Logistic Regression: Javier R. Movellan July 23, 2006
9 pages
Conjugate Gradient Method: Com S 477/577 Nov 6, 2007
No ratings yet
Conjugate Gradient Method: Com S 477/577 Nov 6, 2007
8 pages
Computing For Data Sciences: Introduction To Regression Analysis
No ratings yet
Computing For Data Sciences: Introduction To Regression Analysis
9 pages
ENCS 6021 EC (Summer a 2025)_ 5.3.2 Lecture - Second Order Systems Transcript _ EConcordia
No ratings yet
ENCS 6021 EC (Summer a 2025)_ 5.3.2 Lecture - Second Order Systems Transcript _ EConcordia
17 pages
Time Averages and Ergodicity
No ratings yet
Time Averages and Ergodicity
24 pages
Full download (eBook PDF) Calculus Early Transcendentals, 2nd Global Edition pdf docx
100% (7)
Full download (eBook PDF) Calculus Early Transcendentals, 2nd Global Edition pdf docx
46 pages
Real Numbers 2024 - 25
No ratings yet
Real Numbers 2024 - 25
54 pages
Training Feed Forward Networks With The Marquardt Algorithm
No ratings yet
Training Feed Forward Networks With The Marquardt Algorithm
5 pages
2024-MTAP-Saturday-Class-G8-S4-1
No ratings yet
2024-MTAP-Saturday-Class-G8-S4-1
4 pages
65 - C - B Mathematics
No ratings yet
65 - C - B Mathematics
23 pages
Mirror Descent Slides
No ratings yet
Mirror Descent Slides
35 pages
The Real Number System 8.NS Revised
No ratings yet
The Real Number System 8.NS Revised
15 pages
Polynomials
No ratings yet
Polynomials
29 pages
MA108-Lecture 4-D3
No ratings yet
MA108-Lecture 4-D3
18 pages
Lecture-Wise Plan of "Differential Equations": A: Course Description Topics Covered
No ratings yet
Lecture-Wise Plan of "Differential Equations": A: Course Description Topics Covered
9 pages
Unit - 1: Partial Differential Equations
No ratings yet
Unit - 1: Partial Differential Equations
49 pages
2023/24 Academic Year Ma110: Mathematical Methods Tutorial Sheet 13
No ratings yet
2023/24 Academic Year Ma110: Mathematical Methods Tutorial Sheet 13
8 pages
Bailey 1968
No ratings yet
Bailey 1968
15 pages
09 Domain Analysis Testing - Done
No ratings yet
09 Domain Analysis Testing - Done
14 pages
PIB - Numerical Analysis I - Iserles (2010) 41pg
No ratings yet
PIB - Numerical Analysis I - Iserles (2010) 41pg
41 pages
Interval Notation
No ratings yet
Interval Notation
11 pages
Excerpt
No ratings yet
Excerpt
10 pages
Applications of Systems To Economics
No ratings yet
Applications of Systems To Economics
10 pages
Finding The Area of Rectangles and Squares MM: 8cm 4cm
No ratings yet
Finding The Area of Rectangles and Squares MM: 8cm 4cm
4 pages
Finite Difference Method 2
No ratings yet
Finite Difference Method 2
22 pages
Links
No ratings yet
Links
3 pages
Pde in DIP
No ratings yet
Pde in DIP
20 pages
Euler Theorems PDF
No ratings yet
Euler Theorems PDF
2 pages
Math 280
No ratings yet
Math 280
2 pages
G10 - Budget of Work - First Quarter
No ratings yet
G10 - Budget of Work - First Quarter
10 pages
Chapter Quiz - Linear Equations and Inequalities
No ratings yet
Chapter Quiz - Linear Equations and Inequalities
10 pages
New York City College of Technology The City University of New York
No ratings yet
New York City College of Technology The City University of New York
10 pages
Dear The Weight
From Everand
Dear The Weight
Masud Rana
No ratings yet
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Financial Forecasting and Gradient Descent

Uploaded by

Financial Forecasting and Gradient Descent

Uploaded by

Financial Time Series Forecasting by Neural Network

Using Conjugate Gradient Learning Algorithm and

CHAN Man-Chung, WONG Chi-Cheong, LAM Chi-Chung

order steepest descent technique as learning algorithm.

Once v j ’s have been obtained, the network initialization is

In conjugate gradient network, MLR initialization requires

However, in steepest descent network, regression

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.