The field of stock price prediction economics essay

The stock market is one of the most popular investing places because of its expected high profit. So, people always want to know expectation of return on investment in share market before investing money. In recent years, most of the researchers have been concentrating their research work on the future prediction of share market return by using Artificial Neural Network. All the researchers use different learning algorithm to train their system and to determine different parameters of their system and the learning algorithms include minimizing least-square based objective function or using gradient descent learning algorithm. Our motivation is also design this problem using Artificial Neural Network, but we want to apply Particle Swarm Optimization to update different parameters of the system. As Particle Swarm Optimization method optimizes a problem by iteratively trying to improve a candidate solution, so it may possible that if we apply it for this problem to update different parameters of Artificial Neural Network, then our system will give better estimation on share market return than that given by already existing system. In this article we propose an elegant solution to the stock price prediction algorithm, which performs better than the state of the art if only the next day stock prices need to be predicted, which could be suitable for small scale applications. Our contributions would be the following: We introduce the use of Particle Swarm Optimization techniques in the stock price prediction problem. We gain considerable improvements by using ANN with PSO for predicting the next day stock value. There has been a lot of research in the field of stock price prediction in the recent past. The problem is interesting to both academia and industry and is very challenging as the prices follow a very complex pattern which is difficult to learn using any statistical or machine learning algorithm. The problem has a lot of applications ranging from personal utility to large scale business related issues. Vaisla et al. cite{1} use neural networks to predict the daily stock prices and compare their findings with the statistical forecasting result. Since the stock prices are dependent on many factors, some known and some unknown Artificial Neural Networks (ANNs) cite{7} serve as the best technique as they are governed by the concept of `Learn by Example’. Coupelon et al. cite{2} carry out an investigative study on the applicability of various learning and statistical techniques for stock price prediction. The authors have carried out an indepth literature survey citing various works in the field from the use of Hidden Markov Models (HMM) to optimization techniques like Genetic Programming. Khan et al. cite{3} use back-propagation algorithm for training their ANN and multilayer feed-forward network as a network model for predicting the stock prices. They test their algorithm on a pharmaceutical stock using $2$ and $5$ input data respectively and achieved better performance over other statistical techniques. Mohatram et al. cite{4} state the fact that using statistical techniques like regression for modelling such a complex variable like stock price may be very time consuming, and they lay stress on fast and efficient ways to achieve the same. The author proposes an ANN approach exactly on the similar lines as Khan et al. cite{3} with laying stress on efficiency of ANN based approaches over statistical techniques which will be useful to practical stock exchange prediction systems. Egeli et al. cite{5} use ANNs with similar architectures as discussed in the previous works as of now. They incorporate the use of Multilayer perceptrons as well along with feed-forward network architecture. They test their approach on Istanbul Stock Exchange data set and compare their findings with the Moving Average (MA) method with the considerable improvement in performance. Heinkel et al. cite{6} propose methods that proceed in the opposite direction when compared to the works discussed till now. They tackle an altogether different problem of predicting missing stock return values (the days that may correspond to days of no observed trading or may be due to data loss), which is an interesting contribution to the field. They also study and characterize the effect of a bid-ask spread on the time-series behavior of daily stock prices. The authors are inclined towards statistical techniques and use ordinary/generalised least squares approach for estimating missing data. Gudise et al. cite{9} were the first to propose the use of genetic algorithms like Particle Swarm Optimization (PSO) cite{8} for training an ANN instead of the standard back-propagation algorithm. The authors test their multilayer feed-forward network architecture with both back-propagation and PSO, with PSO converging faster than the former thus displaying a considerable increase in performance. We take our major ideas from them and test our approach on Microsoft Corporation (MSFT) stock price data. There were many other online resources as well which were consulted during the course of our work, of which the most important source of information were cite{10, 11}. The Particle Swarm Optimization algorithm is a genetic algorithm and is widely used these days as swarm intelligence is gaining a lot of popularity. A formal description can be understood from Algorithm
ef{alg: PSO}. We also define `Inertia Weight’ in this context,[v_{id}^{new}= w_i. v_{id}^{old}+c_1. rand_1.(p_{id}-x_{id})+c_2. rand_2.(p_{gd}-x_{id})]$x_{id}^{new}= x_{id}^{old}+v_{id}^{new}$\ where $d$ is the dimension, $c_1$ and $c_2$ are positive constants, $rand_1$ and $rand_2$ are random numbers, and $w$ is the inertia weight. Velocity can be limited to $V_{max}$. The basic architecture consists of three types of neuron layers: input, hidden, and output layers, see Figure
ef{fig: ann_example} for a sample multilayer feed-forward neural network. In feed-forward networks, the signal flow is from input to output units, strictly in a feed-forward direction. The data processing can extend over multiple (layers of) units, but no feedback connections are present. We use only one hidden layer but the number of nodes in the hidden layer will be selected later during the time of training and testing. We’ll run the training and testing for different number of nodes in hidden layer and choose the number that will give the best result. Before using the data for analyzing, preprocessing of data is needed. The problem arises when there is no trading data or partial trading data corresponding certain days. Heinkel and Kraus [6] stated that there are three possible ways dealing with days having no trading, viz. ignore the days with no trading and use data for trading days, assign a zero value for the days which have no trading, build a linear model which can be used to estimate the data value for the day with no trading. In most of times, weekly closing price refers to each Friday’s closing prices. In the event of Friday being a holiday, the most recently available closing price for the stock was used. Here we use data for trading days only and also we ignore the data for certain trading days for which some of the element of the data (like stock volume) is missing. There are two main stages for model analysis1. Training Phase. 2. Prediction Phase or testing phase. The first phase training phase also can be divided into two parts, building ANN model and weight updation phase. We have used Multilayer Feed-forward Neural Network to model the stock price prediction problem. Here we are predicting closing price of a stock on the next day. Thus input variables that are considered to affect the stock exchange market areLast-day opening Stock PriceLast-day high Value of Stock PriceLast-day low Value of Stock PriceLast-day stock VolumeLast-day closing priceegin{algorithm}[t]caption{Particle Swarm Optimisation Algorithm}label{alg: PSO}egin{algorithmic}[1]STATE Initialize population in hyperspace. STATE Evaluate fitness of individual particlesSTATE Modify velocities based on previous best and global (or neighborhood) best positionsSTATE Terminate on some conditionSTATE Go to step 2STATE Update each particle in each generation using equations where $c_1$ and $c_2$ are learning factors (weights)STATE $v[i] gets$ $v[i]+c_1*rand()*(pbest[i]-present[i])+c_2*rand()*(gbest[i]-present[i])$STATE $present[i] gets$ $present[i]+v[i]$end{algorithmic}end{algorithm}Thus input layer of our neural network model consists of five input nodes. The output layer consists of only one node that gives the predicted Closing Stock Price. We have used only one hidden layer and we have found the number of nodes in the hidden layer after trying out $5-15$ numbers of nodes, and the best result we have found for the number of nodes in the hidden layer equal to $10$. We have used the sigmoid function $f(x)= frac{1}{1+e^{-x}}$ as the activation function. Here we have used Particle Swarm Optimization to update the weight of our Neural Network model. The optimal PSO parameters have been determined by varying the inertia weight $(w_i)$, maximum velocity $(V_{max})$, social and cognitive coefficient $(c_1$ and $c_2)$ and the swarm size and the values of the parameters for which we have found the best result in our training set are as follows: Inertia weight= 0. 3Maximum velocity= 2. 0c_1= 0. 15c_2= 0. 8Swarm size= 100We have used the historical Stock Price data (from $01/01/11$ to $31/08/11$) of Microsoft Corporation (MSFT) collected from http://in. finance. yahoo. com. We have used data from $01/01/2011$ to $29/07/2011$ (total 146 data points) as our training data using which we predict the stock closing rate during the month of August , i. e, from $01/08/2011$ to $31/08/2011$ and match it with the closing data of that period. Error is calculated using the following formula[Relative Absolute Error(\%)= frac{| Actual-Predicted|}{Actual}imes100]The table given below Table
ef{tab: table1} shows the percentage relative error in prediction for that period. The stock prediction curve can be easily interpreted from Figure
ef{fig: ann_pso} which describes how close has the prediction gone with the help of PSO and how good it approximated the actual stock price curve. The Table
ef{tab: table2} shows the relative percentage error and Figure
ef{fig: ann_back} shows the graphical representation of predicted and actual stock price of ANN based stock price prediction system where Back-propagation algorithm is used in weight updation of ANN model with the same MSFT dataset that we have used. The predicted stock curve using the above approach is actually very bad and doesn’t do justice to the problem. The errors as mentioned in the Table
ef{tab: table2} are also very large. From this comparison we can say that: Relative error $(\%)$ of our proposed model is far better than that of back-propagation algorithm based ANN model. Back-propagation based ANN model cannot capture the variations of stock market, but our proposed PSO based ANN model can capture the variations of stock market in a better way relative to back-propagation based ANN model. We can also observe the following: The output of the model is highly dependent on the different parameters of PSO. If input changes highly, then prediction of our model is not near to the actual value, i. e., error during sudden fall or sudden rise of index value is much more. As researchers and investors strive to outperform the market, the use of neural networks to forecast stock market prices will be a continuing area of research. Here we have used only 146 data for training purpose. Our model may give far better result if more data will be used during training. Here we have used Last-day opening Stock Price, Last-day high Value of Stock Price, Last-day low Value of Stock Price, Last-day stock Volume, Last-day closing price as inputs of ANN. If instead of these inputs, different economic measures like General Index, Net Asset Value, P/E ratio, Earning per Share, Share Volume will be used as inputs to ANN as proposed in cite{3} , then our approach of weight updation using PSO may give better result but for this different parameter values of PSO need to be tuned. We would like to thank Prof. Jaisankar N. for his valuable suggestions and constant help which aided the successful completion of this work. We would also like to thank Yahoo! for providing such a valuable and informative web service that provides easy access to financial data and thus helps in analysis projects like this.