Stock Market Prediction Using Machine Learning [Step-by-Step Implementation]
Updated on Sep 22, 2022 | 12 min read | 13.1k views
Share:
For working professionals
For fresh graduates
More
Updated on Sep 22, 2022 | 12 min read | 13.1k views
Share:
Table of Contents
Prediction and analysis of the stock market are some of the most complicated tasks to do. There are several reasons for this, such as the market volatility and so many other dependent and independent factors for deciding the value of a particular stock in the market. These factors make it very difficult for any stock market analyst to predict the rise and fall with high accuracy degrees.
However, with the advent of Machine Learning and its robust algorithms, the latest market analysis and Stock Market Prediction developments have started incorporating such techniques in understanding the stock market data.
In short, Machine Learning Algorithms are being used widely by many organisations in analysing and predicting stock values. This article shall go through a simple Implementation of analysing and predicting a Popular Worldwide Online Retail Store’s stock values using several Machine Learning Algorithms in Python.
Before we get into the program’s implementation to predict the stock market values, let us visualise the data on which we will be working. Here, we will be analysing the stock value of Microsoft Corporation (MSFT) from the National Association of Securities Dealers Automated Quotations (NASDAQ). The stock value data will be presented in the form of a Comma Separated File (.csv), which can be opened and viewed using Excel or a Spreadsheet.
MSFT has its stocks registered in NASDAQ and has its values updated during every working day of the stock market. Note that the market doesn’t allow trading to happen on Saturdays and Sundays; hence there is a gap between the two dates. For each date, the Opening Value of the stock, Highest and Lowest values of that stock on the same days are noted, along with the Closing Value at the end of the day.
The Adjusted Close Value shows the stock’s value after dividends are posted (Too technical!). Additionally, the total volume of the stocks in the market are also given, With these data, it is up to the work of a Machine Learning/Data Scientist to study the data and implement several algorithms that can extract patterns from the Microsoft Corporation stock’s historical data.
To develop a Machine Learning model to predict the stock prices of Microsoft Corporation, we will be using the technique of Long Short-Term Memory (LSTM). They are used to make small modifications to the information by multiplications and additions. By definition, long-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in deep learning.
Unlike standard feed-forward neural networks, LSTM has feedback connections. It can process single data points (such as images) and entire data sequences (such as speech or video).To understand the concept behind LSTM, let us take a simple example of an online customer review of a Mobile Phone.
FYI: Free nlp course!
Suppose we want to buy the Mobile Phone, we usually refer to the net reviews by certified users. Depending on their thinking and inputs, we decide whether the mobile is good or bad and then buy it. As we go on reading the reviews, we look for keywords such as “amazing”, “good camera”, “best battery backup”, and many other terms related to a mobile phone.
We tend to ignore the common words in English such as “it”, “gave”, “this”, etc. Thus, when we decide whether to buy the mobile phone or not, we only remember these keywords defined above. Most probably, we forget the other words.
This is the same way in which the Long short-term Memory Algorithm works. It only remembers the relevant information and uses it to make predictions ignoring the non-relevant data. In this way, we have to build an LSTM model that essentially recognises only the essential data about that stock and leaves out its outliers.
Though the above-given structure of an LSTM architecture may seem intriguing at first, it is sufficient to remember that LSTM is an advanced version of Recurrent Neural Networks that retains Memory to process sequences of data. It can remove or add information to the cell state, carefully regulated by structures called gates.
The LSTM unit comprises a cell, an input gate, an output gate, and a forget gate. The cell remembers values over arbitrary time intervals, and the three gates regulate the flow of information into and out of the cell.
We shall move on to the part where we put the LSTM into use in predicting the stock value using Machine Learning in Python.
As we all know, the first step is to import libraries that are necessary to preprocess the stock data of Microsoft Corporation and the other required libraries for building and visualising the outputs of the LSTM model. For this, we will use the Keras library under the TensorFlow framework. The required modules are imported from the Keras library individually.
#Importing the Libraries
import pandas as PD
import NumPy as np
%matplotlib inline
import matplotlib. pyplot as plt
import matplotlib
from sklearn. Preprocessing import MinMaxScaler
from Keras. layers import LSTM, Dense, Dropout
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib. dates as mandates
from sklearn. Preprocessing import MinMaxScaler
from sklearn import linear_model
from Keras. Models import Sequential
from Keras. Layers import Dense
import Keras. Backend as K
from Keras. Callbacks import EarlyStopping
from Keras. Optimisers import Adam
from Keras. Models import load_model
from Keras. Layers import LSTM
from Keras. utils.vis_utils import plot_model
Using the Pandas Data reader library, we shall upload the local system’s stock data as a Comma Separated Value (.csv) file and store it to a pandas DataFrame. Finally, we shall also view the data.
#Get the Dataset
df = pd.read_csv(“MicrosoftStockData.csv”,na_values=[‘null’],index_col=’Date’,parse_dates=True,infer_datetime_format=True)
df.head()
Get AI certification online from the World’s top Universities – Masters, Executive Post Graduate Programs, and Advanced Certificate Program in ML & AI to fast-track your career.
In this yet another crucial step, we first print the shape of the dataset. To make sure that there are no null values in the data frame, we check for them. The presence of null values in the dataset tend to cause problems during training as they act as outliers causing a wide variance in the training process.
#Print Dataframe shape and Check for Null Values
print(“Dataframe Shape: “, df. shape)
print(“Null Value Present: “, df.IsNull().values.any())
>> Dataframe Shape: (7334, 6)
>>Null Value Present: False
Date | Open | High | Low | Close | Adj Close | Volume |
1990-01-02 | 0.605903 | 0.616319 | 0.598090 | 0.616319 | 0.447268 | 53033600 |
1990-01-03 | 0.621528 | 0.626736 | 0.614583 | 0.619792 | 0.449788 | 113772800 |
1990-01-04 | 0.619792 | 0.638889 | 0.616319 | 0.638021 | 0.463017 | 125740800 |
1990-01-05 | 0.635417 | 0.638889 | 0.621528 | 0.622396 | 0.451678 | 69564800 |
1990-01-08 | 0.621528 | 0.631944 | 0.614583 | 0.631944 | 0.458607 | 58982400 |
The final output value that is to be predicted using the Machine Learning model is the Adjusted Close Value. This value represents the closing value of the stock on that particular day of stock market trading.
#Plot the True Adj Close Value
df[‘Adj Close’].plot()
In the next step, we assign the output column to the target variable. In this case, it is the adjusted relative value of the Microsoft Stock. Additionally, we also select the features that act as the independent variable to the target variable (dependent variable). To account for training purpose, we choose four characteristics, which are:
#Set Target Variable
output_var = PD.DataFrame(df[‘Adj Close’])
#Selecting the Features
features = [‘Open’, ‘High’, ‘Low’, ‘Volume’]
To reduce the data’s computational cost in the table, we shall scale down the stock values to values between 0 and 1. In this way, all the data in big numbers get reduced, thus reducing memory usage. Also, we can get more accuracy by scaling down as the data is not spread out in tremendous values. This is performed by the MinMaxScaler class of the sci-kit-learn library.
#Scaling
scaler = MinMaxScaler()
feature_transform = scaler.fit_transform(df[features])
feature_transform= pd.DataFrame(columns=features, data=feature_transform, index=df.index)
feature_transform.head()
Date | Open | High | Low | Volume |
1990-01-02 | 0.000129 | 0.000105 | 0.000129 | 0.064837 |
1990-01-03 | 0.000265 | 0.000195 | 0.000273 | 0.144673 |
1990-01-04 | 0.000249 | 0.000300 | 0.000288 | 0.160404 |
1990-01-05 | 0.000386 | 0.000300 | 0.000334 | 0.086566 |
1990-01-08 | 0.000265 | 0.000240 | 0.000273 | 0.072656 |
As mentioned above, we see that the feature variables’ values are scaled down to smaller values compared to the real values given above.
Before feeding the data into the training model, we need to split the entire dataset into training and test set. The Machine Learning LSTM model will be trained on the data present in the training set and tested upon on the test set for accuracy and backpropagation.
For this, we will be using the TimeSeriesSplit class of the sci-kit-learn library. We set the number of splits as 10, which denotes that 10% of the data will be used as the test set, and 90% of the data will be used for training the LSTM model. The advantage of using this Time Series split is that the split time series data samples are observed at fixed time intervals.
#Splitting to Training set and Test set
timesplit= TimeSeriesSplit(n_splits=10)
for train_index, test_index in timesplit.split(feature_transform):
X_train, X_test = feature_transform[:len(train_index)], feature_transform[len(train_index): (len(train_index)+len(test_index))]
y_train, y_test = output_var[:len(train_index)].values.ravel(), output_var[len(train_index): (len(train_index)+len(test_index))].values.ravel()
Once the training and test sets are ready, we can feed the data into the LSTM model once it is built. Before that, we need to convert the training and test set data into a data type that the LSTM model will accept. We first convert the training data and test data to NumPy arrays and then reshape them to the format (Number of Samples, 1, Number of Features) as the LSTM requires that the data be fed in 3D form. As we know, the number of samples in the training set is 90% of 7334, which is 6667, and the number of features is 4, the training set is reshaped to (6667, 1, 4). Similarly, the test set is also reshaped.
#Process the data for LSTM
trainX =np.array(X_train)
testX =np.array(X_test)
X_train = trainX.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = testX.reshape(X_test.shape[0], 1, X_test.shape[1])
Finally, we come to the stage where we build the LSTM Model. Here, we create a Sequential Keras model with one LSTM layer. The LSTM layer has 32 unit, and it is followed by one Dense Layer of 1 neuron.
We use Adam Optimizer and the Mean Squared Error as the loss function for compiling the model. These two are the most preferred combination for an LSTM model. Additionally, the model is also plotted and is displayed below.
#Building the LSTM Model
lstm = Sequential()
lstm.add(LSTM(32, input_shape=(1, trainX.shape[1]), activation=’relu’, return_sequences=False))
lstm.add(Dense(1))
lstm.compile(loss=’mean_squared_error’, optimizer=’adam’)
plot_model(lstm, show_shapes=True, show_layer_names=True)
Finally, we train the LSTM model designed above on the training data for 100 epochs with a batch size of 8 using the fit function.
#Model Training
history = lstm.fit(X_train, y_train, epochs=100, batch_size=8, verbose=1, shuffle=False)
Epoch 1/100
834/834 [==============================] – 3s 2ms/step – loss: 67.1211
Epoch 2/100
834/834 [==============================] – 1s 2ms/step – loss: 70.4911
Epoch 3/100
834/834 [==============================] – 1s 2ms/step – loss: 48.8155
Epoch 4/100
834/834 [==============================] – 1s 2ms/step – loss: 21.5447
Epoch 5/100
834/834 [==============================] – 1s 2ms/step – loss: 6.1709
Epoch 6/100
834/834 [==============================] – 1s 2ms/step – loss: 1.8726
Epoch 7/100
834/834 [==============================] – 1s 2ms/step – loss: 0.9380
Epoch 8/100
834/834 [==============================] – 2s 2ms/step – loss: 0.6566
Epoch 9/100
834/834 [==============================] – 1s 2ms/step – loss: 0.5369
Epoch 10/100
834/834 [==============================] – 2s 2ms/step – loss: 0.4761
.
.
.
.
Epoch 95/100
834/834 [==============================] – 1s 2ms/step – loss: 0.4542
Epoch 96/100
834/834 [==============================] – 2s 2ms/step – loss: 0.4553
Epoch 97/100
834/834 [==============================] – 1s 2ms/step – loss: 0.4565
Epoch 98/100
834/834 [==============================] – 1s 2ms/step – loss: 0.4576
Epoch 99/100
834/834 [==============================] – 1s 2ms/step – loss: 0.4588
Epoch 100/100
834/834 [==============================] – 1s 2ms/step – loss: 0.4599
Finally, we see that the loss value has decreased exponentially over time during the training process of 100 epochs and has reached a value of 0.4599
With our model ready, it is time to use the model trained using the LSTM network on the test set and predict the Adjacent Close Value of the Microsoft stock. This is performed by using the simple function of predict on the lstm model built.
#LSTM Prediction
y_pred= lstm.predict(X_test)
Finally, as we have predicted the test set’s values, we can plot the graph to compare both Adj Close’s true values and Adj Close’s predicted value by the LSTM Machine Learning model.
#True vs Predicted Adj Close Value – LSTM
plt.plot(y_test, label=’True Value’)
plt.plot(y_pred, label=’LSTM Value’)
plt.title(“Prediction by LSTM”)
plt.xlabel(‘Time Scale’)
plt.ylabel(‘Scaled USD’)
plt.legend()
plt.show()
The above graph shows that some pattern is detected by the very basic single LSTM network model built above. By fine-tuning several parameters and adding more LSTM layers to the model, we can achieve a more accurate representation of any given company’s stock value.
If you’re interested to learn more about artificial intelligence examples, machine learning, check out IIIT-B & upGrad’s Executive PG Programme in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.
Get Free Consultation
By submitting, I accept the T&C and
Privacy Policy
Top Resources