Time Series Forecasting
Time Series Forecasting is a technique for predicting events through a time sequence. The technique is used in many fields of study, from geology to behaviour to economics. Techniques predict future events by analyzing trends from the past, assuming that future trends will hold similar to historical trends.
Now I will be heading towards creating a Machine Learning model to forecast time series with LSTM in Machine Learning.
I will start by importing all the necessary packages we need :
import numpy import matplotlib.pyplot as plt import pandas import math from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error # fix random seed for reproducibility numpy.random.seed(7)
Now let’s load the data, and prepare the data so that we can use it on the LSTM model.
dataframe = pandas.read_csv('dataaa.csv', usecols=[1], engine='python') dataset = dataframe.values dataset = dataset.astype('float32') scaler = MinMaxScaler(feature_range=(0, 1)) dataset = scaler.fit_transform(dataset)
Now, I will split the data into training sets and test sets in the ratio of 80:20 :
train_size = int(len(dataset) * 0.80) test_size = len(dataset) - train_size train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:] print(len(train), len(test))
115 29
Now before training the data on the LSTM Model, we need to prepare the data so that we can fit it on the model, for this task I will define a helper function:
Function convert an array of values into a dataset matrix.
def create_dataset(dataset, look_back=1): dataX, dataY = [], [] for i in range(len(dataset)-look_back-1): a = dataset[i:(i+look_back), 0] dataX.append(a) dataY.append(dataset[i + look_back, 0]) return numpy.array(dataX), numpy.array(dataY)
Now, we need to reshape the data before applying it into the LSTM model:
# reshape into X=t and Y=t+1 look_back = 1 trainX, trainY = create_dataset(train, look_back) testX, testY = create_dataset(test, look_back) # reshape input to be [samples, time steps, features] trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1])) testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
Now as all the tasks are completed concerning data preparation to fit into the LSTM model, it time to fit the data on the model and let’s train the model:
model = Sequential() model.add(LSTM(4, input_shape=(1, look_back))) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(trainX, trainY, epochs=30, batch_size=1, verbose=2)
Epoch 1/30 113/113 - 1s - loss: 0.0612 Epoch 2/30 113/113 - 0s - loss: 0.0287 Epoch 3/30 113/113 - 0s - loss: 0.0212 Epoch 4/30 113/113 - 0s - loss: 0.0192 Epoch 5/30 113/113 - 0s - loss: 0.0176 Epoch 6/30 113/113 - 0s - loss: 0.0158 Epoch 7/30 113/113 - 0s - loss: 0.0141 Epoch 8/30 113/113 - 0s - loss: 0.0124 Epoch 9/30 113/113 - 0s - loss: 0.0107 Epoch 10/30 113/113 - 0s - loss: 0.0092 Epoch 11/30 113/113 - 0s - loss: 0.0077 Epoch 12/30 113/113 - 0s - loss: 0.0065 Epoch 13/30 113/113 - 0s - loss: 0.0054 Epoch 14/30 113/113 - 0s - loss: 0.0045 Epoch 15/30 113/113 - 0s - loss: 0.0038 Epoch 16/30 113/113 - 0s - loss: 0.0034 Epoch 17/30 113/113 - 0s - loss: 0.0031 Epoch 18/30 113/113 - 0s - loss: 0.0029 Epoch 19/30 113/113 - 0s - loss: 0.0027 Epoch 20/30 113/113 - 0s - loss: 0.0027 Epoch 21/30 113/113 - 0s - loss: 0.0026 Epoch 22/30 113/113 - 0s - loss: 0.0027 Epoch 23/30 113/113 - 0s - loss: 0.0026 Epoch 24/30 113/113 - 0s - loss: 0.0026 Epoch 25/30 113/113 - 0s - loss: 0.0027 Epoch 26/30 113/113 - 0s - loss: 0.0026 Epoch 27/30 113/113 - 0s - loss: 0.0026 Epoch 28/30 113/113 - 0s - loss: 0.0026 Epoch 29/30 113/113 - 0s - loss: 0.0026 Epoch 30/30 113/113 - 0s - loss: 0.0027
<tensorflow.python.keras.callbacks.History at 0x7ffb2f5eaed0>
Now, let’s make predictions and visualize the time series trends by using the matplotlib package in python:
trainPredict = model.predict(trainX) testPredict = model.predict(testX) # invert predictions trainPredict = scaler.inverse_transform(trainPredict) trainY = scaler.inverse_transform([trainY]) testPredict = scaler.inverse_transform(testPredict) testY = scaler.inverse_transform([testY]) # calculate root mean squared error trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0])) testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0])) # shift train predictions for plotting trainPredictPlot = numpy.empty_like(dataset) trainPredictPlot[:, :] = numpy.nan trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict # shift test predictions for plotting testPredictPlot = numpy.empty_like(dataset) testPredictPlot[:, :] = numpy.nan testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict # plot baseline and predictions plt.plot(scaler.inverse_transform(dataset)) plt.plot(trainPredictPlot) plt.plot(testPredictPlot) plt.show()