Time Series forecasting using Python

Time Series Forecasting is a technique for predicting events through a time sequence. The technique is used in many fields of study, from geology to behaviour to economics. Techniques predict future events by analyzing trends from the past, assuming that future trends will hold similar to historical trends.

Now I will be heading towards creating a Machine Learning model to forecast time series with LSTM in Machine Learning.

I will start by importing all the necessary packages we need :

In [1]:
import numpy
import matplotlib.pyplot as plt
import pandas
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# fix random seed for reproducibility

Now let’s load the data, and prepare the data so that we can use it on the LSTM model.

In [2]:
dataframe = pandas.read_csv('dataaa.csv', usecols=[1], engine='python')
dataset = dataframe.values
dataset = dataset.astype('float32')

scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

Now, I will split the data into training sets and test sets in the ratio of 80:20 :

In [3]:
train_size = int(len(dataset) * 0.80)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
print(len(train), len(test))
115 29

Now before training the data on the LSTM Model, we need to prepare the data so that we can fit it on the model, for this task I will define a helper function:

Function convert an array of values into a dataset matrix.

In [4]:
def create_dataset(dataset, look_back=1):
	dataX, dataY = [], []
	for i in range(len(dataset)-look_back-1):
		a = dataset[i:(i+look_back), 0]
		dataY.append(dataset[i + look_back, 0])
	return numpy.array(dataX), numpy.array(dataY)

Now, we need to reshape the data before applying it into the LSTM model:

In [5]:
# reshape into X=t and Y=t+1
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))

Now as all the tasks are completed concerning data preparation to fit into the LSTM model, it time to fit the data on the model and let’s train the model:

In [6]:
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=30, batch_size=1, verbose=2)
Epoch 1/30
113/113 - 1s - loss: 0.0612
Epoch 2/30
113/113 - 0s - loss: 0.0287
Epoch 3/30
113/113 - 0s - loss: 0.0212
Epoch 4/30
113/113 - 0s - loss: 0.0192
Epoch 5/30
113/113 - 0s - loss: 0.0176
Epoch 6/30
113/113 - 0s - loss: 0.0158
Epoch 7/30
113/113 - 0s - loss: 0.0141
Epoch 8/30
113/113 - 0s - loss: 0.0124
Epoch 9/30
113/113 - 0s - loss: 0.0107
Epoch 10/30
113/113 - 0s - loss: 0.0092
Epoch 11/30
113/113 - 0s - loss: 0.0077
Epoch 12/30
113/113 - 0s - loss: 0.0065
Epoch 13/30
113/113 - 0s - loss: 0.0054
Epoch 14/30
113/113 - 0s - loss: 0.0045
Epoch 15/30
113/113 - 0s - loss: 0.0038
Epoch 16/30
113/113 - 0s - loss: 0.0034
Epoch 17/30
113/113 - 0s - loss: 0.0031
Epoch 18/30
113/113 - 0s - loss: 0.0029
Epoch 19/30
113/113 - 0s - loss: 0.0027
Epoch 20/30
113/113 - 0s - loss: 0.0027
Epoch 21/30
113/113 - 0s - loss: 0.0026
Epoch 22/30
113/113 - 0s - loss: 0.0027
Epoch 23/30
113/113 - 0s - loss: 0.0026
Epoch 24/30
113/113 - 0s - loss: 0.0026
Epoch 25/30
113/113 - 0s - loss: 0.0027
Epoch 26/30
113/113 - 0s - loss: 0.0026
Epoch 27/30
113/113 - 0s - loss: 0.0026
Epoch 28/30
113/113 - 0s - loss: 0.0026
Epoch 29/30
113/113 - 0s - loss: 0.0026
Epoch 30/30
113/113 - 0s - loss: 0.0027
<tensorflow.python.keras.callbacks.History at 0x7ffb2f5eaed0>

Now, let’s make predictions and visualize the time series trends by using the matplotlib package in python:

In [7]:
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))

# shift train predictions for plotting
trainPredictPlot = numpy.empty_like(dataset)
trainPredictPlot[:, :] = numpy.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = numpy.empty_like(dataset)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict
# plot baseline and predictions
Time Series Forecasting Python

