Thanks to theidioms.com

Time-Series Forecasting with TensorFlow 2.0

Time-Series Forecasting with TensorFlow 2.0

Creating Helper Functions

Since we will be building many time-series forecasting models, it is a good idea to create helper functions to make our workflow smooth. In this lesson, we will be developing helper functions for creating data window, splitting data and for creating visualizations.

1. Creating a data window

In time-series forecasting, our model will be looking at a certain number of previous consecutive data to make a forecast. For example, we can predict one hour into the future by using consecutive data of the past 6 hours.

Creating Helper Functions

This consecutive number of data taken for time-series forecasting is known as data window. Similarly, we can generate a prediction for 24 hours into the future by using a data window of the past 24 hours.

Creating Helper Functions

So, let us construct a python class that can create a data window as per our requirements given the training, validation and testing dataset.

class WindowGenerator():
    def __init__(self, input_width, label_width, shift,
               train_df=train_df, val_df=val_df, test_df=test_df,
               label_columns=None):
        # Store the raw data. Refer to the previous chapter for the DataFrames.
        self.train_df = train_df
        self.val_df = val_df
        self.test_df = test_df

        # Work out the label column indices.
        self.label_columns = label_columns
        if label_columns is not None:
            self.label_columns_indices = {name: i for i, name in
                                        enumerate(label_columns)}
        self.column_indices = {name: i for i, name in
                               enumerate(train_df.columns)}

        # Work out the window parameters.
        self.input_width = input_width
        self.label_width = label_width
        self.shift = shift

        self.total_window_size = input_width + shift

        self.input_slice = slice(0, input_width)
        self.input_indices = np.arange(self.total_window_size)[self.input_slice]

        self.label_start = self.total_window_size - self.label_width
        self.labels_slice = slice(self.label_start, None)
        self.label_indices = np.arange(self.total_window_size)[self.labels_slice]

    def __repr__(self):
        return '\n'.join([
            f'Total window size: {self.total_window_size}',
            f'Input indices: {self.input_indices}',
            f'Label indices: {self.label_indices}',
            f'Label column name(s): {self.label_columns}'])  

Great! Let us test if our window generator is working as intended.

# Predicting one hour into the future by using a data window of the past 6 hours
w1 = WindowGenerator(input_width=6, label_width=1, shift=1,
                     label_columns=['T (degC)'])
print(f'First Window: \n{w1}')

# Predicting 24 hours into the future by using a data window of the past 24 hours.
w2 = WindowGenerator(input_width=24, label_width=1, shift=24,
                     label_columns=['T (degC)'])
print(f'\nSecond Window: \n{w2}')
First Window: 
Total window size: 7
Input indices: [0 1 2 3 4 5]
Label indices: [6]
Label column name(s): ['T (degC)']

Second Window: 
Total window size: 48
Input indices: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
Label indices: [47]
Label column name(s): ['T (degC)']
2. Splitting a window

After creating a data window, we need to need to split the window into two windows: a window of inputs and another window of labels.

In the above example, the first window (w1) can be split into two separate windows as follows,

Creating Helper Functions

For this purpose, we will be creating a helper function split_window() that will take a list of consecutive inputs, the and convert them to a window of inputs and a window of labels.

def split_window(self, features):
    inputs = features[:, self.input_slice, :]
    labels = features[:, self.labels_slice, :]
    if self.label_columns is not None:
        labels = tf.stack(
            [labels[:, :, self.column_indices[name]] for name in self.label_columns],
            axis=-1)

    # Slicing doesn't preserve static shape information, so set the shapes
    # manually. This way the `tf.data.Datasets` are easier to inspect.
    inputs.set_shape([None, self.input_width, None])
    labels.set_shape([None, self.label_width, None])

    return inputs, labels

WindowGenerator.split_window = split_window

Let us try using the function for the above example (w1).

# Stack three slices, the length of the total window:
example_window = tf.stack([np.array(train_df[:w1.total_window_size]),
                           np.array(train_df[100:100+w1.total_window_size]),
                           np.array(train_df[200:200+w1.total_window_size])])


example_inputs, example_labels = w1.split_window(example_window)

print('All shapes are: (batch, time, features)')
print(f'Window shape: {example_window.shape}')
print(f'Inputs shape: {example_inputs.shape}')
print(f'labels shape: {example_labels.shape}')
All shapes are: (batch, time, features) 
Window shape: (3, 7, 19) 
Inputs shape: (3, 6, 19) 
labels shape: (3, 1, 1)
3. Visualizing data

Now, let us create a helper function to visualize the dataset.

def plot(self, model=None, plot_col='T (degC)', max_subplots=3):
    inputs, labels = self.example
    plt.figure(figsize=(12, 8))
    plot_col_index = self.column_indices[plot_col]
    max_n = min(max_subplots, len(inputs))
    for n in range(max_n):
        plt.subplot(3, 1, n+1)
        plt.ylabel(f'{plot_col} [normed]')
        plt.plot(self.input_indices, inputs[n, :, plot_col_index],
             label='Inputs', marker='.', zorder=-10)

    if self.label_columns:
        label_col_index = self.label_columns_indices.get(plot_col, None)
    else:
        label_col_index = plot_col_index

    if label_col_index is None:
        continue

    plt.scatter(self.label_indices, labels[n, :, label_col_index],
                edgecolors='k', label='Labels', c='#2ca02c', s=64)
    if model is not None:
        predictions = model(inputs)
        plt.scatter(self.label_indices, predictions[n, :, label_col_index],
                  marker='X', edgecolors='k', label='Predictions',
                  c='#ff7f0e', s=64)

    if n == 0:
        plt.legend()

    plt.xlabel('Time [h]')

# Creating an example plot
w1.example = example_inputs, example_labels
WindowGenerator.plot = plot
w1.plot()
Visualized time-series data
4. Creating a tf.data.Dataset

The last step that we need to go through is to build a helper function for creating a tf.data.Dataset using a pandas DataFrame. Creating a tf.Data.Dataset will be useful later in this course while building time-series forecasting models.

def make_dataset(self, data):
  data = np.array(data, dtype=np.float32)
  ds = tf.keras.preprocessing.timeseries_dataset_from_array(
      data=data,
      targets=None,
      sequence_length=self.total_window_size,
      sequence_stride=1,
      shuffle=True,
      batch_size=32,)

  ds = ds.map(self.split_window)

  return ds

WindowGenerator.make_dataset = make_dataset

@property
def train(self):
  return self.make_dataset(self.train_df)

@property
def val(self):
  return self.make_dataset(self.val_df)

@property
def test(self):
  return self.make_dataset(self.test_df)

@property
def example(self):
  """Get and cache an example batch of `inputs, labels` for plotting."""
  result = getattr(self, '_example', None)
  if result is None:
    # No example batch was found, so get one from the `.train` dataset
    result = next(iter(self.train))
    # And cache it for next time
    self._example = result
  return result

WindowGenerator.train = train
WindowGenerator.val = val
WindowGenerator.test = test
WindowGenerator.example = example

Now the WindowGenerator object gives us access to the tf.data.Dataset objects, so you can easily iterate over the data.

w1.train.element_spec
(TensorSpec(shape=(None, 6, 19), dtype=tf.float32, name=None), 
TensorSpec(shape=(None, 1, 1), dtype=tf.float32, name=None))

Iterating over a Dataset yields concrete batches:

for example_inputs, example_labels in w1.train.take(1):
  print(f'Inputs shape (batch, time, features): {example_inputs.shape}')
  print(f'Labels shape (batch, time, features): {example_labels.shape}')
Inputs shape (batch, time, features): (32, 6, 19) 
Labels shape (batch, time, features): (32, 1, 1)

Leave your thought here

Your email address will not be published. Required fields are marked *

Close Bitnami banner
Bitnami