Multi dimensional input multi dimensional output rnn keras data preprocessing - python

I want to create a RNN model in Keras. In each time-step the input has 9 element and the output has 4 element.
input_size = (304414,9)
target_size = (304414,4)
How can I create a dataset of sliding windows over the time-series.

You can use this code by considering windows size and stride
for idx in range(0, input.shape[0] - window_size - 1, stride):
input.append(input_data[idx + 1: idx + 1 + window_size, :])
input = np.reshape( input, (len(input), input[0].shape[0], input[0].shape[1]))

Related

Problem with dimensionality in Keras RNN - reshape isn't working?

Let's consider this random dataset on which I want to perform RNN:
import random
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN
from keras.optimizers import SGD
import numpy as np
df_train = random.sample(range(1, 100), 50)
I want to apply RNN with lag equal to 1. I'll use my own function:
def create_dataset(dataset, lags):
dataX, dataY = [], []
for i in range(lags):
subdata = dataset[i:len(dataset) - lags + i]
dataX.append(subdata)
dataY.append(dataset[lags:len(dataset)])
return np.array(dataX), np.array(dataY)
which narrows dataframe with respect to number of lags. It outputs two numpy arrays - first is independent variables, and second one is dependent variable.
x_train, y_train = create_dataset(df_train, lags = 1)
But now when I'm trying to run the function:
model = Sequential()
model.add(SimpleRNN(1, input_shape=(1, 1)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer=SGD(lr = 0.1))
history = model.fit(x_train, y_train, epochs=1000, batch_size=50, validation_split=0.2)
I obtain error:
ValueError: Error when checking input: expected simple_rnn_18_input to have 3 dimensions, but got array with shape (1, 49)
I've read about it and the solution is just to apply reshape:
x_train = np.reshape(x_train, (x_train.shape[0], 1, x_train.shape[1]))
but when I apply it I obtain error:
ValueError: Error when checking input: expected simple_rnn_19_input to have shape (1, 1) but got array with shape (1, 49)
and I'm not sure where is the mistake. Could you please tell me what I'm doing wrong?
What you are calling lags is called look back in literature. This technique allow to feed the RNN with more contextual data and learn mid/long range dependencies.
The error is telling you that you are feeding the layer (shape: 1x1) with the dataset (shape: 1x49)
There are 2 reasons behind the error:
The first is due to your create_dataset which is building a stack of 1x(50 - lags) = 1x49 vectors, which is the opposite of what you want 1x(lags) = 1x1.
In particular this line is the responsible:
subdata = dataset[i:len(dataset) - lags + i]
# with lags = 1 you have just one
# iteration in range(1): i = 0
subdata = dataset[0:50 - 1 + 0]
subdata = dataset[0:49] # which is a 1x49 vector
# In order to create the right vector
# you need to change your function:
def create_dataset(dataset, lags = 1):
dataX, dataY = [], []
# iterate to a max of (50 - lags - 1) times
# because we need "lags" element in each vector
for i in range(len(dataset) - lags - 1):
# get "lags" elements from the dataset
subdata = dataset[i:i + lags]
dataX.append(subdata)
# get only the last label representing
# the current element iteration
dataY.append(dataset[i + lags])
return np.array(dataX), np.array(dataY)
If you use look back in your RNN you also need to increase the input dimensions, because you are looking also at precendent samples.
The network indeed is looking to more data than just 1 sample, because it needs to "look back" to more samples to understand mid/long range dependencies.
This is more conceptual than actual, in your code is fine because lags = 1:
model.add(SimpleRNN(1, input_shape=(1, 1)))
# you should use lags in the input shape
model.add(SimpleRNN(1, input_shape=(1, LAGS)))

handwriting text recognition (CNN + LSTM + CTC) RNN explanation required

I am trying to understand the following code, which is in python & tensorflow. Im trying to implement a handwriting text recognition. I am referring to the following code here
I dont understand why the RNN output is put through a "atrous_conv2d"
This is the architecture of my model, takes a CNN input and pass into this RNN process and then pass it to a CTC.
def build_RNN(self, rnnIn4d):
rnnIn3d = tf.squeeze(rnnIn4d, axis=[2]) # squeeze remove 1 dimensions, here it removes the 2nd index
n_hidden = 256
n_layers = 2
cells = []
for _ in range(n_layers):
cells.append(tf.nn.rnn_cell.LSTMCell(num_units=n_hidden))
stacked = tf.nn.rnn_cell.MultiRNNCell(cells) # combine the 2 LSTMCell created
# BxTxF -> BxTx2H
((fw, bw), _) = tf.nn.bidirectional_dynamic_rnn(cell_fw=stacked, cell_bw=stacked, inputs=rnnIn3d,
dtype=rnnIn3d.dtype)
# BxTxH + BxTxH -> BxTx2H -> BxTx1X2H
concat = tf.expand_dims(tf.concat([fw, bw], 2), 2)
# project output to chars (including blank): BxTx1x2H -> BxTx1xC -> BxTxC
kernel = tf.Variable(tf.truncated_normal([1, 1, n_hidden * 2, len(self.char_list) + 1], stddev=0.1))
rnn = tf.nn.atrous_conv2d(value=concat, filters=kernel, rate=1, padding='SAME')
return tf.squeeze(rnn, axis=[2])
The input to CTC loss layer will be of the form B x T x C
B - Batch Size
T - Max length of the output (twice max word length due to blank char)
C - number of character + 1 (blank char)
Input to atrous is of shape (B x T x 1 X 2T) == (batch, height ,width ,channel)
filter we are using is (1,1,2T,C) == (height ,width ,input channel ,output channel)
After atrous CNN we will get (B ,T ,1 ,C) which is the desired output for CTC
note: we will take a transpose before we input our image to CNN since tf is row major.
atrous with rate 1 is same as normal conv layer.

How to create end execute a basic LSTM network in TensorFlow?

I want to create a basic LSTM network that accept sequences of 5 dimensional vectors (for example as a N x 5 arrays) and returns the corresponding sequences of 4 dimensional hidden- and cell-vectors (N x 4 arrays), where N is the number of time steps.
How can I do it TensorFlow?
ADDED
So, far I got the following code working:
num_units = 4
lstm = tf.nn.rnn_cell.LSTMCell(num_units = num_units)
timesteps = 18
num_input = 5
X = tf.placeholder("float", [None, timesteps, num_input])
x = tf.unstack(X, timesteps, 1)
outputs, states = tf.contrib.rnn.static_rnn(lstm, x, dtype=tf.float32)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
x_val = np.random.normal(size = (12,18,5))
res = sess.run(outputs, feed_dict = {X:x_val})
sess.close()
However, there are many open questions:
Why number of time steps is preset? Shouldn't LSTM be able to accept sequences of arbitrary length?
Why do we split data by time-steps (using unstack)?
How to interpret the "outputs" and "states"?
Why number of time steps is preset? Shouldn't LSTM be able to accept
sequences of arbitrary length?
If you want to accept sequences of arbitrary length, I recommend using dynamic_rnn.You can refer here to understand the difference between them.
For example:
num_units = 4
lstm = tf.nn.rnn_cell.LSTMCell(num_units = num_units)
num_input = 5
X = tf.placeholder("float", [None, None, num_input])
outputs, states = tf.nn.dynamic_rnn(lstm, X, dtype=tf.float32)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
x_val = np.random.normal(size = (12,18,5))
res = sess.run(outputs, feed_dict = {X:x_val})
x_val = np.random.normal(size = (12,16,5))
res = sess.run(outputs, feed_dict = {X:x_val})
sess.close()
dynamic_rnn require same length in one batch , but you can specify every length using the sequence_length parameter after you pad batch data when you need arbitrary length in one batch.
We do we split data by time-steps (using unstack)?
Just static_rnn needs to split data with unstack,this depending on their different input requirements. The input shape of static_rnn is [timesteps,batch_size, features], which is a list of 2D tensors of shape [batch_size, features]. But the input shape of dynamic_rnn is either [timesteps,batch_size, features] or [batch_size,timesteps, features] depending on time_major is True or False.
How to interpret the "outputs" and "states"?
The shape of states is [2,batch_size,num_units ] in LSTMCell, one [batch_size, num_units ] represents C and the other [batch_size, num_units ] represents h. You can see pictures below.
In the same way, You will get the shape of states is [batch_size, num_units ] in GRUCell.
outputs represents the output of each time step, so by default(time_major=False) its shape is [batch_size, timesteps, num_units]. And You can easily conclude that
state[1, batch_size, : ] == outputs[ batch_size, -1, : ].

Iterate over a tensor dimension in Tensorflow

I am trying to develop a seq2seq model from a low level perspective (creating by myself all the tensors needed). I am trying to feed the model with a sequence of vectors as a two-dimensional tensor, however, i can't iterate over one dimension of the tensor to extract vector by vector. Does anyone know what could I do to feed a batch of vectors and later get them one by one?
This is my code:
batch_size = 100
hidden_dim = 5
input_dim = embedding_dim
time_size = 5
input_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='input')
output_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='output')
input_array = np.asarray(input_sentence)
output_array = np.asarray(output_sentence)
gru_layer1 = GRU(input_array, input_dim, hidden_dim) #This is a class created by myself
for i in range(input_array.shape[-1]):
word = input_array[:,i]
previous_state = gru_encoder.h_t
gru_layer1.forward_pass(previous_state,word)
And this is the error that I get
TypeError: Expected binary or unicode string, got <tf.Tensor 'input_7:0' shape=(10, ?) dtype=float64>
Tensorflow does deferred execution.
You usually can't know how big the vector will be (words in a sentance, audio samples, etc...). The common thing to do is to cap it at some reasonably large value and then pad the shorter sequences with an empty token.
Once you do this you can select the data for a time slice with the slice operator:
data = tf.placeholder(shape=(batch_size, max_size, numer_of_inputs))
....
for i in range(max_size):
time_data = data[:, i, :]
DoStuff(time_data)
Also lookup tf.transpose for swapping batch and time indices. It can help with performance in certain cases.
Alternatively consider something like tf.nn.static_rnn or tf.nn.dynamic_rnn to do the boilerplate stuff for you.
Finally I found an approach that solves my problem. It worked using tf.scan() instead of a loop, which doesn't require the input tensor to have a defined number in the second dimension. Consecuently you hace to prepare the input tensor previously to be parsed as you want throught tf.san(). In my case this is the code:
batch_size = 100
hidden_dim = 5
input_dim = embedding_dim
time_size = 5
input_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='input')
output_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='output')
input_array = np.asarray(input_sentence)
output_array = np.asarray(output_sentence)
x_t = tf.transpose(input_array, [1, 0], name='x_t')
h_0 = tf.convert_to_tensor(h_0, dtype=tf.float64)
h_t_transposed = tf.scan(forward_pass, x_t, h_0, name='h_t_transposed')
h_t = tf.transpose(h_t_transposed, [1, 0], name='h_t')

Understanding dimension of input to pre-defined LSTM

I am trying to design a model in tensorflow to predict next words using lstm.
Tensorflow tutorial for RNN gives pseudocode how to use LSTM for PTB dataset.
I reached to step of generating batches and labels.
def generate_batches(raw_data, batch_size):
global data_index
data_len = len(raw_data)
num_batches = data_len // batch_size
#batch = dict.fromkeys([i for i in range(num_batches)])
#labels = dict.fromkeys([i for i in range(num_batches)])
batch = np.ndarray(shape=(batch_size), dtype=np.float)
labels = np.ndarray(shape=(batch_size, 1), dtype=np.float)
for i in xrange(batch_size) :
batch[i] = raw_data[i + data_index]
labels[i, 0] = raw_data[i + data_index + 1]
data_index = (data_index + 1) % len(raw_data)
return batch, labels
This code gives batch and labels size (batch_size X 1).
These batch and labels can also be size of (batch_size x vocabulary_size) using tf.nn.embedding_lookup().
So, the problem here is how to proceed next using the function rnn_cell.BasicLSTMCell or using user defined lstm model? What will be the input dimension to LSTM cell and how will it be used with num_steps?
Which size of batch and labels is useful in any scenario?
The full example for PTB is in the source code. There are recommended defaults (SmallConfig, MediumConfig, and LargeConfig) that you can use.

Categories