Multiple Layer hidden layer in LSTM in Keras - python

x = Input(shape=(timesteps, input_dim,))
# LSTM encoding
h = LSTM(2048)(x)
This is few lines of code from the file I downloaded from the internet. I think h holds for single layer LSTM layer with 2048 units. How can it make it multi layer i.e 2 hidden layers.

Just add another layer (lets call it g)! But since we are passing to another LSTM layer, we're going to have to add return_sequences keyword parameter to the first layer so that we can get the right input shape to the second layer.
x = Input(shape=(timesteps, input_dim,))
# LSTM encoding
h = LSTM(2048, return_sequences=true)(x)
g = LSTM(10)(h)

Related

How do I set the initial state of a keras.layers.RNN instance?

I have created a stacked keras decoder model using the following loop:
# Create the encoder
# Define an input sequence.
encoder_inputs = keras.layers.Input(shape=(None, num_input_features))
# Create a list of RNN Cells, these are then concatenated into a single layer with the RNN layer.
encoder_cells = []
for hidden_neurons in hparams['encoder_hidden_layers']:
encoder_cells.append(keras.layers.GRUCell(hidden_neurons,
kernel_regularizer=regulariser,
recurrent_regularizer=regulariser,
bias_regularizer=regulariser))
encoder = keras.layers.RNN(encoder_cells, return_state=True)
encoder_outputs_and_states = encoder(encoder_inputs)
# Discard encoder outputs and only keep the states. The outputs are of no interest to us, the encoder's job is to create
# a state describing the input sequence.
encoder_states = encoder_outputs_and_states[1:]
print(encoder_states)
if hparams['encoder_hidden_layers'][-1] != hparams['decoder_hidden_layers'][0]:
encoder_states = Dense(hparams['decoder_hidden_layers'][0])(encoder_states[-1])
# Create the decoder, the decoder input will be set to zero
decoder_inputs = keras.layers.Input(shape=(None, 1))
decoder_cells = []
for hidden_neurons in hparams['decoder_hidden_layers']:
decoder_cells.append(keras.layers.GRUCell(hidden_neurons,
kernel_regularizer=regulariser,
recurrent_regularizer=regulariser,
bias_regularizer=regulariser))
decoder = keras.layers.RNN(decoder_cells, return_sequences=True, return_state=True)
# Set the initial state of the decoder to be the output state of the encoder. his is the fundamental part of the
# encoder-decoder.
decoder_outputs_and_states = decoder(decoder_inputs, initial_state=encoder_states)
# Only select the output of the decoder (not the states)
decoder_outputs = decoder_outputs_and_states[0]
# Apply a dense layer with linear activation to set output to correct dimension and scale (tanh is default activation for
# GRU in Keras
decoder_dense = keras.layers.Dense(num_output_features,
activation='linear',
kernel_regularizer=regulariser,
bias_regularizer=regulariser)
decoder_outputs = decoder_dense(decoder_outputs)
model = keras.models.Model(inputs=[encoder_inputs, decoder_inputs], outputs=decoder_outputs)
model.compile(optimizer=optimiser, loss=loss)
model.summary()
This setup works when I have a single layer encoder and a single layer decoder where the number of neurons is the same. However it does not work when the number of layers of the decoder is more than one.
I get the following error message:
ValueError: An `initial_state` was passed that is not compatible with `cell.state_size`. Received `state_spec`=[InputSpec(shape=(None, 48), ndim=2)]; however `cell.state_size` is (48, 58)
My decoder_layers list contains the entries [48, 58]. Therefore my RNN layer that the decoder is comprised of, is a stacked GRU where the first GRU contains 48 neurons and the second contains 58. I would like to set the initial state of the first GRU. I run the states through a Dense layer so that the shape is compatible with the first layer of the decoder. The error message indicates that I am trying to set the initial state of both the first layer and the second layer when I pass the initial state keyword to the decoder RNN layer. Is this correct behaviour? Normally I would set the initial state of the first decoder layer (not built using a cell structure like this) which then would just feed it's inputs into subsequent layers. Is there a way to achieve such behaviour in keras by default when creating a keras.layers.RNN from a list of GRUCell of LSTMCells?
In my own experiments, your intial_states should have batch_size as its first dimension. In other words, each element in one batch may have a different initial state. From your code, I think you missed this dimension.

How to get featuremap of convolutional layer in keras

I have a model that I loaded using Keras. I need to be able to find individual feature maps (print values of each feature map). I was able to print weights. Following is my code:
for layer in model.layers:
g=layer.get_config()
h=layer.get_weights()
print g
print h
The model consists of one convlayer which has total 384 neurons. First 128 have filter size 3, next 4 and last 128 have filter size 5. Then, there are relu and maxpool layers and then it is fed into softmax layer. I want to be able to find outputs (values not shapes) of convlayer, relu and maxpool. I have seen codes online but I'm unable to comprehend on how to map them to my situation.
If you are looking for a way to find the activation (i.e. feature map or output) of a layer given one or more input samples, you can simply define a backend function that takes the input array(s) and gives the activation(s) as its output. Here is an example for illustration (i.e. you may need to adapt it to your needs and your model architecture):
from keras import backend as K
# define a function to get the activation of all layers
outputs = [layer.output for layer in model.layers]
active_func = K.function([model.input], [outputs])
# you can use it like this
activations = active_func([my_input_array])

How to use Bidirectional RNN and Conv1D in keras when shapes are not matching?

I am brand new to Deep-Learning so I'm reading though Deep Learning with Keras by Antonio Gulli and learning a lot. I want to start using some of the concepts. I want to try and implement a neural network with a 1-dimensional convolutional layer that feeds into a bidirectional recurrent layer (like the paper below). All the tutorials or code snippets I've encountered do not implement anything remotely similar to this (e.g. image recognition) or use an older version of keras with different functions and usage.
What I'm trying to do is a variation of this paper:
(1) convert DNA sequences to one-hot encoding vectors; ✓
(2) use a 1 dimensional convolutional neural network; ✓
(3) with max pooling; ✓
(4) send the output to a bidirectional RNN; ⓧ
(5) classify the input;
I cannot figure out how to get the shapes to match up on the Bidirectional RNN. I can't even get an ordinary RNN to work at this stage. How can I restructure the incoming layers to work with a Bidirectional RNN?
Note:
The original code came from https://github.com/uci-cbcl/DanQ/blob/master/DanQ_train.py but I simplified the output layer to just do binary classification. This processed was described (kind of) in https://github.com/fchollet/keras/issues/3322 but I cannot get it to work with the updated keras. The original code (and the 2nd link) work on a very large dataset so I am generating some fake data to illustrate the concept. They are also using an older version of keras where key functionality changes have been made since then.
# Imports
import tensorflow as tf
import numpy as np
from tensorflow.python.keras._impl.keras.layers.core import *
from tensorflow.python.keras._impl.keras.layers import Conv1D, MaxPooling1D, SimpleRNN, Bidirectional, Input
from tensorflow.python.keras._impl.keras.models import Model, Sequential
# Set up TensorFlow backend
K = tf.keras.backend
K.set_session(tf.Session())
np.random.seed(0) # For keras?
# Constants
NUMBER_OF_POSITIONS = 40
NUMBER_OF_CLASSES = 2
NUMBER_OF_SAMPLES_IN_EACH_CLASS = 25
# Generate sequences
https://pastebin.com/GvfLQte2
# Build model
# ===========
# Input Layer
input_layer = Input(shape=(NUMBER_OF_POSITIONS,4))
# Hidden Layers
y = Conv1D(100, 10, strides=1, activation="relu", )(input_layer)
y = MaxPooling1D(pool_size=5, strides=5)(y)
y = Flatten()(y)
y = Bidirectional(SimpleRNN(100, return_sequences = True, activation="tanh", ))(y)
y = Flatten()(y)
y = Dense(100, activation='relu')(y)
# Output layer
output_layer = Dense(NUMBER_OF_CLASSES, activation="softmax")(y)
model = Model(input_layer, output_layer)
model.compile(optimizer="adam", loss="categorical_crossentropy", )
model.summary()
# ~/anaconda/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/layers/recurrent.py in build(self, input_shape)
# 1049 input_shape = tensor_shape.TensorShape(input_shape).as_list()
# 1050 batch_size = input_shape[0] if self.stateful else None
# -> 1051 self.input_dim = input_shape[2]
# 1052 self.input_spec[0] = InputSpec(shape=(batch_size, None, self.input_dim))
# 1053
# IndexError: list index out of range
You don't need to restructure anything at all to get the output of a Conv1D layer into an LSTM layer.
So, the problem is simply the presence of the Flatten layer, which destroys the shape.
These are the shapes used by Conv1D and LSTM:
Conv1D: (batch, length, channels)
LSTM: (batch, timeSteps, features)
Length is the same as timeSteps, and channels is the same as features.
Using the Bidirectional wrapper won't change a thing either. It will only duplicate your output features.
Classifying.
If you're going to classify the entire sequence as a whole, your last LSTM must use return_sequences=False. (Or you may use some flatten + dense instead after)
If you're going to classify each step of the sequence, all your LSTMs should have return_sequences=True. You should not flatten the data after them.

Keras - How to construct a shared Embedding() Layer for each Input-Neuron

I want to create a deep neural network in keras, where each element of the input layer is "encoded" using the same, shared Embedding()-layer, before it is fed into the deeper layers.
Each input would be a number that defines the type of an object, and the network should learn an embedding that encapsulates some internal representation of "what this object is".
So, if the input layer has X dimensions, and the embedding has Y dimensions, the first hidden layer should consist of X*Y neurons (each input neuron embedded).
Here is a little image that should show the network architecture that I would like to create, where each input-element is encoded using a 3D-Embedding
How can I do this?
from keras.layers import Input, Embedding
first_input = Input(shape = (your_shape_tuple) )
second_input = Input(shape = (your_shape_tuple) )
...
embedding_layer = Embedding(embedding_size)
first_input_encoded = embedding_layer(first_input)
second_input_encoded = embedding_layer(second_input)
...
Rest of the model....
The emnedding_layer will have shared weights. You can do this in form of lists of layers if you have a lot of inputs.
If what you want is transforming a tensor of inputs, the way to do it is :
from keras.layers import Input, Embedding
# If your inputs are all fed in one numpy array :
input_layer = Input(shape = (num_input_indices,) )
# the output of this layer will be a 2D tensor of shape (num_input_indices, embedding_size)
embedded_input = Embedding(embedding_size)(input_layer)
Is this what you were looking for?

Add bias to Lasagne neural network layers

I am wondering if there is a way to add bias node to each layer in Lasagne neural network toolkit? I have been trying to find related information in documentation.
This is the network I built but i don't know how to add a bias node to each layer.
def build_mlp(input_var=None):
# This creates an MLP of two hidden layers of 800 units each, followed by
# a softmax output layer of 10 units. It applies 20% dropout to the input
# data and 50% dropout to the hidden layers.
# Input layer, specifying the expected input shape of the network
# (unspecified batchsize, 1 channel, 28 rows and 28 columns) and
# linking it to the given Theano variable `input_var`, if any:
l_in = lasagne.layers.InputLayer(shape=(None, 60),
input_var=input_var)
# Apply 20% dropout to the input data:
l_in_drop = lasagne.layers.DropoutLayer(l_in, p=0.2)
# Add a fully-connected layer of 800 units, using the linear rectifier, and
# initializing weights with Glorot's scheme (which is the default anyway):
l_hid1 = lasagne.layers.DenseLayer(
l_in_drop, num_units=800,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.Uniform())
# We'll now add dropout of 50%:
l_hid1_drop = lasagne.layers.DropoutLayer(l_hid1, p=0.5)
# Another 800-unit layer:
l_hid2 = lasagne.layers.DenseLayer(
l_hid1_drop, num_units=800,
nonlinearity=lasagne.nonlinearities.rectify)
# 50% dropout again:
l_hid2_drop = lasagne.layers.DropoutLayer(l_hid2, p=0.5)
# Finally, we'll add the fully-connected output layer, of 10 softmax units:
l_out = lasagne.layers.DenseLayer(
l_hid2_drop, num_units=2,
nonlinearity=lasagne.nonlinearities.softmax)
# Each layer is linked to its incoming layer(s), so we only need to pass
# the output layer to give access to a network in Lasagne:
return l_out
Actually you don't have to explicitly create biases, because DenseLayer(), and convolution base layers too, has a default keyword argument:
b=lasagne.init.Constant(0.).
Thus you can avoid creating bias, if you don't want to have with explicitly pass bias=None, but this is not that case.
Thus in brief you do have bias parameters while you don't pass None to bias parameter e.g.:
hidden = Denselayer(...bias=None)

Categories