I am using Keras with tensorflow backend. I am using functional layers in Keras. What I want to achieve is that I have the following architecture at some layer.
Tensor of (20,200)----> LSTM----> Split into two Tensors of size (20,100) each
Then use those two tensors as two branches of a further network. (We can think of this as being the opposite of a Merge operation)
I am given to understand that the only way to achieve this is currently using the Lambda layer since there is no "Split" functionality in Keras.
However looking into the documentation for Lambda function, it seems the output_shape functionality is only relevant iff we are using Keras.
Can anyone offer any advice on how to achieve this ? This is the rough pseudo-code of what I want to achieve.
#Other code before this
lstm_1st_layer = LSTM(nos_hidden_neurons, return_sequences=True)(lstm_0th_layer)
#this layer outputs a tensor of size(20,200)
#Split it into two tensors of size (20,100) each (call them left and right)
left_lstm = LSTM(200, return_sequences=True)(left)
right_lstm = LSTM(200, return_sequences=True)(right)
#Other code after this
In your place I would simply use two LSTM layers with half the numbers of units.
Then you get the two outputs ready to go:
left = LSTM(half_nos_hidden_neurons,.....)(lstm_0th_layer)
right = LSTM(half_nos_hidden_neurons,.....)(lstm_0th_layer)
The effect is the same.
Related
For example, say I wanted to apply a 1D Convolution to FFT data and Raw time series data in the first layer (suppose the first 400 nodes, as an example), but use a simple feed forward network to some 1D Statistical features on the remaining, say 20 nodes. Then combine those outputs in the next layer.
I'm mostly used to just adding a layer which is able to interact with any node in the previous layer, which is why I'm confused here. Any help is appreciated, thanks.
edit:
One thing I forgot to add to the answer above, segments of the feature vector can be selected by taking a tf.slice of the input data, or by using similar slicing notation as you would in numpy arrays i.e Data = Input(shape...) first_10 = Data[:10]
This seems easiest with the functional API. You should be able to do something like
def model(input_shape):
Data = Input(shape=input_shape, name="Input")
ConvOutput = Conv1D()(Data)
SimpleFeatures = Dense()(Data)
Combined = tf.concat()([ConvOutput, SimpleFeatures])
return Model(inputs=Data, outputs=Combined)
to control the input/output of specific layers, as well as combining the results of multiple different nodes.
To solve the issue that I've posted here : Adjust the output of a CNN as an input for TimeDistributed tensorflow layer which is about input data format of the Time distributed tensorflow layer, I think about another idea: instead of passing two inputs to a CNN model, what if , before designing the CNN model, I merge the two inputs in one input using pandas or numpy, and then pass it to the CNN model and then AFTER the INPUT LAYER and BEFORE the CONVOLUTION LAYER, I add a customized layer that separate feature that I concatenate them !! Is this possible ? the following picture explain more what I am talking about:
Thank you #Marco for the help. Exactly like Marco says, I separate the input using index slicing and was done using a Lambda layer. This is the code:
input_layer1=tf.keras.Input(shape=(input_shape))
separate_features1 = tf.keras.layers.Lambda(lambda x : tf.transpose(x,[0,1,2,3])[:,:-1,:,:])(input_layer1)
separate_features2 = tf.keras.layers.Lambda(lambda x : tf.transpose(x,[0,1,2,3])[:,-1:,:,:])(input_layer1)
This is the model architecture:
I am currently trying to understand the architecture of Inseption v3 as implemented in tf.keras.applications.InceptionV3.
I am looking at the list of names in the model's layers:
print([layer.name for layer in model.layers])
#Outputs:
['input_1',
'conv2d',
'batch_normalization',
'activation',
'conv2d_1',
'batch_normalization_1',
'activation_1',
'conv2d_2',
...
]
I understand how batch normalization, pooling and conv layers transform inputs, but deeper we have layers named mixed1, mixed2, ... and so on. I am trying to understand how they (mixed layers) are transforming their inputs.
So far, I couldn't find any information about them.
How does a mixed layer work? What does it do?
Refer to InceprtionV3 paper.
You can see that the mixed layers are made of four parallel connections with single input and we get the output by concatenating all parallel outputs into one. Note that to contatenate all the outputs, all parallel feature maps have to have identical first two dimensions (number of feature maps can differ) and this is achieved by strides and pooling.
mixed1, mixed2, ... are layers of type tf.keras.layers.Concatenate.
You can read more about these layers here :
https://keras.io/api/layers/merging_layers/concatenate/
Let's assume i want to make the following layer in a neural network: Instead of having a square convolutional filter that moves over some image, I want the shape of the filter to be some other shape, say a rectangle, circle, triangle, etc (this is of course a silly example; the real case I have in mind is something different). How would I implement such a layer in TensorFlow?
I found that one can define custom layers in Keras by extending tf.keras.layers.Layer, but the documentation is quite limited without many examples. A python implementation of a convolutional layer by for example extending the tf.keras.layer.Layer would probably help as well, but it seems that the convolutional layers are implemented in C. Does this mean that I have to implement my custom layer in C to get any reasonable speed or would Python TensorFlow operations be enough?
Edit: Perhaps it is enough if I can just define a tensor of weights, but where I can customize entries in the tensor that are identically zero and some weights showing up in multiple places in this tensor, then I should be able to by hand build a convolutional layer and other layers. How would I do this, and also include these variables in training?
Edit2: Let me add some more clarifications. We can take the example of building a 5x5 convolutional layer with one output channel from scratch. If the input is say 10x10 (plus padding so output is also 10x10)), I would imagine doing this by creating a matrix of size 100x100. Then I would fill in the 25 weights in the correct locations in this matrix (so some entries are zero, and some entries are equal, ie all 25 weights will show up in many locations in this matrix). I then multiply the input with this matrix to get an output. So my question would be twofold: 1. How do I do this in TensorFlow? 2. Would this be very inefficient and is some other approach recommended (assuming that I want to later customize what this filter looks like and thus the standard conv2d is not good enough).
Edit3: It seems doable by using sparse tensors and assigning values via a previously defined tf.Variable. However I don't know if this approach will suffer from performance issues.
Just use regular conv. layers with square filters, and zero out some values after each weight update:
g = tf.get_default_graph()
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
conv1_filter = g.get_tensor_by_name('conv1:0')
sess.run(tf.assign(conv1_filter, tf.multiply(conv1_filter, my_mask)))
where my_mask is a binary tensor (of the same shape and type as your filters) that matches the desired pattern.
EDIT: if you're not familiar with tensorflow, you might get confused about using the code above. I recommend looking at this example, and specifically at the way the model is constructed (if you do it like this you can access first layer filters as 'conv1/weights'). Also, I recommend switching to PyTorch :)
So I have a custom layer, that does not have any weights.
In a fist step, I tried to implement the functions manipulating the input tensors in Kers. But I did not succeed because of many reasons. My second approach was to implement the functions with numpy operations, since the custom layer I am implementing does not have any weights, from my understanding, I would say, that I could use numpy operarations, as I don't need backpropagation, since there are no weights, right? And then, I would just convert the output of my layer to a tensor with:
Keras.backend.variable(value = output)
So the main idea is to implement a custom layer, that takes tensors, convert them to numpy arrays, operate on them with numpy operations, then convert the output to a tensor.
The problem is that I seem not to be able to use .eval() in order to convert the input tensors of my layer into numpy arrays, so that they could be manipulated with numpy operations.
Can anybody tell, how I can get around this problem ?
As mentioned by Daniel Möller in the comments, Keras needs to be able to backpropagate through your layer, in order the calculate the gradients for the previous layers. Your layer needs to be differentiable for this reason.
For the same reason you can only use Keras operations, as those can be automatically differentiated with autograd. If your layer is something simple, have a look at the Lambda layer where you can implement custom layers quickly.
As an aside, Keras backend functions should cover a lot of use cases, so if you're stuck with writing your layer through those, you might want to post a another question here.
Hope this helps.