I am currently trying to understand the architecture of Inseption v3 as implemented in tf.keras.applications.InceptionV3.
I am looking at the list of names in the model's layers:
print([layer.name for layer in model.layers])
#Outputs:
['input_1',
'conv2d',
'batch_normalization',
'activation',
'conv2d_1',
'batch_normalization_1',
'activation_1',
'conv2d_2',
...
]
I understand how batch normalization, pooling and conv layers transform inputs, but deeper we have layers named mixed1, mixed2, ... and so on. I am trying to understand how they (mixed layers) are transforming their inputs.
So far, I couldn't find any information about them.
How does a mixed layer work? What does it do?
Refer to InceprtionV3 paper.
You can see that the mixed layers are made of four parallel connections with single input and we get the output by concatenating all parallel outputs into one. Note that to contatenate all the outputs, all parallel feature maps have to have identical first two dimensions (number of feature maps can differ) and this is achieved by strides and pooling.
mixed1, mixed2, ... are layers of type tf.keras.layers.Concatenate.
You can read more about these layers here :
https://keras.io/api/layers/merging_layers/concatenate/
Related
I'm trying to apply a separate convolution to each layer of a 3-dimensional array, which brought me to the Keras TimeDistributed layer. But the documentation notes that:
"Because TimeDistributed applies the same instance of Conv2D to each of the
timestamps, the same set of weights are used at each timestamp."
However, I want to perform a separate convolution (with independently defined weights / filters) for each layer of the array, not using the same set of weights. Is there some built in way to do this? Any help is appreciated!
I've been trying to understand LSTM inputs for a while now and I think I understand but I keep getting confused on how to implement them.
This is what I think, please correct me if I am wrong.
When specifying an LSTM, you specify the number of cells and the input shape (I've been having issues with the input shape). The Number of cells specifies how many cells should look at the data given and does not affect the required input shape. The input shape (When Stateful) goes by batch size, Timesteps in a batch, and features in a time step. A stateful LSTM retains it's Internal States until reset. Is this right?
If so I'm having so much confusion trying to specify the input shape for my network. This is because I'm trying to upgrade a current network and I cant figure out how and where to specify the input shape without an error.
The way I'm trying to upgrade it is initially I have a CNN going to a dense layer. I'm trying to change it such that it adds an LSTM that takes the CNN's flattened 1D output as one batch and one time step with features dependent on the size of the CNN's Output. Then concatenates its output with the CNN's output (The LSTM's Input) then feeds into the dense layer. Thus it now behaves like LSTM with a skip connection. The issue that I cant seem to understand is when and how to specify the LSTM layer's Input_shape as it has NO INPUT_SHAPE parameter for the functional API? Or maybe I'm just super confused, Everyone uses different API's going over different examples and it gets super confusing what is and isn't specified and how.
Thank you, even if you just help with one of the two parts.
TLDR:
Do I understand LSTM Parameters correctly?
How and when do I specify LSTM Input_shapes if at all?
LSTM units argument means dimensions of LSTM matrices and output shape.
With Functional API you can specify input shape for the very first layer only. If your LSTM layer follows CNN - then its input shape is determined automatically as CNN output.
Let's assume i want to make the following layer in a neural network: Instead of having a square convolutional filter that moves over some image, I want the shape of the filter to be some other shape, say a rectangle, circle, triangle, etc (this is of course a silly example; the real case I have in mind is something different). How would I implement such a layer in TensorFlow?
I found that one can define custom layers in Keras by extending tf.keras.layers.Layer, but the documentation is quite limited without many examples. A python implementation of a convolutional layer by for example extending the tf.keras.layer.Layer would probably help as well, but it seems that the convolutional layers are implemented in C. Does this mean that I have to implement my custom layer in C to get any reasonable speed or would Python TensorFlow operations be enough?
Edit: Perhaps it is enough if I can just define a tensor of weights, but where I can customize entries in the tensor that are identically zero and some weights showing up in multiple places in this tensor, then I should be able to by hand build a convolutional layer and other layers. How would I do this, and also include these variables in training?
Edit2: Let me add some more clarifications. We can take the example of building a 5x5 convolutional layer with one output channel from scratch. If the input is say 10x10 (plus padding so output is also 10x10)), I would imagine doing this by creating a matrix of size 100x100. Then I would fill in the 25 weights in the correct locations in this matrix (so some entries are zero, and some entries are equal, ie all 25 weights will show up in many locations in this matrix). I then multiply the input with this matrix to get an output. So my question would be twofold: 1. How do I do this in TensorFlow? 2. Would this be very inefficient and is some other approach recommended (assuming that I want to later customize what this filter looks like and thus the standard conv2d is not good enough).
Edit3: It seems doable by using sparse tensors and assigning values via a previously defined tf.Variable. However I don't know if this approach will suffer from performance issues.
Just use regular conv. layers with square filters, and zero out some values after each weight update:
g = tf.get_default_graph()
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
conv1_filter = g.get_tensor_by_name('conv1:0')
sess.run(tf.assign(conv1_filter, tf.multiply(conv1_filter, my_mask)))
where my_mask is a binary tensor (of the same shape and type as your filters) that matches the desired pattern.
EDIT: if you're not familiar with tensorflow, you might get confused about using the code above. I recommend looking at this example, and specifically at the way the model is constructed (if you do it like this you can access first layer filters as 'conv1/weights'). Also, I recommend switching to PyTorch :)
I want to use tensorflow hub to generate features for my images, but it seems that the 2048 features of Inception Module are not enough for my problem because my class images are very similar. so I decided to use the features of a hidden layer of this module, for example:
"module/InceptionV3/InceptionV3/Mixed_7c/concat:0"
so how can I write a function that gives me this ?*8*8*2048 features from my input images?
Please try
module = hub.Module(...) # As before.
outputs = module(dict(images=images),
signature="image_feature_vector",
as_dict=True)
print(outputs.items())
Besides the default output with the final feature vector output, you should see a bunch of intermediate feature maps, under keys starting with InceptionV3/ (or whichever other architecture you select). These are 4D tensors with shape [batch_size, feature_map_height, feature_map_width, num_features], so you might want to remove those middle dimensions by avg- or max-pooling over them before feeding this into classification.
I am using Keras with tensorflow backend. I am using functional layers in Keras. What I want to achieve is that I have the following architecture at some layer.
Tensor of (20,200)----> LSTM----> Split into two Tensors of size (20,100) each
Then use those two tensors as two branches of a further network. (We can think of this as being the opposite of a Merge operation)
I am given to understand that the only way to achieve this is currently using the Lambda layer since there is no "Split" functionality in Keras.
However looking into the documentation for Lambda function, it seems the output_shape functionality is only relevant iff we are using Keras.
Can anyone offer any advice on how to achieve this ? This is the rough pseudo-code of what I want to achieve.
#Other code before this
lstm_1st_layer = LSTM(nos_hidden_neurons, return_sequences=True)(lstm_0th_layer)
#this layer outputs a tensor of size(20,200)
#Split it into two tensors of size (20,100) each (call them left and right)
left_lstm = LSTM(200, return_sequences=True)(left)
right_lstm = LSTM(200, return_sequences=True)(right)
#Other code after this
In your place I would simply use two LSTM layers with half the numbers of units.
Then you get the two outputs ready to go:
left = LSTM(half_nos_hidden_neurons,.....)(lstm_0th_layer)
right = LSTM(half_nos_hidden_neurons,.....)(lstm_0th_layer)
The effect is the same.