I went through the GAN network using tensorflow in tensorflow official site.
Here I came across this point
generator = make_generator_model()
noise = tf.random.normal([1, 100])
generated_image = generator(noise, training=False)
plt.imshow(generated_image[0, :, :, 0], cmap='gray')
The make generator_model() returns a sequential model. Yeah, that's cool. But what about the generated_image? Isn't it the tensor value? How can we just generate image and check them when we have not run the session and how is that the matplotlib pyplot function is plotting on tensor object? It should be numpy and as far as I know, pyplot accepts numpy array to plot an image. Isn't it? Can anyone help me regarding this issue?
That method is defined as
def make_generator_model():
model = tf.keras.Sequential()
model.add(layers.Dense(4*4*1024, use_bias = False, input_shape = (100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
As you can see, what you get is a tf.keras.Sequential
Dense Layer
In Keras, you can create layers to develop models. A model is usually a network of layers, in which, the most common type is a stack of layers
Adding a densely-connected layer to the model will take as input arrays of shape (, 100). The shape of the data will be (, 4*4*1024) after the first layer. In this case, you won’t need to specify the size of the input moving forward because of automatic shape inference
Batch normalization functions similarly to preprocessing at every layer of the network.
ReLU is linear for all positive values and set to zero for all negative values. Leaky ReLU has a smaller slope for negative values, instead of altogether zero.
For example, leaky ReLU may have y = 0.01x when x < 0
More info https://towardsdatascience.com/developing-a-dcgan-model-in-tensorflow-2-0-396bc1a101b2
The tutorial uses TF 2.0 which employs eager execution by default. This means that ops are run as they are defined, similar to e.g. PyTorch. Because of this, you can think about control flow in a much more "natural" way (such as numpy functions). Calling generator immediately returns a tensor with values (which plt.imshow converts to a numpy array), there are no more sessions. I encourage you to check out the tutorials on the TF website that talk about the 2.0 changes.
Related
I use MultiHeadAttention layer in my transformer model (my model is very similar to the named entity recognition models). Because my data comes with different lengths, I use padding and attention_mask parameter in MultiHeadAttention to mask padding. If I would use the Masking layer before MultiHeadAttention, will it have the same effect as attention_mask parameter? Or should I use both: attention_mask and Masking layer?
The Tensoflow documentation on Masking and padding with keras may be helpful.
The following is an excerpt from the document.
When using the Functional API or the Sequential API, a mask generated
by an Embedding or Masking layer will be propagated through the
network for any layer that is capable of using them (for example, RNN
layers). Keras will automatically fetch the mask corresponding to an
input and pass it to any layer that knows how to use it.
tf.keras.layers.MultiHeadAttention also supports automatic mask propagation in TF2.10.0.
Improved masking support for tf.keras.layers.MultiHeadAttention.
Implicit masks for query, key and value inputs will automatically be
used to compute a correct attention mask for the layer. These padding
masks will be combined with any attention_mask passed in directly when
calling the layer. This can be used with tf.keras.layers.Embedding
with mask_zero=True to automatically infer a correct padding mask.
Added a use_causal_mask call time arugment to the layer. Passing
use_causal_mask=True will compute a causal attention mask, and
optionally combine it with any attention_mask passed in directly when
calling the layer.
The masking layer keeps the input vector as it and creates a masking vector to be propagated to the following layers if they need a mask vector ( like RNN layers). you can use it if you implement your own model.If you use models from huggingFace, you can use a masking layer for example if you you want to save the mask vector for future use, if not the masking operations are already built_in, so there is no need to add any masking layer at the beginning.
I'm trying to apply a separate convolution to each layer of a 3-dimensional array, which brought me to the Keras TimeDistributed layer. But the documentation notes that:
"Because TimeDistributed applies the same instance of Conv2D to each of the
timestamps, the same set of weights are used at each timestamp."
However, I want to perform a separate convolution (with independently defined weights / filters) for each layer of the array, not using the same set of weights. Is there some built in way to do this? Any help is appreciated!
Let's assume i want to make the following layer in a neural network: Instead of having a square convolutional filter that moves over some image, I want the shape of the filter to be some other shape, say a rectangle, circle, triangle, etc (this is of course a silly example; the real case I have in mind is something different). How would I implement such a layer in TensorFlow?
I found that one can define custom layers in Keras by extending tf.keras.layers.Layer, but the documentation is quite limited without many examples. A python implementation of a convolutional layer by for example extending the tf.keras.layer.Layer would probably help as well, but it seems that the convolutional layers are implemented in C. Does this mean that I have to implement my custom layer in C to get any reasonable speed or would Python TensorFlow operations be enough?
Edit: Perhaps it is enough if I can just define a tensor of weights, but where I can customize entries in the tensor that are identically zero and some weights showing up in multiple places in this tensor, then I should be able to by hand build a convolutional layer and other layers. How would I do this, and also include these variables in training?
Edit2: Let me add some more clarifications. We can take the example of building a 5x5 convolutional layer with one output channel from scratch. If the input is say 10x10 (plus padding so output is also 10x10)), I would imagine doing this by creating a matrix of size 100x100. Then I would fill in the 25 weights in the correct locations in this matrix (so some entries are zero, and some entries are equal, ie all 25 weights will show up in many locations in this matrix). I then multiply the input with this matrix to get an output. So my question would be twofold: 1. How do I do this in TensorFlow? 2. Would this be very inefficient and is some other approach recommended (assuming that I want to later customize what this filter looks like and thus the standard conv2d is not good enough).
Edit3: It seems doable by using sparse tensors and assigning values via a previously defined tf.Variable. However I don't know if this approach will suffer from performance issues.
Just use regular conv. layers with square filters, and zero out some values after each weight update:
g = tf.get_default_graph()
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
conv1_filter = g.get_tensor_by_name('conv1:0')
sess.run(tf.assign(conv1_filter, tf.multiply(conv1_filter, my_mask)))
where my_mask is a binary tensor (of the same shape and type as your filters) that matches the desired pattern.
EDIT: if you're not familiar with tensorflow, you might get confused about using the code above. I recommend looking at this example, and specifically at the way the model is constructed (if you do it like this you can access first layer filters as 'conv1/weights'). Also, I recommend switching to PyTorch :)
I am working on a project where I need deconvolution. I read that gen_nn_ops.max_pool_grad_v2() can do that. I load the function from tensorflow.python.ops.
As far as I understand, the function takes an input and output tensor where the input is a convolutional layer before max pooling and the output the result of the max pooling operation. But what is grad? And what exactly does the output of the function represent?
ksize = [1,2,2,1]
strides = [1,2,2,1]
padding = 'SAME'
u = gen_nn_ops.max_pool_grad_v2(input, output, grad, ksize, strides, padding)
Unfortunately I did not find anything useful on the Internet.
Regarding deconvolution, max_pool_grad_v2 is probably not the op you're looking for. For deconvolution, you probably want to use the keras layer Conv2DTranspose instead.
max_pool_grad_v2 is a gradient function for computing the gradient of the maxpooling function (you'll see that it's used for that very purpose internally within tensorflow). A gradient function such as _MaxPoolGradGrad computes gradients with respect to the ops' inputs given gradients with respect to the ops' outputs. You don't really need to understand how gradients are implemented in tensorflow in order to use tensorflow unless you wanted to implement some of your own, but if you did, there is a guide on the main tensorflow site.
I am new to Keras. How can I print the outputs of a layer, both intermediate or final, during the training phase?
I am trying to debug my neural network and wanted to know how the layers behave during training. To do so I am trying to exact input and output of a layer during training, for every step.
The FAQ (https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer) has a method to extract output of intermediate layer for building another model but that is not what I want. I don't need to use the intermediate layer output as input to other layer, I just need to print their values out and perhaps graph/chart/visualize it.
I am using Keras 2.1.4
I think I have found an answer myself, although not strictly accomplished by Keras.
Basically, to access layer output during training, one needs to modify the computation graph by adding a print node.
A more detailed description can be found in this StackOverflow question:
How can I print the intermediate variables in the loss function in TensorFlow and Keras?
I will quote an example here, say you would like to have your loss get printed per step, you need to set your custom loss function as:
for Theano backend:
diff = y_pred - y_true
diff = theano.printing.Print('shape of diff', attrs=['shape'])(diff)
return K.square(diff)
for Tensorflow backend:
diff = y_pred - y_true
diff = tf.Print(diff, [tf.shape(diff)])
return K.square(diff)
Outputs of other layers can be accessed similarly.
There is also a nice vice tutorial about using tf.Print() from Google
Using tf.Print() in TensorFlow
If you want to know more info on each neuron, you need to use the following to get their bias and weights.
weights = model.layers[0].get_weights()[0]
biases = model.layers[0].get_weights()[1]
0 index defines weights and 1 defines the bias.
You can also get per layer too,
for layer in model.layers:
weights = layer.get_weights() # list of numpy arrays
After each training, if you can access each layer with its dimension and obtain the weights and bias to a numpy array, you should be able to visualize how the neuron after each training.
Hope it helps.