I'm trying to create a custom RNN cell in TensorFlow that accepts a tuple as an input, but I'm running into the problem that the parent class BasicLSTMCell requires that inputs be two-dimensional:
# Inputs must be 2-dimensional.
self.input_spec = base_layer.InputSpec(ndim=2)
How can I get around this restriction? I can't add the logic to handle the tuple in the call() method because execution never reaches the method - a dimensionality check raises an error.
I actually found this problem as well. There is a bug in the tensorflow platform. You can solve by changing the get_step_input_shape function in the recurrent.py file. Just add [0] to the end of this line: nest.map_structure(get_input_spec, input_shape))
Related
I'm trying to add a scalar parameter to my model (code too complex to attach), but it is effectively like:
class WholeModel:
def __init__(...):
self.new_parameter = Parameter(torch.scalar_tensor(0.1, requires_grad=True))
self.model = self.make_model()
def make_model(self):
d = distribution() # returns a Distribution which is a Module
d = transform_distribution(d, self.new_parameter)
d.register_parameter(name='new', param=self.new_parameter)
return d
However, I run into this error RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
If I change self.new_parameter = Parameter(torch.scalar_tensor(0.1)) to self.new_parameter = torch.scalar_tensor(0.1) and remove the register_parameter, then it compiles and runs (but then obviously its not then learning the parameter).
I've also tried using a tensor rather than a scalar_tensor but this also doesn't work. The error occurs with/without requires_grad.
Any ideas? It really is just a simple addition to a blackbox model.
Thanks
I want to extract features of a optical image and save them into numpy array . I've seen similar questions , also can be seen here : https://keras.io/getting_started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer-feature-extraction , but don't know how to go about it .
Keras documentation does exaclty specify how to do that. If you have defined your model model_full you can create another one, that is just a part of it - from the input layer, to the one you're interested in.
model_part = Model(
inputs=model_full.input,
outputs=model_full.get_layer("intermed_layer").output)
Then you should be able to obtain output from intermediate layer using:
intermed_output = model_part(data)
In order to do that, you just need a model_full defined, which I assume you already have.
2nd approach
You can also use built-in Keras function, which I guess you already saw in documentation as well. It may look kind of complicated at first, but it's just creating a function with bound values i.e.
from keras import backend as K
get_3rd_layer_output = K.function(
[model.layers[0].input], # param 1 will be treated as layer[0].output
[model.layers[3].output]) # and this function will return output from 3rd layer
# here X is param 1 (input) and the function returns output from layers[3]
output = get_3rd_layer_output([X])[0]
Clearly, again model has to be defined. Not sure if there are any other requirements apart from that.
For my task, I do not need to compute gradients. I am simply replacing nn.L1Loss with a numpy function (corrcoef) in my loss evaluation but I get the following error:
RuntimeError: Can’t call numpy() on Variable that requires grad. Use var.detach().numpy() instead.
I couldn’t figure out how exactly I should detach the graph (I tried torch.Tensor.detach(np.corrcoef(x, y)) but I still get the same error. I eventually wrapped everything using with torch.no_grad as follow:
with torch.no_grad():
predFeats = self.forward(x)
targetFeats = self.forward(target)
loss = torch.from_numpy(np.corrcoef(predFeats.cpu().numpy().astype(np.float32), targetFeats.cpu().numpy().astype(np.float32))[1][1])
But this time I get the following error:
TypeError: expected np.ndarray (got numpy.float64)
I wonder, what am I doing wrong?
TL;DR
with torch.no_grad():
predFeats = self(x)
targetFeats = self(target)
loss = torch.tensor(np.corrcoef(predFeats.cpu().numpy(),
targetFeats.cpu().numpy())[1][1]).float()
You would avoid the first RuntimeError by detaching the tensors (predFeats and targetFeats) from the computational graph.
i.e. Getting a copy of the tensor data without the gradients and the gradient function (grad_fn).
So, instead of
torch.Tensor.detach(np.corrcoef(x.numpy(), y.numpy())) # Detaches a newly created tensor!
# x and y still may have gradients. Hence the first error.
which does nothing, do
# Detaches x and y properly
torch.Tensor(np.corrcoef(x.detach().numpy(), y.detach().numpy()))
But let's not bother with all the detachments.
Like you rightfully fixed, it, let's disable the gradients.
torch.no_grad()
Now, compute the features.
predFeats = self(x) # No need for the explicit .forward() call
targetFeats = self(target)
I found it helpful to break your last line up.
loss = np.corrcoef(predFeats.numpy(), targetFeats.numpy()) # We don't need to detach
# Notice that we don't need to cast the arguments to fp32
# since the `corrcoef` casts them to fp64 anyway.
print(loss.shape, loss.dtype) # A 2-dimensional fp64 matrix
loss = loss[1][1]
print(type(loss)) # Output: numpy.float64
# Loss now just a simple fp64 number
And that is the problem!
Because, when we do
loss = torch.from_numpy(loss)
we're passing in a number (numpy.float64) while it expects a numpy tensor (np.ndarray).
If you're using PyTorch 0.4 or up, there's inbuilt support for scalars.
Simply replace the from_numpy() method with the universal tensor() creation method.
loss = torch.tensor(loss)
P.S. You might also want to look at setting rowvar=False in corrcoef since the rows in PyTorch tensors usually represent the observations.
I just recently started playing around with Keras and got into making custom layers. However, I am rather confused by the many different types of layers with slightly different names but with the same functionality.
For example, there are 3 different forms of the concatenate function from https://keras.io/layers/merge/ and https://www.tensorflow.org/api_docs/python/tf/keras/backend/concatenate
keras.layers.Concatenate(axis=-1)
keras.layers.concatenate(inputs, axis=-1)
tf.keras.backend.concatenate()
I know the 2nd one is used for functional API but what is the difference between the 3? The documentation seems a bit unclear on this.
Also, for the 3rd one, I have seen a code that does this below. Why must there be the line ._keras_shape after the concatenation?
# Concatenate the summed atom and bond features
atoms_bonds_features = K.concatenate([atoms, summed_bond_features], axis=-1)
# Compute fingerprint
atoms_bonds_features._keras_shape = (None, max_atoms, num_atom_features + num_bond_features)
Lastly, under keras.layers, there always seems to be 2 duplicates. For example, Add() and add(), and so on.
First, the backend: tf.keras.backend.concatenate()
Backend functions are supposed to be used "inside" layers. You'd only use this in Lambda layers, custom layers, custom loss functions, custom metrics, etc.
It works directly on "tensors".
It's not the choice if you're not going deep on customizing. (And it was a bad choice in your example code -- See details at the end).
If you dive deep into keras code, you will notice that the Concatenate layer uses this function internally:
import keras.backend as K
class Concatenate(_Merge):
#blablabla
def _merge_function(self, inputs):
return K.concatenate(inputs, axis=self.axis)
#blablabla
Then, the Layer: keras.layers.Concatenate(axis=-1)
As any other keras layers, you instantiate and call it on tensors.
Pretty straighforward:
#in a functional API model:
inputTensor1 = Input(shape) #or some tensor coming out of any other layer
inputTensor2 = Input(shape2) #or some tensor coming out of any other layer
#first parentheses are creating an instance of the layer
#second parentheses are "calling" the layer on the input tensors
outputTensor = keras.layers.Concatenate(axis=someAxis)([inputTensor1, inputTensor2])
This is not suited for sequential models, unless the previous layer outputs a list (this is possible but not common).
Finally, the concatenate function from the layers module: keras.layers.concatenate(inputs, axis=-1)
This is not a layer. This is a function that will return the tensor produced by an internal Concatenate layer.
The code is simple:
def concatenate(inputs, axis=-1, **kwargs):
#blablabla
return Concatenate(axis=axis, **kwargs)(inputs)
Older functions
In Keras 1, people had functions that were meant to receive "layers" as input and return an output "layer". Their names were related to the merge word.
But since Keras 2 doesn't mention or document these, I'd probably avoid using them, and if old code is found, I'd probably update it to a proper Keras 2 code.
Why the _keras_shape word?
This backend function was not supposed to be used in high level codes. The coder should have used a Concatenate layer.
atoms_bonds_features = Concatenate(axis=-1)([atoms, summed_bond_features])
#just this line is perfect
Keras layers add the _keras_shape property to all their output tensors, and Keras uses this property for infering the shapes of the entire model.
If you use any backend function "outside" a layer or loss/metric, your output tensor will lack this property and an error will appear telling _keras_shape doesn't exist.
The coder is creating a bad workaround by adding the property manually, when it should have been added by a proper keras layer. (This may work now, but in case of keras updates this code will break while proper codes will remain ok)
Keras historically supports 2 different interfaces for their layers, the new functional one and the old one, that requires model.add() calls, hence the 2 different functions.
For the TF -- their concatenate() functions does not do everything that required for Keras to work, hence, the additional calls to make ._keras_shape variable correct and not to upset Keras that expects that variable to have some particular value.
I want to be able to feed a initial state to a network via a placeholder and TensorFlow only allow array or tensor to be feed (And I don't know how to create a zer initiale state tuple) . But the tf.nn.dynamic_rnn function recquire a tuple of size 3.
In the answer of this post:
How do I set TensorFlow RNN state when state_is_tuple=True?
is exposed a method to do this conversion but the function utilised l = tf.unpack(state_placeholder, axis=0) doesn't exist anymore. How can i perform the conversion from a tensor of shape (num_layer,2,batch_size,hidden_layers) feed to a placeholder to a tupple acceptable by tf.nn.dynamic_rnn as a initial_state argument?
tf.unpack got replaced with tf.unstack. Can you use that instead?
tf.unstack seems to do the work but the tf.nn.dynamic_rnn still throw me an eror message : AttributeError: 'LSTMStateTuple' object has no attribute 'get_shape'
If It's not an LSTMStateTuple that is expected what is it?
The total stack trace error is: