I am using the intermediate outputs of a larger model as the input to smaller models and I'm trying to make it one contiguous Model. In order to do so, I have to use the K.function() as a part of the model. This leads to the question:
Is there any way to use a K.function() within a Keras layer?
I created a simple custom layer using:
class ActivationExtraction(Layer):
"""
Extracts all of the outputs of the input_model network and feeds it as input
to the next layer
"""
def __init__(self, input_model, **kwargs):
self.input_model = input_model
# Extracts all outputs
outputs = [layer.output for layer in input_model.layers]
self.output_dim = np.array(outputs).shape
self.names = [layer.name for layer in input_model.layers]
# Evaluation function
self.output_function = K.function([input_model.input] + [K.learning_phase()],
outputs)
super(ActivationExtraction, self).__init__(**kwargs)
def build(self, input_shape):
super(ActivationExtraction, self).build(input_shape)
def call(self, x):
return self.output_function([x, 0])
def compute_output_shape(self, input_shape):
return self.output_dim
However, when I define the model, it returns the error
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed
values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles.
I don't know a workaround for evaluating the tensor at compile time (because of the input shape being dynamic). I have tried using
def call(self, x):
x = K.get_session.run(x)
return self.outputfuntion([x, 0])
as a long shot to try and evaluate the tensor, but I'm not sure what I would feed it (I have limited experience with tensorflow).
As a last resort, I haven't been able to find a way to evaluate a tensor on the fly in a Keras layer either.
Related
So, I'm trying to create a custom layer in TensorFlow 2.4.1, using a function for a neuron I defined.
# NOTE: this is not the actual neuron I want to use,
# it's just a simple example.
def neuron(x, W, b):
return W # x + b
Where the W and b it gets would be of shape (1, x.shape[0]) and (1, 1) respectively. This means this is like a single neuron in a dense layer. So, I want to create a dense layer by stacking however many of these individual neurons I want.
class Layer(tf.keras.layers.Layer):
def __init__(self, n_units=5):
super(Layer, self).__init__() # handles standard arguments
self.n_units = n_units # Number of neurons to be in the layer
def build(self, input_shape):
# Create weights and biases for all neurons individually
for i in range(self.n_units):
# Create weights and bias for ith neuron
...
def call(self, inputs):
# Compute outputs for all neurons
...
# Concatenate outputs to create layer output
...
return output
How can I create a layer as a stack of individual neurons (also in a way it can train)? I have roughly outlined the idea for the layer in the above code, but the answer doesn't need to follow that as a blueprint.
Finally; yes I'm aware that to create a dense layer you don't need to go about it in such a roundabout way (you just need 1 weight and bias matrix), but in my actual use case, this is neccessary. Thanks!
So, person who asked this question here, I have found a way to do it, by dynamically creating variables and operations.
First, let's re-define the neuron to use tensorflow operations:
def neuron(x, W, b):
return tf.add(tf.matmul(W, x), b)
Then, let's create the layer (this uses the blueprint layed out in the question):
class Layer(tf.keras.layers.Layer):
def __init__(self, n_units=5):
super(Layer, self).__init__()
self.n_units = n_units
def build(self, input_shape):
for i in range(self.n_units):
exec(f'self.kernel_{i} = self.add_weight("kernel_{i}", shape=[1, int(input_shape[0])])')
exec(f'self.bias_{i} = self.add_weight("bias_{i}", shape=[1, 1])')
def call(self, inputs):
for i in range(self.n_units):
exec(f'out_{i} = neuron(inputs, self.kernel_{i}, self.bias_{i})')
return eval(f'tf.concat([{", ".join([ f"out_{i}" for i in range(self.n_units) ])}], axis=0)')
As you can see, we're using exec and eval to dynamically create variables and perform operations.
That's it! We can perform a few checks to see if TensorFlow could use this:
# Check to see if it outputs the correct thing
layer = Layer(5) # With 5 neurons, it should return a (5, 6)
print(layer(tf.zeros([10, 6])))
# Check to see if it has the right trainable parameters
print(layer.trainable_variables)
# Check to see if TensorFlow can find the gradients
layer = Layer(5)
x = tf.ones([10, 6])
with tf.GradientTape() as tape:
z = layer(x)
print(f"Parameter: {layer.trainable_variables[2]}")
print(f"Gradient: {tape.gradient(z, layer.trainable_variables[2])}")
This solution works, but it's not very elegant... I wonder if there's a better way to do it, some magical TF method that can map the neuron to create a layer, I'm too inexperienced to know for the moment. So, please answer if you have a (better) answer, I'll be happy to accept it :)
I was using tensorflow 2.1.0 to build a model, and while I build a custom layer, which need to convert a to numpy array, problem occurred. I explicitly remember aTensor.numpy() is a real thing, so it must be something I did wrong, could anyone tell me how can I fix it? I'm still a noob on tensorflow.
Here are the codes(the code is about a layer, not the whole model):
class CIN_Layer(tf.keras.layers.Layer):
def __init__(self, in_shape):
super(CIN_Layer, self).__init__()
self.in_shape = in_shape
#this is the custom part
def get3DTensor(self, inputs, lastLayerOutput=None):
print(type(inputs))
inputs = inputs.numpy()#FIX HERE: problem occurs here
interaction = []
if lastLayerOutput == None:
lastLayerOutput = inputs.copy()
else:
lastLayerOutput = lastLayerOutput.numpy()
for i in range(inputs.shape[0]):
interaction.append(np.dot(inputs[i].reshape([-1,1]), lastLayerOutput[i].reshape([1,-1])))
return tf.convert_to_tensor(np.array(interaction))
def build(self, input_shape):
self.kernel = self.add_weight('CIN_kernel', shape=[self.in_shape[-1] for i in range(3)])
def call(self, inputs, lastLayerOutput=None):
interaction = self.get3DTensor(inputs, lastLayerOutput)
return tf.reduce_sum(tf.matmul(inputs, self.kernel))
inputs = tf.keras.layers.Input(shape=(5,10))
cin_layer = CIN_Layer(in_shape=(5,10))
lastLayerOutput = cin_layer(inputs)
output = tf.keras.layers.Dense(1)(lastLayerOutput)
model = tf.keras.Model(inputs=inputs, outputs=output)
model.compile(loss='mean_squared_error', optimizer=optimizer)
model.summary()
If there are other ways to insert some numpy code in a tensorflow model, please do tell.
In tensorflow 2.0 there are two types of objects Tensor and EagerTensor. Only the EagerTensor has numpy() method associated with it and EagerTensors are those whose values are readily available during runtime eg. tf.ones(2,3) will create a EagerTensor as we know it's value and any operations performed on it will give out a eagerTensor. In your code during the layer definition, parameter 'inputs' to call method is a normal tensor whose value is known only during the graph execution(forward pass) and so you cannot call a numpy method on it. During the forward pass of tensorflow you should do your operations only using tensor but cannot alternate between tensors and numpy arrays(it makes tracing graph for backprop impossible)
In TF 1.x, it was possible to build layers with custom variables. Here's an example:
import numpy as np
import tensorflow as tf
def make_custom_getter(custom_variables):
def custom_getter(getter, name, **kwargs):
if name in custom_variables:
variable = custom_variables[name]
else:
variable = getter(name, **kwargs)
return variable
return custom_getter
# Make a custom getter for the dense layer variables.
# Note: custom variables can result from arbitrary computation;
# for the sake of this example, we make them just constant tensors.
custom_variables = {
"model/dense/kernel": tf.constant(
np.random.rand(784, 64), name="custom_kernel", dtype=tf.float32),
"model/dense/bias": tf.constant(
np.random.rand(64), name="custom_bias", dtype=tf.float32),
}
custom_getter = make_custom_getter(custom_variables)
# Compute hiddens using a dense layer with custom variables.
x = tf.random.normal(shape=(1, 784), name="inputs")
with tf.variable_scope("model", custom_getter=custom_getter):
Layer = tf.layers.Dense(64)
hiddens = Layer(x)
print(Layer.variables)
The printed variables of the constructed dense layer will be custom tensors we specified in the custom_variables dict:
[<tf.Tensor 'custom_kernel:0' shape=(784, 64) dtype=float32>, <tf.Tensor 'custom_bias:0' shape=(64,) dtype=float32>]
This allows us to create layers/models that use provided tensors in custom_variables directly as their weights, so that we could further differentiate the output of the layers/models with respect to any tensors that custom_variables may depend on (particularly useful for implementing functionality in modulating sub-nets, parameter generation, meta-learning, etc.).
Variable scopes used to make it easy to nest all off graph-building inside scopes with custom getters and build models on top of the provided tensors as their parameters. Since sessions and variable scopes are no longer advisable in TF 2.0 (and all of that low-level stuff is moved to tf.compat.v1), what would be the best practice to implement the above using Keras and TF 2.0?
(Related issue on GitHub.)
Answer based on the comment below
Given you have:
kernel = createTheKernelVarBasedOnWhatYouWant() #shape (784, 64)
bias = createTheBiasVarBasedOnWhatYouWant() #shape (64,)
Make a simple function copying the code from Dense:
def custom_dense(x):
inputs, kernel, bias = x
outputs = K.dot(inputs, kernel)
outputs = K.bias_add(outputs, bias, data_format='channels_last')
return outputs
Use the function in a Lambda layer:
layer = Lambda(custom_dense)
hiddens = layer([x, kernel, bias])
Warning: kernel and bias must be produced from a Keras layer, or come from an kernel = Input(tensor=the_kernel_var) and bias = Input(tensor=bias_var)
If the warning above is bad for you, you can always use kernel and bias "from outside", like:
def custom_dense(inputs):
outputs = K.dot(inputs, kernel) #where kernel is not part of the arguments anymore
outputs = K.bias_add(outputs, bias, data_format='channels_last')
return outputs
layer = Lambda(custom_dense)
hiddens = layer(x)
This last option makes it a bit more complicated to save/load models.
Old answer
You should probably use a Keras Dense layer and set its weights in a standard way:
layer = tf.keras.layers.Dense(64, name='the_layer')
layer.set_weights([np.random.rand(784, 64), np.random.rand(64)])
If you need that these weights are not trainable, before compiling the keras model you set:
model.get_layer('the_layer').trainable=False
If you want direct access to the variables as tensors, they are:
kernel = layer.kernel
bias = layer.bias
There are plenty of other options, but that depends on your exact intention, which is not clear in your question.
Below is a general-purpose solution that works with arbitrary Keras models in TF2.
First, we need to define an auxiliary function canonical_variable_name and a context manager custom_make_variable with the following signatures (see implementation in meta-blocks library).
def canonical_variable_name(variable_name: str, outer_scope: str):
"""Returns the canonical variable name: `outer_scope/.../name`."""
# ...
#contextlib.contextmanager
def custom_make_variable(
canonical_custom_variables: Dict[str, tf.Tensor], outer_scope: str
):
"""A context manager that overrides `make_variable` with a custom function.
When building layers, Keras uses `make_variable` function to create weights
(kernels and biases for each layer). This function wraps `make_variable` with
a closure that infers the canonical name of the variable being created (of the
form `outer_scope/.../var_name`) and looks it up in the `custom_variables` dict
that maps canonical names to tensors. The function adheres the following logic:
* If there is a match, it does a few checks (shape, dtype, etc.) and returns
the found tensor instead of creating a new variable.
* If there is a match but checks fail, it throws an exception.
* If there are no matching `custom_variables`, it calls the original
`make_variable` utility function and returns a newly created variable.
"""
# ...
Using these functions, we can create arbitrary Keras models with custom tensors used as variables:
import numpy as np
import tensorflow as tf
canonical_custom_variables = {
"model/dense/kernel": tf.constant(
np.random.rand(784, 64), name="custom_kernel", dtype=tf.float32),
"model/dense/bias": tf.constant(
np.random.rand(64), name="custom_bias", dtype=tf.float32),
}
# Compute hiddens using a dense layer with custom variables.
x = tf.random.normal(shape=(1, 784), name="inputs")
with custom_make_variable(canonical_custom_variables, outer_scope="model"):
Layer = tf.layers.Dense(64)
hiddens = Layer(x)
print(Layer.variables)
Not entirely sure I understand your question correctly, but it seems to me that it should be possible to do what you want with a combination of custom layers and keras functional api.
Custom layers allow you to build any layer you want in a way that is compatible with Keras, e.g.:
class MyDenseLayer(tf.keras.layers.Layer):
def __init__(self, num_outputs):
super(MyDenseLayer, self).__init__()
self.num_outputs = num_outputs
def build(self, input_shape):
self.kernel = self.add_weight("kernel",
shape=[int(input_shape[-1]),
self.num_outputs],
initializer='normal')
self.bias = self.add_weight("bias",
shape=[self.num_outputs,],
initializer='normal')
def call(self, inputs):
return tf.matmul(inputs, self.kernel) + self.bias
and the functional api allows you to access the outputs of said layers and re-use them:
inputs = keras.Input(shape=(784,), name='img')
x1 = MyDenseLayer(64, activation='relu')(inputs)
x2 = MyDenseLayer(64, activation='relu')(x1)
outputs = MyDenseLayer(10, activation='softmax')(x2)
model = keras.Model(inputs=inputs, outputs=outputs, name='mnist_model')
Here x1 and x2 can be connected to other subnets.
I'm running into an issue with a model I'm trying to build. I've been trying to debug it and ran into an oddity that I think may be the cause, but I'm not sure what I'm doing wrong. I've reduced what I think the problem is into a small snippet you can run on colab.
Here's a colab where you can try running this:
https://colab.research.google.com/drive/1pSTwCwMFGlWgJOP3gn9WF6pZq2CiP4XJ
import keras
from keras.layers import Layer, Dense, Input, Reshape
import keras.backend as K
class SimplePermute(Layer):
def __init__(self, **kwargs):
super(SimplePermute, self).__init__(**kwargs)
def call(self, inputs, **kwargs):
return K.permute_dimensions(inputs, [0,2,1])
test_i = Input(shape=(10, 256))
test = SimplePermute()(test_i)
print(test.get_shape())
print(K.int_shape(test))
test = Dense(units=100, activation="softmax", name="sft2")(test)
print(test.get_shape())
print(K.int_shape(test))
I'd expect the second series of prints to print the permuted tensor shape - that is [?, 256, 10]. However, the K.int_shape() returns [?, 10, 256], while TF's get_shape() returns the properly permuted shape.
I believe this internal mismatch is causing the errors I'm seeing downstream in the model.
Your custom layer doesn't have the compute_output_shape method implemented. This is what Keras uses to determine the _keras_shape property of the tensors, which is returned by K.int_shape.
You can use the standard Permute((2,1)) layer.
Or you can use a Lambda(lambda x: K.permute_dimensions(x, [0,2,1])) layer.
Or you can implement the compute_output_shape method:
.
def compute_output_shape(self, input_shape):
return (input_shape[0], input_shape[2], input_shape[1])
I am working with the keras-capsnet implementation of Capsule Networks, and am trying to apply the same layer to 30 images per sample.
The weights are initialized within the init and build arguments for the class, shown below. I have successfully shared the weights between the primary routing layers which just use tf.layers.conv2d, where I can assign them the same name and set reuse = True.
Does anyone know how to initialize weights in a Keras custom layer so that they may be reused? I am much more familiar with the tensorflow API than with the Keras one!
def __init__(self, num_capsule, dim_capsule, routings=3,
kernel_initializer='glorot_uniform',
**kwargs):
super(CapsuleLayer, self).__init__(**kwargs)
self.num_capsule = num_capsule
self.dim_capsule = dim_capsule
self.routings = routings
self.kernel_initializer = initializers.get(kernel_initializer)
def build(self, input_shape):
assert len(input_shape) >= 3, "The input Tensor should have shape=[None, input_num_capsule, input_dim_capsule]"
self.input_num_capsule = input_shape[1]
self.input_dim_capsule = input_shape[2]
# Weights are initialized here each time the layer is called
self.W = self.add_weight(shape=[self.num_capsule, self.input_num_capsule,
self.dim_capsule, self.input_dim_capsule],
initializer=self.kernel_initializer,
name='W')
self.built = True
The answer was simple. Set up a layer without calling it on input, and then use that built layer to call the data individually.