I am trying to use a function that uses some OpenCV function on the image. But the data I am getting is a tensor and I am not able to convert it into an image.
def image_func(img):
img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV)
img=cv2.resize(img,(200,66))
return img
model=Sequential()
model.add(Lambda(get_ideal_img,input_shape=(r,c,ch),output_shape=(r,c,ch)))
When I run this snippet it throws an error in the cvtColor function saying that img is not a numpy array. I printed out img and it seemed to be a tensor.
I do not know how to change the tensor to an image and then return the tensor as well. I want the model to have this layer.
If I cannot achieve this with a lambda layer what else can I do?
You confused with the symbolic operation in the Lambda layer with the numerical operation in a python function.
Basically, your custom operation accepts numerical inputs but not symbolic ones. To fix this, what you need is something like py_func in tensorflow
In addition, you have not considered the backpropagation. In short, although this layer is non-parametric and non-learnable, you need to take care of its gradient as well.
import tensorflow as tf
from keras.layers import Input, Conv2D, Lambda
from keras.models import Model
from keras import backend as K
import cv2
def image_func(img):
img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV)
img=cv2.resize(img,(200,66))
return img.astype('float32')
def image_tensor_func(img4d) :
results = []
for img3d in img4d :
rimg3d = image_func(img3d )
results.append( np.expand_dims( rimg3d, axis=0 ) )
return np.concatenate( results, axis = 0 )
class CustomLayer( Layer ) :
def call( self, xin ) :
xout = tf.py_func( image_tensor_func,
[xin],
'float32',
stateful=False,
name='cvOpt')
xout = K.stop_gradient( xout ) # explicitly set no grad
xout.set_shape( [xin.shape[0], 66, 200, xin.shape[-1]] ) # explicitly set output shape
return xout
def compute_output_shape( self, sin ) :
return ( sin[0], 66, 200, sin[-1] )
x = Input(shape=(None,None,3))
f = CustomLayer(name='custom')(x)
y = Conv2D(1,(1,1), padding='same')(x)
model = Model( inputs=x, outputs=y )
print model.summary()
Now you can test this layer with some dummy data.
a = np.random.randn(2,100,200,3)
b = model.predict(a)
print b.shape
model.compile('sgd',loss='mse')
model.fit(a,b)
Im going to assume image_func function does what you want (resize) and image. Note that an image is represent by a numpy array. Since you are using the tensorflow backend you are operating over Tensors (this you knew).
The job now is to convert a Tensor to a numpy array. To do that we need to
evaluate the Tensor using its evaluate the tensor. But inorder to do that we need a to grab a tensor flow session.
Use the get_session() method of the keras backend module to grab the current tensorflow session.
Here is the docstring for get_session()
def get_session():
"""Returns the TF session to be used by the backend.
If a default TensorFlow session is available, we will return it.
Else, we will return the global Keras session.
If no global Keras session exists at this point:
we will create a new global session.
Note that you can manually set the global session
via `K.set_session(sess)`.
# Returns
A TensorFlow session.
"""
So try:
def image_func(img)
from keras import backend as K
sess = K.get_session()
img = sess.run(img) # now img is a proper numpy array
img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV)
img=cv2.resize(img,(200,66))
return img
Note, I haven't been able to test this
EDIT: Just tested this and it won't work (as you noticed). The lambda function needs to return
Tensor. Computation flows throw a Tensor so it also needs to be to be smooth in the sense of differentiation.
I see that essentially the lambda is changing the color and resizing the image, why don't you do this in pre-processing step?
Related
I'm trying to implement a custom loss function on my neural network, which would look like this, if tensors were, instead, numpy arrays:
def custom_loss(y_true, y_pred):
activated = y_pred[y_true > 1]
return np.abs(activated.mean() - activated.std()) / activated.std()
The y's have a shape of (batch_size, 1); that's to say, it's a scalar output for each input row.
obs: this post (Converting Tensor to np.array using K.eval() in Keras returns InvalidArgumentError) gave me an initial direction for which to walk on.
Edit:
This is a reproducible setup for which I'm trying to apply the custom loss function:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
X = np.random.normal(0, 1, (256, 5))
Y = np.random.normal(0, 1, (256, 1))
model = keras.Sequential([
layers.Dense(1),
])
model.compile(optimizer='adam', loss=custom_loss)
model.fit(X, Y)
The .fit() on the last line throws the error AttributeError: 'Tensor' object has no attribute 'mean', if I define custom_loss as stated above on my question.
It's a simple catch. You can use your custom loss as follows
def custom_loss(y_true, y_pred):
activated = y_pred[y_true > 1]
return tf.math.abs(tf.reduce_mean(activated) -
tf.math.reduce_std(activated)) / tf.math.reduce_std(activated)
or if you want to use tf.boolean_mask(tensor, mask, ..) then you need to ensure that the mask condition is in the shape of (None,) or 1D. And if we apply tf.where(y_true>1) it will produce a 2D tensor that needs to be reshaped in your case.
def custom_loss(y_true, y_pred):
activated = tf.boolean_mask(y_pred, tf.reshape(tf.where(y_true>1),[-1]) )
return tf.math.abs(tf.reduce_mean(activated) -
tf.math.reduce_std(activated)) / tf.math.reduce_std(activated)
Have you tried writing it in tensorflow as had gradient problems? Or is this just how to do so in tensorflow? -- Don't worry, I won't give you a classic toxic SO response!
I would try something like this (not tested, but seems along the right track):
def custom_loss(y_true, y_pred):
activated = tf.boolean_mask(y_pred, tf.where(y_true>1))
return tf.math.abs(tf.reduce_mean(activated) - tf.math.reduce_std(activated)) / tf.math.reduce_std(activated))
You may need to play around with dimensions in there, since all of those functions allow for specifying the dimensions to work with.
Also, you will lose the loss function when you save the model, unless you subclass the general loss function. That may be more detail than you are looking for, but if you have problems saving and loading the model, let me know.
I am attempting to understand more about computer vision models, and I'm trying to do some exploring of how they work. In an attempt to understand how to interpret feature vectors more I'm trying to use Pytorch to extract a feature vector. Below is my code that I've pieced together from various places.
import torch
import torch.nn as nn
import torchvision.models as models
import torchvision.transforms as transforms
from torch.autograd import Variable
from PIL import Image
img=Image.open("Documents/01235.png")
# Load the pretrained model
model = models.resnet18(pretrained=True)
# Use the model object to select the desired layer
layer = model._modules.get('avgpool')
# Set model to evaluation mode
model.eval()
transforms = torchvision.transforms.Compose([
torchvision.transforms.Resize(256),
torchvision.transforms.CenterCrop(224),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
def get_vector(image_name):
# Load the image with Pillow library
img = Image.open("Documents/Documents/Driven Data Competitions/Hateful Memes Identification/data/01235.png")
# Create a PyTorch Variable with the transformed image
t_img = transforms(img)
# Create a vector of zeros that will hold our feature vector
# The 'avgpool' layer has an output size of 512
my_embedding = torch.zeros(512)
# Define a function that will copy the output of a layer
def copy_data(m, i, o):
my_embedding.copy_(o.data)
# Attach that function to our selected layer
h = layer.register_forward_hook(copy_data)
# Run the model on our transformed image
model(t_img)
# Detach our copy function from the layer
h.remove()
# Return the feature vector
return my_embedding
pic_vector = get_vector(img)
When I do this I get the following error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [3, 224, 224] instead
I'm sure this is an elementary error, but I can't seem to figure out how to fix this. It was my impression that the "totensor" transformation would make my data 4-d, but it seems it's either not working correctly or I'm misunderstanding it. Appreciate any help or resources I can use to learn more about this!
All the default nn.Modules in pytorch expect an additional batch dimension. If the input to a module is shape (B, ...) then the output will be (B, ...) as well (though the later dimensions may change depending on the layer). This behavior allows efficient inference on batches of B inputs simultaneously. To make your code conform you can just unsqueeze an additional unitary dimension onto the front of t_img tensor before sending it into your model to make it a (1, ...) tensor. You will also need to flatten the output of layer before storing it if you want to copy it into your one-dimensional my_embedding tensor.
A couple of other things:
You should infer within a torch.no_grad() context to avoid computing gradients since you won't be needing them (note that model.eval() just changes the behavior of certain layers like dropout and batch normalization, it doesn't disable construction of the computation graph, but torch.no_grad() does).
I assume this is just a copy paste issue but transforms is the name of an imported module as well as a global variable.
o.data is just returning a copy of o. In the old Variable interface (circa PyTorch 0.3.1 and earlier) this used to be necessary, but the Variable interface was deprecated way back in PyTorch 0.4.0 and no longer does anything useful; now its use just creates confusion. Unfortunately, many tutorials are still being written using this old and unnecessary interface.
Updated code is then as follows:
import torch
import torchvision
import torchvision.models as models
from PIL import Image
img = Image.open("Documents/01235.png")
# Load the pretrained model
model = models.resnet18(pretrained=True)
# Use the model object to select the desired layer
layer = model._modules.get('avgpool')
# Set model to evaluation mode
model.eval()
transforms = torchvision.transforms.Compose([
torchvision.transforms.Resize(256),
torchvision.transforms.CenterCrop(224),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
def get_vector(image):
# Create a PyTorch tensor with the transformed image
t_img = transforms(image)
# Create a vector of zeros that will hold our feature vector
# The 'avgpool' layer has an output size of 512
my_embedding = torch.zeros(512)
# Define a function that will copy the output of a layer
def copy_data(m, i, o):
my_embedding.copy_(o.flatten()) # <-- flatten
# Attach that function to our selected layer
h = layer.register_forward_hook(copy_data)
# Run the model on our transformed image
with torch.no_grad(): # <-- no_grad context
model(t_img.unsqueeze(0)) # <-- unsqueeze
# Detach our copy function from the layer
h.remove()
# Return the feature vector
return my_embedding
pic_vector = get_vector(img)
model(t_img) Instead of this
Here just do--
model(t_img[None])
This will add an extra dimension, hence the image will be of shape [1,3,224,224] and it will work.
I am trying to change the activation function of the last layer of a keras model without replacing the whole layer. In this case, only the softmax function
import keras.backend as K
from keras.models import load_model
from keras.preprocessing.image import load_img, img_to_array
import numpy as np
model = load_model(model_path) # Load any model
img = load_img(img_path, target_size=(224, 224))
img = img_to_array(img)
print(model.predict(img))
My output:
array([[1.53172877e-07, 7.13159451e-08, 6.18941920e-09, 8.52070968e-07,
1.25813088e-07, 9.98970985e-01, 1.48254022e-08, 6.09538893e-06,
1.16236095e-07, 3.91888688e-10, 6.29304608e-08, 1.79565995e-09,
1.75571788e-08, 1.02110009e-03, 2.14380114e-09, 9.54465733e-08,
1.05938483e-07, 2.20544337e-07]], dtype=float32)
Then I do this to change the activation:
model.layers[-1].activation = custom_softmax
print(model.predict(test_img))
and the output I got is exactly the same. Any ideas how to fix? Thanks!
You could try to use the custom_softmax below:
def custom_softmax(x, axis=-1):
"""Softmax activation function.
# Arguments
x : Tensor.
axis: Integer, axis along which the softmax normalization is applied.
# Returns
Tensor, output of softmax transformation.
# Raises
ValueError: In case `dim(x) == 1`.
"""
ndim = K.ndim(x)
if ndim >= 2:
return K.zeros_like(x)
else:
raise ValueError('Cannot apply softmax to a tensor that is 1D')
At the current state of things there's no official, clean way to do that. As pointed by #layser in the comments, the Tensorflow graph isn't being updated - which results in the lack of change in your output. One option is to use keras-vis' utils. My recommendation is to isolate that in your own utils.py, like so:
from vis.utils.utils import apply_modifications
def update_layer_activation(model, activation, index=-1):
model.layers[index].activation = activation
return apply_modifications(model)
Which would lead to a similar use:
model = update_layer_activation(model, custom_softmax)
If you follow the given link, you'll see what they do is quite simple: they save the model to a temporary path, then load it back and return, finally deleting the temp file.
Like stated in the title, I was wondering as to how to have the custom layer returning multiple tensors: out1, out2,...outn?
I tried
keras.backend.concatenate([out1, out2], axis = 1)
But this does only work for tensors having the same length, and it has to be another solution rather than concatenating two by two tensors every time, is it?
In the call method of your layer, where you perform the layer calculations, you can return a list of tensors:
def call(self, inputTensor):
#calculations with inputTensor and the weights you defined in "build"
#inputTensor may be a single tensor or a list of tensors
#output can also be a single tensor or a list of tensors
return [output1,output2,output3]
Take care of the output shapes:
def compute_output_shape(self,inputShape):
#calculate shapes from input shape
return [shape1,shape2,shape3]
The result of using the layer is a list of tensors.
Naturally, some kinds of keras layers accept lists as inputs, others don't.
You have to manage the outputs properly using a functional API Model. You're probably going to have problems using a Sequential model while having multiple outputs.
I tested this code on my machine (Keras 2.0.8) and it works perfectly:
from keras.layers import *
from keras.models import *
import numpy as np
class Lay(Layer):
def init(self):
super(Lay,self).__init__()
def build(self,inputShape):
super(Lay,self).build(inputShape)
def call(self,x):
return [x[:,:1],x[:,-1:]]
def compute_output_shape(self,inputShape):
return [(None,1),(None,1)]
inp = Input((2,))
out = Lay()(inp)
print(type(out))
out = Concatenate()(out)
model = Model(inp,out)
model.summary()
data = np.array([[1,2],[3,4],[5,6]])
print(model.predict(data))
import keras
print(keras.__version__)
I want to take an input image img (which also has negative values) and feed it into two activation layers. However, I want to make a simple transformation e.g. multiply the whole image with -1.0:
left = Activation('relu')(img)
right = Activation('relu')(tf.mul(img, -1.0))
If I do it this way I am getting:
TypeError: Output tensors to a Model must be Keras tensors. Found: Tensor("add_1:0", shape=(?, 5, 1, 3), dtype=float32)
and I am not sure how I can fix that. Is there a Keras side mul() method that I can use for such a thing? Or can I wrap the result of tf.mul(img, -1.0) somehow such that I can pass it on to Activation?
Please note: The negative values may be important. Thus transforming the image s.t. the minimum is simply 0.0 is not a solution here.
I am getting the same error for
left = Activation('relu')(conv)
right = Activation('relu')(-conv)
The same error for:
import tensorflow as tf
minus_one = tf.constant([-1.])
# ...
right = merge([conv, minus_one], mode='mul')
Does creating a Lambda Layer to wrap your function work?
See doc here
from keras.layers import Lambda
import tensorflow as tf
def mul_minus_one(x):
return tf.mul(x,-1.0)
def mul_minus_one_output_shape(input_shape):
return input_shape
myCustomLayer = Lambda(mul_minus_one, output_shape=mul_minus_one_output_shape)
right = myCustomLayer(img)
right = Activation('relu')(right)