Changing activation function of a keras layer w/o replacing whole layer - python

I am trying to change the activation function of the last layer of a keras model without replacing the whole layer. In this case, only the softmax function
import keras.backend as K
from keras.models import load_model
from keras.preprocessing.image import load_img, img_to_array
import numpy as np
model = load_model(model_path) # Load any model
img = load_img(img_path, target_size=(224, 224))
img = img_to_array(img)
print(model.predict(img))
My output:
array([[1.53172877e-07, 7.13159451e-08, 6.18941920e-09, 8.52070968e-07,
1.25813088e-07, 9.98970985e-01, 1.48254022e-08, 6.09538893e-06,
1.16236095e-07, 3.91888688e-10, 6.29304608e-08, 1.79565995e-09,
1.75571788e-08, 1.02110009e-03, 2.14380114e-09, 9.54465733e-08,
1.05938483e-07, 2.20544337e-07]], dtype=float32)
Then I do this to change the activation:
model.layers[-1].activation = custom_softmax
print(model.predict(test_img))
and the output I got is exactly the same. Any ideas how to fix? Thanks!
You could try to use the custom_softmax below:
def custom_softmax(x, axis=-1):
"""Softmax activation function.
# Arguments
x : Tensor.
axis: Integer, axis along which the softmax normalization is applied.
# Returns
Tensor, output of softmax transformation.
# Raises
ValueError: In case `dim(x) == 1`.
"""
ndim = K.ndim(x)
if ndim >= 2:
return K.zeros_like(x)
else:
raise ValueError('Cannot apply softmax to a tensor that is 1D')

At the current state of things there's no official, clean way to do that. As pointed by #layser in the comments, the Tensorflow graph isn't being updated - which results in the lack of change in your output. One option is to use keras-vis' utils. My recommendation is to isolate that in your own utils.py, like so:
from vis.utils.utils import apply_modifications
def update_layer_activation(model, activation, index=-1):
model.layers[index].activation = activation
return apply_modifications(model)
Which would lead to a similar use:
model = update_layer_activation(model, custom_softmax)
If you follow the given link, you'll see what they do is quite simple: they save the model to a temporary path, then load it back and return, finally deleting the temp file.

Related

a problem and how to deal with batch while creating a Model

from keras_multi_head import MultiHeadAttention
import keras
from keras.layers import Dense,Input,Multiply
from keras import backend as K
from keras.layers.core import Dropout, Layer
from keras.models import Sequential,Model
import numpy as np
import tensorflow as tf
from self_attention_layer import Encoder
## multi source attention
class Multi_source_attention(keras.Model):
def __init__(self,read_n,embed_dim,num_heads,ff_dim,num_layers):
super().__init__()
self.read_n = read_n
self.embed_dim = embed_dim
self.num_heads = num_heads
self.ff_dim = ff_dim
self.num_layers = num_layers
self.get_weights = Dense(49, activation = 'relu',name = "get_weights")
def compute_output_shape(self,input_shape):
#([batch,7,7,256],[1,256])
return input_shape
def call(self,inputs):
## weights matrix
#(1,49)
weights_res = self.get_weights(inputs[1])
#(1,7,7)
weights = tf.reshape(weights_res,(1,7,7))
#(256,7,7)
weights = tf.tile(weights,[256,1,1])
## img from mobilenet
img=tf.reshape(inputs[0],[-1,7,7])
inter_res = tf.multiply(img,weights)
inter_res = tf.reshape(inter_res, (-1,256,49))
print(inter_res.shape)
att = Encoder(self.embed_dim,self.num_heads,self.ff_dim,self.num_layers)(inter_res)
return att
I try to construct a network to implement the part circled in the image. The output from LSTM **(1,256) and from the previous Mobilenet (batch,7,7,256). Then the output of LSTM is transformed to a weights matrix in form of (7,7).
But the problem is that the input shape of the output from mobilenet has a attribute batch. I have no idea how to deal with "batch" or how to set up a parameter to constraint the batch?
Could someone give me a tip?
And if I remove the function compute_output_shape(), one error unimplementerror occurs. the keras official doc tells me that I don't need to overwrite the function.
Could someone explain me about that?
Compute_output_shape is crucial to custom the layer. if the function summary() is called, the corresponding Graph is generated where the input and output shapes are showed in every layer. The compute_output_shape is responsible for the output shape.

Cannot convert a symbolic Keras input/output to a numpy array in a simple CNN

This is a very simple problem that I cannot get around. I am new to tensorflow and this is the second time I am facing this problem.
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Dropout, Flatten, Input
from tensorflow.keras.models import Model
import numpy as np
x = tf.keras.Input(shape=(128, 128, 4))
conv = Conv2D(30, (3, 3), activation='relu',input_shape=(128, 128, 4))(x)
conv = Conv2D(12, (5,5))(conv)
conv = MaxPooling2D(pool_size=(2,2))(conv)
print(conv[2])
conv = np.array(conv[2]) # <---- here is the problem
input_mean = np.mean(conv[1:], axis=0)
input_std = np.std(conv, axis=0)
conv = (conv - input_mean) / input_std
conv = Flatten()(conv)
conv = Dense(157, activation='relu')(conv)
model = Model(inputs = x, outputs = conv)
#model.summary()
The error that I am getting is,
Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
My question is, How would I take the Output from my Maxpooling layer and take the mean and standard deviation for each incoming channel? The output of the mean and std would be a tensor where each channel is separately normalized. I would then flatten this output and send it to my fully connected dense layer.
Thanks in advance.
I obtained a similar error and I performed the following:
del model
Before:
model = Model(inputs = x, outputs = conv)
It resolved my issue.
I am eager to know if it solves your issue too:) .

keras K.function error for layer output extraction

I currently have a modified resnet 50 architecture that takes two inputs. Building the model and training the model works fine, but when I’m trying to extract layer outputs using the backend function, I encounter errors.
I would prefer to extract layers using the backend function, rather than creating a new truncated model with just my layer of interest as the output.
The following snippet is self contained, and should be able to run and give the error I’ve been seeing.
I've tried reformatting the function in a few ways, such as K.function( [ mymodel.input[0],mymodel.input[1] ] , [mymodel.layers[-1].layers[-6].output])
or
K.function( [ mymodel.layers[0].input,mymodel.layers[1].input ] , [mymodel.layers[-1].layers[-6].output])
but nothing seems to fix the issue
##imports
from keras.applications.resnet50 import ResNet50
from keras.layers import Input
from keras.layers import Lambda
from keras.models import Model
from keras.optimizers import Adam
import keras
import keras.backend as K
import numpy as np
#pop off the input
res = ResNet50(weights=None,include_top=True,classes=2)
res.layers.pop(0)
#add two inputs
auxinput= Input(batch_shape=(None,224,224,1), name='aux_input')
main_input = Input(batch_shape=(None,224,224,3), name='main_input')
#use a lambda functon to return just our main input (avoids errors from out auxilary input not being used in resnet50 component)
l_output = Lambda(lambda x: x[0])([main_input, auxinput])
#feed our main layer to resnet50
data_passed_thru = res(l_output)
#assemble the model with our two inputs, and output
mymodel = Model(inputs=[main_input, auxinput], outputs=[data_passed_thru])
mymodel.compile(optimizer=Adam(lr=0.001), loss= keras.losses.poisson, metrics=[ 'accuracy'])
print("my model summary:")
mymodel.summary()
##generate some fake data for testing
fake_aux= np.zeros((224,224))
fake_aux=fake_aux[None,...]
fake_aux=fake_aux[...,None]
print('fake aux input shape:', fake_aux.shape)
fake_main= np.zeros((224,224,3))
fake_main=fake_main[None,...]
print('fake main input shape:', fake_main.shape)
#check our model inputs and target layer
print("inputs:", mymodel.input)
print("layer outout I'm trying to extract:", mymodel.layers[-1].layers[-6])
#create function to feed inputs, get our desired layer outputs
get_output_func = K.function( mymodel.input , [mymodel.layers[-1].layers[-6].output])
##this is the line that fails
X= [fake_main,fake_aux]
preds=get_output_func(X)
The error message I get is
InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,224,224,3]
[[{{node input_1}}]]
I managed to fix it by accessing the Resnet50 inputs directly for the function, rather than just the whole model's initial inputs. The K.function that works is
get_output_func = K.function( [mymodel.layers[-1].get_input_at(0)] , [mymodel.layers[-1].layers[-6].output])
X= [fake_main]
preds=get_output_func(X)
It only works because of my architecture only depends on the 1 input passing through, so not sure what the solution would be for other situations, but works for my case

complete Keras Lambda confusion

I was trying to define a Lambda layer Keras, as follows:
First, a function which computes the wavelet transform of an image and then gloms it together:
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.layers import BatchNormalization
from keras.layers import Lambda
from keras import regularizers
from keras import backend as K
import pywt
import numpy as np
from keras.engine.topology import Layer
def mkwtarray(image):
channels = K.image_data_format()
if channels is 'channels_first':
axbase = 1
else:
axbase = 0
print(axbase)
print(image.shape)
(a,( b, c, d ))= pywt.dwt2(image, 'db1', axes=(axbase, axbase+1))
ab = np.concatenate((a, b), axis=axbase)
cd = np.concatenate((c, d), axis=axbase)
abcd = np.concatenate((ab, cd), axis=axbase+1)
return abcd
def wtoutshape(input_shape):
return input_shape
train_data_dir = 'train'
validation_data_dir = 'validation'
nb_train_samples = 21558
nb_validation_samples = 3446
epochs = 30
batch_size = 32
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
model = Sequential()
model.add(Lambda(mkwtarray, input_shape=input_shape, output_shape = wtoutshape))
<more random layers>
Much to my amazement, as I was defining the model (meaning, evaluated the lines above), it errored out, claiming:
ValueError: Input array has fewer dimensions than the specified axes
Also, the 'print' statements, which printed the expected values 0 and (?, 150, 150, 3) fired, which means that the function was actually evaluated at definition time, not when the model was actually running. I am obviously missing something about Keras' Lambda functionality - any enlightenment would be appreciated.
UPDATE The exact same problem presents itself if you define a layer in the "general" way (via a class, where the lambda is now in the call function of the layer, so this is not lambda-specific.
This looks like a disastrous mix of NumPy and Keras. Let's look at the 2 main confusion points:
Once you are inside a Keras model, example Lambda layer, you are dealing with tensors and not NumPy arrays. Although convinient it would be, you can't use any NumPy operations, external libraries inside models. Having said that, tensor operators are very similar to arrays for good reason. Because it's your first layer, you can pre-process it in NumPy and then pass that into your model, this would work.
Why you get prints working? There are 2 main steps in Keras, Tensorflow: 1-> build the computation graph, 2-> actually run it. So you are building the graph and your operations get called yes, but they create symbolic tensors that have no value. So you can print the shape which can be determined when building the graph but not for example the values it holds.
Take away message, don't mix NumPy with Tensorflow inside computation graphs (models) and by all means print the shapes while building the graph to get an idea of what the graph looks like but you won't get anything more out of symbolic tensors at build time.
Maybe it's a little late, but this week I've been having a similar problem and managed to solve it.
I stopped using lambda layers to fix the problem, instead I created my own layer.
You can see how it works in my GitHub or Hugging Face repository.
GitHub: https://github.com/FernandoPerezLara/image-preprocessing-layer
Hugging Face: https://huggingface.co/fernandoperlar/preprocessing_image
I hope it at least solves the problem for some future person.
/ Fernando

How to use OpenCV functions in Keras Lambda Layer?

I am trying to use a function that uses some OpenCV function on the image. But the data I am getting is a tensor and I am not able to convert it into an image.
def image_func(img):
img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV)
img=cv2.resize(img,(200,66))
return img
model=Sequential()
model.add(Lambda(get_ideal_img,input_shape=(r,c,ch),output_shape=(r,c,ch)))
When I run this snippet it throws an error in the cvtColor function saying that img is not a numpy array. I printed out img and it seemed to be a tensor.
I do not know how to change the tensor to an image and then return the tensor as well. I want the model to have this layer.
If I cannot achieve this with a lambda layer what else can I do?
You confused with the symbolic operation in the Lambda layer with the numerical operation in a python function.
Basically, your custom operation accepts numerical inputs but not symbolic ones. To fix this, what you need is something like py_func in tensorflow
In addition, you have not considered the backpropagation. In short, although this layer is non-parametric and non-learnable, you need to take care of its gradient as well.
import tensorflow as tf
from keras.layers import Input, Conv2D, Lambda
from keras.models import Model
from keras import backend as K
import cv2
def image_func(img):
img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV)
img=cv2.resize(img,(200,66))
return img.astype('float32')
def image_tensor_func(img4d) :
results = []
for img3d in img4d :
rimg3d = image_func(img3d )
results.append( np.expand_dims( rimg3d, axis=0 ) )
return np.concatenate( results, axis = 0 )
class CustomLayer( Layer ) :
def call( self, xin ) :
xout = tf.py_func( image_tensor_func,
[xin],
'float32',
stateful=False,
name='cvOpt')
xout = K.stop_gradient( xout ) # explicitly set no grad
xout.set_shape( [xin.shape[0], 66, 200, xin.shape[-1]] ) # explicitly set output shape
return xout
def compute_output_shape( self, sin ) :
return ( sin[0], 66, 200, sin[-1] )
x = Input(shape=(None,None,3))
f = CustomLayer(name='custom')(x)
y = Conv2D(1,(1,1), padding='same')(x)
model = Model( inputs=x, outputs=y )
print model.summary()
Now you can test this layer with some dummy data.
a = np.random.randn(2,100,200,3)
b = model.predict(a)
print b.shape
model.compile('sgd',loss='mse')
model.fit(a,b)
Im going to assume image_func function does what you want (resize) and image. Note that an image is represent by a numpy array. Since you are using the tensorflow backend you are operating over Tensors (this you knew).
The job now is to convert a Tensor to a numpy array. To do that we need to
evaluate the Tensor using its evaluate the tensor. But inorder to do that we need a to grab a tensor flow session.
Use the get_session() method of the keras backend module to grab the current tensorflow session.
Here is the docstring for get_session()
def get_session():
"""Returns the TF session to be used by the backend.
If a default TensorFlow session is available, we will return it.
Else, we will return the global Keras session.
If no global Keras session exists at this point:
we will create a new global session.
Note that you can manually set the global session
via `K.set_session(sess)`.
# Returns
A TensorFlow session.
"""
So try:
def image_func(img)
from keras import backend as K
sess = K.get_session()
img = sess.run(img) # now img is a proper numpy array
img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV)
img=cv2.resize(img,(200,66))
return img
Note, I haven't been able to test this
EDIT: Just tested this and it won't work (as you noticed). The lambda function needs to return
Tensor. Computation flows throw a Tensor so it also needs to be to be smooth in the sense of differentiation.
I see that essentially the lambda is changing the color and resizing the image, why don't you do this in pre-processing step?

Categories