Keras with CNTK backend: Writing custom layers - python

I'm trying to write a custom layer in Keras to replicate on particular architecture proposed in a paper. The layer has no trainable weights. I believe this might be relevant, since it wouldn't be necessary to extend the class Layer.
I'm using the CNTK backend, but I'm trying to keep the code as backend-agnostic as possible, so I'm relying on the interfaces defined in keras.backend, instead of directly using CNTK.
Right now I'm just trying to get a small example to work. The example is as follows:
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
#I'm extending a grayscale image to behave as a color image
ex_im = np.empty([im.shape[0],im.shape[1],3])
ex_im[:,:,0] = im
ex_im[:,:,1] = im
ex_im[:,:,2] = im
conv_filter = K.ones([3,3,ex_im.shape[2],ex_im.shape[2]])
x = K.conv2d(ex_im,conv_filter,padding='same')
This code, however, results in the following error:
RuntimeError: Convolution currently requires the main operand to have
dynamic axes
CNTK requires the input to the convolution to have dynamic axes, otherwise it would interpret the first dimension of the input as the batch size. So I tried to make the axes dynamic with placeholders (the only way I could find of doing so):
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
ex_im = np.empty([1,im.shape[0],im.shape[1],3])
ex_im[0,:,:,0] = im
ex_im[0,:,:,1] = im
ex_im[0,:,:,2] = im
place = K.placeholder(shape=((None,) + ex_im.shape[1:]))
conv_filter = K.ones([3,3,ex_im.shape[3],ex_im.shape[3]])
x = K.conv2d(place,conv_filter,padding='same')
The image is now an array of images, with what is basically a batch size of 1.
This works correctly. However, I can't figure out how to feed an input to the placeholder in order to test my code. eval() doesn't take any arguments, and there doesn't seem to be a way to pass the input as an argument to the evaluation.
Is there a way to do this without placeholders? Or a way to feed the inputs to the placeholder? Am I doing something fundamentally wrong and should be following another path?
I should add that I really want to avoid being locked in to a backend, so any solutions should be backend-agnostic.

For using custom layers, you don't define tensors, let Keras do it for you. Just create the layer, and what will be given to the layer will already be a proper tensor:
images = np.ones((1,50,50,3))
def myFunc(x):
conv_filter = K.ones([3,3,3,3])
return K.conv2d(x,conv_filter,padding='same')
inp = Input((50,50,3))
out = Lambda(myFunc, output_shape=(50,50,3))(inp)
model = Model(inp,out)
print(model.predict(images))

Related

Save a block diagram of model in PyTorch [duplicate]

import torch
import torch.nn as nn
import torch.optim as optim
import torch.utils.data as data
import torchvision.models as models
import torchvision.datasets as dset
import torchvision.transforms as transforms
from torch.autograd import Variable
from torchvision.models.vgg import model_urls
from torchviz import make_dot
batch_size = 3
learning_rate =0.0002
epoch = 50
resnet = models.resnet50(pretrained=True)
print resnet
make_dot(resnet)
I want to visualize resnet from the pytorch models. How can I do it? I tried to use torchviz but it gives an error:
'ResNet' object has no attribute 'grad_fn'
Here are three different graph visualizations using different tools.
In order to generate example visualizations, I'll use a simple RNN to perform sentiment analysis taken from an online tutorial:
class RNN(nn.Module):
def __init__(self, input_dim, embedding_dim, hidden_dim, output_dim):
super().__init__()
self.embedding = nn.Embedding(input_dim, embedding_dim)
self.rnn = nn.RNN(embedding_dim, hidden_dim)
self.fc = nn.Linear(hidden_dim, output_dim)
def forward(self, text):
embedding = self.embedding(text)
output, hidden = self.rnn(embedding)
return self.fc(hidden.squeeze(0))
Here is the output if you print() the model.
RNN(
(embedding): Embedding(25002, 100)
(rnn): RNN(100, 256)
(fc): Linear(in_features=256, out_features=1, bias=True)
)
Below are the results from three different visualization tools.
For all of them, you need to have dummy input that can pass through the model's forward() method. A simple way to get this input is to retrieve a batch from your Dataloader, like this:
batch = next(iter(dataloader_train))
yhat = model(batch.text) # Give dummy batch to forward().
Torchviz
https://github.com/szagoruyko/pytorchviz
I believe this tool generates its graph using the backwards pass, so all the boxes use the PyTorch components for back-propagation.
from torchviz import make_dot
make_dot(yhat, params=dict(list(model.named_parameters()))).render("rnn_torchviz", format="png")
This tool produces the following output file:
This is the only output that clearly mentions the three layers in my model, embedding, rnn, and fc. The operator names are taken from the backward pass, so some of them are difficult to understand.
HiddenLayer
https://github.com/waleedka/hiddenlayer
This tool uses the forward pass, I believe.
import hiddenlayer as hl
transforms = [ hl.transforms.Prune('Constant') ] # Removes Constant nodes from graph.
graph = hl.build_graph(model, batch.text, transforms=transforms)
graph.theme = hl.graph.THEMES['blue'].copy()
graph.save('rnn_hiddenlayer', format='png')
Here is the output. I like the shade of blue.
I find that the output has too much detail and obfuscates my architecture. For example, why is unsqueeze mentioned so many times?
Netron
https://github.com/lutzroeder/netron
This tool is a desktop application for Mac, Windows, and Linux. It relies on the model being first exported into ONNX format. The application then reads the ONNX file and renders it. There is then an option to export the model to an image file.
input_names = ['Sentence']
output_names = ['yhat']
torch.onnx.export(model, batch.text, 'rnn.onnx', input_names=input_names, output_names=output_names)
Here's what the model looks like in the application. I think this tool is pretty slick: you can zoom and pan around, and you can drill into the layers and operators. The only negative I've found is that it only does vertical layouts.
The make_dot expects a variable (i.e., tensor with grad_fn), not the model itself.
try:
x = torch.zeros(1, 3, 224, 224, dtype=torch.float, requires_grad=False)
out = resnet(x)
make_dot(out) # plot graph of variable, not of a nn.Module
You can have a look at PyTorchViz (https://github.com/szagoruyko/pytorchviz), "A small package to create visualizations of PyTorch execution graphs and traces."
Here is how you do it with torchviz if you want to save the image:
# http://www.bnikolic.co.uk/blog/pytorch-detach.html
import torch
from torchviz import make_dot
x=torch.ones(10, requires_grad=True)
weights = {'x':x}
y=x**2
z=x**3
r=(y+z).sum()
make_dot(r).render("attached", format="png")
screenshot of image you get:
source: http://www.bnikolic.co.uk/blog/pytorch-detach.html
This might be a late answer. But, especially with __torch_function__ developed, it is possible to get better visualization. You can try my project here, torchview
For your example of resnet50, you check the colab notebook, here
where I demonstrate visualization of resnet18 model. The image of resnet18 is produced by the following code
import torchvision
from torchview import draw_graph
model_graph = draw_graph(resnet18(), input_size=(1,3,224,224), expand_nested=True)
model_graph.visual_graph
It also accepts a wide range of output/input types (e.g. list, dictionary)

Variable size inputs for CNTK in Keras

I want to feed a CNN with images from different resolutions using Keras. Thus, I defined the input layer shape as (None,None,3), since the images have 3 channels. My problem is that this works well on TensorFlow, but gives and error on CNTK (and I must use CNTK).
The following python code illustrates my problem:
import numpy as np
from keras.models import Model
from keras.layers import Conv2D, Input
input_layer = Input(shape=(None,None,3),name='input')
x = Conv2D(16,3)(input_layer)
x = Conv2D(16,3)(x)
model = Model(input=input_layer,output=x)
model.compile('adam','mse')
X = np.random.random((1,32,32,3))
Y = model.predict(X)
print Y.shape
If I run using Keras+TensorFlow it will execute nicely, however changing Keras backend to CNTK gives the error:
ValueError: Convolution operation requires that kernel dim 3 <= input dim 1.
As far as I could find over the internet, this problem should have been fixed from CNTK 2.2 and so on, however I'm using CNTK 2.5. Any ideas on how can I overcome this issue?

Train convolutional neural network with theano/lasagne

I'm trying to implement a CNN using theano/lasagne.
I've made a neural network but can't figure out how to train it with the current state.
This is how I'm trying to get the output of the network with the current_states as input.
train = theano.function([input_var], lasagne.layers.get_output(l.out))
output = train(current_states)
However I get this error:
theano.compile.function_module.UnusedInputError: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 0 is not part of the computational graph needed to compute the outputs: inputs.
To make this error into a warning, you can pass the parameter on_unused_input='warn' to theano.function. To disable it completely, use on_unused_input='ignore'.
Why is current_states not used?
I want to get the output of the model on the current_states. How do I do this?
(the CNN build code: http://pastebin.com/Gd35RncU)
The following code snippet works for me:
import lasagne, theano
import theano.tensor as T
import numpy as np
input_var = theano.tensor.tensor4('inputs')
l_out = build_cnn(input_var)
train = theano.function([input_var], lasagne.layers.get_output(l_out))
x = np.random.randn(10, 4, 80, 80).astype(theano.config.floatX)
train(x)
You didn't post your entire code, but you can check to see if in your script you are passing in the input_var variable to your build_cnn function. If you do not, then input_var will not be part of your computational graph, which is why Theano is raising the error.

Caffe - inconsistency in the activation feature values - GPU mode

Hi I am using caffe on Ubuntu 14.04,
CUDA version 7.0 (latest)
cudnn version 2 (latest)
GPU : NVIDIA GT 730
In caffe first I get the initialization done and then I load the imagenet model (Alexnet). I also initialize the gpu using set_mode_gpu()
After that I take an image. I copy this image onto the caffe source blob. Then I perform a forward pass for this image by using : net.forward(end='fc7')
Then I extract the 4096 dimensional fc7 output.(the activation features of the fc7 layer)
The problem I am facing is that when I run the same code multiple times, everytime I obtain a different result. That is, in GPU mode, everytime the activation features are different for the same image. When I am using forward pass, the function of the network is supposed to be deterministic right ? So I should get the same output everytime for the same image.
On the other hand, when I run caffe on cpu by using set_mode_cpu() everything works perfectly, i.e, I get the same output each time
The code used and the outputs obtained are shown below. I am not able to understand what the problem is. Is it that the problem is caused due to GPU rounding off ? But the errors are very large. Or is it due to some issues with the latest CUDNN version ? Or is it something else altogether ?
Following is the CODE
1) IMPORT libraries
from cStringIO import StringIO
import numpy as np
import scipy.ndimage as nd
import PIL.Image
from IPython.display import clear_output, Image, display
from google.protobuf import text_format
import scipy
import matplotlib.pyplot as plt
import caffe
2) IMPORT Caffe Models and define utility functions
model_path = '../../../caffe/models/bvlc_alexnet/'
net_fn = model_path + 'deploy.prototxt'
param_fn = model_path + 'bvlc_reference_caffenet.caffemodel'
model = caffe.io.caffe_pb2.NetParameter()
text_format.Merge(open(net_fn).read(), model)
model.force_backward = True
open('tmp.prototxt', 'w').write(str(model))
net = caffe.Classifier('tmp.prototxt', param_fn,
mean = np.float32([104.0, 116.0, 122.0]), # ImageNet mean, training set dependent
channel_swap = (2,1,0),# the reference model has channels in BGR order instead of RGB
image_dims=(227, 227))
caffe.set_mode_gpu()
# caffe.set_mode_cpu()
# a couple of utility functions for converting to and from Caffe's input image layout
def preprocess(net, img):
return np.float32(np.rollaxis(img, 2)[::-1]) - net.transformer.mean['data']
def deprocess(net, img):
return np.dstack((img + net.transformer.mean['data'])[::-1])
3) LOADING Image and setting constants
target_img = PIL.Image.open('alpha.jpg')
target_img = target_img.resize((227,227), PIL.Image.ANTIALIAS)
target_img=np.float32(target_img)
target_img=preprocess(net, target_img)
end='fc7'
4) Setting the source image and making the forward pass to obtain fc7 activation features
src = net.blobs['data']
src.reshape(1,3,227,227) # resize the network's input image size
src.data[0] = target_img
dst = net.blobs[end]
net.forward(end=end)
target_data = dst.data[0]
print dst.data
FOLLOWING is the output that I obtained for 'print dst.data' when I ran the above code multiple times
output on 1st execution of code
[[-2.22313166 -1.66219997 -1.67641115 ..., -3.62765646 -2.78621101
-5.06158161]]
output on 2nd execution of code
[[ -82.72431946 -372.29296875 -160.5559845 ..., -367.49728394 -138.7151947
-343.32080078]]
output on 3rd execution of code
[[-10986.42578125 -10910.08105469 -10492.50390625 ..., -8597.87011719
-5846.95898438 -7881.21923828]]
output on 4th execution of code
[[-137360.3125 -130303.53125 -102538.78125 ..., -40479.59765625
-5832.90869141 -1391.91259766]]
The output values keep becoming larger and larger and then again become smaller after some time. I am not able to understand the issue.
Switch your network to Test mode to prevent the effect of dropout which is non-deterministic and needed for training mode.
Add the following line right after initializing your network:
net.set_phase_test()
So that you'll always have the same results.
Soner

How to get a layer from a caffe model using torch

In python when I want to get the data from a layer using caffe I have the following code
input_image = caffe.io.load_image(imgName)
input_oversampled = caffe.io.resize_image(input_image, self.net.crop_dims)
prediction = self.net.predict([input_image])
caffe_input = np.asarray(self.net.preprocess('data', prediction))
self.net.forward(data=caffe_input)
data = self.net.blobs['fc7'].data[4] // I want to get this value in lua
Hoever when I'm using torch I'm a bit stuck since I don't know how to perform the same action.
Currently I have the following code
require 'caffe'
require 'image'
net = caffe.Net('/opt/caffe/models/bvlc_reference_caffenet/deploy.prototxt', '/opt/caffe/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel')
img = image.lena()
dest = torch.Tensor(3, 227,227)
img = image.scale(dest, img)
img = img:resize(10,3,227,227)
output = net:forward(img:float())
conv_nodes = net:findModules('fc7') -- not working
Any help would be appreciated
First of all please note that torch-caffe-binding (i.e the tool you use with require 'caffe') is a direct wrapper around Caffe library thanks to LuaJIT FFI.
This means that it allows you to conveniently do a forward or backward with a Torch tensor, but behind the scenes these operations are made on a caffe::Net and not on a Torch nn network.
So if you want to manipulate a plain Torch network what you should use is the loadcaffe library which fully converts the network into a nn.Sequential:
require 'loadcaffe'
local net = loadcaffe.load('net.prototxt', 'net.caffemodel')
Then you can use findModules. However please note that you cannot use their initial label anymore (like conv1 or fc7) as they are discarded after convert.
Here fc7 (= INNER_PRODUCT) corresponds to the N-1 linear transformation. So you can get it as follow:
local nodes = net:findModules('nn.Linear')
local fc7 = nodes[#nodes-1]
Then you can read the data (weights and biases) via fc7.weight and fc7.bias - these are regular torch.Tensor-s.
UPDATE
As of commit 2516fac loadcaffe now saves layer names in addition. So to retrieve the 'fc7' layer you can now do something like:
local fc7
for _,m in pairs(net:listModules()) do
if m.name == 'fc7' then
fc7 = m
break
end
end

Categories