Save a block diagram of model in PyTorch [duplicate] - python

import torch
import torch.nn as nn
import torch.optim as optim
import as data
import torchvision.models as models
import torchvision.datasets as dset
import torchvision.transforms as transforms
from torch.autograd import Variable
from torchvision.models.vgg import model_urls
from torchviz import make_dot
batch_size = 3
learning_rate =0.0002
epoch = 50
resnet = models.resnet50(pretrained=True)
print resnet
I want to visualize resnet from the pytorch models. How can I do it? I tried to use torchviz but it gives an error:
'ResNet' object has no attribute 'grad_fn'

Here are three different graph visualizations using different tools.
In order to generate example visualizations, I'll use a simple RNN to perform sentiment analysis taken from an online tutorial:
class RNN(nn.Module):
def __init__(self, input_dim, embedding_dim, hidden_dim, output_dim):
self.embedding = nn.Embedding(input_dim, embedding_dim)
self.rnn = nn.RNN(embedding_dim, hidden_dim)
self.fc = nn.Linear(hidden_dim, output_dim)
def forward(self, text):
embedding = self.embedding(text)
output, hidden = self.rnn(embedding)
return self.fc(hidden.squeeze(0))
Here is the output if you print() the model.
(embedding): Embedding(25002, 100)
(rnn): RNN(100, 256)
(fc): Linear(in_features=256, out_features=1, bias=True)
Below are the results from three different visualization tools.
For all of them, you need to have dummy input that can pass through the model's forward() method. A simple way to get this input is to retrieve a batch from your Dataloader, like this:
batch = next(iter(dataloader_train))
yhat = model(batch.text) # Give dummy batch to forward().
I believe this tool generates its graph using the backwards pass, so all the boxes use the PyTorch components for back-propagation.
from torchviz import make_dot
make_dot(yhat, params=dict(list(model.named_parameters()))).render("rnn_torchviz", format="png")
This tool produces the following output file:
This is the only output that clearly mentions the three layers in my model, embedding, rnn, and fc. The operator names are taken from the backward pass, so some of them are difficult to understand.
This tool uses the forward pass, I believe.
import hiddenlayer as hl
transforms = [ hl.transforms.Prune('Constant') ] # Removes Constant nodes from graph.
graph = hl.build_graph(model, batch.text, transforms=transforms)
graph.theme = hl.graph.THEMES['blue'].copy()'rnn_hiddenlayer', format='png')
Here is the output. I like the shade of blue.
I find that the output has too much detail and obfuscates my architecture. For example, why is unsqueeze mentioned so many times?
This tool is a desktop application for Mac, Windows, and Linux. It relies on the model being first exported into ONNX format. The application then reads the ONNX file and renders it. There is then an option to export the model to an image file.
input_names = ['Sentence']
output_names = ['yhat']
torch.onnx.export(model, batch.text, 'rnn.onnx', input_names=input_names, output_names=output_names)
Here's what the model looks like in the application. I think this tool is pretty slick: you can zoom and pan around, and you can drill into the layers and operators. The only negative I've found is that it only does vertical layouts.

The make_dot expects a variable (i.e., tensor with grad_fn), not the model itself.
x = torch.zeros(1, 3, 224, 224, dtype=torch.float, requires_grad=False)
out = resnet(x)
make_dot(out) # plot graph of variable, not of a nn.Module

You can have a look at PyTorchViz (, "A small package to create visualizations of PyTorch execution graphs and traces."

Here is how you do it with torchviz if you want to save the image:
import torch
from torchviz import make_dot
x=torch.ones(10, requires_grad=True)
weights = {'x':x}
make_dot(r).render("attached", format="png")
screenshot of image you get:

This might be a late answer. But, especially with __torch_function__ developed, it is possible to get better visualization. You can try my project here, torchview
For your example of resnet50, you check the colab notebook, here
where I demonstrate visualization of resnet18 model. The image of resnet18 is produced by the following code
import torchvision
from torchview import draw_graph
model_graph = draw_graph(resnet18(), input_size=(1,3,224,224), expand_nested=True)
It also accepts a wide range of output/input types (e.g. list, dictionary)


Difference between Experimental Preprocessing layers and normal preprocessing layers in Tensorflow

import tensorflow as tf
import keras
import tensorflow.keras.layers as tfl
from tensorflow.keras.layers.experimental.preprocessing import RandomFlip, RandomRotation
I am trying to figure out which I should use for Data Augmentation. In the documentation, there is:
tf.keras.layers.RandomFlip and RandomRotation
Then we have in tf.keras.layers.experimental.preprocessing the same things, randomFlip and RandomRotation.
Which should I use? I've seen guides that use both.
This is my current code:
def data_augmenter():
data_augmentation = tf.keras.Sequential([
return data_augmentation
and this is a part of my model:
def ResNet50(image_shape = IMG_SIZE, data_augmentation=data_augmenter()):
input_shape = image_shape + (3,)
# Remove top layer in order to put mine with the correct classification labels, get weights for imageNet
base_model = tf.keras.applications.resnet_v2.ResNet50V2(input_shape=input_shape, include_top=False, weights='imagenet')
# Freeze base model
base_model.trainable = False
# Define input layer
inputs = tf.keras.Input(shape=input_shape)
# Apply Data Augmentation
x = data_augmentation(inputs)
I am a bit confused here..
If you find something in an experimental module and something in the same package by the same name, these will typically be aliases of one another. For the sake of backwards compatibility, they don't remove the experimental one (at least not for a few iterations.)
You should generally use the non-experimental one if it exists, since this is considered stable and should not be removed or changed later.
The following page shows Keras preprocessing exerimental. If it redirects to the preprocessing module, it's an alias.

How to get a particular layer output of a pretrained VGG16 in pytorch

I am very new to pytorch and I am trying to get the output of the pretrained model VGG16 feature vector in 1*4096 format which is returned by the layers just before the final layer. I found that there are similar features available in keras. Is there any direct command in pytorch for the same?
The code I am using:
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import models
from torch.autograd import Variable
from PIL import Image
image1 ="C:\Users\user\Pictures\user.png")
model = models.vgg16(pretrained=True)
scaler = transforms.Resize((224, 224))
to_tensor = transforms.ToTensor()
img = to_tensor(scaler(image1)).unsqueeze(0)
Part of the network responsible for creating features is named... features (not only in VGG, it's like that for most of the pretrained networks inside torchvision).
Just use this field and pass your image like this:
import torch
import torchvision
image ="C:\Users\user\Pictures\user.png")
# Get features part of the network
model = models.vgg16(pretrained=True).features
tensor = transforms.ToTensor()(transforms.Resize((224, 224))(image)).unsqueeze(dim=0)
To see what happens inside any torchvision model you can check it's source code. For VGG (any), there is a base class at the top of this file.
To get 4096 flattened features, you could operations similar to those defined in forward:
# Tensor from previous code snippet for brevity
x = model.avgpool(tensor)
x = torch.flatten(x, 1)
final_x = model.classifier[0](x) # only first classifier layer
You could also iterate over modules or children up to wherever you want and output the result (or results or however you want)

How can I make inferences using the Tensorflow Cifar10 tutorial code?

I am an absolute beginner to TensorFlow.
If I have a picture (or set of pictures) that I would like to attempt to classify using the code from the Cifar10 TensorFlow tutorial, how would I do so?
I have absolutely no idea where to start.
Train the model using base CIFAR10 dataset exactly as per the tutorial.
Create a new graph with your own inputs - probably easiest to just use a tf.placeholder and feed the data as per below, but there's lots of other ways.
Start a session, load previously saved weights.
Run the session (with a feed_dict if you're using a placeholder as above).
import tensorflow as tf
train_dir = '/tmp/cifar10_train' # or use FLAGS as in the train example
batch_size = 8
height = 32
width = 32
image = tf.placeholder(shape=(batch_size, height, width, 3), dtype=tf.uint8)
std_img = tf.image.per_image_standardization(image)
logits = cifar10.inference(std_img)
predictions = tf.argmax(logits, axis=-1)
def get_image_data_batches():
n_batchs = 100
for i in range(n_batchs):
yield (np.random.uniform(size=(batch_size, height, width, 3)*255).astype(np.uint8)
def do_stuff_with(logit_vals, prediction_vals):
with tf.Session() as sess:
# restore variables
saver = tf.train.Saver()
saver.restore(sess, tf.train.latest_checkpoint(train_dir))
# run inference
for batch_data in get_image_data_batches():
logit_vals, prediction_vals =[logits, predictions], feed_dict={image: image_data})
do_stuff_with(logit_vals, prediction_vals)
There are better ways of getting data into the graph (see, but I believe tf.placeholders are the easiest way for learning and getting something up and running initially.
Also check out tf.estimator.Estimators for a cleaner way of managing sessions. It's very different to the way it's done in this tutorial and -slightly- less flexible, but for standard networks they save you writing a lot of boilerplate code.

Keras with CNTK backend: Writing custom layers

I'm trying to write a custom layer in Keras to replicate on particular architecture proposed in a paper. The layer has no trainable weights. I believe this might be relevant, since it wouldn't be necessary to extend the class Layer.
I'm using the CNTK backend, but I'm trying to keep the code as backend-agnostic as possible, so I'm relying on the interfaces defined in keras.backend, instead of directly using CNTK.
Right now I'm just trying to get a small example to work. The example is as follows:
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
#I'm extending a grayscale image to behave as a color image
ex_im = np.empty([im.shape[0],im.shape[1],3])
ex_im[:,:,0] = im
ex_im[:,:,1] = im
ex_im[:,:,2] = im
conv_filter = K.ones([3,3,ex_im.shape[2],ex_im.shape[2]])
x = K.conv2d(ex_im,conv_filter,padding='same')
This code, however, results in the following error:
RuntimeError: Convolution currently requires the main operand to have
dynamic axes
CNTK requires the input to the convolution to have dynamic axes, otherwise it would interpret the first dimension of the input as the batch size. So I tried to make the axes dynamic with placeholders (the only way I could find of doing so):
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
ex_im = np.empty([1,im.shape[0],im.shape[1],3])
ex_im[0,:,:,0] = im
ex_im[0,:,:,1] = im
ex_im[0,:,:,2] = im
place = K.placeholder(shape=((None,) + ex_im.shape[1:]))
conv_filter = K.ones([3,3,ex_im.shape[3],ex_im.shape[3]])
x = K.conv2d(place,conv_filter,padding='same')
The image is now an array of images, with what is basically a batch size of 1.
This works correctly. However, I can't figure out how to feed an input to the placeholder in order to test my code. eval() doesn't take any arguments, and there doesn't seem to be a way to pass the input as an argument to the evaluation.
Is there a way to do this without placeholders? Or a way to feed the inputs to the placeholder? Am I doing something fundamentally wrong and should be following another path?
I should add that I really want to avoid being locked in to a backend, so any solutions should be backend-agnostic.
For using custom layers, you don't define tensors, let Keras do it for you. Just create the layer, and what will be given to the layer will already be a proper tensor:
images = np.ones((1,50,50,3))
def myFunc(x):
conv_filter = K.ones([3,3,3,3])
return K.conv2d(x,conv_filter,padding='same')
inp = Input((50,50,3))
out = Lambda(myFunc, output_shape=(50,50,3))(inp)
model = Model(inp,out)

Caffe - inconsistency in the activation feature values - GPU mode

Hi I am using caffe on Ubuntu 14.04,
CUDA version 7.0 (latest)
cudnn version 2 (latest)
In caffe first I get the initialization done and then I load the imagenet model (Alexnet). I also initialize the gpu using set_mode_gpu()
After that I take an image. I copy this image onto the caffe source blob. Then I perform a forward pass for this image by using : net.forward(end='fc7')
Then I extract the 4096 dimensional fc7 output.(the activation features of the fc7 layer)
The problem I am facing is that when I run the same code multiple times, everytime I obtain a different result. That is, in GPU mode, everytime the activation features are different for the same image. When I am using forward pass, the function of the network is supposed to be deterministic right ? So I should get the same output everytime for the same image.
On the other hand, when I run caffe on cpu by using set_mode_cpu() everything works perfectly, i.e, I get the same output each time
The code used and the outputs obtained are shown below. I am not able to understand what the problem is. Is it that the problem is caused due to GPU rounding off ? But the errors are very large. Or is it due to some issues with the latest CUDNN version ? Or is it something else altogether ?
Following is the CODE
1) IMPORT libraries
from cStringIO import StringIO
import numpy as np
import scipy.ndimage as nd
import PIL.Image
from IPython.display import clear_output, Image, display
from google.protobuf import text_format
import scipy
import matplotlib.pyplot as plt
import caffe
2) IMPORT Caffe Models and define utility functions
model_path = '../../../caffe/models/bvlc_alexnet/'
net_fn = model_path + 'deploy.prototxt'
param_fn = model_path + 'bvlc_reference_caffenet.caffemodel'
model =
text_format.Merge(open(net_fn).read(), model)
model.force_backward = True
open('tmp.prototxt', 'w').write(str(model))
net = caffe.Classifier('tmp.prototxt', param_fn,
mean = np.float32([104.0, 116.0, 122.0]), # ImageNet mean, training set dependent
channel_swap = (2,1,0),# the reference model has channels in BGR order instead of RGB
image_dims=(227, 227))
# caffe.set_mode_cpu()
# a couple of utility functions for converting to and from Caffe's input image layout
def preprocess(net, img):
return np.float32(np.rollaxis(img, 2)[::-1]) - net.transformer.mean['data']
def deprocess(net, img):
return np.dstack((img + net.transformer.mean['data'])[::-1])
3) LOADING Image and setting constants
target_img ='alpha.jpg')
target_img = target_img.resize((227,227), PIL.Image.ANTIALIAS)
target_img=preprocess(net, target_img)
4) Setting the source image and making the forward pass to obtain fc7 activation features
src = net.blobs['data']
src.reshape(1,3,227,227) # resize the network's input image size[0] = target_img
dst = net.blobs[end]
target_data =[0]
FOLLOWING is the output that I obtained for 'print' when I ran the above code multiple times
output on 1st execution of code
[[-2.22313166 -1.66219997 -1.67641115 ..., -3.62765646 -2.78621101
output on 2nd execution of code
[[ -82.72431946 -372.29296875 -160.5559845 ..., -367.49728394 -138.7151947
output on 3rd execution of code
[[-10986.42578125 -10910.08105469 -10492.50390625 ..., -8597.87011719
-5846.95898438 -7881.21923828]]
output on 4th execution of code
[[-137360.3125 -130303.53125 -102538.78125 ..., -40479.59765625
-5832.90869141 -1391.91259766]]
The output values keep becoming larger and larger and then again become smaller after some time. I am not able to understand the issue.
Switch your network to Test mode to prevent the effect of dropout which is non-deterministic and needed for training mode.
Add the following line right after initializing your network:
So that you'll always have the same results.
