I am trying to make a custom CNN architecture using Pytorch. I want to have about the same control as what I would get if I make the architecture using numpy only.
I am new to Pytorch and would like to see some code samples of CNNs implemented without the nn.module class, if possible.
You have to implement backward() function in your custom class.
However from your question it is not clear whether
your need just a new series of CNN block (so you better use nn.module and something like
nn.Sequential(nn.Conv2d( ...) )
you just need gradient descent https://github.com/jcjohnson/pytorch-examples#pytorch-autograd , so computation backward on your own.
Related
I'm learning to use PyTorch. If anyone is familiar with PyTorch could they tell me if all models can be nn.Sequential?
I'm asking because some framework features only accept as input a model defined as nn.Sequential
Yes, but only in the sense that you could wrap a highly complex model as a single step in ‘nn.Sequential’. If you want an example of a model that breaks sequential behavior, look up ResNet and its ilk. These require data to be passed between “layers”; however, even those can be implemented using ‘nn.Sequential’ by creating special ‘nn.Module’ classes to handle the residual functionality, and stacking those blocks together into sequential.
Tensorflow has some docs for subclassing (tf) Keras Model and Layer.
However, it is unclear which to use for "modules" or "blocks" (e.g. several layers collectively).
Since it is technically several layers, I feel that subclassing Layer would be deceiving, and while subclassing Model works, I am unsure if there are any negative penalties for doing so.
e.g.
x = inputs
a = self.dense_1(x) # <--- self.dense_1 = tf.keras.Dense(...)
b = self.dense_2(a)
c = self.add([x, b])
which is appropriate to use?
(Please note that this answer is old, later, Keras changed to allow and use subclassing regularly)
Initially, there is no need to sublass anything with Keras. Unless you have a particular reason for that (which is not building, training, predicting), you don't subclass for Keras.
Buiding a Keras model:
Either using Sequential (the model is ready already, just add layers), or using Model (create a graph with layers and finally call Model(inputs, outputs)), you don't need to subclass.
The moment you create an instance of Sequential or Model, you have a fully defined model, ready to use in all situations.
This model can even be used as parts of other models, its layers can be easily accessed to get intermetiate outputs and create new branches in your graph.
So, I don't see any reason at all to subclass Model, unless you are using some additional framework that would require this (but I don't think so). This seems to be something from PyTorch users (because this kind of model building is typical for PyTorch, create a subclass for Module and add layers and a call method). But Pytorch doesn't offer the same ease as Keras does for getting intermediate results.
The main advantage of using Keras is exactly this: you can easily access layers from blocks and models and instantly start branching from that point without needing to rebuild any call methods or adding any extra code for that in the models. So, when you subclass Model, you just defeat the purpose of Keras making it all more difficult.
The docs you mentioned say:
Model subclassing is particularly useful when eager execution is enabled since the forward pass can be written imperatively.
I don't really understand what "imperatively" means, and I don't see how it would be easier than just building a model with regular layers.
Another quote from the docs:
Key Point: Use the right API for the job. While model subclassing offers flexibility, it comes at a cost of greater complexity and more opportunities for user errors. If possible, prefer the functional API.
Well... it's always possible.
Subclassing layers
Here, there may be good reasons to do so. And these reasons are:
You want a layer that performs custom calculations that are not available with regular layers
This layer must have persistent weights.
If you don't need "both" things above, you don't need to subclass a layer. If you just want "custom calculations" without weights, a Lambda layer is enough.
I'm tryinig to train my LSTM model in tensorflow and my module has to calculate parameter inside parameter. And i want to train both parameters altogether.
More details are in the picture below.
I think that tensorflow LSTM module's input must be a perfect sequence and parameters like "tf.placeholder".
How can i do this in tensorflow? Or can you recommend another appropriate framework better than tensorflow in this task?
Sorry for my poor english.
First of all your usage of the word parameter is quite confusing. Normally parameters are referred as trainable parameters and therefore every variable which is trained by the optimizer. There are also so-called hyper-parameters, which have to be set per hand e.g. like the model topology.
Tensorflow work with tensors, which are representations of data which are used to build the workflow and are filled with data during run time via placeholder which is like an entry point for the data.
Also, if you have trouble to build your model in tensorflow, then there is also keras. Keras can run with tensorflow as its backend but model building is much easier. Also, keras is also available in the tensorflow API as tf.keras. In keras one or multiple LSTMs are simplified as a layer which can be added to your model.
If you like a more specific answer to your question, please provide code to describe your problem.
I'm trying to use a tensorflow op inside a Keras model. I previously tried to wrap it with a Lambda layer but I believe this disables that layers' backpropagation.
More specifically, I'm trying to use the layers from here in a Keras model, without porting it to Keras layers (I hope to deploy to tensorflow later on). I can compile these layers in a shared library form and load these into python. This gives me tensorflow ops and I don't know how to combine this in a Keras model.
A simple example of a Keras MNIST model, where for example one Conv2D layer is replaced by a tf.nn.conv2d op, would be exactly what I'm looking for.
I've seen this tutorial but it appears to do the opposite of what I am looking for. It seems to insert Keras layers into a tensorflow graph. I'm looking to do the exact opposite.
Best regards,
Hans
Roughly two weeks have passed and it seems I am able to answer my own question now.
It seems like tensorflow can look up gradients if you register them using this decorator. As of writing, this functionality is not (yet) available in C++, which is what I was looking for. A workaround would be to define a normal op in C++ and wrap it in a python method using the mentioned decorator. If these functions with corresponding gradients are registered with tensorflow, backpropagation will happen 'automagically'.
I am trying to create CNN using theano and tensorflow. Defined layers are MLP, convolvemaxpool in different classes now I am trying to create CNN using these classes depending on the input provided by user. CNN should call theano or tensorflow classes depending on the user input. Now, Let's say theano and tensorflow both has classes MLP, convolvemaxpool implementing MLP, Convolution simultaneously. Now my problem is how to call the classes depending on the input? I don't want to use if else since adding more libraries means more if else statement for each class which I don't think is a right solution for now. Any help will be appreciated. Thanks.