I'm trying to look for the classification of images with labels using RNN with custom data. I can't find any example other than the Mnist dataset. Any help like this repository where CNN is used for classification would be grateful. Any help regarding the classification of images using RNN would be helpful. Trying to replace the CNN network of the following tutorial.
Aymericdamien has some of the best examples out there, and they have an example of using an RNN with images.
https://github.com/aymericdamien/TensorFlow-Examples
https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/recurrent_network.ipynb
The example is using MNIST, but it can be applied to any image.
However, I'll point out that you're unlikely to find many examples of using an RNN to classify an image because RNNs are inferior to CNNs for most image processing tasks. The example linked to above is for educational purposes more than practical purposes.
Now, if you are attempting to use an RNN because you have a sequence of images you wish to process, such as with a video, in this case a more natural approach would be to combine both a CNN (for the image processing part) with an RNN (for the sequence processing part). To do this you would typically pretrain the CNN on some classification task such as Imagenet, then feed the image through the CNN, then the last layer of the CNN would be the input to each timestep of an RNN. You would then let the entire network train with the loss function defined on the RNN.
Related
So I followed the tutorial code for image to image translation using cyclegan from here. I am training it on a custom dataset on Keras. What reason could there be for the network to not learn a translation? Both generators in the cyclegan give out the same image that is given as input.
I used Keras to build a Siamese network using the coding format of one of the questions posted (please see code sample here). To explain this briefly, I built a Siamese network using the pretrained efficient net so that each copy of the network produces a dense layer which then get combined into into a L1-similarity output.
However, during prediction time, I only want to obtain the dense output of one of the layers (as an embedding). I plan on using a variety of unsupervised learning methods (including KNN) on these outputs.
During prediction, how can I ask keras to run only one copy of my network graph using a single input? Can I extract only a part of the NN graph? I don't want to have to always generate pairs of images or run the cost of running 2 images when I only need one output.
Let me just make sure that I understand your question and context. You are using a Siamese network (efficient net) and you want to generate embeddings for your input images.
From the image below, you only want to save the image encodings for one the ConvNets?
If that is the case, I dont really see the point of building a Siamese network at all. Just go for a single ConvNet (using efficient net). Because if you use the Siamese network model, it will always ask you to make image pairs.
If you go for only a single ConvNet model, and you identify the layer which you want to use to get the embeddings, then you can use the tf.keras.backend.function like this:
get_layer_output = tf.keras.backend.function([fine_tuned_model.layers[0].input],[fine_tuned_model.layers[-2].output])
Which then, for the predict, you can call it like this:
features = get_layer_output([x])[0]
I am trying to implement the following paper :
https://ieeexplore.ieee.org/document/8281622
In this paper patch extraction process has been carried out before training of the convolutional denoising autoencoder model. I wanted to know if this step is necessary or I can train on the whole image as well. What differences will it make?
I want to learn how to train a convolution neural network in tensorflow.May you write a link with some code.For example a github project.After I train the convolution neural network with my own dataset ,how to save it as .pb or .pbtxt.May you discribe the procedure to do that?
Well not sure if you want code you can understand or just a model. If you want just the model check out this retraining code of tensorflow. For full understanding, this link might be useful. The pb file you're referring to is also called a saved model, see this link.
I'm running the default classify_image code of the imagenet model. Is there any way to visualize the features that it has extracted? If I use 'pool_3:0', that gives me the feature vector. Is there any way to overlay this on top of my image to see which features it has picked as important?
Ross Girshick described one way to visualize what a pooling layer has learned: https://www.cs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf
Essentially instead of visualizing features, you find a few images that a neuron fires most on. You repeat that for a few or all neurons from your feature vector. The algorithm needs lots of images to choose from of course, e.g. the test set.
I wrote my implementation of this idea for cifar10 model in Tensorflow today, which I want to share (uses OpenCV): https://gist.github.com/kukuruza/bb640cebefcc550f357c
You could use it if you manage to provide the images tensor for reading images by batches, and the pool_3:0 tensor.