I am working on deep learning. I am using Keras with tensorflow backend, and I have 36980 images to train. I want to use VGG16, so I resized all of them to (224*224) size. So the size of the train array is around 22GB (36980*224*224*3*4 bytes). When I try to load this amount data into a numpy array, python shows MemoryError.
I have thought of splitting the training set into 10 pieces and train my model on only one of such piece at a time. Is there any better approach to solve this problem? I am using Python 3 (64 bit version).
N.B.
To get a good accuracy, I need as large image as possible, so I can't resize them to a smaller size. Moreover, it's necessary to use RGB images here.
No fit_generator() solution please. A model trained using fit_generator() behaves abnormally while predicting, at least as far as I have seen.
Related
for my exam based around data crunching, we've received a small simpsons dataset of 4 characters (Bart, Homer, Lisa, Marge) to build a convolutional neural network around. However, the dataset contains only a rather small amount of images: around 2200 to split into test & train.
Since I'm very new to neural networks and deep learning, is it acceptable to augment my data (i'm turning the images X degrees 9 times) and splitting my data afterwards using sklearn's testtrainsplit function.
Since I've made this change, I'm getting a training and test accuracy of around 95% after 50 epochs with my current model. Since that's more than I've expected to get, I started questioning if augmenting test-data mainly is accepted without having a biased or wrong result in the end.
so:
a) Can you augment your data before splitting it with sklearn's TrainTestSplit without influencing your results in a wrong way?
b) if my method is wrong, what's another method I could try out?
Thanks in advance!
One should augment the data after Train and Test split. To work correctly one needs to make sure to augment data only from the train split.
If one augments data and before splitting the dataset, it will likely inject small variations of the train dataset into the test dataset. Thus the network will be overestimating its accuracy (and it might be over-fitting as well, among other issues).
A good way to avoid this pitfall it is to augment the data after the original dataset was split.
A lot of libraries implement python generators that randomly apply one or more combination of image modifications to augment the data. These might include
Image rotation
Image Shearing
Image zoom ( Cropping and re-scaling)
Adding noise
Small shift in hue
Image shifting
Image padding
Image Blurring
Image embossing
This github library has a good overview of classical image augmentation techniques: https://github.com/aleju/imgaug ( I have not used this library. Thus cannot endorse it speed or implementation quality, but their overview in README.md seems to be quite comprehensive.)
Some neural network libraries already have some utilities to do that. For example: Keras has methods for Image Preprocessing https://keras.io/preprocessing/image/
What I am trying to accomplish is doing inference in Tensorflow with a batch of images at a time instead of a single image. I am wondering what is the most appropriate way of handling processing of multiple images to speed up inference?
Doing inference on a single image is easily done and quite used in most tutorials, but what I have not seen yet is doing that in a batch-like style.
Here's what I am currently using at a high level:
pl = tf.placeholder(tf.float32)
...
sess.run([boxes, confs], feed_dict={pl: image})
I would appreciate any input on this.
Depending on how your model is designed, you can just feed an array of images to pl. The first dimension of your outputs then corresponds to the index of your image in the batch.
Many tensor ops have an implementation for multiple examples in a batch. There are some exceptions though, for example tf.image.decode_jpeg. In this case, you will have to rewrite your network, using tf.map_fn, for example.
When preparing train set for neural network training, I find two possible way.
The traditional way: calculate the mean on whole training set, and minus this fixed mean value per image before sending to network. Processing standard deviation in the similar way.
I find tensorflow provides a function tf.image.per_image_standardization that do normalization on single image.
I wonder which way is more appropriate?
Both ways are possible and the choice mostly depends on the way you read the data.
Whole training set normalization is convenient when you can load the whole dataset at once into a numpy array. E.g., MNIST dataset is usually loaded fully into memory. This way is also preferable in terms of convergence, when the individual images vary significantly: two training images, one is mostly white and the other is mostly black, will have very different means.
Per image normalization is convenient when the images are loaded one by one or in small batches, for example from the TFRecord. It's also the only viable option when the dataset is too large too fit in memory. In this case, it's better to organize the input pipeline in tensorflow and transform the image tensors just like other tensors in the graph. I've seen pretty good accuracy with this normalization in CIFAR-10, so it's a viable way, despite the issues stated earlier. Also note that you can reduce the negative effect via batch normalization.
I already know how to make a neural network using the mnist dataset. I have been searching for tutorials on how to train a neural network on your own dataset for 3 months now but I'm just not getting it. If someone can suggest any good tutorials or explain how all of this works, please help.
PS. I won't install NLTK. It seems like a lot of people are training their neural network on text but I won't do that. If I would install NLTK, I would only use it once.
I suggest you use OpenCV library. Whatever you uses your MNIST data or PIL, when it's loaded, they're all just NumPy arrays. If you want to make MNIST datasets fit with your trained model, here's how I did it:
1.Use cv2.imread to load all the images you want them to act as training datasets.
2.Use cv2.cvtColor to convert all the images into grayscale images and resize them into 28x28.
3.Divide each pixel in all the datasets by 255.
4.Do the training as usual!
I haven't tried to make it your own format, but theoratically it's the same.
I have installed the tensorflow and follow the tutorial here
https://www.tensorflow.org/versions/0.6.0/tutorials/mnist/tf/index.html#tensorflow-mechanics-101
and build it successfully, I can get the evaluation result for the same size dataset, like 1000X784 for training set, and 1000X784 for testing set.
but what if i want to test one data, 1X784, and find out what's the output, using the algorithm trained above.
I am now to tensorflow, and new to Machine Learning, I hope that I have described my self.
It's not clear to me which part you're having trouble with, but I think what you're asking is how to use batch size 1000 for training, but only predict on a single input. I assume you already know how to predict on batches of size 1000.
If the first dimension of your model's input placeholder, which is usually the batch size, is set to be None, the size is inferred when you provide an input. So, if you change the 1000 to be None, you should then be able to pass an input of size 1 by 784 to make predictions.
The solution that you found to feed a 1*784 is a great solution to just get a quick feed back , however in bigger networks which they need a lot of time (around hours) for training your solution is not feasible.
Tensorflow they have a new feature it's name is Tensorflow serving which you give it a train model then you interact with your model as a client.
Here is their website for more information : https://github.com/tensorflow/serving