I am following the official Tensorflow/Keras docs on image classification, in particular the section on image augmentation. There it says:
Data augmentation takes the approach of generating additional training data from your existing examples by augmenting then using random transformations that yield believable-looking images. This helps expose the model to more aspects of the data and generalize better.
So my understanding of this is that - for example if I have not many training images - I want to generate additional training data by creating new, augmented images in addition to the existing training images.
Then in the Keras docs linked above it is shown how some preprocessing layers from the layers.experimental.preprocessing module are being added as first layers to the Sequential model of the example. So in theory that makes sense, those new preprocessing layers augment the input data (=images) before the "enter" the real TF model.
However, as quoted above and what I thought we want to do is to create additional images, i.e. create new, more images for the existing training images. But how would such a set of preprocessing layers in the model create additional images? Wouldn't they simple (randomly) augment the existing training images before the enter the model, but not create new, additional images?
It is creating additional images, but that doesn't necessarily mean that it will create new jpg files.
If this is what you're trying to do, ImageDataGenerator can do that, with the save_to_dir argument.
Wouldn't they simple (randomly) augment the existing training images before the enter the model, but not create new, additional images?
Yes, it creates new images. But it doesn't create new files on your machine. You can use this:
ImageDataGenerator.flow_from_directory(directory, target_size=(256, 256), save_to_dir=None, save_prefix='', save_format='png'
)
Related
I have a set of 20,000 images that I am importing from disk like below.
imgs_dict={}
path="Documents/data/img"
os.listdir(path)
valid_images =[".png"]
for f in os.listdir(path):
ext= os.path.splitext(f)[1]
if ext.lower() not in valid_images:
continue
img_name=os.path.basename(f)
img_name=os.path.splitext(img_name)[0]
img=np.asarray(Image.open(os.path.join(path,f)))
imgs_dict.update([(img_name,img)])
The reason I am converting this to a dictionary at the end is because I also have two other dictionaries specifying the image id, the classification and whether it is part of the training or validation set. One of these dictionaries corresponds to all the data that should be part of the training data and the other specifies those that should be part of the validation data. After I separate them out, I need to get them back into the standard array format for images (height, width, channels). How can I take a dictionary of images and convert it back into the format I'm wanting here? When i do the below, it produces an array with a shape of (8500,), which is the amount of images in my training set but obviously not reflective of the height, width and channels.
x_train=np.array(list(training_images.values()))
np.shape(x_train)
(8500,)
Or secondarily, am I going about this all wrong? Is there an easier way to handle images than this? It would seem much nicer to just keep the images in a numpy array from the beginning, but as far as I can tell there's no way to have arrays have a key value/label of any sort so I can't pull out specific images.
Edit: For some more context, what I'm essentially trying to do is get my data into a format like what is described in the following link.
https://elitedatascience.com/keras-tutorial-deep-learning-in-python
The specific part in question I'm having trouble with is this:
from keras.datasets import mnist
# Load pre-shuffled MNIST data into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
When we load the MNIST data, how is the relation between X_train and y_train determined? How can I replicate that with my data?
Yes, there is an easier way of handling image data in Keras. Specifically, when dealing with large dataset you want to use a generator instead of loading all of the images to the memory, so specifically please refer to the ImageDataGenerator class. This class in a data generator already implemented in Keras, so unless you need any special operations etc. this can be the "go-to-guy", at least for basic projects. This will also allow you to define basic augmentations and normalization (for example - rescaling, normalize the data, rotation etc.).
Specifically, you can automatically upload images per class either by arranging them in subdirectoris (put all the images from a single label under the same subdirectory), or by creating a data frame that indicates for each image path what is it's label. Refer to flow_from_directory and flow_from_dataframe accordingly.
For train-test splitting, the easiest way is to keep your train and test set in different directories (e.g data/train and data/test) and create 2 different generators. For example, a figure from this tutorial:
In case you don't wan't to put the train and test data at different directories, you can use the validation_split argument when initialize the generator (e.g validation_split=0.2), then, when invoking flow_from_directory, add the argument subset='validation' or subset='training'.
Having said all that, in case you want to load all of the images to the memory as you did and just split them easily, you can use scikit learn - train_test_split, as described here, for example.
PS
regarding MNIST - this is a well established benchmark, which is strictly defined to train and test set, so everyone will be able to compare thier evaluations on the exact same images. This is the reason it is already splitted in advance.
I have a small image dataset which I employed ImageDataGenerator class in Keras to augment my dataset. I put my dataset in a folder so I take advantage of flow_from_directory function to load and use the images.
Now I have to implement k-fold on my code. and I don't know how to manage my dataset since the name of each image is its label.
Does anybody have any idea to handle this situation?
I am new to Tensorflow and to implementing deep learning. I have a dataset of images (images of the same object).
I want to train a Neural Network model using python and Tensorflow for object detection.
I am trying to import the data to Tensorflow but I am not sure what is the right way to do it.
Most of the tutorials available online are using public datasets (i.e. MNIST), which importing is straightforward but not helpful in the case where I need to use my own data.
Is there a procedure or tutorial that i can follow?
There are many ways to import images for training, you can use Tensorflow but these will be imported as Tensorflow objects, which you won't be able to visualize until you run the session.
My favorite tool to import images is skimage.io.imread. The imported images will have the dimension (width, height, channels)
Or you can use importing tool from scipy.misc.
To resize images, you can use skimage.transform.resize.
Before training, you will need to normalize all the images to have the values between 0 and 1. To do that, you simply divide the images by 255.
The next step is to one hot encode your labels to be an array of 0s and 1s.
Then you can build and train your CNN.
You could create a data directory containing one subdirectory per image class containing the respective image files and use flow_from_directory of tf.keras.preprocessing.image.ImageDataGenerator.
A tutorial on how to use this can be found in the Keras Blog.
I would like to use Tensorflow's Estimator to simplify training using LSTM Networks. Apparently, to use tensorflow's Estimator, one must define a model function like so:
def some_model_fn(features, labels, mode):
...
I have no problem using placeholders to get the inputs and labels. How do I turn images into the shape accepted by tensorflow lstms which is [batch_size, num_time_steps, num_features]?
I suggest using numpy to load images to a multi-dimensional array. This does take quite a bit of memory, depending on the image sizes and the number of time steps.
The purpose of the task is to classify images by means of SVM. The variable 'images' is supposed to contain the image information and correspondingly labels contains image labels. How can I build (what format and dimensions) should the images and labels have? I tried unsuccesfully images to be a Python array (appending flattened images) and then, in another attempt, Numpy arrays:
images=np.zeros((number_of_images, image_size))
labels=np.zeros((number_of_images, 1))
svm=cv2.SVM()
svm.train(images, labels)
Is it a right approach to the problem and if so, what is the correct way for training the classifier?
I don't think that you can use raw image data to train SVM model. Ok-ok, you can, but it won't be very fruitful.
The basic approach is to extract some features from each image and to use these features for training your model. A set of features forms a dictionary of words, each of which describes your image. Due to the fact that you are using the same set of words to describe each image, you can compare features corresponding to different images. This link introduces more details, check it.
Whats next?
Choose a feature extractor for your algo - HOG, SURF, SIFT (link)
Extract features from each image. You'll get an array of the same length as images array.
Initialize bag-of-words (BoG) model
Train SVM with BoG
Useful links:
C++ vey detailed example
Documentation for existing BOG classifier