I have been trying to label my grayscale images for my CNN model. At this stage I aim for binary classification (Normal VS Fault conditions). I have distributed my images to two separate folders (Normal and Fault) and want to use the corresponding directories in my codes. However, I did not find a resource for labelling grayscale images using directories. Does any one know how we can label our grayscale images using directories. I can ask my question in this way as well, can we label our grayscale images using ImageDataGenerator function?
NOTE: I use following codes, but it converts my grayscale images to coloured ones.
train_path = 'the directory'
train_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.vgg16.preprocess_input)
.flow_from_directory(directory=train_path, target_size=(64,64), classes=['Normal','fault'], batch_size=10)
Best regards.
Well, you can create a training dataframe having 2 columns image_path and label and then iterate through the different directories and save the directory name in label column and path of the image in image_path column like below
import pandas as pd
data = pd.DataFrame()
images = [ ]
label = [ ]
for i in os.listdir( normal_directory_path ):
images.append( normal_directory_path/i)
label.append( "Normal" )
# similarly for fault directory
Then shuffle the dataset and then you can use dataloader to load the dataset into batches and while loading open the image_path and extract the image array out of it and train the model.
VGG is an artificial neural network trained for images having three channels. If you take a glance at the official document (https://www.tensorflow.org/api_docs/python/tf/keras/applications/vgg16/preprocess_input), you will see that this preprocessing method accepts images having three channels, and preprocesses them (again from the document):
Returns
Preprocessed numpy.array or a tf.Tensor with type float32.
The images are converted from RGB to BGR, then each color channel is zero-centered with respect to the ImageNet dataset, without scaling.
So, if you are going to feed grayscale images to your model, you need to do the preprocessing yourself. To do so, you will need ImageNet dataset statistics like "mean" value, to zero center every pixel in your images. This mean will be also per channel, so just take the weighted mean of the three channels (w.r.t. Colorimetric (perceptual luminance-preserving) conversion to grayscale) to find one channel mean.
References: Colorimetric (perceptual luminance-preserving) conversion to grayscale -https://en.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale
Related
Problem statement
I am working on a project using YOLO model to detect tools form given picture.i.e hammer, screwdrivers, bolt etc. I have only 100 pictures as training dataset and I have labelled them using polygons. I have decided to augment the data with the below given code. I’ve got 500 new images but, the problem is that I don't want to label them again. I am looking for any way out with which label bounding boxes (polygons) adjust (preserved) with news augmented images so that I can get polygons data without doing labelling again. In short, I want to preserver the label during the image augmentation process.
Code used for Augmentation
Brightness=[0.7,0.8,1] # Different brightness Levels
Rotation=[10,-10] # Different rotation Levels
# Main link source
main_path=r"./Augmentation"
# Directoy from main path
dir=os.listdir(main_path)[1:]
# Read and iterate all images from directory
for name in dir:
image = Image.open(os.path.join(main_path,name))
# Apply rotation from image
for j in Rotation: # Different rotation Levels
rotated = image.rotate(j)
ransImageRGBA = rotated.convert("RGB")
apply_br = ImageEnhance.Brightness(ransImageRGBA)
# Apply values for brightness in rotated images
for i in Brightness: # Different rotation Levels
Lightness =apply_br.enhance(i)
# below line for output
Lightness = apply_br.enhance(i).save((os.path.join(main_path, 'augmented_Images',str(i)+str(j))+name))
print('image has been augmented successfully')
look into imaug. The augmentations from this module also augment the labels. One more thing, what you are doing right now is offline augmentation. You might want to look at online augmentation. Then every epoch the pictures are augmented in a different way and you only train on the augmented pictures. This way you don't have to have a lot of discspace.
If you are using Yolov4 with darknet, image augmentation is performed automatically.
Lets assume i have a little dataset. I want to implement data augmentation. First i implement image segmentation (after this, image will be binary image) and then implement data augmentation. Is this a good way?
For image augmentation in segmentation and instance segmentation, you have to either no change the positions of the objects contained in the image by manipulating colors for example, or modify these positions by applying translations and rotation.
So, yes this way works, but you have to take into consideration the type of data you have and what you are looking to achieve. Data augmentation isn't a ready to-go process with good results everywhere.
In case you have a:
Semantic segmentation : Each pixel of your image has a row i and a column j which are labeled as its enclosing object. This means having your main image I and a label image L with its same size linking every pixel to its object label. In this case, your data augmentation is applied to both I and L, giving a combination of the two transformed images.
Instance segmentation : Here we generate a mask for every instance of the original image and the augmentation is applied to all of them including the original, then from these transformed masks we get our new instances.
EDIT:
Take a look at CLoDSA (Classification, Localization, Detection and Segmentation Augmentor) it may help you implement your idea.
In case your dataset is small, you should add data-augmentation during the training. It is important to change the original image & the targets (masks) in the same way !!.
For example, If an image is rotated 90 degrees, then its mask should also be rotated 90 degrees. Since you are using Keras library, You should check if the ImageDataGenerator also changes the target images (masks), along with the inputs. If it doesn't, You can implement the augmentations by yourself. This repository shows how it is done in OpenCV here:
https://github.com/kochlisGit/random-data-augmentations
I have a large images 5000x3500 and I want to divide it into small images 512x512 but without loosing the original image coordinates. The large images are annotated/labled that's why I want to keep the original coordinates and I will use the small images to train YOLO model. I am not sure if that called tiled or not. but is there any suggestion to do it using python or opencv-python?
I'm training a deep neural network to improve the quality of images. The images contain some specific types of noise that I want to reduce/remove by means of a deep learning model. In order to do so I'm using a huge dataset of similar clear high-res images with barely any noise, add the specific types of noise to the images and train the network on regenerating the original image (a custom autoencoder network). With one of the several noise types this works very well so far. Without going to far into the details, adding that particular type of noise was easy.
Now I need to add another noise type to the images, more precisely: chroma noise like in the following image (the bottom right one): link
How do I artificially generate and add chroma noise to an image in Python? I can use the full range of image processing packages, PIL, numpy, OpenCV, torchvision...
You need to convert the image to a colorspace such as HSV or CIE Lab. You then add noise to the chromacity channels (a, b in Lab, or H, S is HSV). Finally, convert back to RGB.
This colorspace conversion step is very common and most image toolkits should have that functionality.
I'm currently using Keras' image pre-processing functions to augment some training image data. As part of this I'm trying to visualise the augmentations which can be done by saving the images to a directory using the flow method from the ImageDataGenerator class:
https://keras.io/preprocessing/image/#flow
datagenerator.flow(image, batch_size=1, save_to_dir=args["imgdir"], save_prefix='aug',
save_format='png')
The problem is that the images I pass in are RGB and the images saved in the directory are BGR. The only transform that I'm doing is a rotation, why is it converting them to BGR? I can remedy the situation by converting the image to BGR before passing it to the generator flow method.
The generator itself is not producing BGR images - those remain in RGB format, they're just being converted when they're saved.
The mismatch in channels might be due to the libraries that you are using to load and store the images. Checking this would help you solve this problem.