Detectron2 saving predicted annotation images to COCO file - python

I am doing semantic segmentation. Detectron2 can produce prediction mask labels as pictures.
Can I save the predicted annotation images in COCO file instead of pictures. The COCO file can be used for editing in annotation tools like labelme.
Do you have the Python script for extracting the predicted annotation images in COCO format.
Thanks.

Related

label our grayscale images using corresponding directories

I have been trying to label my grayscale images for my CNN model. At this stage I aim for binary classification (Normal VS Fault conditions). I have distributed my images to two separate folders (Normal and Fault) and want to use the corresponding directories in my codes. However, I did not find a resource for labelling grayscale images using directories. Does any one know how we can label our grayscale images using directories. I can ask my question in this way as well, can we label our grayscale images using ImageDataGenerator function?
NOTE: I use following codes, but it converts my grayscale images to coloured ones.
train_path = 'the directory'
train_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.vgg16.preprocess_input)
.flow_from_directory(directory=train_path, target_size=(64,64), classes=['Normal','fault'], batch_size=10)
Best regards.
Well, you can create a training dataframe having 2 columns image_path and label and then iterate through the different directories and save the directory name in label column and path of the image in image_path column like below
import pandas as pd
data = pd.DataFrame()
images = [ ]
label = [ ]
for i in os.listdir( normal_directory_path ):
images.append( normal_directory_path/i)
label.append( "Normal" )
# similarly for fault directory
Then shuffle the dataset and then you can use dataloader to load the dataset into batches and while loading open the image_path and extract the image array out of it and train the model.
VGG is an artificial neural network trained for images having three channels. If you take a glance at the official document (https://www.tensorflow.org/api_docs/python/tf/keras/applications/vgg16/preprocess_input), you will see that this preprocessing method accepts images having three channels, and preprocesses them (again from the document):
Returns
Preprocessed numpy.array or a tf.Tensor with type float32.
The images are converted from RGB to BGR, then each color channel is zero-centered with respect to the ImageNet dataset, without scaling.
So, if you are going to feed grayscale images to your model, you need to do the preprocessing yourself. To do so, you will need ImageNet dataset statistics like "mean" value, to zero center every pixel in your images. This mean will be also per channel, so just take the weighted mean of the three channels (w.r.t. Colorimetric (perceptual luminance-preserving) conversion to grayscale) to find one channel mean.
References: Colorimetric (perceptual luminance-preserving) conversion to grayscale -https://en.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale

YOLO V4 Tiny - Making more photos from one annotated image

I am trying to make a yolo v4 tiny custom data set using google collab. I am using labelImg.py for image annotations which is shown in https://github.com/tzutalin/labelImg.
I have annotated one image as shown as below,
The .txt file with the annotated coordinates looks as following,
0 0.580859 0.502083 0.303906 0.404167
I only have one class which is calculator class. I want to use this one image to produce 4 more annotated images. I want to rotate the annotated image 45 degrees every time and create a new annotated image and a.txt coordinate file. I have seen something like this done in roboflow but I cant figure out how to do it manually with a python script. Is it possible to do it? If so how?
You can look into the repo and article below for python based data augmentation including rotation, shearing, resizing, translation, flipping etc.
https://github.com/Paperspace/DataAugmentationForObjectDetection
https://blog.paperspace.com/data-augmentation-for-bounding-boxes/
If you are using AlexeyAB's darknet repo for yolov4, then there are some augmentations you can use to increase training data size and variation.
https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-%5Bnet%5D-section
Look into Data augmentation section where you can use various defined augmentations for object detection by adding them to yolo cfg file.

Transferring training data from matlab to tensorflow

I have used MATLAB's Image Labeller App to create PixelLabelData for 500 images. So, I have got the original images and class labels for each image. This information is stored in a gTruth file which is in .mat format. I want to use this dataset for training a Unet in tensorflow (Google Colab).
I could not achieve the training task on MATLAb because of system limitations (insufficient RAM and no GPU).However, I have read that we can import training data from MATLAB for use in Colab. So, I uploaded original image set, the labelled pixels, and corresponding mat file (gTruth.mat) on Google Drive and then mounted the drive onto the Colab environment. But I don't know how to proceed forward with the mat file in Colab.
The pixelLabelTrainingData function will allow you to obtain two separate datastores for the input and pixel labeled images.
[imds,pxds] = pixelLabelTrainingData(gTruth);
https://www.mathworks.com/help/vision/ref/pixellabeltrainingdata.html
Given those, you can write each image, labeled image to parallel directories using the same naming convention using imwrite with the image file format of your choice.
inputImageDir = 'pathOfYourChoice';
count = 0;
while hasdata(imds)
img = read(imds);
fname = sprintf('img%d.png',count);
name = fullfile(inputImageDir,fname);
imwrite(img,name);
end
From there, you should be able to use standard tensorflow tooling (e.g. Dataset) to read in directories of images.

How to use boundary boxes with images for multi label image training?

I am working on machine learning for image classification and managed to get several projects done successfully. All projects had images which always belongs to one class. Now I want to try images with multiple labels on each image. I read that I have to draw boxes (boundary boxes) around images for training.
My question is
Do I have to crop those areas into single images and use them as before for training?
Drawn boxes are only used to cropping?
Or do we really feed the original images and box coordinates (top left[X, Y], width and height) to training?
Any tutorials to materials related to this are appreciated.
Basically, you need to detect various objects in an image which belong to different classes. Here's where Object Detection comes in the picture.
Object Detection tries to classify labels for various objects in an
image and also predict the bounding boxes.
There are many algorithms for object detection. If you are a seasoned TensorFlow user, you can directly use the TensorFlow Object Detection API. You can select the architecture you need and feed the annotations along with the images.
To annotate the images ( drawing bounding boxes around boxes and storing them separately ), you can use LabelImg tool.
You can refer to these blogs:
Creating your own object detector
A Step-by-Step Introduction to the Basic Object Detection Algorithms
Instead of training a whole new object detector, you can use a pretrained object detector available. The TensorFlow Object Detection model can classify 80 objects. If the objects you need to classify are included in these objects, then you get a ready-to-build model. The model draws a bounding box around the object of your interest.
You can crop this part of the image and build a classifier on it, according to your needs.

dlib training a new faces dataset

I'm trying to train my set of images with 68 landmarks, and I'm using the train_shape_predictor.py script. The script creates my .dat files but can not locate well the landmarks. The resulting detections are only the rectangle around faces.
I have created the xml file with the landmarks with imglab tool. This looks different from the test file supplied with dlib.
where is the problem?

Categories