dlib training a new faces dataset - python

I'm trying to train my set of images with 68 landmarks, and I'm using the train_shape_predictor.py script. The script creates my .dat files but can not locate well the landmarks. The resulting detections are only the rectangle around faces.
I have created the xml file with the landmarks with imglab tool. This looks different from the test file supplied with dlib.
where is the problem?

Related

convert segmented images to coco json

I have images saved as png where the object I want to generate coco style segmentation annotations are present. take this image of a bulb as an example:bulb
Now let's say I want to generate segmentation annotations as in coco format from this image automatically by using a program, how can I do it ?
One thing we can do is write a program to store a fixed number of co-ordinates which are on the edges, and then use those co-ordinates in the segmentation field in the .json file, but that could create problems where the number or co-ordinates needed to accurately capture the boundary of an object would differ.
Any kind of help is greatly appreciated.

Splitting an already labelled image for object detection

I have a large geotiff file of elevation data that I would like to use for object detection. I've labelled the objects (originally as a .shp) and converted the labels into a single geojson.
From reading object detection tutorials it seems that I need to split this large image into multiple images for training/testing. Is there a way to do this using the original labels, so that I don't need to re-label each smaller image?
If anyone has any useful tutorials/end to end examples of preparing satellite data for object detection that would also be really helpful.
The GeoJSON file you have should have the co-ordinates to get the bounding box for the named portion of the original image. (If you want to know how to do that, see here: https://gis.stackexchange.com/a/313023/120175). Once you have the bounding box, you can use any imaging library (Pillow or Pillow-SIMD) to get the sub-image that you have named (with the name in the same geojson object that contained the coordinates you took for getting bounding box). You can operate them while they're in memory or save them (they can be treated as independent images themselves) with these imaging library. These images can be used for training.

YOLO V4 Tiny - Making more photos from one annotated image

I am trying to make a yolo v4 tiny custom data set using google collab. I am using labelImg.py for image annotations which is shown in https://github.com/tzutalin/labelImg.
I have annotated one image as shown as below,
The .txt file with the annotated coordinates looks as following,
0 0.580859 0.502083 0.303906 0.404167
I only have one class which is calculator class. I want to use this one image to produce 4 more annotated images. I want to rotate the annotated image 45 degrees every time and create a new annotated image and a.txt coordinate file. I have seen something like this done in roboflow but I cant figure out how to do it manually with a python script. Is it possible to do it? If so how?
You can look into the repo and article below for python based data augmentation including rotation, shearing, resizing, translation, flipping etc.
https://github.com/Paperspace/DataAugmentationForObjectDetection
https://blog.paperspace.com/data-augmentation-for-bounding-boxes/
If you are using AlexeyAB's darknet repo for yolov4, then there are some augmentations you can use to increase training data size and variation.
https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-%5Bnet%5D-section
Look into Data augmentation section where you can use various defined augmentations for object detection by adding them to yolo cfg file.

Transferring training data from matlab to tensorflow

I have used MATLAB's Image Labeller App to create PixelLabelData for 500 images. So, I have got the original images and class labels for each image. This information is stored in a gTruth file which is in .mat format. I want to use this dataset for training a Unet in tensorflow (Google Colab).
I could not achieve the training task on MATLAb because of system limitations (insufficient RAM and no GPU).However, I have read that we can import training data from MATLAB for use in Colab. So, I uploaded original image set, the labelled pixels, and corresponding mat file (gTruth.mat) on Google Drive and then mounted the drive onto the Colab environment. But I don't know how to proceed forward with the mat file in Colab.
The pixelLabelTrainingData function will allow you to obtain two separate datastores for the input and pixel labeled images.
[imds,pxds] = pixelLabelTrainingData(gTruth);
https://www.mathworks.com/help/vision/ref/pixellabeltrainingdata.html
Given those, you can write each image, labeled image to parallel directories using the same naming convention using imwrite with the image file format of your choice.
inputImageDir = 'pathOfYourChoice';
count = 0;
while hasdata(imds)
img = read(imds);
fname = sprintf('img%d.png',count);
name = fullfile(inputImageDir,fname);
imwrite(img,name);
end
From there, you should be able to use standard tensorflow tooling (e.g. Dataset) to read in directories of images.

How to recognize faces without training

I am building a type of "person counter" that is getting face images from live video footage.
If a new face is detected in some frame the program will count that face/person. I thus need a way to check if a particular face has already been detected.
I have tried using a training program to recognize a template image to avoid counting the same face multiple times but due to there being only one template, the system was massively inaccurate and slightly too slow to run for every frame of the feed.
To better understand the process: at the beginning, as a face is detected the frame is cropped and the (new) face is saved in a file location. Afterwards, faces detected in subsequent frames need to go through a process to detect whether a similar face has been detected before and exist in the database (if they do, they shouldn't get added to the database).
One recipe to face (pun! ;) this could be, for every frame:
get all the faces for all the frames (with opencv you can detect those and crop them)
generate face embeddings for the faces collected (e.g. using a tool for the purpose <- most likely this is the pre-trained component you are looking for, and allows you to "condense" the face image into a vector)
add all the the so-obtained face embeddings to a list
With some pre-defined time interval, run a clustering algorithm (see also Face clustering using Chinese Whispers algorithm) on the list of face embeddings collected. This will allow to group together faces belonging to the same person, and thus count the people appearing in the video.
Once that clusters are consolidated, you could prune some of the faces belonging to the same clusters/persons (to save storage in case you wanted)

Categories