I am trying to make a yolo v4 tiny custom data set using google collab. I am using labelImg.py for image annotations which is shown in https://github.com/tzutalin/labelImg.
I have annotated one image as shown as below,
The .txt file with the annotated coordinates looks as following,
0 0.580859 0.502083 0.303906 0.404167
I only have one class which is calculator class. I want to use this one image to produce 4 more annotated images. I want to rotate the annotated image 45 degrees every time and create a new annotated image and a.txt coordinate file. I have seen something like this done in roboflow but I cant figure out how to do it manually with a python script. Is it possible to do it? If so how?
You can look into the repo and article below for python based data augmentation including rotation, shearing, resizing, translation, flipping etc.
https://github.com/Paperspace/DataAugmentationForObjectDetection
https://blog.paperspace.com/data-augmentation-for-bounding-boxes/
If you are using AlexeyAB's darknet repo for yolov4, then there are some augmentations you can use to increase training data size and variation.
https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-%5Bnet%5D-section
Look into Data augmentation section where you can use various defined augmentations for object detection by adding them to yolo cfg file.
Related
I have images saved as png where the object I want to generate coco style segmentation annotations are present. take this image of a bulb as an example:bulb
Now let's say I want to generate segmentation annotations as in coco format from this image automatically by using a program, how can I do it ?
One thing we can do is write a program to store a fixed number of co-ordinates which are on the edges, and then use those co-ordinates in the segmentation field in the .json file, but that could create problems where the number or co-ordinates needed to accurately capture the boundary of an object would differ.
Any kind of help is greatly appreciated.
Lets assume i have a little dataset. I want to implement data augmentation. First i implement image segmentation (after this, image will be binary image) and then implement data augmentation. Is this a good way?
For image augmentation in segmentation and instance segmentation, you have to either no change the positions of the objects contained in the image by manipulating colors for example, or modify these positions by applying translations and rotation.
So, yes this way works, but you have to take into consideration the type of data you have and what you are looking to achieve. Data augmentation isn't a ready to-go process with good results everywhere.
In case you have a:
Semantic segmentation : Each pixel of your image has a row i and a column j which are labeled as its enclosing object. This means having your main image I and a label image L with its same size linking every pixel to its object label. In this case, your data augmentation is applied to both I and L, giving a combination of the two transformed images.
Instance segmentation : Here we generate a mask for every instance of the original image and the augmentation is applied to all of them including the original, then from these transformed masks we get our new instances.
EDIT:
Take a look at CLoDSA (Classification, Localization, Detection and Segmentation Augmentor) it may help you implement your idea.
In case your dataset is small, you should add data-augmentation during the training. It is important to change the original image & the targets (masks) in the same way !!.
For example, If an image is rotated 90 degrees, then its mask should also be rotated 90 degrees. Since you are using Keras library, You should check if the ImageDataGenerator also changes the target images (masks), along with the inputs. If it doesn't, You can implement the augmentations by yourself. This repository shows how it is done in OpenCV here:
https://github.com/kochlisGit/random-data-augmentations
I started studying machine learning in python a few days ago and I was trying some examples online when I decided to try it for myself using a custom dataset.
However, I noticed that most datasets involve images that are taken from camera photos composing of hundreds, if not thousands of images with the same target image.
If I create a custom icon in Photoshop, do I need to take a picture of my monitor a thousand times to achieve this? Is it possible to train an AI using only a single PNG file?
My goal right now is to let the AI do object detection on another big image and it needs to find the custom icon inside the image, kind of like Finding Waldo. All of which are digital images straight from Photoshop though, so I don`t know if it is possible.
Right now, I am using a python-based Computer Vision library called ImageAI.
You can use a data preparation strategy called Data Augmentation.
There are mainly two types of Augmentation
Linear Transformation
Affine Transformation
Here is a good white paper
http://cs231n.stanford.edu/reports/2017/pdfs/300.pdf
There is a implementation of Mask RCNN on Github by Matterport.
I'm trying to train my data for it. I'm adding polygons on images with this tool. I'm drawing polygons on images manually, but I already have manually segmented image below (black and white one)
My questions are:
1) When adding json annotation for region data, is there a way to use that pre-segmented image below?
2) Is there a way to train my data for this algorithm, without adding json annotation and use manually segmented images? The tutorials and posts I've seen uses json annotations to train.
3) This algorithm's output is image with masks obviously, is there a way get black and white output for segmentations?
Here's the code that I'm working on google colab.
Original Repo
My Fork
Manually segmented image
I think both questions 1 and 2 refer to the same solution: you need to convert your masks to json annotations. For that, I would suggest you to read this link, posted in the repository of the cocodataset. There you can read about this repository that you could use for what you need. You could also use directly the Coco PythonAPI, calling the methods here defined.
For question 3, a mask is already binary image (therefore, you can show it as black and white pixels).
I am trying to detect a vehicle in an image (actually a sequence of frames in a video). I am new to opencv and python and work under windows 7.
Is there a way to get horizontal edges and vertical edges of an image and then sum up the resultant images into respective vectors?
Is there a python code or function available for this.
I looked at this and this but would not get a clue how to do it.
You may use the following image for illustration.
EDIT
I was inspired by the idea presented in the following paper (sorry if you do not have access).
Betke, M.; Haritaoglu, E. & Davis, L. S. Real-time multiple vehicle detection and tracking from a moving vehicle Machine Vision and Applications, Springer-Verlag, 2000, 12, 69-83
I would take a look at the squares example for opencv, posted here. It uses canny and then does a contour find to return the sides of each square. You should be able to modify this code to get the horizontal and vertical lines you are looking for. Here is a link to the documentation for the python call of canny. It is rather helpful for all around edge detection. In about an hour I can get home and give you a working example of what you are wanting.
Do some reading on Sobel filters.
http://en.wikipedia.org/wiki/Sobel_operator
You can basically get vertical and horizontal gradients at each pixel.
Here is the OpenCV function for it.
http://docs.opencv.org/modules/imgproc/doc/filtering.html?highlight=sobel#sobel
Once you get this filtered images then you can collect statistics column/row wise and decide if its an edge and get that location.
Typically geometrical approaches to object detection are not hugely successful as the appearance model you assume can quite easily be violated by occlusion, noise or orientation changes.
Machine learning approaches typically work much better in my opinion and would probably provide a more robust solution to your problem. Since you appear to be working with OpenCV you could take a look at Casacade Classifiers for which OpenCV provides a Haar wavelet and a local binary pattern feature based classifiers.
The link I have provided is to a tutorial with very complete steps explaining how to create a classifier with several prewritten utilities. Basically you will create a directory with 'positive' images of cars and a directory with 'negative' images of typical backgrounds. A utiltiy opencv_createsamples can be used to create training images warped to simulate different orientations and average intensities from a small set of images. You then use the utility opencv_traincascade setting a few command line parameters to select different training options outputting a trained classifier for you.
Detection can be performed using either the C++ or the Python interface with this trained classifier.
For instance, using Python you can load the classifier and perform detection on an image getting back a selection of bounding rectangles using:
image = cv2.imread('path/to/image')
cc = cv2.CascadeClassifier('path/to/classifierfile')
objs = cc.detectMultiScale(image)