I'm training a CNN to learn an unusual task where the labels for each image depend on the other images in the minibatch. Further, the data undergo a number of transformations and through a bunch of different threads and queues before the net can actually axes them.
I'd like to be able to validate that the label and image mapping is correct. However, it doesn't seem like it's possible to get TensorFlow to surface the filename for each image summary as part of tf.image_summary(...). Does anyone know if it's possible to do this? Because the labeling regime is so unusual, you cannot gather immediately from the image if the label itself is correct. I need to be able to get the filename to recover the label properly. I can provide the filenames along with the images themselves and their labels in the queue without any problem.
Edit: To clarify, I'm using TensorBoard to display the image summaries.
Related
Problem statement
I am working on a project using YOLO model to detect tools form given picture.i.e hammer, screwdrivers, bolt etc. I have only 100 pictures as training dataset and I have labelled them using polygons. I have decided to augment the data with the below given code. I’ve got 500 new images but, the problem is that I don't want to label them again. I am looking for any way out with which label bounding boxes (polygons) adjust (preserved) with news augmented images so that I can get polygons data without doing labelling again. In short, I want to preserver the label during the image augmentation process.
Code used for Augmentation
Brightness=[0.7,0.8,1] # Different brightness Levels
Rotation=[10,-10] # Different rotation Levels
# Main link source
main_path=r"./Augmentation"
# Directoy from main path
dir=os.listdir(main_path)[1:]
# Read and iterate all images from directory
for name in dir:
image = Image.open(os.path.join(main_path,name))
# Apply rotation from image
for j in Rotation: # Different rotation Levels
rotated = image.rotate(j)
ransImageRGBA = rotated.convert("RGB")
apply_br = ImageEnhance.Brightness(ransImageRGBA)
# Apply values for brightness in rotated images
for i in Brightness: # Different rotation Levels
Lightness =apply_br.enhance(i)
# below line for output
Lightness = apply_br.enhance(i).save((os.path.join(main_path, 'augmented_Images',str(i)+str(j))+name))
print('image has been augmented successfully')
look into imaug. The augmentations from this module also augment the labels. One more thing, what you are doing right now is offline augmentation. You might want to look at online augmentation. Then every epoch the pictures are augmented in a different way and you only train on the augmented pictures. This way you don't have to have a lot of discspace.
If you are using Yolov4 with darknet, image augmentation is performed automatically.
I am using CV2 to resize various images with different dimensions(i.e. 70*300, 800*500, 60*50) to a specific (200*200) pixels dimension. Later, I am feeding the pictures to CNN algorithm to classify the images. (my understanding that pictures must have the same size when fed into CNN).
My questions:
1- How low picture resolutions are converted into higher one and how higher resolutions are converted into lower one? Will this affect the stored information in the pictures
2- Is it good practice to use this approach with CNN? Or is it better to Pad zeros to the end of the image to get the desired resolution? I have seen many researchers pad the end of a file with zeros when trying to detect Malware files to have a common dimension for all the files. Does this mean that padding is more accurate than resizing?
Using interpolation. https://chadrick-kwag.net/cv2-resize-interpolation-methods/
Definitely, resizing is a lossy process and you'll lose information.
Both are okay and used depending on the needs. Resizing is also equally applicable. If your CNN can't differentiate between the original and resized images it must be a badly overfitted one. Resizing is a very light regularization too, even it's advisable to apply more augmentation schemes on the images before CNN training.
I want to make a CNN or FCN that can take grayscale images as an input and outputs a color image. It is very important to me that the size of the images can vary. I heard that I can only do this when I make a FCN and take a batch with images of one size and another batch with images of another size. But I don't know how to make this concept in Tensorflow Keras (the Python version) and I was wondering if you could provide some sample code or pseudo code? I appreciate that. Thanks!
I know you want to keep them all in their original size, but that's not possible. Don't worry, though, because the resizing can take place while the images are being fed into the model (while in memory); the image will never be touched except to be read.
Here's a great example that I frequently reference!
So I've been messing around with tensorflow's object detection api and specifically the re-training of models, essentially doing this.
I made it detect my object fairly well with a small number of images. But I wanted to increase the number of images I train with, however the labeling process is long and boring so I found a data set with cropped images, so only my object is in the image.
If there's a way to send whole images without labeling them too be trained using tensorflow api I didn't find it but I thought making a program that labels the whole image would not be that hard.
The format of the labeling is a csv file with these entries: filename, width, height, class, xmin, ymin, xmax, ymax.
This is my code:
import os
import cv2
path = "D:/path/to/image/folder"
directory = os.fsencode(path)
text = open("D:/result/train.txt","w")
for file in os.listdir(directory):
filename = os.fsdecode(file)
if filename.endswith(".jpg"):
impath= path + "/" + filename
img = cv2.imread(impath)
res = filename+","+ str(img.shape[1])+","+str(img.shape[0])+",person,1,1,"+str(img.shape[1]-1) +"," +str(img.shape[0]-1)+"\n"
text.write(res)
print(res)
text.close()
This seems to be working fine.
Now here's the problem. After converting the .txt to .csv and running the training until the loss stops decreasing my detection on my test set are awful. It puts a huge bounding box around the entirety of the image like it's trained to detect only the edges of the image.
I figure it's somehow learning to detect the edges of the images since the labeling is around the whole image. But how do I make it learn to "see" what's in the picture? Any help would be appreciated.
The model predicts exactly what is what trained for: Huge bounding boxes for the entire image. Obviously, if your training data comprises only boxes with coordinates [0, 0, 1, 1], the model will learn it and predict for the test set.
You may try to use kind of augmentation: Put your images on a larger black/grey canvas and adjust bounding boxes correspondingly. That is what SSD augmentation does, for instance. However, there is no free and good way to compensate the absence of a properly labelled train set.
I have a set of images in a folder, where each image either has a square shape or a triangle shape on a white background (like this and this). I would like to separate those images into different folders (note that I don't care about detecting whether image is a square/triangle etc. I just want to separate those two).
I am planning to use more complex shapes in the future (e.g. pentagons, or other non-geometric shapes) so I am looking for an unsupervised approach. But the main task will always be clustering a set of images into different folders.
What is the easiest/best way to do it? I looked at image clustering algorithms, but they do clustering of colors/shapes inside the image. In my case I simply want to separate those image files based on the shapes that have.
Any pointers/help is appreciated.
You can follow this method:
1. Create a look-up tables with shape you are using in the images
2. Do template matching on the images stored in a single folder
3. According to the result of template matching just store them in different folders
4. You can create folders beforehand and just replace the strings in program according to the usage.
I hope this helps
It's really going to depend on what your data set looks like (e.g., what your shape images look like), and how robust you want your solution to be. The tricky part is going to be extracting features from each shape image the produce a clustering result that you're satisfied with. A few ideas:
You could compute SIFT features for each images and then cluster the images based on those features: http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
If you don't want to go the SIFT route, you could try something like HOG: http://en.wikipedia.org/wiki/Histogram_of_oriented_gradients
A somewhat more naive approach - If the shapes are always the same scale, and the background color is fixed you could get rid of the background cluster the images based on shape area (e.g., number of pixels taken up by the shape).