labelme json file, extracting cordinates of a rectangle - python

I am doing an object detection task and annotated my image using labelme. The images were annotated with rectangle bounding boxes in the labelme too. The JSON file of the annotation just shows two-point coordinates instead of four (required to define a rectangle). I need all the four coordinates or two coordinates with the width and height of the rectangle latest so I can prepare the mask. Can someone help me with that

You only need two points to define a bounding box.
Let's say that the 4 point coordinates that form the bounding box are the following:
(x, y)
(x + w, y)
(x, y + h)
(x + w, y + h)
What LabelMe is showing you the xmin, ymin, xmax and ymax. So, using the example, LabelMe would show you:
<bndbox>
<xmin>x</xmin>
<ymin>y</ymin>
<xmax>x+w</xmax>
<ymax>y+h</ymax>
</bndbox>
Which is enough to get the other 2 point coordinates.

Related

How to convert horizontal bounding box to oriented bounding box in object detection task

I am trying to detect oriented bounding boxes with faster rcnn for a long time but I couldn't make it to do so. I aim to detect objects in DOTA dataset. I was using built-in faster rcnn model in pytorch but realized that it does not support OBB. Then I found another library named detectron2 that is built on the pytorch framework. Built-in faster rcnn network in detectron2 is actually compatible with OBB but I could not make that model work with DOTA. Because I could not convert DOTA box annotations to (cx, cy, w, h, a). In DOTA, objects are annotated by coordinates of 4 corners which are (x1,y1,x2,y2,x3,y3,x4,y4).
I cant come up with a solution that converts these 4 coordinates to (cx, cy, w, h, a), where cx and cy are the center point of OBB and w, h and a are width, height and angle respectively.
Is there any suggestion?
If you have your boxes in an Nx8 tensor/array, you can conver them to (cx, cy, w, h, a) by doing (assuming first point is top left, second point is bottom left, then bottom right, then top right...):
def DOTA_2_OBB(boxes):
#calculate the angle of the box using arctan (degrees here)
angle = (torch.atan((boxes[:,7] - boxes[:,5])/(boxes[:,6] - boxes[:,4]))*180/np.pi).float()
#centrepoint is the mean of adjacent points
cx = boxes[:,[4,0]].mean(1)
cy = boxes[:,[7,3]].mean(1)
#calculate w and h based on the distance between adjacent points
w = ((boxes[:,7] - boxes[:,1])**2+(boxes[:,6] - boxes[:,0])**2)**0.5
h = ((boxes[:,1] - boxes[:,3])**2+(boxes[:,0] - boxes[:,2])**2)**0.5
return torch.stack([cx,cy,w,h,angle]).T
Then giving this a test...
In [40]: boxes = torch.tensor([[0,2,1,0,2,2,1,3],[4,12,8,2,12,12,8,22]]).float()
In [43]: DOTA_2_OBB(boxes)
Out[43]:
tensor([[ 1.0000, 1.5000, 1.4142, 2.2361, -45.0000],
[ 8.0000, 12.0000, 10.7703, 10.7703, -68.1986]])

YOLOv4 annotations saves dimensions in a [0,1] float interval

This is from an annotations file for an image:
0 0.6142131979695431 0.336 0.467005076142132 0.392
The first 0 is the class label. 0.6142131979695431 and 0.336 are the x and y coordinates of the bounding box. 0.467005076142132 and 0.392 are the width and the height of the bounding box. However, what I don't understand is why the x, y, width and height are in a [0,1] float interval. Someone told me that it is a percentage, but a percentage relative to what?
For example, I am writing software that builds a synthetic dataset. This is one training image that I have produced. It has the bounding boxes around the objects that I want to detect.
The bounding boxes wrap the Wizards and Ubuntu logos perfectly. So, how can I annotate them like the format above?
The width/height in YOLO format is the fraction of total width/height of the entire image. So the top-left corner is always (0,0) and bottom-right corner is always (1,1) irrespective of the size of the image.
See this question for the conversion of bounding box (x1, y1, x2, y2) to YOLO style (x, y, w, h).

Rectangles overlapping

I am using opencv to draw rectangles over images with the xmin, ymin, xmax, ymax value of the rectangles given in a list.
List of point is
points = [(1707.0, 1865.0, 2331.0, 2549.0),(1348.0, 1004.0, 1987.0, 1746.0),(749.0, 2129.0, 1674.0, 2939.0)
,(25.0, 1134.0, 1266.0, 2108.0),(253.0, 1731.0, 1403.0, 2449.0)]
image = cv2.imread("pathtoimage")
for point in points:
xmin,ymin,xmax,ymax = point
result_image = cv2.rectangle(image, (int(xmin), int(xmax)), (int(ymin),int(ymax)), (0,255,0), 8)
os.remove("/home/atul/Documents/CarLabel/imagemapping1-wp-BD489663-BD55-484E-9EA7-EB5662B626B9.png")
cv2.imwrite("/home/atul/Documents/CarLabel/imagemapping1-wp-BD489663-BD55-484E-9EA7-EB5662B626B9.png",result_image)
Rectangles are getting overlapped into each other.
How can I resolve this.
Original Image
Resulting image
cv2.rectangle needs the coordintates of top-left and bottom-right points. So you should use:
result_image = cv2.rectangle(image, (int(xmin), int(ymin)), (int(xmax),int(ymax)), (0,255,0), 8)

Returning "chunk" of Image using Python and Pillow

This is a very rudimentary question and I'm sure that there's some part of the Pillow library/documentation I've missed...
Let's say you have a 128x128 image, and you want to save the "chunk" of it that is "x" pixels right from the top-left corner of the original image, and "y" pixels down from the top left corner of the original image (so the top left corner of this "chunk" is located at (x,y). If you know that the chunk you want is "a" pixels wide and "b" pixels tall (so the four corners of the chunk you want are known, and they're (x,y),(x+a,y),(x,y+b),(x+a,y+b)) - how would you save this "chunk" of the original image you're given as a separate image file?
More concisely, how can I save pieces of images given their pixel-coordinates using PIL? any help/pointers are appreciated.
Came up with:
"""
The function "crop" takes in large_img, small_img, x, y, w, h and returns the image lying within these restraints:
large_img: the filename of the large image
small_img: the desired filename of the smaller "sub-image"
x: x coordinate of the upper left corner of the bounding box
y: y coordinate of the upper left corner of the bounding box
w: width of the bounding box
h: height of the bounding box
"""
def crop(large_img, small_img, x, y, w, h):
img = Image.open(large_img)
box = (x, y, x+w, y+h)
area = img.crop(box)
area.save(small_img, 'jpeg')

How to find the overlapping of an image on TkInter Canvas (Python)?

In Tkinter, when I create an image on a canvas and find the coordinates of it, it only returns two coordinates, so the find_overlapping method doesn't work with it (naturally). Is there an alternative?
You should be able to get the image's bounding box (bbox) by calling bbox = canvas.bbox(imageID). Then you can use canvas.find_overlapping(*bbox).
The coordinates it returns should be the coordinates of the top left corner of the image. So if the coordinates you got were (x, y), and your image object (assuming it's a PhotoImage) is img, then you can do:
w, h = img.width(), img.height()
find_overlapping(x, y, x + w, y + h)

Categories