how to crop image using yolo format image coordinate

how to crop image using yolo format image coordinate - python

Greeting stackoverflow community,
I have 200 images with labelled txt file for yolo custom model.
Now I want to crop all the heads present in those images using txt coordinate.
I have tried with opencv.
But I am getting error.
Could you please help me to crop all the heads of those images automatically?
Please see the update code :
import cv2
img = cv2.imread(<image path>)
dh, dw, _ = img.shape
print(dh,dw)
x,y,w,h = 0.360667, 0.089000, 0.113333, 0.130000
x,y,w,h = int(x*dw), int(y*dh), int(w*dw), int(h*dh)
print(x, y, w, h)
imgCrop = img[y:y+h,x:x+w]
cv2.imshow("Crop Image",imgCrop)
cv2.waitKey(0)
For better understanding the problem, please see these images :

# Resource: https://github.com/AlexeyAB/darknet
# <x_center> <y_center> <width> <height> - float values relative to width and height of image,
# it can be equal from (0.0 to 1.0]
# <x> = <absolute_x> / <image_width>
# <height> = <absolute_height> / <image_height>
# attention: <x_center> <y_center> - are center of rectangle (are not top-left corner)
box = "1 0.615234 0.254688 0.148438 0.178125"
class_id, x_center, y_center, w, h = box.strip().split()
x_center, y_center, w, h = float(x_center), float(y_center), float(w), float(h)
x_center = round(x_center * dw)
y_center = round(y_center * dh)
w = round(w * dw)
h = round(h * dh)
x = round(x_center - w / 2)
y = round(y_center - h / 2)
imgCrop = img[y:y + h, x:x + w]

You need to convert those float values to integers. You would do this by multiplying them by the width and height of the image then casting them to ints.
Example:
x,y,h,w = int(x*img_width), int(y*img_height), int(h*img_higth), int(w*img_width)
Then index the image:
imgCrop = img[x:x+w, y:y+h]

Related

Why my photo collage output using numpy has strange color profile?

After a long time of researching and asking questions, I have made my prototype code that makes a collage of a list of photos given as a list of strs.
It resizes the images according to the positions of the images in the list, then randomly rotates the images and randomly arrange them in a minimum bounding area.
It uses cv2, numpy, PIL and rpack, to be honest I have absolutely no idea how these libraries work, why my code is working, I just know how to make them work, I only know how to put them together.
So here is my code:
import cv2
import numpy as np
import random
import rpack
from fractions import Fraction
from math import prod
from pathlib import Path
from PIL import Image
from typing import Tuple
folder = 'D:/test/'
images = [
'Mass Effect.jpg',
'Dragon Age Origins.jpg',
'Life Is Strange.jpg',
'Star Wars KOTOR.jpg',
'Dragon Age 2.jpg',
'Choice of Robots.jpg',
'Perfect Match.png',
'Jade Empire.jpg',
"Serafina's Saga.jpg",
'Rising Angels Reborn.jpg',
'Across The Void.png',
"Heart's Blight.png",
'The Gray Wolf And The Little Lamb.jpg',
'Night of the Lesbian Vampires.png',
'Tethered.png',
'Contract Demon.jpg',
"Yuki's 4P.png"
]
def resize_guide(image_size: Tuple[int, int], unit_shape: Tuple[int, int], target_ratio: float) -> Tuple[int, int]:
aspect_ratio = Fraction(*image_size).limit_denominator()
horizontal = aspect_ratio.numerator
vertical = aspect_ratio.denominator
target_area = prod(unit_shape) * target_ratio
unit_length = (target_area/(horizontal*vertical))**.5
return (int(horizontal*unit_length), int(vertical*unit_length))
images = [cv2.imread(folder+name) for name in images]
size_hint = [i**.75 for i in range(1, len(images)+1)][::-1]
resized_images = []
for image, hint in zip(images, size_hint):
height, width = image.shape[:2]
guide = resize_guide((width, height), (640,360), hint)
resized = cv2.resize(image, guide, interpolation = cv2.INTER_AREA)
resized_images.append(resized)
def make_border(image, value, border=16):
return cv2.copyMakeBorder(
image,
top=border,
bottom=border,
left=border,
right=border,
borderType=cv2.BORDER_CONSTANT,
value=value
)
def rotate_image(image, angle):
h, w = image.shape[:2]
cX, cY = (w // 2, h // 2)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
return cv2.warpAffine(image, M, (nW, nH))
rotated_images = []
sizes = []
for image in resized_images:
image = make_border(image, (255, 255, 255))
rotated = rotate_image(image, random.randrange(-15, 16))
image = make_border(image, (0,0,0))
rotated_images.append(rotated)
height, width = rotated.shape[:2]
sizes.append((width, height))
shapes = [(x, y, w, h) for (x, y), (w, h) in zip(rpack.pack(sizes), sizes)]
rightmost = sorted(shapes, key=lambda x: -x[0] - x[2])[0]
bound_width = rightmost[0] + rightmost[2]
downmost = sorted(shapes, key=lambda x: -x[1] - x[3])[0]
bound_height = downmost[1] + downmost[3]
collage = np.zeros([bound_height, bound_width, 3],dtype=np.uint8)
for image, (x, y, w, h) in zip(rotated_images, shapes):
collage[y:y+h, x:x+w] = image
collage = Image.fromarray(collage, 'RGB')
collage.save('D:/collages/' + random.randbytes(4).hex() + '.png')
Because the output is way too large (over 20 MiB) it can't fit here, I have uploaded it to Google Drive: https://drive.google.com/file/d/16w4wsC_od4dh4QI7BYj8MM2gMngbSLV1/view?usp=sharing
So far the results seem promising, the only complaint I have is that the colors look very strange, I swear the original images have normal colors.
Can someone please tell me what I did wrong?
OK so while executing the code, the interpreter complained a lot about:
libpng warning: iCCP: known incorrect sRGB profile
I used cracked Adobe Photoshop CS6 to edit the images, is this the source of the problem or is it something else?

You are doing everything just fine. The only mistake you are making is while storing the image. Just remove the last two lines and add the following line.
cv2.imwrite('D:/collages/' + random.randbytes(4).hex() + '.png', collage)

Making Automatic Annotiation tool

i ma trying to make an automatic annotiation tool for yolo object detection which useses previosly trained model to find the detections , and i managed to put together some code but i am stuck a little, as far as i know this needs to be the annotation format for YOLO:
18 0.154167 0.431250 0.091667 0.612500
And with my code i get
0.5576068858305613, 0.5410404056310654, -0.7516528169314066, 0.33822181820869446
I am not sure why i get the - at the third number and if i need to shorten my float number,
I will post the code below if someone could help me , after completing this project i will post the whole code if someone wants to use it
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
The above code is the function that converts the coordinates for YOLO format , For the size you need to pass the (w,h) and the for the box you need to pass (x,x+w, y, y+h)
net = cv2.dnn.readNetFromDarknet(config_path, weights_path)
# path_name = "images/city_scene.jpg"
path_name = image
image = cv2.imread(path_name)
file_name = os.path.basename(path_name)
filename, ext = file_name.split(".")
h, w = image.shape[:2]
# create 4D blob
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416), swapRB=True, crop=False)
# sets the blob as the input of the network
net.setInput(blob)
# get all the layer names
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]
# feed forward (inference) and get the network output
# measure how much it took in seconds
start = time.perf_counter()
layer_outputs = net.forward(ln)
time_took = time.perf_counter() - start
print(f"Time took: {time_took:.2f}s")
boxes, confidences, class_ids = [], [], []
b=[]
a=[]
# loop over each of the layer outputs
for output in layer_outputs:
# loop over each of the object detections
for detection in output:
# extract the class id (label) and confidence (as a probability) of
# the current object detection
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
# discard weak predictions by ensuring the detected
# probability is greater than the minimum probability
if confidence > CONFIDENCE:
# scale the bounding box coordinates back relative to the
# size of the image, keeping in mind that YOLO actually
# returns the center (x, y)-coordinates of the bounding
# box followed by the boxes' width and height
box = detection[0:4] * np.array([w, h, w, h])
(centerX, centerY, width, height) = box.astype("float")
# use the center (x, y)-coordinates to derive the top and
# and left corner of the bounding box
x = int(centerX - (width / 2))
y = int(centerY - (height / 2))
a = w, h
convert(a, box)
boxes.append([x, y, int(width), int(height)])
confidences.append(float(confidence))
class_ids.append(class_id)
idxs = cv2.dnn.NMSBoxes(boxes, confidences, SCORE_THRESHOLD,
IOU_THRESHOLD)
font_scale = 1
thickness = 1
# ensure at least one detection exists
if len(idxs) > 0:
# loop over the indexes we are keeping
for i in idxs.flatten():
# extract the bounding box coordinates
x, y = boxes[i][0], boxes[i][1]
w, h = boxes[i][2], boxes[i][3]
# draw a bounding box rectangle and label on the image
color = [int(c) for c in colors[class_ids[i]]]
ba=w,h
print(w,h)
cv2.rectangle(image, (x, y), (x + w, y + h), color=color, thickness=thickness)
text = "{}".format(labels[class_ids[i]])
conf = "{:.3f}".format(confidences[i], x, y)
int1, int2 = (x, y)
print(text)
#print(convert(ba, box))
#b=w,h
#print(convert(b, boxes))
#print(convert(a, box)) #coordinates
ivan = str(int1)
b.append([text, ivan])
#a.append(float(conf))
#print(a)
# calculate text width & height to draw the transparent boxes as background of the text
(text_width, text_height) = \
cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, fontScale=font_scale, thickness=thickness)[0]
text_offset_x = x
text_offset_y = y - 5
box_coords = ((text_offset_x, text_offset_y), (text_offset_x + text_width + 2, text_offset_y - text_height))
overlay = image.copy()
cv2.rectangle(overlay, box_coords[0], box_coords[1], color=color, thickness=cv2.FILLED)
# add opacity (transparency to the box)
image = cv2.addWeighted(overlay, 0.6, image, 0.4, 0)
# now put the text (label: confidence %)
cv2.putText(image, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX,
fontScale=font_scale, color=(0, 0, 0), thickness=thickness)
text = "{}".format(labels[class_ids[i]],x,y)
conf = "{:.3f}".format(confidences[i])

the problem is the indexes in your function.
box[0]=>center x
box[1]=>center y
box[2]=>width of your bbox
box[3]=>height of your bbox
and according to the document, yolo labels are like this :
<object-class> <x> <y> <width> <height>
which x and y are the center of the bounding box.so your code should be like this :
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = box[0]*dw
y = box[1]*dh
w = box[2]*dw
h = box[3]*dh
return (x,y,w,h)

Maybe this can help you
def bounding_box_2_yolo(obj_detections, frame, index):
yolo_info = []
for object_det in obj_detections:
left_x, top_y, right_x, bottom_y = object_det.boxes
xmin = left_x
xmax = right_x
ymin = top_y
ymax = bottom_y
xcen = float((xmin + xmax)) / 2 / frame.shape[1]
ycen = float((ymin + ymax)) / 2 / frame.shape[0]
w = float((xmax - xmin)) / frame.shape[1]
h = float((ymax - ymin)) / frame.shape[0]
yolo_info.append((index, xcen, ycen, w, h))
return yolo_info
The labelimg has a lot of things that you can use too
https://github.com/tzutalin/labelImg/blob/master/libs/yolo_io.py

Python3: Resize rectangular image to a different kind of rectangle, keeping ratio and fill background with black

I have a very similar question to this: Resize rectangular image to square, keeping ratio and fill background with black, but I would like to resize to a nonsquare image and center the image either horizontally or vertically if needed.
Here are some examples of desired outputs. I made this image entirely with Paint, so the images might not actually be perfectly centered, but centering is what I'd like to achieve:
I tried the following code that I edited from the question linked:
def fix_size(fn, desired_w=256, desired_h=256, fill_color=(0, 0, 0, 255)):
"""Edited from https://stackoverflow.com/questions/44231209/resize-rectangular-image-to-square-keeping-ratio-and-fill-background-with-black"""
im = Image.open(fn)
x, y = im.size
#size = max(min_size, x, y)
w = max(desired_w, x)
h = max(desired_h, y)
new_im = Image.new('RGBA', (w, h), fill_color)
new_im.paste(im, ((w - x) // 2, (h - y) // 2))
return new_im.resize((desired_w, desired_h))
That doesn't work however as it still stretches some images into square shaped ones (at least the image b in the example. What comes to big images, it seems to rotate them instead!

The problem lies in your incorrect calculation of the image size:
w = max(desired_w, x)
h = max(desired_h, y)
You're simply taking the maximum of dimension independently - without taking into account the aspect ratio of the image. Imagine if your input is a square 1000x1000 image. You would end up creating a black 1000x1000 image, pasting the original image over it, and then resizing it to 244x138. To get the correct result, you would have to create a 1768x1000 image instead of a 1000x1000 image.
Here's the updated code that takes the aspect ratio into account:
def fix_size(fn, desired_w=256, desired_h=256, fill_color=(0, 0, 0, 255)):
"""Edited from https://stackoverflow.com/questions/44231209/resize-rectangular-image-to-square-keeping-ratio-and-fill-background-with-black"""
im = Image.open(fn)
x, y = im.size
ratio = x / y
desired_ratio = desired_w / desired_h
w = max(desired_w, x)
h = int(w / desired_ratio)
if h < y:
h = y
w = int(h * desired_ratio)
new_im = Image.new('RGBA', (w, h), fill_color)
new_im.paste(im, ((w - x) // 2, (h - y) // 2))
return new_im.resize((desired_w, desired_h))

Adjust size and position of bounding boxes while keeping it somewhat centered

I have a bunch of images with respective bounding box co-ordinates (x,y,w,h). Some of the bounding boxes are rectangular, so firstly I want to make them square while still centered on the region of interest. Using the following example of an apple, with a bounding box on the stalk, I'd want to expand the box to a square while still keeping it centered on the stalk.
Secondly, after I've extracted out the contents of the bounding box, I want to capture contextual information by increasing the bounding box size by n pixels and extracting and then repeat. After that, I want to shift the geometric center of the region of interest just by a few pixels and repeat the multiple bounding box extraction. Like the below image, where the differently colored boxes represent the different boxes I want to extract. The right image shows the small shift in center that I want to achieve.
I have an idea on how to do this in numpy, but are there any higher-level functions/libraries that would help me with defining the bounding box and manipulating it as such?

I use this image to do the same effects:
The code and the comment(as description):
#!/usr/bin/python3
# 2017.11.25 17:10:34 CST
# 2017.12.01 11:23:02 CST
import cv2
import numpy as np
## Read and copy
img = cv2.imread("cat.jpg")
canvas = img.copy()
## set and crop the ROI
x,y,w,h = bbox = (180, 100, 50, 100)
cv2.rectangle(canvas, (x,y), (x+w,y+h), (0,0,255), 2)
croped = img[y:y+h, x:x+w]
cv2.imshow("croped", croped)
## get the center and the radius
cx = x+w//2
cy = y+h//2
cr = max(w,h)//2
## set offset, repeat enlarger ROI
dr = 10
for i in range(0,4):
r = cr+i*dr
cv2.rectangle(canvas, (cx-r, cy-r), (cx+r, cy+r), (0,255,0), 1)
croped = img[cy-r:cy+r, cx-r:cx+r]
cv2.imshow("croped{}".format(i), croped)
## display
cv2.imshow("source", canvas)
cv2.waitKey()
cv2.destroyAllWindows()
The result:

Handling Corner Cases (Improved)
Hi, I just literally used Github Co-Pilot to Generate this code, which increases the size of the bounding box by 10 percent, and this one also can handle corner cases.
# expand bounding box by 10 percent
x = x - (w * 0.1)
y = y - (h * 0.1)
w = w + (w * 0.2)
h = h + (h * 0.2)
# make sure bounding box is within frame
x = max(int(x), 0)
y = max(int(y), 0)
w = min(int(w), imw - x)
h = min(int(h), imh - y)
# getting the bounding box
face_image = frame[y:w+y,x:x+h]
Before and After
White Represents Original Bounding Box and Green the New One.

A Javascript solution with proportionate bounding box values:
const expansionFactor = 0.1;
const x = imageWidth * boundingBox["Left"];
const y = imageHeight * boundingBox["Top"];
const w = imageWidth * boundingBox["Width"];
const h = imageHeight * boundingBox["Height"];
const left = Math.max(x - w * expansionFactor, 0);
const top = Math.max(y - h * expansionFactor, 0);
const width = Math.min(w + w * 2 * expansionFactor, imageWidth - left);
const height = Math.min(h + h * 2 * expansionFactor, imageHeight - top);

overlay a smaller image on a larger image python OpenCv

Hi I am creating a program that replaces a face in a image with someone else's face. However, I am stuck on trying to insert the new face into the original, larger image. I have researched ROI and addWeight(needs the images to be the same size) but I haven't found a way to do this in python. Any advise is great. I am new to opencv.
I am using the following test images:
smaller_image:
larger_image:
Here is my Code so far... a mixer of other samples:
import cv2
import cv2.cv as cv
import sys
import numpy
def detect(img, cascade):
rects = cascade.detectMultiScale(img, scaleFactor=1.1, minNeighbors=3, minSize=(10, 10), flags = cv.CV_HAAR_SCALE_IMAGE)
if len(rects) == 0:
return []
rects[:,2:] += rects[:,:2]
return rects
def draw_rects(img, rects, color):
for x1, y1, x2, y2 in rects:
cv2.rectangle(img, (x1, y1), (x2, y2), color, 2)
if __name__ == '__main__':
if len(sys.argv) != 2: ## Check for error in usage syntax
print "Usage : python faces.py <image_file>"
else:
img = cv2.imread(sys.argv[1],cv2.CV_LOAD_IMAGE_COLOR) ## Read image file
if (img == None):
print "Could not open or find the image"
else:
cascade = cv2.CascadeClassifier("haarcascade_frontalface_alt.xml")
gray = cv2.cvtColor(img, cv.CV_BGR2GRAY)
gray = cv2.equalizeHist(gray)
rects = detect(gray, cascade)
## Extract face coordinates
x1 = rects[0][3]
y1 = rects[0][0]
x2 = rects[0][4]
y2 = rects[0][5]
y=y2-y1
x=x2-x1
## Extract face ROI
faceROI = gray[x1:x2, y1:y2]
## Show face ROI
cv2.imshow('Display face ROI', faceROI)
small = cv2.imread("average_face.png",cv2.CV_LOAD_IMAGE_COLOR)
print "here"
small=cv2.resize(small, (x, y))
cv2.namedWindow('Display image') ## create window for display
cv2.imshow('Display image', small) ## Show image in the window
print "size of image: ", img.shape ## print size of image
cv2.waitKey(1000)

A simple way to achieve what you want:
import cv2
s_img = cv2.imread("smaller_image.png")
l_img = cv2.imread("larger_image.jpg")
x_offset=y_offset=50
l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]] = s_img
Update
I suppose you want to take care of the alpha channel too. Here is a quick and dirty way of doing so:
s_img = cv2.imread("smaller_image.png", -1)
y1, y2 = y_offset, y_offset + s_img.shape[0]
x1, x2 = x_offset, x_offset + s_img.shape[1]
alpha_s = s_img[:, :, 3] / 255.0
alpha_l = 1.0 - alpha_s
for c in range(0, 3):
l_img[y1:y2, x1:x2, c] = (alpha_s * s_img[:, :, c] +
alpha_l * l_img[y1:y2, x1:x2, c])

Using #fireant's idea, I wrote up a function to handle overlays. This works well for any position argument (including negative positions).
def overlay_image_alpha(img, img_overlay, x, y, alpha_mask):
"""Overlay `img_overlay` onto `img` at (x, y) and blend using `alpha_mask`.
`alpha_mask` must have same HxW as `img_overlay` and values in range [0, 1].
"""
# Image ranges
y1, y2 = max(0, y), min(img.shape[0], y + img_overlay.shape[0])
x1, x2 = max(0, x), min(img.shape[1], x + img_overlay.shape[1])
# Overlay ranges
y1o, y2o = max(0, -y), min(img_overlay.shape[0], img.shape[0] - y)
x1o, x2o = max(0, -x), min(img_overlay.shape[1], img.shape[1] - x)
# Exit if nothing to do
if y1 >= y2 or x1 >= x2 or y1o >= y2o or x1o >= x2o:
return
# Blend overlay within the determined ranges
img_crop = img[y1:y2, x1:x2]
img_overlay_crop = img_overlay[y1o:y2o, x1o:x2o]
alpha = alpha_mask[y1o:y2o, x1o:x2o, np.newaxis]
alpha_inv = 1.0 - alpha
img_crop[:] = alpha * img_overlay_crop + alpha_inv * img_crop
Example usage:
import numpy as np
from PIL import Image
# Prepare inputs
x, y = 50, 0
img = np.array(Image.open("img_large.jpg"))
img_overlay_rgba = np.array(Image.open("img_small.png"))
# Perform blending
alpha_mask = img_overlay_rgba[:, :, 3] / 255.0
img_result = img[:, :, :3].copy()
img_overlay = img_overlay_rgba[:, :, :3]
overlay_image_alpha(img_result, img_overlay, x, y, alpha_mask)
# Save result
Image.fromarray(img_result).save("img_result.jpg")
Result:
If you encounter errors or unusual outputs, please ensure:
img should not contain an alpha channel. (e.g. If it is RGBA, convert to RGB first.)
img_overlay has the same number of channels as img.

Based on fireant's excellent answer above, here is the alpha blending but a bit more human legible. You may need to swap 1.0-alpha and alpha depending on which direction you're merging (mine is swapped from fireant's answer).
o* == s_img.*
b* == b_img.*
for c in range(0,3):
alpha = s_img[oy:oy+height, ox:ox+width, 3] / 255.0
color = s_img[oy:oy+height, ox:ox+width, c] * (1.0-alpha)
beta = l_img[by:by+height, bx:bx+width, c] * (alpha)
l_img[by:by+height, bx:bx+width, c] = color + beta

Here it is:
def put4ChannelImageOn4ChannelImage(back, fore, x, y):
rows, cols, channels = fore.shape
trans_indices = fore[...,3] != 0 # Where not transparent
overlay_copy = back[y:y+rows, x:x+cols]
overlay_copy[trans_indices] = fore[trans_indices]
back[y:y+rows, x:x+cols] = overlay_copy
#test
background = np.zeros((1000, 1000, 4), np.uint8)
background[:] = (127, 127, 127, 1)
overlay = cv2.imread('imagee.png', cv2.IMREAD_UNCHANGED)
put4ChannelImageOn4ChannelImage(background, overlay, 5, 5)

A simple function that blits an image front onto an image back and returns the result. It works with both 3 and 4-channel images and deals with the alpha channel. Overlaps are handled as well.
The output image has the same size as back, but always 4 channels.
The output alpha channel is given by (u+v)/(1+uv) where u,v are the alpha channels of the front and back image and -1 <= u,v <= 1. Where there is no overlap with front, the alpha value from back is taken.
import cv2
def merge_image(back, front, x,y):
# convert to rgba
if back.shape[2] == 3:
back = cv2.cvtColor(back, cv2.COLOR_BGR2BGRA)
if front.shape[2] == 3:
front = cv2.cvtColor(front, cv2.COLOR_BGR2BGRA)
# crop the overlay from both images
bh,bw = back.shape[:2]
fh,fw = front.shape[:2]
x1, x2 = max(x, 0), min(x+fw, bw)
y1, y2 = max(y, 0), min(y+fh, bh)
front_cropped = front[y1-y:y2-y, x1-x:x2-x]
back_cropped = back[y1:y2, x1:x2]
alpha_front = front_cropped[:,:,3:4] / 255
alpha_back = back_cropped[:,:,3:4] / 255
# replace an area in result with overlay
result = back.copy()
print(f'af: {alpha_front.shape}\nab: {alpha_back.shape}\nfront_cropped: {front_cropped.shape}\nback_cropped: {back_cropped.shape}')
result[y1:y2, x1:x2, :3] = alpha_front * front_cropped[:,:,:3] + (1-alpha_front) * back_cropped[:,:,:3]
result[y1:y2, x1:x2, 3:4] = (alpha_front + alpha_back) / (1 + alpha_front*alpha_back) * 255
return result

For just add an alpha channel to s_img I just use cv2.addWeighted before the line
l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]] = s_img
as following:
s_img=cv2.addWeighted(l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]],0.5,s_img,0.5,0)

When attempting to write to the destination image using any of these answers above and you get the following error:
ValueError: assignment destination is read-only
A quick potential fix is to set the WRITEABLE flag to true.
img.setflags(write=1)

A simple 4on4 pasting function that works-
def paste(background,foreground,pos=(0,0)):
#get position and crop pasting area if needed
x = pos[0]
y = pos[1]
bgWidth = background.shape[0]
bgHeight = background.shape[1]
frWidth = foreground.shape[0]
frHeight = foreground.shape[1]
width = bgWidth-x
height = bgHeight-y
if frWidth<width:
width = frWidth
if frHeight<height:
height = frHeight
# normalize alpha channels from 0-255 to 0-1
alpha_background = background[x:x+width,y:y+height,3] / 255.0
alpha_foreground = foreground[:width,:height,3] / 255.0
# set adjusted colors
for color in range(0, 3):
fr = alpha_foreground * foreground[:width,:height,color]
bg = alpha_background * background[x:x+width,y:y+height,color] * (1 - alpha_foreground)
background[x:x+width,y:y+height,color] = fr+bg
# set adjusted alpha and denormalize back to 0-255
background[x:x+width,y:y+height,3] = (1 - (1 - alpha_foreground) * (1 - alpha_background)) * 255
return background

I reworked #fireant's concept to allow for optional alpha masks and allow any x or y, including values outside of the bounds of the image. It will crop to the bounds.
def overlay_image_alpha(img, img_overlay, x, y, alpha_mask=None):
"""Overlay `img_overlay` onto `img` at (x, y) and blend using optional `alpha_mask`.
`alpha_mask` must have same HxW as `img_overlay` and values in range [0, 1].
"""
if y < 0 or y + img_overlay.shape[0] > img.shape[0] or x < 0 or x + img_overlay.shape[1] > img.shape[1]:
y_origin = 0 if y > 0 else -y
y_end = img_overlay.shape[0] if y < 0 else min(img.shape[0] - y, img_overlay.shape[0])
x_origin = 0 if x > 0 else -x
x_end = img_overlay.shape[1] if x < 0 else min(img.shape[1] - x, img_overlay.shape[1])
img_overlay_crop = img_overlay[y_origin:y_end, x_origin:x_end]
alpha = alpha_mask[y_origin:y_end, x_origin:x_end] if alpha_mask is not None else None
else:
img_overlay_crop = img_overlay
alpha = alpha_mask
y1 = max(y, 0)
y2 = min(img.shape[0], y1 + img_overlay_crop.shape[0])
x1 = max(x, 0)
x2 = min(img.shape[1], x1 + img_overlay_crop.shape[1])
img_crop = img[y1:y2, x1:x2]
img_crop[:] = alpha * img_overlay_crop + (1.0 - alpha) * img_crop if alpha is not None else img_overlay_crop

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to crop image using yolo format image coordinate - python

You need to convert those float values to integers. You would do this by multiplying them by the width and height of the image then casting them to ints. Example: x,y,h,w = int(ximg_width), int(yimg_height), int(himg_higth), int(wimg_width) Then index the image: imgCrop = img[x:x+w, y:y+h]

Related

Why my photo collage output using numpy has strange color profile?

Making Automatic Annotiation tool

Python3: Resize rectangular image to a different kind of rectangle, keeping ratio and fill background with black

Adjust size and position of bounding boxes while keeping it somewhat centered

overlay a smaller image on a larger image python OpenCv

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to crop image using yolo format image coordinate - python

You need to convert those float values to integers. You would do this by multiplying them by the width and height of the image then casting them to ints. Example: x,y,h,w = int(x*img_width), int(y*img_height), int(h*img_higth), int(w*img_width) Then index the image: imgCrop = img[x:x+w, y:y+h]

Related

Why my photo collage output using numpy has strange color profile?

Making Automatic Annotiation tool

Python3: Resize rectangular image to a different kind of rectangle, keeping ratio and fill background with black

Adjust size and position of bounding boxes while keeping it somewhat centered

overlay a smaller image on a larger image python OpenCv

Categories

Resources

You need to convert those float values to integers. You would do this by multiplying them by the width and height of the image then casting them to ints. Example: x,y,h,w = int(ximg_width), int(yimg_height), int(himg_higth), int(wimg_width) Then index the image: imgCrop = img[x:x+w, y:y+h]