Opencv, how to overcrop an image? - python

I have a set of arbitrary images. Half the images are pictures, half are masks defining ROIS.
In the current version of my program I use the ROI to crop the image (i.e I extract the rectangle in the image matching the bounding box of the ROI mask). The problem is, the ROI mask isn't perfect and it's better to over predict than under predict in my case.
So I want to copy more than the ROI rectangle, but if I do this, I may be trying to crop out of the image.
i.e:
x, y, w, h = cv2.boundingRect(mask_contour)
img = img[int(y-h*0.05):int(y + h * 1.05), int(x-w*0.05):int(x + w * 1.05)]
can fail because it tries to access out of bounds pixels. I could just clamp the values, but I wanted to know if there is a better approach

You can add a boarder using OpenCV
import cv2 as cv
import random
src = cv.imread('/home/stephen/lenna.png')
borderType = cv.BORDER_REPLICATE
boarderSize = .5
top = int(boarderSize * src.shape[0]) # shape[0] = rows
bottom = top
left = int(boarderSize * src.shape[1]) # shape[1] = cols
right = left
value = [random.randint(0,255), random.randint(0,255), random.randint(0,255)]
dst = cv.copyMakeBorder(src, top, bottom, left, right, borderType, None, value)
cv.imshow('img', dst)
c = cv.waitKey(0)

Maybe you could try to limit the coordinates beforehand. Please see the code below:
[ymin, ymax] = [max(0,int(y-h*0.05)), min(h, int(y+h*1.05))]
[xmin, xmax] = [max(0,int(x-w*1.05)), min(w, int(x+w*1.05))]
img = img[ymin:ymax, xmin:xmax]

Related

How to crop face detected via Mediapipe in Python

i have a problem with mediapipe coordinations. What i want to do is crop the box of the detected face.
https://google.github.io/mediapipe/solutions/face_detection.html
EXAMPLE OF PROCEDURE
And i use this code below:
mp_face_detection = mp.solutions.face_detection
# Setup the face detection function.
face_detection = mp_face_detection.FaceDetection(model_selection=0, min_detection_confidence=0.5)
# Initialize the mediapipe drawing class.
mp_drawing = mp.solutions.drawing_utils
# Read an image from the specified path.
sample_img = cv2.imread('12345.jpg')
# Specify a size of the figure.
plt.figure(figsize = [10, 10])
# Display the sample image, also convert BGR to RGB for display.
plt.title("Sample Image");plt.axis('off');plt.imshow(sample_img[:,:,::-1]);plt.show()
face_detection_results = face_detection.process(sample_img[:,:,::-1])
# Check if the face(s) in the image are found.
if face_detection_results.detections:
# Iterate over the found faces.
for face_no, face in enumerate(face_detection_results.detections):
# Display the face number upon which we are iterating upon.
print(f'FACE NUMBER: {face_no+1}')
print('---------------------------------')
# Display the face confidence.
print(f'FACE CONFIDENCE: {round(face.score[0], 2)}')
# Get the face bounding box and face key points coordinates.
face_data = face.location_data
# Display the face bounding box coordinates.
print(f'\nFACE BOUNDING BOX:\n{face_data.relative_bounding_box}')
# Iterate two times as we only want to display first two key points of each detected face.
for i in range(2):
# Display the found normalized key points.
print(f'{mp_face_detection.FaceKeyPoint(i).name}:')
print(f'{face_data.relative_keypoints[mp_face_detection.FaceKeyPoint(i).value]}')
So the results are in this form:
FACE NUMBER: 1
FACE CONFIDENCE: 0.89
FACE BOUNDING BOX:
xmin: 0.2784463167190552
ymin: 0.3503175973892212
width: 0.1538110375404358
height: 0.23071599006652832
RIGHT_EYE:
x: 0.3447018265724182
y: 0.4222590923309326
LEFT_EYE:
x: 0.39114508032798767
y: 0.3888365626335144
And i want to CROP the image in the coordinations of the BOX.
Like
face = Image.fromarray(image).crop(face_rect)
or any other crop procedure.
My problem is that i can't get the coords of the detected item from mediapipe.
Any ideas?
Got the solution guys
import dlib
from PIL import Image
from skimage import io
h, w, c = sample_img.shape
print('width: ', w)
print('height: ', h)
xleft = data.xmin*w
xleft = int(xleft)
xtop = data.ymin*h
xtop = int(xtop)
xright = data.width*w + xleft
xright = int(xright)
xbottom = data.height*h + xtop
xbottom = int(xbottom)
detected_faces = [(xleft, xtop, xright, xbottom)]
for n, face_rect in enumerate(detected_faces):
face = Image.fromarray(image_c).crop(face_rect)
face_np = np.asarray(face)
plt.imshow(face_np)
Assume, the objective is to crop a single detected face by mediapipe . Note the [0] to indicate that we are only interested in single face
results = mp_face.process(image_input)
detection=results.detections[0]
By default mediapipe returns detection data in normalize form and we have to convert to original size by multiplying x values by width and y values by height of input image.
We can employed the _normalized_to_pixel_coordinates available with the mediapipe
relative_bounding_box = location.relative_bounding_box
rect_start_point = _normalized_to_pixel_coordinates(
relative_bounding_box.xmin, relative_bounding_box.ymin, image_cols,
image_rows)
rect_end_point = _normalized_to_pixel_coordinates(
relative_bounding_box.xmin + relative_bounding_box.width,
relative_bounding_box.ymin + relative_bounding_box.height, image_cols,
image_rows)
This essentially produce
xleft,ytop=rect_start_point
xright,ybot=rect_end_point
In other word, ytop. ybot, xleft. xright represent face_top, face_bottom, face_left, and face_right, respectively.
Since the image is simply a 3D np array, we can crop it as below
crop_img = image_input[ytop: ybot, xleft: xright]
The complete code is as below
import cv2
import mediapipe as mp
from mediapipe.python.solutions.drawing_utils import _normalized_to_pixel_coordinates
# load face detection model
mp_face = mp.solutions.face_detection.FaceDetection(
model_selection=1, # model selection
min_detection_confidence=0.5 # confidence threshold
)
dframe= cv2.imread('xx.png',0)
image_rows, image_cols, _ = dframe.shape
image_input = cv2.cvtColor(dframe, cv2.COLOR_BGR2RGB)
results = mp_face.process(image_input)
detection=results.detections[0]
location = detection.location_data
relative_bounding_box = location.relative_bounding_box
rect_start_point = _normalized_to_pixel_coordinates(
relative_bounding_box.xmin, relative_bounding_box.ymin, image_cols,
image_rows)
rect_end_point = _normalized_to_pixel_coordinates(
relative_bounding_box.xmin + relative_bounding_box.width,
relative_bounding_box.ymin + relative_bounding_box.height, image_cols,
image_rows)
## Lets draw a bounding box
color = (255, 0, 0)
thickness = 2
cv2.rectangle(image_input, rect_start_point, rect_end_point, color, thickness)
xleft,ytop=rect_start_point
xright,ybot=rect_end_point
crop_img = image_input[ytop: ybot, xleft: xright]
cv2.imwrite('crop_image0.jpg', crop_img)

Creating new image for given size containing cropped image

I am currently working on the below and am struggling to understand the best approach.
I've searched a lot but was not able to find answers that would match what I am trying to do
The problem:
Relocating an Object (e.g. Shoe) within the existing image (white background) to certain location (e.g. move up)
Inserting and positioning the Object (e.g. Shoe) at by the user specified location within a new background (still white) with by the user specified new height / width
How far I got:
I've managed identify the object within the picture using CV2, got the outer contours, added a little padding and cropped the object (see below). I am happy with cropping it that way as all my images have a one coloured background and I will keep the background in the same colour.
Where I am stuck:
My cropped Object and old image background / new background do not share the same shape, hence I am not able to overlay / concatenate / merge ...
Given both images are store as np arrays, I assume the answer will be to somehow place the Shoe crop np.array within the background np.array, however I have no clue how to do this.
Maybe there is an easier / different way to do this?
Would be very grateful to hear from anyone who can lead me into the right direction.
Code
#importing dependencies
import os
import numpy as np
import cv2
from matplotlib import pyplot as plt
# Config
path = '/Users/..../Shoes/'
img_list = os.listdir(path)
img_path = path + img_list[0]
#Outline
color = (0,255,0)
thickness = 3
padding = 10
# convert to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# create a binary thresholded image
_, binary = cv2.threshold(gray, 225, 255, cv2.THRESH_BINARY_INV)
# find the contours from the thresholded image
contours, hierarchy = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Identifying outer contours
x_axis = []
y_axis = []
for i in range(len(contours)):
for y in range (len(contours[i])):
x_axis.append(contours[i][y][0][0])
y_axis.append(contours[i][y][0][1])
min_x = min(x_axis) - padding
min_y = min(y_axis) - padding
max_x = max(x_axis) + padding
max_y = max(y_axis) + padding
# Defining start and endpoint of outline Rectangle based on identified outer corners + Padding
start_point = (min_x, min_y)
end_point = (max_x, max_y)
image_outline = cv2.rectangle(image, start_point, end_point, color, thickness)
plt.imshow(image_outline)
plt.show()
#Crop Image
crop_img = image[min_y:max_y, min_x:max_x]
print(crop_img.shape)
plt.imshow(crop_img)
plt.show()
I think I got to the solution, this centers the image for any new given background height/width
Still interested in quicker / cleaner ways
#Define the new height and width you want to have
new_height = 1200
new_width = 1200
#Check current hight and with of Cropped image
crop_height = crop_img.shape[0]
crop_width = crop_img.shape[1]
#calculate how much you need add to the sides and top - basically halft of the remaining height / with ... currently not working correctly for odd numbers
add_sides = int((new_width - crop_width)/2)
add_top_and_btm = int((new_height - crop_height)/2)
# Adding background to the sides
bg_sides = np.zeros(shape=[crop_height, add_sides, 3], dtype=np.uint8)
bg_sides2 = 255 * np.ones(shape=[crop_height, add_sides, 3], dtype=np.uint8)
new_crop_img = np.insert(crop_img, [1], bg_sides2, axis=1)
new_crop_img = np.insert(new_crop_img, [-1], bg_sides2, axis=1)
# Then adding Background to top and bottom
bg_top_and_btm = np.zeros(shape=[add_top_and_btm, new_width, 3],
dtype=np.uint8)
bg_top_and_btm2 = 255 * np.ones(shape=[add_top_and_btm, new_width, 3],
dtype=np.uint8)
new_crop_img = np.insert(new_crop_img, [1], bg_top_and_btm2, axis=0)
new_crop_img = np.insert(new_crop_img, [-1], bg_top_and_btm2, axis=0)
plt.imshow(new_crop_img)

Image translation using numpy

I want to perform image translation by a certain amount (shift the image vertically and horizontally).
The problem is that when I paste the cropped image back on the canvas, I just get back a white blank box.
Can anyone spot the issue here?
Many thanks
img_shape = image.shape
# translate image
# percentage of the dimension of the image to translate
translate_factor_x = random.uniform(*translate)
translate_factor_y = random.uniform(*translate)
# initialize a black image the same size as the image
canvas = np.zeros(img_shape)
# get the top-left corner coordinates of the shifted image
corner_x = int(translate_factor_x*img_shape[1])
corner_y = int(translate_factor_y*img_shape[0])
# determine which part of the image will be pasted
mask = image[max(-corner_y, 0):min(img_shape[0], -corner_y + img_shape[0]),
max(-corner_x, 0):min(img_shape[1], -corner_x + img_shape[1]),
:]
# determine which part of the canvas the image will be pasted on
target_coords = [max(0,corner_y),
max(corner_x,0),
min(img_shape[0], corner_y + img_shape[0]),
min(img_shape[1],corner_x + img_shape[1])]
# paste image on selected part of the canvas
canvas[target_coords[0]:target_coords[2], target_coords[1]:target_coords[3],:] = mask
transformed_img = canvas
plt.imshow(transformed_img)
This is what I get:
For image translation, you can make use of the somewhat obscure numpy.roll function. In this example I'm going to use a white canvas so it is easier to visualize.
image = np.full_like(original_image, 255)
height, width = image.shape[:-1]
shift = 100
# shift image
rolled = np.roll(image, shift, axis=[0, 1])
# black out shifted parts
rolled = cv2.rectangle(rolled, (0, 0), (width, shift), 0, -1)
rolled = cv2.rectangle(rolled, (0, 0), (shift, height), 0, -1)
If you want to flip the image so the black part is on the other side, you can use both np.fliplr and np.flipud.
Result:
Here is a simple solution that translates an image by tx and ty pixels using only array indexing, that does not roll over, and handles negative values as well:
tx, ty = 8, 5 # translation on x and y axis, in pixels
N, M = image.shape
image_translated = np.zeros_like(image)
image_translated[max(tx,0):M+min(tx,0), max(ty,0):N+min(ty,0)] = image[-min(tx,0):M-max(tx,0), -min(ty,0):N-max(ty,0)]
Example:
(Note that for simplicity it does not handle cases where tx > M or ty > N).

How to make a shape larger or smaller without changing the resolution of the image using OpenCV or PIL in Python

I would like to be able to make a certain shape in either a PIL image or an OpenCV image 3 times larger and smaller without changing the resolution of the image or changing the shape of the shape I want to make larger. I have tried using OpenCV's dilation method but that is not it's intended use, plus it changed the shape of the image. For an example:
Thanks.
Here's a way of doing it:
find the interesting shape, i.e. non-white ROI area
extract it
scale it up by a factor
clear the original image to white
paste the scaled ROI back into image with same centre
#!/usr/bin/env python3
import cv2
import numpy as np
if __name__ == "__main__":
# Open image
orig = cv2.imread('image.png',cv2.IMREAD_COLOR)
# Get extent of interesting part, i.e. non-white part
y, x, _ = np.nonzero(~orig)
y0, y1 = np.min(y), np.max(y) # top and bottom rows
x0, x1 = np.min(x), np.max(x) # left and right cols
h, w = y1-y0, x1-x0 # height and width
ROI = orig[y0:y1, x0:x1] # extract ROI
cv2.imwrite('ROI.png', ROI) # DEBUG only
# Upscale ROI
factor = 3
scaledROI = cv2.resize(ROI, (w*factor,h*factor), interpolation=cv2.INTER_NEAREST)
newH, newW = scaledROI.shape[:2]
# Clear original image to white
orig[:] = [255,255,255]
# Get centre of original shape, and position of top-left of ROI in output image
cx, cy = (x0 + x1) //2, (y0 + y1)//2
top = cy - newH//2
left = cx - newW//2
# Paste in rescaled ROI
orig[top:top+newH, left:left+newW] = scaledROI
cv2.imwrite('result.png', orig)
That transforms this:
to this:
Puts me in mind of a pantograph:

Using masks to apply different thresholds to different parts of an image

I have an image, in which I want to threshold part of the image within a circular region, and then the remainder of the image outside of this region.
Unfortunately my attempts seem to be thresholding the image as a whole, ignoring the masks. How can this be properly achieved? See code attempt below.
def circular_mask(h, w, centre=None, radius=None):
if centre is None: # use the middle of the image
centre = [int(w / 2), int(h / 2)]
if radius is None: # use the smallest distance between the centre and image walls
radius = min(centre[0], centre[1], w - centre[0], h - centre[1])
Y, X = np.ogrid[:h, :w]
dist_from_centre = np.sqrt((X - centre[0]) ** 2 + (Y - centre[1]) ** 2)
mask = dist_from_centre <= radius
return mask
img = cv2.imread('image.png', 0) #read image
h,w = img.shape[:2]
mask = circular_mask(h,w, centre=(135,140),radius=75) #create a boolean circle mask
mask_img = img.copy()
inside = np.ma.array(mask_img, mask=~mask)
t1 = inside < 50 #threshold part of image within the circle, ignore rest of image
plt.imshow(inside)
plt.imshow(t1, alpha=.25)
plt.show()
outside = np.ma.array(mask_img, mask=mask)
t2 = outside < 20 #threshold image outside circle region, ignoring image in circle
plt.imshow(outside)
plt.imshow(t2, alpha=.25)
plt.show()
fin = np.logical_or(t1, t2) #combine the results from both thresholds together
plt.imshow(fin)
plt.show()
Working solution:
img = cv2.imread('image.png', 0)
h,w = img.shape[:2]
mask = circular_mask(h,w, centre=(135,140),radius=75)
inside = img.copy()*mask
t1 = inside < 50#get_threshold(inside, 1)
plt.imshow(inside)
plt.show()
outside = img.copy()*~mask
t2 = outside < 70
plt.imshow(outside)
plt.show()
plt.imshow(t1)
plt.show()
plt.imshow(t2)
plt.show()
plt.imshow(np.logical_and(t1,t2))
plt.show()
I assume your image is single layered (e.g. Grey Scale).
You can make 2 copies of the image. Multiply (or Logical AND) your mask with one of them and invert of that mask with the other one. Now apply your desired threshold to each of them. In the end merge both images using Logical OR operation.

Categories