So I've been trying to remove the green screen, and crop, and I've had success, however, it is pretty slow, especially when I'm trying to use it on hundreds of pictures. I am not very familiar with image processing or PIL libraries, so I would love advice on how to make my code faster.
How it works: Basically it loops through each pixel, recording the pixel it looped over, and does it until it hits a non green like color, at which point it records the number of pixels away from the edge. I went with four loops, because i wanted to minimize the number of pixels i had to traverse (I can do the same thing with one loop but it would traverse across every pixel). The visitedPixel set prevents dealing with the same pixel. After the loops were done, it got a set of pixels that can be used to trim out the green screen edges, and thus cropping the image.
def trim_greenscreen_and_crop(image_name, output_name):
img = Image.open(image_name)
pixels = img.load()
width = img.size[0]
height = img.size[1]
visitedPixel = set()
box = [0, 0, 0, 0]
# left edge
break_flag = False
for x in range(width):
for y in range(height):
coords = (x, y)
r, g, b = pixels[x, y]
if not (g > r and g > b and g > 200) and coords not in visitedPixel:
box[0] = x - 1
break_flag = True
break
visitedPixel.add(coords)
if break_flag:
break
# top edge
break_flag = False
for y in range(height):
for x in range(width):
coords = (x, y)
r, g, b = pixels[x, y]
if not (g > r and g > b and g > 200) and coords not in visitedPixel:
box[1] = y-1
break_flag = True
break
visitedPixel.add(coords)
if break_flag:
break
# right edge
break_flag = False
for x in range(width - 1, -1, -1):
for y in range(height):
coords = (x, y)
r, g, b = pixels[x, y]
if not (g > r and g > b and g > 200) and coords not in visitedPixel:
box[2] = x + 1
break_flag = True
break
visitedPixel.add(coords)
if break_flag:
break
# bottom edge
break_flag = False
for y in range(height - 1, -1, -1):
for x in range(width):
coords = (x, y)
r, g, b = pixels[x, y]
if not (g > r and g > b and g > 200) and coords not in visitedPixel:
box[3] = y + 1
break_flag = True
break
visitedPixel.add(coords)
if break_flag:
break
cropped_img = img.crop(box)
if cropped_img.size == (0, 0):
return img.size
# cropped_img.save(output_name)
return cropped_img.size
Before:
After:
So i figured using numpy, and got this much faster solution which involves finding the variance of the rows and columns, thanks to MarkSetchell's idea.
draft:
def trim_greenscreen_and_crop(image_name, output_name):
# use numpy to read the image
img = Image.open(image_name)
np_img = np.array(Image.open(image_name))
# use numpy to get the variance across the rows and columns
row_var = np.var(np_img, axis=0)
col_var = np.var(np_img, axis=1)
# select the rows and columns with some variance (basically not all green)
no_variance_row = np.where(row_var > 5)
no_variance_col = np.where(col_var > 5)
# checks if the entire image is green, then dont trim
if len(no_variance_row[0]) == 0 or len(no_variance_col[0]) == 0:
return img.size
else:
# crops the image using the distance from the edges to the first non-green pixel
cropped_img = img.crop((no_variance_row[0][0], no_variance_col[0][0], no_variance_row[0][-1], no_variance_col[0][-1]))
cropped_img.save(output_name)
return cropped_img.size
You could speed up edge detection if you consider image itself is 'big enough' and loop not every pixel of the source image but go by diagonal, incrementing in one go both x and y until reach non-green color. You can repeat this process from all four corners. I hope you got the idea.
Edit:
you can also speed up by checking not every pixel, but check pixels on some 'grid', i.e. increment x and y by some big enough step. This also will work if your image is big enough
I'm using Mtcnn network (https://towardsdatascience.com/face-detection-using-mtcnn-a-guide-for-face-extraction-with-a-focus-on-speed-c6d59f82d49) to detect faces and heads. For this I'm using the classical lines code for face detection :I get the coordinate of the top-left corner of the bouding-box of the face (x,y) + the height and width of the box (h,w), then I expand the box to get the head in my crop :
import mtcnn
img = cv2.imread('images/'+path_res)
faces = detector.detect_faces(img)# result
for result in faces:
x, y, w, h = result['box']
x1, y1 = x + w, y + h
x, y, w, h = result['box']
x1, y1 = x + w, y + h
if x-100>=0:
a=x-100
else:
a=0
if y-150 >=0:
b=y-150
else:
b=0
if x1+100 >= w:
c=x1+100
else:
c=w
if y1+60 >= h:
d=y1+60
else:
d=h
crop=img[b:d,a:c] #<--- final crop of the head
the problem is this solution works for some images, but for many anothers, in my crop, I get the shoulders and the neck of the target person. I think, it's because, the pixels/inch in each image (i.e. +150pixels in one image isn't the same in another image). Hence, what can I do to extract the head properly ?
Many thanks
You can use relative instead of absolute sizes for the margins around the detected faces. For example, 50% on top, bottom, left and right:
import mtcnn
img = cv2.imread('images/'+path_res)
faces = []
for result in detector.detect_faces(img):
x, y, w, h = result['box']
b = max(0, y - (h//2))
d = min(img.shape[0], (y+h) + (h//2))
a = max(0, x - (w//2):(x+w))
c = min(img.shape[1], (x+w) + (w//2))
face = img[b:d, a:c, :]
faces.append(face)
In python, I have written some code that generates a circle using Bresenham's Midpoint Algorithm:
from PIL import Image, ImageDraw
radius = 100 #radius of circle
xpts = [] #array to hold x pts
ypts = [] #array to hold y pts
img = Image.new('RGB', (1000, 1000))
draw = ImageDraw.Draw(img) #to use draw.line()
pixels = img.load()
d = (5/4) - radius
x = 0
y = radius
xpts.append(x) #initial x value
ypts.append(y) #initial y value
while x < y:
if d < 0:
d += (2*x + 3)
x += 1
xpts.append(x + 500) #translate points to center by 500px
ypts.append(y - 500)
else:
d += (2 * (x - y) + 5)
x += 1
y -= 1
xpts.append(x + 500) #translate points to center by 500px
ypts.append(y - 500)
for i in range(len(xpts)): #draw initial and reflected octant points
pixels[xpts[i] ,ypts[i]] = (255,255,0) #initial octant
pixels[xpts[i],-ypts[i]] = (255,255,0)
pixels[-xpts[i],ypts[i]] = (255,255,0)
pixels[-xpts[i],-ypts[i]] = (255,255,0)
pixels[ypts[i],xpts[i]] = (255,255,0)
pixels[-ypts[i],xpts[i]] = (255,255,0)
pixels[ypts[i],-xpts[i]] = (255,255,0)
pixels[-ypts[i],-xpts[i]] = (255,255,0)
img.show()
To fill it, I had planned to use ImageDraw to draw a line horizontally within the circle from each point that is generated from the initial octant using draw.line(). I have the x and y coordinates stored in arrays. However, I am stuck interpreting each point and its reflection point to draw the horizontal line using draw.line(). Could someone clarify this?
Instead of drawing individual pixels, you would just add a line that connects the pixels corresponding to each other (either -x and +x or -y and +y). For each Bresenham step, you draw four lines (each connecting two octants).
Here is your adapted sample code. I dropped the points array and instead drew the lines directly. I also added the cx and cy variables that define the circle center. In your code, you sometimes used negative indices. This only works by coincidence because the circle is in the center:
from PIL import Image, ImageDraw
radius = 100 # radius of circle
xpts = [] # array to hold x pts
ypts = [] # array to hold y pts
img = Image.new('RGB', (1000, 1000))
draw = ImageDraw.Draw(img) # to use draw.line()
pixels = img.load()
d = (5 / 4) - radius
x = 0
y = radius
cx = 500
cy = 500
def draw_scanlines(x, y):
color = (255, 255, 0)
draw.line((cx - x, cy + y, cx + x, cy + y), fill=color)
draw.line((cx - x, cy - y, cx + x, cy - y), fill=color)
draw.line((cx - y, cy + x, cx + y, cy + x), fill=color)
draw.line((cx - y, cy - x, cx + y, cy - x), fill=color)
draw_scanlines(x, y)
while x < y:
if d < 0:
d += (2 * x + 3)
x += 1
else:
d += (2 * (x - y) + 5)
x += 1
y -= 1
draw_scanlines(x, y)
img.show()
Instead of drawing lines, you can fill all points inside a circle with radius radius in O(n^2) using:
# Your code here
for x in range(radius):
for y in range(radius):
if x**2 + y**2 < radius**2:
pixels[ x + 500 , y-500] = (255,255,0)
pixels[ x + 500 , -y-500] = (255,255,0)
pixels[ -x + 500 , y-500] = (255,255,0)
pixels[ -x + 500 , -y-500] = (255,255,0)
img.show()
I am trying to implement circular hough transform by equation, r = sqrt((x-h)^2-(y-k)^2) for detecting circle from image.
I applied list of step like Gaussian Blur, canny. After that i am not getting how to implement above equation if radius and boundary points are available. After implementation i will get accumulator space which contain radius and center of detected circle. I want to implement with out HoughCircle function of opencv. Is there any idea which can help me?It's taking so much time.
import numpy as np
import cv2
import math
image = cv2.imread(imagepath)
h, w = image.shape[:2]
print h, w
grayimg = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
bimg = cv2.bilateralFilter(grayimg, 5, 175, 175)
cann = cv2.Canny(bimg,100,200)
pixel = np.argwhere(cann == 255)
accum = []
ct = 0
for r in range(10,21):
for h in range(0,20):
for k in range(0,20):
for p in pixel:
print r,h,k,p
xpart = (h - p[0])**2
ypart = (k - p[1])**2
rhs = xpart + ypart
lhs = r * r
if(lhs == rhs):
accum.append((r,(h,k),p))
print len(accum)
cv2.waitKey(0)
Still some ugradation require in this code to make process fast for accumulator space.
accum = [[[0 for r in range(10,21)]for h in range(0,30)]for k in range(0,30)]
print accum
ct = 0
for r in range(10,21):
for h in range(0,30):
for k in range(0,30):
for p in pixel:
#print r,h,k,p
xpart = (h - p[0])**2
ypart = (k - p[1])**2
rhs = xpart + ypart
lhs = r * r
if(lhs == rhs):
accum[k][h][r-10] += 1
Hi I am creating a program that replaces a face in a image with someone else's face. However, I am stuck on trying to insert the new face into the original, larger image. I have researched ROI and addWeight(needs the images to be the same size) but I haven't found a way to do this in python. Any advise is great. I am new to opencv.
I am using the following test images:
smaller_image:
larger_image:
Here is my Code so far... a mixer of other samples:
import cv2
import cv2.cv as cv
import sys
import numpy
def detect(img, cascade):
rects = cascade.detectMultiScale(img, scaleFactor=1.1, minNeighbors=3, minSize=(10, 10), flags = cv.CV_HAAR_SCALE_IMAGE)
if len(rects) == 0:
return []
rects[:,2:] += rects[:,:2]
return rects
def draw_rects(img, rects, color):
for x1, y1, x2, y2 in rects:
cv2.rectangle(img, (x1, y1), (x2, y2), color, 2)
if __name__ == '__main__':
if len(sys.argv) != 2: ## Check for error in usage syntax
print "Usage : python faces.py <image_file>"
else:
img = cv2.imread(sys.argv[1],cv2.CV_LOAD_IMAGE_COLOR) ## Read image file
if (img == None):
print "Could not open or find the image"
else:
cascade = cv2.CascadeClassifier("haarcascade_frontalface_alt.xml")
gray = cv2.cvtColor(img, cv.CV_BGR2GRAY)
gray = cv2.equalizeHist(gray)
rects = detect(gray, cascade)
## Extract face coordinates
x1 = rects[0][3]
y1 = rects[0][0]
x2 = rects[0][4]
y2 = rects[0][5]
y=y2-y1
x=x2-x1
## Extract face ROI
faceROI = gray[x1:x2, y1:y2]
## Show face ROI
cv2.imshow('Display face ROI', faceROI)
small = cv2.imread("average_face.png",cv2.CV_LOAD_IMAGE_COLOR)
print "here"
small=cv2.resize(small, (x, y))
cv2.namedWindow('Display image') ## create window for display
cv2.imshow('Display image', small) ## Show image in the window
print "size of image: ", img.shape ## print size of image
cv2.waitKey(1000)
A simple way to achieve what you want:
import cv2
s_img = cv2.imread("smaller_image.png")
l_img = cv2.imread("larger_image.jpg")
x_offset=y_offset=50
l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]] = s_img
Update
I suppose you want to take care of the alpha channel too. Here is a quick and dirty way of doing so:
s_img = cv2.imread("smaller_image.png", -1)
y1, y2 = y_offset, y_offset + s_img.shape[0]
x1, x2 = x_offset, x_offset + s_img.shape[1]
alpha_s = s_img[:, :, 3] / 255.0
alpha_l = 1.0 - alpha_s
for c in range(0, 3):
l_img[y1:y2, x1:x2, c] = (alpha_s * s_img[:, :, c] +
alpha_l * l_img[y1:y2, x1:x2, c])
Using #fireant's idea, I wrote up a function to handle overlays. This works well for any position argument (including negative positions).
def overlay_image_alpha(img, img_overlay, x, y, alpha_mask):
"""Overlay `img_overlay` onto `img` at (x, y) and blend using `alpha_mask`.
`alpha_mask` must have same HxW as `img_overlay` and values in range [0, 1].
"""
# Image ranges
y1, y2 = max(0, y), min(img.shape[0], y + img_overlay.shape[0])
x1, x2 = max(0, x), min(img.shape[1], x + img_overlay.shape[1])
# Overlay ranges
y1o, y2o = max(0, -y), min(img_overlay.shape[0], img.shape[0] - y)
x1o, x2o = max(0, -x), min(img_overlay.shape[1], img.shape[1] - x)
# Exit if nothing to do
if y1 >= y2 or x1 >= x2 or y1o >= y2o or x1o >= x2o:
return
# Blend overlay within the determined ranges
img_crop = img[y1:y2, x1:x2]
img_overlay_crop = img_overlay[y1o:y2o, x1o:x2o]
alpha = alpha_mask[y1o:y2o, x1o:x2o, np.newaxis]
alpha_inv = 1.0 - alpha
img_crop[:] = alpha * img_overlay_crop + alpha_inv * img_crop
Example usage:
import numpy as np
from PIL import Image
# Prepare inputs
x, y = 50, 0
img = np.array(Image.open("img_large.jpg"))
img_overlay_rgba = np.array(Image.open("img_small.png"))
# Perform blending
alpha_mask = img_overlay_rgba[:, :, 3] / 255.0
img_result = img[:, :, :3].copy()
img_overlay = img_overlay_rgba[:, :, :3]
overlay_image_alpha(img_result, img_overlay, x, y, alpha_mask)
# Save result
Image.fromarray(img_result).save("img_result.jpg")
Result:
If you encounter errors or unusual outputs, please ensure:
img should not contain an alpha channel. (e.g. If it is RGBA, convert to RGB first.)
img_overlay has the same number of channels as img.
Based on fireant's excellent answer above, here is the alpha blending but a bit more human legible. You may need to swap 1.0-alpha and alpha depending on which direction you're merging (mine is swapped from fireant's answer).
o* == s_img.*
b* == b_img.*
for c in range(0,3):
alpha = s_img[oy:oy+height, ox:ox+width, 3] / 255.0
color = s_img[oy:oy+height, ox:ox+width, c] * (1.0-alpha)
beta = l_img[by:by+height, bx:bx+width, c] * (alpha)
l_img[by:by+height, bx:bx+width, c] = color + beta
Here it is:
def put4ChannelImageOn4ChannelImage(back, fore, x, y):
rows, cols, channels = fore.shape
trans_indices = fore[...,3] != 0 # Where not transparent
overlay_copy = back[y:y+rows, x:x+cols]
overlay_copy[trans_indices] = fore[trans_indices]
back[y:y+rows, x:x+cols] = overlay_copy
#test
background = np.zeros((1000, 1000, 4), np.uint8)
background[:] = (127, 127, 127, 1)
overlay = cv2.imread('imagee.png', cv2.IMREAD_UNCHANGED)
put4ChannelImageOn4ChannelImage(background, overlay, 5, 5)
A simple function that blits an image front onto an image back and returns the result. It works with both 3 and 4-channel images and deals with the alpha channel. Overlaps are handled as well.
The output image has the same size as back, but always 4 channels.
The output alpha channel is given by (u+v)/(1+uv) where u,v are the alpha channels of the front and back image and -1 <= u,v <= 1. Where there is no overlap with front, the alpha value from back is taken.
import cv2
def merge_image(back, front, x,y):
# convert to rgba
if back.shape[2] == 3:
back = cv2.cvtColor(back, cv2.COLOR_BGR2BGRA)
if front.shape[2] == 3:
front = cv2.cvtColor(front, cv2.COLOR_BGR2BGRA)
# crop the overlay from both images
bh,bw = back.shape[:2]
fh,fw = front.shape[:2]
x1, x2 = max(x, 0), min(x+fw, bw)
y1, y2 = max(y, 0), min(y+fh, bh)
front_cropped = front[y1-y:y2-y, x1-x:x2-x]
back_cropped = back[y1:y2, x1:x2]
alpha_front = front_cropped[:,:,3:4] / 255
alpha_back = back_cropped[:,:,3:4] / 255
# replace an area in result with overlay
result = back.copy()
print(f'af: {alpha_front.shape}\nab: {alpha_back.shape}\nfront_cropped: {front_cropped.shape}\nback_cropped: {back_cropped.shape}')
result[y1:y2, x1:x2, :3] = alpha_front * front_cropped[:,:,:3] + (1-alpha_front) * back_cropped[:,:,:3]
result[y1:y2, x1:x2, 3:4] = (alpha_front + alpha_back) / (1 + alpha_front*alpha_back) * 255
return result
For just add an alpha channel to s_img I just use cv2.addWeighted before the line
l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]] = s_img
as following:
s_img=cv2.addWeighted(l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]],0.5,s_img,0.5,0)
When attempting to write to the destination image using any of these answers above and you get the following error:
ValueError: assignment destination is read-only
A quick potential fix is to set the WRITEABLE flag to true.
img.setflags(write=1)
A simple 4on4 pasting function that works-
def paste(background,foreground,pos=(0,0)):
#get position and crop pasting area if needed
x = pos[0]
y = pos[1]
bgWidth = background.shape[0]
bgHeight = background.shape[1]
frWidth = foreground.shape[0]
frHeight = foreground.shape[1]
width = bgWidth-x
height = bgHeight-y
if frWidth<width:
width = frWidth
if frHeight<height:
height = frHeight
# normalize alpha channels from 0-255 to 0-1
alpha_background = background[x:x+width,y:y+height,3] / 255.0
alpha_foreground = foreground[:width,:height,3] / 255.0
# set adjusted colors
for color in range(0, 3):
fr = alpha_foreground * foreground[:width,:height,color]
bg = alpha_background * background[x:x+width,y:y+height,color] * (1 - alpha_foreground)
background[x:x+width,y:y+height,color] = fr+bg
# set adjusted alpha and denormalize back to 0-255
background[x:x+width,y:y+height,3] = (1 - (1 - alpha_foreground) * (1 - alpha_background)) * 255
return background
I reworked #fireant's concept to allow for optional alpha masks and allow any x or y, including values outside of the bounds of the image. It will crop to the bounds.
def overlay_image_alpha(img, img_overlay, x, y, alpha_mask=None):
"""Overlay `img_overlay` onto `img` at (x, y) and blend using optional `alpha_mask`.
`alpha_mask` must have same HxW as `img_overlay` and values in range [0, 1].
"""
if y < 0 or y + img_overlay.shape[0] > img.shape[0] or x < 0 or x + img_overlay.shape[1] > img.shape[1]:
y_origin = 0 if y > 0 else -y
y_end = img_overlay.shape[0] if y < 0 else min(img.shape[0] - y, img_overlay.shape[0])
x_origin = 0 if x > 0 else -x
x_end = img_overlay.shape[1] if x < 0 else min(img.shape[1] - x, img_overlay.shape[1])
img_overlay_crop = img_overlay[y_origin:y_end, x_origin:x_end]
alpha = alpha_mask[y_origin:y_end, x_origin:x_end] if alpha_mask is not None else None
else:
img_overlay_crop = img_overlay
alpha = alpha_mask
y1 = max(y, 0)
y2 = min(img.shape[0], y1 + img_overlay_crop.shape[0])
x1 = max(x, 0)
x2 = min(img.shape[1], x1 + img_overlay_crop.shape[1])
img_crop = img[y1:y2, x1:x2]
img_crop[:] = alpha * img_overlay_crop + (1.0 - alpha) * img_crop if alpha is not None else img_overlay_crop