How to remove contours from the edge of the image? - python

I have the following image that I generate from the below script,
I would like to know how can I eliminate the contours from the borders? (i.e. between the black bg and the purple pixels).
You can find the image as a pytorch tensor here
img = np.moveaxis(image.cpu().numpy(), 0, -1) # image is a pytorch tensor
img *= 255.0/img.max()
img = img.astype(np.uint8)
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
_,t_img = cv2.threshold(img,90,155,cv2.THRESH_TOZERO_INV)
c_img = cv2.Canny(t_img,10,100)
contours,_ = cv2.findContours(c_img,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
drawContour = cv2.drawContours(img,contours,-1,(255,0,0),1)
plt.imshow(img)

I still don't know if this is what you want, but what you could do is: generate a binary mask of the foreground using either simple thresholding or masking black regions if the background is always black as seems to be the case here. Then you can erode the foreground mask to remove a defined amount of pixels from the border of the mask (side note: use binary_dilation for the opposite operation):
import scipy.ndimage as ndimage
# img should be a numpy array with RGB channels in the last dimension
fg_mask = (img > (0, 0, 0)).any(axis=-1) # binary foreground mask (all pixels which are not black)
filled = ndimage.binary_fill_holes(fg_mask) # fill potential holes in mask (not needed here)
eroded = ndimage.binary_erosion(filled,
iterations=1,
structure=ndimage.generate_binary_structure(2, connectivity=2))
new_img = img * eroded[..., None] # apply eroded mask to img
Parameters to adjust are iterations (the higher the value the more pixels are removed from the foreground/background border) and structure (connectivity of 1 delivers smoother edges, 2 is used here) of binary_erosion.
Results:
Original picture:
Processed picture (new_img) with eroded borders:

Related

How to clean (delete) a black frame/boundaries/countour at the edges of the image after adaptive thresholding

I applied adaptive gaussian thresholding to my .tif image, but the black frame (contour) on the edges was created. Can't understand why and how to delete.
I would be very grateful for your help!
p.s. After cv2.threshold(img,127,255,cv2.THRESH_BINARY) there is no frame.
This is my original image:
https://drive.google.com/file/d/1DfdmQQ9AS-U2SXtyJzU94oYsLLSUmV7N/view?usp=share_link
This is fragment of my image (colored) and after gaussian thresholding (white and black). The black countour on the edge of the image is clearly visible.
enter image description here
My code:
img = cv2.imread(" My.tiff", 0)
th = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,115,2)
Tried this (found on the Stack overflow), but no results:
img = cv2.imread(file_path, 0)
rows, cols = img.shape
cv2.floodFill(img, None, seedPoint=(0, 0), newVal=255, loDiff=1, upDiff=1) # Fill the top left corner.
cv2.floodFill(img, None, seedPoint=(cols-1, 0), newVal=255, loDiff=1, upDiff=1) # Fill the top right corner.
cv2.floodFill(img, None, seedPoint=(0, rows-1), newVal=255, loDiff=1, upDiff=1) # Fill the bottop left corner.
cv2.floodFill(img, None, seedPoint=(cols-1, rows-1), newVal=255, loDiff=1, upDiff=1) # Fill the bottom right corner.
My image after adaptive gaussian thresholding (thresholding is ok..but why the black border was created and how to remove it, unfortunately, can't understand):
enter image description here
The black edge is a result of the steep change in gray levels between the image data, and the white margins.
For fixing the issue, we may fill the margins with values that are closer to the pixels of the image, apply adaptiveThreshold, and restore the margins.
Filling the margins with values that are closer to the pixels is not so simple.
Assuming the image is relatively homogeneous we may apply the following stages for covering the white margins:
Resize the image by a factor of 1.5 in each axis (1.5 factor is about sqrt(2) that applies 45 degree rotation).
Blur the resized image with a large kernel.
Crop the center that is the same size as the original image.
Replace the white margins with the matching pixels in the resized, blurred cropped image.
After covering the margins, execute adaptiveThreshold, and fill the margins with zeros.
Code sample:
import cv2
import numpy as np
img = cv2.imread('ndsi.tiff')
# Assume the margins are white, and there are very few white pixels in the other parts of the image.
# Create a mask with True where pixel are white and False otherwise.
mask = np.all(img != 255, axis=-1).astype(np.uint8)*255
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Convert to grayscale.
# Apply morphological closing for removing small white parts inside the image.
# Note for getting a better mask, we may find minAreaRect as suggested by Micka
#mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, np.ones((5, 5), np.uint8))
#mask = cv2.erode(mask, np.ones((5, 5), np.uint8)) # Erode the mask, because there are too many artifacts
# Resize the image by a factor of 1.5 in each axis
resized_img = cv2.resize(img, (img.shape[1]*3//2, img.shape[0]*3//2))
# Blur with large kernel
resized_img_blurred = cv2.GaussianBlur(resized_img, (51, 51), 50)
# Crop the center that is the same size as the original image.
center_img_blurred = resized_img_blurred[(resized_img.shape[0] - img.shape[0])//2:(resized_img.shape[0] + img.shape[0])//2, (resized_img.shape[1] - img.shape[1])//2:(resized_img.shape[1] + img.shape[1])//2]
tmp_img = img.copy()
tmp_img[mask==0] = center_img_blurred[mask==0] # Replace white margins with resized blurred image.
# Apply the threshold on tmp_img
th = cv2.adaptiveThreshold(tmp_img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)
# Remove the margins from th
th[mask == 0] = 0
# Show images for testing:
cv2.imshow('mask', cv2.resize(mask, (1024, 1024)))
cv2.imshow('center_img_blurred', cv2.resize(center_img_blurred, (1024, 1024)))
cv2.imshow('tmp_img', cv2.resize(tmp_img, (1024, 1024)))
cv2.imshow('th', cv2.resize(th, (1024, 1024)))
cv2.waitKey()
cv2.destroyAllWindows()
cv2.imwrite('tmp_img.jpg', cv2.resize(tmp_img, (1024, 1024)))
cv2.imwrite('th.png', cv2.resize(th, (1024, 1024)))
Result:
tmp_img:

How to replace a checked pattern in a PNG image with transparent in Python?

I am trying to remove the checkered background (which represents transparent background in Adobe Illustrator and Photoshop) with transparent color (alpha channel) in some PNGs with Python script.
First, I use template matching:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img_rgb = cv2.imread('testimages/fake1.png', cv2.IMREAD_UNCHANGED)
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('pattern.png', 0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
if len(img_rgb[0][0]) == 3:
# add alpha channel
rgba = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2RGBA)
rgba[:, :, 3] = 255 # default not transparent
img_rgb = rgba
# replace the area with a transparent rectangle
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (255, 255, 255, 0), -1)
cv2.imwrite('result.png', img_rgb)
Source Image: fake1.png
Pattern Template: pattern.png
Output: result.png (the gray area is actually transparent; enlarge a bit for viewing easier)
I know this approach has problems, as the in some cases, the template cannot be identified fully, as part of the pattern is hidden by the graphics in the PNG image.
My question is: How can I match such a pattern perfectly using OpenCV? via FFT Filtering?
References:
How particular pixel to transparent in opencv python?
Detecting a pattern in an image and retrieving its position
https://python.plainenglish.io/how-to-remove-image-background-using-python-6f7ffa8eab15
https://answers.opencv.org/question/232506/make-the-background-of-the-image-transparent-using-a-mask/
https://dsp.stackexchange.com/questions/36679/which-image-filter-can-be-applied-to-remove-gridded-pattern-from-corrupt-jpegs
Here is one way to do that in Python/OpenCV simply by thresholding on the checks color range.
Input:
import cv2
import numpy as np
# read input
img = cv2.imread("fake.png")
# threshold on checks
low = (230,230,230)
high = (255,255,255)
mask = cv2.inRange(img, low, high)
# invert alpha
alpha = 255 - mask
# convert img to BGRA
result = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
result[:,:,3] = alpha
# save output
cv2.imwrite('fake_transparent.png', result)
cv2.imshow('img', img)
cv2.imshow('mask', mask)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Download the resulting image to see that it is actually transparent.
Here is one way to use DFT to process the image in Python/OpenCV/Numpy. One does need to know the size of the checkerboard pattern (light or dark square size).
Read the input
Separate channels
Apply DFT to each channel
Shift origin from top left to center of each channel
Extract magnitude and phase images from each channel
Define the checkerboard pattern size
Create a black and white checkerboard image of the same size
Apply similar DFT processing to the checkerboard image
Get the spectrum from the log(magnitude)
Threshold the spectrum to form a mask
Zero out the DC center point in the mask
OPTION: If needed apply morphology dilate to thicken the white dots. But does not seem to be needed here
Invert the mask so the background is white and the dots are black
Convert the mask to range 0 to 1 and make 2 channels
Apply the two-channel mask to the center shifted DFT channels
Shift the center back to the top left in each masked image
Do the IDFT to get back from complex domain to real domain on each channel
Merge the resulting channels back to a BGR image as the final reconstituted image
Save results
Input:
import numpy as np
import cv2
import math
# read input
# note: opencv fft only works on grayscale
img = cv2.imread('fake.png')
hh, ww = img.shape[:2]
# separate channels
b,g,r = cv2.split(img)
# convert images to floats and do dft saving as complex output
dft_b = cv2.dft(np.float32(b), flags = cv2.DFT_COMPLEX_OUTPUT)
dft_g = cv2.dft(np.float32(g), flags = cv2.DFT_COMPLEX_OUTPUT)
dft_r = cv2.dft(np.float32(r), flags = cv2.DFT_COMPLEX_OUTPUT)
# apply shift of origin from upper left corner to center of image
dft_b_shift = np.fft.fftshift(dft_b)
dft_g_shift = np.fft.fftshift(dft_g)
dft_r_shift = np.fft.fftshift(dft_r)
# extract magnitude and phase images
mag_b, phase_b = cv2.cartToPolar(dft_b_shift[:,:,0], dft_b_shift[:,:,1])
mag_g, phase_g = cv2.cartToPolar(dft_g_shift[:,:,0], dft_g_shift[:,:,1])
mag_r, phase_r = cv2.cartToPolar(dft_r_shift[:,:,0], dft_r_shift[:,:,1])
# set check size (size of either dark or light square)
check_size = 15
# create checkerboard pattern
white = np.full((check_size,check_size), 255, dtype=np.uint8)
black = np.full((check_size,check_size), 0, dtype=np.uint8)
checks1 = np.hstack([white,black])
checks2 = np.hstack([black,white])
checks3 = np.vstack([checks1,checks2])
numht = math.ceil(hh / (2*check_size))
numwd = math.ceil(ww / (2*check_size))
checks = np.tile(checks3, (numht,numwd))
checks = checks[0:hh, 0:ww]
# apply dft to checkerboard pattern
dft_c = cv2.dft(np.float32(checks), flags = cv2.DFT_COMPLEX_OUTPUT)
dft_c_shift = np.fft.fftshift(dft_c)
mag_c, phase_c = cv2.cartToPolar(dft_c_shift[:,:,0], dft_c_shift[:,:,1])
# get spectrum from magnitude (add tiny amount to avoid divide by zero error)
spec = np.log(mag_c + 0.00000001)
# theshold spectrum
mask = cv2.threshold(spec, 1, 255, cv2.THRESH_BINARY)[1]
# mask DC point (center spot)
centx = int(ww/2)
centy = int(hh/2)
dot = np.zeros((3,3), dtype=np.uint8)
mask[centy-1:centy+2, centx-1:centx+2] = dot
# If needed do morphology dilate by small amount.
# But does not seem to be needed in this case
# invert mask
mask = 255 - mask
# apply mask to real and imaginary components
mask1 = (mask/255).astype(np.float32)
mask2 = cv2.merge([mask1,mask1])
complex_b = dft_b_shift*mask2
complex_g = dft_g_shift*mask2
complex_r = dft_r_shift*mask2
# shift origin from center to upper left corner
complex_ishift_b = np.fft.ifftshift(complex_b)
complex_ishift_g = np.fft.ifftshift(complex_g)
complex_ishift_r = np.fft.ifftshift(complex_r)
# do idft with normalization saving as real output and crop to original size
img_notch_b = cv2.idft(complex_ishift_b, flags=cv2.DFT_SCALE+cv2.DFT_REAL_OUTPUT)
img_notch_b = img_notch_b.clip(0,255).astype(np.uint8)
img_notch_b = img_notch_b[0:hh, 0:ww]
img_notch_g = cv2.idft(complex_ishift_g, flags=cv2.DFT_SCALE+cv2.DFT_REAL_OUTPUT)
img_notch_g = img_notch_g.clip(0,255).astype(np.uint8)
img_notch_g = img_notch_g[0:hh, 0:ww]
img_notch_r = cv2.idft(complex_ishift_r, flags=cv2.DFT_SCALE+cv2.DFT_REAL_OUTPUT)
img_notch_r = img_notch_r.clip(0,255).astype(np.uint8)
img_notch_r = img_notch_r[0:hh, 0:ww]
# combine b,g,r components
img_notch = cv2.merge([img_notch_b, img_notch_g, img_notch_r])
# write result to disk
cv2.imwrite("fake_checks.png", checks)
cv2.imwrite("fake_spectrum.png", (255*spec).clip(0,255).astype(np.uint8))
cv2.imwrite("fake_mask.png", mask)
cv2.imwrite("fake_notched.png", img_notch)
# show results
cv2.imshow("ORIGINAL", img)
cv2.imshow("CHECKS", checks)
cv2.imshow("SPECTRUM", spec)
cv2.imshow("MASK", mask)
cv2.imshow("NOTCH", img_notch)
cv2.waitKey(0)
cv2.destroyAllWindows()
Checkerboard image:
Spectrum of checkerboard:
Mask:
Result (notch filtered image):
The checkerboard pattern in the result is mitigated from the original, but still there upon close inspection.
From here one needs to threshold on the white background and invert to make an image for the alpha channel. Then convert the image to 4 BGRA and insert the alpha channel into the BGRA image as I described in my other answer below.
since you're working on PNG's with transparent backgrounds, it would probably be equally viable to instead of trying to detect the checkered background, you try to extract the stuff that isn't checkered. This could probably be achieved using a color check on all pixels. You could use opencv's inRange() function. I'll link a StackOverflow link below that tries to detect dark spots on a image.
Inrange example

Color percentage in image for Python using OpenCV

I'm creating a code which can detect the percentage of green colour from an image.
.
I have a little experience with OpenCV but am still pretty new to image processing and would like some help with my code. How should I change this code so that it is capable of calculating the percentage of green instead of brown? And if it isn't too troublesome could someone please explain how the changes affect the code? Below is the link to the image I would like to use.
Credit for the code goes to #mmensing
import numpy as np
import cv2
img = cv2.imread('potato.jpg')
brown = [145, 80, 40] # RGB
diff = 20
boundaries = [([brown[2]-diff, brown[1]-diff, brown[0]-diff],
[brown[2]+diff, brown[1]+diff, brown[0]+diff])]
for (lower, upper) in boundaries:
lower = np.array(lower, dtype=np.uint8)
upper = np.array(upper, dtype=np.uint8)
mask = cv2.inRange(img, lower, upper)
output = cv2.bitwise_and(img, img, mask=mask)
ratio_brown = cv2.countNonZero(mask)/(img.size/3)
print('brown pixel percentage:', np.round(ratio_brown*100, 2))
cv2.imshow("images", np.hstack([img, output]))
cv2.waitKey(0)
I've modified your script so you can find the (approximate) percent of green color in your test images. I've added some comments to explain the code:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
img = cv2.imread(imagePath+"leaves.jpg")
# Here, you define your target color as
# a tuple of three values: RGB
green = [130, 158, 0]
# You define an interval that covers the values
# in the tuple and are below and above them by 20
diff = 20
# Be aware that opencv loads image in BGR format,
# that's why the color values have been adjusted here:
boundaries = [([green[2], green[1]-diff, green[0]-diff],
[green[2]+diff, green[1]+diff, green[0]+diff])]
# Scale your BIG image into a small one:
scalePercent = 0.3
# Calculate the new dimensions
width = int(img.shape[1] * scalePercent)
height = int(img.shape[0] * scalePercent)
newSize = (width, height)
# Resize the image:
img = cv2.resize(img, newSize, None, None, None, cv2.INTER_AREA)
# check out the image resized:
cv2.imshow("img resized", img)
cv2.waitKey(0)
# for each range in your boundary list:
for (lower, upper) in boundaries:
# You get the lower and upper part of the interval:
lower = np.array(lower, dtype=np.uint8)
upper = np.array(upper, dtype=np.uint8)
# cv2.inRange is used to binarize (i.e., render in white/black) an image
# All the pixels that fall inside your interval [lower, uipper] will be white
# All the pixels that do not fall inside this interval will
# be rendered in black, for all three channels:
mask = cv2.inRange(img, lower, upper)
# Check out the binary mask:
cv2.imshow("binary mask", mask)
cv2.waitKey(0)
# Now, you AND the mask and the input image
# All the pixels that are white in the mask will
# survive the AND operation, all the black pixels
# will remain black
output = cv2.bitwise_and(img, img, mask=mask)
# Check out the ANDed mask:
cv2.imshow("ANDed mask", output)
cv2.waitKey(0)
# You can use the mask to count the number of white pixels.
# Remember that the white pixels in the mask are those that
# fall in your defined range, that is, every white pixel corresponds
# to a green pixel. Divide by the image size and you got the
# percentage of green pixels in the original image:
ratio_green = cv2.countNonZero(mask)/(img.size/3)
# This is the color percent calculation, considering the resize I did earlier.
colorPercent = (ratio_green * 100) / scalePercent
# Print the color percent, use 2 figures past the decimal point
print('green pixel percentage:', np.round(colorPercent, 2))
# numpy's hstack is used to stack two images horizontally,
# so you see the various images generated in one figure:
cv2.imshow("images", np.hstack([img, output]))
cv2.waitKey(0)
Output:
green pixel percentage: 89.89
I've produced some images, this is the binary mask of the green color:
And this is the ANDed out of the mask and the input image:
Some additional remarks about this snippet:
Gotta be careful loading images with OpenCV, as they are loaded in
BGR format rather than the usual RGB. Here, the snippet has this
covered by reversing the elements in the boundary list, but keep an
eye open for this common pitfall.
Your input image was too big to even display it properly using
cv2.imshow. I resized it and processed that instead. At the end,
you see I took into account this resized scale in the final percent
calculation.
Depending on the target color you define and the difference you
use, you could be producing negative values. In this case, for
instance, for the R = 0 value, after subtracting diff you would
get -20. That doesn't make sense when you are encoding color
intensity in unsigned 8 bits. The values must be in the [0, 255] range.
Watch out for negative values using this method.
Now, you may see that the method is not very robust. Depending on what you are doing, you could switch to the HSV color space to get a nicer and more accurate binary mask.
You can try the HSV-based mask with this:
# The HSV mask values, defined for the green color:
lowerValues = np.array([29, 89, 70])
upperValues = np.array([179, 255, 255])
# Convert the image to HSV:
hsvImage = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Create the HSV mask
hsvMask = cv2.inRange(hsvImage, lowerValues, upperValues)
# AND mask & input image:
hsvOutput = cv2.bitwise_and(img, img, mask=hsvMask)
Which gives you this nice masked image instead:

How to multiply two image in Python? [duplicate]

How can I apply mask to a color image in latest python binding (cv2)? In previous python binding the simplest way was to use cv.Copy e.g.
cv.Copy(dst, src, mask)
But this function is not available in cv2 binding. Is there any workaround without using boilerplate code?
Here, you could use cv2.bitwise_and function if you already have the mask image.
For check the below code:
img = cv2.imread('lena.jpg')
mask = cv2.imread('mask.png',0)
res = cv2.bitwise_and(img,img,mask = mask)
The output will be as follows for a lena image, and for rectangular mask.
Well, here is a solution if you want the background to be other than a solid black color. We only need to invert the mask and apply it in a background image of the same size and then combine both background and foreground. A pro of this solution is that the background could be anything (even other image).
This example is modified from Hough Circle Transform. First image is the OpenCV logo, second the original mask, third the background + foreground combined.
# http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_houghcircles/py_houghcircles.html
import cv2
import numpy as np
# load the image
img = cv2.imread('E:\\FOTOS\\opencv\\opencv_logo.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# detect circles
gray = cv2.medianBlur(cv2.cvtColor(img, cv2.COLOR_RGB2GRAY), 5)
circles = cv2.HoughCircles(gray, cv2.HOUGH_GRADIENT, 1, 20, param1=50, param2=50, minRadius=0, maxRadius=0)
circles = np.uint16(np.around(circles))
# draw mask
mask = np.full((img.shape[0], img.shape[1]), 0, dtype=np.uint8) # mask is only
for i in circles[0, :]:
cv2.circle(mask, (i[0], i[1]), i[2], (255, 255, 255), -1)
# get first masked value (foreground)
fg = cv2.bitwise_or(img, img, mask=mask)
# get second masked value (background) mask must be inverted
mask = cv2.bitwise_not(mask)
background = np.full(img.shape, 255, dtype=np.uint8)
bk = cv2.bitwise_or(background, background, mask=mask)
# combine foreground+background
final = cv2.bitwise_or(fg, bk)
Note: It is better to use the opencv methods because they are optimized.
import cv2 as cv
im_color = cv.imread("lena.png", cv.IMREAD_COLOR)
im_gray = cv.cvtColor(im_color, cv.COLOR_BGR2GRAY)
At this point you have a color and a gray image. We are dealing with 8-bit, uint8 images here. That means the images can have pixel values in the range of [0, 255] and the values have to be integers.
Let's do a binary thresholding operation. It creates a black and white masked image. The black regions have value 0 and the white regions 255
_, mask = cv.threshold(im_gray, thresh=180, maxval=255, type=cv.THRESH_BINARY)
im_thresh_gray = cv.bitwise_and(im_gray, mask)
The mask can be seen below on the left. The image on its right is the result of applying bitwise_and operation between the gray image and the mask. What happened is, the spatial locations where the mask had a pixel value zero (black), became pixel value zero in the result image. The locations where the mask had pixel value 255 (white), the resulting image retained its original gray value.
To apply this mask to our original color image, we need to convert the mask into a 3 channel image as the original color image is a 3 channel image.
mask3 = cv.cvtColor(mask, cv.COLOR_GRAY2BGR) # 3 channel mask
Then, we can apply this 3 channel mask to our color image using the same bitwise_and function.
im_thresh_color = cv.bitwise_and(im_color, mask3)
mask3 from the code is the image below on the left, and im_thresh_color is on its right.
You can plot the results and see for yourself.
cv.imshow("original image", im_color)
cv.imshow("binary mask", mask)
cv.imshow("3 channel mask", mask3)
cv.imshow("im_thresh_gray", im_thresh_gray)
cv.imshow("im_thresh_color", im_thresh_color)
cv.waitKey(0)
The original image is lenacolor.png that I found here.
Answer given by Abid Rahman K is not completely correct. I also tried it and found very helpful but got stuck.
This is how I copy image with a given mask.
x, y = np.where(mask!=0)
pts = zip(x, y)
# Assuming dst and src are of same sizes
for pt in pts:
dst[pt] = src[pt]
This is a bit slow but gives correct results.
EDIT:
Pythonic way.
idx = (mask!=0)
dst[idx] = src[idx]
The other methods described assume a binary mask. If you want to use a real-valued single-channel grayscale image as a mask (e.g. from an alpha channel), you can expand it to three channels and then use it for interpolation:
assert len(mask.shape) == 2 and issubclass(mask.dtype.type, np.floating)
assert len(foreground_rgb.shape) == 3
assert len(background_rgb.shape) == 3
alpha3 = np.stack([mask]*3, axis=2)
blended = alpha3 * foreground_rgb + (1. - alpha3) * background_rgb
Note that mask needs to be in range 0..1 for the operation to succeed. It is also assumed that 1.0 encodes keeping the foreground only, while 0.0 means keeping only the background.
If the mask may have the shape (h, w, 1), this helps:
alpha3 = np.squeeze(np.stack([np.atleast_3d(mask)]*3, axis=2))
Here np.atleast_3d(mask) makes the mask (h, w, 1) if it is (h, w) and np.squeeze(...) reshapes the result from (h, w, 3, 1) to (h, w, 3).

Overlaying an image with another non-rectangular image containing black pixels using OpenCV in Python

I want to programmatically overlay an image, e.g. the blissfully familiar Windows XP wallpaper:
with another non-rectangular image that contains black pixels, e.g. a standard large cursor icon:
Copy-pastoring the codes from this and this tutorial which both use OpenCV bitwise masking magic I arrived at:
import cv2 as cv
# Load two images
img1 = cv.imread('bliss.png') # The image I want the overlay to be diplayed on.
img2 = cv.imread('cursor.png') # The image I want to overlay with, containing black pixels.
# I want to put logo on top-left corner, So I create a ROI.
rows, cols, channels = img2.shape
roi = img1[0:rows, 0:cols ]
# Now create a mask of logo and create its inverse mask also.
img2gray = cv.cvtColor(img2, cv.COLOR_BGR2GRAY)
ret, mask = cv.threshold(img2gray, 20, 255, cv.THRESH_BINARY)
mask_inv = cv.bitwise_not(mask)
# Now black-out the area of logo in ROI.
img1_bg = cv.bitwise_and(roi, roi, mask = mask_inv)
# Take only region of logo from logo image.
img2_fg = cv.bitwise_and(img2, img2, mask = mask)
# Put logo in ROI and modify the main image
dst = cv.add(img1_bg, img2_fg)
img1[0:rows, 0:cols ] = dst
cv.imshow('res',img1)
cv.waitKey(0)
cv.destroyAllWindows()
During naive permutations of trying to find the right parameters for cv.threshold (including thres and maxval arguments as well as thresholding types), I always find that significant number of black pixels present in the original image are missing from the overlaid one. In the zoomed-in picture below on the left you can see the overlaid cursor and on the right is original copied one:
I reckon this loss of pixels is due to the grayscale conversion and/or the inverse (?) masking but could not figure out how or what to change in the code above. In the tutorials I linked above images not containing black pixels were used for the overlay and the results look fine. Is there a way to do the same with images containing black pixels?
The problem here is, that you lose the black pixels in cursor.png at cv.threshold(img2gray, 20, 255, cv.THRESH_BINARY). The remainder are the white pixels only, thus your mask is just too small. Since cursor.png has transparency information stored in it, you can use its alpha channel for your mask.
Here's your code, modified accordingly (I removed all of your comments; comments show my changes):
import cv2 as cv
img1 = cv.imread('bliss.png')
img2 = cv.imread('cursor.png', cv.IMREAD_UNCHANGED) # Added cv.IMREAD_UNCHANGED parameter to maintain alpha channel information
alpha = img2[:, :, 3] # Save alpha channel for later use
_, alpha = cv.threshold(alpha, 5, 255, cv.THRESH_BINARY) # Threshold alpha channel to prevent gradual transparency
img2 = cv.cvtColor(img2, cv.COLOR_BGRA2BGR) # Remove alpha channel information, so that code below still works
rows, cols, channels = img2.shape
roi = img1[0:rows, 0:cols ]
# img2gray no longer needed
mask = alpha # Mask is just the alpha channel saved above
mask_inv = cv.bitwise_not(mask)
img1_bg = cv.bitwise_and(roi, roi, mask = mask_inv)
img2_fg = cv.bitwise_and(img2, img2, mask = mask)
dst = cv.add(img1_bg, img2_fg)
img1[0:rows, 0:cols ] = dst
cv.imshow('res',img1)
cv.waitKey(0)
cv.destroyAllWindows()
Hopefully, the output image then looks like something you expected:

Categories