OpenCV - Splitting and merging alpha channels slow - python

I am using Python OpenCV to split channels and remove black background like this...
b_channel, g_channel, r_channel = cv2.split(image_1)
alpha_channel = np.zeros_like(gray)
for p in range(alpha_channel.shape[0]):
for q in range(alpha_channel.shape[1]):
if b_channel[p][q]!=0 or g_channel[p][q]!=0 or r_channel[p][q]!=0:
alpha_channel[p][q] = 255
merged = cv2.merge((b_channel, g_channel, r_channel, alpha_channel))
This is working, but it is taking around 10 seconds to complete on an image that is only 200kb
Is there a more efficient way to do this or is there some speed gains I could make using the code I have?

Iterating over pixels using for loop is literally very slow and inefficient. Also, as per the documentation here,
cv2.split() is a costly operation (in terms of time). So do it only if
you need it. Otherwise go for Numpy indexing.
You can try vectorising and indexing with numpy as below:
# create the image with alpha channel
img_rgba = cv2.cvtColor(img, cv2.COLOR_RGB2RGBA)
# mask: elements are True any of the pixel value is 0
mask = (img[:, :, 0:3] != [0,0,0]).any(2)
#assign the mask to the last channel of the image
img_rgba[:,:,3] = (mask*255).astype(np.uint8)

For what you're doing, using cv2.bitwise_or seems to be the fastest method:
image_1 = img
# your method
start_time = time.time()
b_channel, g_channel, r_channel = cv2.split(image_1)
alpha_channel = np.zeros_like(gray)
for p in range(alpha_channel.shape[0]):
for q in range(alpha_channel.shape[1]):
if b_channel[p][q]!=0 or g_channel[p][q]!=0 or r_channel[p][q]!=0:
alpha_channel[p][q] = 255
elapsed_time = time.time() - start_time
print('for cycles: ' + str(elapsed_time*1000.0) + ' milliseconds')
# my method
start_time = time.time()
b_channel, g_channel, r_channel = cv2.split(image_1)
alpha_channel2 = cv2.bitwise_or(g_channel,r_channel)
alpha_channel2 = cv2.bitwise_or(alpha_channel2, b_channel)
_,alpha_channel2 = cv2.threshold(alpha_channel2,0,255,cv2.THRESH_BINARY)
elapsed_time2 = time.time() - start_time
print('bitwise + threshold: '+ str(elapsed_time2*1000.0) + ' milliseconds')
# annubhav's method
start_time = time.time()
img_rgba = cv2.cvtColor(image_1, cv2.COLOR_RGB2RGBA)
# mask: elements are True any of the pixel value is 0
mask = (img[:, :, 0:3] != [0,0,0]).any(2)
#assign the mask to the last channel of the image
img_rgba[:,:,3] = (mask*255).astype(np.uint8)
elapsed_time3 = time.time() - start_time
print('anubhav: ' + str(elapsed_time3*1000.0) + ' milliseconds')
for cycles: 2146.300792694092 milliseconds
bitwise + threshold: 4.959583282470703 milliseconds
anubhav: 27.924776077270508 milliseconds

Fastest Solution
Let us consider a function that uses cv2.split and we know that it is very inefficient, we can go ahead and resize or crop a certain part of the image and then perform our calculation on that. In my case where I had to calculate the colorfulness of the image using cv2.split I went ahead and resized and cropped the image to make cv2.split work.
A faster and more reasonable cv2.split calculation can be performed by Resizing
Code
def image_colorfulness(self,image):
# split the image into its respective RGB components
(B, G, R) = cv2.split(image.astype("float"))
print(f'Split Image to B G R {(B, G, R)}')
# compute rg = R - G
rg = np.absolute(R - G)
print(f'Computed RG to {rg}')
# compute yb = 0.5 * (R + G) - B
yb = np.absolute(0.5 * (R + G) - B)
# compute the mean and standard deviation of both `rg` and `yb`
print('Performing Absolute')
(rbMean, rbStd) = (np.mean(rg), np.std(rg))
(ybMean, ybStd) = (np.mean(yb), np.std(yb))
# combine the mean and standard deviations
print('Performing Standard Deviation')
stdRoot = np.sqrt((rbStd ** 2) + (ybStd ** 2))
meanRoot = np.sqrt((rbMean ** 2) + (ybMean ** 2))
# derive the "colorfulness" metric and return it
return stdRoot + (0.3 * meanRoot)
def crop_square(self, img, size, interpolation=cv2.INTER_AREA):
h, w = img.shape[:2]
min_size = np.amin([h,w])
# Centralize and crop
crop_img = img[int(h/2-min_size/2):int(h/2+min_size/2), int(w/2-min_size/2):int(w/2+min_size/2)]
resized = cv2.resize(crop_img, (size, size), interpolation=interpolation)
return resized
img = cv2.imread(image_path)
resize_img = self.crop_square(img, 300)
## perform your calculation on the resized_img and continue with the original img then
colorness = self.image_colorfulness(resize_img)
Resizing Only
If you prefer not to crop and only resize the image, that can be achieved by taking a look at this line of code from the square_crop function.
resized = cv2.resize(crop_img, (size, size), interpolation=interpolation)
Testing Results
Before
I tested a 5.0 MB *.PNG Image, before using standard image input in cv2.split it processed in 8 Minutes.
After
After the Image Resizing it was reduced to 0.001 ms on the resized image.
Standard Image
Resized Image

Related

How Do I Develop a negative film image using python

I have tried inverting a negative film images color with the bitwise_not() function in python but it has this blue tint. I would like to know how I could develop a negative film image that looks somewhat good. Here's the outcome of what I did. (I just cropped the negative image for a new test I was doing so don't mind that)
If you don't use exact maximum and minimum, but 1st and 99th percentile, or something nearby (0.1%?), you'll get some nicer contrast. It'll cut away outliers due to noise, compression, etc.
Additionally, you should want to mess with gamma, or scale the values linearly, to achieve white balance.
I'll apply a "gray world assumption" and scale each plane so the mean is gray. I'll also mess with gamma, but that's just messing around.
And... all of that completely ignores gamma mapping, both of the "negative" and of the outputs.
import numpy as np
import cv2 as cv
import skimage
im = cv.imread("negative.png")
(bneg,gneg,rneg) = cv.split(im)
def stretch(plane):
# take 1st and 99th percentile
imin = np.percentile(plane, 1)
imax = np.percentile(plane, 99)
# stretch the image
plane = (plane - imin) / (imax - imin)
return plane
b = 1 - stretch(bneg)
g = 1 - stretch(gneg)
r = 1 - stretch(rneg)
bgr = cv.merge([b,g,r])
cv.imwrite("positive.png", bgr * 255)
b = 1 - stretch(bneg)
g = 1 - stretch(gneg)
r = 1 - stretch(rneg)
# gray world
b *= 0.5 / b.mean()
g *= 0.5 / g.mean()
r *= 0.5 / r.mean()
bgr = cv.merge([b,g,r])
cv.imwrite("positive_grayworld.png", bgr * 255)
b = 1 - np.clip(stretch(bneg), 0, 1)
g = 1 - np.clip(stretch(gneg), 0, 1)
r = 1 - np.clip(stretch(rneg), 0, 1)
# goes in the right direction
b = skimage.exposure.adjust_gamma(b, gamma=b.mean()/0.5)
g = skimage.exposure.adjust_gamma(g, gamma=g.mean()/0.5)
r = skimage.exposure.adjust_gamma(r, gamma=r.mean()/0.5)
bgr = cv.merge([b,g,r])
cv.imwrite("positive_gamma.png", bgr * 255)
Here's what happens when gamma is applied to the inverted picture... a reasonably tolerable transfer function results from applying the same factor twice, instead of applying its inverse.
Trying to "undo" the gamma while ignoring that the values were inverted... causes serious distortions:
And the min/max values for contrast stretching also affect the whole thing.
A simple photo of a negative simply won't do. It'll include stray light that offsets the black point, at the very least. You need a proper scan of the negative.
Here is one simple way to do that in Python/OpenCV. Basically one stretches each channel of the image to full dynamic range separately. Then recombines. Then inverts.
Input:
import cv2
import numpy as np
import skimage.exposure
# read image
img = cv2.imread('boys_negative.png')
# separate channels
r,g,b = cv2.split(img)
# stretch each channel
r_stretch = skimage.exposure.rescale_intensity(r, in_range='image', out_range=(0,255)).astype(np.uint8)
g_stretch = skimage.exposure.rescale_intensity(g, in_range='image', out_range=(0,255)).astype(np.uint8)
b_stretch = skimage.exposure.rescale_intensity(b, in_range='image', out_range=(0,255)).astype(np.uint8)
# combine channels
img_stretch = cv2.merge([r_stretch, g_stretch, b_stretch])
# invert
result = 255 - img_stretch
cv2.imshow('input', img)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save results
cv2.imwrite('boys_negative_inverted.jpg', result)
Result:
Caveat: This works for this image, but may not be a universal solution for all images.
ADDITION
In the above, I did not clip when stretching as I wanted to preserver all information. But if one wants to clip and use skimage.exposure.rescale_intensity for stretching, then it is easy enough by the following:
import cv2
import numpy as np
import skimage.exposure
# read image
img = cv2.imread('boys_negative.png')
# separate channels
r,g,b = cv2.split(img)
# compute clip points -- clip 1% only on high side
clip_rmax = np.percentile(r, 99)
clip_gmax = np.percentile(g, 99)
clip_bmax = np.percentile(b, 99)
clip_rmin = np.percentile(r, 0)
clip_gmin = np.percentile(g, 0)
clip_bmin = np.percentile(b, 0)
# stretch each channel
r_stretch = skimage.exposure.rescale_intensity(r, in_range=(clip_rmin,clip_rmax), out_range=(0,255)).astype(np.uint8)
g_stretch = skimage.exposure.rescale_intensity(g, in_range=(clip_gmin,clip_gmax), out_range=(0,255)).astype(np.uint8)
b_stretch = skimage.exposure.rescale_intensity(b, in_range=(clip_bmin,clip_bmax), out_range=(0,255)).astype(np.uint8)
# combine channels
img_stretch = cv2.merge([r_stretch, g_stretch, b_stretch])
# invert
result = 255 - img_stretch
cv2.imshow('input', img)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save results
cv2.imwrite('boys_negative_inverted2.jpg', result)
Result:

Image stitching problem using Python and OpenCV

I got output like below after stitching result of 24 stitched images to next 25th image. Before that stitching was good.
Is anyone aware of why/when output of stitching comes like this? What are the possibilities of output coming like that? What may be the reason of that?
Stitching code is following standard stitching steps like finding keypoints, descriptors then matching points, calculating homography and then warping of images. But I am not understanding why that output is coming.
Core part of stitching is like below:
detector = cv2.SIFT_create(400)
# find the keypoints and descriptors with SIFT
gray1 = cv2.cvtColor(image1,cv2.COLOR_BGR2GRAY)
ret1, mask1 = cv2.threshold(gray1,1,255,cv2.THRESH_BINARY)
kp1, descriptors1 = detector.detectAndCompute(gray1,mask1)
gray2 = cv2.cvtColor(image2,cv2.COLOR_BGR2GRAY)
ret2, mask2 = cv2.threshold(gray2,1,255,cv2.THRESH_BINARY)
kp2, descriptors2 = detector.detectAndCompute(gray2,mask2)
keypoints1Im = cv2.drawKeypoints(image1, kp1, outImage = cv2.DRAW_MATCHES_FLAGS_DEFAULT, color=(0,0,255))
keypoints2Im = cv2.drawKeypoints(image2, kp2, outImage = cv2.DRAW_MATCHES_FLAGS_DEFAULT, color=(0,0,255))
# BFMatcher with default params
matcher = cv2.BFMatcher()
matches = matcher.knnMatch(descriptors2,descriptors1, k=2)
# Apply ratio test
good = []
for m, n in matches:
if m.distance < 0.75 * n.distance:
good.append(m)
print (str(len(good)) + " Matches were Found")
if len(good) <= 10:
return image1
matches = copy.copy(good)
matchDrawing = util.drawMatches(gray2,kp2,gray1,kp1,matches)
#Aligning the images
src_pts = np.float32([ kp2[m.queryIdx].pt for m in matches ]).reshape(-1,1,2)
dst_pts = np.float32([ kp1[m.trainIdx].pt for m in matches ]).reshape(-1,1,2)
H = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)[0]
h1,w1 = image1.shape[:2]
h2,w2 = image2.shape[:2]
pts1 = np.float32([[0,0],[0,h1],[w1,h1],[w1,0]]).reshape(-1,1,2)
pts2 = np.float32([[0,0],[0,h2],[w2,h2],[w2,0]]).reshape(-1,1,2)
pts2_ = cv2.perspectiveTransform(pts2, H)
pts = np.concatenate((pts1, pts2_), axis=0)
# print("pts:", pts)
[xmin, ymin] = np.int32(pts.min(axis=0).ravel() - 0.5)
[xmax, ymax] = np.int32(pts.max(axis=0).ravel() + 0.5)
t = [-xmin,-ymin]
Ht = np.array([[1,0,t[0]],[0,1,t[1]],[0,0,1]]) # translate
result = cv2.warpPerspective(image2, Ht.dot(H), (xmax-xmin, ymax-ymin))
resizedB = np.zeros((result.shape[0], result.shape[1], 3), np.uint8)
resizedB[t[1]:t[1]+h1,t[0]:w1+t[0]] = image1
# Now create a mask of logo and create its inverse mask also
img2gray = cv2.cvtColor(result,cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 0, 255, cv2.THRESH_BINARY)
kernel = np.ones((5,5),np.uint8)
k1 = (kernel == 1).astype('uint8')
mask = cv2.erode(mask, k1, borderType=cv2.BORDER_CONSTANT)
mask_inv = cv2.bitwise_not(mask)
difference = cv2.bitwise_or(resizedB, resizedB, mask=mask_inv)
result2 = cv2.bitwise_and(result, result, mask=mask)
result = cv2.add(result2, difference)
Edit:
This image shows match drawing while stitching 25 to result until 24 images:
And before that match drawing:
I have total 97 images to stitch. If I stitch 24 and 25 image separately they stitches properly. If I start stitching from 23rd image onwards then also stitching is good but it gives me problem when I stitches starting from 1st image. I am not able to understand the problem.
Result after stitching 23rd image:
Result after stitching 24th image:
Result after stitching 25th image is as above which went wrong.
Strange Observation: If I stitch 23,24,25 images seperately with same code it gets stitches. If I stitch images after 23 till 97 , it gets stitches. But somehow if I stitch images from 1st, it breaks while stitching 25th image. I am not understanding why this happens.
I have tried different combination like different keypoint detection, extraction methods, matching methods, different homography calculations, different warping code but those combinations didn't work. Something is missing or wrong in the steps combination code. I am not able to figure it out.
Sorry for this long question. As I am completely new to this I am not able to explain and get the things properly. Thanks for your help and guidance.
Stitched result of 23,24,25 images separately with SAME code:
With different code (gives black lines in between stitching), if I stitched 97 images then 25th goes up in stitching and stitches as shown below (right corner point):
Firstly, I was not able to recreate your problem and solve it as the images were too big for my system to process. However, I had faced the same problem in my Panorama Stitching project, so I am sharing the reason behind it and my approach to solving my problem. Hope this helps you too.
Here's what my problem looked like when I stitched 4 images together just like you did.
As you can see, the 4th image was getting distorted a lot which must not happen. The same thing happened with you but on a greater level.
Now, here's the output when I stitched 8 images after some image pre-processing.
After some pre-processing on the input images, I was able to stitch 8 images together perfectly without any distortion.
To understand the exact reason behind this kind of distortion, watch this video by Joseph Redmon between 50:26 - 1:07:23.
As suggested in the video, we'll first have to project the images onto a cylinder and then unroll them and then stitch these unrolled images together.
Below is the initial input image(left) and the image after projection and unrolling onto a cylinder(right).
For your problem, as you are using satellite images, I guess projection onto a sphere would work better than the cylinder however you'll have to give it a try.
Sharing below my code for projecting the image onto a cylinder and unrolling it for reference. The mathematics used behind it is the same as given in the video.
def Convert_xy(x, y):
global center, f
xt = ( f * np.tan( (x - center[0]) / f ) ) + center[0]
yt = ( (y - center[1]) / np.cos( (x - center[0]) / f ) ) + center[1]
return xt, yt
def ProjectOntoCylinder(InitialImage):
global w, h, center, f
h, w = InitialImage.shape[:2]
center = [w // 2, h // 2]
f = 1100 # 1100 field; 1000 Sun; 1500 Rainier; 1050 Helens
# Creating a blank transformed image
TransformedImage = np.zeros(InitialImage.shape, dtype=np.uint8)
# Storing all coordinates of the transformed image in 2 arrays (x and y coordinates)
AllCoordinates_of_ti = np.array([np.array([i, j]) for i in range(w) for j in range(h)])
ti_x = AllCoordinates_of_ti[:, 0]
ti_y = AllCoordinates_of_ti[:, 1]
# Finding corresponding coordinates of the transformed image in the initial image
ii_x, ii_y = Convert_xy(ti_x, ti_y)
# Rounding off the coordinate values to get exact pixel values (top-left corner)
ii_tl_x = ii_x.astype(int)
ii_tl_y = ii_y.astype(int)
# Finding transformed image points whose corresponding
# initial image points lies inside the initial image
GoodIndices = (ii_tl_x >= 0) * (ii_tl_x <= (w-2)) * \
(ii_tl_y >= 0) * (ii_tl_y <= (h-2))
# Removing all the outside points from everywhere
ti_x = ti_x[GoodIndices]
ti_y = ti_y[GoodIndices]
ii_x = ii_x[GoodIndices]
ii_y = ii_y[GoodIndices]
ii_tl_x = ii_tl_x[GoodIndices]
ii_tl_y = ii_tl_y[GoodIndices]
# Bilinear interpolation
dx = ii_x - ii_tl_x
dy = ii_y - ii_tl_y
weight_tl = (1.0 - dx) * (1.0 - dy)
weight_tr = (dx) * (1.0 - dy)
weight_bl = (1.0 - dx) * (dy)
weight_br = (dx) * (dy)
TransformedImage[ti_y, ti_x, :] = ( weight_tl[:, None] * InitialImage[ii_tl_y, ii_tl_x, :] ) + \
( weight_tr[:, None] * InitialImage[ii_tl_y, ii_tl_x + 1, :] ) + \
( weight_bl[:, None] * InitialImage[ii_tl_y + 1, ii_tl_x, :] ) + \
( weight_br[:, None] * InitialImage[ii_tl_y + 1, ii_tl_x + 1, :] )
# Getting x coorinate to remove black region from right and left in the transformed image
min_x = min(ti_x)
# Cropping out the black region from both sides (using symmetricity)
TransformedImage = TransformedImage[:, min_x : -min_x, :]
return TransformedImage, ti_x-min_x, ti_y
You just have to call the function ProjectOntoCylinder and pass it an image to get the resultant image and the coordinates of white pixels in the mask image. Use the code below to call this function and get the mask image.
# Applying Cylindrical projection on Image
Image_Cyl, mask_x, mask_y = ProjectOntoCylinder(Image)
# Getting Image Mask
Image_Mask = np.zeros(Image_Cyl.shape, dtype=np.uint8)
Image_Mask[mask_y, mask_x, :] = 255
Here are links to my project and its detailed documentation for reference:
Part 1:
Source Code,
Documentation
Part 2:
Source Code,
Documentation

Manipulate RGB values in image

I would like to apply a simple algebraic operation to the RBG values of an image, that I have loaded via PIL. My current version works, but is slow:
from PIL import Image
import numpy as np
file_name = '1'
im = Image.open('data/' + file_name + '.jpg').convert('RGB')
pixels = np.array(im)
s = pixels.shape
p = pixels.reshape((s[0] * s[1], s[2]))
def update(ratio=0.5):
p2 = np.array([[min(rgb[0] + rgb[0] * ratio, 1), max(rgb[1] - rgb[1] * ratio, 0), rgb[2]] for rgb in p])
img = Image.fromarray(np.uint8(p2.reshape(s)))
img.save('result/' + file_name + '_test.png')
return 0
update(0.5)
Has someone a more efficient idea?
Make use of NumPy's vectorized operations to get rid of the loop.
I modified your original approach to compare performance between the following, different solutions. Also, I added a PIL only approach using ImageMath, if you want to get rid of NumPy completely.
Furthermore, I assume, there is/was a bug:
p2 = np.array([[min(rgb[0] + rgb[0] * ratio, 1), max(rgb[1] - rgb[1] * ratio, 0), rgb[2]] for rgb in p])
You actually do NOT convert to float, so it should be 255 instead of 1 in the min call.
Here's, what I've done:
import numpy as np
from PIL import Image, ImageMath
import time
# Modified, original implementation; fixed most likely wrong compare value in min (255 instead of 1)
def update_1(ratio=0.5):
pixels = np.array(im)
s = pixels.shape
p = pixels.reshape((s[0] * s[1], s[2]))
p2 = np.array([[min(rgb[0] + rgb[0] * ratio, 255), max(rgb[1] - rgb[1] * ratio, 0), rgb[2]] for rgb in p])
img = Image.fromarray(np.uint8(p2.reshape(s)))
img.save('result_update_1.png')
return 0
# More efficient vectorized approach using NumPy
def update_2(ratio=0.5):
pixels = np.array(im)
pixels[:, :, 0] = np.minimum(pixels[:, :, 0] * (1 + ratio), 255)
pixels[:, :, 1] = np.maximum(pixels[:, :, 1] * (1 - ratio), 0)
img = Image.fromarray(pixels)
img.save('result_update_2.png')
return 0
# More efficient approach only using PIL
def update_3(ratio=0.5):
(r, g, b) = im.split()
r = ImageMath.eval('min(float(r) / 255 * (1 + ratio), 1) * 255', r=r, ratio=ratio).convert('L')
g = ImageMath.eval('max(float(g) / 255 * (1 - ratio), 0) * 255', g=g, ratio=ratio).convert('L')
Image.merge('RGB', (r, g, b)).save('result_update_3.png')
return 0
im = Image.open('path/to/your/image.png')
t1 = time.perf_counter()
update_1(0.5)
print(time.perf_counter() - t1)
t1 = time.perf_counter()
update_2(0.5)
print(time.perf_counter() - t1)
t1 = time.perf_counter()
update_3(0.5)
print(time.perf_counter() - t1)
The performance on a [400, 400] RGB image on my machine:
1.723889293 s # your approach
0.055316339 s # vectorized NumPy approach
0.062502050 s # PIL only approach
Hope that helps!

Denoise a "Lang-Stereotest"

I'm trying to denoise a "Lang-Stereotest" (so it's called in Germany...) like this one:
here
I have used some filters as you can see in my source code:
(some code before...)
# Blur
output = cv2.blur(image, (10, 10))
img = Image.fromarray(output, 'RGB')
img.save("images/Filters/" + filePath.split('/')[1].split('.')[0] + " - Blur.jpg")
# Bilareal
output = cv2.bilateralFilter(image, 50, 50, 50)
img = Image.fromarray(output, 'RGB')
img.save("images/Filters/" + filePath.split('/')[1].split('.')[0] + " - Bilateral.jpg")
# MedianBlur
output = cv2.medianBlur(image, 5)
img = Image.fromarray(output, 'RGB')
img.save("images/Filters/" + filePath.split('/')[1].split('.')[0] + " - MedianBlur.jpg")
# Weighted
output = cv2.addWeighted(image, 5, image, -5, 128)
img = Image.fromarray(output, 'RGB')
img.save("images/Filters/" + filePath.split('/')[1].split('.')[0] + " - Weighted.jpg")
# Try to combine...
output = ... # here I want to combine the filters to gain best results..
img.save("images/Filters/" + filePath.split('/')[1].split('.')[0] + " - Best.jpg")
(some code after...)
As a result I got Bilateral:
[Blur], [Median Blur]
(I'll add "Blur" and "Median Blur" once I hit 10 reputation.... Sorry)
Ofcourse the results are far away from perfect and I also know, that there is no hundred percent solution but I think that it should significantly better..
Maybe someone of you have an idea on how to get a better result!
I have two approaches in mind
FIRST - Brute-Force approach
Here I manually set a threshold level below which all pixel values are 0 i.e; black
ret,th = cv2.threshold(gray, 100, 255, 1)
It looks pretty OK. But we can go further.
SECOND - Calculative approach
Here I set a threshold based on the median value of the gray scale image. This is a method statisticians use for separating data into different classes in data science. So I thought 'Why not try it out for images?'
Here is the code snippet for that:
sigma = 0.33
v = np.median(gray)
threshold = (1.0 - sigma) * v
for i in range(gray1.shape[0]):
for j in range(gray1.shape[1]):
if (gray[i, j] < threshold):
gray1[i, j] = 0
else:
gray[i, j] = 255
cv2.imwrite('gray1.jpg',gray1)
Yes, it does not look so perfect, but this is where I could go.
From here on it is up to you. You can apply medianfiltering followed by somemorphological` operations to attain what you want.
EDIT
I just copied the gray image into gray1 as reference to be used in the for loop.
Here is the complete code for a better understanding:
import cv2
import numpy as np
filename = '1.jpg'
img = cv2.imread(filename)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray1 = gray
sigma = 0.33
v = np.median(gray)
threshold = (1.0 - sigma) * v
for i in range(gray1.shape[0]):
for j in range(gray1.shape[1]):
if (gray[i, j] < threshold):
gray1[i, j] = 0
else:
gray[i, j] = 255
cv2.imwrite('gray1.jpg',gray1)
Hope this helped!!!!!!
:)
This is in response to your second image.
I performed histogram equalization of the gray scale image as mentioned in the comments:
equ = cv2.equalizeHist(gray)
I then applied binary threshold followed by dilation:
ret,th = cv2.threshold(equ, 50, 255, 0)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3))
dilate = cv2.morphologyEx(th, cv2.MORPH_DILATE, kernel, 3)
To reduce noise and spores in the image:
close = cv2.morphologyEx(dilate, cv2.MORPH_CLOSE, kernel, 3)
I inverted the image followed by morphological close:
ret,th1 = cv2.threshold(close, 50, 255, 1)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
opened = cv2.morphologyEx(th1, cv2.MORPH_CLOSE, kernel1, 3)
I then performed morphological dilation:
dd = cv2.morphologyEx(opened, cv2.MORPH_DILATE, kernel1, 3)
This is the maximum I could get to.
Now you can find contours and eliminate the small dots falling below a certain area.
:)

Using openCV to overlay transparent image onto another image

How can I overlay a transparent PNG onto another image without losing it's transparency using openCV in python?
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png')
# Help please
cv2.imwrite('combined.png', background)
Desired output:
Sources:
Background Image
Overlay
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png')
added_image = cv2.addWeighted(background,0.4,overlay,0.1,0)
cv2.imwrite('combined.png', added_image)
The correct answer to this was far too hard to come by, so I'm posting this answer even though the question is really old. What you are looking for is "over" compositing, and the algorithm for this can be found on Wikipedia: https://en.wikipedia.org/wiki/Alpha_compositing
I am far from an expert with OpenCV, but after some experimentation this is the most efficient way I have found to accomplish the task:
import cv2
background = cv2.imread("background.png", cv2.IMREAD_UNCHANGED)
foreground = cv2.imread("overlay.png", cv2.IMREAD_UNCHANGED)
# normalize alpha channels from 0-255 to 0-1
alpha_background = background[:,:,3] / 255.0
alpha_foreground = foreground[:,:,3] / 255.0
# set adjusted colors
for color in range(0, 3):
background[:,:,color] = alpha_foreground * foreground[:,:,color] + \
alpha_background * background[:,:,color] * (1 - alpha_foreground)
# set adjusted alpha and denormalize back to 0-255
background[:,:,3] = (1 - (1 - alpha_foreground) * (1 - alpha_background)) * 255
# display the image
cv2.imshow("Composited image", background)
cv2.waitKey(0)
The following code will use the alpha channels of the overlay image to correctly blend it into the background image, use x and y to set the top-left corner of the overlay image.
import cv2
import numpy as np
def overlay_transparent(background, overlay, x, y):
background_width = background.shape[1]
background_height = background.shape[0]
if x >= background_width or y >= background_height:
return background
h, w = overlay.shape[0], overlay.shape[1]
if x + w > background_width:
w = background_width - x
overlay = overlay[:, :w]
if y + h > background_height:
h = background_height - y
overlay = overlay[:h]
if overlay.shape[2] < 4:
overlay = np.concatenate(
[
overlay,
np.ones((overlay.shape[0], overlay.shape[1], 1), dtype = overlay.dtype) * 255
],
axis = 2,
)
overlay_image = overlay[..., :3]
mask = overlay[..., 3:] / 255.0
background[y:y+h, x:x+w] = (1.0 - mask) * background[y:y+h, x:x+w] + mask * overlay_image
return background
This code will mutate background so create a copy if you wish to preserve the original background image.
Been a while since this question appeared, but I believe this is the right simple answer, which could still help somebody.
background = cv2.imread('road.jpg')
overlay = cv2.imread('traffic sign.png')
rows,cols,channels = overlay.shape
overlay=cv2.addWeighted(background[250:250+rows, 0:0+cols],0.5,overlay,0.5,0)
background[250:250+rows, 0:0+cols ] = overlay
This will overlay the image over the background image such as shown here:
Ignore the ROI rectangles
Note that I used a background image of size 400x300 and the overlay image of size 32x32, is shown in the x[0-32] and y[250-282] part of the background image according to the coordinates I set for it, to first calculate the blend and then put the calculated blend in the part of the image where I want to have it.
(overlay is loaded from disk, not from the background image itself,unfortunately the overlay image has its own white background, so you can see that too in the result)
If performance isn't a concern then you can iterate over each pixel of the overlay and apply it to the background. This isn't very efficient, but it does help to understand how to work with png's alpha layer.
slow version
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
height, width = overlay.shape[:2]
for y in range(height):
for x in range(width):
overlay_color = overlay[y, x, :3] # first three elements are color (RGB)
overlay_alpha = overlay[y, x, 3] / 255 # 4th element is the alpha channel, convert from 0-255 to 0.0-1.0
# get the color from the background image
background_color = background[y, x]
# combine the background color and the overlay color weighted by alpha
composite_color = background_color * (1 - overlay_alpha) + overlay_color * overlay_alpha
# update the background image in place
background[y, x] = composite_color
cv2.imwrite('combined.png', background)
result:
fast version
I stumbled across this question while trying to add a png overlay to a live video feed. The above solution is way too slow for that. We can make the algorithm significantly faster by using numpy's vector functions.
note: This was my first real foray into numpy so there may be better/faster methods than what I've come up with.
import cv2
import numpy as np
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
# separate the alpha channel from the color channels
alpha_channel = overlay[:, :, 3] / 255 # convert from 0-255 to 0.0-1.0
overlay_colors = overlay[:, :, :3]
# To take advantage of the speed of numpy and apply transformations to the entire image with a single operation
# the arrays need to be the same shape. However, the shapes currently looks like this:
# - overlay_colors shape:(width, height, 3) 3 color values for each pixel, (red, green, blue)
# - alpha_channel shape:(width, height, 1) 1 single alpha value for each pixel
# We will construct an alpha_mask that has the same shape as the overlay_colors by duplicate the alpha channel
# for each color so there is a 1:1 alpha channel for each color channel
alpha_mask = np.dstack((alpha_channel, alpha_channel, alpha_channel))
# The background image is larger than the overlay so we'll take a subsection of the background that matches the
# dimensions of the overlay.
# NOTE: For simplicity, the overlay is applied to the top-left corner of the background(0,0). An x and y offset
# could be used to place the overlay at any position on the background.
h, w = overlay.shape[:2]
background_subsection = background[0:h, 0:w]
# combine the background with the overlay image weighted by alpha
composite = background_subsection * (1 - alpha_mask) + overlay_colors * alpha_mask
# overwrite the section of the background image that has been updated
background[0:h, 0:w] = composite
cv2.imwrite('combined.png', background)
How much faster? On my machine the slow method takes ~3 seconds and the optimized method takes ~ 30 ms. So about
100 times faster!
Wrapped up in a function
This function handles foreground and background images of different sizes and also supports negative and positive offsets the move the overlay across the bounds of the background image in any direction.
import cv2
import numpy as np
def add_transparent_image(background, foreground, x_offset=None, y_offset=None):
bg_h, bg_w, bg_channels = background.shape
fg_h, fg_w, fg_channels = foreground.shape
assert bg_channels == 3, f'background image should have exactly 3 channels (RGB). found:{bg_channels}'
assert fg_channels == 4, f'foreground image should have exactly 4 channels (RGBA). found:{fg_channels}'
# center by default
if x_offset is None: x_offset = (bg_w - fg_w) // 2
if y_offset is None: y_offset = (bg_h - fg_h) // 2
w = min(fg_w, bg_w, fg_w + x_offset, bg_w - x_offset)
h = min(fg_h, bg_h, fg_h + y_offset, bg_h - y_offset)
if w < 1 or h < 1: return
# clip foreground and background images to the overlapping regions
bg_x = max(0, x_offset)
bg_y = max(0, y_offset)
fg_x = max(0, x_offset * -1)
fg_y = max(0, y_offset * -1)
foreground = foreground[fg_y:fg_y + h, fg_x:fg_x + w]
background_subsection = background[bg_y:bg_y + h, bg_x:bg_x + w]
# separate alpha and color channels from the foreground image
foreground_colors = foreground[:, :, :3]
alpha_channel = foreground[:, :, 3] / 255 # 0-255 => 0.0-1.0
# construct an alpha_mask that matches the image shape
alpha_mask = np.dstack((alpha_channel, alpha_channel, alpha_channel))
# combine the background with the overlay image weighted by alpha
composite = background_subsection * (1 - alpha_mask) + foreground_colors * alpha_mask
# overwrite the section of the background image that has been updated
background[bg_y:bg_y + h, bg_x:bg_x + w] = composite
example usage:
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
x_offset = 0
y_offset = 0
print("arrow keys to move the dice. ESC to quit")
while True:
img = background.copy()
add_transparent_image(img, overlay, x_offset, y_offset)
cv2.imshow("", img)
key = cv2.waitKey()
if key == 0: y_offset -= 10 # up
if key == 1: y_offset += 10 # down
if key == 2: x_offset -= 10 # left
if key == 3: x_offset += 10 # right
if key == 27: break # escape
You need to open the transparent png image using the flag IMREAD_UNCHANGED
Mat overlay = cv::imread("dice.png", IMREAD_UNCHANGED);
Then split the channels, group the RGB and use the transparent channel as an mask, do like that:
/**
* #brief Draws a transparent image over a frame Mat.
*
* #param frame the frame where the transparent image will be drawn
* #param transp the Mat image with transparency, read from a PNG image, with the IMREAD_UNCHANGED flag
* #param xPos x position of the frame image where the image will start.
* #param yPos y position of the frame image where the image will start.
*/
void drawTransparency(Mat frame, Mat transp, int xPos, int yPos) {
Mat mask;
vector<Mat> layers;
split(transp, layers); // seperate channels
Mat rgb[3] = { layers[0],layers[1],layers[2] };
mask = layers[3]; // png's alpha channel used as mask
merge(rgb, 3, transp); // put together the RGB channels, now transp insn't transparent
transp.copyTo(frame.rowRange(yPos, yPos + transp.rows).colRange(xPos, xPos + transp.cols), mask);
}
Can be called like that:
drawTransparency(background, overlay, 10, 10);
To overlay png image watermark over normal 3 channel jpeg image
import cv2
import numpy as np
​
def logoOverlay(image,logo,alpha=1.0,x=0, y=0, scale=1.0):
(h, w) = image.shape[:2]
image = np.dstack([image, np.ones((h, w), dtype="uint8") * 255])
​
overlay = cv2.resize(logo, None,fx=scale,fy=scale)
(wH, wW) = overlay.shape[:2]
output = image.copy()
# blend the two images together using transparent overlays
try:
if x<0 : x = w+x
if y<0 : y = h+y
if x+wW > w: wW = w-x
if y+wH > h: wH = h-y
print(x,y,wW,wH)
overlay=cv2.addWeighted(output[y:y+wH, x:x+wW],alpha,overlay[:wH,:wW],1.0,0)
output[y:y+wH, x:x+wW ] = overlay
except Exception as e:
print("Error: Logo position is overshooting image!")
print(e)
​
output= output[:,:,:3]
return output
Usage:
background = cv2.imread('image.jpeg')
overlay = cv2.imread('logo.png', cv2.IMREAD_UNCHANGED)
​
print(overlay.shape) # must be (x,y,4)
print(background.shape) # must be (x,y,3)
# downscale logo by half and position on bottom right reference
out = logoOverlay(background,overlay,scale=0.5,y=-100,x=-100)
​
cv2.imshow("test",out)
cv2.waitKey(0)
import cv2
import numpy as np
background = cv2.imread('background.jpg')
overlay = cv2.imread('cloudy.png')
overlay = cv2.resize(overlay, (200,200))
# overlay = for_transparent_removal(overlay)
h, w = overlay.shape[:2]
shapes = np.zeros_like(background, np.uint8)
shapes[0:h, 0:w] = overlay
alpha = 0.8
mask = shapes.astype(bool)
# option first
background[mask] = cv2.addWeighted(shapes, alpha, shapes, 1 - alpha, 0)[mask]
cv2.imwrite('combined.png', background)
# option second
background[mask] = cv2.addWeighted(background, alpha, overlay, 1 - alpha, 0)[mask]
# NOTE : above both option will give you image overlays but effect would be changed
cv2.imwrite('combined.1.png', background)
**Use this function to place your overlay on any background image.
if want to resize overlay use this overlay = cv2.resize(overlay, (200,200)) and then pass resized overlay into the function.
**
import cv2
import numpy as np
def image_overlay_second_method(img1, img2, location, min_thresh=0, is_transparent=False):
h, w = img1.shape[:2]
h1, w1 = img2.shape[:2]
x, y = location
roi = img1[y:y + h1, x:x + w1]
gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
_, mask = cv2.threshold(gray, min_thresh, 255, cv2.THRESH_BINARY)
mask_inv = cv2.bitwise_not(mask)
img_bg = cv2.bitwise_and(roi, roi, mask=mask_inv)
img_fg = cv2.bitwise_and(img2, img2, mask=mask)
dst = cv2.add(img_bg, img_fg)
if is_transparent:
dst = cv2.addWeighted(img1[y:y + h1, x:x + w1], 0.1, dst, 0.9, None)
img1[y:y + h1, x:x + w1] = dst
return img1
if __name__ == '__main__':
background = cv2.imread('background.jpg')
overlay = cv2.imread('overlay.png')
output = image_overlay_third_method(background, overlay, location=(800,50), min_thresh=0, is_transparent=True)
cv2.imwrite('output.png', output)
background.jpg
output.png

Categories