Related
I have a grayscale image and I want to create an alpha layer based on a range of pixel values. I want to know how can I create a fall-off function to generate such image.
The original image is the following:
I can use the color range in photoshop to select the shadows with fuzziness of 20%
And the resultant alpha channel is the following:
With fuzziness of 100%:
How can I generate such alpha channels in python with PIL?
I thought that maybe a subtract, but it does not generates a
The code to generate the image with Numpy and PIL:
from PIL import Image
import numpy as np
img = np.arange(0,256, 0.1).astype(np.uint8)
img = np.reshape(img, (img.shape[0], 1))
img = np.repeat((img), 500, axis=1)
img = Image.fromarray(img.T)
I tried to create a fall-off function from the distance of the pixel values but it does not have the same gradient. Maybe there is a different way?
def gauss_falloff(distance, c=0.2, alpha=255):
new_value = alpha * np.exp(-1 * ((distance) ** 2) / (c**2))
new_value = new_value.clip(0,255)
return new_value.astype(np.uint8)
test = img.T / 255
test = np.abs(test - pixel)
test = gauss_falloff(test, c=0.2, alpha=255)
test = Image.fromarray(test)
With my code:
Here's how you could do that
from PIL import Image, ImageDraw
# Create a new image with a transparent background
width, height = 300, 300
image = Image.new('RGBA', (width, height), (255, 255, 255, 0))
# Create a drawing context for the image
draw = ImageDraw.Draw(image)
# Set the starting and ending colors for the gradient
start_color = (255, 0, 0)
end_color = (0, 0, 255)
# Draw a gradient line with the specified color range
for x in range(width):
color = tuple(int(start_color[i] + (end_color[i] - start_color[i]) * x / width)
for i in range(3))
draw.line((x, 0, x, height), fill=color)
# Save the image
image.save('gradient.png')
This code creates a new image with a transparent background and a drawing context for that image. Then it draws a gradient line on the image with the specified color range. Finally, it saves the image as a PNG file.
Note: The Python Imaging Library (PIL) has been replaced by the Pillow library, which is a fork of PIL. If you are using Pillow, you can use the same code as above, but you need to import the Image and ImageDraw modules from the Pillow package instead of the PIL package.
I work with logos and other simple graphics, in which there are no gradients or complex patterns. My task is to extract from the logo segments with letters and other elements.
To do this, I define the background color, and then I go through the picture in order to segment the images. Here is my code for more understanding:
MAXIMUM_COLOR_TRANSITION_DELTA = 100 # 0 - 765
def expand_segment_recursive(image, unexplored_foreground, segment, point, color):
height, width, _ = image.shape
# Unpack coordinates from point
py, px = point
# Create list of pixels to check
neighbourhood_pixels = [(py, px + 1), (py, px - 1), (py + 1, px), (py - 1, px)]
allowed_zone = unexplored_foreground & np.invert(segment)
for y, x in neighbourhood_pixels:
# Add pixel to segment if its coordinates within the image shape and its color differs from segment color no
# more than MAXIMUM_COLOR_TRANSITION_DELTA
if y in range(height) and x in range(width) and allowed_zone[y, x]:
color_delta = np.sum(np.abs(image[y, x].astype(np.int) - color.astype(np.int)))
print(color_delta)
if color_delta <= MAXIMUM_COLOR_TRANSITION_DELTA:
segment[y, x] = True
segment = expand_segment_recursive(image, unexplored_foreground, segment, (y, x), color)
allowed_zone = unexplored_foreground & np.invert(segment)
return segment
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Pass image as the argument to use the tool")
exit(-1)
IMAGE_FILENAME = sys.argv[1]
print(IMAGE_FILENAME)
image = cv.imread(IMAGE_FILENAME)
height, width, _ = image.shape
# To filter the background I use median value of the image, as background in most cases takes > 50% of image area.
background_color = np.median(image, axis=(0, 1))
print("Background color: ", background_color)
# Create foreground mask to find segments in it (TODO: Optimize this part)
foreground = np.zeros(shape=(height, width, 1), dtype=np.bool)
for y in range(height):
for x in range(width):
if not np.array_equal(image[y, x], background_color):
foreground[y, x] = True
unexplored_foreground = foreground
for y in range(height):
for x in range(width):
if unexplored_foreground[y, x]:
segment = np.zeros(foreground.shape, foreground.dtype)
segment[y, x] = True
segment = expand_segment_recursive(image, unexplored_foreground, segment, (y, x), image[y, x])
cv.imshow("segment", segment.astype(np.uint8) * 255)
while cv.waitKey(0) != 27:
continue
Here is the desired result:
In the end of run-time I expect 13 extracted separated segments (for this particular image). But instead I got RecursionError: maximum recursion depth exceeded, which is not surprising as expand_segment_recursive() can be called for every pixel of the image. And since even with small image resolution of 600x500 i got at maximum 300K calls.
My question is how can I get rid of recursion in this case and possibly optimize the algorithm with Numpy or OpenCV algorithms?
You can actually use a thresholded image (binary) and connectedComponents to do this job in a couple of steps. Also, you may use findContours or other methods.
Here is the code:
import numpy as np
import cv2
# load image as greyscale
img = cv2.imread("hp.png", 0)
# puts 0 to the white (background) and 255 in other places (greyscale value < 250)
_, thresholded = cv2.threshold(img, 250, 255, cv2.THRESH_BINARY_INV)
# gets the labels and the amount of labels, label 0 is the background
amount, labels = cv2.connectedComponents(thresholded)
# lets draw it for visualization purposes
preview = np.zeros((img.shape[0], img.shape[2], 3), dtype=np.uint8)
print (amount) #should be 3 -> two components + background
# draw label 1 blue and label 2 green
preview[labels == 1] = (255, 0, 0)
preview[labels == 2] = (0, 255, 0)
cv2.imshow("frame", preview)
cv2.waitKey(0)
At the end, the thresholded image will look like this:
and the preview image (the one with the colored segments) will look like this:
With the mask you can always use numpy functions to get things like, coordinates of the segments you want or to color them (like I did with preview)
UPDATE
To get different colored segments, you may try to create a "border" between the segments. Since they are plain colors and not gradients, you can try to do an edge detector like canny and then put it black in the image....
import numpy as np
import cv2
img = cv2.imread("total.png", 0)
# background to black
img[img>=200] = 0
# get edges
canny = cv2.Canny(img, 60, 180)
# make them thicker
kernel = np.ones((3,3),np.uint8)
canny = cv2.morphologyEx(canny, cv2.MORPH_DILATE, kernel)
# apply edges as border in the image
img[canny==255] = 0
# same as before
amount, labels = cv2.connectedComponents(img)
preview = np.zeros((img.shape[0], img.shape[1], 3), dtype=np.uint8)
print (amount) #should be 14 -> 13 components + background
# color them randomly
for i in range(1, amount):
preview[labels == i] = np.random.randint(0,255, size=3, dtype=np.uint8)
cv2.imshow("frame", preview )
cv2.waitKey(0)
The result is:
Need to change the white pixels to black and black pixels to white of the picture given below
import cv2
img=cv2.imread("cvlogo.png")
A basic opencv logo with white background and resized the picture to a fixed known size
img=cv2.resize(img, (300,300))#(width,height)
row,col=0,0
i=0
Now checking each pixel by its row and column positions with for loop
If pixel is white, then change it to black or if pixel is black,change it to white.
for row in range(0,300,1):
print(row)
for col in range(0,300,1):
print(col)
if img[row,col] is [255,255,255] : #I have used == instead of 'is'..but there is no change
img[row,col]=[0,0,0]
elif img[row,col] is [0,0,0]:
img[row,col]=[255,255,255]
There is no error in execution but it is not changing the pixel values to black or white respectively. More over if statement is also not executing..Too much of confusion..
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
I am not very experienced, but I would do it using numpy.where(), which is faster than the loops.
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Read the image
original_image=cv2.imread("cvlogo.png")
# Not necessary. Make a copy to plot later
img=np.copy(original_image)
#Isolate the areas where the color is black(every channel=0) and white (every channel=255)
black=np.where((img[:,:,0]==0) & (img[:,:,1]==0) & (img[:,:,2]==0))
white=np.where((img[:,:,0]==255) & (img[:,:,1]==255) & (img[:,:,2]==255))
#Turn black pixels to white and vice versa
img[black]=(255,255,255)
img[white]=(0,0,0)
# Plot the images
fig=plt.figure()
ax1 = fig.add_subplot(1,2,1)
ax1.imshow(original_image)
ax1.set_title('Original Image')
ax2 = fig.add_subplot(1,2,2)
ax2.imshow(img)
ax2.set_title('Modified Image')
plt.show()
I think this should work. :)
(I used numpy just to get width and height values - you dont need this)
import cv2
img=cv2.imread("cvlogo.png")
img=cv2.resize(img, (300,300))
height, width, channels = img.shape
white = [255,255,255]
black = [0,0,0]
for x in range(0,width):
for y in range(0,height):
channels_xy = img[y,x]
if all(channels_xy == white):
img[y,x] = black
elif all(channels_xy == black):
img[y,x] = white
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
This is also a method of solving this problem.
CREDITS:ajlaj25
import cv2
img=cv2.imread("cvlogo.png")
img=cv2.resize(img, (300,300))
height, width, channels = img.shape
print(height,width,channels)
for x in range(0,width):
for y in range(0,height):
if img[x,y,0] == 255 and img[x,y,1] == 255 and img[x,y,2] == 255:
img[x,y,0] = 0
img[x,y,1] = 0
img[x,y,2] = 0
elif img[x,y,0] == 0 and img[x,y,1] == 0 and img[x,y,2] == 0:
img[x,y,0] = 255
img[x,y,1] = 255
img[x,y,2] = 255
img[x,y] denotes the channel values - all three: [ch1,ch2,ch3] -
at the x,y coordinates. img[x,y,0] is the ch1 channel's value at x,y
coordinates.
**
x and y denotes pixels location not RGB values of pixel .So,
img[x,y,0] is the ch1 channel's value at x,y coordinates
**
cv2.imshow('Coverted Image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
A bit late, but I'd like to contribute with another approach to solve this situation. My approach is based on image indexation, which are faster than looping through the image as the approach used in the accept answer.
I did some time measurement of both codes to illustrate what I just said. Take a look at the code below:
import cv2
from matplotlib import pyplot as plt
# Reading image to be used in the montage, this step is not important
original = cv2.imread('imgs/opencv.png')
# Starting time measurement
e1 = cv2.getTickCount()
# Reading the image
img = cv2.imread('imgs/opencv.png')
# Converting the image to grayscale
imgGray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Converting the grayscale image into a binary image to get the whole image
ret,imgBinAll = cv2.threshold(imgGray,175,255,cv2.THRESH_BINARY)
# Converting the grayscale image into a binary image to get the text
ret,imgBinText = cv2.threshold(imgGray,5,255,cv2.THRESH_BINARY)
# Changing white pixels from original image to black
img[imgBinAll == 255] = [0,0,0]
# Changing black pixels from original image to white
img[imgBinText == 0] = [255,255,255]
# Finishing time measurement
e2 = cv2.getTickCount()
t = (e2 - e1)/cv2.getTickFrequency()
print(f'Time spent in seconds: {t}')
At this point I stopped timing because the next step is just to plot the montage, the code follows:
# Plotting the image
plt.subplot(1,5,1),plt.imshow(original)
plt.title('original')
plt.xticks([]),plt.yticks([])
plt.subplot(1,5,2),plt.imshow(imgGray,'gray')
plt.title('grayscale')
plt.xticks([]),plt.yticks([])
plt.subplot(1,5,3),plt.imshow(imgBinAll,'gray')
plt.title('binary - all')
plt.xticks([]),plt.yticks([])
plt.subplot(1,5,4),plt.imshow(imgBinText,'gray')
plt.title('binary - text')
plt.xticks([]),plt.yticks([])
plt.subplot(1,5,5),plt.imshow(img,'gray')
plt.title('final result')
plt.xticks([]),plt.yticks([])
plt.show()
That is the final result:
Montage showing all steps of the proposed approach
And this is the time consumed (printed in the console):
Time spent in seconds: 0.008526025
In order to compare both approaches I commented the line where the image is resized. Also, I stopped timing before the imshow command. These were the results:
Time spent in seconds: 1.837972522
Final result of the looping approach
If you examine both images you'll see some contour differences. Sometimes when you are working with image processing, efficiency is key. Therefore, it is a good idea to save time where it is possible. This approach can be adapted for different situations, take a look at the threshold documentation.
How can I overlay a transparent PNG onto another image without losing it's transparency using openCV in python?
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png')
# Help please
cv2.imwrite('combined.png', background)
Desired output:
Sources:
Background Image
Overlay
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png')
added_image = cv2.addWeighted(background,0.4,overlay,0.1,0)
cv2.imwrite('combined.png', added_image)
The correct answer to this was far too hard to come by, so I'm posting this answer even though the question is really old. What you are looking for is "over" compositing, and the algorithm for this can be found on Wikipedia: https://en.wikipedia.org/wiki/Alpha_compositing
I am far from an expert with OpenCV, but after some experimentation this is the most efficient way I have found to accomplish the task:
import cv2
background = cv2.imread("background.png", cv2.IMREAD_UNCHANGED)
foreground = cv2.imread("overlay.png", cv2.IMREAD_UNCHANGED)
# normalize alpha channels from 0-255 to 0-1
alpha_background = background[:,:,3] / 255.0
alpha_foreground = foreground[:,:,3] / 255.0
# set adjusted colors
for color in range(0, 3):
background[:,:,color] = alpha_foreground * foreground[:,:,color] + \
alpha_background * background[:,:,color] * (1 - alpha_foreground)
# set adjusted alpha and denormalize back to 0-255
background[:,:,3] = (1 - (1 - alpha_foreground) * (1 - alpha_background)) * 255
# display the image
cv2.imshow("Composited image", background)
cv2.waitKey(0)
The following code will use the alpha channels of the overlay image to correctly blend it into the background image, use x and y to set the top-left corner of the overlay image.
import cv2
import numpy as np
def overlay_transparent(background, overlay, x, y):
background_width = background.shape[1]
background_height = background.shape[0]
if x >= background_width or y >= background_height:
return background
h, w = overlay.shape[0], overlay.shape[1]
if x + w > background_width:
w = background_width - x
overlay = overlay[:, :w]
if y + h > background_height:
h = background_height - y
overlay = overlay[:h]
if overlay.shape[2] < 4:
overlay = np.concatenate(
[
overlay,
np.ones((overlay.shape[0], overlay.shape[1], 1), dtype = overlay.dtype) * 255
],
axis = 2,
)
overlay_image = overlay[..., :3]
mask = overlay[..., 3:] / 255.0
background[y:y+h, x:x+w] = (1.0 - mask) * background[y:y+h, x:x+w] + mask * overlay_image
return background
This code will mutate background so create a copy if you wish to preserve the original background image.
Been a while since this question appeared, but I believe this is the right simple answer, which could still help somebody.
background = cv2.imread('road.jpg')
overlay = cv2.imread('traffic sign.png')
rows,cols,channels = overlay.shape
overlay=cv2.addWeighted(background[250:250+rows, 0:0+cols],0.5,overlay,0.5,0)
background[250:250+rows, 0:0+cols ] = overlay
This will overlay the image over the background image such as shown here:
Ignore the ROI rectangles
Note that I used a background image of size 400x300 and the overlay image of size 32x32, is shown in the x[0-32] and y[250-282] part of the background image according to the coordinates I set for it, to first calculate the blend and then put the calculated blend in the part of the image where I want to have it.
(overlay is loaded from disk, not from the background image itself,unfortunately the overlay image has its own white background, so you can see that too in the result)
If performance isn't a concern then you can iterate over each pixel of the overlay and apply it to the background. This isn't very efficient, but it does help to understand how to work with png's alpha layer.
slow version
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
height, width = overlay.shape[:2]
for y in range(height):
for x in range(width):
overlay_color = overlay[y, x, :3] # first three elements are color (RGB)
overlay_alpha = overlay[y, x, 3] / 255 # 4th element is the alpha channel, convert from 0-255 to 0.0-1.0
# get the color from the background image
background_color = background[y, x]
# combine the background color and the overlay color weighted by alpha
composite_color = background_color * (1 - overlay_alpha) + overlay_color * overlay_alpha
# update the background image in place
background[y, x] = composite_color
cv2.imwrite('combined.png', background)
result:
fast version
I stumbled across this question while trying to add a png overlay to a live video feed. The above solution is way too slow for that. We can make the algorithm significantly faster by using numpy's vector functions.
note: This was my first real foray into numpy so there may be better/faster methods than what I've come up with.
import cv2
import numpy as np
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
# separate the alpha channel from the color channels
alpha_channel = overlay[:, :, 3] / 255 # convert from 0-255 to 0.0-1.0
overlay_colors = overlay[:, :, :3]
# To take advantage of the speed of numpy and apply transformations to the entire image with a single operation
# the arrays need to be the same shape. However, the shapes currently looks like this:
# - overlay_colors shape:(width, height, 3) 3 color values for each pixel, (red, green, blue)
# - alpha_channel shape:(width, height, 1) 1 single alpha value for each pixel
# We will construct an alpha_mask that has the same shape as the overlay_colors by duplicate the alpha channel
# for each color so there is a 1:1 alpha channel for each color channel
alpha_mask = np.dstack((alpha_channel, alpha_channel, alpha_channel))
# The background image is larger than the overlay so we'll take a subsection of the background that matches the
# dimensions of the overlay.
# NOTE: For simplicity, the overlay is applied to the top-left corner of the background(0,0). An x and y offset
# could be used to place the overlay at any position on the background.
h, w = overlay.shape[:2]
background_subsection = background[0:h, 0:w]
# combine the background with the overlay image weighted by alpha
composite = background_subsection * (1 - alpha_mask) + overlay_colors * alpha_mask
# overwrite the section of the background image that has been updated
background[0:h, 0:w] = composite
cv2.imwrite('combined.png', background)
How much faster? On my machine the slow method takes ~3 seconds and the optimized method takes ~ 30 ms. So about
100 times faster!
Wrapped up in a function
This function handles foreground and background images of different sizes and also supports negative and positive offsets the move the overlay across the bounds of the background image in any direction.
import cv2
import numpy as np
def add_transparent_image(background, foreground, x_offset=None, y_offset=None):
bg_h, bg_w, bg_channels = background.shape
fg_h, fg_w, fg_channels = foreground.shape
assert bg_channels == 3, f'background image should have exactly 3 channels (RGB). found:{bg_channels}'
assert fg_channels == 4, f'foreground image should have exactly 4 channels (RGBA). found:{fg_channels}'
# center by default
if x_offset is None: x_offset = (bg_w - fg_w) // 2
if y_offset is None: y_offset = (bg_h - fg_h) // 2
w = min(fg_w, bg_w, fg_w + x_offset, bg_w - x_offset)
h = min(fg_h, bg_h, fg_h + y_offset, bg_h - y_offset)
if w < 1 or h < 1: return
# clip foreground and background images to the overlapping regions
bg_x = max(0, x_offset)
bg_y = max(0, y_offset)
fg_x = max(0, x_offset * -1)
fg_y = max(0, y_offset * -1)
foreground = foreground[fg_y:fg_y + h, fg_x:fg_x + w]
background_subsection = background[bg_y:bg_y + h, bg_x:bg_x + w]
# separate alpha and color channels from the foreground image
foreground_colors = foreground[:, :, :3]
alpha_channel = foreground[:, :, 3] / 255 # 0-255 => 0.0-1.0
# construct an alpha_mask that matches the image shape
alpha_mask = np.dstack((alpha_channel, alpha_channel, alpha_channel))
# combine the background with the overlay image weighted by alpha
composite = background_subsection * (1 - alpha_mask) + foreground_colors * alpha_mask
# overwrite the section of the background image that has been updated
background[bg_y:bg_y + h, bg_x:bg_x + w] = composite
example usage:
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
x_offset = 0
y_offset = 0
print("arrow keys to move the dice. ESC to quit")
while True:
img = background.copy()
add_transparent_image(img, overlay, x_offset, y_offset)
cv2.imshow("", img)
key = cv2.waitKey()
if key == 0: y_offset -= 10 # up
if key == 1: y_offset += 10 # down
if key == 2: x_offset -= 10 # left
if key == 3: x_offset += 10 # right
if key == 27: break # escape
You need to open the transparent png image using the flag IMREAD_UNCHANGED
Mat overlay = cv::imread("dice.png", IMREAD_UNCHANGED);
Then split the channels, group the RGB and use the transparent channel as an mask, do like that:
/**
* #brief Draws a transparent image over a frame Mat.
*
* #param frame the frame where the transparent image will be drawn
* #param transp the Mat image with transparency, read from a PNG image, with the IMREAD_UNCHANGED flag
* #param xPos x position of the frame image where the image will start.
* #param yPos y position of the frame image where the image will start.
*/
void drawTransparency(Mat frame, Mat transp, int xPos, int yPos) {
Mat mask;
vector<Mat> layers;
split(transp, layers); // seperate channels
Mat rgb[3] = { layers[0],layers[1],layers[2] };
mask = layers[3]; // png's alpha channel used as mask
merge(rgb, 3, transp); // put together the RGB channels, now transp insn't transparent
transp.copyTo(frame.rowRange(yPos, yPos + transp.rows).colRange(xPos, xPos + transp.cols), mask);
}
Can be called like that:
drawTransparency(background, overlay, 10, 10);
To overlay png image watermark over normal 3 channel jpeg image
import cv2
import numpy as np
def logoOverlay(image,logo,alpha=1.0,x=0, y=0, scale=1.0):
(h, w) = image.shape[:2]
image = np.dstack([image, np.ones((h, w), dtype="uint8") * 255])
overlay = cv2.resize(logo, None,fx=scale,fy=scale)
(wH, wW) = overlay.shape[:2]
output = image.copy()
# blend the two images together using transparent overlays
try:
if x<0 : x = w+x
if y<0 : y = h+y
if x+wW > w: wW = w-x
if y+wH > h: wH = h-y
print(x,y,wW,wH)
overlay=cv2.addWeighted(output[y:y+wH, x:x+wW],alpha,overlay[:wH,:wW],1.0,0)
output[y:y+wH, x:x+wW ] = overlay
except Exception as e:
print("Error: Logo position is overshooting image!")
print(e)
output= output[:,:,:3]
return output
Usage:
background = cv2.imread('image.jpeg')
overlay = cv2.imread('logo.png', cv2.IMREAD_UNCHANGED)
print(overlay.shape) # must be (x,y,4)
print(background.shape) # must be (x,y,3)
# downscale logo by half and position on bottom right reference
out = logoOverlay(background,overlay,scale=0.5,y=-100,x=-100)
cv2.imshow("test",out)
cv2.waitKey(0)
import cv2
import numpy as np
background = cv2.imread('background.jpg')
overlay = cv2.imread('cloudy.png')
overlay = cv2.resize(overlay, (200,200))
# overlay = for_transparent_removal(overlay)
h, w = overlay.shape[:2]
shapes = np.zeros_like(background, np.uint8)
shapes[0:h, 0:w] = overlay
alpha = 0.8
mask = shapes.astype(bool)
# option first
background[mask] = cv2.addWeighted(shapes, alpha, shapes, 1 - alpha, 0)[mask]
cv2.imwrite('combined.png', background)
# option second
background[mask] = cv2.addWeighted(background, alpha, overlay, 1 - alpha, 0)[mask]
# NOTE : above both option will give you image overlays but effect would be changed
cv2.imwrite('combined.1.png', background)
**Use this function to place your overlay on any background image.
if want to resize overlay use this overlay = cv2.resize(overlay, (200,200)) and then pass resized overlay into the function.
**
import cv2
import numpy as np
def image_overlay_second_method(img1, img2, location, min_thresh=0, is_transparent=False):
h, w = img1.shape[:2]
h1, w1 = img2.shape[:2]
x, y = location
roi = img1[y:y + h1, x:x + w1]
gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
_, mask = cv2.threshold(gray, min_thresh, 255, cv2.THRESH_BINARY)
mask_inv = cv2.bitwise_not(mask)
img_bg = cv2.bitwise_and(roi, roi, mask=mask_inv)
img_fg = cv2.bitwise_and(img2, img2, mask=mask)
dst = cv2.add(img_bg, img_fg)
if is_transparent:
dst = cv2.addWeighted(img1[y:y + h1, x:x + w1], 0.1, dst, 0.9, None)
img1[y:y + h1, x:x + w1] = dst
return img1
if __name__ == '__main__':
background = cv2.imread('background.jpg')
overlay = cv2.imread('overlay.png')
output = image_overlay_third_method(background, overlay, location=(800,50), min_thresh=0, is_transparent=True)
cv2.imwrite('output.png', output)
background.jpg
output.png
I am using the PIL to take an image with a black background and make a mask out of it. What I want the program to do is iterate through all the pixels in the image and if the pixel is black make it white and if it is any other color make it black, but I am not sure how to appropriately compare pixel values to determine what to do with the pixel.
Here is my code so far which creates an all black image.
import os, sys
import Image
filename = "C:\Users\pdiffley\Dropbox\C++2\Code\Test\BallSpriteImage.bmp"
height = 50
width = 50
im = Image.open(filename)
im = im.load()
i = 0
j = 0
while i<height:
while j<width:
if im[j,i] == (0,0,0):
im[j,i] = (255,255,255)
else:
im[j,i] = (0,0,0)
j = j+1
i = i+1
mask = Image.new('RGB', (width, height))
newfile = filename.partition('.')
newfile = newfile[0] + "Mask.bmp"
mask.save(newfile)
I believe the problem is in the if statement comparing the im[j,i] to the RGB value (0,0,0) which always evaluates to false. What is the correct way to compare the pixel?
The pixel data comparison is correct. But there are two problems with the logic:
When you are finished with a row, you should reset j to 0.
You are modifying the object "im", but writing "mask".
This should work (as long as you have no alpha channel - as andrewdski pointed out):
img = Image.open(filename)
im = img.load()
i = 0
while i<height:
j = 0
while j<width:
if im[j,i] == (0,0,0):
im[j,i] = (255,255,255)
else:
im[j,i] = (0,0,0)
j = j+1
i = i+1
newfile = filename.partition('.')
newfile = newfile[0] + "Mask.png"
img.save(newfile)
Here's how I'd rewrite it, which avoids a pixel index reset problem by using for loops, writes the data to a separate mask image rather than back onto the source, and removes the hardcoded image size. I also added an r prefix to the filename string to handle the backslashes in it.
import os, sys
import Image
BLACK = (0,0,0)
WHITE = (255, 255, 255)
filename = r"C:\Users\pdiffley\Dropbox\C++2\Code\Test\BallSpriteImage.bmp"
img = Image.open(filename)
width, height = img.size
im = img.load()
mask = Image.new('RGB', (width, height))
msk = mask.load()
for y in xrange(height):
for x in xrange(width):
if im[x,y] == BLACK:
msk[x,y] = WHITE
else: # not really needed since mask's initial color is black
msk[x,y] = BLACK
newfilename = filename.partition('.')
newfilename = newfilename[0] + "Mask.bmp"
mask.save(newfilename)
The following function uses the .point method and works on separately on each band of the image:
CVT_TABLE= (255,) + 255 * (0,)
def do_convert(img):
return img.point(CVT_TABLE * len(img.getbands()))
Working separately on each band means that a picture like this:
will be converted into this:
However, you can get almost what you want if you convert the image to mode "L" first:
CVT_TABLE= (255,) + 255 * (0,)
def do_convert(img):
return img.convert("L").point(CVT_TABLE)
producing the following result:
The only drawback is that a few darkest colors (e.g. #000001, the darkest blue possible) will probably be converted to black by the mode conversion.