I have the following RGB image (shape of (3, 50, 200)):
I want to reduce dimensions by converting the image to pure black and white (this image looks black and white, but actually it has 3 channels as I mentioned).
I made (with help from the internet) the following function:
def rgb2gray(rgb):
r, g, b = rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]
gray = (0.2989 * r + 0.5870 * g + 0.1140 * b)
for x in range(rgb.shape[1]):
for y in range(rgb.shape[0]):
if gray[y][x]>128: #if bright
gray[y][x] = 255.0 #white
else:
gray[y][x] = 0.0 #black
return gray
Then I ran:
im = cv2.imread("samples/55y2m.png")
print(im.shape)
print(rgb2gray(im).shape)
plt.imshow(rgb2gray(im))
And got the following output:
(50, 200, 3) #for the input
(50, 200) #for the output
Why the image is yellow and purple, and how can I change it to black and white?
p.s. I tried to change the function to:
def rgb2gray(rgb):
r, g, b = rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]
gray = (0.2989 * r + 0.5870 * g + 0.1140 * b)
for x in range(rgb.shape[1]):
for y in range(rgb.shape[0]):
if gray[y][x]>128:
rgb[y][x] = 255.0 #changed
else:
rgb[y][x] = 0.0 #changed
return rgb #changed
And I actually got pure black and white image, but it was 3 channels (RGB). So I tried to remove the last axis, and got purple and yellow again.
You don't need this:
r, g, b = rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]
gray = (0.2989 * r + 0.5870 * g + 0.1140 * b)
because your image is already grayscale, which means R == G == B, so you may take GREEN channel (or any other if you like) and use it.
And yeah, specify the colormap for matplotlib:
plt.imshow(im[:,:,1], cmap='gray')
Related
I am trying to detect numbers by converting the portion of the screen from its original colour to gray as well as applying a fixed-level thresholding. The idea is detecting the character's hp and mp values. Both results do not give a correct and an accurate result. I also upscaled the portion thinking that numbers could be read easily.
Original portion:
x, y, w, h = rectangles[0]
w = w - 155
h = h - 9
y = y + 5
x = x + 60
screenshot = window_capture.screenshot[y:y + h, x:x + w]
screenshot = cv2.cvtColor(screenshot, cv2.COLOR_BGR2GRAY)
scale_percent = 250
width = int(w * scale_percent / 100)
height = int(h * scale_percent / 100)
dim = (width, height)
screenshot = cv2.resize(screenshot, dim, interpolation=cv2.INTER_AREA)
ret, thresh = cv2.threshold(screenshot, 150, 255, cv2.THRESH_BINARY)
config = r'--oem 3 --psm 13 -c tessedit_char_whitelist=0123456789/'
text = pytesseract.image_to_string(screenshot).replace('\n', ',')
cv2.imshow('threshold', screenshot)
print(text)
print('------')
When I change colour to gray only and print text, I get the following with psm 13 and/or 6. Other psm values did not show anything.
When I apply threshold with psm 13 and/or 6 (other psm values did not show anything), I get the following.
What am I wrong that tesseract does not provide me values correctly? Is this approach the best way to detect hp and mp values?
Update:
I updated the code by converting the image from bgr to hsv, and the background is black with red text. The tesseract can now read fine, but it doesn't read the numbers when gray background behind numbers appear. After converting the image, the gray background turns into red. How can I have black background always or is there a better approach?
x, y, w, h = rectangles[0]
w = w - 140
h = h - 9
y = y + 5
x = x + 60
screenshot = window_capture.screenshot[y:y + h, x:x + w]
screenshot = cv2.cvtColor(screenshot, cv2.COLOR_BGR2HSV)
lower_white = np.array([0, 0, 0], dtype=np.uint8)
upper_white = np.array([0, 0, 255], dtype=np.uint8)
mask = cv2.inRange(screenshot, lower_white, upper_white)
res = cv2.bitwise_and(screenshot, screenshot, mask=mask)
config = r'--psm 11 --oem 3 -c tessedit_char_whitelist=0123456789/'
text = pytesseract.image_to_string(screenshot).replace('\n', ',')
cv2.imshow('display', screenshot)
print(text)
print('------')
I get better results with --psm 11. The code that was posted does not used the thresholded image thresh for the OCR. Finally, tesseract seems to prefer white text on a black background, so you could use cv2.THRESH_BINARY_INV instead of cv2.THRESH_BINARY. The "INV" tells OpenCV to invert your image of white text on black background.
import pytesseract
import cv2
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
screenshot = cv2.imread('hpmp.png', cv2.IMREAD_GRAYSCALE)
scale_percent = 250
screenshot = cv2.resize(screenshot, (0,0), fx=2.5, fy=2.5)
ret, thresh = cv2.threshold(screenshot, 150, 255, cv2.THRESH_BINARY_INV)
config = r'--oem 3 --psm 11 -c tessedit_char_whitelist=0123456789/'
text = pytesseract.image_to_string(thresh, config=config).replace('\n', ',')
print(text)
print('------')
which gives
1263/1263,,/ 2101/2191,♀
------
I am new to OpenCV and I am not even sure how to tackle this problem. I have this image of 500x500 pixel with red dots and white lines in it.
Considering each red dot as center and could I draw a fixed bounding box of 25X25 size around the red dot? I need to identify every red dot in the image.
Note: condition is that I need to find a bounding box of fixed size (25x25) and the red dot must be in the center of the bounding box.
Any help would be appreciated. Thank you in advance.
Another solution, using numpy slicing to get the red channel, where to create a mask of the red dots and cv2.findContours to get the bounding rectangles of the dots. We can use this info to draw the new 25 x 25 rectangles:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "C://opencvImages//"
inputImage = cv2.imread(imagePath + "oHk9s.png")
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Slice the Red channel from the image:
r = inputImage[:, :, 2]
# Convert type to unsigned integer (8 bit):
r = np.where(r == 237, 255, 0).astype("uint8")
# Extract blobs (the red dots are all the white pixels in this mask):
contours, _ = cv2.findContours(r, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
# Store bounding rectangles here:
boundingRectangles = []
# Loop through the blobs and draw a 25 x 25 green rectangle around them:
for c in contours:
# Get dot bounding box:
x, y, w, h = cv2.boundingRect(c)
# Set new bounding box dimensions:
boxWidth = 25
boxHeight = 25
# Center rectangle around blob:
boxX = int(x + 0.5 * (w - boxWidth))
boxY = int(y + 0.5 * (h - boxHeight))
# Store data:
boundingRectangles.append((boxX, boxY, boxWidth, boxHeight))
# Draw and show new bounding rectangles
color = (0, 255, 0)
cv2.rectangle(inputImageCopy, (boxX, boxY), (boxX + boxWidth, boxY + boxHeight), color, 2)
cv2.imshow("Boxes", inputImageCopy)
cv2.waitKey(0)
Additionally, I've stored the top left coordinate, width and height of the rectangles in the boundingRectangles list. This is the output:
Here is how you can use an HSV mask to mask out everything in your image except for the red pixels:
import cv2
import numpy as np
def draw_box(img, cnt):
x, y, w, h = cv2.boundingRect(cnt)
half_w = w // 2
half_h = h // 2
x1 = x + half_h - 12
x2 = x + half_h + 13
y1 = y + half_w - 12
y2 = y + half_w + 13
cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0))
img = cv2.imread("red_dots.png")
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
ranges = np.array([[100, 0, 0], [179, 255, 255]])
mask = cv2.inRange(img_hsv, *ranges)
img_masked = cv2.bitwise_and(img, img, mask=mask)
img_gray = cv2.cvtColor(img_masked, cv2.COLOR_BGR2GRAY)
contours, _ = cv2.findContours(img_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
for cnt in contours:
draw_box(img, cnt)
cv2.imshow("Image", img)
cv2.waitKey(0)
Output:
Notice at this part of the draw_box() function:
x1 = x + half_h - 12
x2 = x + half_h + 13
y1 = y + half_w - 12
y2 = y + half_w + 13
Ideally, instead of - 12 and + 13, it should be - 12.5 and + 12.5, but there cannot be half pixels in OpenCV, or an error would be thrown.
I'm trying to create a chessboard with Pillow for a small but complicated project of mine. I'm new to Pillow and I'm trying to follow the documentation, that's how I got to this point. My issue is that my code does not work as intended, I'm trying to create a 200x200 image and the first row of the board, but my code does not switch between the colors (red and blue) to create different colored squares.
from PIL import Image, ImageDraw
image_size = 200
with Image.new("RGBA", (image_size, image_size)) as im:
x, y = 0, 0
w, h = 50, 50
red = "RED"
blue = "BLUE"
color = red
draw = ImageDraw.Draw(im)
for i in range(0, int((image_size/5)), 10):
draw.rectangle((int(i/10) * w, y, w, h), color)
if color == red: color = blue
else: color = red
im.show()
Expected result:
Actual result:
You can change your code to:
#....
for step in range(0, image_size // w):
draw.rectangle([step * w, y, (step + 1) * w, h], color)
if color == red:
color = blue
else:
color = red
#....
this way you'll be drawing a square for each iteration starting at step*w and ending at (step+1)*w
You use draw.rectangle(x, y, w, h), but Pillow expects you to specify the lower right corner, which would be draw.rectangle(x, y, x + w, y + h)):
from PIL import Image, ImageDraw
image_size = 200
with Image.new("RGBA", (image_size, image_size)) as im:
x, y = 0, 0
w, h = 50, 50
red = "RED"
blue = "BLUE"
color = red
draw = ImageDraw.Draw(im)
for i in range(0, int((image_size/5)), 10):
x = int(i/10) * w
print(x)
print(y)
draw.rectangle((x, y, x + w, y + h), fill=color)
if color == red:
color = blue
else:
color = red
im.show()
I need overlay 2 images based on third image mask
Example
1.-I have this background
2.-I have this object image and also i have de segmentation image
Object image
I'm try to merge Backgound and Object image based on third image (mask image)
(mask image)
The final result is Background image + Object image(only based on mask)
Any idea..
I tried
import cv2
added_image = cv2.addWeighted(back_img,0.4,aug_demoimage,0.1,0)
But not working as expected.. any sugestion? thanks!
Solved
def get_only_object(img, mask, back_img):
fg = cv2.bitwise_or(img, img, mask=mask)
#imshow(fg)
# invert mask
mask_inv = cv2.bitwise_not(mask)
#fg_back = cv2.bitwise_or(back_img, back_img, mask=mask)
fg_back_inv = cv2.bitwise_or(back_img, back_img, mask=mask_inv)
#imshow(fg_back_inv)
final = cv2.bitwise_or(fg, fg_back_inv)
#imshow(final)
return final
You need to convert the object image into an RGBA image where the alpha channel is the mask image you have created. Once you do this, you can paste it to the background image.
def convert_to_png(img, a):
#alpha and img must have the same dimenstons
fin_img = cv2.cvtColor(img, cv2.COLOR_RGB2RGBA)
b, g, r, alpha = cv2.split(fin_img)
alpha = a
# plt.imshow(alpha);plt.title('alpha image');plt.show()
# plt.imshow(img);plt.title('original image');plt.show()
# plt.imshow(alpha);plt.title('fin alpha image');plt.show()
fin_img[:,:, 0] = img[:,:,0]
fin_img[:,:, 1] = img[:,:,1]
fin_img[:,:, 2] = img[:,:,2]
fin_img[:,:, 3] = alpha[:,:]
# plt.imshow(fin_img);plt.title('fin image');plt.show()
return fin_img
This function will combine the two images into an RGBA image.
y1, y2 = new_loc[1], new_loc[1] + img.shape[0]
x1, x2 = new_loc[0], new_loc[0] + img.shape[1]
alpha_s = img[:, :, 3] / 255.0
alpha_l = 1.0 - alpha_s
for c in range(0, 3):
fin_img[y1:y2, x1:x2, c] = (alpha_s * img[:, :, c] +
alpha_l * img[y1:y2, x1:x2, c])
And this will copy the Object image to the background image
I'm trying to overlay random images (natural scene images should be overlayed with sign images) using OpenCV and Python. They can vary in size, file extension and no. of channels (and many more, I guess). So I'm resizing the sign images according to the size of the natural scene image and put them onto the latter.
I have implemented fireant's code found here: overlay a smaller image on a larger image python OpenCv
But it only works for images with 4 channels.
Using cv2.addWeighted() always crops the larger image (scene image) to the size of the smaller image (sign image). Has anybody an idea how to do that? Help is highly appreciated.
EDIT: See the expected output below. At first the, escape route sign and the background are separate images.
And this is my code, it is working, but since a lot of my images seem to have only 3 channels, I would like to get it working for those also.
import cv2
import time
import math
import os
pathSigns = "/home/moritz/Schreibtisch/Signs"
pathScenes = "/home/moritz/Schreibtisch/Scenes"
i = 0
for fSigns in os.listdir(pathSigns):
fSigns = os.path.join(pathSigns, fSigns)
s_img = cv2.imread(fSigns, -1)
for fScenes in os.listdir(pathScenes):
try:
l_img = cv2.imread(os.path.join(pathScenes, fScenes))
l_height, l_width, l_channels = l_img.shape
TARGET_PIXEL_AREA = (l_height * l_width) * 0.05
ratio = float(s_img.shape[1]) / float(s_img.shape[0])
s_new_h = int(math.sqrt(TARGET_PIXEL_AREA / ratio) + 0.5)
s_new_w = int((s_new_h * ratio) + 0.5)
s_img = cv2.resize(s_img,(s_new_w, s_new_h))
x_offset=y_offset=50
# l_img[y_offset:y_offset+s_img.shape[0],
x_offset:x_offset+s_img.shape[1]] = s_img
y1, y2 = y_offset, y_offset + s_img.shape[0]
x1, x2 = x_offset, x_offset + s_img.shape[1]
height, width, channels = s_img.shape
if channels <= 3:
alpha_s = s_img[:, :, 2] / 255.0
alpha_l = 1.0 - alpha_s
else:
alpha_s = s_img[:, :, 3] / 255.0
alpha_l = 1.0 - alpha_s
for c in range(0, 3):
l_img[y1:y2, x1:x2, c] = (alpha_s * s_img[:, :, c] +
alpha_l * l_img[y1:y2, x1:x2, c])
fResult = "/home/moritz/Schreibtisch/results/data_" + str(i) +
".png"
i += 1
cv2.imwrite(fResult, l_img)
except IndexError:
pass
thanks to #DanMašek hint and How to crop or remove white background from an image, I have worked out a solution. The following code will first remove white background from the smaller image, then set all images to 4 channels and then overlay the larger image with a smaller image. Works for me.
import cv2
import time
import math
import os
import numpy as np
pathSigns = "/home/moritz/Schreibtisch/Signs"
pathScenes = "/home/moritz/Schreibtisch/Scenes"
i = 0
for fSigns in os.listdir(pathSigns):
fSigns = os.path.join(pathSigns, fSigns)
s_img = cv2.imread(fSigns, -1)
s_height, s_width, s_channels = s_img.shape
# crop image
gray = cv2.cvtColor(s_img, cv2.COLOR_BGR2GRAY)
th, threshed = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY_INV)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11,11))
morphed = cv2.morphologyEx(threshed, cv2.MORPH_CLOSE, kernel)
_, cnts, _ = cv2.findContours(morphed, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnt = sorted(cnts, key=cv2.contourArea)[-1]
x,y,w,h = cv2.boundingRect(cnt)
s_img = s_img[y:y+h, x:x+w]
# set channels to 4
if s_channels < 4:
s_img = cv2.cvtColor(s_img, cv2.COLOR_BGR2BGRA)
for fScenes in os.listdir(pathScenes):
try:
l_img = cv2.imread(os.path.join(pathScenes, fScenes))
l_height, l_width, l_channels = l_img.shape
if l_channels < 4:
l_img = cv2.cvtColor(l_img, cv2.COLOR_BGR2BGRA)
TARGET_PIXEL_AREA = (l_height * l_width) * 0.05
ratio = float(s_img.shape[1]) / float(s_img.shape[0])
s_new_h = int(math.sqrt(TARGET_PIXEL_AREA / ratio) + 0.5)
s_new_w = int((s_new_h * ratio) + 0.5)
s_img = cv2.resize(s_img,(s_new_w, s_new_h))
x_offset=y_offset=50
y1, y2 = y_offset, y_offset + s_img.shape[0]
x1, x2 = x_offset, x_offset + s_img.shape[1]
alpha_s = s_img[:, :, 3] / 255.0
alpha_l = 1.0 - alpha_s
for c in range(0, 3):
l_img[y1:y2, x1:x2, c] = (alpha_s * s_img[:, :, c] + alpha_l *
l_img[y1:y2, x1:x2, c])
fResult = "/home/moritz/Schreibtisch/results/data_" + str(i) + ".png"
i += 1
cv2.imwrite(fResult, l_img)
except IndexError:
pass