So I'm trying to recognize an region that's already been defined by a bounding box. Example:
Some of the areas within these rectangles in these images are white and some are black, and most of them are completely different sizes. The only common characteristic between these images is the red rectangle:
Essentially what I'm trying to do is create a randomly generated meme bot that places a random source image in the region defined by these rectangles. I have tons of these images already with predefined areas with these red rectangles for use. I want to automate the process somehow, currently every resize shape and offset has to be defined for each template. So what I need to do is recognize the area within the rectangle and have it return the defined resize shape and offset needed to place the source image.
How should I go about this? Should I use something in OpenCV or am I going to have to train a CNN? Just really looking for a push in the right direction because I'm pretty lost as to the best approach to this problem.
I think OpenCV can do it. Below is a short example of the steps for what you need. Read the comments in the code for more details.
import cv2
import numpy as np
img = cv2.imread("1.jpg")
#STEP1: get only red color (or the bounding box color) in the image
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# define range of red color in HSV
lower_red = np.array([0,50,50])
upper_red = np.array([0,255,255])
# Threshold the HSV image to get only blue colors
mask = cv2.inRange(hsv, lower_red, upper_red)
red_only = cv2.bitwise_and(img,img, mask= mask)
#STEP2: find contour
gray_img = cv2.cvtColor(red_only,cv2.COLOR_BGR2GRAY)
_,thresh = cv2.threshold(gray_img,1,255,cv2.THRESH_BINARY)
_,contours,_ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#max contour in the image is the box you want
areas = [cv2.contourArea(c) for c in contours]
sorted_areas = np.sort(areas)
cnt=contours[areas.index(sorted_areas[-1])]
r = cv2.boundingRect(cnt)
cv2.rectangle(img,(r[0],r[1]),(r[0]+r[2],r[1]+r[3]),(0,255,0),3)
cv2.imshow("img",img)
cv2.imshow("red_only",red_only)
cv2.imshow("thresh",thresh)
cv2.waitKey()
cv2.destroyAllWindows()
Related
so I've been trying to code a Python script, which takes an image as input and then cuts out a rectangle with a specific background color. However, what causes a problem for my coding skills, is that the rectangle is not on a fixed position in every image (the position will be random).
I do not really understand how to manage the numpy functions. I also read something about OpenCV, but I'm totally new to it. So far I just cropped the images through the ".crop" function, but then I would have to use fixed values.
This is how the input image could look and now I would like to detect the position of the yellow rectangle and then crop the image to its size.
Help is appreciated, thanks in advance.
Edit: #MarkSetchell's way works pretty good, but found a issue for a different test picture. The problem with the other picture is that there are 2 small pixels with the same color at the top and the bottom of the picture, which cause errors or a bad crop.
Updated Answer
I have updated my answer to cope with specks of noisy outlier pixels of the same colour as the yellow box. This works by running a 3x3 median filter over the image first to remove the spots:
#!/usr/bin/env python3
import numpy as np
from PIL import Image, ImageFilter
# Open image and make into Numpy array
im = Image.open('image.png').convert('RGB')
na = np.array(im)
orig = na.copy() # Save original
# Median filter to remove outliers
im = im.filter(ImageFilter.MedianFilter(3))
# Find X,Y coordinates of all yellow pixels
yellowY, yellowX = np.where(np.all(na==[247,213,83],axis=2))
top, bottom = yellowY[0], yellowY[-1]
left, right = yellowX[0], yellowX[-1]
print(top,bottom,left,right)
# Extract Region of Interest from unblurred original
ROI = orig[top:bottom, left:right]
Image.fromarray(ROI).save('result.png')
Original Answer
Ok, your yellow colour is rgb(247,213,83), so we want to find the X,Y coordinates of all yellow pixels:
#!/usr/bin/env python3
from PIL import Image
import numpy as np
# Open image and make into Numpy array
im = Image.open('image.png').convert('RGB')
na = np.array(im)
# Find X,Y coordinates of all yellow pixels
yellowY, yellowX = np.where(np.all(na==[247,213,83],axis=2))
# Find first and last row containing yellow pixels
top, bottom = yellowY[0], yellowY[-1]
# Find first and last column containing yellow pixels
left, right = yellowX[0], yellowX[-1]
# Extract Region of Interest
ROI=na[top:bottom, left:right]
Image.fromarray(ROI).save('result.png')
You can do the exact same thing in Terminal with ImageMagick:
# Get trim box of yellow pixels
trim=$(magick image.png -fill black +opaque "rgb(247,213,83)" -format %# info:)
# Check how it looks
echo $trim
251x109+101+220
# Crop image to trim box and save as "ROI.png"
magick image.png -crop "$trim" ROI.png
If still using ImageMagick v6 rather than v7, replace magick with convert.
What I see is dark and light gray areas on sides and top, a white area, and a yellow rectangle with gray triangles inside the white area.
The first stage I suggest is converting the image from RGB color space to HSV color space.
The S color channel in HSV space, is the "color saturation channel".
All colorless (gray/black/white) are zeros and yellow pixels are above zeros in the S channel.
Next stages:
Apply threshold on S channel (convert it to binary image).
The yellow pixels goes to 255, and other goes to zero.
Find contours in thresh (find only the outer contour - only the rectangle).
Invert polarity of the pixels inside the rectangle.
The gray triangles become 255, and other pixels are zeros.
Find contours in thresh - find the gray triangles.
Here is the code:
import numpy as np
import cv2
# Read input image
img = cv2.imread('img.png')
# Convert from BGR to HSV color space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Get the saturation plane - all black/white/gray pixels are zero, and colored pixels are above zero.
s = hsv[:, :, 1]
# Apply threshold on s - use automatic threshold algorithm (use THRESH_OTSU).
ret, thresh = cv2.threshold(s, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Find contours in thresh (find only the outer contour - only the rectangle).
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] # [-2] indexing takes return value before last (due to OpenCV compatibility issues).
# Mark rectangle with green line
cv2.drawContours(img, contours, -1, (0, 255, 0), 2)
# Assume there is only one contour, get the bounding rectangle of the contour.
x, y, w, h = cv2.boundingRect(contours[0])
# Invert polarity of the pixels inside the rectangle (on thresh image).
thresh[y:y+h, x:x+w] = 255 - thresh[y:y+h, x:x+w]
# Find contours in thresh (find the triangles).
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] # [-2] indexing takes return value before last (due to OpenCV compatibility issues).
# Iterate triangle contours
for c in contours:
if cv2.contourArea(c) > 4: # Ignore very small contours
# Mark triangle with blue line
cv2.drawContours(img, [c], -1, (255, 0, 0), 2)
# Show result (for testing).
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
S color channel in HSV color space:
thresh - S after threshold:
thresh after inverting polarity of the rectangle:
Result (rectangle and triangles are marked):
Update:
In case there are some colored dots on the background, you can crop the largest colored contour:
import cv2
import imutils # https://pypi.org/project/imutils/
# Read input image
img = cv2.imread('img2.png')
# Convert from BGR to HSV color space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Get the saturation plane - all black/white/gray pixels are zero, and colored pixels are above zero.
s = hsv[:, :, 1]
cv2.imwrite('s.png', s)
# Apply threshold on s - use automatic threshold algorithm (use THRESH_OTSU).
ret, thresh = cv2.threshold(s, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Find contours
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = imutils.grab_contours(cnts)
# Find the contour with the maximum area.
c = max(cnts, key=cv2.contourArea)
# Get bounding rectangle
x, y, w, h = cv2.boundingRect(c)
# Crop the bounding rectangle out of img
out = img[y:y+h, x:x+w, :].copy()
Result:
In opencv you can use inRange. This basically makes whatever color is in the range you speciefied white, and the rest black. This way all the yellow will be white.
Here is the documentation: https://docs.opencv.org/3.4/da/d97/tutorial_threshold_inRange.html
I have around 100+ images with 2 different texts on it. The images are below. one is occupied and the other is unoccupied.
So is there any way in python to differentiate these images using some code to detect the text in it?
If so I wanted to identify the occupied images and delete unoccupied images.
Since I am new to python can anyone help me in doing this?
This answer is based on the assumption that that there are only two different texts on the images as you posted in the question. So I assume that the number of characters and the color of the text is always the same ("Room status: Unoccupied" and "Room status" Occupied" in red color). That being said, I would try a more simple way to differentiate between these two different types. These images contain caracters that are very near to each other so in my opinion is that it would be very difficult to seperate each character and identify it with an OCR. I would try a more simple approach like finding the area containg the text and find the pure lenght of the text - "unoccupied" has two more characters in the text as "occupied" and hence has a bigger distance in lenght. So you can transform the image to HSV color space and use the cv2.inRange() function to extract the text (red color). Then you can merge the characters to one contour with cv2.morphologyEx() and get its lenght with cv2.minAreaRect(). Hope it helps or at least gives you a new perspective on how to find your solution. Cheers!
Example code:
import cv2
import numpy as np
# Read the image and transform to HSV colorspace.
img = cv2.imread('ocupied.jpg')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Extract the red text.
lower_red = np.array([0,150,50])
upper_red = np.array([40,255,255])
mask_red = cv2.inRange(hsv, lower_red, upper_red)
# Search for contours on the mask.
_, contours, hierarchy = cv2.findContours(mask_red,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
# Create a new mask for further processing.
mask = np.ones(img.shape, np.uint8)*255
# Draw contours on the mask with size and ratio of borders for threshold (to remove other noises from the image).
for cnt in contours:
size = cv2.contourArea(cnt)
x,y,w,h = cv2.boundingRect(cnt)
if 10000 > size > 50 and w*2.5 > h:
cv2.drawContours(mask, [cnt], -1, (0,0,0), -1)
# Connect neighbour contours and select the biggest one (text).
kernel = np.ones((50,50),np.uint8)
opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
gray_op = cv2.cvtColor(opening, cv2.COLOR_BGR2GRAY)
_, threshold_op = cv2.threshold(gray_op, 150, 255, cv2.THRESH_BINARY_INV)
_, contours_op, hierarchy_op = cv2.findContours(threshold_op, cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours_op, key=cv2.contourArea)
# Create rotated rectangle to get the 4 points of the rectangle.
rect = cv2.minAreaRect(cnt)
# Create bounding and calculate the "lenght" of the text.
box = cv2.boxPoints(rect)
a, b, c, d = box = np.int0(box)
bound =[]
bound.append(a)
bound.append(b)
bound.append(c)
bound.append(d)
bound = np.array(bound)
(x1, y1) = (bound[:,0].min(), bound[:,1].min())
(x2, y2) = (bound[:,0].max(), bound[:,1].max())
# Draw the rectangle.
cv2.rectangle(img,(x1,y1),(x2,y2),(0,255,0),1)
# Identify the room status.
if x2 - x1 > 200:
print('unoccupied')
else:
print('occupied')
# Display the result
cv2.imshow('img', img)
Result:
occupied
unoccupied
Using the tesseract OCR Engine and the python wrapper pytesseract, it is only a few lines' task:
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
img = Image.open('D:\\tmp2.jpg').crop((0,0,250,35))
print(pytesseract.image_to_string(img, config='--psm 7'))
I have tested this on Windows 7. Of course, I have assumed that the text appears at the same position in every image (from your example, it does seem to be the case). Else, you need to find a better cropping mechanism.
I'm currently working on an image where I have to find the box outer region. But I failed to find the white and black boxes regions.
input image:
https://i.imgur.com/gec9eP5.png
output image:
https://i.imgur.com/Giz1DAW.png
Update edit:
if I use HLS instead of HSV I can find 3 more box region but 2 is still missing.
here is new output:
https://i.imgur.com/eUqltKI.png
and here is my code:
import cv2
import numpy as np
img = cv2.imread("1.png")
imghsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
lower_blue = np.array([0,50,0])
upper_blue = np.array([255,255,255])
mask_blue = cv2.inRange(imghsv, lower_blue, upper_blue)
_, contours, _ = cv2.findContours(mask_blue, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
im = np.copy(img)
cv2.drawContours(im, contours, -1, (0, 255, 0), 2)
cv2.imwrite("contours_blue.png", im)
The mask you're generating with
mask_blue = cv2.inRange(imghsv, lower_blue, upper_blue)
does not include the bottom row at all, so it's impossible to detect these outlines with
_, contours, _ = cv2.findContours(mask_blue, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
You could try to work with multiple masks / thresholds to account for the different color ranges and merge the detected contours.
Color channel threshold is not the optimal solution in cases when you are dealing with objects of many different colors (that are not known in advance) and with background color that is not necessarily distinctly different from all object colors. Combination of multiple thresholds/conditions could solve the job for this particular image but this same combination can fail for slightly different input, so I think this approach is generally not too good.
I think the problem is very elementary in nature so I would recommend sticking to a simple approach. For example, if you apply Sobel operator on your image, you get result like one below. The intensity of the result is weak on some borders, so I inverted the image colors to make it better visible.
There are tons of tutorials on Sobel operator on the web so I won't go into detail here. On your input image there is no noise, so the intensity outside and within the boxes is zero. I would therefore suggest masking-out all zero values. If you do contour detection after that, you will end up with two contours per square - one will be on the inner side of the border and one on the outer side of the border. If you only want to extract contours on the outer border, see how contour hirearchy works in the OpenCV documentation. If you want to have contour exactly on the border, help yourself with outer contour and erosion.
How can I draw a rectangle on the white object (in mask window), that appears in the original cam (in frame window)
see image
my code:
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while(1):
# Take each frame
_, frame = cap.read()
# Convert BGR to HSV
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# define range of red color in HSV
lower_blue = np.array([0,89,190])
upper_blue = np.array([180,255,255])
# Threshold the HSV image to get only red colors
mask = cv2.inRange(hsv, lower_blue, upper_blue)
# Bitwise-AND mask and original image
res = cv2.bitwise_and(frame,frame, mask= mask)
cv2.imshow('frame',frame)
cv2.imshow('mask',mask)
cv2.imshow('res',res)
k = cv2.waitKey(5) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()
Sorry for my bad English, I'm doing my best to improve it.
Like Tiphel said, you can use cv2.findContours and cv2.drawContours. Alternatively, after getting the contours, you can draw a box using cv2.boundingRect() function too. This returns 4 arguments, say, x,y, w and h. x,y represent a point and w, h represent the width of height of the rectangle respectively. You can then use cv2.rectangle to draw the rectangle. You can fit other shapes too similarly, like ellipses, circles, etc.
i, contours, heirarchy = cv2.findContours(a_thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cont_sorted = sorted(cnts2, key=cv2.contourArea, reverse=True)[:5]
x,y,w,h = cv2.boundingRect(cont_sorted[0])
cv2.rectangle(a,(x,y),(x+w,y+h),(0,0,255),5)
Here, a_thresh is the binary image after thresholding input image. In the cv2.rectange() function, the first argument corresponds to the image on which you want to draw, fourth argument specifies the color and fifth specifies the thickness of line used to draw the rectangle.
Also, I use 'sorted' to get the top 5 contours in terms of their size and ideally my object of interest would be the one with the largest area.
You can find documentation for these online. I suggest you go through the documentation for all functions used above to use it appropriately for your application!
Use cv2.findContours to find the object on your masked image then cv2.drawContours to display it.
Doc here
I am using Python and OpenCV. I am now using grabcut() to crop out the object I want. Here is my code:
img = cv2.imread('test.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = np.zeros(img.shape[:2], np.uint8)
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)
rect = (2,2,630,930)
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
mask2 = np.where((mask==2)|(mask==0), 0,1).astype('uint8')
img = img*mask2[:,:, np.newaxis]
Afterwards, I try to find out the contour.
I have tried to find the contour by the code below:
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,0)
im2, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
And it returns a contours array with length 48. When I draw this out:
First question is how can I get the contour (array) of this grab cut?
Second question: as you can see, the background color is black. How can I change the background color to white?
Thank you.
At first, you need to get the background. To this must be subtracted from the original image with the mask image. And then change the black background to white (or any color). And then back to add with the image of the mask.
import numpy as np
import cv2
cv2.namedWindow(‘image’, cv2.WINDOW_NORMAL)
#Load the Image
imgo = cv2.imread(‘input.jpg’)
height, width = imgo.shape[:2]
#Create a mask holder
mask = np.zeros(imgo.shape[:2],np.uint8)
#Grab Cut the object
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
#Hard Coding the Rect… The object must lie within this rect.
rect = (10,10,width-30,height-30)
cv2.grabCut(imgo,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
mask = np.where((mask==2)|(mask==0),0,1).astype(‘uint8’)
img1 = imgo*mask[:,:,np.newaxis]
#Get the background
background = imgo – img1
#Change all pixels in the background that are not black to white
background[np.where((background > [0,0,0]).all(axis = 2))] =[255,255,255]
#Add the background and the image
final = background + img1
#To be done – Smoothening the edges….
cv2.imshow(‘image’, final )
k = cv2.waitKey(0)
if k==27:
cv2.destroyAllWindows()
Information taken from the site
https://nxtify.wordpress.com/2015/02/24/image-background-removal-using-opencv-in-python/
If you want a single contour like boundary, you can go for edge detection on the output of grabcut and morphology dilation on the edge image to get as a proper connected contour and can get the array of pixels of the boundary.
For making the background white, All the pixels outside your bounding box can be default made to white. The black pixels inside your bounding box, you can compare with the original image corresponding gray level, if it is black, you can keep it, otherwise make it to white. Because if the original pixel is not black, but made black by grabcut, then it is considered as background. If black pixel is there in foreground, the grabcut never makes it to black (ideal case).