I am attempting to create a model that can recognize green circles or green rectangular arrows as shown in the "image".
I figured I would need to make my own classifier and I am using Cascade Trainer to accomplish this. (See repo for the classification images and .xml file.)
Complete code and supporting files here:
https://github.com/ThePieMonster/Computer-Vision
# Analyze Screenshots
print("Analyze Screenshots")
img = cv2.imread('Data/image.png')
classifier_path = 'Data/train/classifier/cascade.xml'
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
print("- load classifier")
cascade = cv2.CascadeClassifier(classifier_path)
print("- detect objects")
objects = cascade.detectMultiScale(image=img_rgb, scaleFactor=1.10, minNeighbors=3)
print("- draw rectangle around objects")
for(x,y,w,h) in objects:
# cv2.rectangle(<image>, (x,y), (x+w,y+h), <rectangle rgb color>, <rectangle thickness>)
img = cv2.rectangle(img, (x,y), (x+w,y+h), (0,255,0), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
print()
As you can see via the below image, I can't seem to get the classifier to recognize the green dots. There are about 41 green dots in the image and I am hoping to spot them all, currently I have a few red dots spotted and some random map squares. Any help is much appreciated!
Related
I need help with OpenCV
I have a picture with a complex for lying on the ground now i need to extract this form from the picture and cleaned it from noise. But now there is a logo which i need to remove and 4 holes to identify.What i could do Original image
My code so far:
import cv2
import numpy as np
# Read the original image
img = cv2.imread('Amoebe_1.jpg')
# resize image
scale_down = 0.4
img = cv2.resize(img, None, fx= scale_down, fy= scale_down, interpolation= cv2.INTER_LINEAR)
# Display original image
cv2.imshow('Original', img)
cv2.waitKey(0)
# Denoising
dst = cv2.fastNlMeansDenoisingColored(img,None,20,10,10,21)
# Canny Edge Detection
edges = cv2.Canny(image=dst, threshold1=100, threshold2=200) # Canny Edge Detection
# Contour Detection
contours1, hierarchy1 = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# draw contours on the original image for `CHAIN_APPROX_SIMPLE`
image_copy1 = img.copy()
cv2.drawContours(image_copy1, contours1, -1, (0, 255, 0), 2, cv2.LINE_AA)
# see the results
cv2.imshow('Simple approximation', image_copy1)
# Display Canny Edge Detection Image
cv2.imshow('Canny Edge Detection', edges)
cv2.waitKey(0)
#Floodfill
h,w,chn = img.shape
seed = (w/2,h/2)
mask = np.zeros((h+2,w+2),np.uint8)
bucket = edges.copy()
cv2.floodFill(bucket, mask, (0,0), (255,255,255))
cv2.imshow('Mask', bucket)
cv2.waitKey(0)
cv2.destroyAllWindows()
Having a shot at this in ImageJ, extracting the red channel from the raw image gives me this:
Which is close to a binary image already. Running a small (3pix) median filter and thresholding gives this as a binary:
Running cv.findContours() on that last one and analysing contour areas should give you the little holes and the "eye". Use cv.drawContours() with the bigger objects to fill up the eye and logo area, maybe dilate() to fill small discrepancies.
I have not bordered table like on picture .
I tried to use example from this post
I have got this result
But I need something like this
How I could tune Open CV to get needed result?
You can easily achieved by using image_to_data method of pytesseract.
You also need to know:
Pre-processing methods
Page-segmentation-modes
Steps:
Load the image in BGR mode and convert it to the gray-scale
Get region-of-interest areas
From each area, get the coordinates and draw rectangle.
Result:
Code:
# Load the library
import cv2
import pytesseract
# Load the image
img = cv2.imread("1Tksb.jpg")
# Convert to gry-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR detection
d = pytesseract.image_to_data(gry, config="--psm 6", output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
# Get the localized region
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
# Draw rectangle to the detected region
cv2.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), 1)
# Display
cv2.imshow("img", img)
cv2.waitKey(0)
If you want to read you can use image_to_string method.
I have images of Math question papers which have multiple questions per page. Example:
Math questions image
I want to use Python to extract the contents of each question separately and store them in a database table. From my research, I have a rough idea for my workflow: Pre-process image --> Find contours of each question --> Snip and send those individual images to pyTesseract --> Store the transcribed text.
I was very happy to find a great thread about a similar problem, but when I tried that approach on my image, the ROI that was identified covered the whole page. In other words, it identified all the questions as one block of text.
How do I make OpenCV recognize multiple ROIs within a page and draw bounding boxes? Is there something different to be done during the pre-processing?
Please suggest an approach - thanks so much!
First you need to convert the image into grayscale
Perform otsu'threshold which does better binarization in removing the background.
Specify structure shape and kernel size. Kernel size increases or decreases the area of the rectangle to be detected.
Applying dilation on the threshold image with the kernel when you dilated it gets thicker.
Finding contours
Looping through the identified contours Then the rectangular part is can be drawn using cv2.rectangle method
import cv2
img = cv2.imread("text.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(5,5),0)
ret, thresh1 = cv2.threshold(blur, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))
dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1)
contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_NONE)
for cnt in contours:
x, y, w, h = cv2.boundingRect(cnt)
# Drawing a rectangle on copied image
rect = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.imwrite('drawed.png', img)
Sample output iamge
Good afternoon, I currently have some code which detects eyes and faces using haar cascades, I was curious to see if anybody knows how to get the program to recognize movement of the head e..g. nod or movement of the eye e.g. blink.
Here is what i currently have:
import cv2
import numpy as np
"""
Created on Mon Mar 2 11:38:49 2020
#author: bradl
"""
# Open Camera
camera = cv2.VideoCapture(0)
camera.set(10, 200)
face_cascade = cv2.CascadeClassifier('haarcascades/face.xml')
##smile = cv2.CascadeClassifier('haarcascades/smile.xml')
eye_cascade = cv2.CascadeClassifier('haarcascades/eye.xml')
while True:
ret, img = camera.read()
## converts to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
## determines what a face is and how it is found
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
## Determines the starting and ending co-ordinates for a blue rectangle to be drawn around the face
cv2.rectangle (img, (x,y), (x+w, y+h), (255,0,0), 2)
## Declares the region of the image where the eyes will be
roi_gray = gray[y:y+h, x:x+w]
roi_color = img[y:y+h, x:x+w]
## Determines what an eye is based on the eye haar cascade xml file
eyes = eye_cascade.detectMultiScale(roi_gray)
for (ex,ey,ew,eh) in eyes:
##Draws green rectangles around the co-ordintates for eyes
cv2.rectangle(roi_color, (ex, ey),(ex+ew,ey+eh), (0,255,0),2)
##Displays camera
cv2.imshow('Image',img)
##Requires the user to press escape to exit the program
k = cv2.waitKey(40)
if k == 27:
break
Does anybody have any ideas to get the program to recognize head or eye movement?
There are a number of ways to detect eye blinking.
In the ROI of the eyes apply white color detection.
Draw contours of around this mask, if the area of the contour is above a threshold you can interpret the eye is open, whenever there is a sudden change in the area of the contour, that is the point of a blink.
This method would fail if you move towards and away from the camera.
Another way of doing this would be face landmark detection using a library like DLIB.
So I'm trying to recognize an region that's already been defined by a bounding box. Example:
Some of the areas within these rectangles in these images are white and some are black, and most of them are completely different sizes. The only common characteristic between these images is the red rectangle:
Essentially what I'm trying to do is create a randomly generated meme bot that places a random source image in the region defined by these rectangles. I have tons of these images already with predefined areas with these red rectangles for use. I want to automate the process somehow, currently every resize shape and offset has to be defined for each template. So what I need to do is recognize the area within the rectangle and have it return the defined resize shape and offset needed to place the source image.
How should I go about this? Should I use something in OpenCV or am I going to have to train a CNN? Just really looking for a push in the right direction because I'm pretty lost as to the best approach to this problem.
I think OpenCV can do it. Below is a short example of the steps for what you need. Read the comments in the code for more details.
import cv2
import numpy as np
img = cv2.imread("1.jpg")
#STEP1: get only red color (or the bounding box color) in the image
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# define range of red color in HSV
lower_red = np.array([0,50,50])
upper_red = np.array([0,255,255])
# Threshold the HSV image to get only blue colors
mask = cv2.inRange(hsv, lower_red, upper_red)
red_only = cv2.bitwise_and(img,img, mask= mask)
#STEP2: find contour
gray_img = cv2.cvtColor(red_only,cv2.COLOR_BGR2GRAY)
_,thresh = cv2.threshold(gray_img,1,255,cv2.THRESH_BINARY)
_,contours,_ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#max contour in the image is the box you want
areas = [cv2.contourArea(c) for c in contours]
sorted_areas = np.sort(areas)
cnt=contours[areas.index(sorted_areas[-1])]
r = cv2.boundingRect(cnt)
cv2.rectangle(img,(r[0],r[1]),(r[0]+r[2],r[1]+r[3]),(0,255,0),3)
cv2.imshow("img",img)
cv2.imshow("red_only",red_only)
cv2.imshow("thresh",thresh)
cv2.waitKey()
cv2.destroyAllWindows()