I have some colored rectangles in my image that I successfully recognize by HSV thresholding. The result looks like this:
Now I want to detect the big blob as one point. I tried it with cv2.SimpleBlobDetector() and custom parameters:
import cv2
import numpy as np
mask = cv2.imread('mask.png')
original = cv2.imread('original.png')
params = cv2.SimpleBlobDetector_Params()
# thresholds
params.minThreshold = 10
params.maxThreshold = 200
#params.thresholdStep = 20
# filter by area
params.filterByArea = True
params.minArea = 1
params.maxArea = 10000
# filter by circularity
params.filterByCircularity = False
# filter by convexity
params.filterByConvexity = False
# filter by inertia
params.filterByInertia = False
detector = cv2.SimpleBlobDetector(params)
keypoints = detector.detect(mask)
img_keypoints = cv2.drawKeypoints(original, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
cv2.imwrite('keypoints.png', img_keypoints)
This is how the result AND the original picture looks like:
I would expect a red point sitting in the center of the green point.
How can I fix this? Help is very appreciated.
EDIT: I forgot to mention: Is it really necessary to generate multiple binary images in cv2.SimpleBlobDetector() since I already have a binary image as input? Is it OK to change the values to the following:
params.minThreshold = 127
params.maxThreshold = 127
to reduce unnecessary CPU usage by generating binary images?
EDIT2 : Please note that I'm using OpenCV 2, not 3
Thank you.
When using cv2.SimpleBlobDetector(), it looks for blobs that are of a darker shade. In your case, the rectangle in mask is in white while the rest of the image is dark. As a result it is unable to find any blobs for the custom parameters set.
I just made a few changes to the existing code:
Read the mask as a grayscale image not a color image:
mask = cv2.imread('mask.png', 0)
Convert the mask to a binary image with the rectangle highlighted in dark:
ret, mask = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY_INV)
Proceeding from here using you code gave the following result as you expected.
Result:
Related
I am playing a sports called "Stockschiessen" in Austria. Its similar to curling. Now I came to that thought of automating the detection of the winner. So I started coding something with opencv in python. It ends up in a big problem. I can detect Circles but the "Stocks" won't be directly under the camera so I have to detect ellipses or ovals. I tried a lot but found no solution. So I just need a solution to detect these and after that I am able to calculate the distances of each "Stock".
The last Code I tried was from Geeks for Geeks:
import numpy as np
# Load image
image = cv2.imread('C://gfg//images//blobs.jpg', 0)
# Set our filtering parameters
# Initialize parameter settiing using cv2.SimpleBlobDetector
params = cv2.SimpleBlobDetector_Params()
# Set Area filtering parameters
params.filterByArea = True
params.minArea = 100
# Set Circularity filtering parameters
params.filterByCircularity = True
params.minCircularity = 0.9
# Set Convexity filtering parameters
params.filterByConvexity = True
params.minConvexity = 0.2
# Set inertia filtering parameters
params.filterByInertia = True
params.minInertiaRatio = 0.01
# Create a detector with the parameters
detector = cv2.SimpleBlobDetector_create(params)
# Detect blobs
keypoints = detector.detect(image)
# Draw blobs on our image as red circles
blank = np.zeros((1, 1))
blobs = cv2.drawKeypoints(image, keypoints, blank, (0, 0, 255),
cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
number_of_blobs = len(keypoints)
text = "Number of Circular Blobs: " + str(len(keypoints))
cv2.putText(blobs, text, (20, 550),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 100, 255), 2)
# Show blobs
cv2.imshow("Filtering Circular Blobs Only", blobs)
cv2.waitKey(0)
cv2.destroyAllWindows()
And this is an example image, one with the "Stocks" directly under the camera and the second one is a bit disturbed. It should find the ellipses on the second image.
Example 1
Example 2
Edit:
Here are these two pictures with grayscale and thresh. I think the circles can be seen well.
BlackAndWhite of Example1
BlackAndWhite of Example2
I'm trying to make a general object counting algorithm using python and openCV (open to try other methods) however I can't seem to get a good count on a variety of objects and don't know how to accomodate for that
https://imgur.com/a/yAkRxWH are some example test images.
This is for to speed up inventory counting of smaller objects.
**EDIT
This is my current code (simple blob detector)
# Standard imports
import cv2
import numpy as np;
# Read image
im = cv2.imread("./images/screw_simple.jpg", cv2.IMREAD_GRAYSCALE)
im = cv2.resize(im, (1440, 880))
# Setup SimpleBlobDetector parameters.
params = cv2.SimpleBlobDetector_Params()
# Change thresholds
params.minThreshold = 10 #10
params.maxThreshold = 200 #200
# Filter by Area.
params.filterByArea = True # True
params.minArea = 500 #1500
# Filter by Circularity
params.filterByCircularity = True #True
params.minCircularity = 0.1 #0.1
# Filter by Convexity
params.filterByConvexity = True #True
params.minConvexity = 0.0 #0.87
# Filter by Inertia
params.filterByInertia = True #True
params.minInertiaRatio = 0.0 #0.01
# Create a detector with the parameters
ver = (cv2.__version__).split('.')
if int(ver[0]) < 3:
detector = cv2.SimpleBlobDetector(params)
else:
detector = cv2.SimpleBlobDetector_create(params)
# Detect blobs.
keypoints = detector.detect(im)
# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures
# the size of the circle corresponds to the size of blob
total_count = 0
for i in keypoints:
total_count = total_count + 1
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0, 0, 255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
# Show blobs
cv2.imshow("Keypoints", im_with_keypoints)
cv2.waitKey(0)
print(total_count)
Here are the results I'm getting: https://imgur.com/a/id6OlIA
How can I improve this algorithm to get better detection for a general use case of objects without having to modify the parameters each time for each object?
You can try with an OpenCV approach, you could use a
SimpleBlobDetector
Obviously this is a test image and the result I got is also not perfect, since there are a lot of hyperparameters to set. The hyperparameters make it pretty flexible, so it is a decent place to start from.
This is what the Detector does (see details here):
Thresholding: Convert the source images to several binary images by thresholding the source image with thresholds starting at minThreshold. These thresholds are incremented by thresholdStep until maxThreshold. So the first threshold is minThreshold, the second is minThreshold + thresholdStep, the third is minThreshold + 2 x thresholdStep, and so on.
Grouping: In each binary image, connected white pixels are grouped together. Let’s call these binary blobs.
Merging: The centers of the binary blobs in the binary images are computed, and blobs located closer than minDistBetweenBlobs are merged.
Center & Radius Calculation: The centers and radii of the new merged blobs are computed and returned.
Find the code bellow the image.
# Standard imports
import cv2
import numpy as np
# Read image
im = cv2.imread("petri.png", cv2.IMREAD_COLOR)
# Setup SimpleBlobDetector parameters.
params = cv2.SimpleBlobDetector_Params()
# Change thresholds
params.minThreshold = 0
params.maxThreshold = 255
# Set edge gradient
params.thresholdStep = 5
# Filter by Area.
params.filterByArea = True
params.minArea = 10
# Set up the detector with default parameters.
detector = cv2.SimpleBlobDetector_create(params)
# Detect blobs.
keypoints = detector.detect(im)
# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures the size of the circle corresponds to the size of blob
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0, 0, 255),
cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
# Show keypoints
cv2.imshow("Keypoints", im_with_keypoints)
cv2.waitKey(0)
For better readability I rather put this in a second answer: You could
use a segmentation approach e.g. watershed algorithm
Any grayscale image can be viewed as a topographic surface where high intensity denotes peaks and hills while low intensity denotes valleys. You start filling every isolated valleys (local minima) with different colored water (labels). As the water rises, depending on the peaks (gradients) nearby, water from different valleys, obviously with different colors will start to merge. To avoid that, you build barriers in the locations where water merges. You continue the work of filling water and building barriers until all the peaks are under water. Then the barriers you created gives you the segmentation result. This is the "philosophy" behind the watershed.
I do have some old bank statements as scan and would like to use google´s thesseract engine to extract the text. Works pretty well unless the image is slightly rotated. I thought of detecting the dashed lines in order to estimate the slope and afterwards the angle of rotation. However, it is tricky to get the parameters right.
If I could get rid of the large line artefact, I might use the minimum rotated bounding box (cv2.minAreaRect) on the text characters.
Maybe another strategy is suited better ? Any ideas ?
An example image (deleted some characters for data protection):
EIDT: I have found a solution which seems to work. However, I am stil wondering if there might be a faster solution (takes about 1.5 seconds per Image)
I do use template matching from skimage with following template:
template = plt.imread('template_long.png')
template = rgb2gray(template)
template = template > threshold_mean(template)
for i in range(1):
# read in image
img = cv2.imread('conversion/umsatz_{}.png'.format(i))
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)
# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# edge detection
#edges = cv2.Canny(thresh,2,100, apertureSize = 3)
# fill the holes from detected edges
#kernel = np.ones((2,2),np.uint8)
#dilate = cv2.dilate(thresh, kernel, iterations=1)
result = match_template(thresh, template)
mask = result < 0.5
r = result.copy()
r[mask] = 0
r[~mask] = 1
plt.imshow(r)
I am very new to OpenCV Python and I really need some help here.
So what I am trying to do here is to extract out these words in the image below.
The words and shapes are all hand drawn, so they are not perfect. I have did some coding below.
First of all, I grayscale the image
img_final = cv2.imread(file_name)
img2gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
Then I use THRESH_INV to show the content
ret, new_img = cv2.threshold(image_final, 100 , 255, cv2.THRESH_BINARY_INV)
After which, I dilate the content
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3 , 3))
dilated = cv2.dilate(new_img,kernel,iterations = 3)
I dilate the image is because I can identify text as one cluster
After that, I apply boundingRect around the contour and draw around the rectangle
contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours
index = 0
for contour in contours:
# get rectangle bounding contour
[x,y,w,h] = cv2.boundingRect(contour)
#Don't plot small false positives that aren't text
if w < 10 or h < 10:
continue
# draw rectangle around contour on original image
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,255),2)
This is what I got after that.
I am only able to detect one of the text. I have tried many other methods but this is the closet results I have got and it does not fulfill the requirement.
The reason for me to identify the text is so that I can get the X and Y coordinate of each of the text in this image by putting a bounding Rectangle "boundingRect()".
Please help me out. Thank you so much
You can use the fact that the connected component of the letters are much smaller than the large strokes of the rest of the diagram.
I used opencv3 connected components in the code but you can do the same things using findContours.
The code:
import cv2
import numpy as np
# Params
maxArea = 150
minArea = 10
# Read image
I = cv2.imread('i.jpg')
# Convert to gray
Igray = cv2.cvtColor(I,cv2.COLOR_RGB2GRAY)
# Threshold
ret, Ithresh = cv2.threshold(Igray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Keep only small components but not to small
comp = cv2.connectedComponentsWithStats(Ithresh)
labels = comp[1]
labelStats = comp[2]
labelAreas = labelStats[:,4]
for compLabel in range(1,comp[0],1):
if labelAreas[compLabel] > maxArea or labelAreas[compLabel] < minArea:
labels[labels==compLabel] = 0
labels[labels>0] = 1
# Do dilation
se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(25,25))
IdilateText = cv2.morphologyEx(labels.astype(np.uint8),cv2.MORPH_DILATE,se)
# Find connected component again
comp = cv2.connectedComponentsWithStats(IdilateText)
# Draw a rectangle around the text
labels = comp[1]
labelStats = comp[2]
#labelAreas = labelStats[:,4]
for compLabel in range(1,comp[0],1):
cv2.rectangle(I,(labelStats[compLabel,0],labelStats[compLabel,1]),(labelStats[compLabel,0]+labelStats[compLabel,2],labelStats[compLabel,1]+labelStats[compLabel,3]),(0,0,255),2)
I wrote a little script to transform pictures of chalkboards into a form that I can print off and mark up.
I take an image like this:
Auto-crop it, and binarize it. Here's the output of the script:
I would like to remove the largest connected black regions from the image. Is there a simple way to do this?
I was thinking of eroding the image to eliminate the text and then subtracting the eroded image from the original binarized image, but I can't help thinking that there's a more appropriate method.
Sure you can just get connected components (of certain size) with findContours or floodFill, and erase them leaving some smear. However, if you like to do it right you would think about why do you have the black area in the first place.
You did not use adaptive thresholding (locally adaptive) and this made your output sensitive to shading. Try not to get the black region in the first place by running something like this:
Mat img = imread("desk.jpg", 0);
Mat img2, dst;
pyrDown(img, img2);
adaptiveThreshold(255-img2, dst, 255, ADAPTIVE_THRESH_MEAN_C,
THRESH_BINARY, 9, 10); imwrite("adaptiveT.png", dst);
imshow("dst", dst);
waitKey(-1);
In the future, you may read something about adaptive thresholds and how to sample colors locally. I personally found it useful to sample binary colors orthogonally to the image gradient (that is on the both sides of it). This way the samples of white and black are of equal size which is a big deal since typically there are more background color which biases estimation. Using SWT and MSER may give you even more ideas about text segmentation.
I tried this:
import numpy as np
import cv2
im = cv2.imread('image.png')
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
grayout = 255*np.ones((im.shape[0],im.shape[1],1), np.uint8)
blur = cv2.GaussianBlur(gray,(5,5),1)
thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
wcnt = 0
for item in contours:
area =cv2.contourArea(item)
print wcnt,area
[x,y,w,h] = cv2.boundingRect(item)
if area>10 and area<200:
roi = gray[y:y+h,x:x+w]
cntd = 0
for i in range(x,x+w):
for j in range(y,y+h):
if gray[j,i]==0:
cntd = cntd + 1
density = cntd/(float(h*w))
if density<0.5:
for i in range(x,x+w):
for j in range(y,y+h):
grayout[j,i] = gray[j,i];
wcnt = wcnt + 1
cv2.imwrite('result.png',grayout)
You have to balance two things, removing the black spots but balance that with not losing the contents of what is on the board. The output I got is this:
Here is a Python numpy implementation (using my own mahotas package) of the method for the top answer (almost the same, I think):
import mahotas as mh
import numpy as np
Imported mahotas & numpy with standard abbreviations
im = mh.imread('7Esco.jpg', as_grey=1)
Load the image & convert to gray
im2 = im[::2,::2]
im2 = mh.gaussian_filter(im2, 1.4)
Downsample and blur (for speed and noise removal).
im2 = 255 - im2
Invert the image
mean_filtered = mh.convolve(im2.astype(float), np.ones((9,9))/81.)
Mean filtering is implemented "by hand" with a convolution.
imc = im2 > mean_filtered - 4
You might need to adjust the number 4 here, but it worked well for this image.
mh.imsave('binarized.png', (imc*255).astype(np.uint8))
Convert to 8 bits and save in PNG format.