Opencv find Contours on image - python

I writing a code that detect book on image. First step is find the contour on the image but i have problem with some books. Sometimes I can't detect the contours correctly ( A book is a rectangle so just find 4 contours) beacouse the line is not appointed correctly and i have gap beatwen them as show on image. Is there a way to extend the detected edges ?
This is my code:
imgg = cv2.imread('\book.jpg')
gray = cv2.cvtColor(imgg, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)
edged = cv2.Canny(gray , 10, 250)
(cnts, _) = cv2.findContours(edged.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
total = 0
#binary = cv2.bitwise_not(gray)
for c in cnts:
area = cv2.contourArea(c)
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.03 * peri, True)
if (len(approx) == 4) and (area > 100000):
cv2.drawContours(imgg, [approx], -1, (0, 255, 0), 4)
cv2.imshow('image',imgg)
cv2.waitKey(0)
cv2.destroyAllWindows()

Here is a quick example of Thresholding, please remember to have a test.png in the same folder as the following script. Use it before applying findContours, should be a significant improvement. Otherwise google Otsu's Binarization.
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
img = cv.imread('test.png',0)
img = cv.medianBlur(img,5)
ret,th1 = cv.threshold(img,127,255,cv.THRESH_BINARY)
th2 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_MEAN_C,\
cv.THRESH_BINARY,11,2)
th3 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv.THRESH_BINARY,11,2)
titles = ['Original Image', 'Global Thresholding (v = 127)',
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]
for i in xrange(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()

Edge detection often works poorly. Here you discard highly relevant information, which is the color contrast.
Binarization of the saturation component will be much more effective.

Related

How should I properly extract the digital from 7 segment display in python

I am working on a project about extracting the digit from the 7-segment display and I am following this guide: https://pyimagesearch.com/2017/02/13/recognizing-digits-with-opencv-and-python/
Firstly, I have successfully extracted the ROI of the LED display but I have some difficulties in generating the gray-black image for using `cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE) to find the digital
What should I do to generate a grayscale image under shadow?
Original photo:
The extracted black white photo:
Code:
img_name = 'test2.jpeg'
image = cv2.imread(img_name)
image = imutils.resize(image, height=1000)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)
#cv2.imshow("test", edged)
#cv2.waitKey(0)
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None
# loop over the contours
for c in cnts:
# approximate the contour
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
# if the contour has four vertices, then we have found
# the thermostat display
if len(approx) == 4:
displayCnt = approx
break
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))
thresh = cv2.threshold(warped, 222, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
cv2.imwrite("black.png", thresh)
Due to different parts of the image having different overall brightness levels, a global threshold will result in some parts of the image having a threshold that's too low and some too high. This can be remedied by using a median filter on the image to determine local thresholds for the entire image. Here are the steps described (and demonstrated using Paint.NET).
Apply a median filter to the image
Take the difference between the original image and the filtered image and convert it to grayscale
Use a global threshold on this new image
Regarding the brightness difference, division normalization and sharping can be applied:
smooth = cv2.GaussianBlur(warped, (95,95), 0)
division = cv2.divide(warped, smooth, scale=255)
sharp = filters.unsharp_mask(division, radius=1.5, amount=1.5,
multichannel=False, preserve_range=False)
sharp = (255*sharp).clip(0,255).astype(np.uint8)
Output:

How to rotate an image to align the text for extraction?

I am using pytessearct to extract the text from images. But it doesn't work on images which are inclined. Consider the image given below:
Here is the code to extract text, which is working fine on images which are not inclined.
img = cv2.imread(<path_to_image>)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5),0)
ret3, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
def findSignificantContours (img, edgeImg):
contours, heirarchy = cv2.findContours(edgeImg, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
# Find level 1 contours
level1 = []
for i, tupl in enumerate(heirarchy[0]):
# Each array is in format (Next, Prev, First child, Parent)
# Filter the ones without parent
if tupl[3] == -1:
tupl = np.insert(tupl, 0, [i])
level1.append(tupl)
significant = []
tooSmall = edgeImg.size * 5 / 100 # If contour isn't covering 5% of total area of image then it probably is too small
for tupl in level1:
contour = contours[tupl[0]];
area = cv2.contourArea(contour)
if area > tooSmall:
significant.append([contour, area])
# Draw the contour on the original image
cv2.drawContours(img, [contour], 0, (0,255,0),2, cv2.LINE_AA, maxLevel=1)
significant.sort(key=lambda x: x[1])
#print ([x[1] for x in significant]);
mx = (0,0,0,0) # biggest bounding box so far
mx_area = 0
for cont in contours:
x,y,w,h = cv2.boundingRect(cont)
area = w*h
if area > mx_area:
mx = x,y,w,h
mx_area = area
x,y,w,h = mx
# Output to files
roi = img[y:y+h,x:x+w]
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5),0)
ret3, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2_imshow(thresh)
text = pytesseract.image_to_string(roi);
print(text); print("\n"); print(pytesseract.image_to_string(thresh));
print("\n")
return [x[0] for x in significant];
edgeImg_8u = np.asarray(thresh, np.uint8)
# Find contours
significant = findSignificantContours(img, edgeImg_8u)
mask = thresh.copy()
mask[mask > 0] = 0
cv2.fillPoly(mask, significant, 255)
# Invert mask
mask = np.logical_not(mask)
#Finally remove the background
img[mask] = 0;
Tesseract can't extract the text from this image. Is there a way I can rotate it to align the text perfectly and then feed it to pytesseract? Please let me know if my question require any more clarity.
Here's a simple approach:
Obtain binary image. Load image, convert to grayscale,
Gaussian blur, then Otsu's threshold.
Find contours and sort for largest contour. We find contours then filter using contour area with cv2.contourArea() to isolate the rectangular contour.
Perform perspective transform. Next we perform contour approximation with cv2.contourArea() to obtain the rectangular contour. Finally we utilize imutils.perspective.four_point_transform to actually obtain the bird's eye view of the image.
Binary image
Result
To actually extract the text, take a look at
Use pytesseract OCR to recognize text from an image
Cleaning image for OCR
Detect text area in an image using python and opencv
Code
from imutils.perspective import four_point_transform
import cv2
import numpy
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread("1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find contours and sort for largest contour
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None
for c in cnts:
# Perform contour approximation
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
if len(approx) == 4:
displayCnt = approx
break
# Obtain birds' eye view of image
warped = four_point_transform(image, displayCnt.reshape(4, 2))
cv2.imshow("thresh", thresh)
cv2.imshow("warped", warped)
cv2.waitKey()
To Solve this problem you can also use minAreaRect api in opencv which will give you a minimum area rotated rectangle with an angle of rotation. You can then get the rotation matrix and apply warpAffine for the image to straighten it. I have also attached a colab notebook which you can play around on.
Colab notebook : https://colab.research.google.com/drive/1SKxrWJBOHhGjEgbR2ALKxl-dD1sXIf4h?usp=sharing
import cv2
from google.colab.patches import cv2_imshow
import numpy as np
def rotate_image(image, angle):
image_center = tuple(np.array(image.shape[1::-1]) / 2)
rot_mat = cv2.getRotationMatrix2D(image_center, angle, 1.0)
result = cv2.warpAffine(image, rot_mat, image.shape[1::-1], flags=cv2.INTER_LINEAR)
return result
img = cv2.imread("/content/sxJzw.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
mask = np.zeros((img.shape[0], img.shape[1]))
blur = cv2.GaussianBlur(gray, (5,5),0)
ret, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2_imshow(thresh)
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
largest_countour = max(contours, key = cv2.contourArea)
binary_mask = cv2.drawContours(mask, [largest_countour], 0, 1, -1)
new_img = img * np.dstack((binary_mask, binary_mask, binary_mask))
minRect = cv2.minAreaRect(largest_countour)
rotate_angle = minRect[-1] if minRect[-1] < 0 else -minRect[-1]
new_img = rotate_image(new_img, rotate_angle)
cv2_imshow(new_img)

openCV: Warp Perspective for various size of images

I am learning computer vision and trying to warp perspective of pictures of single paper for OCR. The sample picture is
I succeeded to binarize the image and detect contours. Yet I am having difficulty to wrap perspective based on the contours.
def display_cv_image(image, format='.png'):
"""
Display image from 2d array
"""
decoded_bytes = cv2.imencode(format, image)[1].tobytes()
display(Image(data=decoded_bytes))
def get_contour(img,original, thresh):
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
areas = []
for cnt in contours:
area = cv2.contourArea(cnt)
if area > 10000:
epsilon = 0.1*cv2.arcLength(cnt,True)
approx = cv2.approxPolyDP(cnt,epsilon,True)
areas.append(approx)
cv2.drawContours(original,areas,-1,(0,255,0),3)
display_cv_image(original)
return areas[0]
def perspective(original, target):
dst = []
pts1 = np.float32(target)
pts2 = np.float32([[1000,2000],[1000,0],[0,0],[0,2000]])
M = cv2.getPerspectiveTransform(pts1,pts2)
dst = cv2.warpPerspective(original,M,(1000,2000))
display_cv_image(dst)
# Driver codes
original = cv2.imread('image.jpg')
thresh, grey = binarize(original)
target = get_contour(grey,original, thresh)
perspective(original, target)
The problem is pts2 in perspective function. I am trying multiple value for the variable but none of them works. I want to back calculate the map matrix and possibly make the function adaptive to various size of images.
A good description for four point perspective transform can be obtained from Adrian's tutorial: https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/
There is a function four_point_transform in imutils module.
As far as the above picture is concerned, following is the code snippet to warp and binarize which can be used in OCR input.
import cv2
import numpy as np
from imutils.perspective import four_point_transform
import imutils
original = cv2.imread('image.jpg')
blurred = cv2.GaussianBlur(original, (3, 3), 0)
blurred_float = blurred.astype(np.float32) / 255.0
edgeDetector = cv2.ximgproc.createStructuredEdgeDetection('model.yml')
edged = edgeDetector.detectEdges(blurred_float)
edged = (255 * edged).astype("uint8")
edged = cv2.threshold(edged, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(edged, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:5]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
if len(approx) == 4:
screenCnt = approx
break
if len(screenCnt) == 4:
warped = four_point_transform(original, screenCnt.reshape(4, 2))
warped = cv2.cvtColor(warped, cv2.COLOR_BGR2GRAY)
T = cv2.ximgproc.niBlackThreshold(warped, maxValue=255, type=cv2.THRESH_BINARY_INV, blockSize=81, k=0.1, binarizationMethod=cv2.ximgproc.BINARIZATION_WOLF)
warped = (warped > T).astype("uint8") * 255
cv2.imshow("Original", imutils.resize(original, height = 650))
cv2.imshow("Edged", imutils.resize(edged, height = 650))
cv2.imshow("Warped", imutils.resize(warped, height = 650))
cv2.waitKey(0)
Following are the original, edged and final warped binarized output:
Please note that StructuredEdgeDetection is used for better edge detection. You can download the model.yml file from the link: https://cdn.rawgit.com/opencv/opencv_extra/3.3.0/testdata/cv/ximgproc/model.yml.gz
Also note that Wolf & Julion binarization technique is used for better output.
You need to install opencv-contrib-python package through pip for StructuredEdgeDetection and niBlackThreshold.

Find contours based on edges

I want to detect the contours of an equipment label. Although the code runs correctly, it never quite detects the contours of the label.
Original Image
Using this code:
import numpy as np
import cv2
import imutils #resizeimage
import pytesseract # convert img to string
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Read the image file
image = cv2.imread('Car Images/5.JPG')
# Resize the image - change width to 500
image = imutils.resize(image, width=500)
# Display the original image
cv2.imshow("Original Image", image)
cv2.waitKey(0)
# RGB to Gray scale conversion
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow("1 - Grayscale Conversion", gray)
cv2.waitKey(0)
# Noise removal with iterative bilateral filter(removes noise while preserving edges)
gray = cv2.bilateralFilter(gray, 11, 17, 17)
cv2.imshow("2 - Bilateral Filter", gray)
cv2.waitKey(0)
# Find Edges of the grayscale image
edged = cv2.Canny(gray, 170, 200)
cv2.imshow("3 - Canny Edges", edged)
cv2.waitKey(0)
# Find contours based on Edges
cnts, new = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
# Create copy of original image to draw all contours
img1 = image.copy()
cv2.drawContours(img1, cnts, -1, (0,255,0), 3)
cv2.imshow("4- All Contours", img1)
cv2.waitKey(0)
#sort contours based on their area keeping minimum required area as '30' (anything smaller than this will not be considered)
cnts=sorted(cnts, key = cv2.contourArea, reverse = True)[:30]
NumberPlateCnt = None #we currently have no Number plate contour
# Top 30 Contours
img2 = image.copy()
cv2.drawContours(img2, cnts, -1, (0,255,0), 3)
cv2.imshow("5- Top 30 Contours", img2)
cv2.waitKey(0)
# loop over our contours to find the best possible approximate contour of number plate
count = 0
idx =7
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
# print ("approx = ",approx)
if len(approx) == 4: # Select the contour with 4 corners
NumberPlateCnt = approx #This is our approx Number Plate Contour
# Crop those contours and store it in Cropped Images folder
x, y, w, h = cv2.boundingRect(c) #This will find out co-ord for plate
new_img = gray[y:y + h, x:x + w] #Create new image
cv2.imwrite('Cropped Images-Text/' + str(idx) + '.png', new_img) #Store new image
idx+=1
break
# Drawing the selected contour on the original image
#print(NumberPlateCnt)
cv2.drawContours(image, [NumberPlateCnt], -1, (0,255,0), 3)
cv2.imshow("Final Image With Number Plate Detected", image)
cv2.waitKey(0)
Cropped_img_loc = 'Cropped Images-Text/7.png'
cv2.imshow("Cropped Image ", cv2.imread(Cropped_img_loc))
# Use tesseract to covert image into string
text = pytesseract.image_to_string(Cropped_img_loc, lang='eng')
print("Equipment Number is :", text)
cv2.waitKey(0) #Wait for user input before closing the images displayed
Displayed output
Is there a better way to narrow down the contour to the equipment label?
Here is the code for your reference on github:
https://github.com/AjayAndData/Licence-plate-detection-and-recognition---using-openCV-only/blob/master/Car%20Number%20Plate%20Detection.py
I think this code may help you
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('C:/Users/DELL/Desktop/download (5).png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
corners = cv2.goodFeaturesToTrack(gray,60,0.001,10)
corners = np.int0(corners)
for i in corners:
x,y = i.ravel()
cv2.circle(img,(x,y),0,255,-1)
coord = np.where(np.all(img == (255, 0, 0),axis=-1))
plt.imshow(img)
plt.show()

Extract car license plate number in image

I want to find the car plate number to search in a database. Since Saudi plates are different, I face this problem
The result of the code
My current approach is to search for the cross in openCV using edge detection. How can I found the cross and take the below character (using container and edge detection)?
import numpy as np
import pytesseract
from PIL import Image
import cv2
import imutils
import matplotlib.pyplot as plt
import numpy as np
img = cv2.imread('M4.png')
img = cv2.resize(img, (820,680) )
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #convert to grey scale
gray = cv2.blur(gray, (3,3))#Blur to reduce noise
edged = cv2.Canny(gray, 10, 100) #Perform Edge detection
# find contours in the edged image, keep only the largest
# ones, and initialize our screen contour
cnts = cv2.findContours(edged.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:10]
screenCnt = None
# loop over our contours
for c in cnts:
# approximate the contour
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.1 * peri, True)
# if our approximated contour has four points, then
# we can assume that we have found our screen
if len(approx) == 4:
screenCnt = approx
break
if screenCnt is None:
detected = 0
print "No contour detected"
else:
detected = 1
if detected == 1:
cv2.drawContours(img, [screenCnt], -1, (0, 255, 0), 3)
# Masking the part other than the number plate
imgs = img
mask = np.zeros(gray.shape,np.uint8)
new_image = cv2.drawContours(mask,[screenCnt],0,255,-1,)
new_image = cv2.bitwise_and(imgs,imgs,mask=mask)
# Now crop
(x, y) = np.where(mask == 255)
(topx, topy) = (np.min(x), np.min(y))
(bottomx, bottomy) = (np.max(x), np.max(y))
Cropped = gray[topx:bottomx+1, topy:bottomy+1]
#Read the number plate
text = pytesseract.image_to_string(Cropped, config='--psm 11')
print("Detected Number is:",text)
plt.title(text)
plt.subplot(1,4,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(1,4,2),plt.imshow(gray,cmap = 'gray')
plt.title('gray'), plt.xticks([]), plt.yticks([])
plt.subplot(1,4,3),plt.imshow(Cropped,cmap = 'gray')
plt.title('Cropped'), plt.xticks([]), plt.yticks([])
plt.subplot(1,4,4),plt.imshow(edged,cmap = 'gray')
plt.title('edged'), plt.xticks([]), plt.yticks([])
plt.show()
#check data base
#recoed the entre
cv2.waitKey(0)
cv2.destroyAllWindows()
Thanks for your help
Here's an approach:
Convert image to grayscale and Gaussian blur
Otsu's threshold to get a binary image
Find contours and sort contours from left-to-right to maintain order
Iterate through contours and filter for the bottom two rectangles
Extract ROI and OCR
After converting to grayscale and Gaussian blurring, we Otsu's threshold to get a binary image. We find contours then sort the contours using imutils.contours.sort_contours() with the left-to-right parameter. This step keeps the contours in order. From here we iterate through the contours and perform contour filtering using these three filtering conditions:
The contour must be larger than some specified threshold area (3000)
The width must be larger than the height
The center of each ROI must be in the bottom half of the image. We find the center of each contour and compare it to where it is located on the image.
If a ROI passes these filtering conditions, we extract the ROI using numpy slicing and then throw it into Pytesseract. Here's the detected ROIs that pass the filter highlighted in green
Since we already have the bounding box, we extract each ROI
We throw each individual ROI into Pytesseract one at a time to construct our license plate string. Here's the result
License plate: 430SRU
Code
import cv2
import pytesseract
from imutils import contours
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
image = cv2.imread('1.png')
height, width, _ = image.shape
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts, _ = contours.sort_contours(cnts, method="left-to-right")
plate = ""
for c in cnts:
area = cv2.contourArea(c)
x,y,w,h = cv2.boundingRect(c)
center_y = y + h/2
if area > 3000 and (w > h) and center_y > height/2:
ROI = image[y:y+h, x:x+w]
data = pytesseract.image_to_string(ROI, lang='eng', config='--psm 6')
plate += data
print('License plate:', plate)

Categories