See: Converting An Image To A Cartoon Using OpenCV
In the above article they have the following image:
And, they wanted to obtain an output like the following:
I ran the following script:
import cv2
window_name = 'image'
img = cv2.imread("photo.png")
cv2.imshow(window_name, img)
cv2.waitKey(0)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.medianBlur(gray, 5)
edges = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 9, 9)
cv2.imshow(window_name, edges)
cv2.waitKey(0)
color = cv2.bilateralFilter(img, 9, 250, 250)
cartoon = cv2.bitwise_and(color, color, mask=edges)
cv2.imshow(window_name, cartoon)
cv2.waitKey(0)
Firstly, the script is very slow.
Secondly, the output is not what they promised would be:
How can I fix these two issues?
One simple approach is to use stylization in Python/OpenCV in the Non-Photorealistic Rendering in the Computational Photography section to make a "cartoon". Reference to algorithm is at https://www.inf.ufrgs.br/~eslgastal/DomainTransform/Gastal_Oliveira_SIGGRAPH2011_Domain_Transform.pdf
Input:
import cv2
# read image
img = cv2.imread('beard_man.png')
# use mask with input to do inpainting
result = cv2.stylization(img, sigma_s=50, sigma_r=0.8)
# write result to disk
cv2.imwrite("beard_man_cartoon.png", result)
# display it
cv2.imshow("RESULT", result)
cv2.waitKey(0)
Result:
Brief description
I'm so interested in your question, so I tried your suggested website's code, the code you posted, and myself googled a few to tried. Even discussed with my peers, my professor who taught introductory image processing/computer vision using C# that I took couple years ago.
Discussion feedback
Sadly they all respond the same and like what I initially thought, it's not possible to transform/convert directly into the second picture in your post, the posted second picture is most likely to be an artistic graphics photo. Well, maybe you dig deeper maybe there's actually a module or library that can actually transform/convert it 100% like the second picture.
Examples code testing
So, I begin trying out the contents of your posted website, snipped a bit there, adjusted some, but overall, no where near to the second cartoon picture.
The code and result of "Converting An Image To A Cartoon Using OpenCV"
import cv2
from matplotlib import pyplot as plt
# Reading image
img = cv2.imread("img.png")
plt.imshow(img)
# Converting to RGB
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)
# Detecting edges of the input image
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.medianBlur(gray, 9)
edges = cv2.adaptiveThreshold(
gray, 255,
cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY, 9, 9
)
# Cartoonifying the image
color = cv2.bilateralFilter(img, 9, 250, 250)
cartoon = cv2.bitwise_and(color, color, mask=edges)
plt.imshow(cartoon)
plt.savefig("cartoonify.png")
plt.show()
Moving on, then I tried your code in the post, and it's actually made some differences, and it doesn't run slow or didn't make changes. I ran your code, and it did made some change, the code stays pretty much the same, just added saving image methods at the end, cv2.imwrite().
import cv2
import matplotlib.pyplot as plt
window_name = "image"
img = cv2.imread("img.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.medianBlur(gray, 5)
edges = cv2.adaptiveThreshold(
gray, 255,
cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY,
9, 9
)
color = cv2.bilateralFilter(img, 9, 250, 250)
cartoon = cv2.bitwise_and(color, color, mask=edges)
cv2.imshow(window_name, cartoon)
cv2.waitKey(0)
cv2.imwrite("cartoon_op.png", cartoon)
cv2.waitKey(0)
cv2.destroyAllWindows()
The third, I searched on github, found this code, but for this I used my stackoverlfow profile picture, which it's a headshot, I thought maybe the white background would make more visible difference, but it didn't, compared to previous examples, it's pretty much close.
import cv2
import numpy as np
from tkinter.filedialog import *
photo = askopenfilename()
img = cv2.imread(photo)
grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
grey = cv2.medianBlur(grey, 5)
edges = cv2.adaptiveThreshold(grey, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 9, 9)
#cartoonize
color = cv2.bilateralFilter(img, 9, 250, 250)
cartoon = cv2.bitwise_and(color, color, mask = edges)
cv2.imshow("Image", img)
cv2.imshow("Cartoon", cartoon)
#save
cv2.imwrite("cartoon-git.png", cartoon)
cv2.waitKey(0)
cv2.destroyAllWindows()
Just before almost finished with the answer, I found this example
gives the closest result of cartoonized picture example on Dev -
How to cartoonize an image with Python, this example used Elon
Musk's photo to demonstrate, although it's the closest to cartoon,
but the size somehow just got really small.
import numpy as np
import cv2
file_name = "elon.jpg"
def resize_image(image):
scale_ratio = 0.3
width = int(image.shape[1] * scale_ratio)
height = int(image.shape[0] * scale_ratio)
new_dimensions = (width, height)
resized = cv2.resize(
image, new_dimensions,
interpolation=cv2.INTER_AREA
)
return resized
def find_countours(image):
contoured_image = image
gray = cv2.cvtColor(contoured_image, cv2.COLOR_BGR2GRAY)
edged = cv2.Canny(gray, 30, 100)
contours, hierarchy = cv2.findContours(
edged, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_NONE
)
cv2.drawContours(
contoured_image, contours,
contourIdx=-1, color=1,
thickness=1
)
cv2.imshow("Image after contouring", contoured_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
return contoured_image
def color_quantization(image, k=4):
z = image.reshape((-1, 3))
z = np.float32(z)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,
10000, 0.0001)
compactness, label, center = cv2.kmeans(z, k, None, criteria,
1, cv2.KMEANS_RANDOM_CENTERS)
center = np.uint8(center)
res = center[label.flatten()]
res2 = res.reshape((image.shape))
return res2
if __name__ == '__main__':
image = cv2.imread(file_name)
resized_image = resize_image(image)
coloured = color_quantization(resized_image)
contoured = find_countours(coloured)
final_image = contoured
save_q = input("Save the image? [y]/[n]: ")
if save_q == "y":
cv2.imwrite("cartoonized_" + file_name, final_image)
print("Image saved!")
Original Elon.jpg
Cartoonized Elon.jpg
Wrapping up
I hope this long answer that sounded like no definitive answer helps, it's just what I found interested and decided to share the process of discovering it.
Related
I'm currently working on a small OCR bot. I got pretty much everything to work and am now trying to improve the OCR. Specifically, it has problems with two things: the orange/red-ish text on the same colored gradient and for some reason the first 1 of "1/1". Sadly I haven't found anything that worked in my case yet. I've made a small test image, which is consisting of multiple images, below:
Source Image
Results
Adaptive Threshold
As you can see the gradient results in a blob that is sometimes big enough to overlap with the first word (see "apprentice") resulting in garbage.
I've tried many variations and played around with thresholds, blurs, erode, dilation, box detection with the dilation method, etc. but nothing worked well. The only way I did get rid of the blob is using an adaptive Threshold. But sadly I wasn't able to get good results using the output image.
If anyone knows how to make the OCR more robust, increase accuracy and get rid of the blob I'd appreciate your help. Thanks.
The following code is my 'playground' to figure out a better way:
import cv2
import pytesseract
import numpy as np
pytesseract.pytesseract.tesseract_cmd = YOUR_PATH
def resize(img, scale_percent=300):
# use this instead?
# resize = image = imutils.resize(image, width=300)
# automatically resizes it about 300% by default
width = int(img.shape[1] * scale_percent / 100)
height = int(img.shape[0] * scale_percent / 100)
dim = (width, height)
resized = cv2.resize(img, dim, interpolation=cv2.INTER_AREA)
return resized
def preprocessImage(img, scale=300, threshhold=127):
""" input RGB colour space """
# makes results more accurate - inspired from https://stackoverflow.com/questions/58103337/how-to-ocr-image-with-tesseract
# another resource to improve accuracy - https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html
# converts from rgb to grayscale then enlarges it
# applies gaussian blur
# convert to b&w
# invert black and white colours (white background, black text)
grayscale = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
cv2.imshow('grayscale', grayscale)
resized = resize(grayscale, scale)
cv2.imshow('resized', resized)
blurred = cv2.medianBlur(resized, 5)
#cv2.imshow('median', blurred)
blurred = cv2.GaussianBlur(resized, (5, 5), 5)
cv2.imshow('1', blurred)
cv2.waitKey()
blackAndWhite = cv2.threshold(blurred, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cv2.imshow('blackAndWhite', blackAndWhite)
th3 = cv2.adaptiveThreshold(blurred, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
cv2.floodFill(th3, None, (0, 0), 255)
cv2.imshow('th3', th3)
#kernel = np.ones((3, 3), np.uint8)
#erode = cv2.erode(th3, kernel)
kernel = np.ones((5, 5), np.uint8)
#opening = cv2.morphologyEx(blackAndWhite, cv2.MORPH_OPEN, kernel)
invertedColours = cv2.bitwise_not(blackAndWhite)
return invertedColours
# excerpt from https://www.youtube.com/watch?v=6DjFscX4I_c
def imageToText(img):
# returns item name from image, preprocess if needed
boxes = pytesseract.image_to_data(img)
num = []
for count, box in enumerate(boxes.splitlines()):
if (count != 0):
box = box.split()
if (len(box) == 12):
text = box[11].strip('#®')
if (text != ''):
num.append(text)
text = ' '.join(num)
## Alternate method
# text = pytesseract.image_to_string(img)
# print("Name:", text)
return text
if __name__ == "__main__":
img = cv2.imread("test.png")
img = preprocessImage(img, scale=300)
print(imageToText(img))
##############################################
##### Detecting Words ######
##############################################
#[ 0 1 2 3 4 5 6 7 8 9 10 11 ]
#['level', 'page_num', 'block_num', 'par_num', 'line_num', 'word_num', 'left', 'top', 'width', 'height', 'conf', 'text']
boxes = pytesseract.image_to_data(img)
# convert back to colored image
img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
# draw boxes and text
for a,b in enumerate(boxes.splitlines()):
print(b)
if a!=0:
b = b.split()
if len(b)==12:
x,y,w,h = int(b[6]),int(b[7]),int(b[8]),int(b[9])
cv2.putText(img,b[11],(x,y-5),cv2.FONT_HERSHEY_SIMPLEX,1,(50,50,255),2)
cv2.rectangle(img, (x,y), (x+w, y+h), (0, 0, 255), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
I couldn't get it perfect but almost...
I got a lot of benefit from CLAHE equalization. See tutorial here. But that wasn't enough. Still needed thresholding. Adaptive techniques didn't work well, but cv2.THRESH_TOZERO gives OK results. See thresholding tutorial here
import cv2
from pytesseract import image_to_string, image_to_data
img = cv2.imread('gradient.png', cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (0,0), fx=2.0, fy=2.0)
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
img = clahe.apply(img)
img = 255-img # invert image. tesseract prefers black text on white background
ret, img = cv2.threshold(img, 127, 255, cv2.THRESH_TOZERO)
cv2.imwrite('output.png', img)
ocr = image_to_string(img, config='--psm 6')
print(ocr)
which gives ocr output
Tool Crafting Part
Apprentice Craft Kit
Adept Craft Kit
Expert Craft Kit
=
Master Craft Kit
1/1
I have a drone FPV video, from which I need extract GPS coordinates. The text is white, but because of bad quality of video it seems gray and light blue. Since the background is changing, I have some problems, because in some frames the background has a totally different and in some frames a similar color to the text one.
Here is 2 original images (frames) from the video:
Dark background
Light background
And here is the code that I've found after googling:
import numpy as np
import cv2
import pytesseract
cap = cv2.VideoCapture('v1.avi')
p = 10000
while(cap.isOpened()):
ret, frame = cap.read()
img = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
img = img[380:460, 220:640]
img = cv2.bilateralFilter(img, 9, 27, 27)
img = cv2.threshold(img, 0, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
img = cv2.GaussianBlur(img, (9, 9), 0)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
img = cv2.morphologyEx(img, cv2.MORPH_TOPHAT, kernel)
img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)[1]
img = cv2.dilate(img, kernel)
img = cv2.threshold(img, 0, 250, cv2.THRESH_BINARY_INV)[1]
cv2.imshow('frame', img)
cv2.imshow('or', frame)
print('\n==============')
print(pytesseract.image_to_string(img, config='digits'))
if cv2.waitKey(50) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
And also the results:
Dark background
Light background
As you can see, in the second case the background isn't clear, there is some noise, and from that image Tesseract doesn't extract the text properly.
EDIT:
For some reasons I can't share the video I wrote about above, but here is a similar video from Youtube, and if the text can be extracted from that video, I guess that method will also work for mine or solve many problems at least:
I was able to get something working using a combination of cv2.bilateralFilter and cv2.adaptiveThreshold. Once the background is in one main blob, the numbers can be extracted based on their patch sizes.
img = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Bilaterial filter and adaptive histogram thresholding to get background into mostly one patch
img = cv2.bilateralFilter(img, 9, 29, 29)
thresh = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 13, 0)
# Add padding to join any background around edges into the same patch
pad = 2
img_pad = cv2.copyMakeBorder(thresh, pad, pad, pad, pad, cv2.BORDER_CONSTANT, value = 1)
# Label patches and remove padding
ret, markers = cv2.connectedComponents(img_pad)
markers = markers[pad:-pad,pad:-pad]
# Count pixels in each patch
counts = [(markers==i).sum() for i in range(markers.max()+1)]
# Keep patches based on pixel counts
maxCount = 200 # removes large background patches
minCount = 40 # removes specs and centres of numbering
keep = [c<maxCount and c>minCount for c in counts]
output = markers.copy()
for i,k in enumerate(keep):
output[markers==i] = k
Here is what the images look like at each stage.
I want to stitch multiple image patches to a new and mainly gray background image. The image patches contain colored elements which shall not be changed, if possible. Their shape and color is diverse. Like the new background image the borders of the image patches are also gray, just slightly different, but you can see strong borders if I just go by
ImgPatch = cv2.imread("C://...//ImagePatch.png")
NewBackground = cv2.imread("C://...//NewBackground.png")
height, width, channels = ImgPatch.shape
NewBackground[y:y+height,x:x+width] = ImgPatch
I tried cv2.seamlessClone() (docs.opencv.org) as explained in this tutorial:
www.learnopencv.com/seamless-cloning-using-opencv-python-cpp
The edges are perfectly smoothed, but unfortunately the colors of the elements are changed way too much. I know the approximate width and height of the gray border of each image patch. If i could specifically smooth that area that may be a start and lets the result look already better than what I have. I tried different masks with cv2.seamlessClone(), of which none of the tried ways workes. So unfortunately I couldn't find a correct way to blend only the border of the patches so far.
The following images visualize my problem in a very abstract way.
What I have:
Left: Background, Right: Image patch
What I want:
What I currently get by using cv2.seamlessClone():
Any help would be very much appreciated!
EDIT As I probably was not clear enough: The real images are way more complex and so unfortunately I can not get reasonable results for all image patches by using cv2.findContour... What I am looking for is a method to merge the borders, so you can not see the exact transition of patch to background anymore.
patch = cv2.imread('patch.png', cv2.IMREAD_UNCHANGED);
image = cv2.imread('image.png', cv2.IMREAD_UNCHANGED);
mask = 255 * np.ones(patch.shape, patch.dtype)
width, height, channels = image.shape
center = (height//2, width//2)
mixed_clone = cv2.seamlessClone(patch, image, mask, center, cv2.cv2.NORMAL_CLONE)
You could try to find contour in your image patch with cv2.findContour() (red spot). Then remove the background of the contour and save the image. You can finally combine the one you saved (red spot without background) with the gray background image with cv2.add(). I have combined some code I once played with and the code in OpenCV docs (for cv2.add()). Hope it helps a bit (Note the example ads the image in upper left corner - if you want elswhere you should change the code). Cheers!
Example:
import cv2
import numpy as np
from PIL import Image
img = cv2.imread('background2.png', cv2.IMREAD_UNCHANGED)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, threshold = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY_INV)
height,width = gray.shape
mask = np.zeros((height,width), np.uint8)
_, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
cv2.drawContours(mask,[cnt], -1, (255,255,255),thickness=-1)
masked = cv2.bitwise_and(img, img, mask=mask)
_,thresh = cv2.threshold(mask,1,255,cv2.THRESH_BINARY)
contours = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
x,y,w,h = cv2.boundingRect(contours[0])
circle = masked[y:y+h,x:x+w]
cv2.imwrite('temp.png', circle)
cv2.waitKey(0)
cv2.destroyAllWindows()
img = Image.open('temp.png')
img = img.convert("RGBA")
datas = img.getdata()
newData = []
for item in datas:
if item[0] == 0 and item[1] == 0 and item[2] == 0:
newData.append((255, 255, 255, 0))
else:
newData.append(item)
img.putdata(newData)
img.save('background3.png', "PNG")
img1 = cv2.imread('background1.png')
img2 = cv2.imread('background3.png')
rows,cols,channels = img2.shape
roi = img1[0:rows, 0:cols ]
img2gray = cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 110, 255, cv2.THRESH_BINARY_INV)
mask_inv = cv2.bitwise_not(mask)
img1_bg = cv2.bitwise_and(roi,roi,mask = mask_inv)
img2_fg = cv2.bitwise_and(img2,img2,mask = mask)
dst = cv2.add(img1_bg,img2_fg)
img1[0:rows, 0:cols] = dst
cv2.imshow('img',img1)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
I'm trying to detect this Code128 barcode with Python + zbar module:
(Image download link here).
This works:
import cv2, numpy
import zbar
from PIL import Image
import matplotlib.pyplot as plt
scanner = zbar.ImageScanner()
pil = Image.open("000.jpg").convert('L')
width, height = pil.size
plt.imshow(pil); plt.show()
image = zbar.Image(width, height, 'Y800', pil.tobytes())
result = scanner.scan(image)
for symbol in image:
print symbol.data, symbol.type, symbol.quality, symbol.location, symbol.count, symbol.orientation
but only one point is detected: (596, 210).
If I apply a black and white thresholding:
pil = Image.open("000.jpg").convert('L')
pil = pil .point(lambda x: 0 if x<100 else 255, '1').convert('L')
it's better, and we have 3 points: (596, 210), (482, 211), (596, 212). But it adds one more difficulty (finding the optimal threshold - here 100 - automatically for every new image).
Still, we don't have the 4 corners of the barcode.
Question: how to reliably find the 4 corners of a barcode on an image, with Python? (and maybe OpenCV, or another library?)
Notes:
It is possible, this is a great example (but sadly not open-source as mentioned in the comments):
Object detection, very fast and robust blurry 1D barcode detection for real-time applications
The corners detection seems to be excellent and very fast, even if the barcode is only a small part of the whole image (this is important for me).
Interesting solution: Real-time barcode detection in video with Python and OpenCV but there are limitations of the method (see in the article: the barcode should be close up, etc.) that limit the potential use. Also I'm more looking for a ready-to-use library for this.
Interesting solution 2: Detecting Barcodes in Images with Python and OpenCV but again, it does not seem like a production-ready solution, but more a research in progress. Indeed, I tried their code on this image but the detection does not yield successful result. It has to be noted that it doesn't take any spec of the barcode in consideration for the detection (the fact there's a start/stop symbol, etc.)
import numpy as np
import cv2
image = cv2.imread("000.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gradX = cv2.Sobel(gray, ddepth = cv2.CV_32F, dx = 1, dy = 0, ksize = -1)
gradY = cv2.Sobel(gray, ddepth = cv2.CV_32F, dx = 0, dy = 1, ksize = -1)
gradient = cv2.subtract(gradX, gradY)
gradient = cv2.convertScaleAbs(gradient)
blurred = cv2.blur(gradient, (9, 9))
(_, thresh) = cv2.threshold(blurred, 225, 255, cv2.THRESH_BINARY)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (21, 7))
closed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
closed = cv2.erode(closed, None, iterations = 4)
closed = cv2.dilate(closed, None, iterations = 4)
(_, cnts, _) = cv2.findContours(closed.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
c = sorted(cnts, key = cv2.contourArea, reverse = True)[0]
rect = cv2.minAreaRect(c)
box = np.int0(cv2.boxPoints(rect))
cv2.drawContours(image, [box], -1, (0, 255, 0), 3)
cv2.imshow("Image", image)
cv2.waitKey(0)
Solution 2 is pretty good. The critical factor that made it fail on your image was the thresholding. If you drop the parameter 225 way down to 55, you'll get much better results.
I've reworked the code, making some tweaks here and there. The original code is fine if you prefer. The documentation for OpenCV is quite good, and there are very good Python tutorials.
import numpy as np
import cv2
image = cv2.imread("barcode.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# equalize lighting
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
gray = clahe.apply(gray)
# edge enhancement
edge_enh = cv2.Laplacian(gray, ddepth = cv2.CV_8U,
ksize = 3, scale = 1, delta = 0)
cv2.imshow("Edges", edge_enh)
cv2.waitKey(0)
retval = cv2.imwrite("edge_enh.jpg", edge_enh)
# bilateral blur, which keeps edges
blurred = cv2.bilateralFilter(edge_enh, 13, 50, 50)
# use simple thresholding. adaptive thresholding might be more robust
(_, thresh) = cv2.threshold(blurred, 55, 255, cv2.THRESH_BINARY)
cv2.imshow("Thresholded", thresh)
cv2.waitKey(0)
retval = cv2.imwrite("thresh.jpg", thresh)
# do some morphology to isolate just the barcode blob
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 9))
closed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
closed = cv2.erode(closed, None, iterations = 4)
closed = cv2.dilate(closed, None, iterations = 4)
cv2.imshow("After morphology", closed)
cv2.waitKey(0)
retval = cv2.imwrite("closed.jpg", closed)
# find contours left in the image
(_, cnts, _) = cv2.findContours(closed.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
c = sorted(cnts, key = cv2.contourArea, reverse = True)[0]
rect = cv2.minAreaRect(c)
box = np.int0(cv2.boxPoints(rect))
cv2.drawContours(image, [box], -1, (0, 255, 0), 3)
print(box)
cv2.imshow("found barcode", image)
cv2.waitKey(0)
retval = cv2.imwrite("found.jpg", image)
edge.jpg
thresh.jpg
closed.jpg
found.jpg
output from console:
[[596 249]
[470 213]
[482 172]
[608 209]]
For the following to work, you need to have contrib package installed using pip install opencv-contrib-python
Your OpenCV version would now have a separate class for detecting barcodes.
cv2.barcode_BarcodeDetector() comes equipped with 3 in-built functions:
decode(): returns decoded information and type
detect(): returns the 4 corner points enclosing each detected barcode
detectAndDecode(): returns all the above
Sample Image used is from pyimagesearch blog:
The 4 corners are captured in points.
Code:
img = cv2.imread('barcode.jpg')
barcode_detector = cv2.barcode_BarcodeDetector()
# 'retval' is boolean mentioning whether barcode has been detected or not
retval, decoded_info, decoded_type, points = barcode_detector.detectAndDecode(img)
# copy of original image
img2 = img.copy()
# proceed further only if at least one barcode is detected:
if retval:
points = points.astype(np.int)
for i, point in enumerate(points):
img2 = cv2.drawContours(img2,[point],0,(0, 255, 0),2)
# uncomment the following to print decoded information
#x1, y1 = point[1]
#y1 = y1 - 10
#cv2.putText(img2, decoded_info[i], (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 0), 3, 2)
Result:
Detected barcode:
Detected barcode and information:
One solution not discussed here is PyZbar.
It is helpful to know there are a number of different types of barcode so reading this can be helpful. Now each solution for decoding might have limitations for the types it can decode. #Jeru Luke's solution seems to be only support EAN-13 barcodes currently see docs here.
Now using PyZbar a simple solution for getting the rect object (4 corners) with the decoding and the bonus of finding out which type the barcode it is can be done with this script.
Using this barcode
import cv2
from pyzbar.pyzbar import decode
file_path = r'c:\my_file'
img = cv2.imread(file_path)
detectedBarcodes = decode(img)
for barcode in detectedBarcodes:
(x, y, w, h) = barcode.rect
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 5)
print(barcode.rect)
print(barcode.data)
print(barcode.type)
output
Rect(left=77, top=1, width=665, height=516)
b'9771234567003'
EAN13
Using #Jeru Luke's code you can drawContours and putText.
ZBar supports
EAN-13/UPC-A,
UPC-E, EAN-8,
Code 128,
Code 93,
Code 39,
Codabar,
Interleaved 2 of 5,
QR Code
SQ Code.
So I think PyZbar will also support these types.
I have recently downloaded some code in python that tries to scan a receipt(or prepare for scanning). I tried to run the code, but there seems to be a problem. Python doesn't recognize the module 'Rect'. I tried to download the module, but there is no such module available. I'm stuck and am wondering what to do.
Note: Only one line of code uses the module
Code :
import cv2
import numpy as np
import rect
# add image here.
image = cv2.imread('test_pic.jpg')
# resize image
# choose optimal dimensions
image = cv2.resize(image, (1500, 880))
# create copy of original image
orig = image.copy()
# convert to grayscale and blur to smooth
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# gaussian blur to smoothen texture
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
#blurred = cv2.medianBlur(gray, 5)
# apply Canny Edge Detection
edged = cv2.Canny(blurred, 0, 50)
orig_edged = edged.copy()
# find the contours in the edged image
# keep only the largest ones, and
# initialize screen contour
(contours, _) = cv2.findContours(edged, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
contours = sorted(contours, key=cv2.contourArea, reverse=True)
#x,y,w,h = cv2.boundingRect(contours[0])
#cv2.rectangle(image,(x,y),(x+w,y+h),(0,0,255),0)
# get approximate contour
for c in contours:
p = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * p, True)
if len(approx) == 4:
target = approx
break
# map target points to 800x800 quadrilateral
approx = rect.rectify(target)
pts2 = np.float32([[0,0],[800,0],[800,800],[0,800]])
M = cv2.getPerspectiveTransform(approx,pts2)
dst = cv2.warpPerspective(orig,M,(800,800))
cv2.drawContours(image, [target], -1, (0, 255, 0), 2)
dst = cv2.cvtColor(dst, cv2.COLOR_BGR2GRAY)
# use thresholding on warped image to get scanned effect (If Required)
ret,th1 = cv2.threshold(dst,127,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(dst,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
cv2.THRESH_BINARY,11,2)
th3 = cv2.adaptiveThreshold(dst,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY,11,2)
ret2,th4 = cv2.threshold(dst,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# show results
cv2.imshow("Original.jpg", orig)
cv2.imshow("Original Gray.jpg", gray)
cv2.imshow("Original Blurred.jpg", blurred)
cv2.imshow("Original Edged.jpg", orig_edged)
cv2.imshow("Outline.jpg", image)
cv2.imshow("Thresh Binary.jpg", th1)
cv2.imshow("Thresh mean.jpg", th2)
cv2.imshow("Thresh gauss.jpg", th3)
cv2.imshow("Otsu's.jpg", th4)
cv2.imshow("dst.jpg", dst)
# other thresholding methods
"""
ret,thresh1 = cv2.threshold(dst,127,255,cv2.THRESH_BINARY)
ret,thresh2 = cv2.threshold(dst,127,255,cv2.THRESH_BINARY_INV)
ret,thresh3 = cv2.threshold(dst,127,255,cv2.THRESH_TRUNC)
ret,thresh4 = cv2.threshold(dst,127,255,cv2.THRESH_TOZERO)
ret,thresh5 = cv2.threshold(dst,127,255,cv2.THRESH_TOZERO_INV)
cv2.imshow("Thresh Binary", thresh1)
cv2.imshow("Thresh Binary_INV", thresh2)
cv2.imshow("Thresh Trunch", thresh3)
cv2.imshow("Thresh TOZERO", thresh4)
cv2.imshow("Thresh TOZERO_INV", thresh5)
"""
cv2.waitKey(0)
cv2.destroyAllWindows()
In the github repo you presumably grabbed the code from, there's another file called rect.py with a single function rectify() that is used in the main program. In Python, if you create other .py modules, you can import them into your code for better encapsulation of certain functions, although it seems really unnecessary in this code to keep the rectify() function in a different file altogether. Something that's equally basic but more common is a .py file with all your functions, and then a main .py file which uses those functions.
Edit: so to be clear, the rect module is in the repo itself. A word of advice, generally clone a whole repo unless you know for sure that you don't need the other files in it.