I would like to extract the time from some images using pytesseract in Python, but it doesn't produce anything.
The code I was using is as follow:
import pytesseract
from PIL import Image, ImageOps
im = Image.open(r'im.jpg')
im_invert = ImageOps.invert(im)
text = pytesseract.image_to_string(im_invert)
print(text)
The original image:
Image after inversion operation:
When I ran the code above, the only thing I got is
Is there anything wrong with my code?
If you can use EasyOCR, then this approach below works for your input image.
I have tested the given original image in google colab. For showing output images locally use cv2.imshow(...) and cv2.waitkey(0).
Here, first median blur is applied to grayscale image. Next, thresholding, erosion and dilation is applied. Median Blur + Thresholding outputs almost similar confidence as Median Blur + Thresholding + Erosion + Dilation in this case.
Image
OCR Prediction Including Confidence
Thresholding:
[([[3, 1], [270, 1], [270, 60], [3, 60]], '09:01:00', 0.797291100025177)]
Erosion:
[([[2, 2], [270, 2], [270, 58], [2, 58]], '09:01:00', 0.4145631492137909)]
Dilation:
[([[3, 1], [270, 1], [270, 60], [3, 60]], '09:01:00', 0.7948697805404663)]
Code
import cv2
import easyocr
import numpy as np
from PIL import Image
from google.colab.patches import cv2_imshow
# need to run only once to load model into memory
reader = easyocr.Reader(['ch_sim','en'])
img = cv2.imread('1.jpg', 0)
img = cv2.medianBlur(img, 5)
ret, th1 = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
#th1 = cv2.bitwise_not(th1)
kernel = np.ones((3,3), np.uint8)
erosion = cv2.erode(th1, kernel, iterations = 1)
dilation = cv2.dilate(erosion, kernel, iterations = 1)
print("Thresholding:\n")
cv2_imshow(th1)
print("\nErosion:\n")
cv2_imshow(erosion)
print("\nDilation:\n")
cv2_imshow(dilation)
print("Thresholding:")
result = reader.readtext(th1)
print(result)
print("Erosion:")
result = reader.readtext(erosion)
print(result)
print("Dilation:")
result = reader.readtext(dilation)
print(result)
Related
I am new to image processing. I am doing image processing of fiber image to generate skeleton using morphology function from skimage. The generated skeleton shows lots of small unnecessary loops/circles around it. I need a single medial axis line (skeleton) for each fiber in image. I have used following code to generate skeleton. Also, attached original and skeleton image for reference. Can someone help in to improve the skeleton generation step?
[import cv2
import numpy as np
from skimage import morphology
img = cv2.imread('img.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# dilate and threshold
kernel = np.ones((2, 2), np.uint8)
dilated = cv2.dilate(gray, kernel, iterations=1)
ret, thresh = cv2.threshold(dilated, 215, 255, cv2.THRESH_BINARY_INV)
cv2.imwrite('Binary.jpg',thresh)
# skeletonize
skeleton = morphology.skeletonize(thresh, method='lee')
skeleton = morphology.remove_small_objects(skeleton.astype(bool), 100, connectivity=2)
cv2.imwrite('Skeleton.jpg',skeleton*255)]
I'm not seeing the skeleton. Here's what I was able to come up with binary thresholding + morphological operations + small object clean up:
from skimage.morphology import binary_opening, binary_closing, thin, disk, remove_small_objects, remove_small_holes
from skimage.filters import threshold_otsu
from skimage.color import rgb2gray
import numpy as np
img_gray = rgb2gray(img) # image was rgb for me
mask = img_gray < threshold_otsu(img_gray) + 15/255
mask = binary_opening(binary_closing(mask, disk(2)), disk(2))
mask = remove_small_objects(mask, 300)
mask = remove_small_holes(mask, 300)
skeleton = thin(mask)
plot(skeleton)
You can probably play around with the kernel shape for the opening/closing, as well as use individual erosion/dilation operations, to get a stronger output. You may also try different skeletonizing techniques, as I know different ones produce varying results.
I want to extract numbers from captcha image, so I tried this code from this answer this answer:
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
import cv2
file = 'sample.jpg'
img = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, None, fx=10, fy=10, interpolation=cv2.INTER_LINEAR)
img = cv2.medianBlur(img, 9)
th, img = cv2.threshold(img, 185, 255, cv2.THRESH_BINARY)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (4,8))
img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
cv2.imwrite("sample2.jpg", img)
file = 'sample2.jpg'
text = pytesseract.image_to_string(file)
print(''.join(x for x in text if x.isdigit()))
and it worked fine for this image: outPut: 436359 But, when I tried it on this image: It gave me nothing, outPut: .
How can I modify my code to get the numbers as a string from the second image?
EDIT:
I tried Matt's answer and it worked just fine for the image above. but it doesn't recognise numbers like (8,1) in image A, and number (7) in image B
image A
image B
How to fix that?
Often, getting OCR just right on an image like this has to do with the order and parameters of the transformations. For example, in the following code snippet, I first convert to grayscale, then erode the pixels, then dilate, then erode again. I use threshold to convert to binary (just blacks and whites) and then dilate and erode one more time. This for me produces the correct value of 859917 and should be reproducible.
import cv2
import numpy as np
import pytesseract
file = 'sample2.jpg'
img = cv2.imread(file)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ekernel = np.ones((1,2),np.uint8)
eroded = cv2.erode(gray, ekernel, iterations = 1)
dkernel = np.ones((2,3),np.uint8)
dilated_once = cv2.dilate(eroded, dkernel, iterations = 1)
ekernel = np.ones((2,2),np.uint8)
dilated_twice = cv2.erode(dilated_once, ekernel, iterations = 1)
th, threshed = cv2.threshold(dilated_twice, 200, 255, cv2.THRESH_BINARY)
dkernel = np.ones((2,2),np.uint8)
threshed_dilated = cv2.dilate(threshed, dkernel, iterations = 1)
ekernel = np.ones((2,2),np.uint8)
threshed_eroded = cv2.erode(threshed_dilated, ekernel, iterations = 1)
text = pytesseract.image_to_string(threshed_eroded)
print(''.join(x for x in text if x.isdigit()))
I have an image that looks like this:
And this is the processed image
I have tried pretty much everything. I processed the image like this:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
(h, w) = gray.shape[:2]
gray = cv2.resize(gray, (w*2, h*2))
thresh = cv2.threshold(gray, 150, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
gray = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, rectKernel)
blur = cv2.GaussianBlur(gray,(1,1),cv2.BORDER_DEFAULT)
text = pytesseract.image_to_string(blur, config="--oem 1 --psm 6")
But Tesseract doesnt print out anything. I am using this version of tesseract
5.0.0-alpha.20201127
How do I improve it's performance? Its highly unreliable.
Edit:
The answer below did a wonderful job on the said image.
But when I apply this technique to image like this one I get wrong output
Why is that? They seem roughly the same.
The problem is characters are not in center of the image.
Sometimes, tesseract have difficulty recognizing the characters or digit if they are not on the center.
Therefore my suggestion is:
Center the characters
Up-sample and convert to gray-scale
Centering the characters:
cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
50 is just a padding variable, you can set to any other value.
The background turns blue because of the value. OpenCV read the image in BGR fashion. giving 255 as an input is same as [255, 0, 0] which is display blue channel, but not green and red respectively.
You can try with other values. For me it won't matter, since I'll convert it to gray-scale on the next step.
Up-sampling and converting to gray-scale:
The same steps you have done. The first three-line of your code.
Now when you read:
MEHVISH MUQADDAS
Code:
import cv2
import pytesseract
# Load the image
img = cv2.imread("onf0D.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR
txt = pytesseract.image_to_string(gry, config="--psm 6")
print(txt)
Read more tesseract-improve-quality.
You don't need to do threshold, GaussianBlur or morphologyEx.
The reasons are:
Simple-Threshold is used to get the features of the image. Input images' features are already available.
You don't have to smooth the image, there is no illumination effect on the image.
You don't need to do segmentation, since background is plain-white.
Update-1
The second image requires pre-processing. However, applying simple-threshold won't work on this image. You need to remove the background using a binary mask, then you can apply OCR.
Result of the binary-mask:
Now, if you apply OCR:
IRUM FEROZ
Code:
import cv2
import numpy as np
import pytesseract
# Load the image
img = cv2.imread("jCMft.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to HSV color-space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Adaptive-Threshold
msk = cv2.inRange(hsv, np.array([0, 0, 0]), np.array([179, 255, 130]))
# OCR
txt = pytesseract.image_to_string(msk, config="--psm 6")
print(txt)
Q:How do I find the lower and upper bounds of the cv2.inRange method?
A: You can use the following script.
Q: What did you change in the second image?
A: First I converted image to the HSV format, instead of gray-scale. The reason is I wanted remove the background. If you experiment with adaptiveThreshold you will see there are a lot of artifacts on the background limits the tesseract recognition. Then I used cv2.inRange to get a binary mask. Feeding binary-mask to the input gave me the desired result.
I am working on some Xray images and I want to detect and segment region of interest from the image.
Consider input image
I want to detect square like shapes in the image, as highlighted in the image
Output: region of interest will somehow look like this
This is my code which I have done so far
import cv2
import numpy as np
import pandas as pd
import os
from PIL import Image
import matplotlib.pyplot as plt
from skimage.io import imread, imshow
img = cv2.imread('image.jpg',0)
imshow(img)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imshow(gray)
equ = cv2.equalizeHist(img)
imshow(equ)
img_resize = cv2.resize(img, (300, 300))
print(img_resize.shape)
figure_size = 9
new_image_gauss = cv2.GaussianBlur(img_resize, (figure_size, figure_size),0)
imshow(new_image_gauss)
img_edge = cv2.Canny(equ,100,200)
# show the image edges on the newly created image window
imshow(img_edge)
kernel = np.ones((5,5), np.uint8)
img_erosion = cv2.erode(img_edge, kernel, iterations=1)
img_dilation = cv2.dilate(img_edge, kernel, iterations=1)
imshow(img_erosion)
Results which I have got;
Please guide me.
TIA
One thing that might help is to do morphology gradient on your image to emphasize the edges in Python OpenCV before you do your Canny edge detection.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("xray2.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do morphology gradient
kernel = cv2.getStructuringElement(cv2.MORPH_RECT , (3,3))
morph = cv2.morphologyEx(gray, cv2.MORPH_GRADIENT, kernel)
# apply gain
morph = cv2.multiply(morph, 5)
# write results
cv2.imwrite("xray2_gradient_edges.jpg",morph)
# show lines
cv2.imshow("morph", morph)
cv2.waitKey(0)
I am trying to get a small scratch from the noise image as shown. It is quite noticeable by eyes, but I would like to identify it using OpenCV Python.
I tried to use image blurring and subtract it from the original image, and then threshold to get a image as the second.
Could any body please advise to get this scratch?
Original image:
Image after blurring, subtraction, and threshold:
This is how I process this image:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread("scratch0.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.blur(gray,(71,71))
diff = cv2.subtract(blur, gray)
ret, th = cv2.threshold(diff, 13, 255, cv2.THRESH_BINARY_INV)
cv2.imshow("threshold", th)
cv2.waitKey(0)