I know this has been asked before, and I have been trying several different methods and changing things, but cannot figure out how to get this to work. I have a bunch of pages where this works perfectly. This is clear text perfectly laid out. But for some reason, on one of the sheets it is messing up and reading completely wrong info. Below I have attached my code, output, and the image.
import pytesseract
import cv2
import numpy as np
img = cv2.imread('page_3.jpg')
img = cv2.resize(img, None, fx=2, fy=2)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1, 1), np.uint8)
cv2.imwrite('thresh.png', img)
for psm in range(6, 13 + 1):
config = '--oem 3 --psm %d' % psm
txt = pytesseract.image_to_string(img, config=config, lang='eng')
print('psm ', psm, ':', txt)
Here is the photo:
And then here is the output. It works perfectly until the end for some reason. All of the outputs (psm 6, 11, and 12) are reading the exact same. Any help is appreciated.
1885-1015
1886-1280
1956-0044
2087-0047
2087-0155
2087-1433
2221-0093L
2221-0093R
2331-4628R
2992-/114R
29593-0007R
Your image does not require any pre-processing at all. It is already perfect and structured. So try not to resize the image before passing it to tesseract. Resizing is not needed in your case.
Hope this helps.
Related
seems resolution of image effect the output is success or not
usually the image's resolution/quality from production line is like test image 1, instead of change camera quality, is there any way to make success rate higher? like improve code make simple AI to help detect or something? I need a hand thanks.
the demo .py code I found from tutorial
from PIL import Image
import pytesseract
img = Image.open('new_003.png')
text = pytesseract.image_to_string(img, lang='eng')
print("size")
print(img.size)
print(text)
(pic) test image 1: https://ibb.co/VLsM9LL
size
(122, 119)
# the output is:
R carac7
(pic) test image 2: https://ibb.co/XyRcf45
size
(329, 249)
# the output is:
R1 oun,
2A
R ca7ac2
(pic) test image 3: https://ibb.co/fNtDRc7
this one just for test but is the only one 100% correct
size
(640, 640)
# the output is:
BREAKING THE STATUE
i have always known
i just didn't understand
the inner conflictions
arresting our hands
gravitating close enough
expansive distamce between
i couldn't give you more
but i meant everything
when the day comes
you find your heart
wants something more
than a viece and a part
your life will change
like astatue set free
to walk among us
to created estiny
we didn't break any rules
we didn't make mistakes
making beauty in loving
making lovine for days
SHILOW
I tried to find out/proof the solution can only be the image resolution or there can be other alternative way to solve this issue
I try Dilation and Erosion to image, hoped can get more clear image for OCR recognize like link demo pic https://ibb.co/3pDgDnF
import cv2
import numpy as np
import matplotlib.pyplot as plt
import glob
from IPython.display import clear_output
def show_img(img, bigger=False):
if bigger:
plt.figure(figsize=(15,15))
image_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(image_rgb)
plt.show()
def sharpen(img, sigma=100):
# sigma = 5、15、25
blur_img = cv2.GaussianBlur(img, (0, 0), sigma)
usm = cv2.addWeighted(img, 1.5, blur_img, -0.5, 0)
return usm
def img_processing(img):
# do something here
img = sharpen(img)
return img
img = cv2.imread("/home/joy/桌面/all_pic_OCR/simple_pic/03.png")
cv2.imshow('03', img) # Original image
img2 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (11, 11))
img = cv2.dilate(img, kernel) # tried Dilation
cv2.imshow('image_after_Dilation', img) # image after Dilation
img = cv2.erode(img, kernel) # tried Erosion
cv2.imshow('then_Erosion', img) # image after Erosion
cv2.waitKey(0)
cv2.destroyAllWindows()
result: https://ibb.co/TbZjg3d
so still trying to achieve python OCR recognize image into text with 99.9999% correct
I'm woorking with this kind of image Original_Image and I'm having some problems to apply character recognition. I'm tried some image treatment (gray, black and white, noise removal,..) and got only bad results. This is a part of the code I'm work in Python.
import cv2
from matplotlib import pyplot as plt
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Users\14231744700\AppData\Local\Tesseract-OCR\tesseract.exe"
image_file = '5295_down.bmp'
img = cv2.imread(image_file)
height,width,channels= img.shape
#The attached image is this one (img_cropped) and I want this data as a string to work on it
img_cropped = img[41*height//50:92*height//100,2*width//14:81*width//100]
#cv2.imshow('Image_cropped', img_cropped)
#cv2.imwrite('image_cropped.png', img_cropped)
#cv2.waitKey(0)
def image_to_string(image):
data = pytesseract.image_to_string(image, lang='eng', config='--psm 6')
return data
image_to_string(img_cropped)
If someone know about a preprocessing step or any other possibility to get better results, I'll be very thankfull.
I have a problem with the recognition, that some of my input images that are visibly a 1 turn into a 4 after the .image_to_string() command.
My input image is this:
unedited img
I then run some preprocessing steps over it (greyscale, thresholding with otsu, and enlarge the picture) leading to this:
preprocessed img
I also tried dilating the picture with no improvement in the output changing.
After running:
custom_config = "-c tessedit_char_whitelist=0123456789LV --psm 13"
pytesseract.image_to_string(processed_img, config=custom_config)
The final result is a String Displaying:
4LV♀ and I don't understand what I can change to get a 1 instead of the 4.
Thanks in advance for your time.
The ♀ or \n\x0c is because you need custom_config = "-c page_separator=''" as the config because for some reason it adds it as the page separator. you don't need anything else in your config.
To get your number is to do with the processing, mainly to do with the size. However this code i found works best.
import pytesseract
from PIL import Image
import cv2
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
import numpy as np
imagepath = "./Pytesseract Wrong Number/kD3Zy.jpg"
read_img = Image.open(imagepath)
# convert PIL image to cv2 image locally
read_img = read_img.convert('RGB')
level_img = np.array(read_img)
level_img = level_img[:, :, ::-1].copy()
# convert to grayscale
level_img = cv2.cvtColor(level_img, cv2.COLOR_RGB2GRAY)
level_img, img_bin = cv2.threshold(level_img, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
level_img = cv2.bitwise_not(img_bin)
kernel = np.ones((2, 1), np.uint8)
# make the image bigger, because it needs at least 30 pixels for the height for the characters
level_img = cv2.resize(level_img,(0,0),fx=4,fy=4, interpolation=cv2.INTER_CUBIC)
level_img = cv2.dilate(level_img, kernel, iterations=1)
# --debug--
#cv2.imshow("Debug", level_img)
#cv2.waitKey()
#cv2.destroyAllWindows
#cv2.imwrite("1.png", level_img)
custom_config = "-c page_separator=''"
level = pytesseract.image_to_string(level_img, config=custom_config)
print(level)
if you want to save it uncomment #cv2.imwrite("1.png", level_img)
Try settings "--psm 8 --oem 3" All list is at enter link description here, though psm 8 and oem 3 generally works fine.
I have a simple code, that applies effect on one of my pictures:
from matplotlib import pyplot as plt
import os
import cv2
path_in=os.path.join("C:/Users/Desktop/Images","glass.jpg")
img = cv2.imread(path_in, cv2.IMREAD_COLOR)
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)
plt.show()
but problem is, that I want to apply this to all images in my folder and I don't know their names.
So I understand that I need to create loop with the list of my images in folder, but I have tried this and it didn't work
path_in=os.path.join("C:/Users/Desktop/Images")
list = os.listdir(path_in)
for img in list:
img = cv2.imread(path_in, cv2.IMREAD_COLOR)
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
io.imsave("C:/Users/Desktop/Images_new", image_converted)
plt.imshow(img)
plt.show()
I would be very glad if someone could tell me, what I am doing wrong. Thank you
You are already on the right track. There are just a few minor issues with your code:
First, define the path_in inside the loop and join your folder path with the image name. Second, never use keywords from the python language as variable names. The name list is a python keyword.
I don't know the functions imsave, imshow and show in details, so I'm not sure if they should be inside of the loop. I guess, but there you may need to just test it. Also, for imsave, it may be that you need to set a path for each image. In that case you could do it like io.imsave("C:/Users/Desktop/Images_new/{}".format(img), image_converted).
path_folder = "C:/Users/Desktop/Images"
img_list = os.listdir(path_in)
for img in img_list:
path_in = os.path.join(path_folder, img)
img = cv2.imread(path_in, cv2.IMREAD_COLOR)
img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
io.imsave("C:/Users/Desktop/Images_new", image_converted)
plt.imshow(img)
plt.show()
No 100% guarantee it works exaclty as is. Let me know if you get an error.
path_folder = "C:/Users/Desktop/Images"
img_list = os.listdir(path_in)
for img in img_list:
path_in=os.path.join("C:/Users/Desktop/Images",img)
path_out = os.path.join("C:/Users/Desktop/Images_new", img)
img = cv2.imread(path_in, cv2.IMREAD_COLOR)
image_converted = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
io.imsave(path_out, image_converted, format = 'jpg')
plt.imshow(image_converted)
plt.show()
Does anyone know how I can get these results better?
Total Kills: 15,230,550
Kill Details: (recorded after 2019/10,/Z3]
993,151 331,129
1,330,450 33,265,533
5,031,168
This is what it returns however it is meant to be the same as the image posted below, I am new to python so are there any parameters that I can add to make it read the image better?
img = cv2.imread("kills.jpeg")
text = pytesseract.image_to_string(img)
print(text)
This is my code to read the image, Is there anything I can add to make it read better? Also, the black boxes are to cover images that were interfering with the reading. I would like to also say that I have added the 2 black boxes to see if the images behind them were causing the issue, but I still get the same issue.
The missing knowledge is page-segmentation-mode (psm). You need to use them, when you can't get the desired result.
If we look at your image, the only artifacts are the black columns. Other than that, the image looks like a binary image. Suitable for tesseract to recognize the characters and the digits.
Lets try reading the image by setting the psm to 6.
6 Assume a single uniform block of text.
print(pytesseract.image_to_string(img, config="--psm 6")
The result will be:
Total Kills: 75,230,550
Kill Details: (recorded after 2019/10/23)
993,161 331,129
1,380,450 33,265,533
5,031,168
Update
The second way to solve the problem is getting binary mask and applying OCR to the mask features.
Binary-mask
Features of the binary-mask
As we can see the result is slightly different from the input image. Now when we apply OCR result will be:
Total Kills: 75,230,550
Kill Details: (recorded after 2019/10/23)
993,161 331,129
1,380,450 33,265,533
5,031,168
Code:
import cv2
import numpy as np
import pytesseract
# Load the image
img = cv2.imread("LuKz3.jpg")
# Convert to hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Get the binary mask
msk = cv2.inRange(hsv, np.array([0, 0, 0]), np.array([179, 255, 154]))
# Extract
krn = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
dlt = cv2.dilate(msk, krn, iterations=5)
res = 255 - cv2.bitwise_and(dlt, msk)
# OCR
txt = pytesseract.image_to_string(res, config="--psm 6")
print(txt)
# Display
cv2.imshow("res", res)
cv2.waitKey(0)