Tesseract doesn't recognize certain pictures. Python

Tesseract doesn't recognize certain pictures. Python - python

Tesseract works fine when I use other pictures but whenever I use this picture it doesn't recognize the picture.
Can someone explain me why please?
import cv2
import pytesseract
import time
import random
from pynput.keyboard import Controller
keyboard = Controller() # Create the controller
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread("capture5.png")
#img = cv2.resize(img, (300, 300))
cv2.imshow("capture5", img)
text = pytesseract.image_to_string(img)
print(text)
cv2.waitKey(0)
cv2.destroyAllWindows()

I fixed my problem, all I needed to do was add this code to my script.
text = pytesseract.image_to_string(
img, config=("-c tessedit"
"_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ"
" --psm 10"
" "))

Related

im trying to read the data from the image using pytesseract

i tried to open read the image using the pytesseract , however the code is not able to read it please check this photo im using for reading the text.
below is my code:-
import cv2
import time
import pyscreenshot as ImageGrab
import pytesseract
pytesseract.pytesseract.tesseract_cmd=r'C:/Users/RTam/AppData/Local/Programs/Tesseract-OCR/tesseract.exe'
def takescreenshot():
path= (r'C:\Users\RTam\Desktop\python basics\web scraping\Pyautogui\photos')
im=ImageGrab.grab(bbox=(900,1000,1200,1100))
im.save(path+'\\'+'ss.png')
img= cv2.imread(r'C:\Users\RTam\Desktop\python basics\web scraping\Pyautogui\photos\ss3.png')
cv2.imshow('sample',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
sample_text= pytesseract.image_to_string(img)
print(sample_text)
the only output im getting is and empty space please help

Eventually, I found the answer to my question.
However this code will not run properly in Spyder IDE, so we should make sure we have the latest tesseract version.
import cv2
import time
import pyscreenshot as ImageGrab
import pytesseract
pytesseract.pytesseract.tesseract_cmd=r'C:/Users/RTam/AppData/Local/Programs/Tesseract-OCR/tesseract.exe'
def takescreenshot():
path= (r'C:\Users\RTam\Desktop\python basics\web scraping\Pyautogui\photos')
im=ImageGrab.grab(bbox=(900,1000,1200,1100))
im.save(path+'\\'+'ss.png')
img= cv2.imread(r'C:\Users\RTam\Desktop\python basics\web scraping\Pyautogui\photos\ss3.png')
def clerify_pic():
img2 = cv2.resize(img, (0, 0), fx=2, fy=2)
gry = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
return pytesseract.image_to_string(thr)

blank output after solving the captcha using pytesseract?

I am trying to solve a captcha :
and run a script :
from PIL import Image
from pytesseract import pytesseract
path_to_tesseract = r"/usr/local/Cellar/tesseract/5.0.1/bin/tesseract"
image_path2 = r"captcha2.jpg"
img = Image.open(image_path2)
pytesseract.tesseract_cmd = path_to_tesseract
text = pytesseract.image_to_string(img)
print(text[:-1])
captchaText=text[:-1]
but output is blank and when I use the same script with the following captcha:
it works great.

How to run pyttsx3 and OpenCV simultaneously?

Hello I have this text detector code below where green squares will drew itself around the each of detected text using OpenCV and it works well but I wanted to expand the project by saying out the detected word using pyttsx3 module but a problem occurs when I ran the code is that the window is not displaying but detected text is being says
import cv2
from matplotlib.pyplot import text
import pytesseract
from pytesseract import pytesseract
import pyttsx3
pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('C:\\users\\HP\Documents\\Yt thumbnails\\vector.png')
# Print the text contained in image
img2text = pytesseract.image_to_string(img)
print(img2text)
height,width,c = img.shape
letter_boxes = pytesseract.image_to_boxes(img)
say = pyttsx3.init()
speech = img2text
say.say(speech)
say.runAndWait()
for box in letter_boxes.splitlines():
box = box.split()
x,y,w,h = int(box[1]),int(box[2]),int(box[3]),int(box[4]) # Height of the boxes
cv2.rectangle(img, (x,height-y), (w,height-h),(0,0,255),3) # Add boxes
cv2.putText(img,box[0],(x,height-h+32), cv2.FONT_HERSHEY_COMPLEX,1,(0,255,0),2) # Add texts
cv2.imshow('Window',img) # Display window
cv2.waitKey(0)

NameError: name 'img_new' is not defined, how to fix?

Im trying to get pytesseract to work at identifying an image as single characters and not words.
Using code: This works, but only for detecting words not single characters in the image.
#importing modules
import pytesseract
from PIL import Image
# If you don't have tesseract executable in your PATH, include the following:
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
#converting image to text
print(pytesseract.image_to_string(Image.open('C:\Program Files\Tesseract-OCR\image2.png')))
Attempting to view single characters Code:
#importing modules
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
#converting image to text
text = pytesseract.image_to_string(img_new, lang='eng', config='--psm 10')
print(pytesseract.image_to_string(Image.open('C:\Program Files\Tesseract-OCR\image2.png')))
I get error
text = pytesseract.image_to_string(img_new, lang='eng', config='--psm 10')
NameError: name 'img_new' is not defined

How to put text on multiple images using python?

from PIL import Image, ImageDraw, ImageFont
import glob
import os
images = glob.glob("directory_path/*.jpg")
for img in images:
images = Image.open(img)
draw = ImageDraw.Draw(images)
font = ImageFont.load_default() #Downloaded Font from Google font
text = "Text on all images from directory"
draw.text((0,150),text,(250,250,250),font=font)
images.save(img)
I have to put text on all images , I have tried above code but its not working

This code worked for me just fine, but the text was hard to read because it was small and white. I did change directory_path to images and put my images in there. The images looked like this, the text is small and on the left side:

Here is the solution
from PIL import Image,ImageDraw,ImageFont
import glob
import os
images=glob.glob("path/*.jpg")
for img in images:
images=Image.open(img)
draw=ImageDraw.Draw(images)
font=ImageFont.load_default()
text="Whatever text"
draw.text((0,240),text,(250,250,250),font=font)
images.save(img)

one possible problem with the code may be that you are using the images variable for saving the list of images and also to iterate through the images.
Try this code, this will work for sure.
from PIL import Image, ImageDraw, ImageFont
import glob
import os
images = glob.glob("new_dir/*.jpg")
print(images)
for img in images:
image = Image.open(img)
draw = ImageDraw.Draw(image)
font = ImageFont.load_default() #Downloaded Font from Google font
text = "Text on all images from directory"
draw.text((0,150),text,fill = 'red' ,font=font)
image.save(img)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tesseract doesn't recognize certain pictures. Python - python

I fixed my problem, all I needed to do was add this code to my script. text = pytesseract.image_to_string( img, config=("-c tessedit" "_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ" " --psm 10" " "))

Related

im trying to read the data from the image using pytesseract

blank output after solving the captcha using pytesseract?

How to run pyttsx3 and OpenCV simultaneously?

NameError: name 'img_new' is not defined, how to fix?

How to put text on multiple images using python?

Categories

Resources