I have thousands of scale images that I would like to extract the reading of the scale from each image. However, when using the Tesseract it gives wrong values. I tried several process for the image but still running to same issue. From my understanding so far after defining region of interest in the image, it has to be converted to white text with black background. However, I am new to python, I tried some functions to do so but still running to same issue. Would be appreciated if someone can help me on this one. The following link is for the image, as I couldn't uploaded it here as it is more than 2 MiB:
https://mega.nz/file/fZMUDRbL#tg4Tc2VmGMMdEpnZzt7blxZjVLdlhMci9jll0FLnIGI
import cv2
import pytesseract
import matplotlib.pyplot as plt
import numpy as np
import imutils
## Reading Image File
Filename = 'C:\\Users\\Abdullah\\Desktop\\Scale Reading\\' #File Path For Images
IName = 'Disk_Test_1_09_07-00000_0.tif' # Image Name
Image = cv2.imread(Filename + IName,0)
## Image Processing
Image_Crop = Image[1680:1890, 550:1240] # Define Region of Interest of the image
#cv2.imshow("cropped", Image_Crop) # Show Cropped Image
#cv2.waitKey(0) # Show Cropped Image
Mask = Image_Crop > 10 # Thershold Image to Value of X
Mask = np.array(Mask, dtype=np.uint8)
plt.imshow(Mask, alpha=1) # Set Opacity (Max 1)
ret,Binary = cv2.threshold(Mask,0,255,cv2.THRESH_BINARY)
#plt.imshow(Image_Crop, cmap="gray") # Transform Image to Gray
#plt.show()
plt.imshow(Binary,'gray',vmin=0,vmax=255)
plt.show()
## Number Recognition
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Call Location of Tesseract-OCR
data = pytesseract.image_to_string(Binary, lang='eng',config='--psm 6')
print(data)
Here is the image after processing
Related
I'm woorking with this kind of image Original_Image and I'm having some problems to apply character recognition. I'm tried some image treatment (gray, black and white, noise removal,..) and got only bad results. This is a part of the code I'm work in Python.
import cv2
from matplotlib import pyplot as plt
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Users\14231744700\AppData\Local\Tesseract-OCR\tesseract.exe"
image_file = '5295_down.bmp'
img = cv2.imread(image_file)
height,width,channels= img.shape
#The attached image is this one (img_cropped) and I want this data as a string to work on it
img_cropped = img[41*height//50:92*height//100,2*width//14:81*width//100]
#cv2.imshow('Image_cropped', img_cropped)
#cv2.imwrite('image_cropped.png', img_cropped)
#cv2.waitKey(0)
def image_to_string(image):
data = pytesseract.image_to_string(image, lang='eng', config='--psm 6')
return data
image_to_string(img_cropped)
If someone know about a preprocessing step or any other possibility to get better results, I'll be very thankfull.
I am trying to extract the values from photographs of a Ritter biogas counter, specifically I want get the numbers at the black measurer. Here is an example:
I am trying to do this in Python, using the cv2 and pytesseract libraries. Currently my script looks like this:
import argparse
import cv2
import pytesseract
from matplotlib import pyplot as plt
# Parsing input arguments
parser = argparse.ArgumentParser(description='Analyze an image from Ritter counter and extract the measured gas volume')
parser.add_argument("--img", required=True, help="Route to image to be analyzed")
args = parser.parse_args()
img=str(args.img)
# Reading photo as a grayscale image
img = cv2.imread(img, 0)
print("Pixels (height x width):")
print(img.shape[:2])
# Cropping image
img = img[377:420, 295:660]
#Transforming image to a binary image using a fixed threshold
for i in range(65,85,1):
thresh = cv2.threshold(img, i, 255, cv2.THRESH_TOZERO)[1]
data = pytesseract.image_to_string(thresh, lang='eng',config='--psm 6')
plt.imshow(thresh)
plt.title("Fixed: " + str(i) + "; Result: " + data)
plt.show()
However, glare differences across the image, and those white lines of flash reflection in the counter's glass are causing me trouble to process the image before pytesseract. This is, currently, my best result:
I have tried using cv's adaptative thresholding with no better results. The expected result would process several images similar as the uploaded one, each with small differences in light reflection intensity and angle.
I am trying to crop an image with cv2 (converting it to a bytes file and therefore not needing to save it)and afterwards perform pytesseract.
This way i won't need to save the image twice during the process.
First when i create the image
When cropping the image
Process...
## CROPPING THE IMAGE REGION
ys, xs = np.nonzero(mask2)
ymin, ymax = ys.min(), ys.max()
xmin, xmax = xs.min(), xs.max()
croped = image[ymin:ymax, xmin:xmax]
pts = np.int32([[xmin, ymin],[xmin,ymax],[xmax,ymax],[xmax,ymin]])
cv2.drawContours(image, [pts], -1, (0,255,0), 1, cv2.LINE_AA)
#OPENCV IMAGE TO BYTES WITHOUT SAVING TO DISK
is_success, im_buf_arr = cv2.imencode(".jpg", croped)
byte_im = im_buf_arr.tobytes()
#PYTESSERACT IMAGE USING A BYTES FILE
Results = pytesseract.image_to_string(byte_im, lang="eng")
print(Results)
Unfortunately i get the error : Unsupported image object
Am i missing something? Is there a way to do this process without needing to save the file when cropping? Any help is highly appreciated.
you have croped which is a numpy array.
according to pytesseract examples, you simply do this:
# tesseract needs the right channel order
cropped_rgb = cv2.cvtColor(croped, cv2.COLOR_BGR2RGB)
# give the numpy array directly to pytesseract, no PIL or other acrobatics necessary
Results = pytesseract.image_to_string(cropped_rgb, lang="eng")
from PIL import Image
img_tesseract = Image.fromarray(croped)
Results = pytesseract.image_to_string(img_tesseract, lang="eng")
from PIL import Image
import io
def bytes_to_image(image_bytes):
io_bytes = io.BytesIO(image_bytes)
return Image.open(io_bytes)
pytesseract.image_to_data(byte_array_image,lang='eng')
Requirement is to crop region of interest from binary image.
I need a rectangle image from a binary image by removing the extra space around the region of interest.
For example:
From this Original image i want only the region of interest marked with yellow color rectangle.
Note: Yellow color rectangle is just for the reference and it is not present in the image that will be processed.
I tried the following python code but it is not giving the required output.
from PIL import Image
from skimage.io import imread
from skimage.morphology import convex_hull_image
import numpy as np
from matplotlib import pyplot as plt
from skimage import io
from skimage.color import rgb2gray
im = imread('binaryImageEdited.png')
plt.imshow(im)
plt.title('input image')
plt.show()
# create a binary image
im1 = 1 - rgb2gray(im)
threshold = 0.8
im1[im1 <= threshold] = 0
im1[im1 > threshold] = 1
chull = convex_hull_image(im1)
plt.imshow(chull)
plt.title('convex hull in the binary image')
plt.show()
imageBox = Image.fromarray((chull*255).astype(np.uint8)).getbbox()
cropped = Image.fromarray(im).crop(imageBox)
cropped.save('L_2d_cropped.png')
plt.imshow(cropped)
plt.show()
Thank you.
Your image is not actually binary on account of two things:
firstly, it has 26 colours, and
secondly it has an (entirely unnecessary) alpha channel.
You can trim it like this:
#!/usr/bin/env python3
from PIL import Image, ImageOps
# Open image and ensure greysale and discard useless alpha channel
im = Image.open("thing.png").convert('L')
# Threshold and invert image as not actually binary
thresh = im.point(lambda p: p < 64 and 255)
# Get bounding box of thresholded image
bbox1 = thresh.getbbox()
crop1 = thresh.crop(bbox1)
# Invert and crop again
crop1n = ImageOps.invert(crop1)
bbox2 = crop1n.getbbox()
crop2 = crop1.crop(bbox2) # You don't actually need this - it's just for debug
# Trim original, unthresholded, uninverted image to the two bounding boxes
result = im.crop(bbox1).crop(bbox2)
result.save('result.png')
even i have similar problem. Also it would be helpful if image saved is in 32X32 px.
I am new to python and I was playing around with background subtraction to visualize changes in pre and post change images.
I wrote a short and simple script using the cv2 library:
#!/usr/bin/env python
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
#GRAYSCALE ONLY FOR TESTING
#Test with person appearing in image
img1 = cv.imread("images/1.jpg", 0)
img2 = cv.imread("images/2.jpg", 0)
img3 = cv.subtract(img1, img2)
ret,thresh1 = cv.threshold(img3,90,255,cv.THRESH_BINARY)
#Test with satelite image of japan landslide changes after earthquake
jl_before = cv.imread("images/japan_earthquake_before.jpg",0)
jl_after = cv.imread("images/japan_earthquake_after.jpg",0)
jl_subtraction = cv.subtract(jl_before, jl_after)
ret,thresh2 = cv.threshold(img3,20,255,cv.THRESH_BINARY)
images = [img1, img2, thresh1, jl_before, jl_after, thresh2]
titles = ["Image1", "Image2", "Changes", "Japan_Before", "Japan_After", "Japan_Changes" ]
for i in range(6):
plt.subplot(2,3,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
The result looks like this:
Why is the mask with changes from the first set of images present in the mask of the second set of images?
I used different variables, thresh1 and thresh2.
Any help would be greatly appreciated as I can't seem to find the problem.
Because you missed a change when copy pasting:
ret,thresh2 = cv.threshold(img3,20,255,cv.THRESH_BINARY)
^^^^