I am trying to crop an image with cv2 (converting it to a bytes file and therefore not needing to save it)and afterwards perform pytesseract.
This way i won't need to save the image twice during the process.
First when i create the image
When cropping the image
Process...
## CROPPING THE IMAGE REGION
ys, xs = np.nonzero(mask2)
ymin, ymax = ys.min(), ys.max()
xmin, xmax = xs.min(), xs.max()
croped = image[ymin:ymax, xmin:xmax]
pts = np.int32([[xmin, ymin],[xmin,ymax],[xmax,ymax],[xmax,ymin]])
cv2.drawContours(image, [pts], -1, (0,255,0), 1, cv2.LINE_AA)
#OPENCV IMAGE TO BYTES WITHOUT SAVING TO DISK
is_success, im_buf_arr = cv2.imencode(".jpg", croped)
byte_im = im_buf_arr.tobytes()
#PYTESSERACT IMAGE USING A BYTES FILE
Results = pytesseract.image_to_string(byte_im, lang="eng")
print(Results)
Unfortunately i get the error : Unsupported image object
Am i missing something? Is there a way to do this process without needing to save the file when cropping? Any help is highly appreciated.
you have croped which is a numpy array.
according to pytesseract examples, you simply do this:
# tesseract needs the right channel order
cropped_rgb = cv2.cvtColor(croped, cv2.COLOR_BGR2RGB)
# give the numpy array directly to pytesseract, no PIL or other acrobatics necessary
Results = pytesseract.image_to_string(cropped_rgb, lang="eng")
from PIL import Image
img_tesseract = Image.fromarray(croped)
Results = pytesseract.image_to_string(img_tesseract, lang="eng")
from PIL import Image
import io
def bytes_to_image(image_bytes):
io_bytes = io.BytesIO(image_bytes)
return Image.open(io_bytes)
pytesseract.image_to_data(byte_array_image,lang='eng')
Related
I have thousands of scale images that I would like to extract the reading of the scale from each image. However, when using the Tesseract it gives wrong values. I tried several process for the image but still running to same issue. From my understanding so far after defining region of interest in the image, it has to be converted to white text with black background. However, I am new to python, I tried some functions to do so but still running to same issue. Would be appreciated if someone can help me on this one. The following link is for the image, as I couldn't uploaded it here as it is more than 2 MiB:
https://mega.nz/file/fZMUDRbL#tg4Tc2VmGMMdEpnZzt7blxZjVLdlhMci9jll0FLnIGI
import cv2
import pytesseract
import matplotlib.pyplot as plt
import numpy as np
import imutils
## Reading Image File
Filename = 'C:\\Users\\Abdullah\\Desktop\\Scale Reading\\' #File Path For Images
IName = 'Disk_Test_1_09_07-00000_0.tif' # Image Name
Image = cv2.imread(Filename + IName,0)
## Image Processing
Image_Crop = Image[1680:1890, 550:1240] # Define Region of Interest of the image
#cv2.imshow("cropped", Image_Crop) # Show Cropped Image
#cv2.waitKey(0) # Show Cropped Image
Mask = Image_Crop > 10 # Thershold Image to Value of X
Mask = np.array(Mask, dtype=np.uint8)
plt.imshow(Mask, alpha=1) # Set Opacity (Max 1)
ret,Binary = cv2.threshold(Mask,0,255,cv2.THRESH_BINARY)
#plt.imshow(Image_Crop, cmap="gray") # Transform Image to Gray
#plt.show()
plt.imshow(Binary,'gray',vmin=0,vmax=255)
plt.show()
## Number Recognition
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Call Location of Tesseract-OCR
data = pytesseract.image_to_string(Binary, lang='eng',config='--psm 6')
print(data)
Here is the image after processing
I'm converting a .png image to float32 the following way and I'm obtaining a broken image as shown below. If I remove the tf.image.convert_image_dtype call, everything goes well.
image = tf.io.read_file(filename)
image = tf.image.decode_png(image, channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
I've also tried different images with different formats like .bmp and .jpg but same thing happens. The code I use to visualize the image generated the above way is just:
a = a.numpy()
a = Image.fromarray(a, 'RGB')
As I've said, if I just remove the tf.image.convert_image_dtype call everything goes well.
Here are the download links of both images (I have less than 10 reputation here so I can't upload photos yet).
original_image
obtained_image
You can convert it back to integer like this
import tensorflow as tf
import numpy as np
from PIL import Image
image = tf.io.read_file("C:\\<your file>.png")
image = tf.image.decode_png(image, channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
a = image.numpy()
a = (a * 255 / np.max(a)).astype('uint8')
a = Image.fromarray(a, 'RGB')
a.show()
I'm trying to create a image system in Python 3 to be used in a web app. The idea is to load an image from disk and add some random noise to it. When I try this, I get what looks like a totally random image, not resembling the original:
import cv2
import numpy as np
from skimage.util import random_noise
from random import randint
from pathlib import Path
from PIL import Image
import io
image_files = [
{
'name': 'test1',
'file': 'test1.png'
},
{
'name': 'test2',
'file': 'test2.png'
}
]
def gen_image():
rand_image = randint(0, len(image_files)-1)
image_file = image_files[rand_image]['file']
image_name = image_files[rand_image]['name']
image_path = str(Path().absolute())+'/img/'+image_file
img = cv2.imread(image_path)
noise_img = random_noise(img, mode='s&p', amount=0.1)
img = Image.fromarray(noise_img, 'RGB')
fp = io.BytesIO()
img.save(fp, format="PNG")
content = fp.getvalue()
return content
gen_image()
I have also tried using pypng:
import png
# Added the following to gen_image()
content = png.from_array(noise_img, mode='L;1')
content.save('image.png')
How can I load a png (With alpha transparency) from disk, add some noise to it, and return it so that it can be displayed by web server code (flask, aiohttp, etc).
As indicated in the answer by makayla, this makes it better: noise_img = (noise_img*255).astype(np.uint8) but the colors are still wrong and there's no transparency.
Here's the updated function for that:
def gen_image():
rand_image = randint(0, len(image_files)-1)
image_file = image_files[rand_image]['file']
image_name = image_files[rand_image]['name']
image_path = str(Path().absolute())+'/img/'+image_file
img = cv2.imread(image_path)
cv2.imshow('dst_rt', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Problem exists somewhere below this line.
img = random_noise(img, mode='s&p', amount=0.1)
img = (img*255).astype(np.uint8)
img = Image.fromarray(img, 'RGB')
fp = io.BytesIO()
img.save(fp, format="png")
content = fp.getvalue()
return content
This will popup a pre-noise image and return the noised image. RGB (And alpha) problem exists in returned image.
I think the problem is it needs to be RGBA but when I change to that, I get ValueError: buffer is not large enough
Given all the new information I am updating my answer with a few more tips for debugging the issue.
I found a site here which creates sample transparent images. I created a 64x64 cyan (R=0, G=255, B=255) image with a transparency layer of 0.5. I used this to test your code.
I read in the image two ways to compare: im1 = cv2.imread(fileName) and im2 = cv2.imread(fileName,cv2.IMREAD_UNCHANGED). np.shape(im1) returned (64,64,3) and np.shape(im2) returned (64,64,4). This is why that flag is required--the default imread settings in opencv will read in a transparent image as a normal RGB image.
However opencv reads in as BGR instead of RGB, and since you cannot save out with opencv, you'll need to convert it to the correct order otherwise the image will have reversed color. For example, my cyan image, when viewed with the reversed color appears like this:
You can change this using openCV's color conversion function like this im = cv2.cvtColor(im, cv2.COLOR_BGRA2RGBA) (Here is a list of all the color conversion codes). Again, double check the size of your image if you need to, it should still have four channels since you converted it to RGBA.
You can now add your noise to your image. Just so you know, this is also going to add noise to your alpha channel as well, randomly making some pixels more transparent and others less transparent. The random_noise function from skimage converts your image to float and returns it as float. This means the image values, normally integers ranging from 0 to 255, are converted to decimal values from 0 to 1. Your line img = Image.fromarray(noise_img, 'RGB') does not know what to do with the floating point noise_img. That's why the image is all messed up when you save it, as well as when I tried to show it.
So I took my cyan image, added noise, and then converted the floats back to 8 bits.
noise_img = random_noise(im, mode='s&p', amount=0.1)
noise_img = (noise_img*255).astype(np.uint8)
img = Image.fromarray(noise_img, 'RGBA')
It now looks like this (screenshot) using img.show():
I used the PIL library to save out my image instead of openCV so it's as close to your code as possible.
fp = 'saved_im.png'
img.save(fp, format="png")
I loaded the image into powerpoint to double-check that it preserved the transparency when I saved it using this method. Here is a screenshot of the saved image overlaid on a red circle in powerpoint:
I'm using PIL to resize a JPG. I'm expecting the same image, resized as output, but instead I get a correctly sized black box. The new image file is completely devoid of any information, just an empty file. Here is an excerpt for my script:
basewidth = 300
img = Image.open(path_to_image)
wpercent = (basewidth/float(img.size[0]))
hsize = int((float(img.size[1])*float(wpercent)))
img = img.resize((basewidth,hsize))
img.save(dir + "/the_image.jpg")
I've tried resizing with Image.LANCZOS as the second argument, (defaults to Image.NEAREST with 1 argument), but it didn't make a difference. I'm running Python3 on Ubunutu 16.04. Any ideas on why the image file is empty?
I also encountered the same issue when trying to resize an image with transparent background. The "resize" works after I add a white background to the image.
Code to add a white background then resize the image:
from PIL import Image
im = Image.open("path/to/img")
if im.mode == 'RGBA':
alpha = im.split()[3]
bgmask = alpha.point(lambda x: 255-x)
im = im.convert('RGB')
im.paste((255,255,255), None, bgmask)
im = im.resize((new_width, new_height), Image.ANTIALIAS)
ref:
Other's code for making thumbnail
Python: Image resizing: keep proportion - add white background
The simplest way to get to the bottom of this is to post your image! Failing that, we can check the various aspects of your image.
So, import Numpy and PIL, open your image and convert it to a Numpy ndarray, you can then inspect its characteristics:
import numpy as np
from PIL import Image
# Open image
img = Image.open('unhappy.jpg')
# Convert to Numpy Array
n = np.array(img)
Now you can print and inspect the following things:
n.shape # we are expecting something like (1580, 1725, 3)
n.dtype # we expect dtype('uint8')
n.max() # if there's white in the image, we expect 255
n.min() # if there's black in the image, we expect 0
n.mean() # we expect some value between 50-200 for most images
import cv2
fname = '1.png'
img=cv2.imread(fname, 0)
print (img)//the outcome is an array of values from 0 to 255 (grayscale)
ret, thresh = cv2.threshold(img, 254, 255, cv2.THRESH_BINARY)
thresh = cv2.bitwise_not(thresh)
nums, labels = cv2.connectedComponents(thresh, None, 4, cv2.CV_32S)
dst = cv2.convertScaleAbs(255.0*labels/nums)
cv2.imwrite(dest_dir+"output.png", dst)
that code works just fine, so i moved on to adjusting my code so it can take a portion of the image not the entire image:
from PIL import Image
img = Image.open(fname)
img2 = img.crop((int(xmin), int(yMin),int(xMax), int(yMax))
xmin ymin xmax ymax simply being the top left bottom right coordinates of the box.
then i did img = cv2.imread(img2) to continue as the previous code but got an error, i printed img2 and got <PIL.Image.Image image mode=RGB size=54x10 at 0x7F4D283AFB70> how can i adjust it to be able to input that crop or image portion instead of fname in my code above, and kindly note i don't want to save img2 as an image and carry on from there because i need to work on the main image.
try cv2.imshow() instead of printing it. In order to see an image you cropped, you need to use cv2 function. here is a sample code:
import numpy as np
import cv2
# Load an color image in grayscale
img = cv2.imread('messi5.jpg',0)
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
The simple answer is NO you cannot.
Open up your terminal /IDE and type in help(cv2.imread).
It clearly states that The function imread loads an image from the specified file and returns it. So in order to use cv2.imread() you must pass it in as a file not an image.
Your best bet would be to save your cropped image as a file and then read it.