I writing a Python app where I need to do some image tasks.
I'm trying PIL and it's ImageOps module. But it looks that the unsharp_mask method is not working properly. It should return another image, but is returning a ImagingCore object, which I don't know what is.
Here's some code:
import Image
import ImageOps
file = '/home/phius/test.jpg'
img = Image.open(file)
img = ImageOps.unsharp_mask(img)
#This fails with AttributeError: save
img.save(file)
I'm stuck on this.
What I need: Ability to do some image tweeks like PIL's autocontrast and unsharp_mask and to re-size, rotate and export in jpg controlling the quality level.
What you want is the filter command on your image and the PIL ImageFilter module[1] so:
import Image
import ImageFilter
file = '/home/phius/test.jpg'
img = Image.open(file)
img2 = img.filter(ImageFilter.UnsharpMask) # note it returns a new image
img2.save(file)
The other filtering operations are part of the ImageFilter module[1] as well and are applied the same way. The transforms (rotation, resize) are handled by calling functions[2] on the image object itself i.e. img.resize. This question addresses JPEG quality How to adjust the quality of a resized image in Python Imaging Library?
[1] http://effbot.org/imagingbook/imagefilter.htm
[2] http://effbot.org/imagingbook/image.htm
Related
import pytesseract
from pdf2image import convert_from_path, convert_from_bytes
import cv2,numpy
def pil_to_cv2(image):
open_cv_image = numpy.array(image)
return open_cv_image[:, :, ::-1].copy()
path='OriginalsFile.pdf'
images = convert_from_path(path)
cv_h=[pil_to_cv2(i) for i in images]
img_header = cv_h[0][:160,:]
#print(pytesseract.image_to_string(Image.open('test.png'))) I only found this in tesseract docs
Hello, is there a way to read the img_header directly using pytesseract without saving it,
pytesseract docs
pytesseract.image_to_string() input format
As documentation explains pytesseract.image_to_string() needs a PIL image as input.
So you can convert your CV image into PIL one easily, like this:
from PIL import Image
... (your code)
print(pytesseract.image_to_string(Image.fromarray(img_header)))
if you really don't want to use PIL!
see:
https://github.com/madmaze/pytesseract/blob/master/src/pytesseract.py
pytesseract is an easy wrapper to run the tesseract command def run_and_get_output() line, you'll see that it saves your image into an temporary file, and then gives the address to the tesseract to run.
hence, you can do the same with opencv, just rewrite the pytesseract only .py file to do it with opencv, although; i don't see any performance improvements whatsoever.
The fromarray function allows you to load the PIL document into tesseract without saving the document to disk, but you should also ensure that you don`t send a list of pil images into tesseract. The convert_from_path function can generate a list of pil images if a pdf document contains multiple pages, therefore you need to send each page into tesseract individually.
import pytesseract
from pdf2image import convert_from_path
import cv2, numpy
def pil_to_cv2(image):
open_cv_image = numpy.array(image)
return open_cv_image[:, :, ::-1].copy()
doc = convert_from_path(path)
for page_number, page_data in enumerate(doc):
cv_h= pil_to_cv2(page_data)
img_header = cv_h[:160,:]
print(f"{page_number} - {pytesseract.image_to_string(Image.fromarray(img_header))}")
I have read mountains of posts on pytesseract, but I cannot get it to read text off a dead simple image; It returns an empty string.
Here is the image:
I have tried scaling it, grayscaling it, and adjusting the contrast, thresholding, blurring, everything it says in other posts, but my problem is that I don't know what the OCR wants to work better. Does it want blurry text? High contrast?
Code to try:
import pytesseract
from PIL import Image
print pytesseract.image_to_string(Image.open(IMAGE FILE))
As you can see in my code, the image is stored locally on my computer, hence Image.open()
Trying something along the lines of
import pytesseract
from PIL import Image
import requests
import io
response = requests.get('https://i.stack.imgur.com/J2ojU.png')
img = Image.open(io.BytesIO(response.content))
text = pytesseract.image_to_string(img, lang='eng', config='--psm 7')
print(text)
with --psm values equal or larger than 6 did yield "Gm" for me.
If the image is stored locally (and in your working directory), just drop the response variable and change the definition of text with the lines
image_name = "J2ojU.png" # or whatever appropriate
text = pytesseract.image_to_string(Image.open(image_name), lang='eng', config='--psm 7')
There are several reasons:
Edges are not sharp and continuous (By sharp I mean smooth, not with teeth)
Image is too small, you need to resize
Font is missing (not mandatory, but trained font incredibly improve possibility of recognition)
Based on points 1) and 2) I was able to recognize text.
1) I resized image 3x and 2) I blurred the image to make edges smooth
import pytesseract
import cv2
import numpy as np
import urllib
import requests
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
from PIL import Image
def url_to_image(url):
resp = urllib.request.urlopen(url)
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
return image
url = 'https://i.stack.imgur.com/J2ojU.png'
img = url_to_image(url)
retval, img = cv2.threshold(img,200,255, cv2.THRESH_BINARY)
img = cv2.resize(img,(0,0),fx=3,fy=3)
img = cv2.GaussianBlur(img,(11,11),0)
img = cv2.medianBlur(img,9)
cv2.imshow('asd',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
txt = pytesseract.image_to_string(img)
print('recognition:', txt)
>> recognition: Gm
Note:
This script is good for testing any image on web
Note 2:
All processing is based on your posted image
Note 3:
Text recognition is not easy. Every recognition requires special processing. If you try this steps with different image, it may not work at all. Important is to try a lot of recognition on images so you understand what tesseract wants
Referring to the answer to this question, I tried to save my own JPG image files, after some basic image processing. I've only applied a rotation and a shear. This is my code:
import numpy as np
import sys
from skimage import data, io, filter, color, exposure
import skimage.transform as tf
from skimage.transform import rotate, setup, AffineTransform
from PIL import Image
mypath = PATH_TO_FILENAME
readfile = FILENAME
img = color.rgb2gray(io.imread(mypath + readfile))
myimg = rotate(img, angle=10, order=2)
afine_tf = tf.AffineTransform(shear=0.1)
editedimg = tf.warp(myimg, afine_tf)
# IF I UNCOMMENT THE TWO LINES BELOW, I CAN SEE THE EDITED IMAGE AS EXPECTED
#io.imshow(editedimg)
#io.show()
saveimg= np.array(editedimg)
result = Image.fromarray((saveimg).astype(np.uint8))
newfile = "edited_" + readfile
result.save(path+newfile)
I know that the image processing was fine because if I display it before saving, it's just the original image with a bit of rotation and shearing, as expected. But I'm doing something wrong while saving it. I tried without the astype(np.uint8)) part but got an error. Then I removed some of the code from the link mentioned above because I guessed it was particularly for Fourier Transforms, since when I included some of their code, then I got an image that was all gray but with white lines in the direction of the shear I'd applied. But now the image that gets saved is just 2KB of nothing but blackness.
And when I tried something as simple as:
result = Image.fromarray(editedimg)
result.save(path+newfile)
then I got this error:
raise IOError("cannot write mode %s as JPEG" % im.mode)
IOError: cannot write mode F as JPEG
I don't really need to use PIL, if there's another way to simply save my image, I'm fine with that.
Look into the PIL fork, Pillow, is is not as outdated and what you should probably be using for this.
Also depending on your operating system you may need a few other libraries to compile PIL with JPEG support properly, see here
This may also help Says you need to convert your image to RGB mode before saving.
Image.open('old.jpeg').convert('RGB').save('new.jpeg')
When loading a png image with PIL and OpenCV, there is a color shift. Black and white remain the same, but brown gets changed to blue.
I can't post the image because this site does not allow newbies to post images.
The code is written as below rather than use cv.LoadImageM, because in the real case the raw image is received over tcp.
Here is the code:
#! /usr/bin/env python
import sys
import cv
import cv2
import numpy as np
import Image
from cStringIO import StringIO
if __name__ == "__main__":
# load raw image from file
f = open('frame_in.png', "rb")
rawImage = f.read()
f.close()
#convert to mat
pilImage = Image.open(StringIO(rawImage));
npImage = np.array(pilImage)
cvImage = cv.fromarray(npImage)
#show it
cv.NamedWindow('display')
cv.MoveWindow('display', 10, 10)
cv.ShowImage('display', cvImage)
cv. WaitKey(0)
cv.SaveImage('frame_out.png', cvImage)
How can the color shift be fixed?
OpenCV's images have color channels arranged in the order BGR whereas PIL's is RGB. You will need to switch the channels like so:
import PIL.Image
import cv2
...
image = np.array(pilImage) # Convert PIL Image to numpy/OpenCV image representation
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # You can use cv2.COLOR_RGBA2BGRA if you are sure you have an alpha channel. You will only have alpha channel if your image format supports transparency.
...
#Krish: Thanks for pointing out the bug. I didn't have time to test the code the last time.
Hope this helps.
Change
pilImage = Image.open(StringIO(rawImage))
to
pilImage = Image.open(StringIO(rawImage)).convert("RGB")
Light alchemist's answer did not work, but it did explain the issue. Wouldn't the reverse be screwed up by the Apha channel, i.e. it changes BRGA to AGRB. I would think Froyo's answer would solve it, but it did not change the displayed image at all. What did work was reversing the colors in OpenCV. I'm too much of a newbie to know why. They seem equivalent to me. Reversing the colors in numpy would be preferred as additional processing is planned in numpy. But thanks for the help, the answers steered me in the right direction.
pilImage = Image.open(StringIO(rawImage));
bgrImage = np.array(pilImage)
cvBgrImage = cv.fromarray(bgrImage)
# Reverse BGR
cvRgbImage = cv.CreateImage(cv.GetSize(cvBgrImage),8,3)
cv.CvtColor(cvBgrImage, cvRgbImage, cv.CV_BGR2RGB)
#show it
cv.ShowImage('display', cvRgbImage)
cv. WaitKey(30) # ms to allow display
i want to convert a Pyglet.AbstractImage object to an PIL image for further manipulation
here are my codes
from pyglet import image
from PIL import Image
pic = image.load('pic.jpg')
data = pic.get_data('RGB', pic.pitch)
im = Image.fromstring('RGB', (pic.width, pic.height), data)
im.show()
but the image shown went wrong.
so how to convert an image from pyglet to PIL properly?
I think I find the solution
the pitch in Pyglet.AbstractImage instance is not compatible with PIL
I found in pyglet 1.1 there is a codec function to encode the Pyglet image to PIL
here is the link to the source
so the code above should be modified to this
from pyglet import image
from PIL import Image
pic = image.load('pic.jpg')
pitch = -(pic.width * len('RGB'))
data = pic.get_data('RGB', pitch) # using the new pitch
im = Image.fromstring('RGB', (pic.width, pic.height), data)
im.show()
I'm using a 461x288 image in this case and find that pic.pitch is -1384
but the new pitch is -1383
This is an open wishlist item:
AbstractImage to/from PIL image.