I'm trying to use Python to run OCR with pytesseract on some PowerPoint slides that have images (of text) and I'm stuck on getting the images to pass to pytesseract.
So far, I have this but that last line is the problem:
for slide in presentation.Slides:
for shape in slide.Shapes:
if 'Picture' in shape.Name: #in my case, the images I want have this.
picture_text = image_to_string(shape)
This gives an error--I guess because a PowerPoint Shape is not an image:
Traceback (most recent call last):
File "C:/Users/agent/Desktop/Chaelon Stuff on Desktop/Walpole/make_Word_rough_pass_from_PowerPoint_chapter.py", line 61, in <module>
worddoc.Content.Text = image_to_string(shape)
File "C:\Python27\lib\site-packages\pytesseract\pytesseract.py", line 143, in image_to_string
if len(image.split()) == 4:
File "C:\Python27\lib\site-packages\win32com\client\dynamic.py", line 522, in __getattr__
raise AttributeError("%s.%s" % (self._username_, attr))
AttributeError: <unknown>.split
So then I tried using shape.Image but get this error:
Traceback (most recent call last):
File "C:/Users/agent/Desktop/Chaelon Stuff on Desktop/Walpole/make_Word_rough_pass_from_PowerPoint_chapter.py", line 61, in <module>
worddoc.Content.Text = image_to_string(shape.Image)
File "C:\Python27\lib\site-packages\win32com\client\dynamic.py", line 522, in __getattr__
raise AttributeError("%s.%s" % (self._username_, attr))
AttributeError: <unknown>.Image
Given the image is in the presentation, I was hoping there could be some way to get each image from its Shape object and then pass each image directly to pytesseract for OCR (without having to save it to disk as an image first). Is there?
Or do I have to save it to disk as an image and then read it into pytesseract? If so, how best to do that?
You give yourself the answer to your question, but are not yet sure you are right or just don't want believe it is they way it is. Yes:
You need to save an image to disk as an image and then read it into pytesseract except you find a way to convert the image you got from PowerPoint to an image object used in PIL (Python Image Library).
Maybe someone else can provide here the information how to do the conversion from PowerPoint image to PIL image as I am not on Windows and not using Microsoft PowerPoint to test myself eventually proposed solutions, but maybe THIS link provides already enough information to satisfy your needs:
https://codereview.stackexchange.com/questions/101803/process-powerpoint-xml
Picture shapes in python-pptx have an image property, which returns an Image object:
http://python-pptx.readthedocs.io/en/latest/api/shapes.html#picture-objects
http://python-pptx.readthedocs.io/en/latest/api/image.html
The image object provides access to the image file bytes and the filename extension (e.g. "png"), which should give you what you need:
for shape in slide.Shapes:
if 'Picture' in shape.name:
picture = shape
image = picture.image
image_file_bytes = image.blob
file_extension = image.ext
# save image as file or perhaps in-memory file like StringIO() using bytes and ext.
Related
I cannot figure out how to do what Ben Eater did.
I have the exact same code (different file name) but the error I get is that I cannot use the argument pixels[x,y] for chr() to write to a binary file
The video I linked has all the information of what I am trying to accomplish. If you have a specific question for me, ask away.
btw...I have literally been trying to make this work about a year and have not figured out how to do it so...yeah.
'''
from PIL import Image
image = Image.open("Margarita3.png")
pixels = image.load()
out_file = open("Margarita3.bin", "wb")
for y in range(150):
for x in range(200):
try:
out_file.write(chr(pixels[x, y]))
except IndexError:
out_file.write(chr(0))
'''
here is the error message
Traceback (most recent call last):
File "C:\Users\Nicky\Desktop\tolaptop\wincupl_vga_timings\convert.py", line
11, in <module>
out_file.write(chr(pixels[x,y]))
TypeError: an integer is required
Make sure the image is in the correct location. According to your current code, it should be in the same directory as the python script. If you want to specify otherwise, you should do it like so:
image = Image.open("C:\Users\Nicky\Desktop\tolaptop\...Margarita3.png")
pixels = image.load()
out_file = open("C:\Users\Nicky\Desktop\tolaptop\...Margarita3.bin", "wb")
Basically I followed this tutorial to stream processed video (not just retrieving frames and broadcasting) and it works for me (I'm new to html and flask). But I want to save some computation here:
I wonder if it's possible to avoid saving opencv image object to a jpeg file and then reading again? Is it a waste of computation?
I think it's even better if flask/html template could render the image by using raw 3 data channels RGB of the image.
Any idea? Thanks!
P/S: I actually tried this following code:
_, encoded_img = cv2.imencode('.jpg', img, [ int( cv2.IMWRITE_JPEG_QUALITY ), 95 ] )
But it gives the following error:
Debugging middleware caught exception in streamed response at a point where response headers were already sent.
Traceback (most recent call last):
File "/home/trungnb/virtual_envs/tf_cpu/lib/python3.5/site-packages/werkzeug/wsgi.py", line 704, in next
return self._next()
File "/home/trungnb/virtual_envs/tf_cpu/lib/python3.5/site-packages/werkzeug/wrappers.py", line 81, in _iter_encoded
for item in iterable:
File "/home/trungnb/workspace/coding/Mask_RCNN/web.py", line 25, in gen
if frame == None:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
You would want to compress it to JPEG anyway as sending the raw RGB data would be slower due to the data size.
You could try using cv::imencode to compress the image. Then you may be able send the image in a similar way to flask return image created from database
I am using pydicom for extracting image data out of a dicom file. Unfortunately pydicom fails to directly extract a numpy array of data I can directly use, but I get a data string containing all values in hex (i.e. f.eks. \x03\x80\x01\x0c\xa0\x00\x02P\x00\x04#\x00\t\x80\x00\x03.... I know that the image data is encoded in a JPEG2000-format. Is there a way to reconstruct an image out of these data? I already tried via
img = Image.fromstring('RGB', len(pixelData), pixelData)
but there I get the error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/PIL/Image.py", line 2064, in fromstring
return frombytes(*args, **kw)
File "/usr/local/lib/python2.7/dist-packages/PIL/Image.py", line 2049, in frombytes
im = new(mode, size)
File "/usr/local/lib/python2.7/dist-packages/PIL/Image.py", line 2015, in new
return Image()._new(core.fill(mode, size, color))
TypeError: must be 2-item sequence, not int
Is there another way to create an image out of these data?
Second parameter (size) to Image.fromstring should be 2-tuple with height and width:
:param size: A 2-tuple, containing (width, height) in pixels.
Unfortunately pydiacom has issues with JPEG compression. Is there no way of making the images TIFF or some other uncompressed format? Is it scan data?
I use uvccapture to take pictures and want to process them with the help of python and the python imaging library (PIL).
The problem is that PIL can not open those images. It throws following error message.
Traceback (most recent call last):
File "process.py", line 6, in <module>
im = Image.open(infile)
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 1980, in open
raise IOError("cannot identify image file")
IOError: cannot identify image file
My python code looks like this:
import Image
infile = "snap.jpg"
im = Image.open(infile)
I tried to save the images in different formats before processing them. But this does not help. Also changing file permissions and owners does not help.
The only thing that helps is to open the images, for example with jpegoptim, and overwriting the old image with the optimized one. After this process, PIL can deal with these images.
What is the problem here? Are the files generated by uvccapture corrupt?
//EDIT: I also found out, that it is not possible to open the images, generated with uvccapture, with scipy. Running the command
im = scipy.misc.imread("snap.jpg")
produces the same error.
IOError: cannot identify image file
I only found a workaround to this problem. I processed the captured pic with jpegoptim and afterwords PIL could deal with the optimized image.
I'm trying to take an fft of an image in python, alter the transformed image and take a reverse fft. Specifically, I have a picture of a grid that I'd like to transform, then black out all but a central, narrow vertical slit of the transform, then take a reverse fft.
The code I'm working with now, for no alteration to transform plane:
import os
os.chdir('/Users/terra/Desktop')
import Image, numpy
i = Image.open('grid.png')
i = i.convert('L') #convert to grayscale
a = numpy.asarray(i) # a is readonly
b = abs(numpy.fft.rfft2(a))
j = Image.fromarray(b)
j.save('grid2.png')
As of now, I'm getting an error message:
Traceback (most recent call last):
File "/Users/terra/Documents/pic2.py", line 11, in
j.save('grid2.png')
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PIL/Image.py", line 1439, in save
save_handler(self, fp, filename)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PIL/PngImagePlugin.py", line 506, in _save
raise IOError, "cannot write mode %s as PNG" % mode
IOError: cannot write mode F as PNG
I'm very new to programming and Fourier transforms, so most related threads I've found online are over my head. Very specific help is greatly appreciated. Thanks!
The main problem is that the array contains floats after the FFT, but for it to be useful for PNG output, you need to have uint8s.
The simplest thing is to convert it to uint8 directly:
b = abs(numpy.fft.rfft2(a)).astype(numpy.uint8)
This probably will not produce the image you want, so you'll have to normalize the values in the array somehow before converting them to integers.