Converting an UploadedFile to PIL image in Django - python

I'm trying to check an image's dimension, before saving it. I don't need to change it, just make sure it fits my limits.
Right now, I can read the file, and save it to AWS without a problem.
output['pic file'] = request.POST['picture_file']
conn = myproject.S3.AWSAuthConnection(aws_key_id, aws_key)
filedata = request.FILES['picture'].read()
content_type = 'image/png'
conn.put(
bucket_name,
request.POST['picture_file'],
myproject.S3.S3Object(filedata),
{'x-amz-acl': 'public-read', 'Content-Type': content_type},
)
I need to put a step in the middle, that makes sure the file has the right size / width dimensions. My file isn't coming from a form that uses ImageField, and all the solutions I've seen use that.
Is there a way to do something like
img = Image.open(filedata)

image = Image.open(file)
#To get the image size, in pixels.
(width,height) = image.size()
#check for dimensions width and height and resize
image = image.resize((width_new,height_new))

I've done this before but I can't find my old snippet... so here we go off the top of my head
picture = request.FILES.get['picture']
img = Image.open(picture)
#check sizes .... probably using img.size and then resize
#resave if necessary
imgstr = StringIO()
img.save(imgstr, 'PNG')
imgstr.reset()
filedata = imgstr.read()

The code bellow creates the image from the request, as you want:
from PIL import ImageFile
def image_upload(request):
for f in request.FILES.values():
p = ImageFile.Parser()
while 1:
s = f.read(1024)
if not s:
break
p.feed(s)
im = p.close()
im.save("/tmp/" + f.name)

Related

Rasterio MemoryFile to Pillow image (Or base64 output)

I'm trying to write an AWS Lambda function that takes a TIFF, converts it to JPEG, then outputs it in base64 so that lambda can serve it. But I keep running into malformed response, or issues with reshape_as_image saying axes doesn't match array.
My understanding was that the return of memfile.read() would allow me to use reshape_as_image, however my logic seems faulty.
Without saving to disk, how can I get from memfile to a base64 jpeg representation so that lambda can serve it? I've also tried pillow but I think the necessary step is where I'm failing.
def get_image(self, driver="jpeg"):
data = self.get_image()
with MemoryFile() as memfile:
# Change the driver for output
data[1]['driver'] = driver
with MemoryFile() as memfile:
# Change the driver for output
data[1]['driver'] = driver
with memfile.open(**data[1]) as dataset:
dataset.write(data[0])
image = memfile.read()
image = reshape_as_image(image)
im = Image.open(io.BytesIO(image))
b64data = base64.b64encode(im.tobytes()).decode('utf-8')
return b64data
It seems this isn't necessary for some reason, assuming because memfile.read() gives the actual bytes of the image.
def get_image(self, store=False, driver="GTiff"):
data = self.crop_ortho(store)
with MemoryFile() as memfile:
# Change the driver for output
data[1]['driver'] = driver
with memfile.open(**data[1]) as dataset:
dataset.write(data[0])
image = memfile.read()
im = Image.open(io.BytesIO(image))
im = im.convert('RGB')
# Save bytes to a byte array
imgByteArr = io.BytesIO()
im.save(imgByteArr, format='jpeg')
b64data = base64.b64encode(imgByteArr.getvalue())
return b64data

Change color scheme when extracting an image from PDF in Python

I am trying to read an image from a pdf following this post:
Extract images from PDF without resampling, in python?
So far I managed to get the image file from the pdf, but it contains a CWYK color scheme and the picture is becoming messed up.
My code is the following:
import PyPDF2
import struct
pdf_filename = 'document.pdf'
pdf_file = open(pdf_filename, 'rb')
cond_scan_reader = PyPDF2.PdfFileReader(pdf_file)
page = cond_scan_reader.getPage(4)
xObject = page['/Resources']['/XObject'].getObject()
for obj in xObject:
print(xObject[obj])
if xObject[obj]['/Subtype'] == '/Image':
if xObject[obj]['/Filter'] == '/DCTDecode':
data = xObject[obj]._data
img = open("image" + ".jpg", "wb")
img.write(data)
img.close()
pdf_file.close()
The point is that when I save, the colors are all weird, I believe it's because of the colorScheme. I have the following in the console:
{'/Type': '/XObject', '/Subtype': '/Image', '/Width': 1122, '/Height': 502, '/Interpolate': <PyPDF2.generic.BooleanObject object at 0x1061574a8>, '/ColorSpace': '/DeviceCMYK', '/BitsPerComponent': 8, '/Filter': '/DCTDecode'}
As you can see, the ColorSpace is CMYK, and I believe that's why the colors of the image are weird.
That's the image I have:
This is the original image (it is inside a pdf file):
Can anyone help me?
Thanks in advance.
Israel
A CMYK mode JPG image that contained in PDF must be invert.
But in PIL, invert of CMYK mode image is not supported.
Than I solve it using numpy.
Full source is in below link.
https://github.com/Gaia3D/pdfImageExtractor/blob/master/extrectImage.py
imgData = np.frombuffer(img.tobytes(), dtype='B')
invData = np.full(imgData.shape, 255, dtype='B')
invData -= imgData
img = Image.frombytes(img.mode, img.size, invData.tobytes())
img.save(outFileName + ".jpg")

Image drawn to reportlab pdf bigger than pdf paper size

i'm writing a program which takes all the pictures in a given folder and aggregates them into a pdf. The problem I have is that when the images are drawn, they are bigger in size and are rotated to the left oddly. I've searched everywhere, havent found anything even in the reportlab documentation.
Here's the code:
import os
from PIL import Image
from PyPDF2 import PdfFileWriter, PdfFileReader
from reportlab.pdfgen import canvas
from reportlab.lib.units import cm
from StringIO import StringIO
def main():
images = image_search()
output = PdfFileWriter()
for image in images:
Image_file = Image.open(image) # need to convert the image to the specific size first.
width, height = Image_file.size
im_width = 1 * cm
# Using ReportLab to insert image into PDF
watermark_str = "watermark" + str(images.index(image)) + '.pdf'
imgDoc = canvas.Canvas(watermark_str)
# Draw image on Canvas and save PDF in buffer
# define the aspect ratio first
aspect = height / float(width)
## Drawing the image
imgDoc.drawImage(image, 0,0, width = im_width, height = (im_width * aspect)) ## at (399,760) with size 160x160
imgDoc.showPage()
imgDoc.save()
# Get the watermark file just created
watermark = PdfFileReader(open(watermark_str, "rb"))
#Get our files ready
pdf1File = open('sample.pdf', 'rb')
page = PdfFileReader(pdf1File).getPage(0)
page.mergePage(watermark.getPage(0))
#Save the result
output.addPage(page)
output.write(file("output.pdf","wb"))
#The function which searches the current directory for image files.
def image_search():
found_images = []
for doc in os.listdir(os.curdir):
image_ext = ['.jpg', '.png', '.PNG', '.jpeg', '.JPG']
for ext in image_ext:
if doc.endswith(ext):
found_images.append(doc)
return found_images
main()
I also tried scaling and specifying the aspect ratio using the im_width variable, which gave the same output.
After a little bit of confusion about your goal I figured out that the goal is to make a PDF overview of the images in the current folder. To do so we actual don't need PyPDF2 as Reportlab offers everything we need for this.
See the code below with the comments as guidelines:
def main():
output_file_loc = "overview.pdf"
imgDoc = canvas.Canvas(output_file_loc)
imgDoc.setPageSize(A4) # This is actually the default page size
document_width, document_height = A4
images = image_search()
for image in images:
# Open the image file to get image dimensions
Image_file = Image.open(image)
image_width, image_height = Image_file.size
image_aspect = image_height / float(image_width)
# Determine the dimensions of the image in the overview
print_width = document_width
print_height = document_width * image_aspect
# Draw the image on the current page
# Note: As reportlab uses bottom left as (0,0) we need to determine the start position by subtracting the
# dimensions of the image from those of the document
imgDoc.drawImage(image, document_width - print_width, document_height - print_height, width=print_width,
height=print_height)
# Inform Reportlab that we want a new page
imgDoc.showPage()
# Save the document
imgDoc.save()

Wand convert pdf to jpeg and storing pages in file-like objects

I am trying to convert a pdf to jpegs using wand, but when I iterate over the SingleImages in image.sequence and save each image separately. I am saving each image on AWS, with database references, using Django.
image_pdf = Image(blob=pdf_blob)
image_jpeg = image_pdf.convert('jpeg')
for img in image_jpeg.sequence:
memory_file = SimpleUploadedFile(
"{}.jpeg".format(img.page_number),
page.container.make_blob())
spam = Spam.objects.create(
page_image=memory_file,
caption="Spam")
This doesn't work, the page.container is calling the parent Image class, and the first page is written over and over again. How do I get the second frame/page for saveing?
Actually, you can get per-file blobs:
for img in image_jpeg.sequence:
img_page = Image(image=img)
Then you can work with each img_page variable like a full-fledged image: change format, resize, save, etc.
It seems you cannot get per file blobs without messing with c_types. So this is my solution
from path import path # wrapper for os.path
import re
import tempfile
image_pdf = Image(blob=pdf_blob)
image_jpeg = image_pdf.convert('jpeg')
temp_dir = path(tempfile.mkdtemp())
# set base file name (join)
image_jpeg.save(temp_dir / 'pdf_title.jpeg')
images = temp_dir.files()
sorted_images = sorted(
images,
key=lambda img_path: int(re.search(r'\d+', img_path.name).group())
)
for img in sorted_images:
with open(img, 'rb') as img_fd:
memory_file = SimpleUploadedFile(
img.name,
img_fd.read()
)
spam = Spam.objects.create(
page_image=memory_file,
caption="Spam Spam",
)
tempfile.rmtree(tempdir)
Not as clean as doing it all in memory, but it gets it done.

Resizing uploaded files in django using PIL

I am using PIL to resize an uploaded file using this method:
def resize_uploaded_image(buf):
imagefile = StringIO.StringIO(buf.read())
imageImage = Image.open(imagefile)
(width, height) = imageImage.size
(width, height) = scale_dimensions(width, height, longest_side=240)
resizedImage = imageImage.resize((width, height))
return resizedImage
I then use this method to get the resizedImage in my main view method:
image = request.FILES['avatar']
resizedImage = resize_uploaded_image(image)
content = django.core.files.File(resizedImage)
acc = Account.objects.get(account=request.user)
acc.avatar.save(image.name, content)
However, this gives me the 'read' error.
Trace:
Exception Type: AttributeError at /myapp/editAvatar Exception Value:
read
Any idea how to fix this? I have been at it for hours!
Thanks!
Nikunj
Here's how you can take a file-like object, manipulate it as an image in PIL, then turn it back into a file-like object:
def resize_uploaded_image(buf):
image = Image.open(buf)
(width, height) = image.size
(width, height) = scale_dimensions(width, height, longest_side=240)
resizedImage = image.resize((width, height))
# Turn back into file-like object
resizedImageFile = StringIO.StringIO()
resizedImage.save(resizedImageFile , 'PNG', optimize = True)
resizedImageFile.seek(0) # So that the next read starts at the beginning
return resizedImageFile
Note that there's already a handy thumbnail() method for PIL images. This is a variant of the thumbnail code I use in my own project:
def resize_uploaded_image(buf):
from cStringIO import StringIO
import Image
image = Image.open(buf)
maxSize = (240, 240)
resizedImage = image.thumbnail(maxSize, Image.ANTIALIAS)
# Turn back into file-like object
resizedImageFile = StringIO()
resizedImage.save(resizedImageFile , 'PNG', optimize = True)
resizedImageFile.seek(0) # So that the next read starts at the beginning
return resizedImageFile
It would be better for you to save the uploaded image and then display and resize it in template as you wish. This way you will be able to resize images at runtime. sorl-thumbnail is djano app which you can use for template image resizing, it is easy to use and you can use it in a view too. Here are examples for this app.

Categories