Uploading Image with PIL+twython on twitter without saving it - python

After much searching I finally found it. Here's the response.
Step 1
code for the user to provide the link and for PIL to resize it according to some conditions (not relevant to the question).
PIL opens the image file link like this:
img=Image.open(io.BytesIO(requests.get(url).content))
where url is the link.
Step2
then PIL must save it AND seek(0) on the io.BytesIO class:
blob = io.BytesIO()
img.save(blob, 'JPEG')
blob.seek(0)
response = twitter.upload_media(media=blob);
Step 3
Proceed according to the documendation:
twitter.update_status(status='Checkout this cool image!', media_ids=[response['media_id']])
Twython's documendation is outdated and StringIO() has changed package and also doesn't take bytes objects. I also can't get the logic behind making a bytes object to a string and then send it as a bytes object.
https://twython.readthedocs.io/en/latest/usage/advanced_usage.html
The media parameter also takes io.BytesIO class objects as shown above and in this simpler example below:
response = twitter.upload_media(media=io.BytesIO(requests.get(url).content));

Related

Save JPEG comment using Pillow

I need to save an Image in Python (created as a Numpy array) as a JPEG file, while including a "comment" in the file with some specific metadata. This metadata will be used by another (third-party) application and is a simple ASCII string. I have a sample image including such a "comment", which I can read out using Pillow (PIL), via the image.info['comment'] or the image.app['COM'] property. However, when I try a simple round-trip, i.e. loading my sample image and save it again using a different file name, the comment is no longer preserved. Equally, I found no way to include a comment in a newly created image.
I am aware that EXIF tags are the preferred way to save metadata in JPEG images, but as mentioned, the third-party application only accepts this data as a "comment", not as EXIF, which I cannot change. After reading this question, I looked into the binary structure of my sample file and found the comment at the start of the file, after a few bytes of some other (meta)data. I do however not know a lot about binary file manipulation, and also I was wondering if there is a more elegant way, other than messing with the binary...
EDIT: minimum example:
from PIL import Image
img = Image.open(path) # where path is the path to the sample image
# this prints the desired metadata if it is correctly saved in loaded image
print(img.info["comment"])
img.save(new_path) # save with different file name
img.close()
# now open to see if it has been saved correctly
new_img = Image.open(new_path)
print(new_img.info['comment']) # now results in KeyError
I also tried img.save(new_path, info=img.info), but this does not seem to have an effect. Since img.info['comment'] appears identical to img.app['COM'], I tried img.save(new_path, app=img.app), again does not work.
Just been having a play with this and I couldn't see anything directly in Pillow to support this. I've found that the save() method supports a parameter called extra that can be used to pass arbitrary bytes to the output file.
We then just need a simple method to turn a comment into a valid JPEG segment, for example:
import struct
from PIL import Image
def make_jpeg_variable_segment(marker: int, payload: bytes) -> bytes:
"make a JPEG segment from the given payload"
return struct.pack('>HH', marker, 2 + len(payload)) + payload
def make_jpeg_comment_segment(comment: bytes) -> bytes:
"make a JPEG comment/COM segment"
return make_jpeg_variable_segment(0xFFFE, comment)
# open source image
with Image.open("foo.jpeg") as im:
# save out with new JPEG comment
im.save('bar.jpeg', extra=make_jpeg_comment_segment("hello world".encode()))
# read file back in to ensure comment round-trips
with Image.open('bar.jpeg') as im:
print(im.app['COM'])
print(im.info['comment'])
Note that in my initial attempts I tried appending the comment segment at the end of the file, but Pillow wouldn't load this comment even after calling the .load() method to force it to load the entire JPEG file.
Update: The upcoming version Pillow version 9.4.0 will support this by passing a comment parameter while saving, e.g.:
with Image.open("foo.jpeg") as im:
im.save('bar.jpeg', comment="hello world")
hopefully that makes things easier!

How do I stop creating a broken png when converting from base64 using Python

I've tried this a number of ways and have searched high and low, but no matter what I try (including all posts I could find here on the subject) I can't manage to convert my base64 string of an HTML document / canvas containing JavaScript.
I'm not getting the incorrect padding error which is quite common (I have ensured 'data:text/html;base64,' is not included at the start of the base64 string.)
I have also checked the base64 string both by checking and running the original .html file, which renders in browser with no issue, and decoding the string with an online decoder.
I know I must be missing something very simple here, but after several hours I'm ready to pull my hair out.
My encoding step is as follows:
htmlSource = bytes(htmlSource,'UTF-8')
fullBase64 = base64.b64encode(htmlSource)
The resultant base64 string is included in my attempts below, which should generate a turquoise oval with shadow on a dirty white background in 4k.
The following attempts all create a png file, only 1kb in size, which cannot be opened - 'It may be damaged or use a file format that Preview doesn’t recognise.':
import base64
img_data = b'PCFET0NUWVBFIGh0bWw+CjxodG1sPgogIDxoZWFkPgogICAgPG1ldGEgbmFtZT0ndmlld3BvcnQnIGNvbnRlbnQ9J3dpZHRoPWRldmljZS13aWR0aCwgaW5pdGlhbC1zY2FsZT0xLjAnPgogIDwvaGVhZD4KPGJvZHk+CjxzdHlsZT4KICAgIGJvZHksIGh0bWwgewogICAgICBwYWRkaW5nOiAwICFpbXBvcnRhbnQ7CiAgICAgIG1hcmdpbjogMCAhaW1wb3J0YW50OwogICAgICBtYXJnaW46IDA7CiAgICB9CiAgICAqIHsKICAgICAgcGFkZGluZzogMDsKICAgICAgbWFyZ2luOiAwOwogICAgfQo8L3N0eWxlPgoKPGNhbnZhcyBpZD0nbXlDYW52YXMnIHN0eWxlPSdvYmplY3QtZml0OiBjb250YWluOyB3aWR0aDogOTl2dzsgaGVpZ2h0OiA5OXZoOyc+CllvdXIgYnJvd3NlciBkb2VzIG5vdCBzdXBwb3J0IHRoZSBIVE1MNSBjYW52YXMgdGFnLjwvY2FudmFzPgoKPHNjcmlwdD4KdmFyIGNhbnZhcyA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKCdteUNhbnZhcycpOwpjYW52YXMud2lkdGggPSA0MDk2OwpjYW52YXMuaGVpZ2h0ID0gNDA5NjsKY2FudmFzLnN0eWxlLndpZHRoID0gJzk5dncnOwpjYW52YXMuc3R5bGUuaGVpZ2h0ID0gJzk5dmgnOwp2YXIgY3R4ID0gY2FudmFzLmdldENvbnRleHQoJzJkJyk7CnZhciBjYW52YXNXID0gY3R4LmNhbnZhcy53aWR0aDsKdmFyIGNhbnZhc0ggPSBjdHguY2FudmFzLmhlaWdodDsKCmN0eC5maWxsU3R5bGUgPSAncmdiYSgyMDAsIDE5NywgMTc3LCAxKSc7CmN0eC5maWxsUmVjdCgwLCAwLCBjYW52YXNXLCBjYW52YXNIKTsKCmN0eC5zaGFkb3dCbHVyID0gY2FudmFzVzsKY3R4LnNoYWRvd0NvbG9yID0gJ3JnYmEoMCwgMCwgMCwgMC4zKSc7CmN0eC5iZWdpblBhdGgoKTsKY3R4LmZpbGxTdHlsZSA9ICdyZ2JhKDUxLCAyMjAsIDE5MSwgMSknOwpjdHguZWxsaXBzZShjYW52YXNXIC8gMiwgY2FudmFzSCAvIDIgLCBjYW52YXNXICogLjQsIGNhbnZhc0ggKiAuNDUsIDAsIDAsIDIgKiBNYXRoLlBJKTsKY3R4LmZpbGwoKTsKCgoKPC9zY3JpcHQ+Cgo8L2JvZHk+CjwvaHRtbD4='
with open("turquoise egg.png", "wb") as fh:
fh.write(base64.decodebytes(img_data))
Version 2
from binascii import a2b_base64
data = 'PCFET0NUWVBFIGh0bWw+CjxodG1sPgogIDxoZWFkPgogICAgPG1ldGEgbmFtZT0ndmlld3BvcnQnIGNvbnRlbnQ9J3dpZHRoPWRldmljZS13aWR0aCwgaW5pdGlhbC1zY2FsZT0xLjAnPgogIDwvaGVhZD4KPGJvZHk+CjxzdHlsZT4KICAgIGJvZHksIGh0bWwgewogICAgICBwYWRkaW5nOiAwICFpbXBvcnRhbnQ7CiAgICAgIG1hcmdpbjogMCAhaW1wb3J0YW50OwogICAgICBtYXJnaW46IDA7CiAgICB9CiAgICAqIHsKICAgICAgcGFkZGluZzogMDsKICAgICAgbWFyZ2luOiAwOwogICAgfQo8L3N0eWxlPgoKPGNhbnZhcyBpZD0nbXlDYW52YXMnIHN0eWxlPSdvYmplY3QtZml0OiBjb250YWluOyB3aWR0aDogOTl2dzsgaGVpZ2h0OiA5OXZoOyc+CllvdXIgYnJvd3NlciBkb2VzIG5vdCBzdXBwb3J0IHRoZSBIVE1MNSBjYW52YXMgdGFnLjwvY2FudmFzPgoKPHNjcmlwdD4KdmFyIGNhbnZhcyA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKCdteUNhbnZhcycpOwpjYW52YXMud2lkdGggPSA0MDk2OwpjYW52YXMuaGVpZ2h0ID0gNDA5NjsKY2FudmFzLnN0eWxlLndpZHRoID0gJzk5dncnOwpjYW52YXMuc3R5bGUuaGVpZ2h0ID0gJzk5dmgnOwp2YXIgY3R4ID0gY2FudmFzLmdldENvbnRleHQoJzJkJyk7CnZhciBjYW52YXNXID0gY3R4LmNhbnZhcy53aWR0aDsKdmFyIGNhbnZhc0ggPSBjdHguY2FudmFzLmhlaWdodDsKCmN0eC5maWxsU3R5bGUgPSAncmdiYSgyMDAsIDE5NywgMTc3LCAxKSc7CmN0eC5maWxsUmVjdCgwLCAwLCBjYW52YXNXLCBjYW52YXNIKTsKCmN0eC5zaGFkb3dCbHVyID0gY2FudmFzVzsKY3R4LnNoYWRvd0NvbG9yID0gJ3JnYmEoMCwgMCwgMCwgMC4zKSc7CmN0eC5iZWdpblBhdGgoKTsKY3R4LmZpbGxTdHlsZSA9ICdyZ2JhKDUxLCAyMjAsIDE5MSwgMSknOwpjdHguZWxsaXBzZShjYW52YXNXIC8gMiwgY2FudmFzSCAvIDIgLCBjYW52YXNXICogLjQsIGNhbnZhc0ggKiAuNDUsIDAsIDAsIDIgKiBNYXRoLlBJKTsKY3R4LmZpbGwoKTsKCgoKPC9zY3JpcHQ+Cgo8L2JvZHk+CjwvaHRtbD4='
binary_data = a2b_base64(data)
fd = open('turquoise egg.png', 'wb')
fd.write(binary_data)
fd.close()
Version 3
import base64
fileString = 'PCFET0NUWVBFIGh0bWw+CjxodG1sPgogIDxoZWFkPgogICAgPG1ldGEgbmFtZT0ndmlld3BvcnQnIGNvbnRlbnQ9J3dpZHRoPWRldmljZS13aWR0aCwgaW5pdGlhbC1zY2FsZT0xLjAnPgogIDwvaGVhZD4KPGJvZHk+CjxzdHlsZT4KICAgIGJvZHksIGh0bWwgewogICAgICBwYWRkaW5nOiAwICFpbXBvcnRhbnQ7CiAgICAgIG1hcmdpbjogMCAhaW1wb3J0YW50OwogICAgICBtYXJnaW46IDA7CiAgICB9CiAgICAqIHsKICAgICAgcGFkZGluZzogMDsKICAgICAgbWFyZ2luOiAwOwogICAgfQo8L3N0eWxlPgoKPGNhbnZhcyBpZD0nbXlDYW52YXMnIHN0eWxlPSdvYmplY3QtZml0OiBjb250YWluOyB3aWR0aDogOTl2dzsgaGVpZ2h0OiA5OXZoOyc+CllvdXIgYnJvd3NlciBkb2VzIG5vdCBzdXBwb3J0IHRoZSBIVE1MNSBjYW52YXMgdGFnLjwvY2FudmFzPgoKPHNjcmlwdD4KdmFyIGNhbnZhcyA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKCdteUNhbnZhcycpOwpjYW52YXMud2lkdGggPSA0MDk2OwpjYW52YXMuaGVpZ2h0ID0gNDA5NjsKY2FudmFzLnN0eWxlLndpZHRoID0gJzk5dncnOwpjYW52YXMuc3R5bGUuaGVpZ2h0ID0gJzk5dmgnOwp2YXIgY3R4ID0gY2FudmFzLmdldENvbnRleHQoJzJkJyk7CnZhciBjYW52YXNXID0gY3R4LmNhbnZhcy53aWR0aDsKdmFyIGNhbnZhc0ggPSBjdHguY2FudmFzLmhlaWdodDsKCmN0eC5maWxsU3R5bGUgPSAncmdiYSgyMDAsIDE5NywgMTc3LCAxKSc7CmN0eC5maWxsUmVjdCgwLCAwLCBjYW52YXNXLCBjYW52YXNIKTsKCmN0eC5zaGFkb3dCbHVyID0gY2FudmFzVzsKY3R4LnNoYWRvd0NvbG9yID0gJ3JnYmEoMCwgMCwgMCwgMC4zKSc7CmN0eC5iZWdpblBhdGgoKTsKY3R4LmZpbGxTdHlsZSA9ICdyZ2JhKDUxLCAyMjAsIDE5MSwgMSknOwpjdHguZWxsaXBzZShjYW52YXNXIC8gMiwgY2FudmFzSCAvIDIgLCBjYW52YXNXICogLjQsIGNhbnZhc0ggKiAuNDUsIDAsIDAsIDIgKiBNYXRoLlBJKTsKY3R4LmZpbGwoKTsKCgoKPC9zY3JpcHQ+Cgo8L2JvZHk+CjwvaHRtbD4='
decodeit = open('turquoise egg.png', 'wb')
decodeit.write(base64.b64decode((fileString)))
decodeit.close()
FWIW I originally used the following code to create a png from the HTML without using base64, but it would only ever save the first element of JavaScript generated on the canvas (ie the background) and since I require the information in base64 anyway, thought I would approach it this way in order to capture the complete image
file = open('html.html', 'r')
imgkit.from_file(file, 'png.png')
file.close()
Html2Image has provided the solution I was looking for.
Whilst imgkt wasn't saving the fully rendered canvas, taking screenshot with html2canvas does. Documentation is here and I implemented as follows:
from html2image import Html2Image
hti.screenshot(
html_file = ‘html.html’,
size = (imageW, imageH),
save_as = ‘png.png'
)

Only one image from 5 is downloaded and it knocks out an error

import requests
from PIL import Image
url_shoes_for_choice = [
"https://content.adidas.co.in/static/Product-CM7531/Unisex_OUTDOOR_SANDALS_CM7531_1.jpg",
"https://cdn.shopify.com/s/files/1/0080/1374/2161/products/product-image-897958210_640x.jpg?v=1571713841",
"https://cdn.chamaripashoes.com/media/catalog/product/cache/9/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_8_3.jpg",
"https://ae01.alicdn.com/kf/HTB1EyKjaI_vK1Rjy0Foq6xIxVXah.jpg_q50.jpg",
"https://www.converse.com/dw/image/v2/BCZC_PRD/on/demandware.static/-/Sites-cnv-master-catalog/default/dwb9eb8c43/images/a_107/167708C_A_107X1.jpg"
]
def img():
for url in url_shoes_for_choice:
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image/image.jpg', 'jpg')
if __name__=="__main__":
img()
Error:
OSError: cannot identify image file <_io.BytesIO object at 0x7fa185c52d58>
The problem is that one of the images is making issues with the byte data returned by the requests.get(url, stream=True).raw, I'm not sure but I guess the data of the 3rd image is invalid byte data so instead of getting the raw data we can just fetch the content and then by using BytesIO we can fix the byte data.
I fixed one more thing from your original code, I added numbering to your images so each can be saved with different name.
from io import BytesIO
def img():
for count, url in enumerate(url_shoes_for_choice):
image = requests.get(url, stream=True)
with BytesIO(image.content) as f:
with Image.open(f) as out:
# out.show() # See the images
out.save('image/image{}.jpg'.format(count))
(Though this works fine but I'm not sure what was the main issue. If anyone knows exactly what is the issue please comment and explain.)
I opened the first link in my browser and saved the image. It's actually a webp file.
$ file Unisex_OUTDOOR_SANDALS_CM7531_1.webp
Unisex_OUTDOOR_SANDALS_CM7531_1.webp: RIFF (little-endian) data, Web/P image, VP8 encoding, 500x500, Scaling: [none]x[none], YUV color, decoders should clamp
You explicitly tell the image library that it should expect a jpg. When you remove that parameter and let it figure it out on its own using out.save('image/image.jpg') the first image successfully downloads for me.
The first two images work this way if you make sure to save each under a different name:
def img():
i = 0
for url in url_shoes_for_choice:
i+=1
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image{}.jpg'.format(i))
the third is a valid jpeg file, as well as the fourth, but using the JFIF standard 1.01 which I hear the first time of. I'm pretty sure you'll have to figure out support for different such filetypes.
It is worth noting that if I download the images in chrome and open those with python, nothing fails. So chrome might be adding information to the file.
The documentation of PIL/pillow explains here that you need a new enough version for animated images, but that is not your problem.
Support for animated WebP files will only be enabled if the system
WebP library is v0.5.0 or later. You can check webp animation support
at runtime by calling features.check(“webp_anim”).

How to get Pillow to make identical copies (edit EXIF in-line)

You would think it's quite simple, but the following code doesn't work as I would expect:
from hashlib import md5
from PIL import Image
im = Image.open("/tmp/original.jpg")
im.save("/tmp/new.jpg", quality="keep")
original = Image.open("/tmp/original.jpg")
new = Image.open("/tmp/new.jpg")
assert md5(original.tobytes()).hexdigest() == md5(new.tobytes()).hexdigest()
Why is it that when I'm simply saving an image as a new file, and keeping the quality settings the same, that the image data isn't identical? What am I missing?
Update (Explanation):
My problem is that I have a Pillow JpegImage object being handed to my code as part of a pipeline and I don't have control over the step at which the file is saved to disk:
<magic> → <my code> → <magic that saves to disk>
All I want my code to do is add/update/replace (any of these) the EXIF data for the to-be-saved jpeg image. As this info doesn't appear to be editable on an image object, the only way that I can figure to do this is to save the image to a temporary place (like BytesIO), save it and the re-open it with Image.open() before passing it to the next function in the chain.
Please tell me that there's a smarter, more efficient way to do this?

Python/Django download Image from URL, modify, and save to ImageField

I've been looking for a way to download an image from a URL, preform some image manipulations (resize) actions on it, and then save it to a django ImageField. Using the two great posts (linked below), I have been able to download and save an image to an ImageField. However, I've been having some trouble manipulating the file once I have it.
Specifically, the model field save() method requires a File() object as the second parameter. So my data has to eventually be a File() object. The blog posts linked below show how to use urllib2 to save your an image URL into a File() object. This is great, however, I also want to manipulate the image using PIL as an Image() object. (or ImageFile object).
My preferred approach would be then to load the image URL directly into an Image() object, preform the resize, and convert it to a File() object and then save it in the model. However, my attempts to convert an Image() to a File() have failed. If at all possible, I want to limit the number of times I write to the disk, so I'd like to do this object transformation in Memory or using a NamedTemporaryFile(delete=True) object so I don't have to worry about extra files laying around. (Of course, I want the file to be written to disk once it is saved via the model).
import urllib2
from PIL import Image, ImageFile
from django.core.files import File
from django.core.files.temp import NamedTemporaryFile
inStream = urllib2.urlopen('http://www.google.com/intl/en_ALL/images/srpr/logo1w.png')
parser = ImageFile.Parser()
while True:
s = inStream.read(1024)
if not s:
break
parser.feed(s)
inImage = parser.close()
# convert to RGB to avoid error with png and tiffs
if inImage.mode != "RGB":
inImage = inImage.convert("RGB")
# resize could occur here
# START OF CODE THAT DOES NOT SEEM TO WORK
# I need to somehow convert an image .....
img_temp = NamedTemporaryFile(delete=True)
img_temp.write(inImage.tostring())
img_temp.flush()
file_object = File(img_temp)
# .... into a file that the Django object will accept.
# END OF CODE THAT DOES NOT SEEM TO WORK
my_model_instance.image.save(
'some_filename',
file_object, # this must be a File() object
save=True,
)
With this approach, the file appears corrupt whenever I view it as an image. Does anyone have any approach that takes a file file from a URL, allows one to manipulate it as an Image and then save it to a Django ImageField?
Any help is much appreciated.
Programmatically saving image to Django ImageField
Django: add image in an ImageField from image url
Update 08/11/2010: I did end up going with StringIO, however, I was stringIO was throwing an unusual Exception when I tried to save it in a Django ImageField. Specifically, the stack trace showed a name error:
"AttribueError exception "StringIO instance has no attribute 'name'"
After digging through the Django source, it looks like this error was caused when the model save tries to access the size attribute of the StringIO "File". (Though the error above indicates a problem with the name, the root cause of this error appears to be the lack of a size property on the StringIO image). As soon as I assigned a value to the size attribute of the image file, it worked fine.
In an attempt to kill 2 birds with 1 stone. Why not use a (c)StringIO object instead of a NamedTemporaryFile? You won't have to store it on disk anymore and I know for a fact that something like this works (I use similar code myself).
from cStringIO import StringIO
img_temp = StringIO()
inImage.save(img_temp, 'PNG')
img_temp.seek(0)
file_object = File(img_temp, filename)

Categories