I have a list of image filenames with about 150 entries. Every image is downloaded via urllib and stored on the system. The result is a zipfile containing several broken images. The last part of some images is missing / corrupt.
The image download works perfectly. Every image in the list is completely downloaded and a valid image. It looks like i have to wait until zf.write() is completely done until the next image is added. Is there a way to ensure this?
images = ['image-01.jpg', 'image-02.jpg', 'image-03.jpg']
zf = zipfile.ZipFile('file.zip', mode='w')
for image in images:
download_url = 'http://foo/bar/' + image
image_file = open(image, 'wb')
image_file.write(urllib.urlopen(download_url).read())
image_file.close
zf.write(image)
zf.close()
Thanks to alecxe. The solution is to close the file correctly.
image_file.close()
Related
I am doing an ML project in google colab. I need to pre-process the whole image in the train set and replace those images with the newly preprocessed ones provided that train set images are already uploaded in "content/train/images/". I created an image_preprocessing function where the input is the image and returns preprocessed image. Now I need to save this image by replacing previous one.
This is my code :
import cv2
import glob
import os
path = "/content/train/images/*.jpg"
for file in glob.glob(path):
img = cv2.imread(file)
file_name = os.path.basename(file)
img_preprocessed = image_preprocessing(img)
with open(file, 'w') as f:
f.write(img_preprocessed)
print(file_name + " preprocessed and saved\n")
I am a newbie in python. Please help. Thanks in advance.
What you have in your example is not too far but you are trying to save an image using a syntax made to write a text file (with open(file, 'w) as f).
As you are using openCV, you can directly save with cv2.imwrite(file, img_preprocessed). All put together:
import cv2
import glob
import os
path = "/content/train/images/*.jpg"
for file in glob.glob(path):
img = cv2.imread(file)
file_name = os.path.basename(file)
img_preprocessed = image_preprocessing(img)
# Save the img_preprocessed as a picture with a path matching 'file'
cv2.imwrite(file, img_preprocessed)
print(file_name + " preprocessed and saved")
NOTE: This example will overwrite your original images as requested in the question. However, it may be an issue for repeatability, it may be better to save them in a preprocessed_images folder so you retain the source. But it may not be required, it is up to your usage.
import requests
from PIL import Image
url_shoes_for_choice = [
"https://content.adidas.co.in/static/Product-CM7531/Unisex_OUTDOOR_SANDALS_CM7531_1.jpg",
"https://cdn.shopify.com/s/files/1/0080/1374/2161/products/product-image-897958210_640x.jpg?v=1571713841",
"https://cdn.chamaripashoes.com/media/catalog/product/cache/9/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_8_3.jpg",
"https://ae01.alicdn.com/kf/HTB1EyKjaI_vK1Rjy0Foq6xIxVXah.jpg_q50.jpg",
"https://www.converse.com/dw/image/v2/BCZC_PRD/on/demandware.static/-/Sites-cnv-master-catalog/default/dwb9eb8c43/images/a_107/167708C_A_107X1.jpg"
]
def img():
for url in url_shoes_for_choice:
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image/image.jpg', 'jpg')
if __name__=="__main__":
img()
Error:
OSError: cannot identify image file <_io.BytesIO object at 0x7fa185c52d58>
The problem is that one of the images is making issues with the byte data returned by the requests.get(url, stream=True).raw, I'm not sure but I guess the data of the 3rd image is invalid byte data so instead of getting the raw data we can just fetch the content and then by using BytesIO we can fix the byte data.
I fixed one more thing from your original code, I added numbering to your images so each can be saved with different name.
from io import BytesIO
def img():
for count, url in enumerate(url_shoes_for_choice):
image = requests.get(url, stream=True)
with BytesIO(image.content) as f:
with Image.open(f) as out:
# out.show() # See the images
out.save('image/image{}.jpg'.format(count))
(Though this works fine but I'm not sure what was the main issue. If anyone knows exactly what is the issue please comment and explain.)
I opened the first link in my browser and saved the image. It's actually a webp file.
$ file Unisex_OUTDOOR_SANDALS_CM7531_1.webp
Unisex_OUTDOOR_SANDALS_CM7531_1.webp: RIFF (little-endian) data, Web/P image, VP8 encoding, 500x500, Scaling: [none]x[none], YUV color, decoders should clamp
You explicitly tell the image library that it should expect a jpg. When you remove that parameter and let it figure it out on its own using out.save('image/image.jpg') the first image successfully downloads for me.
The first two images work this way if you make sure to save each under a different name:
def img():
i = 0
for url in url_shoes_for_choice:
i+=1
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image{}.jpg'.format(i))
the third is a valid jpeg file, as well as the fourth, but using the JFIF standard 1.01 which I hear the first time of. I'm pretty sure you'll have to figure out support for different such filetypes.
It is worth noting that if I download the images in chrome and open those with python, nothing fails. So chrome might be adding information to the file.
The documentation of PIL/pillow explains here that you need a new enough version for animated images, but that is not your problem.
Support for animated WebP files will only be enabled if the system
WebP library is v0.5.0 or later. You can check webp animation support
at runtime by calling features.check(“webp_anim”).
I have a simple script that can download most any images and store it in a file. However, I have encountered this Chinese website that causes me to download a wrong image?
Here are a few sample images that I attempted to save:
Image1/ Image2/ Image3
import urllib.request
def Save_Image(url, file_path, file_name):
full_path = file_path + file_name + '.jpg'
urllib.request.urlretrieve(url, full_path)
url = 'http://photo.yupoo.com/evakicks/05269e07/7bd1fc86.png'
file_name = 'Image1'
#!) Manually create an Image1 Folder in the same directory as this script
Save_Image(url,
'imageFolder/',
file_name)
However, if you try running my script, with the following links, it will download a dummy image. I simply wish to know why this is so? :O
As the title mentioned, I can actually right click and save image, however cannot save it via software. Am I missing something here and are there ways around this?
Sorry for the title... So the goal of this script is to take a folder full on images that are listed in a particular order. Then it chunks the images into groups of 3. From there it takes the 3 images and blends them together using PIL. Now the issue that I have is that the code below does a great job of doing what I want. I can show imgbld2 it'll create 4 images in a temporary folder.
Now my problem is that when I go to save the images using imgbld2.save()it will only save the first created image into 4 image files, instead of 4 created images into 4 separate files.
I can fix this issue by pointing another script to retrieve the images from the temp folder by using glob.glob(). But that would require me to make sure to run the script on a freshly restarted computer but that seems to be too messy for my taste.
Is there a better way to achieve what I'm trying to do? Or there a saving method that I'm missing?
Any help would be appreciated, here is the code:
from PIL import Image
import os.path
import glob
#Lists Directory
Dir = os.listdir('/path/to/Directory/of/Images')
#Glob all jpgs
im = glob.glob( '/path/to/Directory/of/Images/*.jpg')
#sort jpg according to name
imsort = sorted(im)
def chunker(imsort,size = 3):
for i in range(0, len(imsort), size):
yield imsort[i:i + size]
print('what does it look like?')
for j in chunker(imsort):
print(j)
img1 = Image.open(j[0])
img2 = Image.open(j[1])
img3 = Image.open(j[2])
imgbld1 = Image.blend(img1, img2, 0.3)
imgbld2 = Image.blend(imgbld1, img3, 0.3)
imgbld2.show()
imgbld2.save('path/to/new/folder/' + 'blended' , 'JPEG')
I'm looking for a way to download an 640x640 image from a URL, resize the image to 180x180 and append the word small to the end of the resized image filename.
For example, the image is located at this link
http://0height.com/wp-content/uploads/2013/07/18-japanese-food-instagram-1.jpg
Once resized, I would like to append the world small to the end of the filename like so:
18-japanese-food-instagram-1small.jpeg
How can this be done? Also will the downloaded image be saved to memory or will it save to the actual drive? If it does save to the drive, is it possible to delete the original image and keep the resized version?
Why don't you try urllib?
import urllib
urllib.urlretrieve("http://0height.com/wp-content/uploads/2013/07/18-japanese-food-instagram-1.jpg", "18-japanese-food-instagram-1.jpg")
Then, to resize this you can use PIL or another library
import Image
im1 = Image.open("18-japanese-food-instagram-1.jpg")
im_small = im1.resize((width, height), Image.ANTIALIAS)
im_small.save("18-japanese-food-instagram-1_small.jpg")
References:
http://www.daniweb.com/software-development/python/code/216637/resize-an-image-python
Downloading a picture via urllib and python