I'm trying to figure out how to upload a Pillow Image instance to a Firebase storage bucket. Is this possible?
Here's some code:
from PIL import Image
image = Image.open(file)
# how to upload to a firebase storage bucket?
I know there's a gcloud-python library but does this support Image instances? Is converting the image to a string my only option?
The gcloud-python library is the correct library to use. It supports uploads from Strings, file pointers, and local files on the file system (see the docs).
from PIL import Image
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket('bucket-id-here')
blob = bucket.blob('image.png')
# use pillow to open and transform the file
image = Image.open(file)
# perform transforms
image.save(outfile)
of = open(outfile, 'rb')
blob.upload_from_file(of)
# or... (no need to use pillow if you're not transforming)
blob.upload_from_filename(filename=outfile)
This is how to directly upload the pillow image to firebase storage
from PIL import Image
from firebase_admin import credentials, initialize_app, storage
# Init firebase with your credentials
cred = credentials.Certificate("YOUR DOWNLOADED CREDENTIALS FILE (JSON)")
initialize_app(cred, {'storageBucket': 'YOUR FIREBASE STORAGE PATH (without gs://)'})
bucket = storage.bucket()
blob = bucket.blob('image.jpg')
bs = io.BytesIO()
im = Image.open("test_image.jpg")
im.save(bs, "jpeg")
blob.upload_from_string(bs.getvalue(), content_type="image/jpeg")
Related
I want to load "fonts" from Google Storage, I've try two ways, but none of them work. Any pointers? Appreciated for any advices provided.
First:
I follow the instruction load_font_from_gcs(uri)given in the answer here, but I received an NameError: name 'load_font_from_gcs' is not defined message. I installed google storage dependency and execute from google.cloud import storage
.
Second:
I try to execute the following code (reference #1) , and running into an blob has no attribute open() error, just the same answer I get it here, but as the reference in this link, it give a positive answer.
reference #1
bucket = storage_client.bucket({bucket_name})
blob = bucket.get_blob({blob_name)
with blob.open("r") as img:
imgblob = Image.open(img)
draw = ImageDraw.Draw(imgblob)
According to the provided links, your code must use BytesIO in order to work with the font file loaded from GCS.
The load_font_from_gcs is a custom function, written by the author of that question you referencing.
And it is not represented in the google-cloud-storage package.
Next, according to the official Google Cloud Storage documentation here:
Files from storage can be accessed this way (this example loads the font file into the PIL.ImageFont.truetype):
# Import PIL
from PIL import Image, ImageFont, ImageDraw
# Import the Google Cloud client library
from google.cloud import storage
# Import BytesIO module
from io import BytesIO
# Instantiate a client
storage_client = storage.Client()
# The name of the bucket
bucket_name = "my-new-bucket"
# Required blob
blob_name = "somefont.otf"
# Creates the bucket & blob instance
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(blob_name)
# Download the given blob
blob_content = blob.download_as_string()
# Make ImageFont out of it (or whatever you want)
font = ImageFont.truetype(BytesIO(font_file), 18)
So, your reference code can be changed respectively:
bucket = storage_client.bucket({bucket_name})
blob = bucket.get_blob({blob_name).download_as_string()
bytes = BytesIO(blob)
imgblob = Image.open(bytes)
draw = ImageDraw.Draw(imgblob)
You can read more about PIL here.
Also, don't forget to check the official Google Cloud Storage documentation.
(There are plenty of examples using Python code.)
I'm struggling to download a JPG file from Amazon S3 using Python, I want to load this code onto Heroku so I need to the image to be loaded into memory rather than onto disk.
The code I'm using is:
import boto3
s3 = boto3.client(
"s3",
aws_access_key_id = access_key,
aws_secret_access_key = access_secret
)
s3.upload_fileobj(image_conv, bucket, Key = "image_3.jpg")
new_obj = s3.get_object(Bucket=bucket, Key="image_3.jpg")
image_dl = new_obj['Body'].read()
Image.open(image_dl)
I'm getting the error message:
File ..... line 2968, in open
fp = builtins.open(filename, "rb")
ValueError: embedded null byte
Calling image_dl returns a massive long list of what I assume are bytes, one small section looks like the following:
f\xbc\xdc\x8f\xfe\xb5q\xda}\xed\xcb\xdcD\xab\xe6o\x1c;\xb7\xa0\xf5\xf5\xae\xa6)\xbe\xee\xe6\xc3vn\xdfLVW:\x96\xa8\xa3}\xa4\xd8\xea\x8f*\x89\xd7\xcc\xe8\xf0\xca\xb9\x0b\xf4\x1f\xe7\x15\x93\x0f\x83ty$h\xa6\x83\xc8\x99z<K\xc3c\xd4w\xae\xa4\xc2\xfb\xcb\xee\xe0
The image before I uploaded to S3 returned the below and that's the format that I'm trying to return the image into. Is anyone able to help me on where I'm going wrong?
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1440x1440 at 0x7F2BB4005EB0>
Pillow's Image class needs either a filename to open, or a file-like object that it can call read on. Since you don't have a filename, you'll need to provide a stream. It's easiest to use BytesIO to turn the byte array into a strem:
import boto3
from PIL import Image
from io import BytesIO
bucket = "--example-bucket--"
s3 = boto3.client("s3")
with open("image.jpg", "rb") as image_conv:
s3.upload_fileobj(image_conv, bucket, Key="image_3.jpg")
new_obj = s3.get_object(Bucket=bucket, Key="image_3.jpg")
image_dl = new_obj['Body'].read()
image = Image.open(BytesIO(image_dl))
print(image.width, image.height)
Try first to load raw data into a BytesIO container:
from io import StringIO
from PIL import Image
file_stream = StringIO()
s3.download_fileobj(bucket, "image_3.jpg", file_stream)
img = Image.open(file_stream)
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.download_fileobj
I can successfully access the google cloud bucket from my python code running on my PC using the following code.
client = storage.Client()
bucket = client.get_bucket('bucket-name')
blob = bucket.get_blob('images/test.png')
Now I don't know how to retrieve and display image from the "blob" without writing to a file on the hard-drive?
You could, for example, generate a temporary url
from gcloud import storage
client = storage.Client() # Implicit environ set-up
bucket = client.bucket('my-bucket')
blob = bucket.blob('my-blob')
url_lifetime = 3600 # Seconds in an hour
serving_url = blob.generate_signed_url(url_lifetime)
Otherwise you can set the image as public in your bucket and use the permanent link that you can find in your object details
https://storage.googleapis.com/BUCKET_NAME/OBJECT_NAME
Download the image from GCS as bytes, wrap it in BytesIO object to make the bytes file-like, then read in as a PIL Image object.
from io import BytesIO
from PIL import Image
img = Image.open(BytesIO(blob.download_as_bytes()))
Then you can do whatever you want with img -- for example, to display it, use plt.imshow(img).
In Jupyter notebooks you can display the image directly with download_as_bytes:
from google.cloud import storage
from IPython.display import Image
client = storage.Client() # Implicit environment set up
# with explicit set up:
# client = storage.Client.from_service_account_json('key-file-location')
bucket = client.get_bucket('bucket-name')
blob = bucket.get_blob('images/test.png')
Image(blob.download_as_bytes())
I want to load a model which is saved as a joblib file from Google Cloud Storage bucket. When it is in local path, we can load it as follows (considering model_file is the full path in system):
loaded_model = joblib.load(model_file)
How can we do the same task with Google Cloud Storage?
For anyone googling around for an answer to this.
Here are two more options besides the obvious, to use Google AI platform for model hosting (and online predictions).
Option 1 is to use TemporaryFile like this:
from google.cloud import storage
from sklearn.externals import joblib
from tempfile import TemporaryFile
storage_client = storage.Client()
bucket_name=<bucket name>
model_bucket='model.joblib'
bucket = storage_client.get_bucket(bucket_name)
#select bucket file
blob = bucket.blob(model_bucket)
with TemporaryFile() as temp_file:
#download blob into temp file
blob.download_to_file(temp_file)
temp_file.seek(0)
#load into joblib
model=joblib.load(temp_file)
#use the model
model.predict(...)
Option 2 is to use BytesIO like this:
from google.cloud import storage
from sklearn.externals import joblib
from io import BytesIO
storage_client = storage.Client()
bucket_name=<bucket name>
model_bucket='model.joblib'
bucket = storage_client.get_bucket(bucket_name)
#select bucket file
blob = bucket.blob(model_bucket)
#download blob into an in-memory file object
model_file = BytesIO()
blob.download_to_file(model_file)
#load into joblib
model=joblib.load(model_local)
Alternate answer as of 2020 using tf2, you can do this:
import joblib
import tensorflow as tf
gcs_path = 'gs://yourpathtofile'
loaded_model = joblib.load(tf.io.gfile.GFile(gcs_path, 'rb'))
I found using gcsfs to be the fastest (and most compact) method to use:
def load_joblib(bucket_name, file_name):
fs = gcsfs.GCSFileSystem()
with fs.open(f'{bucket_name}/{file_name}') as f:
return joblib.load(f)
I don't think that's possible, at least in a direct way. I though about a workaround, but the might not be as efficient as you want.
By using the Google Cloud Storage client libraries [1] you can download the model file first, load it, and when your program ends, delete it. Of course, this means that you need to download the file every time you run the code. Here is a snippet:
from google.cloud import storage
from sklearn.externals import joblib
storage_client = storage.Client()
bucket_name=<bucket name>
model_bucket='model.joblib'
model_local='local.joblib'
bucket = storage_client.get_bucket(bucket_name)
#select bucket file
blob = bucket.blob(model_bucket)
#download that file and name it 'local.joblib'
blob.download_to_filename(model_local)
#load that file from local file
job=joblib.load(model_local)
For folks who are Googling around with this problem - here's another option. The open source modelstore library is a wrapper that deals with the process of saving, uploading, and downloading models from Google Cloud Storage.
Under the hood, it saves scikit-learn models using joblib, creates a tar archive with the files, and up/downloads them from a Google Cloud Storage bucket using blob.upload_from_file() and blob.download_to_filename().
In practice it looks a bit like this (a full example is here):
# Create modelstore instance
from modelstore import ModelStore
ModelStore.from_gcloud(
os.environ["GCP_PROJECT_ID"], # Your GCP project ID
os.environ["GCP_BUCKET_NAME"], # Your Cloud Storage bucket name
)
# Train and upload a model (this currently works with 9 different ML frameworks)
model = train() # Replace with your code to train a model
meta_data = modelstore.sklearn.upload("my-model-domain", model=model)
# ... and later when you want to download it
model_path = modelstore.download(
local_path="/path/to/a/directory",
domain="my-model-domain",
model_id=meta_data["model"]["model_id"],
)
The full documentation is here.
This is the shortest way I found so far:
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket("my-gcs-bucket")
blob = bucket.blob("model.joblib")
with blob.open(mode="rb") as file:
model = joblib.load(file)
I want to play a sound file in a datalab notebook which I read from a google cloud storage bucket. How to do this?
import numpy as np
import IPython.display as ipd
import librosa
import soundfile as sf
import io
from google.cloud import storage
BUCKET = 'some-bucket'
# Create a Cloud Storage client.
gcs = storage.Client()
# Get the bucket that the file will be uploaded to.
bucket = gcs.get_bucket(BUCKET)
# specify a filename
file_name = 'some_dir/some_audio.wav'
# read a blob
blob = bucket.blob(file_name)
file_as_string = blob.download_as_string()
# convert the string to bytes and then finally to audio samples as floats
# and the audio sample rate
data, sample_rate = sf.read(io.BytesIO(file_as_string))
left_channel = data[:,0] # I assume the left channel is column zero
# enable play button in datalab notebook
ipd.Audio(left_channel, rate=sample_rate)