Get Cloud Storage upload response - python

I am uploading a file to a Cloud Storage bucket using the Python SDK:
from google.cloud import storage
bucket = storage.Client().get_bucket('mybucket')
df = # pandas df to save
csv = df.to_csv(index=False)
output = 'test.csv'
blob = bucket.blob(output)
blob.upload_from_string(csv)
How can I get the response to know if the file was uploaded successfully? I need to log the response to notify the user about the operation.
I tried with:
response = blob.upload_from_string(csv)
but it always return a None object even when the operation has succeded.

You can try with tqdm library.
import os
from google.cloud import storage
from tqdm import tqdm
def upload_function(client, bucket_name, source, dest, content_type=None):
bucket = client.bucket(bucket_name)
blob = bucket.blob(dest)
with open(source, "rb") as in_file:
total_bytes = os.fstat(in_file.fileno()).st_size
with tqdm.wrapattr(in_file, "read", total=total_bytes, miniters=1, desc="upload to %s" % bucket_name) as file_obj:
blob.upload_from_file(file_obj,content_type=content_type,size=total_bytes,
)
return blob
if __name__ == "__main__":
upload_function(storage.Client(), "bucket", "C:\files\", "Cloud:\blob.txt", "text/plain")

Regarding how to get notifications about changes made into the buckets there is a few ways that you could also try:
Using Pub/Sub - This is the recommended way where Pub/Sub notifications send information about changes to objects in your buckets to Pub/Sub, where the information is added to a Pub/Sub topic of your choice in the form of messages. Here you will find an example using python, as in your case, and using other ways as gsutil, other supported languages or REST APIs.
Object change notification with Watchbucket: This will create a notification channel that sends notification events to the given application URL for the given bucket using a gsutil command.
Cloud Functions with Google Cloud Storage Triggers using event-driven functions to handle events from Google Cloud Storage configuring these notifications to trigger in response to various events inside a bucket—object creation, deletion, archiving and metadata updates. Here there is some documentation on how to implement it.
Another way is using Eventarc to build an event-driven architectures, it offers a standardized solution to manage the flow of state changes, called events, between decoupled microservices. Eventarc routes these events to Cloud Run while managing delivery, security, authorization, observability, and error-handling for you. Here there is a guide on how to implement it.
Here you’ll be able to find related post with the same issue and answers:
Using Storage-triggered Cloud Function.
With Object Change Notification and Cloud Pub/Sub Notifications for Cloud Storage.
Answer with a Cloud Pub/Sub topic example.

You can verify if the upload gets any error, then use the exception's response methods:
def upload(blob,content):
try:
blob.upload_from_string(content)
except Exception as e:
status_code = e.response.status_code
status_desc = e.response.json()['error']['message']
else:
status_code = 200
status_desc = 'success'
finally:
return status_code,status_desc
Refs:
https://googleapis.dev/python/google-api-core/latest/_modules/google/api_core/exceptions.html
https://docs.python.org/3/tutorial/errors.html

Related

how to run a python code with impersonated Service Account

i am coming here after searching google but i am not able to find any answer which i can understand. Kindly help me with my query.
If i want to access GCP resource using an impersonated service account i know i can use it using commands like for example to list a bucket in a project:
gsutil -i service-account-id ls -p project-id
But how can i run a python code ex: test1.py to access the resources using impersonate service account ?
Is there any package or class that i need to use it ? if yes then how to use ? PFB the scenario and code:
I have a pub/sub topic hosted in project-A, where owner is xyz#gmail.com and I have a python code hosted in project-B where owner is abc#gmail.com.
In project-A I have created a service account where I have added abc#gmail.com to impersonate the service account which has pub/sub admin role. Now how can I access pubsub topic via my python code in project-B without using the keys ?
"""Publishes multiple messages to a Pub/Sub topic with an error handler."""
import os
from collections.abc import Callable
from concurrent import futures
from google.cloud import pubsub_v1
# os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\gcp_poc\key\my-GCP-project.JSON"
project_id = "my-project-id"
topic_id = "myTopic1"
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path(project_id, topic_id)
publish_futures = []
def get_callback(publish_future: pubsub_v1.publisher.futures.Future, data: str) -> Callable[[pubsub_v1.publisher.futures.Future], None]:
def callback(publish_future: pubsub_v1.publisher.futures.Future) -> None:
try:
# Wait 60 seconds for the publish call to succeed.
print(f"Printing the publish future result here: {publish_future.result(timeout=60)}")
except futures.TimeoutError:
print(f"Publishing {data} timed out.")
return callback
for i in range(4):
data = str(i)
# When you publish a message, the client returns a future.
publish_future = publisher.publish(topic_path, data.encode("utf-8"))
# Non-blocking. Publish failures are handled in the callback function.
publish_future.add_done_callback(get_callback(publish_future, data))
publish_futures.append(publish_future)
# Wait for all the publish futures to resolve before exiting.
futures.wait(publish_futures, return_when=futures.ALL_COMPLETED)
print(f"Published messages with error handler to {topic_path}.")
Create a Service Account with the appropriate role. Then create a Service Account key file and download it. Then put the path of the key file in the "GOOGLE_APPLICATION_CREDENTIALS" environment variable. The Client Library will pick that key file and use it for further authentication/authorization. Please read this official doc to know more about how Application Default Credentials work.

How best to handle async response from google long running operation with cloud functions

I'm using Google Cloud Functions (python) to initiate an asset inventory export from GCP by calling the exportAssets() method here. The method returns an Operations object defined here which can be used to poll the operation until it is complete. Of course since this is a cloud function I'm limited to 540 seconds so cannot do that forever. The google api python client offers the add_done_callback() method where one can await an async response, but as far as I can tell it requires me to keep a thread alive within the cloud function. Is there a way to tell the Asset Inventory API executing the operation to send the aync response (success or failure) to a pubsub topic where I can properly handle the response? Trying to avoid spinning up an appengine instance with basic_scaling to support 24 hour timeouts.
from google.cloud import asset_v1
# .....
# Setup request to asset inventory API
parent = "organizations/{}".format(GCP_ORGANIZATION)
requested_type = 'RESOURCE'
dataset = 'projects/{}/datasets/gcp_assets_{}'.format(GCP_PROJECT, requested_type)
partition_spec = asset_v1.PartitionSpec
partition_key = asset_v1.PartitionSpec.PartitionKey.REQUEST_TIME
partition_spec.partition_key = asset_v1.PartitionSpec.PartitionKey.REQUEST_TIME
output_config = asset_v1.OutputConfig()
output_config.bigquery_destination.dataset = dataset
output_config.bigquery_destination.table = 'assets'
output_config.bigquery_destination.separate_tables_per_asset_type = True
output_config.bigquery_destination.partition_spec.partition_key = partition_key
# Make API request to asset inventory API
print("Creating job to load 'asset types: {}' to {}".format(
requested_type,
dataset
))
response = ASSET_CLIENT.export_assets(
request={
"parent": parent,
"content_type": content_type,
"output_config": output_config,
}
)
print(response.result()) # This waits for the job to complete
Cloud Asset inventory export doesn't offer a PubSub notification at the end of the export. However, in my previous company, it took about 5 minutes to export 100k+ assets; it's not so bad! And if you have more assets, I'm sure you can contact Google Cloud (use your Customer Engineer) to add this notification in the roadmap.
Anyway, if you want to build a workaround, you can use workflows.
Use a Cloud Function to trigger your workflow
In your workflow,
Call the Cloud Asset API to export data to BigQuery
Get the response and perform a loop (test the export job status, if not OK, sleep X seconds and test again)
When the job is over, call PubSub API (or directly a Cloud Function) to submit the job status and process it.

How to access authentication by Strava API using Python?

I am starting a small python script (not an application) that can upload my *.fit activity files on Strava whenever they are created in a desired folder.
The main steps I plan to do are:
1. monitor *.fit file system modifications
2. access authentication to Strava to enable my program to upload files
(This tool will be personal use only, thus I expect no need to authenticate every time uploading)
3. upload the file to my Strava account
4. automatically doing this fixed routine with the help of Windows Task Scheduler
(For example, there will be 4-5 new riding activities generated in my computer folder, I expect this tool can automatically upload all of them once a week so that I do not need to manually complete the task.)
For step2, I really have no ideas how to implement even though reading through Strava Authentication Documentation and several source codes other peoples have developed (e.g. toravir's "rk2s (RunKeeper 2 Strava)" project on GitHub). I grabbed that some of the python modules like stravalib, swagger_client, request, json, etc. as well as concepts like OAuth2 may be related to step2 but I still can not put everything together...
Can any experienced give me some advice for the implementations of step2? or any related readings will be perfect!
Advice for other parts of this project will also be very welcomed and appreciated.
I thank you very much in advance:)
This is a code example on how you can access the Strava API, check out this gist or use the code below:
import time
import pickle
from fastapi import FastAPI
from fastapi.responses import RedirectResponse
from stravalib.client import Client
CLIENT_ID = 'GET FROM STRAVA API SITE'
CLIENT_SECRET = 'GET FROM STRAVA API SITE'
REDIRECT_URL = 'http://localhost:8000/authorized'
app = FastAPI()
client = Client()
def save_object(obj, filename):
with open(filename, 'wb') as output: # Overwrites any existing file.
pickle.dump(obj, output, pickle.HIGHEST_PROTOCOL)
def load_object(filename):
with open(filename, 'rb') as input:
loaded_object = pickle.load(input)
return loaded_object
def check_token():
if time.time() > client.token_expires_at:
refresh_response = client.refresh_access_token(client_id=CLIENT_ID, client_secret=CLIENT_SECRET, refresh_token=client.refresh_token)
access_token = refresh_response['access_token']
refresh_token = refresh_response['refresh_token']
expires_at = refresh_response['expires_at']
client.access_token = access_token
client.refresh_token = refresh_token
client.token_expires_at = expires_at
#app.get("/")
def read_root():
authorize_url = client.authorization_url(client_id=CLIENT_ID, redirect_uri=REDIRECT_URL)
return RedirectResponse(authorize_url)
#app.get("/authorized/")
def get_code(state=None, code=None, scope=None):
token_response = client.exchange_code_for_token(client_id=CLIENT_ID, client_secret=CLIENT_SECRET, code=code)
access_token = token_response['access_token']
refresh_token = token_response['refresh_token']
expires_at = token_response['expires_at']
client.access_token = access_token
client.refresh_token = refresh_token
client.token_expires_at = expires_at
save_object(client, 'client.pkl')
return {"state": state, "code": code, "scope": scope}
try:
client = load_object('client.pkl')
check_token()
athlete = client.get_athlete()
print("For {id}, I now have an access token {token}".format(id=athlete.id, token=client.access_token))
# To upload an activity
# client.upload_activity(activity_file, data_type, name=None, description=None, activity_type=None, private=None, external_id=None)
except FileNotFoundError:
print("No access token stored yet, visit http://localhost:8000/ to get it")
print("After visiting that url, a pickle file is stored, run this file again to upload your activity")
Download that file, install the requirements, and run it (assuming the filename is main):
pip install stravalib
pip install fastapi
pip install uvicorn
uvicorn main:app --reload
I believe you need to authenticate using OAuth in order to upload your activity, which pretty much requires you to have a web server setup that Strava can post back to after you "Authorize". I just set the authentication piece up using Rails & Heroku.
This link has a pretty good flowchart of what needs to happen.
https://developers.strava.com/docs/authentication/
Actually it looks like if you go to API Settings you can get your access token and refresh token there. I would also check out the Python Strava Library but it looks like you could do something like:
from stravalib.client import Client
access_token = 'your_access_token_from_your_api_application_settings_page'
refresh_token = 'your_refresh_token_from_your_api_application_settings_page'
client = Client()
athlete = client.get_athlete()
You may need to dig in a little more to that library to figure out the upload piece.

Google Cloud Storage create_upload_url -- App Engine Flexible Python

On a regular (non-flexible) instance of Google App Engine, you can use the Blobstore API and create a URL to allow a user to upload a file directly into your Blobstore. When it is uploaded, your app engine application is notified of the location of the file and can process it. An example of the python code is:
from google.appengine.ext import blobstore
upload_url = blobstore.create_upload_url('/upload_photo')
See the Blobstore docs.
Switching to Google App Engine Flexible Environment, usage of the Blobstore has been largely replaced by Cloud Storage. In such a case, is there an equivalent of create_upload_url?
My current implementation takes a standard file upload to a python Flask application. Then proceeds with something like:
from flask import request
from google.cloud import storage
uploaded_file = request.files.get('file')
gcs = storage.Client()
bucket = gcs.get_bucket(bucket_name)
blob = bucket.blob(blob_name)
blob.upload_from_string(
uploaded_file.read(),
content_type=uploaded_file.content_type
)
This seems like it is doubling the network load compared with create_upload_url because the file is coming into my app engine instance and then immediately being copied out. So the uploader will be made to wait extra time whilst this is happening. Presumably I will also incur extra App Engine charges for this. Is there a better way?
I have workers that later process the uploaded file, but I tend to download the file from Cloud Storage again in their code because I don't think you can assume that the worker will still have access to a file stored in the instance file system. Therefore I don't get any benefit of having the file uploaded to my instance rather than direct to it's storage location.
I have started using create_resumable_upload_session to create a signed URL that our client side application can upload a file to. Something like:
gcs = storage.Client()
bucket = gcs.get_bucket(BUCKET)
blob = bucket.blob(blob_name)
signed_url = blob.create_resumable_upload_session(content_type=content_type)
Then when the client has successfully uploaded a file to our storage, I subscribe to a Pub/Sub notification of the creation using this Cloud Pub/Sub Notifications for Cloud Storage.
Each blob created with the new Google Cloud Storage Client has a public_url property:
from flask import request
from google.cloud import storage
uploaded_file = request.files.get('file')
gcs = storage.Client()
bucket = gcs.get_bucket(bucket_name)
blob = bucket.blob('blob_name')
blob.upload_from_string(
uploaded_file.read(),
content_type=uploaded_file.content_type
)
url = blob.public_url
--
With the Blobstore, a GAE system handler in your instance takes care of the uploaded file you pass to the upload url created. I'm not sure if it's an issue handling it yourself in your code. If your current approach is problematic, you might want to consider doing the upload client side and not pass the file through App Engine at all. GCS has a REST API and the cloud storage client uses it underneath, so you can read and upload the file directly to GCS on the client side if it's more convenient. There's firebase.google.com/docs/storage/web/upload-files to ease you through the process

How to Define Google Endpoints API File Download Message Endpoint

All the examples I can find on google endpoint api (e.g., tic-tac-toe sample) show strings, integers, enums, etc fields. None of the examples say anything about how to specify document (e.g., image or zip files) uploads or downloads using the API. Is this not possible?
If this is possible, can anyone share a code snippet on how to define google endpoint api on the server to allow downloads and uploads of files? For example, is there a way to set HTTPResponse headers to specify that an endpoint response will serve a zip file? How do we include the zip file in the response?
An example with python or php would be appreciated. If anyone from the endpoints-proto-datastore team is watching this discussion, please say whether or not file downloads are supported in endpoints at the moment. We hate to waste our time trying to figure this out if it is simply impossible. Thanks.
We are seeking a complete example for upload and download. We need to store the key for the uploaded file in our database during upload and retrieve it for download. The client app sends a token that the API needs to use to figure out what file to download. Hence, we would need to store the blob key generated during the upload process in our database. Our database would have the mapping between the token and the blob file's key.
class BlobDataFile(models.Model):
data_code = models.CharField(max_length=10) # Key used by client app to request file
blob_key = models.CharField()
By the way, our app is written in Django 1.7 with a mysql (modeled with models.Model) database. It is infuriating that all the examples for Google App Engine upload I can find is written for a standalone webapp Handlers (no urls.py/views.py solutions could be found anywhere). Hence, building a standalone uploader is as much of a challenge as writing the API code. If your solution has full urls.py/views.py example for uploading files and saving the blob_key in our BlobDataFile, it would be good enough for us.
f you use the blobstore use the get_serving_url function to read the images from url in the client, or use the messages.ByteField in the ResourceContainer and serialize the image with base64.b64decode
#the returned class
class Img(messages.Message):
message = messages.BytesField (1)
#The api class
#endpoints.api(name='helloImg', version='v1')
class HelloImgApi(remote.Service):
ID_RESOURCE = endpoints.ResourceContainer(
message_types.VoidMessage,
id=messages.StringField(1, variant=messages.Variant.STRING))
#endpoints.method(ID_RESOURCE, Img,
path='serveimage/{id}', http_method='GET', #ID is the blobstore key
name='greetings.getImage')
def image_get(self, request):
try:
blob_reader = blobstore.BlobReader(blob_key)
value = blob_reader.read()
return Img(message=value)
except:
raise endpoints.NotFoundException('image %s not found.' %
(request.id,))
APPLICATION = endpoints.api_server([HelloImgApi])
And this is the response (save it in the client with the proper format)
{
"message": ""
}
in the client you can do this (in python for continuity)
import base64
myFile = open("mock.jpg", "wb")
img = base64.b64decode(value) #value is the returned string
myFile.write(img)
myFile.close()
Did you try converting the image to base64 string and send it as an argument of your request on the client side?
So you will be able to do that on the server side :
#strArg is the Base64 string sent from the client
img = base64.b64decode(strArg)
filename = 'someFileName.jpg'
with open(filename, 'wb') as f:
f.write(img)
#then you can save the file to your BlobStore

Categories