Can I upload a file to GCS from Google Endpoints? - python

I'm trying to upload a file from API Rest (Google Endpoints) to GCS, but I have retrieve a lot of errors. I don't know if I'm using a bad way or simply Google Endpoints does not upload a file.
I'm trying who my customers upload files to my project bucket.
I read "Endpoints doesn't accept the multipart/form-data encoding so you can't upload the image directly to Endpoints".
Mike answered me at this post but dont know how to implement that on my project.
I'm using this libray (Python):
https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/
If is possible, whats the better way? Any example?
Thanks so much.

I think what Mike means in the previous post, is that you should use Blobstore API to upload file to GCS, instead of using endpoints, and take the data again to the blobstore.
But that will depends on what platform is your client. If you use Web-based client, you should use ordinary way just as Mike has explained (by using HTML form and handler). But if you use Android or mobile client, you can use GCS Client library or GCS REST API.

Related

It's possible to upload content to Cloud Storage automatically from Google Drive?

I need to load CSV files from Google Drive into BigQuery automatically and I was wondering if it's possible to do it that way:
Google Drive Folder
Pub/Sub, Cloud Functions, DriveApi... ??
Cloud Storage Bucket
Bigquery
I have developed a python script that uploads the CSV file stored in Cloud Storage automatically to BigQuery, now I need to create the workflow between Google Drive and Cloud Storage.
I've been researching but really donĀ“t really know how to proceed.
Any hints?
You will need to develop an app to listen for changes, Google App Engine works well here or Cloud Functions.
The app will need to implement the Retrieve Changes logic that makes sense to your use case.
See these Google Drive API docs https://developers.google.com/drive/api/v3/manage-changes
With Drive, I recommend asking whether the OAuth is worth it for any app. Asking your users to submit to a lightweight frontend might be easy and faster to develop.
Try using Google drive API to pull data from google drive and load it to which ever location you want, i.e. GCS, BQ table and so on.
You can refer following example to create a code to achieve same.
You will need to develop an app to listen for changes, Google App Engine works well here or Cloud Functions.
The app will need to implement the Retrieve Changes logic that makes sense to your use case.
See these Google Drive API docs https://developers.google.com/drive/api/v3/manage-changes
With Drive, I recommend asking whether the OAuth is worth it for any app. Asking your users to submit to a lightweight frontend might be easy and faster to develop.

PDF/TIFF Document Text Detection

I am currently trying to use Google's cloud vision API for my project. The problem is that Google cloud vision API for document text detection accepts only Google Cloud Services URI as input and output destination. But I have all my projects, data in Amazon S3 server which cant be directly used with this API.
Points to be noted:-
All data should be in kept in S3 only.
I can't change my cloud storage to GCS now.
I can't download files from S3 and upload to GCS manually.The number
of files that are incoming per day is more than 1000 and less than
100,000.
Even if I could automate downloading and uploading of the pdf, this
would serve as a bottleneck for the entire project, since I would have to deal
with concurrency issues and memory management.
Is there any workaround to make this API work with S3 URI? I am in need of your help.
Thank You
Currently, Vision API doesn't work with URLs, apart from the Google Cloud Storage ones. There's a feature request for the image search related to use the API with specific URLs where you could ask to consider this feature for the PDF/TIFF documents too, or raise a new feature request for this scenario.

How to download all files from google bucket directory to local directory using google oauth

Is there any way using OAuth to download all content of a google bucket directory to a local directory.
I found two ways using (get request object) from storage API and gsutil. But since API uses direct name downloading I have to first parse all the list of bucket content and then send get request and then download it. I find gsutil more convenient but for this, I have to hard code details for the credential.
Basically, i am developing a client related application where I have to download the big query table data to the client local server
Can anyone help me for this
Unless your application knows ahead of time the object names that you want to download, you'll need to perform a list followed by GETs for each object.
You can use the gcloud-python client library to do this. You can configure your client application has the OAuth2 credentials and the library should handle the rest of the necessary authentication for you. See the documentation here for the basics of authentication, and [here](https://googlecloudplatform.github.io/google-cloud-python/stable/storage-blobs.html for interacting with Google Cloud Storage objects.

How to do chunk deduplication-based upload in Dropbox API?

I am using Dropbox API (python version), and want to replicate one functionality in Dropbox client-side software.
In Dropbox API, I can call a function like put_file() to upload the file to my Dropbox account.
Dropbox actually implemented per-user deduplication mechanism, which means that you need to transmit the chunk/file hash to the server before transmitting the chunk/file to the server.
If you uploaded a file F before, if now the server finds a hash match, you don't need to transmit the chunk/file again.
put_file() seems to upload the file everytime and does not do the chunking.
I also found upload_chunk() probably useful, but it seems not that useful.
I am wondering how can I do the chunk-based deduplication with Dropbox API?
(for example, I can upload the hash of a particular chunk, and the server will reply me whether there is a hash match)
According to this announcement the purpose of chunked upload is to make it possible to deal with spotty connections by letting you upload a large file in chunks instead. It's not about deduplication.
If you take a look through the Core API documentation (not that much to read, really), there is no mention anywhere of de-duplication being offered through the API. Wether you use Python or any other language or library, without the published API supporting de-duplication, there is no way you can access this functionality.

Uploading to Blobstore without using blobstore.create_upload_url

I would like to create an app using python on google app engine to handle file upload and store them into the blobstore.
However, currently blobstore requires the use of blobstore.create_upload_url to create the url for the file upload form post. Since I am uploading from another server, my question is, is it possible to upload file to gae blobstore without using that dynamic url from blobstore.create_upload_url?
FYI, it is ok that I request a upload URL from the python script before I upload from another server but this creates extra latency and that is not what I want. Also I read that using the so called "file-like API" from http://code.google.com/intl/en/appengine/docs/python/blobstore/overview.html#Writing_Files_to_the_Blobstore but the documentation didn't seem to cover the part on uploading.
Also, previously I tried to use datastore for file upload, but the max file size is 1MB which is not enough for my case. Please kindly advise, thanks.
There are exactly two ways to write files to the blobstore: 1) using create_upload_url, and posting a form with a file attachment to it, 2) writing to the blobstore directly using an API (with a solution here for large files).
If you want a remote server to be able to upload, you have the same two choices:
1) The remote server firsts requests a URL. You have a handler that's just is only to return such a URL. With this URL, the remote server crafts a properly-encoded POST message and posts it to the URL.
2) You send the file data as an HTTP parameter to a handler to your app engine server, which uses the blobstore write API to then write the file directly. Request size is limited to 32MB, so the maximum file size you can write using this method will be slightly under that.

Categories