I am trying to populate a bucket on Google Cloud Services, but the python API client library that I've been using, and that has been treating me very well, doesn't seem to work for uploading files.
The API is here and is documented here
When I try to use the storage.objects.insert I get a response with an error that says:
"reason": "wrongUrlForUpload"
"message": "Upload requests must include an uploadType URL parameter and a URL path beginning with /upload/"
I did some experimenting. I found out that I need to pass in a file name, but my data is in a string, not on my file system. I'm creating a file name with tempfile.NamedTemporaryFile, and passing that name in. tempfile creates a file like /tmp/tmpOGyQY1, but the library doesn't like it unless it can determine a file type. If the file name has an extension, then it works. Is there a way I can create a tempfile with an extension? Or is there some other way I can inform the library of the file type?
It turns out that the piece I was missing was to set suffix=".csv" when I create the tempfile
Related
I'm trying to set up a cloud function that performs user authentication. I need it to open an object stored in a Cloud Storage bucket, read its content and verify if username and password match those coming from the HTTP request.
In some cases the function needs to add a user: it should retrieve the content of the .json file stored in the bucket, add a username:password pair and save the content in the same object.
Basically, it has to modify the content of the object.
I can't find the way o do it using the Cloud Storage Python client library. None of the tutorials listed in the GitHub pages mentions anything like "modify a file" or similar concepts (at least in their short descriptions).
I also looked for a method to perform this operation in the Blob class source code, but I couldn't find it.
Am I missing something? This looks to me as a very common operation, one that should have a very straightforward method, like blob.modify(new_content).
I have to confess that I am completely new to GCP, so there is probably an obvious reason behind this (or maybe I just missed it).
Thank you in advance!
Cloud Storage is a blob storage and you can only read, write and delete the object. You can't update the content (only the metadata) and can't move/rename a file (move and rename operation perform a copy (create a new object) followed by a delete (of the old object)).
In addition, the directories don't exist, all the file are put at the root level of the bucket. The file name contains the path from the root to the leaf. The / is only a human representation for the folders (and the UI use that representation), but the directories are only virtual.
Finally, you can't search on a file suffix, only per prefix of the file name (including the full path from the root path /)
In summary, it's not a file system, it's a blob storage. Change your design or your file storage option.
The App Engine documentation for the Blobstore gives a pretty thorough explanation of how to upload a file using the BlobstoreUploadHandler provided by the webapp framework.
However, I have a cgi.FieldStorage instance that I would like to store directly into the Blobstore. In other words, I don't need to upload the file since this is taken care of by other means; I just need to store it.
I've been looking through the blobstore module source to try to understand how the upload handler creates/generates blobstore keys and ultimately writes files to the blobstore itself, but I'm getting lost. It seems like the CreateUploadURLResponse in blobstore_service_pb is where the actual write would occur, but I'm not seeing the component that actually implements that functionality.
Update
There is also an implementation for storing files directly into the filesystem, which I think is what the upload handler does in the end. I am not entirely sure about this, so an explanation as to whether or not using the FileBlobStorage is the correct way to go would be appreciated.
After the deprecation of the files API you can no longer write directly to blobstore.
You should write to Google Cloud Storage instead. For that you can use the AE GCS client
Files written to Google Cloud Storage could be served by the Blobstore API by creating a blob key.
So I am trying to port a Python webapp written with Flask to Google App Engine. The app hosts user uploaded files up to 200mb in size, and for non-image files the original name of the file needs to be retained. To prevent filename conflicts, e.g. two people uploading stuff.zip, each containing completely different and unrelated contents, the app creates a UUID folder on the filesystem and stores the file within that, and serves them to users. Google App Engine's Cloud Storage, which I was planning on using to store the user files, by making a bucket - according to their documentation has "no notion of folders". What is the best way to go about getting this same functionality with their system?
The current method, just for demonstration:
# generates a new folder with a shortened UUID name to save files
# other than images to avoid filename conflicts
else:
# if there is a better way of doing this i'm not clever enough
# to figure it out
new_folder_name = shortuuid.uuid()[:9]
os.mkdir(
os.path.join(app.config['FILE_FOLDER'], new_folder_name))
file.save(
os.path.join(os.path.join(app.config['FILE_FOLDER'], new_folder_name), filename))
new_folder_path = os.path.join(
app.config['FILE_FOLDER'], new_folder_name)
return url_for('uploaded_file', new_folder_name=new_folder_name)
From the Google Cloud Storage Client Library Overview documentation:
GCS and "subdirectories"
Google Cloud Storage documentation refers to "subdirectories" and the GCS client library allows you to supply subdirectory delimiters when you create an object. However, GCS does not actually store the objects into any real subdirectory. Instead, the subdirectories are simply part of the object filename. For example, if I have a bucket my_bucket and store the file somewhere/over/the/rainbow.mp3, the file rainbow.mp3 is not really stored in the subdirectory somewhere/over/the/. It is actually a file named somewhere/over/the/rainbow.mp3. Understanding this is important for using listbucket filtering.
While Cloud Storage does not support subdirectories per se, it allows you to use subdirectory delimiters inside filenames. This basically means that the path to your file will still look exactly as if it was inside a subdirectory, even though it is not. This apparently should concern you only when you're iterating over the entire contents of the bucket.
From the Request URIs documentation:
URIs for Standard Requests
For most operations you can use either of the following URLs to access objects:
storage.googleapis.com/<bucket>/<object>
<bucket>.storage.googleapis.com/<object>
This means that the public URL for their example would be http://storage.googleapis.com/my_bucket/somewhere/over/the/rainbow.mp3. Their service would interpret this as bucket=my_bucket and object=somewhere/over/the/rainbow.mp3 (i.e. no notion of subdirectories, just an object name with embedded slashes in it); the browser however will just see the path /my_bucket/somewhere/over/the/rainbow.mp3 and will interpret it as if the filename is rainbow.mp3.
I know that I can't write files into the google app engine system, but I wonder if from the datastore I could programmatically build a txt file and serve it directly to download to the user of the application. I am not storing the file. I just want to serve it.
Any idea if this is possible?
Yes, it's possible.
You need to set the header to indicate that the file must be an attachment.
class MainHandler(webapp2.RequestHandler):
def test_download(self):
self.response.headers.add_header('content-disposition','attachment',filename='text.txt')
self.response.write("hello world")
You can see more information looking at the source for webapp2
Regarding "can't write files into the google app engine system", you can write to the blobstore instead. So if you need to generate a large file, you write it to the blobstore and serve it from there.
I want to save some data fetched from the web to blobstore, but the google doc says that
Deprecated: The Files API feature used here to write files to Blobstore is going to be removed at some time in the future, in favor of writing files to Google Cloud Storage and using Blobstore to serve them.
The code in python is as follows
from __future__ import with_statement
from google.appengine.api import files
# Create the file
file_name = files.blobstore.create(mime_type='application/octet-stream')
# Open the file and write to it
with files.open(file_name, 'a') as f:
f.write('data')
# Finalize the file. Do this before attempting to read it.
files.finalize(file_name)
# Get the file's blob key
blob_key = files.blobstore.get_blob_key(file_name)
I am wondering if there is another way to write to blobstore instead of the official upload method.
If you want to use a file-like API, you have to go with GCS.
Blobstore is for uploading more-or-less static images and serving them.
If you want to write using a a file-like API and then serve from Blobstore, you can write to GCS and get a BlobKey to the file.
https://cloud.google.com/appengine/docs/python/blobstore/#Python_Using_the_Blobstore_API_with_Google_Cloud_Storage
But writing to BlobStore like you want is deprecated. Stop trying to do it that way.
An option may be to put the data in the datastore using a TextProperty