I am trying to delete files (not folder) from multiple folders within an s3 bucket.
my code:
for archive in src.objects.filter(Prefix="Nightly/{folder}"):
s3.Object(BUCKET, archive.key).delete()
when I do this, this deletes only files from some directory (works fine)
but for other 2 folders it deletes the folder itself.
if you see the picture, I am listing the files in each folder.
folders account, user, Archive printing an extra archieve (highlighted)
but folders opportunity and opphistory not printing key for folder. I would like to know why this key is not printing for these 2 folders, thanks.
There are no folders and files in S3. Everything is an object. So Nightly/user/ is an object, just like Nightly/opportunity/opportunity1.txt is an object.
The "folders" are only visual representation made by AWS console:
The console uses the key name prefixes (Development/, Finance/, and Private/) and delimiter ('/') to present a folder structure.
So your "folders account, user, Archive printing an extra archive (highlighted)" are just objects called Nightly/user/, Nightly/account/ and Nightly/Archive/. Such objects are created when you click "New folder" in the AWS console (you can use also AWS SDK or CLI to create them). Your other "files" don't have such folders, because these "files" weren't created like this. Instead they where uploaded to S3 under their full name, e.g. Nightly/opportunity/opportunity1.txt.
Related
i have a excel sheet which contains s3 file name & s3 static url , i want to delete these objects at once . Total number of files are 1000+ .So any script i can use to delete these files will be helpful .
You could use the AWS Command-Line Interface (CLI) to delete the files, if you have AWS credentials with permission to delete the objects.
I would recommend that you add another column to the spreadsheet that inserts the file details into formula that would generate this:
aws s3 rm s3://BUCKET-NAME/path/file.txt
You could then copy the command from Excel and paste it into the command line to tell the AWS CLI to delete the file. Test it on a few first, to make sure it works.
Then, use Fill Down and copy/paste all the commands into the command line.
I am trying to upload a file to a subfolder in S3 in lambda function.
Any suggestion for achieving this task. Currently I am able to upload to only the main S3 bucket folder.
s3_resource.Bucket("bucketname").upload_file("/tmp/file.csv", "file.csv")
However, my goal is to upload to a folder in bucketname/subfolder1/file.csv
Thanks in advance
Amazon has this to say about S3 paths:
In Amazon S3, buckets and objects are the primary resources, and objects are stored in buckets. Amazon S3 has a flat structure instead of a hierarchy like you would see in a file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. It does this by using a shared name prefix for objects (that is, objects have names that begin with a common string). Object names are also referred to as key names.
In other words, you just need to specify the path you want to use for the upload, any directory concept only impacts how objects are enumerated and displayed, there isn't a directory you need to make like a traditional filesystem:
s3_resource.Bucket("bucketname").upload_file("/tmp/file.csv", "subfolder1/file.csv")
So I am trying to port a Python webapp written with Flask to Google App Engine. The app hosts user uploaded files up to 200mb in size, and for non-image files the original name of the file needs to be retained. To prevent filename conflicts, e.g. two people uploading stuff.zip, each containing completely different and unrelated contents, the app creates a UUID folder on the filesystem and stores the file within that, and serves them to users. Google App Engine's Cloud Storage, which I was planning on using to store the user files, by making a bucket - according to their documentation has "no notion of folders". What is the best way to go about getting this same functionality with their system?
The current method, just for demonstration:
# generates a new folder with a shortened UUID name to save files
# other than images to avoid filename conflicts
else:
# if there is a better way of doing this i'm not clever enough
# to figure it out
new_folder_name = shortuuid.uuid()[:9]
os.mkdir(
os.path.join(app.config['FILE_FOLDER'], new_folder_name))
file.save(
os.path.join(os.path.join(app.config['FILE_FOLDER'], new_folder_name), filename))
new_folder_path = os.path.join(
app.config['FILE_FOLDER'], new_folder_name)
return url_for('uploaded_file', new_folder_name=new_folder_name)
From the Google Cloud Storage Client Library Overview documentation:
GCS and "subdirectories"
Google Cloud Storage documentation refers to "subdirectories" and the GCS client library allows you to supply subdirectory delimiters when you create an object. However, GCS does not actually store the objects into any real subdirectory. Instead, the subdirectories are simply part of the object filename. For example, if I have a bucket my_bucket and store the file somewhere/over/the/rainbow.mp3, the file rainbow.mp3 is not really stored in the subdirectory somewhere/over/the/. It is actually a file named somewhere/over/the/rainbow.mp3. Understanding this is important for using listbucket filtering.
While Cloud Storage does not support subdirectories per se, it allows you to use subdirectory delimiters inside filenames. This basically means that the path to your file will still look exactly as if it was inside a subdirectory, even though it is not. This apparently should concern you only when you're iterating over the entire contents of the bucket.
From the Request URIs documentation:
URIs for Standard Requests
For most operations you can use either of the following URLs to access objects:
storage.googleapis.com/<bucket>/<object>
<bucket>.storage.googleapis.com/<object>
This means that the public URL for their example would be http://storage.googleapis.com/my_bucket/somewhere/over/the/rainbow.mp3. Their service would interpret this as bucket=my_bucket and object=somewhere/over/the/rainbow.mp3 (i.e. no notion of subdirectories, just an object name with embedded slashes in it); the browser however will just see the path /my_bucket/somewhere/over/the/rainbow.mp3 and will interpret it as if the filename is rainbow.mp3.
I'm faced with the following problem:
The users have some files that need syncing so I'm writing a script that copies the encrypted files from a user's directory to a temporary directory in the server before it gets distributed in the other 5 servers.
The initial copy is done by creating a folder with the user's name and putting the files there.
The users are free to change usernames so if someone changes his username to something nasty the server(s) is/are owned
I have to use the usernames for folder names because the script that does the syncing is using the folder username for metadata of some sort.
So, is there any way to escape the usernames and make sure that everything is created under the master folder?
As nrathaus suggested you could use os.path.normpath to get "normalized" path and check for security issues
I zip a folder having multiple subdirectories. When I upload it to s3 using boto
By reading like this,
zipdata = open(os.path.join(os.curdir, zip_file), 'rb').read()
Then all files from all subdirectries copied to root directory. That is no subdirectory exists at s3.
How to upload a zip file of a folder to s3?
After running the command you show above, zip_data will contain the bytes contained in the zip file. If you then write that data to S3, you will get a single object (key) in S3 that contains that data. Is that what you want?
It sounds like you want the zip file to be expanded and all of the individual files and directories inside it to be stored in S3 as individual objects. If that is the case, you need to expand the zip file locally and then walk through the hierarchy and store each individual file in S3. You could use the s3put command line tool in boto to do this for you.
There is no way to get S3 itself to unpack the contents of a zip file for you automatically.