I zip a folder having multiple subdirectories. When I upload it to s3 using boto
By reading like this,
zipdata = open(os.path.join(os.curdir, zip_file), 'rb').read()
Then all files from all subdirectries copied to root directory. That is no subdirectory exists at s3.
How to upload a zip file of a folder to s3?
After running the command you show above, zip_data will contain the bytes contained in the zip file. If you then write that data to S3, you will get a single object (key) in S3 that contains that data. Is that what you want?
It sounds like you want the zip file to be expanded and all of the individual files and directories inside it to be stored in S3 as individual objects. If that is the case, you need to expand the zip file locally and then walk through the hierarchy and store each individual file in S3. You could use the s3put command line tool in boto to do this for you.
There is no way to get S3 itself to unpack the contents of a zip file for you automatically.
Related
i am able to read the files from gcs bucket, but i am not able to merge the sample files and write it to another gcs location.
i have to merge all the files in one folder of gcs bucket and move that merged file to other gcs bucket location using python.
ex: gs://hello/test
test contains 5 files and the contnet of all 5 files to be merged and moved to another folder say test1.
i tried to move one file from one gcs to other, but the use case is merge all the files in one folder and move the merged file to other gcs bucket.
I am trying to delete files (not folder) from multiple folders within an s3 bucket.
my code:
for archive in src.objects.filter(Prefix="Nightly/{folder}"):
s3.Object(BUCKET, archive.key).delete()
when I do this, this deletes only files from some directory (works fine)
but for other 2 folders it deletes the folder itself.
if you see the picture, I am listing the files in each folder.
folders account, user, Archive printing an extra archieve (highlighted)
but folders opportunity and opphistory not printing key for folder. I would like to know why this key is not printing for these 2 folders, thanks.
There are no folders and files in S3. Everything is an object. So Nightly/user/ is an object, just like Nightly/opportunity/opportunity1.txt is an object.
The "folders" are only visual representation made by AWS console:
The console uses the key name prefixes (Development/, Finance/, and Private/) and delimiter ('/') to present a folder structure.
So your "folders account, user, Archive printing an extra archive (highlighted)" are just objects called Nightly/user/, Nightly/account/ and Nightly/Archive/. Such objects are created when you click "New folder" in the AWS console (you can use also AWS SDK or CLI to create them). Your other "files" don't have such folders, because these "files" weren't created like this. Instead they where uploaded to S3 under their full name, e.g. Nightly/opportunity/opportunity1.txt.
i have a excel sheet which contains s3 file name & s3 static url , i want to delete these objects at once . Total number of files are 1000+ .So any script i can use to delete these files will be helpful .
You could use the AWS Command-Line Interface (CLI) to delete the files, if you have AWS credentials with permission to delete the objects.
I would recommend that you add another column to the spreadsheet that inserts the file details into formula that would generate this:
aws s3 rm s3://BUCKET-NAME/path/file.txt
You could then copy the command from Excel and paste it into the command line to tell the AWS CLI to delete the file. Test it on a few first, to make sure it works.
Then, use Fill Down and copy/paste all the commands into the command line.
I am trying to upload a file to a subfolder in S3 in lambda function.
Any suggestion for achieving this task. Currently I am able to upload to only the main S3 bucket folder.
s3_resource.Bucket("bucketname").upload_file("/tmp/file.csv", "file.csv")
However, my goal is to upload to a folder in bucketname/subfolder1/file.csv
Thanks in advance
Amazon has this to say about S3 paths:
In Amazon S3, buckets and objects are the primary resources, and objects are stored in buckets. Amazon S3 has a flat structure instead of a hierarchy like you would see in a file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. It does this by using a shared name prefix for objects (that is, objects have names that begin with a common string). Object names are also referred to as key names.
In other words, you just need to specify the path you want to use for the upload, any directory concept only impacts how objects are enumerated and displayed, there isn't a directory you need to make like a traditional filesystem:
s3_resource.Bucket("bucketname").upload_file("/tmp/file.csv", "subfolder1/file.csv")
I want to upload a HDF5 file created with h5py to S3 bucket without saving locally using boto3.
This solution uses pickle.dumps and pickle.loads and other solutions I have found, store the file locally which I like to avoid.
You can use io.BytesIO() to and put_object as illustrated here 6. Hope this helps. Even in this case, you'd have to 'store' the data locally(though 'in memory'). You could also create a tempfile.TemporaryFile and then upload your file with put_object. I don't think you can stream to an S3 Buckets in the sense that the local data would be discarded as it is uploaded to the Bucket.