Flask - Sending zipfile contains absolute path - python

My flask app has a function that compresses log files in a directory into a zip file and then sends the file to the user to download. The compression works, except that when the client receives the zipfile, the zip contains a series of folders that matches the absolute path of the original files that were zipped in the server. However, the zipfile that was made in the server static folder does not.
Zipfile contents in static folder: "log1.bin, log2.bin"
Zipfile contents that was sent to user: "/home/user/server/data/log1.bin, /home/user/server/data/log2.bin"
I don't understand why using "send_file" seems to make this change to the zip file contents and fills the received zip file with sub folders. The actual contents of the received zip file do in fact match the contents of the sent zip file in terms of data, but the user has to click through several directories to get to the files. What am I doing wrong?
#app.route("/download")
def download():
os.chdir(data_dir)
if(os.path.isfile("logs.zip")):
os.remove("logs.zip")
log_dir = os.listdir('.')
log_zip = zipfile.ZipFile('logs.zip', 'w')
for log in log_dir:
log_zip.write(log)
log_zip.close()
return send_file("logs.zip", as_attachment=True)

Using send_from_directory(directory, "logs.zip", as_attachment=True) fixed everything. It appears this call is better for serving up static files.

Related

Deleting a file with Python

My code takes a list of file and folder paths, loops through them, then uploads to a Google Drive. If the path given is a directory, the code creates a .zip file before uploading. Once the upload is complete, I need the code to delete the .zip file that was created but the deletion throws an error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Temp\Newfolder.zip'. The file_path being given is C:\Temp\Newfolder. From what I can tell the only process using the file is this script but with open(...) should be closing the file when the processing is complete. Looking for suggestions on what could be done differently.
import os
import zipfile
import time
def add_list(filePathList, filePath):
filePathList.append(filePath)
return filePathList
def deleteFiles(filePaths):
for path in filePaths:
os.remove(path)
for file_path in file_paths:
if os.path.isdir(file_path) and not os.path.splitext(file_path)[1]:
# Compress the folder into a .zip file
folder_name = os.path.basename(file_path)
zip_file_path = os.path.join('C:\\Temp', folder_name + '.zip')
with zipfile.ZipFile(zip_file_path, 'w') as zip_file:
for root, dirs, files in os.walk(file_path):
for filename in files:
file_to_zip = os.path.join(root, filename)
zip_file.write(file_to_zip, os.path.relpath(file_to_zip, file_path))
zip_file.close()
# Update the file path to the zipped file
file_path = zip_file_path
# Create request body
request_body = {
'name': os.path.basename(file_path),
'mimeType': 'application/zip' if file_path.endswith('.zip') else 'text/plain',
'parents': [parent_id],
'supportsAllDrives': True
}
#Open the file and execute the request
with open(file_path, "rb") as f:
media_file = MediaFileUpload(file_path, mimetype='application/zip' if file_path.endswith('.zip') else 'text/plain')
upload_file = service.files().create(body=request_body, media_body=media_file, supportsAllDrives=True).execute()
# Print the response
print(upload_file)
if file_path.endswith('.zip'):
add_list(filePathList, file_path)
time.sleep(10)
deleteFiles(filePathList)
You're using with statements to properly close the open files, but you're not actually using the open files, just passing the path to another API that is presumably opening the file for you under the hood. Check the documentation for the APIs here:
media_file = MediaFileUpload(file_path, mimetype='application/zip' if file_path.endswith('.zip') else 'text/plain')
upload_file = service.files().create(body=request_body, media_body=media_file, supportsAllDrives=True).execute()
to figure out if MediaFileUpload objects, or the various things done with .create/.execute that uses it, provides some mechanism for ensuring deterministic resource cleanup (as is, if MediaFileUpload opens the file, and nothing inside .create/.execute explicitly closes it, at least one such file is guaranteed to be open when you try to remove them at the end, possibly more than one if reference cycles are involved or you're on an alternate Python interpreter, causing the problem you see on Windows systems). I can't say what might or might not be required, because you don't show the implementation, or specify the package it comes from.
Even if you are careful not to hold open handles yourself, there are cases where other processes can lock it, and you need to retry a few times (virus scanners can cause this issue by attempting to scan the file as you're deleting it; I believe properly implemented there'd be no problem, but they're software written by humans, and therefore some of them will misbehave). It's particularly common for files that are freshly created (virus scanners being innately suspicious of any new file, especially compressed files), and as it happens, you're creating fresh zip files and deleting them in short order, so you may be stuck retrying a few times.

Retrieve file from FTP, unzip file, save extract file to Amazon S3 bucket

Attempting to retrieve a file from FTP and save it to an S3 bucket within lambda function.
I can confirm the first part of the code works as I can see the list of files printed to Cloudwatch logs.
import ftplib
from ftplib import FTP
import zipfile
import boto3
s3 = boto3.client('s3')
S3_OUTPUT_BUCKETNAME = 'my-s3bucket'
ftp = FTP('ftp.godaddy.com')
ftp.login(user='auctions', passwd='')
ftp.retrlines('LIST')
The next part was resulting in the following error:
module initialization error: [Errno 30] Read-only file system: 'tdnam_all_listings.csv.zip'
However I managed to overcome this by adding 'tmp' to the file location as per following code:
fileName = 'all_expiring_auctions.json.zip'
with open('/tmp/fileName', 'wb') as file:
ftp.retrbinary('RETR ' + fileName, file.write)
Next, I am attempting to unzip the file from the temporary loaction
with zipfile.ZipFile('/tmp/fileName', 'r') as zip_ref:
zip_ref.extractall('')
Finally, I am attempting save the file to a particular 'folder' in the s3 bucket, as follows:
data = open('/tmp/all_expiring_auctions.json')
s3.Bucket('brnddmn-s3').upload_fileobj('data','my-s3bucket/folder/')
The code produces no errors that I can see in the log, however the unzipped file is not reaching the destination despite my efforts.
Any help greatly appreciated.
Firstly, you have to use the tmp directory for working with files in Lambda. The ZipFile extractall('') will create the extract in your current working directory though, assuming the zip content is a simple plain text file with no relative path. To create the extract in tmp directory, use
zip_ref.extract_all('tmp')
I'm not sure why there are no errors logged. data = open(...) should throw an error if no file is found. If required you can explicitly print if file exists:
import os
print(os.path.exists('tmp/all_expiring_auctions.json')) # True/False
Finally, once you have ensured the file exists, the argument for Bucket() should be the bucket name. Not sure if your bucket name is 'brnddmn-s3' or 'my-s3bucket'. Also, the first argument to upload_fileobj() should be a file object, i.e., data instead of string 'data'. The second argument should be the object key (filename in S3) instead of the folder name.
Putting it together, the last line should look like this.
S3_OUTPUT_BUCKETNAME = 'my-s3bucket' # Replace with your S3 bucket name
s3.Bucket(S3_OUTPUT_BUCKETNAME).upload_fileobj(data,'folder/all_expiring_auctions.json')

python unable to extract zip file uploaded to aws s3 bucket

medias = ['https://baby-staging-bucket.s3.us-east-2.amazonaws.com/asset/0002.jpg',
'https://baby-staging-bucket.s3.us-east-2.amazonaws.com/asset/2.png',
'https://baby-staging-bucket.s3.us-east-2.amazonaws.com/asset/02.png'
]
for i in medias:
file_name = i.split("/")[-1]
urllib.urlretrieve (i, "media/"+file_name)
# writing files to a zipfile
local_os_path = f'media/{title}.zip'
with ZipFile(local_os_path, 'w') as zip:
# writing each file one by one
for file in medias:
file_name = file.split("/")[-1]
zip.write("media/"+file_name)
os.remove("media/"+file_name)
s3 = session.resource('s3')
storage_path = f'asset/nfts/zip/{title}.zip'
s3.meta.client.upload_file(Filename=local_os_path, Bucket=AWS_STORAGE_BUCKET_NAME, Key=storage_path)
# os.remove(local_os_path)
DesignerProduct.objects.filter(id=instance.id).update(
zip_file_path=S3_BUCKET_URL + storage_path,
)
I am using this code to create zip file and saving to w3 bucket.
Fitst i am downloading to localsystem then zipping all files and saving zip file to s3 bucket
In my local system i am able to extract zip file but when i download from s3 bucket i am not able to extract it.
https://baby-staging-bucket.s3.us-east-2.amazonaws.com/asset/nfts/zip/ant.zip
This is my path of s3 where zip file uploaded .
what can be the reason please take a look
Move the upload after the with block.
You are uploading your zipfile before the archive is closed.
See ZipFile.close():
Close the archive file. You must call close() before exiting your program or essential records will not be written.
close is automatically called by the with statement.
You open your local file after the program exits - which means after the zipfile is closed - so your local version is not corrupted.

It is possible to search in files on ftp in Python?

right now this is all I have:
import ftputil
a_host = ftputil.FTPHost("ftp_host", "username","pass") # login to ftp
for (dirname, subdirs, files) in a_host.walk("/"): # directory
for f in files:
fullpath = a_host.path.join(dirname, f)
if fullpath.endswith('html'):
#stucked
so I can log in to my ftp, and do a .walk in my files
the thing I am not able to manage is when the .walk finds a html file to also search in it for a string I want.
for example:
on my ftp - there is a index.html and a something.txt file
I want to find with .walk the index.html file, and then in index.html search for 'my string'
thanks
FTP is a protocol for file transfer only. It has not the ability by itself to execute remote commands which are needed to search the files on the remote server (there is a SITE command but it can usually not be used for such a purpose because it is not implemented or restricted to only a few commands).
This means your only option with FTP is to download the file and search it locally, i.e. transfer the file to the local system, open it there and look for the string.

How to get the path of the posted file in Python

I am getting a file posting from a file:
file = request.post['ufile']
I want to get the path. How can I get it?
You have to use the request.FILES dictionary.
Check out the official documentation about the UploadedFile object, you can use the UploadedFile.temporary_file_path attribute, but beware that only files uploaded to disk expose it (that is, normally, when using the TemporaryFileUploadHandler uploads handler).
upload = request.FILES['ufile']
path = upload.temporary_file_path
In the normal case, though, you would like to use the file handler directly:
upload = request.FILES['ufile']
content = upload.read() # For small files
# ... or ...
for chunk in upload.chunks():
do_somthing_with_chunk(chunk) # For bigger files
You should use request.FILES['ufile'].file.name
you will get like this /var/folders/v7/1dtcydw51_s1ydkmypx1fggh0000gn/T/tmpKGp4mX.upload
and use file.name, your upload file have to bigger than 2.5M.
if you want to change this , see File Upload Settings
We cannot get the file path from the post request, only the filename, because flask doesn't has the file system access. If you need to get the file and perform some operations on it then you can try creating a temp directory save the file there, you can also get the path.
import tempfile
import shutil
dirpath = tempfile.mkdtemp()
# perform some operations if needed
shutil.rmtree(dirpath) # remove the temp directory

Categories