Creating a zip file in stream without directory structure - python

I'm working on a Flask app to return a zip file to a user of a directory (a bunch of photos) but I don't want to include my server directory structure in the returned zip. Currently, I have this :
def return_zip():
dir_to_send = '/dir/to/the/files'
base_path = pathlib.Path(dir_to_send)
data = io.BytesIO()
with zipfile.ZipFile(data, mode='w') as z:
for f_name in base_path.iterdir():
z.write(f_name)
data.seek(0)
return send_file(data, mimetype='application/zip', as_attachment=True, attachment_filename='data.zip')
This works great for creating and returning the zip but the file includes the structure of my server i.e.
/dir/to/the/files/image.jpg, image1.jpg etc...
In the zip I just want the files, not their associated directory. How could I go about this?
Thanks!

We can use the arcname parameter to rename the file in the zip file. We can remove the directory structure by simply passing the write function the name of the file you want to add, this can be accomplished as follows:
def return_zip():
dir_to_send = '/dir/to/the/files'
base_path = pathlib.Path(dir_to_send)
data = io.BytesIO()
with zipfile.ZipFile(data, mode='w') as z:
for f_name in base_path.iterdir():
z.write(f_name, arcname=f_name.name)
data.seek(0)
return send_file(data, mimetype='application/zip', as_attachment=True, attachment_filename='data.zip')
You can read more about the details of zipfile write function here: https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.write

Related

How to make this code to take multiple yaml input files and convert them to .json and save all of them to specified path

def convert(): #function to convert .yaml to .json
in_file = filepath #assigning input file from GUI to variable
out_file = savepath #assigning output path from GUI to variable
yaml = ruamel.yaml.YAML(typ='safe')
with open(in_file ) as i: #opening yaml file
data = yaml.load(i)
with open(out_file, 'w') as o: #writing json file
json.dump(data, o, indent=2)
Your code is not formatted properly. But let's say your code looks like:
def convert():
in_file = filepath #assigning input file from GUI to variable
out_file = savepath #assigning output path from GUI to variable
yaml = ruamel.yaml.YAML(typ='safe')
with open(in_file ) as i: #opening yaml file
data = yaml.load(i)
with open(out_file, 'w') as o: #writing json file
json.dump(data, o, indent=2)
You must change some logic here:
You must get file names information instead of file name. It can be a list of paths which can be got from glob. Then you can get individual file name looping in files.
Since you have multiple files you cannot provide 1 file name and save all your files with same name. The best I can think of is (Let's say you are converting yaml files to json files) you can replace the extension of the file and save it with the same name.
Getting file names:
The glob module finds all the pathnames matching a specified pattern
according to the rules used by the Unix shell, although results are
returned in arbitrary order.
see glob
Now let's use it:
from glob import glob
files = glob("/your/path/*.yaml")
for file in files:
print(files)
Here file is path for each yaml file. Now you can open/use that file name inside the loop and do whatever you want.
Saving files
To save the file as json you can:
Add .json to the file name
Basically use + or fstrings to add .json.
json_file_name = file + ".json"
or
json_file_name = f"{file}.json"
Be aware that your file names might look weird. Because you will have my_file.yaml changes to my_file.yaml.json.
Replace yaml with json
Since file paths are strings you can use replace method and replace the file name.
json_file_name = file.replace("yaml", "json")
Here your file name would look better but it will change all yamls in the name. So if your file name has yaml in it it will be changed as well.
Like my_yaml_file.yaml changes to my_json_file.json.

How to avoid saving subfolders when saving a file?

The function creates a folder and saves a file into it. Then the folder is packed into a rar archive and sent to the user, and the newly created folder and archive are deleted from the server after.
code.py
new_file_name = self.generate_file_name(rfi, vendor, current_scoring_round)
path_to_temp_folder = os.path.dirname(BASE_DIR)
if not os.path.exists(f'{path_to_temp_folder}/temp_folder'):
pathlib.Path(f'{path_to_temp_folder}/temp_folder').mkdir(parents=True, exist_ok=True)
wb.save(f'{path_to_temp_folder}/temp_folder/{new_file_name}') #save xlsx file from openpyxl library
archive = self.generate_zip_name(rfi) # generate name for archive
to_rar = f'{path_to_temp_folder}/temp_folder'
patoolib.create_archive(archive, (to_rar,)) # patoolib - to .rar converter
to_download = f'{path_to_temp_folder}/{archive}'
if os.path.exists(to_download):
try:
with open(to_download, 'rb') as fh:
response = HttpResponse(fh.read(),
content_type="content_type='application/vnd.rar'")
response['Content-Disposition'] = 'attachment; filename= "{}"'.format(archive)
return response
finally:
shutil.rmtree(to_rar, ignore_errors=True)
default_storage.delete(to_download)
Everything work, but the problem is that the downloaded archive contains subfolders - paths to the saved file.
Expected result:
folder.rar
file.xlsx
Actual result:
folder.rar
/home
/y700
/projects
file.xlsx
The documentation to patool is minimal. It certainly seems to suggest that this should be possible by passing the path to the file in the create-archive command. I've tried this though, and it appears not.
So the only option, probably, is to change the working directory to the location of the test.xlsx file:
import patoolib
import os
new_file_name = self.generate_file_name(rfi, vendor, current_scoring_round)
path_to_temp_folder = os.path.dirname(BASE_DIR)
if not os.path.exists(f'{path_to_temp_folder}/temp_folder'):
pathlib.Path(f'{path_to_temp_folder}/temp_folder').mkdir(parents=True, exist_ok=True)
wb.save(f'{path_to_temp_folder}/temp_folder/{new_file_name}') #save xlsx file from openpyxl library
archive = self.generate_zip_name(rfi) # generate name for archive
to_rar = f'{path_to_temp_folder}/temp_folder'
cwd=os.getcwd()
os.chdir('to_rar')
patoolib.create_archive(cwd+archive, ({new_file_name},)) # patoolib - to .rar converter
os.chdir('cwd')
to_download = f'{path_to_temp_folder}/{archive}'
if os.path.exists(to_download):
try:
with open(to_download, 'rb') as fh:
response = HttpResponse(fh.read(),
content_type="content_type='application/vnd.rar'")
response['Content-Disposition'] = 'attachment; filename= "{}"'.format(archive)
return response
finally:
shutil.rmtree(to_rar, ignore_errors=True)
default_storage.delete(to_download)
This works on my system, for example, and I get a single file in the archive (using tar, because I don't have rar installed):
import patoolib
import os
cwd=os.getcwd()
os.chdir('foo/bar/baz/qux/')
patoolib.create_archive(cwd+'/foo.tar.gz',('test.txt',))
os.chdir(cwd)
Note that you should really use os.path.join rather than concatenating strings, but this was just a quick & dirty test.

Extracting Zip From HTTP Response in Python

I'm able to get the zip from HTTPs response and store in a specific folder using below code snippet:
z = zipfile.ZipFile(io.BytesIO(statement_resp.content))
z.extractall("/pathtostore")
However, in /pathtostore the zip file gets extracted with some random name. Is there a way to control the name of zip files created while extracting itself?
Currently, after zip extraction, below is the directory structure:
/pathtostore/ZaXyzzz
--> ZaXyzzz is the zip name.
I'm looking for something as below:
/pathtostore/1234_2020_03_02
--> 1234_2020_03_02 (cid_curdate) is the zip name which I want.
PS: I cannot read the zip and rename it as there could be multiple zip present inside /pathtostore
You can get names z.namelist() and read every file separatelly z.read() and write it with new name using standarad open(), write(), close()
Minimal example.
It may need more code if zipfile has folders
import zipfile
import datetime
import os
z = zipfile.ZipFile('input.zip')
folder = '/pathtostore'
os.makedirs(folder, exist_ok=True)
today = datetime.date.today().strftime('%Y_%m_%d')
cid = 0
for old_name in z.namelist():
cid += 1
new_name = os.path.join(folder, '{:04}_{}'.format(cid, today))
print(old_name, '->', new_name)
data = z.read(old_name)
with open(new_name, 'wb') as fh:
fh.write(data)
You can read the zipfile's ZipInfo structures and modify its filename attribute for the write
from pathlib import Path
z = zipfile.ZipFile(io.BytesIO(statement_resp.content))
for info in z.getinfo():
# implement your extraction policy here. Remove root
# path name and add our own
zip_path = Path(z.filename)
z.filename = str(Path("1234_2020_03_02").joinpath(*zip_path.parts[1:]))
x.extract(info)

Cant change string value to variable with string value

I upload file to dropbox api, but it post on dropbox all directories from my computer since root folder. I mean you have folder of your project inside folder home, than user until you go to file sours folder. If I cut that structure library can't see that it is file, not string and give mistake message.
My code is:
def upload_file(project_id, filename, dropbox_token):
dbx = dropbox.Dropbox(dropbox_token)
file_path = os.path.abspath(filename)
with open(filename, "rb") as f:
dbx.files_upload(f.read(), file_path, mute=True)
link = dbx.files_get_temporary_link(path=file_path).link
return link
It works, but I need something like:
file_path = os.path.abspath(filename)
chunks = file_path.split("/")
name, dir = chunks[-1], chunks[-2]
which gives me mistake like:
dropbox.exceptions.ApiError: ApiError('433249b1617c031b29c3a7f4f3bf3847', GetTemporaryLinkError('path', LookupError('not_found', None)))
How could I make only parent folder and filename in the path?
For example if I have
/home/user/project/file.txt
I need
/project/file.txt
you have /home/user/project/file.txt and you need /project/file.txt
I would split according to os default separator (so it would work with windows paths as well), then reformat only the 2 last parts with the proper format (sep+path) and join that.
import os
#os.sep = "/" # if you want to test that on Windows
s = "/home/user/project/file.txt"
path_end = "".join(["{}{}".format(os.sep,x) for x in s.split(os.sep)[-2:]])
result:
/project/file.txt
I assume the following code should works:
def upload_file(project_id, filename, dropbox_token):
dbx = dropbox.Dropbox(dropbox_token)
abs_path = os.path.abspath(filename)
directory, file = os.path.split(abs_path)
_, directory = os.path.split(directory)
dropbox_path = os.path.join(directory, file)
with open(abs_path, "rb") as f:
dbx.files_upload(f.read(), dropbox_path, mute=True)
link = dbx.files_get_temporary_link(path=dropbox_path).link
return link

how to zip all pdf files under a static folder? django

I have a folder named pdfs under static folder.
I am trying to have a returned zip which contains all the pdf files in the pdfs folder.
I have tried a few threads and used their codes, but I tried to workout things but then couldn't solve the last part that I get a message saying no file / directory
I know static folders are a bit different than usual folders.
can someone please give me a hand and see what I have missed?
Thanks in advance
from StringIO import StringIO
import zipfile
pdf_list = os.listdir(pdf_path)
print('###pdf list################################')
print(pdf_path) # this does show me the whole path up to the pdfs folder
print(pdf_list) # returns ['abc.pdf', 'efd.pdf']
zip_subdir = "somefiles"
zip_filename = "%s.zip" % zip_subdir
# Open StringIO to grab in-memory ZIP contents
s = StringIO()
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(content_type='application/zip')
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename
# The zip compressor
zf = zipfile.ZipFile(s, "w")
for pdf_file in pdf_list:
print(pdf_file)
zf.write(pdf_file, pdf_path + pdf_file)
zf.writestr('file_name.zip', pdf_file.getvalue())
zf.close()
return resp
here I am getting errors for not able to find file / directory for 'abc.pdf'
P.S. I don't really need any sub folders zipped into the zip file. As long as all files are inside the zip, it'll be all good. (There won't be any sub folders in the pdfs folder)
I solved it myself and made it into a function with comments.
complicated things myself earlier
# two params
# 1. the directory where files want to be zipped
# e.g. of file directory is /et/ubuntu/vanfruits/vanfruits/static/pdfs/
# 2. filename of the zip file
def render_respond_zip(self, file_directory, zip_file_name):
response = HttpResponse(content_type='application/zip')
response['Content-Disposition'] = 'attachment; filename=' + zip_file_name
# open a file, writable
zip = ZipFile(response, 'w')
# loop through the directory provided
for single_file in os.listdir(file_directory):
# open the file, full path to the file including file name and extension is needed as first param
f = open(file_directory + single_file, 'r')
# write the file into the zip with
# first param is the name of the file inside the zip
# second param is read the file
zip.writestr(single_file, f.read())
zip.close()
return response

Categories