how to zip all pdf files under a static folder? django

how to zip all pdf files under a static folder? django - python

I have a folder named pdfs under static folder.
I am trying to have a returned zip which contains all the pdf files in the pdfs folder.
I have tried a few threads and used their codes, but I tried to workout things but then couldn't solve the last part that I get a message saying no file / directory
I know static folders are a bit different than usual folders.
can someone please give me a hand and see what I have missed?
Thanks in advance
from StringIO import StringIO
import zipfile
pdf_list = os.listdir(pdf_path)
print('###pdf list################################')
print(pdf_path) # this does show me the whole path up to the pdfs folder
print(pdf_list) # returns ['abc.pdf', 'efd.pdf']
zip_subdir = "somefiles"
zip_filename = "%s.zip" % zip_subdir
# Open StringIO to grab in-memory ZIP contents
s = StringIO()
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(content_type='application/zip')
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename
# The zip compressor
zf = zipfile.ZipFile(s, "w")
for pdf_file in pdf_list:
print(pdf_file)
zf.write(pdf_file, pdf_path + pdf_file)
zf.writestr('file_name.zip', pdf_file.getvalue())
zf.close()
return resp
here I am getting errors for not able to find file / directory for 'abc.pdf'
P.S. I don't really need any sub folders zipped into the zip file. As long as all files are inside the zip, it'll be all good. (There won't be any sub folders in the pdfs folder)

I solved it myself and made it into a function with comments.
complicated things myself earlier
# two params
# 1. the directory where files want to be zipped
# e.g. of file directory is /et/ubuntu/vanfruits/vanfruits/static/pdfs/
# 2. filename of the zip file
def render_respond_zip(self, file_directory, zip_file_name):
response = HttpResponse(content_type='application/zip')
response['Content-Disposition'] = 'attachment; filename=' + zip_file_name
# open a file, writable
zip = ZipFile(response, 'w')
# loop through the directory provided
for single_file in os.listdir(file_directory):
# open the file, full path to the file including file name and extension is needed as first param
f = open(file_directory + single_file, 'r')
# write the file into the zip with
# first param is the name of the file inside the zip
# second param is read the file
zip.writestr(single_file, f.read())
zip.close()
return response

Related

Creating a zip file in stream without directory structure

I'm working on a Flask app to return a zip file to a user of a directory (a bunch of photos) but I don't want to include my server directory structure in the returned zip. Currently, I have this :
def return_zip():
dir_to_send = '/dir/to/the/files'
base_path = pathlib.Path(dir_to_send)
data = io.BytesIO()
with zipfile.ZipFile(data, mode='w') as z:
for f_name in base_path.iterdir():
z.write(f_name)
data.seek(0)
return send_file(data, mimetype='application/zip', as_attachment=True, attachment_filename='data.zip')
This works great for creating and returning the zip but the file includes the structure of my server i.e.
/dir/to/the/files/image.jpg, image1.jpg etc...
In the zip I just want the files, not their associated directory. How could I go about this?
Thanks!

We can use the arcname parameter to rename the file in the zip file. We can remove the directory structure by simply passing the write function the name of the file you want to add, this can be accomplished as follows:
def return_zip():
dir_to_send = '/dir/to/the/files'
base_path = pathlib.Path(dir_to_send)
data = io.BytesIO()
with zipfile.ZipFile(data, mode='w') as z:
for f_name in base_path.iterdir():
z.write(f_name, arcname=f_name.name)
data.seek(0)
return send_file(data, mimetype='application/zip', as_attachment=True, attachment_filename='data.zip')
You can read more about the details of zipfile write function here: https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.write

How to avoid saving subfolders when saving a file?

The function creates a folder and saves a file into it. Then the folder is packed into a rar archive and sent to the user, and the newly created folder and archive are deleted from the server after.
code.py
new_file_name = self.generate_file_name(rfi, vendor, current_scoring_round)
path_to_temp_folder = os.path.dirname(BASE_DIR)
if not os.path.exists(f'{path_to_temp_folder}/temp_folder'):
pathlib.Path(f'{path_to_temp_folder}/temp_folder').mkdir(parents=True, exist_ok=True)
wb.save(f'{path_to_temp_folder}/temp_folder/{new_file_name}') #save xlsx file from openpyxl library
archive = self.generate_zip_name(rfi) # generate name for archive
to_rar = f'{path_to_temp_folder}/temp_folder'
patoolib.create_archive(archive, (to_rar,)) # patoolib - to .rar converter
to_download = f'{path_to_temp_folder}/{archive}'
if os.path.exists(to_download):
try:
with open(to_download, 'rb') as fh:
response = HttpResponse(fh.read(),
content_type="content_type='application/vnd.rar'")
response['Content-Disposition'] = 'attachment; filename= "{}"'.format(archive)
return response
finally:
shutil.rmtree(to_rar, ignore_errors=True)
default_storage.delete(to_download)
Everything work, but the problem is that the downloaded archive contains subfolders - paths to the saved file.
Expected result:
folder.rar
file.xlsx
Actual result:
folder.rar
/home
/y700
/projects
file.xlsx

The documentation to patool is minimal. It certainly seems to suggest that this should be possible by passing the path to the file in the create-archive command. I've tried this though, and it appears not.
So the only option, probably, is to change the working directory to the location of the test.xlsx file:
import patoolib
import os
new_file_name = self.generate_file_name(rfi, vendor, current_scoring_round)
path_to_temp_folder = os.path.dirname(BASE_DIR)
if not os.path.exists(f'{path_to_temp_folder}/temp_folder'):
pathlib.Path(f'{path_to_temp_folder}/temp_folder').mkdir(parents=True, exist_ok=True)
wb.save(f'{path_to_temp_folder}/temp_folder/{new_file_name}') #save xlsx file from openpyxl library
archive = self.generate_zip_name(rfi) # generate name for archive
to_rar = f'{path_to_temp_folder}/temp_folder'
cwd=os.getcwd()
os.chdir('to_rar')
patoolib.create_archive(cwd+archive, ({new_file_name},)) # patoolib - to .rar converter
os.chdir('cwd')
to_download = f'{path_to_temp_folder}/{archive}'
if os.path.exists(to_download):
try:
with open(to_download, 'rb') as fh:
response = HttpResponse(fh.read(),
content_type="content_type='application/vnd.rar'")
response['Content-Disposition'] = 'attachment; filename= "{}"'.format(archive)
return response
finally:
shutil.rmtree(to_rar, ignore_errors=True)
default_storage.delete(to_download)
This works on my system, for example, and I get a single file in the archive (using tar, because I don't have rar installed):
import patoolib
import os
cwd=os.getcwd()
os.chdir('foo/bar/baz/qux/')
patoolib.create_archive(cwd+'/foo.tar.gz',('test.txt',))
os.chdir(cwd)
Note that you should really use os.path.join rather than concatenating strings, but this was just a quick & dirty test.

gzip multiple files in python

I have to compress a lot of XML files into and split them by the data in the file name, just for clarification's sake, there is a parser which collects information from XML file and then moves it to a backup folder. My code needs to gzip it according to the date in the filename and group those files in a compressed .gz file.
Please find the code bellow:
import os
import re
import gzip
import shutil
import sys
import time
#
timestr = time.strftime("%Y%m%d%H%M")
logfile = 'D:\\Coleta\\log_compactador_xml_tar'+timestr+'.log'
ptm_dir = "D:\\PTM\\monitored_programs\\"
count_files_mdc = 0
count_files_3gpp = 0
count_tar = 0
#
for subdir, dir, files in os.walk(ptm_dir):
for file in files:
path = os.path.join(subdir, file)
try:
backup_files_dir = path.split(sep='\\')[4]
parser_id = path.split(sep='\\')[3]
if re.match('backup_files_*', backup_files_dir):
if file.endswith('xml'):
# print(time.strftime("%Y-%m-%d %H:%M:%S"), path)
data_arq = file[1:14]
if parser_id in ('parser-924'):
gzip_filename_mdc = os.path.join(subdir,'E4G_PM_MDC_IP51_'+timestr+'_'+data_arq)
with open(path, 'r')as f_in, gzip.open(gzip_filename_mdc + ".gz", 'at') as f_out_mdc:
shutil.copyfileobj(f_in, f_out_mdc)
count_files_mdc += 1
f_out_mdc.close()
f_in.close()
print(time.strftime("%Y-%m-%d %H:%M:%S"), "Compressing file MDC: ",path)
os.remove(path)
except PermissionError:
print(time.strftime("%Y-%m-%d %H:%M:%S"), "Permission error on file:", fullpath, file=logfile)
pass
except IndexError:
print(time.strftime("%Y-%m-%d %H:%M:%S"), "IndexError: ", path, file=logfile)
pass
As long as I seem it creates a stream of data, then compress and write it to a new file with the specified filename. However, instead of grouping each XML file independently inside a ".gz" file, it does creates inside the "gzip" file, a big file (big stream of data?) with the same name of the output "gzip" file, but without any extension. After the files are totally compressed, it's not possible to uncompress the big file generated inside the "gzip" output file. Does someone know where is the problem with my code?
PS: I have edited the code for readability purposes.

Not sure whether the solution is still needed, but I will just leave it here for anyone who faces the same issue.
There is a way to create a gzip archive in python using tarfile, the code is quite simple:
with tarfile.open(filename, mode="w:gz") as archive:
archive.add(name=name_of_file_to_add, recursive=True)
in this case name_of_file_to_add can be a directory, in which case tarfile will add it recursively with all its contents. Obviously you will need to import the tarfile module.
If you need to add files without a directory a simple for with calls to add will do (recursive flag is not required in this case).

Cant change string value to variable with string value

I upload file to dropbox api, but it post on dropbox all directories from my computer since root folder. I mean you have folder of your project inside folder home, than user until you go to file sours folder. If I cut that structure library can't see that it is file, not string and give mistake message.
My code is:
def upload_file(project_id, filename, dropbox_token):
dbx = dropbox.Dropbox(dropbox_token)
file_path = os.path.abspath(filename)
with open(filename, "rb") as f:
dbx.files_upload(f.read(), file_path, mute=True)
link = dbx.files_get_temporary_link(path=file_path).link
return link
It works, but I need something like:
file_path = os.path.abspath(filename)
chunks = file_path.split("/")
name, dir = chunks[-1], chunks[-2]
which gives me mistake like:
dropbox.exceptions.ApiError: ApiError('433249b1617c031b29c3a7f4f3bf3847', GetTemporaryLinkError('path', LookupError('not_found', None)))
How could I make only parent folder and filename in the path?
For example if I have
/home/user/project/file.txt
I need
/project/file.txt

you have /home/user/project/file.txt and you need /project/file.txt
I would split according to os default separator (so it would work with windows paths as well), then reformat only the 2 last parts with the proper format (sep+path) and join that.
import os
#os.sep = "/" # if you want to test that on Windows
s = "/home/user/project/file.txt"
path_end = "".join(["{}{}".format(os.sep,x) for x in s.split(os.sep)[-2:]])
result:
/project/file.txt

I assume the following code should works:
def upload_file(project_id, filename, dropbox_token):
dbx = dropbox.Dropbox(dropbox_token)
abs_path = os.path.abspath(filename)
directory, file = os.path.split(abs_path)
_, directory = os.path.split(directory)
dropbox_path = os.path.join(directory, file)
with open(abs_path, "rb") as f:
dbx.files_upload(f.read(), dropbox_path, mute=True)
link = dbx.files_get_temporary_link(path=dropbox_path).link
return link

Python create zip file

I use the following script to create ZIP files:
import zipfile
import os
def zip_file_generator(filenames, size):
# filenames = path to the files
zip_subdir = "SubDirName"
zip_filename = "SomeName.zip"
# Open BytesIO to grab in-memory ZIP contents
s = io.BytesIO()
# The zip compressor
zf = zipfile.ZipFile(s, "w")
for fpath in filenames:
# Calculate path for file in zip
fdir, fname = os.path.split(fpath)
zip_path = os.path.join(zip_subdir, fname)
# Add file, at correct path
zf.write(fpath, zip_path)
# Must close zip for all contents to be written
zf.close()
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(s.getvalue(), content_type = "application/x-zip-compressed")
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename
resp['Content-length'] = size
return resp
I got it from here.
But I changed s = StringIO.StringIO() to s = io.BytesIO() because I am using Python 3.x.
The zip file does get created with the right size etc. But I can't open it. It is invalid. If I write the zip file to disk the zip file is valid.

I got it working. Just change size in resp['Content-length'] = size to s.tell()

I use shutil to zip like so:
import shutil
shutil.make_archive(archive_name, '.zip', folder_name)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to zip all pdf files under a static folder? django - python

Related

Creating a zip file in stream without directory structure

How to avoid saving subfolders when saving a file?

gzip multiple files in python

Cant change string value to variable with string value

Python create zip file

Categories

Resources