My code takes a list of file and folder paths, loops through them, then uploads to a Google Drive. If the path given is a directory, the code creates a .zip file before uploading. Once the upload is complete, I need the code to delete the .zip file that was created but the deletion throws an error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Temp\Newfolder.zip'. The file_path being given is C:\Temp\Newfolder. From what I can tell the only process using the file is this script but with open(...) should be closing the file when the processing is complete. Looking for suggestions on what could be done differently.
import os
import zipfile
import time
def add_list(filePathList, filePath):
filePathList.append(filePath)
return filePathList
def deleteFiles(filePaths):
for path in filePaths:
os.remove(path)
for file_path in file_paths:
if os.path.isdir(file_path) and not os.path.splitext(file_path)[1]:
# Compress the folder into a .zip file
folder_name = os.path.basename(file_path)
zip_file_path = os.path.join('C:\\Temp', folder_name + '.zip')
with zipfile.ZipFile(zip_file_path, 'w') as zip_file:
for root, dirs, files in os.walk(file_path):
for filename in files:
file_to_zip = os.path.join(root, filename)
zip_file.write(file_to_zip, os.path.relpath(file_to_zip, file_path))
zip_file.close()
# Update the file path to the zipped file
file_path = zip_file_path
# Create request body
request_body = {
'name': os.path.basename(file_path),
'mimeType': 'application/zip' if file_path.endswith('.zip') else 'text/plain',
'parents': [parent_id],
'supportsAllDrives': True
}
#Open the file and execute the request
with open(file_path, "rb") as f:
media_file = MediaFileUpload(file_path, mimetype='application/zip' if file_path.endswith('.zip') else 'text/plain')
upload_file = service.files().create(body=request_body, media_body=media_file, supportsAllDrives=True).execute()
# Print the response
print(upload_file)
if file_path.endswith('.zip'):
add_list(filePathList, file_path)
time.sleep(10)
deleteFiles(filePathList)
You're using with statements to properly close the open files, but you're not actually using the open files, just passing the path to another API that is presumably opening the file for you under the hood. Check the documentation for the APIs here:
media_file = MediaFileUpload(file_path, mimetype='application/zip' if file_path.endswith('.zip') else 'text/plain')
upload_file = service.files().create(body=request_body, media_body=media_file, supportsAllDrives=True).execute()
to figure out if MediaFileUpload objects, or the various things done with .create/.execute that uses it, provides some mechanism for ensuring deterministic resource cleanup (as is, if MediaFileUpload opens the file, and nothing inside .create/.execute explicitly closes it, at least one such file is guaranteed to be open when you try to remove them at the end, possibly more than one if reference cycles are involved or you're on an alternate Python interpreter, causing the problem you see on Windows systems). I can't say what might or might not be required, because you don't show the implementation, or specify the package it comes from.
Even if you are careful not to hold open handles yourself, there are cases where other processes can lock it, and you need to retry a few times (virus scanners can cause this issue by attempting to scan the file as you're deleting it; I believe properly implemented there'd be no problem, but they're software written by humans, and therefore some of them will misbehave). It's particularly common for files that are freshly created (virus scanners being innately suspicious of any new file, especially compressed files), and as it happens, you're creating fresh zip files and deleting them in short order, so you may be stuck retrying a few times.
Related
My flask app has a function that compresses log files in a directory into a zip file and then sends the file to the user to download. The compression works, except that when the client receives the zipfile, the zip contains a series of folders that matches the absolute path of the original files that were zipped in the server. However, the zipfile that was made in the server static folder does not.
Zipfile contents in static folder: "log1.bin, log2.bin"
Zipfile contents that was sent to user: "/home/user/server/data/log1.bin, /home/user/server/data/log2.bin"
I don't understand why using "send_file" seems to make this change to the zip file contents and fills the received zip file with sub folders. The actual contents of the received zip file do in fact match the contents of the sent zip file in terms of data, but the user has to click through several directories to get to the files. What am I doing wrong?
#app.route("/download")
def download():
os.chdir(data_dir)
if(os.path.isfile("logs.zip")):
os.remove("logs.zip")
log_dir = os.listdir('.')
log_zip = zipfile.ZipFile('logs.zip', 'w')
for log in log_dir:
log_zip.write(log)
log_zip.close()
return send_file("logs.zip", as_attachment=True)
Using send_from_directory(directory, "logs.zip", as_attachment=True) fixed everything. It appears this call is better for serving up static files.
I have to compress a lot of XML files into and split them by the data in the file name, just for clarification's sake, there is a parser which collects information from XML file and then moves it to a backup folder. My code needs to gzip it according to the date in the filename and group those files in a compressed .gz file.
Please find the code bellow:
import os
import re
import gzip
import shutil
import sys
import time
#
timestr = time.strftime("%Y%m%d%H%M")
logfile = 'D:\\Coleta\\log_compactador_xml_tar'+timestr+'.log'
ptm_dir = "D:\\PTM\\monitored_programs\\"
count_files_mdc = 0
count_files_3gpp = 0
count_tar = 0
#
for subdir, dir, files in os.walk(ptm_dir):
for file in files:
path = os.path.join(subdir, file)
try:
backup_files_dir = path.split(sep='\\')[4]
parser_id = path.split(sep='\\')[3]
if re.match('backup_files_*', backup_files_dir):
if file.endswith('xml'):
# print(time.strftime("%Y-%m-%d %H:%M:%S"), path)
data_arq = file[1:14]
if parser_id in ('parser-924'):
gzip_filename_mdc = os.path.join(subdir,'E4G_PM_MDC_IP51_'+timestr+'_'+data_arq)
with open(path, 'r')as f_in, gzip.open(gzip_filename_mdc + ".gz", 'at') as f_out_mdc:
shutil.copyfileobj(f_in, f_out_mdc)
count_files_mdc += 1
f_out_mdc.close()
f_in.close()
print(time.strftime("%Y-%m-%d %H:%M:%S"), "Compressing file MDC: ",path)
os.remove(path)
except PermissionError:
print(time.strftime("%Y-%m-%d %H:%M:%S"), "Permission error on file:", fullpath, file=logfile)
pass
except IndexError:
print(time.strftime("%Y-%m-%d %H:%M:%S"), "IndexError: ", path, file=logfile)
pass
As long as I seem it creates a stream of data, then compress and write it to a new file with the specified filename. However, instead of grouping each XML file independently inside a ".gz" file, it does creates inside the "gzip" file, a big file (big stream of data?) with the same name of the output "gzip" file, but without any extension. After the files are totally compressed, it's not possible to uncompress the big file generated inside the "gzip" output file. Does someone know where is the problem with my code?
PS: I have edited the code for readability purposes.
Not sure whether the solution is still needed, but I will just leave it here for anyone who faces the same issue.
There is a way to create a gzip archive in python using tarfile, the code is quite simple:
with tarfile.open(filename, mode="w:gz") as archive:
archive.add(name=name_of_file_to_add, recursive=True)
in this case name_of_file_to_add can be a directory, in which case tarfile will add it recursively with all its contents. Obviously you will need to import the tarfile module.
If you need to add files without a directory a simple for with calls to add will do (recursive flag is not required in this case).
I am getting a file posting from a file:
file = request.post['ufile']
I want to get the path. How can I get it?
You have to use the request.FILES dictionary.
Check out the official documentation about the UploadedFile object, you can use the UploadedFile.temporary_file_path attribute, but beware that only files uploaded to disk expose it (that is, normally, when using the TemporaryFileUploadHandler uploads handler).
upload = request.FILES['ufile']
path = upload.temporary_file_path
In the normal case, though, you would like to use the file handler directly:
upload = request.FILES['ufile']
content = upload.read() # For small files
# ... or ...
for chunk in upload.chunks():
do_somthing_with_chunk(chunk) # For bigger files
You should use request.FILES['ufile'].file.name
you will get like this /var/folders/v7/1dtcydw51_s1ydkmypx1fggh0000gn/T/tmpKGp4mX.upload
and use file.name, your upload file have to bigger than 2.5M.
if you want to change this , see File Upload Settings
We cannot get the file path from the post request, only the filename, because flask doesn't has the file system access. If you need to get the file and perform some operations on it then you can try creating a temp directory save the file there, you can also get the path.
import tempfile
import shutil
dirpath = tempfile.mkdtemp()
# perform some operations if needed
shutil.rmtree(dirpath) # remove the temp directory
I made a small script as below to read group of files and tar them, its all working fine accept that the compressed file contain full path of the files when uncompressed. Is there a way to do it without the directory structure?
compressor = tarfile.open(PATH_TO_ARCHIVE + re.sub('[\s.:"-]+', '',
str(datetime.datetime.now())) + '.tar.gz', 'w:gz')
for file in os.listdir(os.path.join(settings.MEDIA_ROOT, PATH_CSS_DB_OUT)):
compressor.add(os.path.join(settings.MEDIA_ROOT, PATH_CSS_DB_OUT) + file)
compressor.close()
Take a look at the TarFile.add signature:
... If given, arcname specifies an alternative name for the file in the archive.
I created a context manager for changing the current working directory fo handling this with tar files.
import contextlib
#contextlib.contextmanager
def cd_change(tmp_location):
cd = os.getcwd()
os.chdir(tmp_location)
try:
yield
finally:
os.chdir(cd)
Then, to package everything up in your case:
with cd_change(os.path.join(settings.MEDIA_ROOT, PATH_CSS_DB_OUT)):
for file in os.listdir('.'):
compressor.add(file)
Have you ever tried this feedback calling an external zip.py script to work? My CGITB does not show any error messages. It simply did not invoke external .py script to work. It simply skipped over to gush. I should be grateful if you can assist me in making this zip.py callable in feedback.py.
Regards. David
#**********************************************************************
# Description:
# Zips the contents of a folder.
# Parameters:
# 0 - Input folder.
# 1 - Output zip file. It is assumed that the user added the .zip
# extension.
#**********************************************************************
# Import modules and create the geoprocessor
#
import sys, zipfile, arcgisscripting, os, traceback
gp = arcgisscripting.create()
# Function for zipping files. If keep is true, the folder, along with
# all its contents, will be written to the zip file. If false, only
# the contents of the input folder will be written to the zip file -
# the input folder name will not appear in the zip file.
#
def zipws(path, zip, keep):
path = os.path.normpath(path)
# os.walk visits every subdirectory, returning a 3-tuple
# of directory name, subdirectories in it, and filenames
# in it.
#
for (dirpath, dirnames, filenames) in os.walk(path):
# Iterate over every filename
#
for file in filenames:
# Ignore .lock files
#
if not file.endswith('.lock'):
gp.AddMessage("Adding %s..." % os.path.join(path, dirpath, file))
try:
if keep:
zip.write(os.path.join(dirpath, file),
os.path.join(os.path.basename(path),
os.path.join(dirpath, file)[len(path)+len(os.sep):]))
else:
zip.write(os.path.join(dirpath, file),
os.path.join(dirpath[len(path):], file))
except Exception, e:
gp.AddWarning(" Error adding %s: %s" % (file, e))
return None
if __name__ == '__main__':
try:
# Get the tool parameter values
#
infolder = gp.GetParameterAsText(0)
outfile = gp.GetParameterAsText(1)
# Create the zip file for writing compressed data. In some rare
# instances, the ZIP_DEFLATED constant may be unavailable and
# the ZIP_STORED constant is used instead. When ZIP_STORED is
# used, the zip file does not contain compressed data, resulting
# in large zip files.
#
try:
zip = zipfile.ZipFile(outfile, 'w', zipfile.ZIP_DEFLATED)
zipws(infolder, zip, True)
zip.close()
except RuntimeError:
# Delete zip file if exists
#
if os.path.exists(outfile):
os.unlink(outfile)
zip = zipfile.ZipFile(outfile, 'w', zipfile.ZIP_STORED)
zipws(infolder, zip, True)
zip.close()
gp.AddWarning(" Unable to compress zip file contents.")
gp.AddMessage("Zip file created successfully")
except:
# Return any python specific errors as well as any errors from the geoprocessor
#
tb = sys.exc_info()[2]
tbinfo = traceback.format_tb(tb)[0]
pymsg = "PYTHON ERRORS:\nTraceback Info:\n" + tbinfo +
"\nError Info:\n " + str(sys.exc_type) +
": " + str(sys.exc_value) + "\n"
gp.AddError(pymsg)
msgs = "GP ERRORS:\n" + gp.GetMessages(2) + "\n"
gp.AddError(msgs)
zip() is a built-in function in Python. Therefore it is a bad practice to use zip as a variable name. zip_ can be used instead of.
execfile() function reads and executes a Python script.
It is probably that you actually need just import zip_ in feedback.py instead of execfile().
Yay ArcGIS.
Just to clarify how are you trying to call this script using popen, can you post some code?
If your invoking this script via another script in the ArcGIS environment, then the thing is, when you use Popen the script wont be invoked within the ArcGIS environment, instead it will be invoked within windows. So you will loose all real control over it.
Also just another ArcGIS comment you never initalize a license for the geoprocessor.
My suggestion refactor your code, into a module function that simply attempts to zip the files, if it fails print the message out to ArcGIS.
If you want post how you are calling it, and how this is being run.