Python - Can't locate downloaded file to unzip - python

Using selenium, I was able to automate the download of a zip file and save it to a specified directory. When I try to unzip the file, however, I hit a snag where I can't seem to locate the recently downloaded file. If it helps, this is the block of code related to the downloading and unzipping process:
# Click on Map Link
driver.find_element_by_css_selector("input.linksubmit[value=\"▸ Map\"]").click()
# Download Data
driver.find_element_by_xpath('//*[#id="buttons"]/a[4]/img').click()
# Locate recently downloaded file
path = 'C:/.../Download'
list = os.listdir(path)
time_sorted_list = sorted(list, key=os.path.getmtime)
file_name = time_sorted_list[len(time_sorted_list)-1]
Specifically, this is my error:
Traceback (most recent call last):
File "C:\Users\...\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-89-3f1d00dac284>", line 3, in <module>
time_sorted_list = sorted(list, key=os.path.getmtime)
File "C:\Users\...\AppData\Local\Continuum\Anaconda3\lib\genericpath.py", line 55, in getmtime
return os.stat(filename).st_mtime
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'grid-m1b566d31a87cba1379e113bb93fdb61d5be5b128.zip'
I tried troubleshooting the code by deleting it and placing another file in the directory, and I was able to find the random file, but not the recently downloaded file. Can anyone tell me what's going on here?

First of all, do not use list for a variable name. That hides the list constructor from being readily available to use somewhere else in your program. Second, os.listdir does not return the full path of the files in that directory. If you want the full path, there are two things you can do:
You can use os.path.join:
import zipfile
path = 'C:/.../Download'
file_list = [os.path.join(path, f) for f in os.listdir(path)]
time_sorted_list = sorted(file_list, key=os.path.getmtime)
file_name = time_sorted_list[-1]
myzip = zipfile.ZipFile(file_name)
for contained_file in myzip.namelist():
if all(n in contained_file.lower() for n in ('corn', 'irrigation', 'high', 'brazil')):
with myzip.open(contained_file) as f:
# save data to a CSV file
You can also use the glob function from the glob module:
from glob import glob
import zipfile
path = 'C:/.../Download'
file_list = glob(path+"/*")
time_sorted_list = sorted(file_list, key=os.path.getmtime)
file_name = time_sorted_list[-1]
myzip = zipfile.ZipFile(file_name)
for contained_file in myzip.namelist():
if all(n in contained_file.lower() for n in ('corn', 'irrigation', 'high', 'brazil')):
with myzip.open(contained_file) as f:
# save data in a CSV file
Either should work.

Related

Unzip all ziped files in another working directory

I would like t write the script where all zip files in the folder would be extracted in another folder. I found a helpful answer here: Unzip all zipped files in a folder to that same folder using Python 2.7.5
I slightly edited the code from the answers there as listed down below
import zipfile36 as zipfile
working_dir = r"C:\Users\Tim\Desktop\Tim\Faks\NMG\Python\Python Scripting for Geoprocessing Workflows\PythonGP\PythonGP\Scripts" ###### LAHKO MENJAŠ POT ######
goal_dir= r"C:\Users\Tim\Desktop\Tim\Faks\NMG\Python\Python Scripting for Geoprocessing Workflows\PythonGP\PythonGP\Scripts\Ekstrahirano"
extension = ".zip"
for item in os.listdir(working_dir): # loop through items in dir
if item.endswith(extension): # check for ".zip" extension
file_name = os.path.abspath(item) # get full path of files
zip_ref = zipfile.ZipFile(file_name) # create zipfile object
zip_ref.extractall(goal_dir) # extract file to dir
zip_ref.close() # close file
os.remove(file_name) # delete zipped file
When running, I am constantly receiving this error
Traceback (most recent call last):
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-98-d46cd220b77f>", line 8, in <module>
zip_ref = zipfile.ZipFile(file_name) # create zipfile object
File "C:\Users\Tim\AppData\Roaming\Python\Python39\site-packages\zipfile36.py", line 1100, in __init__
self._RealGetContents()
File "C:\Users\Tim\AppData\Roaming\Python\Python39\site-packages\zipfile36.py", line 1167, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile36.BadZipFile: File is not a zip file
What am I missing?

Decompressing .bz2 files in a directory in python

I would like to decompress a bunch of .bz2 files contained in a folder (where there are also .zst files). What I am doing is the following:
destination_folder = "/destination_folder_path/"
compressed_files_path="/compressedfiles_folder_path/"
dirListing = os.listdir(compressed_files_path)
for file in dirListing:
if ".bz2" in file:
unpackedfile = bz2.BZ2File(file)
data = unpackedfile.read()
open(destination_folder, 'wb').write(data)
But I keep on getting the following error message:
Traceback (most recent call last):
File "mycode.py", line 34, in <module>
unpackedfile = bz2.BZ2File(file)
File ".../miniconda3/lib/python3.9/bz2.py", line 85, in __init__
self._fp = _builtin_open(filename, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'filename.bz2'
Why do I receive this error?
You must be sure that all the file paths you are using exist.
It is better to use the full path to the file being opened.
import os
import bz2
# this path must exist
destination_folder = "/full_path_to/folder/"
compressed_files_path = "/full_path_to_other/folder/"
# get list with filenames (strings)
dirListing = os.listdir(compressed_files_path)
for file in dirListing:
# ^ this is only filename.ext
if ".bz2" in file:
# concatenation of directory path and filename.bz2
existing_file_path = os.path.join(compressed_files_path, file)
# read the file as you want
unpackedfile = bz2.BZ2File(existing_file_path)
data = unpackedfile.read()
new_file_path = os.path.join(destination_folder, file)
with bz2.open(new_file_path, 'wb') as f:
f.write(data)
You can also use the shutil module to copy or move files.
os.path.exists
os.path.join
shutil
bz2 examples

Iterating over a multiple files in a folder

Im trying to develop a program that can iterate over different files in the same folder. The files are all the same format but will have different names. Right now if there is only 1 file in the folder the code executes with no problems but with different files i get the error:
Traceback (most recent call last):
File "D:/Downloads/FYP/Feedback draft.py", line 24, in <module>
wb = openpyxl.load_workbook(filename)
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 315, in load_workbook
reader = ExcelReader(filename, read_only, keep_vba,
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 124, in __init__
self.archive = _validate_archive(fn)
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 96, in _validate_archive
archive = ZipFile(filename, 'r')
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\zipfile.py", line 1251, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'tester2.xlsx'
The code im using is :
directory = r'D:\Downloads\FYP\TEST'
for filename in os.listdir(directory):
if filename.endswith(".xlsx"):
wb = openpyxl.load_workbook(filename)
sh1=wb['test']
doc = DocxTemplate('Assignment1feedback.docx')
context = {
'acc': acceleration
}
doc.render(context)
doc.save('D:\\Downloads\\FYP\\TEST\\' + filename + '.docx')
This is incomplete code as the full thing would be quite long but overall i want to access these excel files and then create a corresponding docx
So os.listdir only provides the basename of the directory files, which will cause problems if your working directory does not match the value of directory. If your working directory is D:\Downloads, ./file.xlsx does not exist but D:\Downloads\FYP\TEST/file.xlsx does.
You will want to use the absolute path to the file, you have two options here. You could follow #IronMan's suggestion in the their comment to produce the file path from the directory path and file basename:
import os
directory = r'D:\Downloads\FYP\TEST'
for filename in os.listdir():
wb = openpyxl.load_workbook(os.path.join(directory, filename))
This is a simple and useful approach; however, its functionality is somewhat limited and may make it harder to make changes in the future. The alternative is to use python's paathlib and scandir, and access the path directly from there:
import pathlib
directory = r'D:\Downloads\FYP\TEST'
for entry in pathlib.scandir(diectory):
wb = openpyxl.load_workbook(entry.path)

Python File Not Found Error even though file is in same directory

I'm running a python code (filename- images.py) that reads-
import gzip
f = gzip.open('i1.gz','r')
But it is showing the FileNotFoundError.
My folder containing images.py looks like-
New Folder/
images.py
i1.gz
(...Some other files...)
The problem is that you are not running the script from within the New Folder.
You can easily solve it by using the absolute path without hard-coding it:
from os import path
file_path = path.abspath(__file__) # full path of your script
dir_path = path.dirname(file_path) # full path of the directory of your script
zip_file_path = path.join(dir_path,'i1.gz') # absolute zip file path
# and now you can open it
f = gzip.open(zip_file_path,'r')
Check the current working directory of the script by doing:
import os
os.getcwd()
Then, compare this with your i1.gz absolute path. Then you should be able to see if there are any inconsistencies.
Are you run the script from the New Folder?
If you are in the folder, it should work:
c:\Data\Python\Projekty\Random\gzip_example>python load_gzip.py
but if you run the script from a parent folder with the folder name, it returned the error:
c:\Data\Python\Projekty\Random>python gzip_example\load_gzip.py
Traceback (most recent call last):
File "C:\Data\Python\Projekty\Random\gzip_example\load_gzip.py", line 2, in <module>
f = gzip.open('file.gz', 'r')
File "C:\Python\Python 3.8\lib\gzip.py", line 58, in open
binary_file = GzipFile(filename, gz_mode, compresslevel)
File "C:\Python\Python 3.8\lib\gzip.py", line 173, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'file.gz'
The way I usually set working directory and work with files is as follow:
import os
pwd_path= os.path.dirname(os.path.abspath(__file__))
myfile = os.path.join(pwd_path, 'i1.gz')

File exists but open says it doesnt

dir_path = os.path.dirname(os.path.realpath(__file__))
from os.path import isfile, join
onlyfiles = [f for f in listdir(dir_path) if isfile(join(dir_path, f))]
print(onlyfiles);
with open("config.json", 'r') as jsondata:
config = json.loads(jsondata.read())
Running this code, somehow, triggers a non existing error despite the file being listed during
print(onlyfiles);
Here is the full output log from the console.
Traceback (most recent call last):
['builder.py', 'builder.spec', 'builderloader2.rb', 'config.json',
'Run.bat', 'Run.bat.lnk', 'test.json']
File "C:/Users/cody.jones/Desktop/Builder Generator Release/builder.py",
line 26, in <module>
with open("config.json", 'r') as jsondata:
FileNotFoundError: [Errno 2] No such file or directory: 'config.json'
Process finished with exit code 1
provide full path to open() instead of just file name as by default it will look for file in same directory
try:
open(r"C:/Users/cody.jones/Desktop/Builder Generator Release/config.json", "r")
The script will look for config.json in the current working directory - which presumably is not the same as the folder that the script resides in.
Update your open call to include the path you've already generated.
with open(os.path.join(dir_path, "config.json"), 'r')) as jsondata:
By doing it this way (rather than just including the absolute path) this script will still work if you move it to a different directory or computer so long as you keep the script and config together.

Categories