Is it possible to continue when you hit a BadZipFile? - python

This zip-file contains 130.000 images.
Is it possible to continue when you run into a BadZipFile?
I mean ignore the bad image and move on to the next.
import zipfile
with zipfile.ZipFile("/content/train.zip", 'r') as zip_ref:
zip_ref.extractall("/content/train/")
The error:
BadZipFile: Bad CRC-32 for file 'calvary-andrea-mantegna.jpg
I want something like this to work.
import zipfile
from zipfile import BadZipfile
try:
with zipfile.ZipFile("/content/train.zip", 'r') as zip_ref:
zip_ref.extractall("/content/train/")
except BadZipfile:
continue
But i know i cant use a continue in a try-except.
Is there a way to solve this?

You can extract files one by one. This may not work, depending what corrupted the file. For instance, if a block of the file was lost somewhere in the middle of the file, all unzips after that will fail.
import zipfile
try:
with zipfile.ZipFile("/content/train.zip", 'r') as zip_ref:
for info in zip_ref.infolist():
try:
zip_ref.extract(info, path="/content/train/")
except zipfile.BadZipFile as e:
print(f"{e} - offset {info.header_offset}")
except zipfile.BadZipFile as e:
print(f"could not read zipfile: {e}")

Related

what is an exception handler for

I have a script which wants to load integers from a text file. If the file does not exist I want the user to be able to browse for a different file (or the same file in a different location, I have UI implementation for that).
What I don't get is what the purpose of Exception handling, or catching exceptions is. From what I have read it seems to be something you can use to log errors, but if an input is needed catching the exception won't fix that. I am wondering if a while loop in the except block is the approach to use (or don't use the try/except for loading a file)?
with open(myfile, 'r') as f:
try:
with open(myfile, 'r') as f:
contents = f.read()
print("From text file : ", contents)
except FileNotFoundError as Ex:
print(Ex)
You need to use to while loop and use a variable to verify in the file is found or not, if not found, set in the input the name of the file and read again and so on:
filenotfound = True
file_path = myfile
while filenotfound:
try:
with open(file_path, 'r') as f:
contents = f.read()
print("From text file : ", contents)
filenotfound = False
except FileNotFoundError as Ex:
file_path = str(input())
filenotfound = True

Remove a JSON file if an exception occurs

I am writing a program which stores some JSON-encoded data in a file, but sometimes the resulting file is blank (because there wasn't found any new data). When the program finds data and stores it, I do this:
with open('data.tmp') as f:
data = json.load(f)
os.remove('data.tmp')
Of course, if the file is blank this will raise an exception, which I can catch but does not let me to remove the file. I have tried:
try:
with open('data.tmp') as f:
data = json.load(f)
except:
os.remove('data.tmp')
And I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "MyScript.py", line 50, in run
os.remove('data.tmp')
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
How could I delete the file when the exception occurs?
How about separating out file reading and json loading? json.loads behaves exactly same as json.load but uses a string.
with open('data.tmp') as f:
dataread = f.read()
os.remove('data.tmp')
#handle exceptions as needed here...
data = json.loads(dataread)
I am late to the party. But the json dump and load modules seem to keep using files even after writing or reading data from them. What you can do is use dumps or loads modules to get the string representation and then use normal file.write() or file.read() on the result.
For example:
with open('file_path.json'), 'w') as file:
file.write(json.dumps(json_data))
os.remove('file_path.json')
Not the best alternative but it saves me a lot especially when using temp dir.
you need to edit the remove part, so it handles the non-existing case gracefully.
import os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
try:
if os.stat(fn).st_size > 0:
os.remove(fn) if os.path.exists(fn) else None
except OSError as e: # this would be "except OSError, e:" before Python 2.6
if e.errno != errno.ENOENT:
raise
see also Most pythonic way to delete a file which may not exist
you could extract the silent removal in a separate function.
also, from the same other SO question:
# python3.4 and above
import contextlib, os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
with contextlib.suppress(FileNotFoundError):
if os.stat(fn).st_size > 0:
os.remove(fn)
I personally like the latter approach better - it's explicit.

exception stop read files python

I'm trying to control exceptions when reading files, but I have a problem. I'm new to Python, and I am not yet able to control how I can catch an exception and still continue reading text from the files I am accessing. This is my code:
import errno
import sys
class Read:
#FIXME do immutables this 2 const
ROUTE = "d:\Profiles\user\Desktop\\"
EXT = ".txt"
def setFileReaded(self, fileToRead):
content = ""
try:
infile = open(self.ROUTE+fileToRead+self.EXT)
except FileNotFoundError as error:
if error.errno == errno.ENOENT:
print ("File not found, please check the name and try again")
else:
raise
sys.exit()
with infile:
content = infile.read()
infile.close()
return content
And from another class I tell it:
read = Read()
print(read.setFileReaded("verbs"))
print(read.setFileReaded("object"))
print(read.setFileReaded("sites"))
print(read.setFileReaded("texts"))
Buy only print this one:
turn on
connect
plug
File not found, please check the name and try again
And no continue with the next files. How can the program still reading all files?
It's a little difficult to understand exactly what you're asking here, but I'll try and provide some pointers.
sys.exit() will terminate the Python script gracefully. In your code, this is called when the FileNotFoundError exception is caught. Nothing further will be ran after this, because your script will terminate. So none of the other files will be read.
Another thing to point out is that you close the file after reading it, which is not needed when you open it like this:
with open('myfile.txt') as f:
content = f.read()
The file will be closed automatically after the with block.

Python cant handle exceptions from zipfile.BadZipFile

Need to handle if a zip file is corrupt, so it just pass this file and can go on to the next.
In the code example underneath Im trying to catch the exception, so I can pass it. But my script is failing when the zipfile is corrupt*, and give me the "normal" traceback errors* istead of printing "my error", but is running ok if the zipfile is ok.
This i a minimalistic example of the code I'm dealing with.
path = "path to zipfile"
from zipfile import ZipFile
with ZipFile(path) as zf:
try:
print "zipfile is OK"
except BadZipfile:
print "Does not work "
pass
part of the traceback is telling me: raise BadZipfile, "File is not a zip file"
You need to put your context manager inside the try-except block:
try:
with ZipFile(path) as zf:
print "zipfile is OK"
except BadZipfile:
print "Does not work "
The error is raised by ZipFile so placing it outside means no handler can be found for the raised exception. In addition make sure you appropriately import BadZipFile from zipfile.

python fnmatch unable to find the file

I have a directory that has bunch of sub directories, each subdir has many csv files, but I am only interest in certain csv file. So I wrote following python method, but I am unable to capture the file name, if I do *.csv it will find all the file but I don't want to all the files to be read in:
def gatherStats(template_file, csv_file):
for lang in getLanguageCodes(csv_file):
lang_dir = os.path.join(template_file, lang)
try:
for file in os.listdir(lang_dir):
if fnmatch.fnmatch(file, '*-*-template-users-data.csv'):
t_file = open(file, 'rb').read()
reader = csv.reader()
for row in reader:
print row
else:
print "didn't find the file"
except Exception, e:
logging.exception(e)
What am I doing wrong here? Is it a regular expression issue? Can we use regular expression with fnmath?
There are several problems with your code. Fix them first, then we might get to the bottom of what your issue really is.
First of all, don't use built-in names as variables, such as file. Rather replace it with filename.
Then os.path.join(lang_dir, filename) before opening the file. Meaning:
t_file = open(os.path.join(lang_dir, filename), 'rb').read()
How do you expect reader = csv.reader() to read your file if you don't reference your open file object in this line?
Your try/except block is a bit too wide for my taste. Take your time and narrow down the errors that actually can happen. Then decide which of them you want to ignore and which should crash your program. Take a close look at the exceptions actually thrown in this block. You'll probably find your issue there.
With the help provided by another user, I manage to fix the problem. I am putting this answer here for future reference for community.
def gatherStats(template_file, csv_file):
for lang in getLanguageCodes(csv_file):
lang_dir = os.path.join(template_file, lang)
try:
for filename in os.listdir(lang_dir):
path = os.path.join(lang_dir, filename)
if re.search(r'-.+-template-users-data.csv$',filename):
with open(path, 'rb') as template_user_data_file:
reader = csv.reader(template_user_data_file)
try:
for row in reader:
print row
except csv.ERROR as e:
logging.error(e)
else:
print "didn't find the file"
except Exception, e:
logging.exception(e)

Categories