python fnmatch unable to find the file - python

I have a directory that has bunch of sub directories, each subdir has many csv files, but I am only interest in certain csv file. So I wrote following python method, but I am unable to capture the file name, if I do *.csv it will find all the file but I don't want to all the files to be read in:
def gatherStats(template_file, csv_file):
for lang in getLanguageCodes(csv_file):
lang_dir = os.path.join(template_file, lang)
try:
for file in os.listdir(lang_dir):
if fnmatch.fnmatch(file, '*-*-template-users-data.csv'):
t_file = open(file, 'rb').read()
reader = csv.reader()
for row in reader:
print row
else:
print "didn't find the file"
except Exception, e:
logging.exception(e)
What am I doing wrong here? Is it a regular expression issue? Can we use regular expression with fnmath?

There are several problems with your code. Fix them first, then we might get to the bottom of what your issue really is.
First of all, don't use built-in names as variables, such as file. Rather replace it with filename.
Then os.path.join(lang_dir, filename) before opening the file. Meaning:
t_file = open(os.path.join(lang_dir, filename), 'rb').read()
How do you expect reader = csv.reader() to read your file if you don't reference your open file object in this line?
Your try/except block is a bit too wide for my taste. Take your time and narrow down the errors that actually can happen. Then decide which of them you want to ignore and which should crash your program. Take a close look at the exceptions actually thrown in this block. You'll probably find your issue there.

With the help provided by another user, I manage to fix the problem. I am putting this answer here for future reference for community.
def gatherStats(template_file, csv_file):
for lang in getLanguageCodes(csv_file):
lang_dir = os.path.join(template_file, lang)
try:
for filename in os.listdir(lang_dir):
path = os.path.join(lang_dir, filename)
if re.search(r'-.+-template-users-data.csv$',filename):
with open(path, 'rb') as template_user_data_file:
reader = csv.reader(template_user_data_file)
try:
for row in reader:
print row
except csv.ERROR as e:
logging.error(e)
else:
print "didn't find the file"
except Exception, e:
logging.exception(e)

Related

what is an exception handler for

I have a script which wants to load integers from a text file. If the file does not exist I want the user to be able to browse for a different file (or the same file in a different location, I have UI implementation for that).
What I don't get is what the purpose of Exception handling, or catching exceptions is. From what I have read it seems to be something you can use to log errors, but if an input is needed catching the exception won't fix that. I am wondering if a while loop in the except block is the approach to use (or don't use the try/except for loading a file)?
with open(myfile, 'r') as f:
try:
with open(myfile, 'r') as f:
contents = f.read()
print("From text file : ", contents)
except FileNotFoundError as Ex:
print(Ex)
You need to use to while loop and use a variable to verify in the file is found or not, if not found, set in the input the name of the file and read again and so on:
filenotfound = True
file_path = myfile
while filenotfound:
try:
with open(file_path, 'r') as f:
contents = f.read()
print("From text file : ", contents)
filenotfound = False
except FileNotFoundError as Ex:
file_path = str(input())
filenotfound = True

Remove a JSON file if an exception occurs

I am writing a program which stores some JSON-encoded data in a file, but sometimes the resulting file is blank (because there wasn't found any new data). When the program finds data and stores it, I do this:
with open('data.tmp') as f:
data = json.load(f)
os.remove('data.tmp')
Of course, if the file is blank this will raise an exception, which I can catch but does not let me to remove the file. I have tried:
try:
with open('data.tmp') as f:
data = json.load(f)
except:
os.remove('data.tmp')
And I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "MyScript.py", line 50, in run
os.remove('data.tmp')
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
How could I delete the file when the exception occurs?
How about separating out file reading and json loading? json.loads behaves exactly same as json.load but uses a string.
with open('data.tmp') as f:
dataread = f.read()
os.remove('data.tmp')
#handle exceptions as needed here...
data = json.loads(dataread)
I am late to the party. But the json dump and load modules seem to keep using files even after writing or reading data from them. What you can do is use dumps or loads modules to get the string representation and then use normal file.write() or file.read() on the result.
For example:
with open('file_path.json'), 'w') as file:
file.write(json.dumps(json_data))
os.remove('file_path.json')
Not the best alternative but it saves me a lot especially when using temp dir.
you need to edit the remove part, so it handles the non-existing case gracefully.
import os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
try:
if os.stat(fn).st_size > 0:
os.remove(fn) if os.path.exists(fn) else None
except OSError as e: # this would be "except OSError, e:" before Python 2.6
if e.errno != errno.ENOENT:
raise
see also Most pythonic way to delete a file which may not exist
you could extract the silent removal in a separate function.
also, from the same other SO question:
# python3.4 and above
import contextlib, os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
with contextlib.suppress(FileNotFoundError):
if os.stat(fn).st_size > 0:
os.remove(fn)
I personally like the latter approach better - it's explicit.

pickle.dump dumps nothing when appending to file

User may give a bunch of urls as command line args. All URLs given in the past are serialized with pickle. The script checks all given URLs, if they are unique then they are serialized and appended to a file. At least that's what should be happening. Nothing is being appended. However when I open the file in write mode,the new, unique URL is written. So what gives? Code is:
def get_new_urls():
if(len(urls.URLs) != 0): # check if empty
with open(urlFile, 'rb') as f:
try:
cereal = pickle.load(f)
print(cereal)
toDump = []
for arg in urls.URLs:
if (arg in cereal):
print("Duplicate URL {0} given, ignoring it.".format(arg))
else:
toDump.append(arg)
except Exception as e:
print("Holy bleep something went wrong: {0}".format(e))
return(toDump)
urlsToDump = get_new_urls()
print(urlsToDump)
# TODO: append new URLs
if(urlsToDump):
with open(urlFile, 'ab') as f:
pickle.dump(urlsToDump, f)
# TODO check HTML of each page against the serialized copy
with open(urlFile, 'rb') as f:
try:
cereal = pickle.load(f)
print(cereal)
except EOFError: # your URL file is empty, bruh
pass
Pickle writes out the data you give it in a special format, e.g. it will write some header/metadata/etc, to the file you give it.
It is not intended to work this way; concatenating two pickle files doesn't really make sense. To achieve a concatenation of your data, you'd need to first read whatever is in the file into your urlsToDump, then update your urlsToDump with any new data, and then finally dump it out again (overwriting the whole file, not appending).
After
with open(urlFile, 'rb') as f:
you need a while loop, to repeatedly unpickle (repeatedly read) from the file until hitting EOF.

exception stop read files python

I'm trying to control exceptions when reading files, but I have a problem. I'm new to Python, and I am not yet able to control how I can catch an exception and still continue reading text from the files I am accessing. This is my code:
import errno
import sys
class Read:
#FIXME do immutables this 2 const
ROUTE = "d:\Profiles\user\Desktop\\"
EXT = ".txt"
def setFileReaded(self, fileToRead):
content = ""
try:
infile = open(self.ROUTE+fileToRead+self.EXT)
except FileNotFoundError as error:
if error.errno == errno.ENOENT:
print ("File not found, please check the name and try again")
else:
raise
sys.exit()
with infile:
content = infile.read()
infile.close()
return content
And from another class I tell it:
read = Read()
print(read.setFileReaded("verbs"))
print(read.setFileReaded("object"))
print(read.setFileReaded("sites"))
print(read.setFileReaded("texts"))
Buy only print this one:
turn on
connect
plug
File not found, please check the name and try again
And no continue with the next files. How can the program still reading all files?
It's a little difficult to understand exactly what you're asking here, but I'll try and provide some pointers.
sys.exit() will terminate the Python script gracefully. In your code, this is called when the FileNotFoundError exception is caught. Nothing further will be ran after this, because your script will terminate. So none of the other files will be read.
Another thing to point out is that you close the file after reading it, which is not needed when you open it like this:
with open('myfile.txt') as f:
content = f.read()
The file will be closed automatically after the with block.

Python shell freezes on reading (fasta) file

I am going to start of by showing the code I have thus far:
def err(em):
print(em)
exit
def rF(f):
s = ""
try:
fh = open(f, 'r')
except IOError:
e = "Could not open the file: " + f
err(e)
try:
with fh as ff:
next(ff)
for l in ff:
if ">" in l:
next(ff)
else:
s += l.replace('\n','').replace('\t','').replace('\r','')
except:
e = "Unknown Exception"
err(e)
fh.close()
return s
For some reason the python shell (I am using 3.2.2) freezes up whenever I tried to read a file by typing:
rF("mycobacterium_bovis.fasta")
The conditionals in the rF function are to prevent reading each line that starts with a ">" token. These lines aren't DNA/RNA code (which is what I am trying to read from these files) and should be ignored.
I hope anyone can help me out with this, I don't see my error.
As per the usual, MANY thanks in advance!
EDIT:
*The problem persists!*
This is the code I now use, I removed the error handling which was a fancy addition anyway, still the shell freezes whenever attempting to read a file. This is my code now:
def rF(f):
s = ""
try:
fh = open(f, 'r')
except IOError:
print("Err")
try:
with fh as ff:
next(ff)
for l in ff:
if ">" in l:
next(ff)
else:
s += l.replace('\n','').replace('\t','').replace('\r','')
except:
print("Err")
fh.close()
return s
You didn't ever define e.
So you'll get a NameError that is being hidden by the naked except:.
This is why it is good and healthy to specify the exception, e.g.:
try:
print(e)
except NameError as e:
print(e)
In cases like yours, though, when you don't necessarily know what the exception will be you should at least use this method of displaying information about the error:
import sys
try:
print(e)
except: # catch *all* exceptions
e = sys.exc_info()[1]
print(e)
Which, using the original code you posted, would have printed the following:
name 'e' is not defined
Edit based on updated information:
Concatenating a string like that is going to be quite slow if you have a large file.
Consider instead writing the filtered information to another file, e.g.:
def rF(f):
with open(f,'r') as fin, open('outfile','w') as fou:
next(fin)
for l in fin:
if ">" in l:
next(fin)
else:
fou.write(l.replace('\n','').replace('\t','').replace('\r',''))
I have tested that the above code works on a FASTA file based on the format specification listed here: http://en.wikipedia.org/wiki/FASTA_format using Python 3.2.2 [GCC 4.6.1] on linux2.
A couple of recommendations:
Start small. Get a simple piece working then add a step.
Add print() statements at trouble spots.
Also, consider including more information about the contents of the file you're attempting to parse. That may make it easier for us to help.

Categories