Python 'with' command - python

Is this code
with open(myfile) as f:
data = f.read()
process(data)
equivalent to this one
try:
f = open(myfile)
data = f.read()
process(f)
finally:
f.close()
or the following one?
f = open(myfile)
try:
data = f.read()
process(f)
finally:
f.close()
This article: http://effbot.org/zone/python-with-statement.htm suggests (if I understand it correctly) that the latter is true. However, the former would make more sense to me. If I am wrong, what am I missing?

According to the documentation:
A new statement is proposed with the syntax:
with EXPR as VAR:
BLOCK
The translation of the above statement is:
mgr = (EXPR)
exit = type(mgr).__exit__ # Not calling it yet
value = type(mgr).__enter__(mgr)
exc = True
try:
try:
VAR = value # Only if "as VAR" is present
BLOCK
except:
# The exceptional case is handled here
exc = False
if not exit(mgr, *sys.exc_info()):
raise
# The exception is swallowed if exit() returns true
finally:
# The normal and non-local-goto cases are handled here
if exc:
exit(mgr, None, None, None)
And this is an extended version of your second code snippet. Initialization goes before try ... finaly block.

It's equivalent to the latter one, because until open() successfully returns, f has no value, and should not be closed.

Related

python 2.6 linecache.getline() and stdin. How does it work?

I have a script which runs through lines of input to find the occurrence of an ID string while keeping track of the linenumber.
Then it runs backwards up the input to trace parentID/childID relationships. The script accepts either a logfile using a '-f' flag as an argument or the contents of stdin from a pipe.
The logfile as input portion works just fine, but reading from stdin seems not to work.
For the sake of reasonable clarity I've included the portion of the script that this concerns, but don't expect to be able to run it. It's just to show you sorta whats going on (anyone who works in financial services around FIX protocol would recognize a few things):
import os
import sys
import linecache
from types import *
from ____ import FixMessage # custom message class that is used throughout
# Feel free to ignore all the getArgs and validation crap
def getArgs():
import argparse
parser = argparse.ArgumentParser(
description='Get amendment history.')
parser.add_argument('-f', '--file',
help="input logfile.'")
args = parser.parse_args()
return validateArgs(args)
def validateArgs(args):
try:
if sys.stdin.isatty():
if args.file:
assert os.path.isfile(args.file.strip('\n')), \
'File "{0}" does not exist'.format(args.file)
args.file = open(args.file, 'r')
else:
args.file = sys.stdin
assert args.file, \
"Please either include a file with '-f' or pipe some text in"
except AssertionError as err:
print err
exit(1)
return args
defGetMessageTrail(logfile, orderId):
# some input validation
if isinstance(logfile, StringType):
try: logfile = open(logfile, 'r')
except IOError as err: exit(1)
elif not isinstance(logfile, FileType):
raise TypeError(
'Expected FileType and got {0}'.format(type(logfile)))
linenum = 0
# This retrieves the message containing the orderID as well as the linenum
for line in logfile:
linenum += 1
if orderId in line:
# FixMessage is a custom class that is treated here like
# a dictionary with some metadata
# Missing dict keys return 'None'
# .isvalid is bool results of some text validation
# .direction is either incoming or outgoing
# thats all you really need to know
msg = FixMessage(line)
if msg.isvalid and msg.direction == 'Incoming':
yield msg
break
# If there is a message parentID, it would be in msg['41']
if msg['41']:
messages = findParentMessages(logfile, startline=linenum, msg['41'])
for msg in messages: yield msg
def findParentMessages(logfile, startline, targetId):
# Some more input validation
assert isinstance(logfile, FileType)
assert isinstance(startline, IntType)
assert isinstance(targetId, StringType)
# should just make a integer decrementing generator,
# but this is fine for the example
for linenum in range(startline)[::-1]:
# *** This is where the question lies... ***
# print(logfile.name) # returns "<stdin>"
line = linecache.getline(logfile.name, linenum)
if 'Incoming' in line and '11=' + targetId in line:
msg = FixMessage(line)
yield msg
if msg['41']: findParentMessages(logfile, linenum, msg['41'])
else: break
def main():
log = getArgs().file
trail = getMessageTrail(log, 'ORDER123')
if __name__ == '__main__': main()
The question is, how does linecache.getline work when it comes to reading stdin as a file? is it different than how it would work if given a regular filename?
linecache.getline() accepts a file name, not a file object. It is not designed to work that way as filename is passed to calls like open() and os.stat().
For reference: https://github.com/python/cpython/blob/2.6/Lib/linecache.py

How to catch unhandled error in Deferred originating from threads.deferToThread()

I'm getting the error Unhandled error in Deferred:
Can anybody help, how to handle this?
#inlineCallbacks
def start(self):
# First we try Avahi, if it fails we fallback to Bluetooth because
# the receiver may be able to use only one of them
log.info("Trying to use this code with Avahi: %s", self.userdata)
key_data = yield threads.deferToThread(self.discovery.find_key, self.userdata)
if key_data and not self.stopped:
success = True
message = ""
returnValue((key_data, success, message))
if self.bt_code and not self.stopped:
# We try Bluetooth, if we have it
log.info("Trying to connect to %s with Bluetooth", self.bt_code)
self.bt = BluetoothReceive(self.bt_port)
msg_tuple = yield self.bt.find_key(self.bt_code, self.mac)
key_data, success, message = msg_tuple
if key_data:
# If we found the key
returnValue((key_data, success, message))
Error throws at line
key_data = yield threads.deferToThread(self.discovery.find_key, self.userdata)
This is the way that makes sense for most devs using inlineCallbacks
try:
key_data = yield threads.deferToThread(self.discovery.find_key, self.userdata)
except Exception as e:
log.exception('Unable to get key_data')
returnValue(e)
Another way would be to chain callback using addCallback (success) and addErrback (failure). So you should be able to do something like this:
d = threads.deferToThread(self.discovery.find_key, self.userdata) # notice there's no yield
d.addCallback(success_callback)
d.addErrback(failure_callback)
key_data = yield d
Helpful Links
http://twistedmatrix.com/documents/current/core/howto/defer-intro.html#simple-failure-handling
http://twistedmatrix.com/documents/current/core/howto/threading.html
Per the inlineCallbacks documentation you can handle this case with a try/except statement:
For example:
#inlineCallbacks
def getUsers(self):
try:
responseBody = yield makeRequest("GET", "/users")
except ConnectionError:
log.failure("makeRequest failed due to connection error")
returnValue([])
returnValue(json.loads(responseBody))
Therefore, replace your key_data = yield ... line with:
try:
key_data = yield threads.deferToThread(self.discovery.find_key, self.userdata)
except SomeExceptionYouCanHandle:
# Some exception handling code

Exception not caught in multiprocessing

I'm using multiprocessing module for files processing in parallel, which works perfectly fine almost every time.
Also I've written that in try , except block to catch any exception.
I've come across a situation where except block doesn't catch the exception.
Since the code is huge I'm just putting relevant block which is giving problem.
def reader(que, ip, start, end, filename):
""" Reader function checks each line of the file
and if the line contains any of the ip addresses which are
being scanned, then it writes to its buffer.
If the line field doesn't match date string it skips the line.
"""
logging.info("Processing : %s" % os.path.basename(filename))
ip_pat = re.compile("(\d+\.\d+\.\d+\.\d+\:\d+)")
chunk = 10000000 # taking chunk of 10MB data
buff = ""
with bz2.BZ2File(filename,"rb", chunk) as fh: # open the compressed file
for line in fh:
output = []
fields = line.split()
try:
ts = fields[1].strip() + "/" +fields[0]+"/"+fields[3].split("-")[0]+" "+fields[2]
times = da.datetime.strptime(ts,"%d/%b/%Y %H:%M:%S")
if times < start:
continue
if times > end:
break
ips = re.findall(ip_pat,line)
if len(ips) < 3:
continue
if ips[0].split(":")[0] == ip:
output.append(times.strftime("%d/%m/%Y %H:%M:%S"))
status = "SESSION_OPEN" if "SESSION_OPEN" in line or "CREATE" in line else "SESSION_CLOSE"
protocol = "TCP" if "TCP" in line else "UDP"
output.append(status)
output.append(protocol)
ips[1], ips[2] = ips[2], ips[1]
output.extend(ips)
res = "|".join(output)
buff += res + "\n"
except IndexError, ValueError:
continue
logging.info("Processed : %s of size [ %d ]" % (os.path.basename(filename), os.path.getsize(filename)))
if buff:
que.put((ip,buff))
return buff
And this is what is received as error.
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 554, in get
raise self._value
ValueError: time data '2/Dec/20 10:59:59' does not match format '%d/%b/%Y %H:%M:%S'
What I don't understand is why the exception is not caught, I've mentioned ValueError in except block.
What's the best way to get through this problem.
Provide the multiple exceptions as a tuple:
except (IndexError, ValueError):
continue
The relevant doc is https://docs.python.org/2/tutorial/errors.html#handling-exceptions
Snippet from the page:
Note that the parentheses around this tuple are required, because except ValueError, e: was the syntax used for what is normally written as except ValueError as e: in modern Python (described below). The old syntax is still supported for backwards compatibility. This means except RuntimeError, TypeError is not equivalent to except (RuntimeError, TypeError): but to except RuntimeError as TypeError: which is not what you want.

Python Exception Handling with in a loop

An exception occurs when my program can't find the element its looking for, I want to log the event within the CSV, Display a message the error occurred and continue. I have successfully logged the event in the CSV and display the message, Then my program jumps out of the loop and stops. How can I instruct python to continue. Please check out my code.
sites = ['TCF00670','TCF00671','TCF00672','TCF00674','TCF00675','TCF00676','TCF00677']`
with open('list4.csv','wb') as f:
writer = csv.writer(f)
try:
for s in sites:
adrs = "http://turnpikeshoes.com/shop/" + str(s)
driver = webdriver.PhantomJS()
driver.get(adrs)
time.sleep(5)
LongDsc = driver.find_element_by_class_name("productLongDescription").text
print "Working.." + str(s)
writer.writerows([[LongDsc]])
except:
writer.writerows(['Error'])
print ("Error Logged..")
pass
driver.quit()
print "Complete."
Just put the try/except block inside the loop. And there is no need in that pass statement at the end of the except block.
with open('list4.csv','wb') as f:
writer = csv.writer(f)
for s in sites:
try:
adrs = "http://turnpikeshoes.com/shop/" + str(s)
driver = webdriver.PhantomJS()
driver.get(adrs)
time.sleep(5)
LongDsc = driver.find_element_by_class_name("productLongDescription").text
print "Working.." + str(s)
writer.writerows([[LongDsc]])
except:
writer.writerows(['Error'])
print ("Error Logged..")
NOTE It's generally a bad practice to use except without a particular exception class, e.g. you should do except Exception:...

Inputting a String Argument as a Variable in function giving a NameError

I have this code block that it should give out the CIK number when the stock ticker is supplied:
def lookup_cik(ticker, name=None):
good_read = False
ticker = ticker.strip().upper()
url = 'http://www.sec.gov/cgi-bin/browse-edgar?action+getcompany&CIK=(cik)&count=10&output=xml'.format(cik=ticker)
try:
xmlFile = urlopen ( url )
try:
xmlData = xmlFile.read()
good_read = True
finally:
xmlFile.close()
except HTTPError as e:
print( "HTTP Error:", e.code )
except URLError as e:
print( "URL Error:", e.reason )
except TimeoutError as e:
print( "Timeout Error:", e.reason )
except socket.timeout:
print( "Socket Timeout Error" )
if not good_read:
print( "Unable to lookup CIK for ticker:", ticker )
return
try:
root = ET.fromstring(xmlData)
except ET.ParseError as perr:
print( "XML Parser Error:", perr )
try:
cikElement = list(root.iter( "CIK" ))[0]
return int(cikElement.text)
except StopIteration:
pass
However when it try to input a Stock ticker i get
>>> lookup_cik(BDX)
Traceback (most recent call last):
File "<pyshell#34>", line 1, in <module>
lookup_cik(BDX)
NameError: name 'BDX' is not defined
I know that it is a NameError but i have never met an issue where the function does not recognize the supposedly inputted argument data the stock ticker which in our example is BDX.
Your function expects a string, so pass in one:
lookup_cik("BDX")
Without the quotes Python parses that as a name, but you never bound anything to that name (assigned to it).
Note that you'll also get a UnboundLocalError: local variable 'root' referenced before assignment exception if there was a parse error. You probably want to exit the function at that point:
try:
root = ET.fromstring(xmlData)
except ET.ParseError as perr:
print( "XML Parser Error:", perr )
return
You'll most likely get a parse error, because you never actually interpolate the ticker anywhere in the string; you are missing a {cik} placeholder:
url = 'http://www.sec.gov/cgi-bin/browse-edgar?action+getcompany&CIK=(cik)&count=10&output=xml'.format(cik=ticker)
You probably meant to use CIK={cik} there. A quick experiment directly calling the site also shows you need to use action=getcompany (= instead of +):
url = 'http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK={cik}&count=10&output=xml'.format(cik=ticker)
Because you use list() on root.iter(), the whole expression will not raise StopIteration (list() catches that). Instead, the expression could raise a IndexError instead.
I'd use next() there instead:
cikElement = next(root.iter("CIK"), None)
return cikElement and int(cikElement.text)
or better still, just use Element.find():
cikElement = root.find("CIK")
return cikElement and int(cikElement.text)

Categories