How to intercept multiple warnings from a Python function? - python

Here's a bit of Python pseudocode to illustrate a problem (I'm working in Python 3.8, although I doubt that matters much):
import warnings
from some_library import function, LibraryWarning # (a library that I do not control)
warning_log = []
with warnings.catch_warnings():
warnings.simplefilter("error")
try:
result = function()
except LibraryWarning as w:
warning_log.append(w)
with warnings.catch_warnings():
warnings.simplefilter("ignore")
result = function()
The main purpose of this code is to produce result without any warnings cluttering up the screen. It accomplishes that. However, I also want to document any warnings that occur for later review. This code accomplishes that too, provided that function() only generates one warning. However, sometimes function() produces TWO warnings. Since I have to switch to ignoring warnings to get result, I never see the second warning and I can't append it to my log.
How can I perform a full accounting of all warnings generated, while still obtaining my final
result?
Thanks for your advice.

def a_warning_function():
warning_log = []
with warnings.catch_warnings():
warnings.simplefilter("error")
try:
result = function()
return result
except LibraryWarning as w:
warning_log.append(w)
with warnings.catch_warnings():
warnings.simplefilter("ignore")
result = function()
return result

Related

Try/Except not working with BeautifulSoup

I am trying to loop over a series of pages and extract some info. However, in certain pages some exceptions occur and I need to deal with them. I created the following function to try to deal with them. See below:
def iferr(x):
try:
x
except (Exception, TypeError, AttributeError) as e:
pass
I intend to use as part of code like this:
articles = [[iferr(dp[0].find('span', class_='citation')),\
iferr(dp[0].find('div', class_='abstract')),\
iferr(dp[0].find('a', rel='nofollow')['href'])] for dp in data]
The idea is that if, for example, dp[0].find('a', rel='nofollow')['href'] leads to an error (fails), it will simply ignore it (fill it with a blank or a None).
However, whenever an error/exception occurs in one of the three elements it does not 'pass'. It just tells me that the error has occurred. There errors it displays are those I listed in the 'except' command which I assume would be dealt with.
EDIT:
Per Michael's suggestion, I was able to see that the order in which iferr processes what is going on would always prompt the error before he try. So I worked on workaround:
def fndwoerr(d,x,y,z,h):
try:
if not h:
d.find('x',y = 'z')
else:
d.find('x',y = 'z')['h']
except (Exception, TypeError, AttributeError) as e:
pass
...
articles = [[fndwoerr(dp[0],'span','class_','citation',None),\
fndwoerr(dp[0],'div','class_','abstract',None),\
fndwoerr(dp[0], 'a', 'rel','nofollow','href')] for dp in data]
Now it runs without prompting an error. However, everything returned becomes None. I am pretty sure it has to do with he way the parameters are entered. y should not be displayed as a string in the find function, whereas z has. However, I input both as string when i call the function. How can I go about this?
Example looks a bit strange, so it would be a good idea to improve the question, so that we can reproduce your issue easily. May read how to create minimal, reproducible example
The idea is that if, for example, dp[0].find('a',
rel='nofollow')['href'] leads to an error (fails), it will simply
ignore it (fill it with a blank or a None).
What about checking if element is available with an if-statement?
dp[0].find('a', rel='nofollow').get('href']) if dp[0].find('a', rel='nofollow') else None
or with walrus operator from python 3.8:
l.get('href']) if (l:=dp[0].find('a', rel='nofollow')) else None
Example
from bs4 import BeautifulSoup
soup = BeautifulSoup('<h1>This is a Heading</h1>', 'html.parser')
for e in soup.select('h1'):
print(e.find('a').get('href') if e.find('a') else None)

Python: why is warning printed twice following temporary suppression?

It seems that temporarily suppressing warnings makes repeat warnings outside of the context manager display repeatedly.
Example:
import warnings
def f():
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=Warning)
print("A")
print("B")
warnings.warn("My warning")
f()
f()
Output:
A
B
tmp2.py:10: UserWarning: My warning
warnings.warn("My warning")
A
B
tmp2.py:10: UserWarning: My warning
warnings.warn("My warning")
Also, it does not seem to matter what action and category I give to simplefilter.
On the other hand, if I comment out the context catch_warnings block,
then the warning only displays once (as intended).
Why? Is it a bug? What am I missing?
This is a known bug
also asked about here
with a PR that seems pretty stale

Run a dynamic multi-line string of code in python at runtime

How can I run a multi-line string of code from inside python, additionally the string is generated at runtime and stored in either a string or an array.
Background
I have a function which dynamically generates random code snippets in high volume.
I wish to pass this code to a variable/string and evaluate syntax/return code.
Code can not be imported from file/api request etc as focus is on high throughput and speed.
Example
# my function returns random code snippets
myCode = myRandomCodeGenerator()
# I wish to perform some in fight validation (ideally RC or syntax)
someTestingFunction(myCode)
My attempts so far
I have seen solutions such as the one below, however since my code is generated dynamically I have issues formatting. I have tried generating \n and \r or adding """ in the string to compensate with no luck.
code = """
def multiply(x,y):
return x*y
print('Multiply of 2 and 3 is: ',multiply(2,3))
"""
exec(code)
I have also tried using the call function on a string yields similar formatting issues.
So far the best I can do is to perform a line by line syntax check and feed it a line at a time as shown below.
import codeop
def is_valid_code(line):
try:
codeop.compile_command(line)
except SyntaxError:
return False
else:
return True
I suspect there may be some syntax tricks I could apply to preserve formatting of indentation and returns. I am also aware of the risk of generating dynamic code in runtime, and have a filter of allowed terms in the function.
Another way instead of eval and exec is to compile the code before and then to exec it:
Example:
import contextlib,sys
from io import StringIO
#contextlib.contextmanager
def stdoutIO(stdout=None):
old = sys.stdout
if stdout is None:
stdout = StringIO()
sys.stdout = stdout
yield stdout
sys.stdout = old
def run_code(override_kale_blocks):
compiled123 = []
for b123 in override_kale_blocks:
compiled123.append(compile(b123,"<string>","exec"))
with stdoutIO() as s:
for c123 in compiled123:
exec(c123)
return s.getvalue()
block0='''
import time
a=5
b=6
b=a+b
'''
block1='''
b=a+b
'''
block2="print(b)"
blocksleep='''
print('startsleep')
time.sleep(1)
print('donesleep')
'''
pc = (block0,blocksleep,block1,block2)
cb = []
print('before')
output= run_code(pc)
print(output)
print('after')
print("Hello World!\n")
source:
https://stackoverflow.com/a/3906390/1211174
Question was answered by Joran Beasley.
Essentially, my issue was I was adding a space after \n as other compilers get confused, python however takes this into account.
Thus if you add a space after a return carriage in a def statement, it will produce syntax error.
I have kept this incase others encounter similar issues.
Try using '\t' instead of placing tabs and spaces for the indentation, that may preserve your formatting.
code = """
def multiply(x,y):
\t return x*y
print('Multiply of 2 and 3 is: ',multiply(2,3))
"""
exec(code)

Catch OptimizeWarning as an exception

I was just trying to catch an OptimizeWarning thrown by the scipy.optimize.curve_fit function, but I realized it was not recognized as a valid exception.
This is a non-working simple idea of what I'm doing:
from scipy.optimize import curve_fit
try:
popt, pcov = curve_fit(some parameters)
except OptimizeWarning:
print 'Maxed out calls.'
# do something
I looked around the docs but there was nothing there.
Am I missing something obvious or is it simply not defined for some reason?
BTW, this is the full warning I get and that I want to catch:
/usr/local/lib/python2.7/dist-packages/scipy/optimize/minpack.py:604: OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)
You can require that Python raise this warning as an exception using the following code:
import warnings
from scipy.optimize import OptimizeWarning
warnings.simplefilter("error", OptimizeWarning)
# Your code here
Issues with warnings
Unfortunately, warnings in Python have a few issues you need to be aware of.
Multiple filters
First, there can be multiple filters, so your warning filter can be overridden by something else. This is not too bad and can be worked around with the catch_warnings context manager:
import warnings
from scipy.optimize import OptimizeWarning
with warnings.catch_warnings():
warnings.simplefilter("error", OptimizeWarning)
try:
# Do your thing
except OptimizeWarning:
# Do your other thing
Raised Once
Second, warnings are only raised once by default. If your warning has already been raised before you set the filter, you can change the filter, it won't raise the warning again.
To my knowledge, there unfortunately is not much you can do about this. You'll want to make sure you run the warnings.simplefilter("error", OptimizeWarning) as early as possible.

Tracing fIle path and line number

I'm using python's trace module to trace some code. When I trace code this way, I can get one of the following two results:
Call:
tracer = trace.Trace(count=False, trace=True, ignoredirs=[sys.prefix, sys.exec_prefix])
r = tracer.run('run()')
tracer.results().write_results(show_missing=True)
Result:
<filename>(<line number>): <line of code>
Call [citation]:
tracer = trace.Trace(count=False, trace=True, ignoredirs=[sys.prefix, sys.exec_prefix], countfuncs=True)
r = tracer.run('run()')
tracer.results().write_results(show_missing=True)
Result:
filename:<filepath>, modulename:<module name>, funcname: <function name>
What I really need is a trace that gives me this:
<filepath> <line number>
It would seem that I could use the above information and interleave them to get what I need, but such an attempt would fail in the following use case:
sys.path contains directory A and directory B.
There are two files A/foo.py and B/foo.py
Both A/foo.py and B/foo.py contain the function bar, defined on lines 100 - 120
There are some minor differences between A/foo.py and B/foo.py
In this scenario, using such interleaving to correctly identifying which bar is called is impossible (please correct me if I'm wrong) without statically analyzing the code within each bar, which itself is very difficult for non-trivial functions.
So, how can I get the correct trace output that I need?
With a little monkey-patching, this is actually quite easy. Digging around in the source code of the trace module it seems that callbacks are used to report on each execution step. The basic functionality of Trace.run, greatly simplified, is:
sys.settrace(globaltrace) # Set the trace callback function
exec function # Execute the function, invoking the callback as necessary
sys.settrace(None) # Reset the trace
globaltrace is defined in Trace.__init__ depending on the arguments passed. Specifically, with the arguments in your first example, Trace.globaltrace_lt is used as the global callback, which calls Trace.localtrace_trace for each line of execution. Changing it is simply a case of modifying Trace.localtrace, so to get the result you want:
import trace
import sys
import time
import linecache
class Trace(trace.Trace):
def localtrace_trace(self, frame, why, arg):
if why == "line":
# record the file name and line number of every trace
filename = frame.f_code.co_filename
lineno = frame.f_lineno
if self.start_time:
print '%.2f' % (time.time() - self.start_time),
print "%s (%d): %s" % (filename, lineno,
linecache.getline(filename, lineno)),
return self.localtrace
tracer = Trace(count=False, trace=True, ignoredirs=[sys.prefix, sys.exec_prefix])
r = tracer.run('run()')
There is a difference between the two examples you give; if the first the output is printed during the Trace.run call, in the second it is printed during write_results. The code I've given above follows the pattern of the former, so tracer.results().write_results() is not necessary. However, if you want to manipulate this output instead it can be achieved by patching the trace.CoverageResults.write_results method in a similar manner.

Categories