How to perform assert introspection in Python - python

I'm looking into how to perform assert introspection in Python, in the same way that py.test does. For example...
>>> a = 1
>>> b = 2
>>> assert a == b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError # <--- I want more information here, eg 'AssertionError: 1 != 2'
I see that the py.code library has some functionality around this and I've also seen this answer, noting that sys.excepthook allows you to plug in whatever behavior you want to exceptions, but it's not clear to me how to put it all together.

You can do something like this if you want to show a detailed error message
def assertion(a,b):
try:
assert a==b
except AssertionError as e:
e.args += ('some other', 'information',)
raise
a=1
b=2
assertion(a,b)
This code will give this output:
Traceback (most recent call last):
File "tp.py", line 11, in <module>
assertion(a,b)
File "tp.py", line 4, in assertion
assert a==b
AssertionError: ('some other', 'information')

The unittest assert gives extra information (possibly more than you need). Inspired by Raymond Hettinger's talk.
This is a partial answer, only giving the values for a and b (last line of the output), not the additional introspection you are also seeking that is unique in pytest.
import unittest
class EqualTest(unittest.TestCase):
def testEqual(self, a, b):
self.assertEqual(a, b)
a, b = 1, 2
assert_ = EqualTest().testEqual
assert_(a, b)
Output
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-4-851ce0f1f668> in <module>()
9 a, b = 1, 2
10 assert_ = EqualTest().testEqual
---> 11 assert_(a, b)
<ipython-input-4-851ce0f1f668> in testEqual(self, a, b)
4
5 def testEqual(self, a, b):
----> 6 self.assertEqual(a, b)
7
8
C:\Anaconda3\lib\unittest\case.py in assertEqual(self, first, second, msg)
818 """
819 assertion_func = self._getAssertEqualityFunc(first, second)
--> 820 assertion_func(first, second, msg=msg)
821
822 def assertNotEqual(self, first, second, msg=None):
C:\Anaconda3\lib\unittest\case.py in _baseAssertEqual(self, first, second, msg)
811 standardMsg = '%s != %s' % _common_shorten_repr(first, second)
812 msg = self._formatMessage(msg, standardMsg)
--> 813 raise self.failureException(msg)
814
815 def assertEqual(self, first, second, msg=None):
AssertionError: 1 != 2

I don't think it is straightforward to reproduce pytest's assert introspection in a standalone context. The docs contain a few more details on how it works:
pytest rewrites test modules on import. It does this by using an import hook to write a new pyc files. Most of the time this works transparently. However, if you are messing with import yourself, the import hook may interfere. If this is the case, simply use --assert=reinterp or --assert=plain. Additionally, rewriting will fail silently if it cannot write new pycs, i.e. in a read-only filesystem or a zipfile.
It looks like it would require quite some hacks to make that work in arbitrary modules, so you're probably better off to use a solution suggested in the other answers.

Related

gensim lemmatize error generator raised StopIteration

I'm trying to execute simple code to lemmatize string, but there's an error about iteration.
I have found some solutions which are about reinstalling web.py, but this not worked for me.
python code
from gensim.utils import lemmatize
lemmatize("gone")
error is
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
I:\Anaconda\lib\site-packages\pattern\text\__init__.py in _read(path, encoding, comment)
608 yield line
--> 609 raise StopIteration
610
StopIteration:
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
<ipython-input-4-9daceee1900f> in <module>
1 from gensim.utils import lemmatize
----> 2 lemmatize("gone")
-------------------------------------------------------------------------------------
I:\Anaconda\lib\site-packages\pattern\text\__init__.py in <genexpr>(.0)
623 def load(self):
624 # Arnold NNP x
--> 625 dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if len(x.split(" ")) > 1))
626
627 #--- FREQUENCY -------------------------------------------------------------------------------------
RuntimeError: generator raised StopIteration
The error message is misleading – it occurs when there's nothing to properly lemmatize.
By default, lemmatize() only accepts word tags NN|VB|JJ|RB. Pass in a regexp that matches any string to change this:
>>> import re
>>> lemmatize("gone", allowed_tags=re.compile('.*'))
[b'go/VB']

Type Error for MultiComparision.tukeyhsd()

I have a data set with 13 columns, I want to perform multi comparison ANOVA on one of those columns using statsmodels.stats.multicomp.MultiComparison module. But I'm getting an type error message that I don't understand.
Same code that I wrote here works on other machines, I tried updating all the python and conda modules and tried executing the code again but ended up same result.
import statsmodels.api as sm
multi_comp = sm.stats.multicomp.MultiComparison(data=df['col1'],groups=data['category'])
result_string = multi_comp.tukeyhsd(alpha=0.05)
print(result_string)
I also tried other methods like pairwise_tukeyhsd(), same error persists for that method as well. Apart from this multicomp module, rest of my code works fine.
TypeError Traceback (most recent call last) in module<module>
1 multi_comp = sm.stats.multicomp.MultiComparison(data=data['col1'], groups=data['category'])
----> 2 result_string = multi_comp.tukeyhsd()
3 result_string
~\Anaconda3\lib\site-packages\statsmodels\sandbox\stats\multicomp.py in tukeyhsd(self, alpha)
1011 np.round(res[4][:, 0], 4),
1012 np.round(res[4][:, 1], 4),
-> 1013 res[1]),
1014 dtype=[('group1', object),
1015 ('group2', object),
~\Anaconda3\lib\site-packages\statsmodels\compat\python.py in lzip(*args, **kwargs)
60
61 def lzip(*args, **kwargs):
---> 62 return list(zip(*args, **kwargs))
63
64 def lmap(*args, **kwargs):
TypeError: zip argument #4 must support iteration
That may happen due to few possible values of category in your data. If there are only 2 possible categories, MultiComp won't work.

TypeError: 'module' object is not callable in python and pandas

I imported the module correctly into pandas and called it correctly using import main and then main.main(data, 1,10,2.5) but I am getting an error:
TypeError Traceback (most recent call last)
<ipython-input-52-e9913b227737> in <module>()
----> 1 main.main(data, 1, 10, 2.5)
38 dat_sh = data.shape[0]
39 #Z = random.sample(range(0,U),k_max)
---> 40 Z = cf.centroid_finder(data,sp_atr,k_max)
41
42 prototypes = {i: data[j:j+1].values.tolist()[0] for i,j in enumerate(Z)}
11 for i in range(dat_sh):
12 for j in range(dat_sh):
---> 13 D[i][j] = ed(sub_atr[i],sub_atr[j])
14
15
TypeError: 'module' object is not callable
ed is euclidean:
def ed(X2, X1):
return sqrt(sum(np.subtract(X1,X2)**2))
There may be some confusion regarding the importing of modules vs functions.
It's possible to recreate your error with a simple example, assuming that module/function importing has gotten mixed up somewhere along the way. Consider a case where we pass initial values a and b through a centroid_finder() function, which lives in cf.py:
# import cf.py as cf
import cf
a, b = ([1,2,3], [2,3,4])
cf.centroid_finder(a, b)
But centroid_finder() calls ed(), which lives in ed.py:
## cf.py
# import ed.py as ed
import ed
def centroid_finder(a, b):
print(ed(a, b))
## ed.py
from numpy import sqrt, sum
import numpy as np
def ed(X2, X1):
return sqrt(sum(np.subtract(X1,X2)**2))
Here, calling centroid_finder() will give the error you observed:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-1-e76ec6a0496c> in <module>()
3 a, b = ([1,2,3], [2,3,4])
4
----> 5 cf.centroid_finder(a, b)
cf.py in centroid_finder(a, b)
2
3 def centroid_finder(a, b):
----> 4 print(ed(a, b))
TypeError: 'module' object is not callable
That's because you imported a module, ed.py as ed...but what you wanted was to call the ed() function that lives inside of ed.py. That's ed.ed()!
Changing centroid_finder() to call ed.ed() produces the desired result:
# cf.py
import ed
def centroid_finder(a, b):
print(ed.ed(a, b))
Now, from the main script:
cf.centroid_finder(a, b)
# 1.73205080757
There's at least one undisclosed import shorthand in your example code, where you call sqrt in ed(). There isn't a natural sqrt() in Python, it most likely is imported from either math or numpy, e.g. from numpy import sqrt. That's not a problem, per se, but given the error you're getting about modules being non-callable, you might benefit from explicitly calling functions in your code from the modules they live in.
For example, using import numpy as np, call np.sqrt() instead of just importing sqrt() directly. This is a defensive programming posture that will prevent similar confusion in the future. (You do this already with np.subtract() in ed(); it's unclear why sqrt() doesn't get the same treatment.)

Dynamic Semantic errors in Python

i came across this as an interview question. This question seemed interesting. So, i am posting it here.
Consider the operation which gives semantic error like division by zero. By default, python compiler gives output like "Invalid Operation" or something. Can we control the output that is given out by Python compiler, like print some other error message, skip that division by zero operation, and carry on with rest of the instructions?
And also, how can i evaluate the cost of run-time semantic checks?
There are many python experts here. I am hoping someone will throw some light on this. Thanks in advance.
Can we control the output that is given out by Python compiler, like print some other error message, skip that division by zero operation, and carry on with rest of the instructions?
No, you cannot. You can manually wrap every dangerous command with a try...except block, but I'm assuming you're talking about an automatic recovery to specific lines within a try...except block, or even completely automatically.
By the time the error has fallen through such that sys.excepthook is called, or whatever outer scope if you catch it early, the inner scopes are gone. You can change line numbers with sys.settrace in CPython although that is only an implementation detail, but since the outer scopes are gone there is no reliable recorvery mechanism.
If you try to use the humorous goto April fools module (that uses the method I just described) to jump blocks even within a file:
from goto import goto, label
try:
1 / 0
label .foo
print("recovered")
except:
goto .foo
you get an error:
Traceback (most recent call last):
File "rcv.py", line 9, in <module>
goto .foo
File "rcv.py", line 9, in <module>
goto .foo
File "/home/joshua/src/goto-1.0/goto.py", line 272, in _trace
frame.f_lineno = targetLine
ValueError: can't jump into the middle of a block
so I'm pretty certain it's impossible.
And also, how can i evaluate the cost of run-time semantic checks?
I don't know what that is, but you're probably looking for a line_profiler:
import random
from line_profiler import LineProfiler
profiler = LineProfiler()
def profile(function):
profiler.add_function(function)
return function
#profile
def foo(a, b, c):
if not isinstance(a, int):
raise TypeError("Is this what you mean by a 'run-time semantic check'?")
d = b * c
d /= a
return d**a
profiler.enable()
for _ in range(10000):
try:
foo(random.choice([2, 4, 2, 5, 2, 3, "dsd"]), 4, 2)
except TypeError:
pass
profiler.print_stats()
output:
Timer unit: 1e-06 s
File: rcv.py
Function: foo at line 11
Total time: 0.095197 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
11 #profile
12 def foo(a, b, c):
13 10000 29767 3.0 31.3 if not isinstance(a, int):
14 1361 4891 3.6 5.1 raise TypeError("Is this what you mean by a 'run-time semantic check'?")
15
16 8639 20192 2.3 21.2 d = b * c
17 8639 20351 2.4 21.4 d /= a
18
19 8639 19996 2.3 21.0 return d**a
So the "run-time semantic check", in this case would be taking 36.4% of the time of running foo.
If you want to time specific blocks manually that are larger than you'd use timeit on but smaller than you'd want for a profiler, instead of using two time.time() calls (which is quite an inaccurate method) I suggest Steven D'Aprano's Stopwatch context manager.
I would just use an exception, this example is using python 3. For Python 2, simple remove the annotations after the function parameters. So you function signature would look like this -> f(a,b):
def f(a: int, b: int):
"""
#param a:
#param b:
"""
try:
c = a / b
print(c)
except ZeroDivisionError:
print("You idiot, you can't do that ! :P")
if __name__ == '__main__':
f(1, 0)
>>> from cheese import f
>>> f(0, 0)
You idiot, you can't do that ! :P
>>> f(0, 1)
0.0
>>> f(1, 0)
You idiot, you can't do that ! :P
>>> f(1, 1)
1.0
This is an example of how you could catch Zero Division, by making an exception case using ZeroDivisionError.
I won't go into any specific tools for making loggers, but you can indeed understand the costs associated with this kind of checking. You can put a start = time.time() at the start of the function and end = time.time() at the end. If you take the difference, you will get the execution time in seconds.
I hope that helps.

python doctest: expected result is the same as the "got" result but the test failed

I am on a learning stage of using python as a tool for software QA.
I wrote the next simple test in order to find the letter 'a' in a text file number matrix.
problem is that the test fails even though the expect equals to what i got.
Why is that? Can you tell me what am I doing wrong?
test script:
fin = open("abc.txt", "r")
arr_fin = []
for line in fin:
arr_fin.append(line.split())
print arr_fin
for row in arr_fin:
arr_fin_1 = " ".join('{0:4}'.format(i or " ") for i in row)
print arr_fin_1
def find_letter(x, arr_fin_1):
"""
>>> find_letter('a', arr_fin_1)
97
"""
t=ord(x) #exchange to letter's ASCII value
for i in arr_fin_1:
if i==x:
print t
return;
def _test():
import doctest
doctest.testmod()
if __name__ == "__main__":
_test()
error message:
Expected:
97
Got:
97
**********************************************************************
1 items had failures:
1 of 1 in __main__.find_letter
***Test Failed*** 1 failures.
You've got an extra space after the 97 - if you remove it, your test should run fine.
This:
return;
Makes your function return None.
Did you mean return t?
Besides that, IMHO doctest tests are meant to be self-contained. This is something the user should see in your documentation and understand without context. In your example, you're using a module-local arr_fin_1 object which is completely opaque to the user. It's better to define it in the doctest before the find_letter call to provide a self-contained example.

Categories