Learn Python the Hard Way, Ex 49 : Comparing objects using assert_equal - python

Is it possible to use assert_equal to compare objects? I keep seeing this error:
AssertionError: <ex49.parser.Sentence object at 0x01F1BAF0> !=
<ex49.parser.Sentence object at 0x01F1BB10>
The relevant code fragment:
def test_parse_subject():
testsentence = "princess go east"
result = lexicon.scan(testsentence)
Sent = parse_sentence(result)
ResultSent = Sentence(('subject', 'princess'),
('verb', 'go'),
('object', 'east'))
print ResultSent.subject
print ResultSent.verb
print ResultSent.object
print Sent.subject
print Sent.verb
print Sent.object
assert_equal(Sent, ResultSent)
The print outputs on screen suggests that the objects have the same contents - yet the assertion error shows up. Why is this? Is there some way to use assert_equal to override this?

I believe you need to implement the __eq__ method on the Sentence class.
assertEqual(first, second, msg=None)¶
Test that first and second are equal. If the values do not compare equal, the test will fail.
In addition, if first and second are the exact same type and one of list, tuple, dict, set, frozenset or unicode or any type that a subclass registers with addTypeEqualityFunc() the type-specific equality function will be called in order to generate a more useful default error message (see also the list of type-specific methods).
Python unittest documentation
The correspondence between operator symbols and method names is as follows: xlt(y), x<=y calls x.le(y), x==y calls x.eq(y), x!=y and x<>y call x.ne(y), x>y calls x.gt(y), and x>=y calls x.ge(y).
Python data model documentation
An example:
import unittest
class A:
def __init__(self, num):
self.num = num
def __eq__(self, other):
return self.num == other.num
class Test(unittest.TestCase):
def test(self):
a1 = A(1)
a12 = A(1)
a2 = A(2)
self.assertEqual(a1, a1, 'a1 != a1')
self.assertEqual(a1, a12, 'a1 != a12')
self.assertEqual(a1, a2, 'a1 != a2')
def main():
unittest.TestRunner(Test())
if __name__ == '__main__':
unittest.main()
Now comment the __eq__ method and see the difference.

This is good info, For me, I was too lazy to search so I just compared the variables of the two objects as below:
def test_parse_subject():
word_list_a = lexicon.scan("eat the bear")
Sentence1 = Sentence(('noun','player'),('verb', 'eat'),('noun', 'bear'))
Sentence2 = parse_subject(word_list_a,('noun','player'))
assert_equal(Sentence2.subject, Sentence1.subject)
assert_equal(Sentence2.verb, Sentence1.verb)
assert_equal(Sentence2.object, Sentence1.object)

I too am working through LPTHW ex49. Specifically for the context of this example, I was able to get it to work by adding the __eq__() method to the Sentence class, as follows:
Class Sentence(object):
def __init__(self, subject, verb, object_)
...
def __eq__(self, other):
return (self.subject == other.subject and
self.verb == other.verb and
self.object_ == other.object_)
Then, in the test file, I did:
# where LIST5 is defined above to give list of two tuples, [('verb', 'go'), ('direction', 'east')]
def test_parse_subject():
wordlist = list(LIST5)
sent = parse.Sentence(('noun', 'person'), ('verb'), ('go'), ('direction', 'east))
newsent = parse.parse_subject(wordlist, ('noun', 'person'))
assert_equal(newsent, sent)
As far as I can tell (new to this), assert_equal with nose and unittest will call the __eq__() method if it exists. In this case, the test is OK as long as the two objects have the same three values for subject, verb, object_. However, this took me a while to figure out, because I had a bug in my code, and the only thing nose would provide is the same error message that you received, even when I had the __eq__() method. That is, it provided "AssertionError: ...object at 0x... != ... object at 0x..." This misled me into thinking that the __eq__() method was not working, since it looked like it was comparing addresses. Not sure if there's a better way to do this.
NOTE: I renamed object to object_ because gedit was highlighting object as a python keyword. Not sure if this is recommended to use trailing underscore.

Related

python set() membership and hashable objects

I wanted to store instances of a class in a set, so I could use the set methods to find intersections, etc. My class has a __hash__() function, along with an __eq__ and a __lt__, and is decorated with functools.total_ordering
When I create two sets, each containing the same two objects, and do a set_a.difference(set_b), I get a result with a single object, and I have no idea why. I was expecting none, or at the least, 2, indicating a complete failure in my understanding of how sets work. But one?
for a in set_a:
print(a, a.__hash__())
for b in set_b:
print(b, b.__hash__(), b in set_a)
(<foo>, -5267863171333807568)
(<bar>, -8020339072063373731)
(<foo>, -5267863171333807568, False)
(<bar)>, -8020339072063373731, True)
Why is the <foo> object in set_b not considered to be in set_a? What other properties does an object require in order to be considered a member of a set? And why is bar considered to be a part of set_a, but not foo?
edit: updating with some more info. I figured that simply showing that the two objects' hash() results where the same meant that they where indeed the same, so I guess that's where my mistake probably comes from.
#total_ordering
class Thing(object):
def __init__(self, i):
self.i = i
def __eq__(self, other):
return self.i == other.i
def __lt__(self, other):
return self.i < other.i
def __repr__(self):
return "<Thing {}>".format(self.i)
def __hash__(self):
return hash(self.i)
I figured it out thanks to some of the questions in the comments- the problem was due to the fact that I had believed that ultimately, the hash function decides if two objects are the same, or not. The __eq__ also needs to match, which it always did in my tests and attempts to create a minimal example here.
However, when pulling data from a DB in prod, a certain float was being rounded down, and thus, the x == y was failing in prod. Argh.

how to use an object in a function without knowing what the object is

I want to use an object even when I don't know the name of the object. I am trying to use a function where it compares two objects and see which one has the biggest number, but I want to be able to type the objects into the argument of the function and then the function does the comparison, so I don't have to keep repeating the same code over and over again. The issue is that I don't know how to have an argument in a function say what object to compare.
class tester:
myVar = None
def __init__(self, myVar):
self.myVar = myVar
# I am not going to make everything legitamite here
def compare(first, second):
# I want to make first = the first object i am comparing
# second = second object i am comparing
# I would then use it in a conditional
This probably not the best way of going about this, and if there is a better way I would love to know.
A cleaner way to do this is to define a __cmp__() method in your class. That way, you can use the standard comparison operators < == != >, etc, and the built-in cmp() function on your class instances. Also, if an object defines __cmp__() it will behave properly when passed to functions like max() and sort(). (Thanks to EOL for reminding me to mention that).
Eg,
class tester(object):
def __init__(self, myVar):
self.myVar = myVar
def __cmp__(self, other):
return cmp(self.myVar, other.myVar)
print tester(5) < tester(7)
print tester(6) == tester(6)
print tester(9) > tester(6)
print tester('z') < tester('a')
print cmp(tester((1, 2)), tester((1, 3)))
output
True
True
True
False
-1
Note that I've made tester inherit from object, which makes it a new-style class. That's not strictly necessary, but it does have various benefits.
I've also removed the myVar = None class attribute, which as EOL points out in the comments is unnecessary clutter.
Do you mean that you want to pass in 2 instances of a class and then compare their values. If so, you can simply do it as follows:
class tester:
myVar = None
def __init__(self, myVar):
self.myVar = myVar
def compare(first, second):
if first.myVar > second.myVar:
return "First object has a greater value"
elif first.myVar < second.myVar:
return "Second object has a greater value"
else:
return "Both objects have the same value"
obj1 = tester(5)
obj2 = tester(7)
>>> print(compare(obj1, obj2))
#Output: Second object has a greater value

What does "bound method" error mean when I call a function?

I am creating a word parsing class and I keep getting a
bound method Word_Parser.sort_word_list of <__main__.Word_Parser instance at 0x1037dd3b0>
error when I run this:
class Word_Parser:
"""docstring for Word_Parser"""
def __init__(self, sentences):
self.sentences = sentences
def parser(self):
self.word_list = self.sentences.split()
def sort_word_list(self):
self.sorted_word_list = self.word_list.sort()
def num_words(self):
self.num_words = len(self.word_list)
test = Word_Parser("mary had a little lamb")
test.parser()
test.sort_word_list()
test.num_words()
print test.word_list
print test.sort_word_list
print test.num_words
There's no error here. You're printing a function, and that's what functions look like.
To actually call the function, you have to put parens after that. You're already doing that above. If you want to print the result of calling the function, just have the function return the value, and put the print there. For example:
print test.sort_word_list()
On the other hand, if you want the function to mutate the object's state, and then print the state some other way, that's fine too.
Now, your code seems to work in some places, but not others; let's look at why:
parser sets a variable called word_list, and you later print test.word_list, so that works.
sort_word_list sets a variable called sorted_word_list, and you later print test.sort_word_list—that is, the function, not the variable. So, you see the bound method. (Also, as Jon Clements points out, even if you fix this, you're going to print None, because that's what sort returns.)
num_words sets a variable called num_words, and you again print the function—but in this case, the variable has the same name as the function, meaning that you're actually replacing the function with its output, so it works. This is probably not what you want to do, however.
(There are cases where, at first glance, that seems like it might be a good idea—you only want to compute something once, and then access it over and over again without constantly recomputing that. But this isn't the way to do it. Either use a #property, or use a memoization decorator.)
This problem happens as a result of calling a method without brackets. Take a look at the example below:
class SomeClass(object):
def __init__(self):
print 'I am starting'
def some_meth(self):
print 'I am a method()'
x = SomeClass()
''' Not adding the bracket after the method call would result in method bound error '''
print x.some_meth
''' However this is how it should be called and it does solve it '''
x.some_meth()
You have an instance method called num_words, but you also have a variable called num_words. They have the same name. When you run num_words(), the function replaces itself with its own output, which probably isn't what you want to do. Consider returning your values.
To fix your problem, change def num_words to something like def get_num_words and your code should work fine. Also, change print test.sort_word_list to print test.sorted_word_list.
For this thing you can use #property as an decorator, so you could use instance methods as attributes. For example:
class Word_Parser:
def __init__(self, sentences):
self.sentences = sentences
#property
def parser(self):
self.word_list = self.sentences.split()
#property
def sort_word_list(self):
self.sorted_word_list = self.word_list.sort()
#property
def num_words(self):
self.num_words = len(self.word_list)
test = Word_Parser("mary had a little lamb")
test.parser()
test.sort_word_list()
test.num_words()
print test.word_list
print test.sort_word_list
print test.num_words
so you can use access the attributes without calling (i.e., without the ()).
I think you meant print test.sorted_word_list instead of print test.sort_word_list.
In addition list.sort() sorts a list in place and returns None, so you probably want to change sort_word_list() to do the following:
self.sorted_word_list = sorted(self.word_list)
You should also consider either renaming your num_words() function, or changing the attribute that the function assigns to, because currently you overwrite the function with an integer on the first call.
The syntax problem is shadowing method and variable names. In the current version sort_word_list() is a method, and sorted_word_list is a variable, whereas num_words is both. Also, list.sort() modifies the list and replaces it with a sorted version; the sorted(list) function actually returns a new list.
But I suspect this indicates a design problem. What's the point of calls like
test.parser()
test.sort_word_list()
test.num_words()
which don't do anything? You should probably just have the methods figure out whether the appropriate counting and/or sorting has been done, and, if appropriate, do the count or sort and otherwise just return something.
E.G.,
def sort_word_list(self):
if self.sorted_word_list is not None:
self.sorted_word_list = sorted(self.word_list)
return self.sorted_word_list
(Alternately, you could use properties.)
Your helpful comments led me to the following solution:
class Word_Parser:
"""docstring for Word_Parser"""
def __init__(self, sentences):
self.sentences = sentences
def parser(self):
self.word_list = self.sentences.split()
word_list = []
word_list = self.word_list
return word_list
def sort_word_list(self):
self.sorted_word_list = sorted(self.sentences.split())
sorted_word_list = self.sorted_word_list
return sorted_word_list
def get_num_words(self):
self.num_words = len(self.word_list)
num_words = self.num_words
return num_words
test = Word_Parser("mary had a little lamb")
test.parser()
test.sort_word_list()
test.get_num_words()
print test.word_list
print test.sorted_word_list
print test.num_words
and returns:
['mary', 'had', 'a', 'little', 'lamb']
['a', 'had', 'lamb', 'little', 'mary']
5
Thank you all.
Bound method error also occurs (in a Django app for instnce) , if you do a thing as below:
class Products(models.Model):
product_category = models.ForeignKey(ProductCategory, on_delete=models.Protect)
def product_category(self)
return self.product_category
If you name a method, same way you named a field.

Python - __eq__ method not being called

I have a set of objects, and am interested in getting a specific object from the set. After some research, I decided to use the solution provided here: http://code.activestate.com/recipes/499299/
The problem is that it doesn't appear to be working.
I have two classes defined as such:
class Foo(object):
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __key(self):
return (self.a, self.b, self.c)
def __eq__(self, other):
return self.__key() == other.__key()
def __hash__(self):
return hash(self.__key())
class Bar(Foo):
def __init__(self, a, b, c, d, e):
self.a = a
self.b = b
self.c = c
self.d = d
self.e = e
Note: equality of these two classes should only be defined on the attributes a, b, c.
The wrapper _CaptureEq in http://code.activestate.com/recipes/499299/ also defines its own __eq__ method. The problem is that this method never gets called (I think). Consider,
bar_1 = Bar(1,2,3,4,5)
bar_2 = Bar(1,2,3,10,11)
summary = set((bar_1,))
assert(bar_1 == bar_2)
bar_equiv = get_equivalent(summary, bar_2)
bar_equiv.d should equal 4 and likewise bar_equiv .e should equal 5, but they are not. Like I mentioned, it looks like the __CaptureEq __eq__ method does not get called when the statement bar_2 in summary is executed.
Is there some reason why the __CaptureEq __eq__ method is not being called? Hopefully this is not too obscure of a question.
Brandon's answer is informative, but incorrect. There are actually two problems, one with
the recipe relying on _CaptureEq being written as an old-style class (so it won't work properly if you try it on Python 3 with a hash-based container), and one with your own Foo.__eq__ definition claiming definitively that the two objects are not equal when it should be saying "I don't know, ask the other object if we're equal".
The recipe problem is trivial to fix: just define __hash__ on the comparison wrapper class:
class _CaptureEq:
'Object wrapper that remembers "other" for successful equality tests.'
def __init__(self, obj):
self.obj = obj
self.match = obj
# If running on Python 3, this will be a new-style class, and
# new-style classes must delegate hash explicitly in order to populate
# the underlying special method slot correctly.
# On Python 2, it will be an old-style class, so the explicit delegation
# isn't needed (__getattr__ will cover it), but it also won't do any harm.
def __hash__(self):
return hash(self.obj)
def __eq__(self, other):
result = (self.obj == other)
if result:
self.match = other
return result
def __getattr__(self, name): # support anything else needed by __contains__
return getattr(self.obj, name)
The problem with your own __eq__ definition is also easy to fix: return NotImplemented when appropriate so you aren't claiming to provide a definitive answer for comparisons with unknown objects:
class Foo(object):
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __key(self):
return (self.a, self.b, self.c)
def __eq__(self, other):
if not isinstance(other, Foo):
# Don't recognise "other", so let *it* decide if we're equal
return NotImplemented
return self.__key() == other.__key()
def __hash__(self):
return hash(self.__key())
With those two fixes, you will find that Raymond's get_equivalent recipe works exactly as it should:
>>> from capture_eq import *
>>> bar_1 = Bar(1,2,3,4,5)
>>> bar_2 = Bar(1,2,3,10,11)
>>> summary = set((bar_1,))
>>> assert(bar_1 == bar_2)
>>> bar_equiv = get_equivalent(summary, bar_2)
>>> bar_equiv.d
4
>>> bar_equiv.e
5
Update: Clarified that the explicit __hash__ override is only needed in order to correctly handle the Python 3 case.
The problem is that the set compares two objects the “wrong way around” for this pattern to intercept the call to __eq__(). The recipe from 2006 evidently was written against containers that, when asked if x was present, went through the candidate y values already present in the container doing:
x == y
comparisons, in which case an __eq__() on x could do special actions during the search. But the set object is doing the comparison the other way around:
y == x
for each y in the set. Therefore this pattern might simply not be usable in this form when your data type is a set. You can confirm this by instrumenting Foo.__eq__() like this:
def __eq__(self, other):
print '__eq__: I am', self.d, self.e, 'and he is', other.d, other.e
return self.__key() == other.__key()
You will then see a message like:
__eq__: I am 4 5 and he is 10 11
confirming that the equality comparison is posing the equality question to the object already in the set — which is, alas, not the object wrapped with Hettinger's _CaptureEq object.
Update:
And I forgot to suggest a way forward: have you thought about using a dictionary? Since you have an idea here of a key that is a subset of the data inside the object, you might find that splitting out the idea of the key from the idea of the object itself might alleviate the need to attempt this kind of convoluted object interception. Just write a new function that, given an object and your dictionary, computes the key and looks in the dictionary and returns the object already in the dictionary if the key is present else inserts the new object at the key.
Update 2: well, look at that — Nick's answer uses a NotImplemented in one direction to force the set to do the comparison in the other direction. Give the guy a few +1's!
There are two issues here. The first is that:
t = _CaptureEq(item)
if t in container:
return t.match
return default
Doesn't do what you think. In particular, t will never be in container, since _CaptureEq doesn't define __hash__. This becomes more obvious in Python 3, since it will point this out to you rather than providing a default __hash__. The code for _CaptureEq seems to believe that providing an __getattr__ will solve this - it won't, since Python's special method lookups are not guaranteed to go through all the same steps as normal attribute lookups - this is the same reason __hash__ (and various others) need to be defined on a class and can't be monkeypatched onto an instance. So, the most direct way around this is to define _CaptureEq.__hash__ like so:
def __hash__(self):
return hash(self.obj)
But that still isn't guaranteed to work, because of the second issue: set lookup is not guaranteed to test equality. sets are based on hashtables, and only do an equality test if there's more than one item in a hash bucket. You can't (and don't want to) force items that hash differently into the same bucket, since that's all an implementation detail of set. The easiest way around this issue, and to neatly sidestep the first one, is to use a list instead:
summary = [bar_1]
assert(bar_1 == bar_2)
bar_equiv = get_equivalent(summary, bar_2)
assert(bar_equiv is bar_1)

sharing a string between two objects

I want two objects to share a single string object. How do I pass the string object from the first to the second such that any changes applied by one will be visible to the other? I am guessing that I would have to wrap the string in a sort of buffer object and do all sorts of complexity to get it to work.
However, I have a tendency to overthink problems, so undoubtedly there is an easier way. Or maybe sharing the string is the wrong way to go? Keep in mind that I want both objects to be able to edit the string. Any ideas?
Here is an example of a solution I could use:
class Buffer(object):
def __init__(self):
self.data = ""
def assign(self, value):
self.data = str(value)
def __getattr__(self, name):
return getattr(self.data, name)
class Descriptor(object):
def __get__(self, instance, owner):
return instance._buffer.data
def __set__(self, instance, value):
if not hasattr(instance, "_buffer"):
if isinstance(value, Buffer):
instance._buffer = value
return
instance._buffer = Buffer()
instance._buffer.assign(value)
class First(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def read(self, size=-1):
if size < 0:
size = len(self.data)
data = self.data[:size]
self.data = self.data[size:]
return data
class Second(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def add(self, newdata):
self.data += newdata
def reset(self):
self.data = ""
def spawn(self):
return First(self._buffer)
s = Second("stuff")
f = s.spawn()
f.data == s.data
#True
f.read(2)
#"st"
f.data
# "uff"
f.data == s.data
#True
s.data
#"uff"
s._buffer == f._buffer
#True
Again, this seems like absolute overkill for what seems like a simple problem. As well, it requires the use of the Buffer class, a descriptor, and the descriptor's impositional _buffer variable.
An alternative is to put one of the objects in charge of the string and then have it expose an interface for making changes to the string. Simpler, but not quite the same effect.
I want two objects to share a single
string object.
They will, if you simply pass the string -- Python doesn't copy unless you tell it to copy.
How do I pass the string object from
the first to the second such that any
changes applied by one will be visible
to the other?
There can never be any change made to a string object (it's immutable!), so your requirement is trivially met (since a false precondition implies anything).
I am guessing that I would have to
wrap the string in a sort of buffer
object and do all sorts of complexity
to get it to work.
You could use (assuming this is Python 2 and you want a string of bytes) an array.array with a typecode of c. Arrays are mutable, so you can indeed alter them (with mutating methods -- and some operators, which are a special case of methods since they invoke special methods on the object). They don't have the myriad non-mutating methods of strings, so, if you need those, you'll indeed need a simple wrapper (delegating said methods to the str(...) of the array that the wrapper also holds).
It doesn't seem there should be any special complexity, unless of course you want to do something truly weird as you seem to given your example code (have an assignment, i.e., a *rebinding of a name, magically affect a different name -- that has absolutely nothing to do with whatever object was previously bound to the name you're rebinding, nor does it change that object in any way -- the only object it "changes" is the one holding the attribute, so it's obvious that you need descriptors or other magic on said object).
You appear to come from some language where variables (and particularly strings) are "containers of data" (like C, Fortran, or C++). In Python (like, say, in Java), names (the preferred way to call what others call "variables") always just refer to objects, they don't contain anything except exactly such a reference. Some objects can be changed, some can't, but that has absolutely nothing to do with the assignment statement (see note 1) (which doesn't change objects: it rebinds names).
(note 1): except of course that rebinding an attribute or item does alter the object that "contains" that item or attribute -- objects can and do contain, it's names that don't.
Just put your value to be shared in a list, and assign the list to both objects.
class A(object):
def __init__(self, strcontainer):
self.strcontainer = strcontainer
def upcase(self):
self.strcontainer[0] = self.strcontainer[0].upper()
def __str__(self):
return self.strcontainer[0]
# create a string, inside a shareable list
shared = ['Hello, World!']
x = A(shared)
y = A(shared)
# both objects have the same list
print id(x.strcontainer)
print id(y.strcontainer)
# change value in x
x.upcase()
# show how value is changed in both x and y
print str(x)
print str(y)
Prints:
10534024
10534024
HELLO, WORLD!
HELLO, WORLD!
i am not a great expert in python, but i think that if you declare a variable in a module and add a getter/setter to the module for this variable you will be able to share it this way.

Categories