Overriding db.Model.all() in python google app engine - python

I'm trying to start using memcache in my Google App Engine app. Instead of creating a function that checks memcache and then maybe queries the database, I decided to just override my Model's all() class method. Here's my code so far:
def all(cls, order=None):
result = memcache.get("allitems")
if not result or not memcache.get("updateitems"):
logging.info(list(super(Item, cls.all())))
result = list(super(Item, cls).all()).sort(key=lambda x: getattr(x, order) if order else str(x))
memcache.set("allitems", result)
memcache.set("updateitems", True)
logging.info("DB Query for items")
return result
I had figured this would work. But instead I get a RuntimeError saying that recursion depth was exceeded. I think this comes from a misunderstanding of the super() method. Sorry for cluttering the code up with the ordering thing. But maybe the problem lies somewhere in there too. One place I found said that the super method should be called like this:
super(supercls, cls_or_self)
But this wouldn't work with GAE's API:
super(db.Model, cls)
This wouldn't know which model to query. Someone please tell me what I'm doing wrong, and maybe give me a better understanding super().
EDIT: Thanks to #Matthew, the problem turned out to be a misplaced parentheses in the first logging.info() call. Now I have another problem, the method is just returning None. I don't know if that means that the super implementation of all() returns None (Maybe it doesn't know what entity is calling it?) or just there is some other bug with my code.

I think the error might be here:
logging.info(list(super(Item, cls.all())))
If there's an error in cls.all(), you then call it again as part of the super constructor, rather than calling it on the result:
logging.info(list(super(Item, cls).all()))
So if an error would call all again it would still meet the logging branch conditions, which would call all again, which would still etc etc until you hit the recursion limit.
The other possible problem is that Model.all() returns a Query object, and I'm not sure if list(query) works. It also provides it's own sorting, so you might be able to use this instead:
query = super(Item, cls).all()
query.order( order )
...
return list(query)
Or just return query, as it's already iterable.

Related

Mocking a function call within a function in Python

This is my first time building out unit tests, and I'm not quite sure how to proceed here. Here's the function I'd like to test; it's a method in a class that accepts one argument, url, and returns one string, task_id:
def url_request(self, url):
conn = self.endpoint_request()
authorization = conn.authorization
response = requests.get(url, authorization)
return response["task_id"]
The method starts out by calling another method within the same class to obtain a token to connect to an API endpoint. Should I be mocking the output of that call (self.endpoint_request())?
If I do have to mock it, and my test function looks like this, how do I pass a fake token/auth endpoint_request response?
#patch("common.DataGetter.endpoint_request")
def test_url_request(mock_endpoint_request):
mock_endpoint_request.return_value = {"Auth": "123456"}
# How do I pass the fake token/auth to this?
task_id = DataGetter.url_request(url)
The code you have shown is strongly dominated by interactions. Which means that there will most likely be no bugs to find with unit-testing: The potential bugs are on the interaction level: You access conn.authorization - but, is this the proper member? And, does it already have the proper representation in the way you need it further on? Is requests.get the right method for the job? Is the argument order as you expect it? Is the return value as you expect it? Is task_id spelled correctly?
These are (some of) the potential bugs in your code. But, with unit-testing you will not be able to find them: When you replace the depended-on components with some mocks (which you create or configure), your unit-tests will just succeed: Lets assume that you have a misconception about the return value of requests.get, namely that task_id is spelled wrongly and should rather be spelled taskId. If you mock requests.get, you would implement the mock based on your own misconception. That is, your mock would return a map with the (misspelled) key task_id. Then, the unit-test would succeed despite of the bug.
You will only find that bug with integration testing, where you bring your component and depended-on components together. Only then you can test the assumptions made in your component against the reality of the other components.

Mocking inherited methods

Please forgive my noob status, but I have come across a construct I don't really understand and hope someone can explain it for me.
class Base(object):
def mogrify(self, column):
return self.mogrifiers.get(column.lower().strip()) or (lambda x: x)
...
class MyClass(some.package.Base):
def mogrifiers(self):
return {
'column1': (lambda x: datetime.datetime.fromtimestamp(int(x)))
}
...
class MyOtherClass(object):
def convert_columns:
...
new_row[colkey] = self.myclass.mogrify(colkey)(value)
This all works, but I'm trying to write a unit test and mock out MyClass.
As far as I can tell, mogrifiers returns a dictionary of all the columns and any transformations that are required.
The code I am testing calls mogrify (inherited from the Base class) with a specific column name in a string.
This tries to extract the column from the dictionary and returns the lambda function ? or if it doesn't exist in the dictionary, it returns a lambda that just gives the string back ?
So that just leaves me with the (value) bit in the code I'm trying to test. It's no clear what it does.
If I don't want to test the underlying conversion/transformation my mock could just return the simple lambda.
So I've done that, but it throws an exception on the call to mogrify saying:
E TypeError: 'str' object is not callable
Can anyone provide some clues what I'm missing here?
As far as I can tell, mogrifiers returns a dictionary of all the
columns and any transformations that are required.
That is correct, though as you've shown it it will create a fresh dictionary each time which seems unnecessary.
The code I am testing calls mogrify (inherited from the Base class)
with a specific column name in a string.
This tries to extract the column from the dictionary and returns the
lambda function ? or if it doesn't exist in the dictionary, it returns
a lambada that just gives the string back ?
Yes, that is also correct (except that a lambada is a dance, but I think you meant lambda again).
So that just leaves me with the (value) bit in the code I'm trying to
test. It's no clear what it does.
The call self.myclass.mogrify(colkey) returns a callable, the (value) simply calls it. It may be clearer if I rewrite like this:
fn = self.myclass.mogrify(colkey)
new_row[colkey] = fn(value)
splitting it into two lines will also make it clearer whether the problem is with the call self.myclass.mogrify(colkey) or fn(value). If as seems likely it is the fn(value) call it means your mocked mogrify is returning a str instead of returning a callable; it could however be that you got the mock wrong and the mocked mogrify method is actually a string.
I would suggest you rewrite as shown and also insert a print between the two lines and see what is actually being returned.

What is a _curried method in python and how are they generated?

Apologies upfront if this is a dupe; I search for "_curried python" and got 14 results, and then simply _curried" and that only bumped up to 33 results, and none seemed to help out...
The problem: I came across what I originally thought was a typo in our codebase today, here is the suspect:
student.recalculate_gpa()
Now, I suspect it to be a typo because student is an instance of a Student class that has no recalculate_gpa method. However, it does have a calculate_gpa method:
class Student(User):
...
def calculate_gpa(self):
# some calculations...
(Where User is the standard django user class.) But, the code wasn't erroring, which made no sense to me. So I did an inspect and found this:
... (a bunch of methods)
('calculate_gpa', <unbound method Student.calculate_gpa>),
... (some more methods)
('recalculate_gpa', <unbound method Student._curried>),
Strange, recalculate_gpa is in fact a method. But where on earth does it come from? I grep'd for "_curried" in our codebase and found nothing, so this must be some Django-related behavior. Certainly I would expect that somewhere in our project we've described how dynamically named functions are formed since recalculate seems like a plausible derivative of calculate, but I honestly have no idea where to start looking.
Thus, my question: how are curried methods like the one above generated, and where should I start looking to discover how our own codebase is curry-ing?
Thanks a ton in advance!
a curried method is when you partially call a method in advance of actually calling it
for example
from functools import partial
from itertools import count
def my_pow(x,y):
return x**y
curried_pow2n = partial(my_pow,x=2)
for i in count(0): #print the 2**i
print curried_pow2n(i)
you could also easily implement it with lambda
curried_pow2n = lambda x:return my_pow(2,x)
although Im not sure this has anything to do with your actual question ...
django also provides a curry method that is pretty similar to functools.partial
from django.utils.functional import curry
lols = {'lols':'lols'}
formset = modelformset_factory(MyModel, form=myForm, extra=0)
formset.form = staticmethod(curry(MyForm, lols=lols))
(from https://stackoverflow.com/a/25915489/541038)
so you may want to look for Student.recalculate_gpa =
or perhaps in the Student.__init__ method for self.recalculate_gpa =
you likely would not find it looking for def recalculate_gpa

SQLAlchemy: return existing object instead of creating a new on when calling constructor

I want to use sqlalchemy in a way like this:
email1 = EmailModel(email="user#domain.com", account=AccountModel(name="username"))
email2 = EmailModel(email="otheruser#domain.com", account=AccountModel(name="username"))
Usually sqlalchemy will create two entries for the account and link each email address to this. If i set the accountname as unique sqlalchemy is throwing an exception which tells me about an entry with the same value already exists. This makes all sense and works as expected.
Now i figured out an way by my own which allows the mentioned code and just creates an account only once by overwriting the the new Constructor of the AccountModel Class:
def __new__(*cls, **kw):
if len(kw) and "name" in kw:
x = session.query(cls.__class__).filter(cls[0].name==kw["name"]).first()
if x: return x
return object.__new__(*cls, **kw)
This is working perfectly for me. But the question is:
Is this the correct way?
Is there an sqlalchemy way of achieving the same?
I'm using the latest 0.8.x SQLAlchemy Version and Python 2.7.x
Thanks for any help :)
There is exactly this example on the wiki at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject.
Though, more recently I've preferred to use a #classmethod for this instead of redefining the constructor, as explicit is better than implicit, also simpler:
user.email = Email.as_unique('foo#bar.com')
(I'm actually going to update the wiki now to more fully represent the usage options here.)
I think it's not the best way to achieve this since it creates an dependency of your constructor to the global variable session and also modifies the behaviour of the constructor in an unexpected way (new is expected to return a new object after all). If, for example, someone is using your classes with two sessions in parallel the code will fail and he will have to dig into the code to find the error.
I'm not aware of any "sqlalchemy" way of achieving this, but I would suggest to create a function createOrGetAccountModel like
def createOrGetAccountModel(**kw):
if len(kw) and "name" in kw:
x = session.query(AccountModel).filter_by(name=kw["name"]).first()
if x: return x
return AccountModel(**kw)

Creating an asynchronous method with Google App Engine's NDB

I want to make sure I got down how to create tasklets and asyncrounous methods. What I have is a method that returns a list. I want it to be called from somewhere, and immediatly allow other calls to be made. So I have this:
future_1 = get_updates_for_user(userKey, aDate)
future_2 = get_updates_for_user(anotherUserKey, aDate)
somelist.extend(future_1)
somelist.extend(future_2)
....
#ndb.tasklet
def get_updates_for_user(userKey, lastSyncDate):
noteQuery = ndb.GqlQuery('SELECT * FROM Comments WHERE ANCESTOR IS :1 AND modifiedDate > :2', userKey, lastSyncDate)
note_list = list()
qit = noteQuery.iter()
while (yield qit.has_next_async()):
note = qit.next()
noteDic = note.to_dict()
note_list.append(noteDic)
raise ndb.Return(note_list)
Is this code doing what I'd expect it to do? Namely, will the two calls run asynchronously? Am I using futures correctly?
Edit: Well after testing, the code does produce the desired results. I'm a newbie to Python - what are some ways to test to see if the methods are running async?
It's pretty hard to verify for yourself that the methods are running concurrently -- you'd have to put copious logging in. Also in the dev appserver it'll be even harder as it doesn't really run RPCs in parallel.
Your code looks okay, it uses yield in the right place.
My only recommendation is to name your function get_updates_for_user_async() -- that matches the convention NDB itself uses and is a hint to the reader of your code that the function returns a Future and should be yielded to get the actual result.
An alternative way to do this is to use the map_async() method on the Query object; it would let you write a callback that just contains the to_dict() call:
#ndb.tasklet
def get_updates_for_user_async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
note_list = yield noteQuery.map_async(lambda note: note.to_dict())
raise ndb.Return(note_list)
Advanced tip: you can simplify this even more by dropping the #ndb.tasklet decorator and just returning the Future returned by map_async():
def get_updates_for_user_Async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
return noteQuery.map_async(lambda note: note.to_dict())
This is a general slight optimization for async functions that contain only one yield and immediately return the value yielded. (If you don't immediately get this you're in good company, and it runs the risk to be broken by a future maintainer who doesn't either. :-)

Categories