I have been working on a project using Google App Engine. I have been setting up users and have to check if a username is taken yet.
I used the following code to try to test whether it is taken or not
usernames = db.GqlQuery('select username from User')
taken = username in usernames
This never caught duplicate usernames. I tried a few variants of this on the GQL query line. I tried using .get() which caused an error because it returned something that wasn't iterable. I also tried putting list() around the request, which returned the same error. I tried writing the value of usernames but never got any response. If it returns a query instance, then is there any way to turn it into a list or tuple?
For starters you should revisit the docs https://cloud.google.com/appengine/docs/python/datastore/gqlqueryclass?hl=en
db.GqlQuery('select username from User') is calling a constructor not a function so it returns an instance of a GqlQuery object. See docs referred to above.
Secondly what you are doing will never work reliably due to eventual consistancy . Please read https://cloud.google.com/appengine/docs/python/datastore/structuring_for_strong_consistency to understand why.
Lastly you are starting out with appengine, so move away from db and use ndb unless you have a significant existing code base.
Related
I have many models with relational links to each other which I have to use. My code is very complicated so I cannot keep session alive after a query. Instead, I try to preload all the objects:
def db_get_structure():
with Session(my_engine) as session:
deps = {x.id: x for x in session.query(Department).all()}
...
return (deps, ...)
def some_logic(id):
struct = db_get_structure()
return some_other_logic(struct.deps[id].owner)
However, I get the following error anyway regardless of the fact that all the objects are already loaded:
sqlalchemy.orm.exc.DetachedInstanceError: Parent instance <Department at 0x10476e780> is not bound to a Session; lazy load operation of attribute 'owner' cannot proceed
Is it possible to link preloaded objects with each other so that the relations will work after session get closed?
I know about joined queries (.options(joinedload(), but this approach leads to more code lines and bigger DB request, and I think this should be solved simpler, because all the objects are already loaded into Python objects.
It's even possible now to request the related objects like struct.deps[struct.deps[id].owner_id], but I think the ORM should do this and provide shorter notation struct.deps[id].owner using some "cached load".
Whenever you access an attribute on a DB entity that has not yet been loaded from the DB, SQLAlchemy will issue an implicit SQL statement to the DB to fetch that data. My guess is that this is what happens when you issue struct.deps[struct.deps[id].owner_id].
If the object in question has been removed from the session it is in a "detached" state and SQLAlchemy protects you from accidentally running into inconsistent data. In order to work with that object again it needs to be "re-attached".
I've done this already fairly often with session.merge:
attached_object = new_session.merge(detached_object)
But this will reconile the object instance with the DB and potentially issue updates to the DB if necessary. The detached_object is taken as "truth".
I believe you can do the reverse (attaching it by reading from the DB instead of writing to it) by using session.refresh(detached_object), but I need to verify this. I'll update the post if I found something.
Both ways have to talk to the DB with at least a select to ensure the data is consistent.
In order to avoid loading, issue session.merge(..., load=False). But this has some very important cavetas. Have a look at the docs of session.merge() for details.
I will need to read up on your link you added concerning your "complicated code". I would like to understand why you need to throw away your session the way you do it. Maybe there is an easier way?
From what I have been reading in the Google Docs and other SO questions, keys_only queries should return strongly consistent results (here and here, for example).
My code looks something like this:
class ClientsPage(SomeHandler):
def get(self):
query = Client.query()
clients = query.fetch(keys_only=True)
self.write(len(clients))
Even though I am fetching the results with the keys_only=True parameter I am getting stale results right after the creation of a new Client object (which is a root entity). If there were 2 client objects before the insertion, it keeps showing 2 after inserting and redirecting. I have to manually refresh the page in order to see the number change to 3.
I understand I could use ancestor queries, but I am testing some things first and I was surprised to see that a keys_only query returned stale results. Can anyone please explain to me what's going on?
EDIT 1:
This happened in the development server, I have not tested it in production.
Eventual consistency exists because the Datastore needs time to update all indexes. Keys-only query is the same as all the other queries, except it tells the Datastore - I don't need the entire entity, just return me the key. The query still looks at the indexes to get the list of results.
In contrast, getting an entity by key does not need to look at the indexes, so it is always strongly consistent.
I apologize if my question turns out to be silly, but I'm rather new to Django, and I could not find an answer anywhere.
I have the following model:
class BlackListEntry(models.Model):
user_banned = models.ForeignKey(auth.models.User,related_name="user_banned")
user_banning = models.ForeignKey(auth.models.User,related_name="user_banning")
Now, when i try to create an object like this:
BlackListEntry.objects.create(user_banned=int(user_id),user_banning=int(banning_id))
I get a following error:
Cannot assign "1": "BlackListEntry.user_banned" must be a "User" instance.
Of course, if i replace it with something like this:
user_banned = User.objects.get(pk=user_id)
user_banning = User.objects.get(pk=banning_id)
BlackListEntry.objects.create(user_banned=user_banned,user_banning=user_banning)
everything works fine. The question is:
Does my solution hit the database to retrieve both users, and if yes, is it possible to avoid it, just passing ids?
The answer to your question is: YES.
Django will hit the database (at least) 3 times, 2 to retrieve the two User objects and a third one to commit your desired information. This will cause an absolutelly unnecessary overhead.
Just try:
BlackListEntry.objects.create(user_banned_id=int(user_id),user_banning_id=int(banning_id))
These is the default name pattern for the FK fields generated by Django ORM. This way you can set the information directly and avoid the queries.
If you wanted to query for the already saved BlackListEntry objects, you can navigate the attributes with a double underscore, like this:
BlackListEntry.objects.filter(user_banned__id=int(user_id),user_banning__id=int(banning_id))
This is how you access properties in Django querysets. with a double underscore. Then you can compare to the value of the attribute.
Though very similar, they work completely different. The first one sets an atribute directly while the second one is parsed by django, that splits it at the '__', and query the database the right way, being the second part the name of an attribute.
You can always compare user_banned and user_banning with the actual User objects, instead of their ids. But there is no use for this if you don't already have those objects with you.
Hope it helps.
I do believe that when you fetch the users, it is going to hit the db...
To avoid it, you would have to write the raw sql to do the update using method described here:
https://docs.djangoproject.com/en/dev/topics/db/sql/
If you decide to go that route keep in mind you are responsible for protecting yourself from sql injection attacks.
Another alternative would be to cache the user_banned and user_banning objects.
But in all likelihood, simply grabbing the users and creating the BlackListEntry won't cause you any noticeable performance problems. Caching or executing raw sql will only provide a small benefit. You're probably going to run into other issues before this becomes a problem.
I made a simple login system with gae-sessions, and I want to show a logged in user how many users are logged in and who they are.
To count the number of people logged in, when I log a user in I immediately save the session to the datastore with save(persist_even_if_using_cookie=True). Then I use SessionModel.all().count() to retrieve the number of logged in accounts.
I'm having trouble retrieving information on other sessions though. I'm not sure how to do it. I tried this:
logged_in = []
for activesession in SessionModel.all():
logged_in.append(activesession['user'])
But I'm getting this error:
TypeError: 'SessionModel' object is unsubscriptable
I also tried activesession.get('user'), but it results in another error:
BadKeyError: Invalid string key user.
How can I do this?
The Session object and the SessionModel are separate from each other. SessionModel only stores the contents of the session, it can't be read from like a Session object.
I have a feeling that this is a bad idea, and you should find another way to store/retrieve the list of logged in users. This method may return expired sessions that haven't been deleted yet, and will probably be really slow.
The method you want to call is __decode_data. I think something like this will work:
for activesession in SessionModel.all():
data = Session._Session__decode_data(activesession.pdump)
logged_in.append(data['user'])
I have a super simple django model here:
class Notification(models.Model):
message = models.TextField()
user = models.ForeignKey(User)
timestamp = models.DateTimeField(default=datetime.datetime.now)
Using ajax, I check for new messages every minute. I only show the five most recent notifications to the user at any time. What I'm trying to avoid, is the following scenario.
User logs in and has no notifications. While the user's window is up, he receives 10 new messages. Since I'm only showing him five, no big deal. The problem happens when the user starts to delete his notifications. If he deletes the five that are displayed, the five older ones will be displayed on the next ajax call or refresh.
I'd like to have my model's save method delete everything but the 5 most recent objects whenever a new one is saved. Unfortunately, you can't use [5:] to do this. Help?
EDIT
I tried this which didn't work as expected (in the model's save method):
notes = Notification.objects.filter(user=self.user)[:4]
Notification.objects.exclude(pk__in=notes).delete()
i couldn't find a pattern in strange behavior, but after a while of testing, it would only delete the most recent one when a new one was created. i have NO idea why this would be. the ordering is taken care of in the model's Meta class (by timestamp descending). thanks for the help, but my way seems to be the only one that works consistently.
This is a bit old, but I believe you can do the following:
notes = Notification.objects.filter(user=self.user)[:4]
Notification.objects.exclude(pk__in=list(notes)).delete() # list() forces a database hit.
It costs two hits, but avoids using the for loop with transactions middleware.
The reason for using list(notes) is that Django creates a single query without it and, in Mysql 5.1, this raises the error
(1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'")
By using list(notes), we force a query of notes, avoiding this.
This can be further optimized to:
notes = Notification.objects.filter(user=self.user)[:4].values_list("id", flat=True) # only retrieve ids.
Notification.objects.exclude(pk__in=list(notes)).delete()
Use an inner query to get the set of items you want to keep and then filter on them.
objects_to_keep = Notification.objects.filter(user=user).order_by('-created_at')[:5]
Notification.objects.exclude(pk__in=objects_to_keep).delete()
Double check this before you use it. I have found that simpler inner queries do not always behave as expected. The strange behavior I have experienced has been limited to querysets that are just an order_by and a slice. Since you will have to filter on user, you should be fine.
this is how i ended up doing this.
notes = Notification.objects.filter(user=self.user)
for note in notes[4:]:
note.delete()
because i'm doing this in the save method, the only way the loop would ever have to run more than once would be if the user got multiple notifications at once. i'm not worried about that happening (while it may happen it's not likely to be enough to cause a problem).