taskqueue and non-idempotent tasks - python

I'm working on a voting app, where the user can upload a list of email addresses for all of the voters. After doing some error checking, I create a Voter entity for each voter. Since there can be a large number of voters, I create the Voter entities in a taskqueue to avoid the 30 second limit and the task looks like this:
put_list = []
for email, id in itertools.izip(voter_emails, uuids):
put_list.append(Voter(election = election,
email = email,
uuid = id))
election.txt_voters = ""
put_list.append(election)
db.put(put_list)
This task, however, isn't idempotent. Is there a way to make this task idempotent? Or is there a better way to do this?

use a key_name rather than a uuid property to prevent creating duplicate voter entities.

Related

How can I automatically let API keys expire?

I am building an application which uses API keys during sessions. I have so far successfully generated the API keys and I can check them for validity and whether they go with the correct account and I've also added brute force protection.
My problem is that I would like to automatically let them expire after 24 hours. Right now I remove old keys when a user requests a new one to lessen the chance of someone guessing the right key, but this doesn't work for users who don't use the application again.
I was going to achieve this by scheduling a cronjob, as I read other people advising. However, the server the application will be hosted by isn't mine and the person who the server actually does belong to doesn't see the need for the automatic expiry in the first place. Which means that I would like to somehow include it in the code itself or to have a good reasoning for why he should let me (or do it himself) schedule a cronjob.
The table containing the API keys looks as follows:
class DBAuth(db.Model):
__tablename__ = 'auth'
id = db.Column(db.Integer, primary_key=True)
user_id = db.Column(db.Integer, index=True)
api_key = db.Column(db.String(256))
begin_date = db.Column(db.DateTime, nullable=False)
And the api key generater is called as follows:
auth = DBAuth()
key = DBAuth.query.filter_by(user_id=user.id).first()
if key is not None:
db.session.delete(key)
db.session.commit()
api_key = auth.generate_key(user.id)
db.session.add(auth)
db.session.commit()
With the generator function like this:
def generate_key(self, user_id):
self.user_id = user_id
self.api_key = #redacted#
self.begin_date = datetime.datetime.now()
return self.api_key
My question is really two part:
1: is my colleague right in saying that the automatic expiry isn't necessary? and 2: Is there a way to add automatic expiry to the code instead of scheduling a cronjob?
Sorry, I don't have enough rep to comment, a simple approach would be the following:
Since you already have DateTime objects in your schema, maybe you can add another such item say "key_expiry_date" that contains the current time plus 24 hours.
You can then use "key_expiry_date" to validate further requests

sqlalchemy different children must exist

I'm trying to build a query which returns all objects which have children matching specified criteria. The trick is that there are multiple criteria which are mutually exclusive, so there must be multiple children. I'm not sure how to express this.
Model:
class Message(Base):
__tablename__ = 'Message'
class MessageRecipient(Base):
__tablename__ = 'MessageRecipient'
recipient_id = Column(Integer, ForeignKey('User.uid'))
message_id = Column(Integer, ForeignKey('Message.uid'))
user = relationship('User', backref="messages_received")
message = relationship('Message', backref="recipients")
I want to get all messages which are being sent to a defined set of users. For example, I want to return all messages which were sent to users 1 and 2, but not messages only sent to user 1 or messages only sent to user 2. It must have been sent to both users!
I was trying a query like the following:
query = Message.query.filter(Message.recipients.any(MessageRecipient.recipient_id.in_([1,2])))
The above doesn't work because in_ is disjunctive. It does return the messages I want, but it also returns messages I don't want.
Does anyone have an idea of how I can build a query which requires that a Message have MessageRecipients with an arbitrary set of ids?
I solved this by iterating over all objects in the collection and creating a new subquery for each using exists(). Not sure if this is the most efficient way, but it works.
for recipient in [1,2]:
query = query.filter(MessageRecipient.query.filter(and_(MessageRecipient.recipient_id== recipient,
MessageRecipient.message_id == Message.uid)).exists())

Many to many relationship with NDB on Google App Engine

I've got the following models...
class User(ndb.Model):
email = ndb.StringProperty()
username = ndb.StringProperty(indexed=True)
password = ndb.StringProperty()
class Rel(ndb.Model):
user = ndb.KeyProperty(kind=User, indexed=True)
follows = ndb.KeyProperty(kind=User, indexed=True)
blocks = ndb.KeyProperty(kind=User)
I'm trying to make it so a user can follow or block any other number of users.
Using the above setup I'm finding it hard to perform tasks that would been easy with a traditional DBMS.
As a simple example, how would I find all of a given user's followers AND order by username-- keeping in mind when I perform a query on Rel, I'm getting back keys and not user objects?
Am I going about this the wrong way?
You have to do a fetch but you can go about designing it in a better way,
the follows and blocks fields can be lists instead of just key -
follows = ndb.KeyProperty(kind=User, repeated=True)
blocks = ndb.KeyProperty(kind=User, repeated=True)
after this when you need the follows of this user you can get the keys and do an ndb.get_multi(Keys_list) to get all the follows/blocks entities whatever you need.
OR
A better way of doing this -
If you care about the order and want to paginate, you will have to store all the follow/block entities separately,
for example if this is about a user 'a'
Follows entity will have records for each person 'a' follows
class FollowEntity(ndb.Model):
user = ndb.KeyProperty(kind=User)
follow = ndb.KeyProperty(kind=User)
follow_username = ndb.StringProperty()
a query can be
assuming user is an entry from your 'User' Entity.
query = FollowEntity.query(FollowEntity.user == user.key).order(FollowEntity.follow_username)
you can run this query and get the sorted username results, would work well if you use fetch_page to display the results in a batch.
Do the same for BlockEntity too

How to access items in a model with ReferenceProperty?

This is a follow up on my previous question.
I set up the models with ReferenceProperty:
class User(db.Model):
userEmail = db.StringProperty()
class Comment(db.Model):
user = db.ReferenceProperty(User, collection_name="comments")
comment = db.StringProperty(multiline=True)
class Venue(db.Model):
user = db.ReferenceProperty(User, collection_name="venues")
venue = db.StringProperty()
In datastore I have this entry under User:
Entity Kind: User
Entity Key: ag1oZWxsby0xLXdvcmxkcgoLEgRVc2VyGBUM
ID: 21
userEmail: az#example.com
And under Venue:
Entity Kind: Venue
Entity Key: ag1oZWxsby0xLXdvcmxkcgsLEgVWZW51ZRhVDA
ID: 85
venue (string): 5 star venue
user (Key): ag1oZWxsby0xLXdvcmxkcgoLEgRVc2VyGBUM
User: id=21
I am trying to display the item in hw.py like this
query = User.all()
results = query.fetch(10)
self.response.out.write("<html><body><ol>")
for result in results:
self.response.out.write("<li>")
self.response.out.write(result.userEmail)
self.response.out.write(result.venues)
self.response.out.write("</li>")
self.response.out.write("</ol></body></html>")
This line works:
self.response.out.write(result.userEmail)
But this line does not work:
self.response.out.write(result.venues)
But as per vonPetrushev's answer result.venues should grab the venue associated with this userEmail.
Sorry if this is confusing: I am just trying to access the tables linked to the userEmail with ReferenceProperty. The linked tables are Venue and Comment. How do I access an item in Venue or in Comment? Thanks.
venues is actually a query object. So you'll need to fetch or iterate over it.
Try replacing the self.response.out.write(result.venues) line with a loop:
query = User.all()
users = query.fetch(10)
self.response.out.write("<html><body><ol>")
for user in users:
self.response.out.write("<li>")
self.response.out.write(user.userEmail)
self.response.out.write("<ul>")
# Iterate over the venues
for venue in user.venues:
self.response.out.write("<li>%s</li>" % venue.venue)
self.response.out.write("</ul></li>")
self.response.out.write("</ol></body></html>")
However, this will not scale very well. If there are 10 users, it will run 11 queries! Make sure you are using Appstats to look for performance issues like these. Try to minimize the number of RPC calls you make.
A better solution might be to denormalize and store the user's email on the Venue kind. That way you will only need one query to print the venue information and users email.

GQL with two tables

Hello i am doing a very small application in google appengine and i use python.
My problem is that i have two tables using de db.model ("clients" and "requests"). The table "client" has got the email and name fields and the table "requests" has got the email and issue fields. I want to do a query that returns for each request the email, issue and client name, if the email is the same in the two tables. Can anyone help, please?
The app engine datastore does not support joins, so you will not be able to solve this problem with GQL. You can use two gets, one for client and one for request, or you can use a ReferenceProperty to establish a relationship between the two entities.
If you need to model a one-to-many relationship, you can do it with a reference property. For your case, it would look something like this:
class Client(db.Model):
email = db.UserProperty()
name = db.StringProperty()
class Request(db.Model):
client = db.ReferencePrpoerty(Client, collection_name='requests')
issue = db.StringProperty()
Any Client entity that has a Request associated with it will automatically get a property called requests which is a Query object that will return all Request entities that have a client field set to the particular Client entity you are dealing with.
You might also want to make sure that the code that creates Request entities set each new entity to have the Client entity for the particular user as its ancestor. Keeping these associated items in the same entity group could be helpful for performance reasons and transactions.
using this models:
class Client(db.Model):
email = db.StringProperty()
name = db.StringProperty()
class Request(db.Model):
client = db.ReferenceProperty(Client, collection_name='requests')
issue = db.StringProperty()
With this code can query the data
from modelos import Client,Request
ctes=Client.all().filter("email =","somemail#mailbox.com.mx")
for ct in ctes:
allRequest4ThisUser=Request.all().filter("client =",ct)
for req in allRequest4ThisUser:
print req.issue

Categories