Python: Store a num in Google App Engine - python

I am using Google App Engine Python.
I would like to store a simple variable number somewhere, so I can add, deduct and use this number.
examplenum=5
example+=1
Next time I use this examplenum, it wil become 6.
I know I can build a Model and add an IntegerProperty varible. But I guess that would be too much hassle because I only need one simple number not a model.
I have also read this question Simplest way to store a value in Google App Engine Python?, the memcache solution is not good for me. I want the number to stay forever not just temporary in the memcache.
Thanks a lot.

All the answers to the question you linked are relevent to you. If you want persistance you will have to create a model. Look at using get_or_insert to fetch/initialize an entity and give your entity a key_name so that you can keep fetching it easily whenever you need to store your values.
class MyAppData(db.Model):
my_number = db.IntegerProperty()
# fetch entity whenever you need to store your value
data = MyAppData.get_or_insert(key_name='mydata', my_number=1)
data.my_number += 1
data.put()
You code does look suspiciously like a counter, and you might want to look at the sharding countering article and the problems that it solves in case they are relevent.

Related

How to implement composition/agregation with NDB on GAE

How do we implement agregation or composition with NDB on Google App Engine ? What is the best way to proceed depending on use cases ?
Thanks !
I've tried to use a repeated property. In this very simple example, a Project have a list of Tag keys (I have chosen to code it this way instead of using StructuredProperty because many Project objects can share Tag objects).
class Project(ndb.Model):
name = ndb.StringProperty()
tags = ndb.KeyProperty(kind=Tag, repeated=True)
budget = ndb.FloatProperty()
date_begin = ndb.DateProperty(auto_now_add=True)
date_end = ndb.DateProperty(auto_now_add=True)
#classmethod
def all(cls):
return cls.query()
#classmethod
def addTags(cls, from_str):
tagname_list = from_str.split(',')
tag_list = []
for tag in tagname_list:
tag_list.append(Tag.addTag(tag))
cls.tags = tag_list
--
Edited (2) :
Thanks. Finally, I have chosen to create a new Model class 'Relation' representing a relation between two entities. It's more an association, I confess that my first design was unadapted.
An alternative would be to use BigQuery. At first we used NDB, with a RawModel which stores individual, non-aggregated records, and an AggregateModel, which a stores the aggregate values.
The AggregateModel was updated every time a RawModel was created, which caused some inconsistency issues. In hindsight, properly using parent/ancestor keys as Tim suggested would've worked, but in the end we found BigQuery much more pleasant and intuitive to work with.
We just have cronjobs that run everyday to push RawModel to BigQuery and another to create the AggregateModel records with data fetched from BigQuery.
(Of course, this is only effective if you have lots of data to aggregate)
It really does depend on the use case. For small numbers of items StructuredProperty and repeated properties may well be the best fit.
For large numbers of entities you will then look at setting the parent/ancestor in the Key for composition, and have a KeyProperty pointing to the primary entity in a many to one aggregation.
However the choice will also depend heavily on the actual use pattern as well. Then considerations of efficiency kick in.
The best I can suggest is consider carefully how you plan to use these relationships, how active are they (ie are they constantly changing, adding, deleting), do you need to see all members of the relation most of the time, or just subsets. These consideration may well require adjustments to the approach.

Is querying NDB JsonProperty in Google App Engine possible? If not, any alternatives?

Is there any way of using JsonProperties in queries in NDB/GAE? I can't seem to find any information about this.
Person.query(Person.custom.eye_color == "blue").fetch()
With a model looking something like this:
class Person(ndb.Model):
height = ndb.IntegerProperty(default=-1)
#...
#...
custom = ndb.JsonProperty(indexed=False, compressed=False)
The use case is this: I'm storing data about customers, where we at first only needed to query specific data. Now, we want to be able to query for any type of registred data about the persons. For example eye color, which some may have put into the system, or any other custom key/value pair in our JsonProperty.
I know about the expando class but for me, it seems a lot easier to be able to query jsonproperty and to keep all the custom properties on the same "name"; custom. That means that the front end can just loop over the properties in custom. If an expando class would be used, it would be harder to differentiate.
Rather than using a JSONProperty have you considered using a StructuredProperty. You maintain the same structure, just stored differently and you can filter by sub components of the StructureProperty with some restrictions, but that may be sufficient.
See https://developers.google.com/appengine/docs/python/ndb/queries#filtering_structured_properties
for querying StructuredProperties.

Hosting a small database aside from Google App Engine?

I asked another question about doing large queries in GAE, to which the answer was pretty much not possible.
What I want to do is this: from an iOS device, I get all the user's contacts phone numbers. So now I have a list of say 250 phone numbers. I want to send these phone numbers back to the server and check to see which of these phone numbers belong to a User account.
So I need to do a query: query = User.query(User.phone.IN(phones_list))
However, with GAE, this is quite an expensive query. It will cost 250 reads for just this one query, and I expect to do this type of query often.
So I came up with a crazy idea. Why don't I host the phone numbers on another host, on another database, where this type of query is cheaper. Then I can have GAE send a HTTP request to my other server to get the desired info.
So I have two questions:
Are there any databases more streamlined to handle these kinds of
queries, and which it would be more cheaper to do? Or will it all be
the same as GAE?
Is this overkill? Is it a good idea? Should I suck it up and pay the cost?
GAE's datastore should be good enough for your service. Since your application looks like could be parallelized very well.
1. use phone number as key_name of User.
As you set number as key_name of User, the following code will increase the query speed and reduce the read operation.
memcache.get_multi([phone_number1, phone_number2 ... ])
db.get([number1_not_found_in_memcache, number2_not_found_in_memcache])
memcache.set_multi("all_number_found_in_db")
2. store multi number in one datastore.
the operation cost of GAE not directly related to the entity's size. therefore a large entity store multi data would be another way to save the operation cost.
for example, store several phone number which have the same number_prefix together.
class Number(db.Model):
number_prefix = db.StringProperty()
numbers = db.StringListProperty(indexed = False)
# check number 01234567, 032123124
numbers = Number.get(["01", "03'])
# check 01234567 in number[0].numbers ?
# check 032123124 in number[1].numbers ?
this method could further imporve with memcache.
Generalizing slightly on other ideas offered... assuming that all your search keys are unique to a single User (e.g. email, phone, twitter handle, etc.)
At User write time, you can generate a set of SearchIndex(...) and persist that. Each SearchIndex has the key of the User.
Then at search time you can construct the keys for any SearchIndex and do two ndb.get_multi_async calls. The first to get matching SearchIndex entities, and the second to get the Users associated with those index entities.

Is it safe to pass Google App Engine Entity Keys into web pages to maintain context?

I have a simple GAE system that contains models for Account, Project and Transaction.
I am using Django to generate a web page that has a list of Projects in a table that belong to a given Account and I want to create a link to each project's details page. I am generating a link that converts the Project's key to string and includes that in the link to make it easy to lookup the Project object. This gives a link that looks like this:
My Project Name
Is it secure to create links like this? Is there a better way? It feels like a bad way to keep context.
The key string shows up in the linked page and is ugly. Is there a way to avoid showing it?
Thanks.
There is few examples, in GAE docs, that uses same approach, and also Key are using characters safe for including in URLs. So, probably, there is no problem.
BTW, I prefer to use numeric ID (obj_key.id()), when my model uses number as identifier, just because it's looks not so ugly.
Whether or not this is 'secure' depends on what you mean by that, and how you implement your app. Let's back off a bit and see exactly what's stored in a Key object. Take your key, go to shell.appspot.com, and enter the following:
db.Key(your_key)
this returns something like the following:
datastore_types.Key.from_path(u'TestKind', 1234, _app=u'shell')
As you can see, the key contains the App ID, the kind name, and the ID or name (along with the kind/id pairs of any parent entities - in this case, none). Nothing here you should be particularly concerned about concealing, so there shouldn't be any significant risk of information leakage here.
You mention as a concern that users could guess other URLs - that's certainly possible, since they could decode the key, modify the ID or name, and re-encode the key. If your security model relies on them not guessing other URLs, though, you might want to do one of a couple of things:
Reconsider your app's security model. You shouldn't rely on 'secret URLs' for any degree of real security if you can avoid it.
Use a key name, and set it to a long, random string that users will not be able to guess.
A final concern is what else users could modify. If you handle keys by passing them to db.get, the user could change the kind name, and cause you to fetch a different entity kind to that which you intended. If that entity kind happens to have similarly named fields, you might do things to the entity (such as revealing data from it) that you did not intend. You can avoid this by passing the key to YourModel.get instead, which will check the key is of the correct kind before fetching it.
All this said, though, a better approach is to pass the key ID or name around. You can extract this by calling .id() on the key object (for an ID - .name() if you're using key names), and you can reconstruct the original key with db.Key.from_path('kind_name', id) - or just fetch the entity directly with YourModel.get_by_id.
After doing some more research, I think I can now answer my own question. I wanted to know if using GAE keys or ids was inherently unsafe.
It is, in fact, unsafe without some additional code, since a user could modify URLs in the returned webpage or visit URL that they build manually. This would potentially let an authenticated user edit another user's data just by changing a key Id in a URL.
So for every resource that you allow access to, you need to ensure that the currently authenticated user has the right to be accessing it in the way they are attempting.
This involves writing extra queries for each operation, since it seems there is no built-in way to just say "Users only have access to objects that are owned by them".
I know this is an old post, but i want to clarify one thing. Sometimes you NEED to work with KEYs.
When you have an entity with a #Parent relationship, you cant get it by its ID, you need to use the whole KEY to get it back form the Datastore. In these cases you need to work with the KEY all the time if you want to retrieve your entity.
They aren't simply increasing; I only have 10 entries in my Datastore and I've already reached 7001.
As long as there is some form of protection so users can't simply guess them, there is no reason not to do it.

Need a way to count entities in GAE datastore that meet a certain condition? (over 1000 entities)

I'm building an app on GAE that needs to report on events occurring. An event has a type and I also need to report by event type.
For example, say there is an event A, B and C. They occur periodically at random. User logs in and creates a set of entities to which those events can be attributed. When the user comes back to check the status, I need to be able to tell how many events of A, B and/or C occurred during a specific time range, say a day or a month.
The 1000 limit is throwing a wrench into how I would normally do it. I don't need to retrieve all of the entities and present them to the user, but I do need to show the total count for a specific date range. Any suggestions?
I'm a bit of python/GAE noob...
App Engine is not a relational database and you won't be able to quickly do counts on the fly like this. The best approach is to update the counts at write time, not generate them at read time.
When generating counts, there are two general approaches that scale well with App Engine to minimize write contention:
Store the count in Memcache or local memory and periodically flush. This is the simplest solution, but it can be volatile and data loss is probable.
Use a Sharded Counter. This approach is a bit more reliable but more complex. You won't be able to sort easily by count, but you could also periodically flush to another indexed count field periodically and sort by that.
Results of datastore count() queries
and offsets for all datastore queries
are no longer capped at 1000.
Since Version 1.3.6
My approach would be to have an aggregate model or models to keep track of event types, dates and counts. I'm not 100% how you should model this given your requirements, though.
Then, I'd fire off deferred tasks to asynchronously update the appropriate aggregate models whenever a user does something that triggers an event.
Nick Johnson's Background work with the deferred library article has much more information, and provides a framework that you might find useful for doing the kind of aggregation you're talking about.
Would a solution using cursors (like the one below) work for you? I personally use this method to count the number of entries in a scenario similar to yours, and haven't seen yet any problems with it (although I run on a schedule, since constant querying of the data store is pretty taxing on the CPU quota).
def count(query):
i = 0
while True:
result = query.fetch(1000)
i = i + len(result)
if len(result) < 1000:
break
cursor = query.cursor()
query.with_cursor(cursor)
return i
This post is quite old, but I would like to provide a useful reference. App Engine now offers a built-in API to access datastore statistics:
For Python,
from google.appengine.ext.db import stats
global_stat = stats.GlobalStat.all().get()
print 'Total bytes stored: %d' % global_stat.bytes
print 'Total entities stored: %d' % global_stat.count
For Java,
import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;
import com.google.appengine.api.datastore.Query;
// ...
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Entity globalStat = datastore.prepare(new Query("__Stat_Total__")).asSingleEntity();
Long totalBytes = (Long) globalStat.getProperty("bytes");
Long totalEntities = (Long) globalStat.getProperty("count");
It is also possible to filter entities number only for a particular kind. Take a look at this reference:
https://developers.google.com/appengine/docs/python/datastore/stats
https://developers.google.com/appengine/docs/java/datastore/stats
This sounds very similar to a question that I posed on StackOverflow.
How to get the distinct value of one of my models in Google App Engine I needed to know how to get a distinct values for an entities within my models and there is going to be over 1000 entities for that model.
Hope that helps.

Categories