Concurent Access to datastore in app engine

Concurent Access to datastore in app engine - python

i want to know if db.run_in_transaction() acts as a lock for Data store operations
and helps in case of concurrent access on same entity.
Does in following code it is guarantied that a concurrent access will not cause a race and instead of creating new entity it will not do a over-write
Is db.run_in_transaction() correct/best way to do so
in following code i m trying to create new unique entity with following code
def txn(charmer=None):
new = None
key = my_magic() + random_part()
sk = Snake.get_by_name(key)
if not sk:
new = Snake(key_name=key, charmer= charmer)
new.put()
return new
db.run_in_transaction(txn, charmer)

That is a safe method. Should the same name get generated twice, only one entity would be created.
It sounds like you have already looked at the transactions documentation. There is also a more detailed description.
Check out the docs (specifically the equivalent code) on Model.get_or_insert, it answers exactly the question you are asking:
The get and subsequent (possible) put
are wrapped in a transaction to ensure
atomicity. Ths means that
get_or_insert() will never overwrite
an existing entity, and will insert a
new entity if and only if no entity
with the given kind and name exists.

What you've done is right and sort of duplicates the Model.get_or_insert, like Robert already explained.
I don't know if this can be called a 'lock'... the way this works is optimistic concurrency - the operation will execute assuming that no one else is trying to do the same thing at the same time, and if someone is, it will give you an exception. You'll need to figure out what you want to do in that case. Maybe ask the user to choose a new name?

Related

Solution needed to a scenario

I am trying to make use of a column's value as a radio button's choice using below code
Forms.py
#retreiving data from database and assigning it to diction list
diction = polls_datum.objects.values_list('poll_choices', flat=True)
#initializing list and dictionary
OPTIONS1 = {}
OPTIONS = []
#creating the dictionary with 0 to no of options given in list
for i in range(len(diction)):
OPTIONS1[i] = diction[i]
#creating tuples from the dictionary above
#OPTIONS = zip(OPTIONS1.keys(), OPTIONS1.values())
for i in OPTIONS1:
k = (i,OPTIONS1[i])
OPTIONS.append(k)
class polls_form(forms.ModelForm):
#retreiving data from database and assigning it to diction list
options = forms.ChoiceField(choices=OPTIONS, widget = forms.RadioSelect())
class Meta:
model = polls_model
fields = ['options']
Using a form I am saving the data or choices in a field (poll_choices), when trying to display it on the index page, it is not reflecting until a server restart.
Can someone help on this please

of course "it is not reflecting until a server restart" - that's obvious when you remember that django server processes are long-running processes (it's not like PHP where each script is executed afresh on each request), and that top-level code (code that's at the module's top-level, not in a function) is only executed once per process when the module is first imported. As a general rule: don't do ANY db query at a module's top-level or at the top-level of a class statement - at best you'll get stale data, at worse it will crash your server process (if you're doing query before everything has been properly setup by django, or if you're doing query based on a schema update before the migration has been applied).
The possible solutions are either to wait until the form's initialisation to setup your field's choices, or to pass a callable as the formfield's choices options, cf https://docs.djangoproject.com/en/2.1/ref/forms/fields/#django.forms.ChoiceField.choices
Also, the way you're building your choices list is uselessly complicated - you could do it as a one-liner:
OPTIONS = list(enumerate(polls_datum.objects.values_list('poll_choices', flat=True))
but it's also very brittle - you're relying on the current db content and ordering for the choice value when you should use the polls_datum's pk instead (which is garanteed to be stable).
And finally: since you're working with what seems to be a related model, you may want to use a ModelChoiceField instead.

For future reference:
What version of Django are you using?
Have you read up on the documentation of ModelForms? https://docs.djangoproject.com/en/2.1/topics/forms/modelforms/
I'm not sure what you're trying to do with diction to dictionary to tuple. I think you could skip a step there and your future self will thank you for that.
Try to follow some tutorials and understand why certain steps are being taken. I can see from your code that you're rather new to coding or Python and there's room for improvement. Not trying to talk you down, but I'm trying to push you into the direction of becoming a better developer ;-)
REAL ANSWER:
That being said, I think the solution is to write the loading of the data somewhere in your form model, rather than 'loose' in forms.py. See bruno's answer for more information on this.
If you want to reload the data on each request that loads the form, you should create a function that gets called every time the form is loaded (for example in the form's __init__ function).

Overwriting an entity by reusing its id

If we add a second entity of same Model(NDB) with the same id, would the first entity get replaced by the second entity?
Is this the right way? In future, would this cause any problem?
I use GAE Python with NDB.
Eg,
class X (ndb.Model):
command = ndb.StringProperty ()
x_record = X (id="id_value", command="c1")
x_record.put ()
# After some time
x_record = X (id="id_value", command="c2")
x_record.put ()
I did find a mention of this in official Google docs.
CONTEXT
I intend to use it to reduce code steps. Presently, first the code checks if an entity with key X already exists. If it exists, it updates its properties. Else, it creates a new one with that key(X). New approach would be to just blindly create a new entity with key X.

Yes, you would simply replace the model.
Would it cause any problems? Only if you wanted the original model back...

Update a field in a django model only if it needs updating

Suppose I have some django model and I'm updating an instance
def modify_thing(id, new_blah):
mything = MyModel.objects.get(pk=id)
mything.blah = new_blah
mything.save()
My question is, if it happened that it was already the case that mything.blah == new_blah, does django somehow know this and not bother to save this [non-]modification again? Or will it always go into the db (MySQL in my case) and update data?
If I want to avoid an unnecessary write, does it make any sense to do something like:
if mything.blah != new_blah:
mything.blah = new_blah
mything.save()
Given that the record would have to be read from db anyway in order to do the comparison in the first place? Is there any efficiency to be gained from this sort of construction - and if so, is there a less ugly way of doing that than with the if statement in python?

You can use Django Signals to ensure that code like that you just posted don´t write to the db. Take a look at pre_save, that's the signal you're looking for.

Given that django does not cache the values, a trip to DB is inevitable, you have to fetch it to compare the value. And definitely we have less ugly ways to do that. You could do it as
if mything.blah is new_blah:
#Do nothing
else:
mything.blah = new_blah
mything.blah.save()

How to check the existance of single Entity? Google App Engine, Python

Sorry for noobster question again.
But I'm trying to do some very easy stuff here, and I don't know how. Documentation gives me hints which do not work, or apply.
I recieve a POST request and grab a variable out of it. It says "name".
I have to search all over my entities Object (for example) and find out if there's one that has the same name. Is there's none, I must create a new Entity with this name. Easy it may look, but I keep Failing.
Would really appreciate any help.
My code currently is this one:
objects_qry = Object.query(Object.name == data["name"])
if (not objects_qry ):
obj = Object()
obj .name = data["name"]
obj .put()
class Object(ndb.Model):
name = ndb.StringProperty()

Using a query to perform this operation is really inefficient.
In addition your code is possibly unreliable, if name doesn't exist and you have two requests at the same time for name you could end up with two records. And you can't tell because your query only returns the first entity with the name property equal to some value.
Because you expect only one entity for name a query is expensive and inefficient.
So you have two choices you can use get_or_insert or just do a get, and if you have now value create a new entity.
Any way here is a couple of code samples using the name as part of the key.
name = data['name']
entity = Object.get_or_insert(name)
or
entity = Object.get_by_id(name)
if not entity:
entity = Object(id=name)
entity.put()

Calling .query just creates a query object, it doesn't execute it, so trying to evaluate is as a boolean is wrong. Query object have methods, fetch and get that, respectively, return a list of matching entities, or just one entity.
So your code could be re-written:
objects_qry = Object.query(Object.name == data["name"])
existing_object = objects_qry.get()
if not existing_object:
obj = Object()
obj.name = data["name"]
obj.put()
That said, Tim's point in the comments about using the ID instead of a property makes sense if you really care about names being unique - the code above wouldn't stop two simultaneous requests from creating entities with the same name.

Recursive delete in google app engine

I'm using google app engine with django 1.0.2 (and the django-helper) and wonder how people go about doing recursive delete.
Suppose you have a model that's something like this:
class Top(BaseModel):
pass
class Bottom(BaseModel):
daddy = db.ReferenceProperty(Top)
Now, when I delete an object of type 'Top', I want all the associated 'Bottom' objects to be deleted as well.
As things are now, when I delete a 'Top' object, the 'Bottom' objects stay and then I get data that doesn't belong anywhere. When accessing the datastore in a view, I end up with:
Caught an exception while rendering: ReferenceProperty failed to be resolved.
I could of course find all objects and delete them, but since my real model is at least 5 levels deep, I'm hoping there's a way to make sure this can be done automatically.
I've found this article about how it works with Java and that seems to be pretty much what I want as well.
Anyone know how I could get that behavior in django as well?

You need to implement this manually, by looking up affected records and deleting them at the same time as you delete the parent record. You can simplify this, if you wish, by overriding the .delete() method on your parent class to automatically delete all related records.
For performance reasons, you almost certainly want to use key-only queries (allowing you to get the keys of entities to be deleted without having to fetch and decode the actual entities), and batch deletes. For example:
db.delete(Bottom.all(keys_only=True).filter("daddy =", top).fetch(1000))

Actually that behavior is GAE-specific. Django's ORM simulates "ON DELETE CASCADE" on .delete().
I know that this is not an answer to your question, but maybe it can help you from looking in the wrong places.

Reconsider the data structure. If the relationship will never change on the record lifetime, you could use "ancestors" feature of GAE:
class Top(db.Model): pass
class Middle(db.Model): pass
class Bottom(db.Model): pass
top = Top()
middles = [Middle(parent=top) for i in range(0,10)]
bottoms = [Bottom(parent=middle) for i in range(0,10) for middle in middles]
Then querying for ancestor=top will find all the records from all levels. So it will be easy to delete them.
descendants = list(db.Query().ancestor(top))
# should return [top] + middles + bottoms

If your hierarchy is only a small number of levels deep, then you might be able to do something with a field that looks like a file path:
daddy.ancestry = "greatgranddaddy/granddaddy/daddy/"
me.ancestry = daddy.ancestry + me.uniquename + "/"
sort of thing. You do need unique names, at least unique among siblings.
The path in object IDs sort of does this already, but IIRC that's bound up with entity groups, which you're advised not to use to express relationships in the data domain.
Then you can construct a query to return all of granddaddy's descendants using the initial substring trick, like this:
query = Person.all()
query.filter("ancestry >", gdaddy.ancestry + "\U0001")
query.filter("ancestry <", gdaddy.ancestry + "\UFFFF")
Obviously this is no use if you can't fit the ancestry into a 500 byte StringProperty.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Concurent Access to datastore in app engine - python

Related

Solution needed to a scenario

Overwriting an entity by reusing its id

Update a field in a django model only if it needs updating

How to check the existance of single Entity? Google App Engine, Python

Recursive delete in google app engine

Categories

Resources