Mongokit add objects to collection - python

How do you insert objects to a collection in using MongoKit in Python?
I understand you can specify the 'collection' field in a model and that you can create models in the db like
user = db.Users()
Then save the object like
user.save()
However I can't seem to find any mention of an insert function. If I create a User object and now want to insert it into a specific collection, say "online_users", what do I do?

After completely guessing it appears I can successfully just call
db.online_users.insert(user)

You create a new Document called OnlineUser with the __collection__ field set to online_users, and then you have to related User and OnlineUsers with either ObjectID or DBRef. MongoKit supports both through -
pymongo.objectid.ObjectId
bson.dbref.DBRef
You can also use list of any other field type as a field.

I suppose your user object is a dict like
user = {
"one": "balabala",
"two": "lalala",
"more": "I am happy always!"
}
And here is my solution, not nice but work :)
online_users = db.Online_users() # connecting your collection
for item in user:
if item == "item you don't want":
continue # skip any item you don't want
online_users[item] = user[item]
online_users.save()
db.close() # close the db connection

Related

Django "update_or_create" API: how to filter objects by created or updated?

So, I'm using the Django update_or_create API to build my form data. It works fine...but, once built, I need a way to check to see what profiles were actually updated or if they were created for the first time?
Just an example:
for people in peoples:
people, updated = People.objects.update_or_create(
id=people.person_id,
defaults={
'first_name': people.first_name,
}
)
Filtering queryset:
people = People.objects.filter(
id__in=whatever,
)
But, now, I'm trying to filter the queryset by created or updated...but don't see an API for that (e.g., a fitler of sorts)
So, I would like to do something like:
updated = Person.objects.filter(updated=True, created_for_first_time=False)
and then I can write something like
if updated:
do this
else:
do this
Basically, I just want to check if a profile was updated or created for the first time.
As you have shown in your question, the update_or_create method returns a tuple (obj, created), where obj in the object, and created is a boolean showing whether a new object was created.
You could check the value of the boolean field, and create a list to store the ids of the newly created objects
new_objects = []
for people in peoples:
obj, created = People.objects.update_or_create(...)
if created:
new_objects.append(obj.id)
You can then filter using that list:
new_people = People.objects.filter(id__in=new_objects)
existing_people = People.objects.exclude(id__in=new_objects)
When you call update_or_create:
person, created = People.objects.update_or_create(...)
the created return value will be True if the record was freshly created, or False if it was an existing record that got updated. If you need to act on this bit of information, it would be best do do it right here, while you have access to it.
If you need to do it later, the only way I can think of is to design a schema that supports it, i.e. have create_date and last_modify_date fields, and if those two fields are equal, you know the record has not been modified since it was created.

Issue with multiple records of the same subdocument returned for MongoDb and Python

I'm getting this strange issue when getting back the results of a object with subdocuments from a Mongo database using Python and pymongo.
I have a document with a list of sub documents, e.g.
User: {
"_id": ....
hats: [{"colour": "blue" }]
}
I query this using find_one(). It returns back the details of the document and that one sub document record. However, next I do the query I get back two "hats", the second being a duplicate of the first. With the next I get back three "hats" and in continues on like that.
If I restart the application, "count" above is reset, so the find_one() query returns one sub document again.
There is definitely only one sub document record in the database, so that isn't the issue. It must be doing something weird in terms of in memory stuff.
I am using the Python "Tornado" framework, the application is a tornado.wsgi.WSGIApplication. Every time a new request comes it should be opening up a new connection.
The request handler does something along the lines of
class Handler(RequestHandler):
def initialize(self):
self.db = MongoClient("localhost", 27017)
I'm really baffled about what it could be.
Ultimately it was nothing to do with MongoDb.
For each collection type in the database I had a model class. With this I could decouple to exact details of how the data is stored from how it is returned from the server.
class User(ModelBase):
name = None
hats = []
#staticmethod
def from_db(document):
model = User()
model.name = document.get("name")
hats = document.get("hats", list())
from document in hats:
hat = Hat.from_db(document)
model.hats.append(hat)
return model
The problem is that I defined the properties on the model directly instead of setting up in the init function.
That is
class User(ModelBase):
name = None
hats = []
instead of
class User(ModelBase):
def __init__(self):
self.name = None
self.hats = []
I thought that they were equivalent, however it seems when variables are set on the class directly they are static variables shared between all instances of the class. So when I added a Hat to the list for one User object, the next User object had that Hat already in the list and added another one.

How to get dictionary object in Mongoengine Python?

On querying pymongo i get a dictionary object that can be sent directly as a response to the api request. Where as mongoengine returns a Document object on querying database. So I have to parse every object before it can be sent as the response in the api.
this is how I have to query in mongoengine.
users = User.objects(location = 'US')
This will return me a BaseQueryList object which contains User model type object. Instead I need that it should return me a list of dictionary type objects of Users.
In BaseQueryList there is one method called as_pymongo, we can use this to get rows as list of dict like where we get pymongo. The following is an example
users = User.objects(location = 'US').as_pymongo()
OR
In BaseQueryList there are in list of User class objects.
In User class object there is one method called _data, this will returns data as dict
So you can try like following
users = [user._data for user in users._iter_results()]
It could be help you.
Mongoengine has to_mongo() method that gives you Python dict.
users = User.objects(location = 'US')
users.to_mongo()

Mongodb query return type

When I make a query in Mongodb using Mongokit in Python, it returns a json document object. However I need to use the return value as a model type that I have defined. For example, if I have the class:
class User(Document):
structure = {
'name': basestring
}
and make the query
user = db.users.find_one({'name':'Mike'})
I want user to be an object of type User, so that I can embed it into other objects that have fields of type User. However it just returns a json document. Is there a way to cast it or something? This seems like something that should be very intuitive and easy to do.
From what I can see Mongokit is built on the top of pymongo, and pymongo find has an argument called as_class:
as_class (optional): class to use for documents in the query result (default is document_class)
http://api.mongodb.org/python/current/api/pymongo/collection.html#pymongo.collection.Collection.find

Create Django model or update if exists

I want to create a model object, like Person, if person's id doesn't not exist, or I will get that person object.
The code to create a new person as following:
class Person(models.Model):
identifier = models.CharField(max_length = 10)
name = models.CharField(max_length = 20)
objects = PersonManager()
class PersonManager(models.Manager):
def create_person(self, identifier):
person = self.create(identifier = identifier)
return person
But I don't know where to check and get the existing person object.
It's unclear whether your question is asking for the get_or_create method (available from at least Django 1.3) or the update_or_create method (new in Django 1.7). It depends on how you want to update the user object.
Sample use is as follows:
# In both cases, the call will get a person object with matching
# identifier or create one if none exists; if a person is created,
# it will be created with name equal to the value in `name`.
# In this case, if the Person already exists, its existing name is preserved
person, created = Person.objects.get_or_create(
identifier=identifier, defaults={"name": name}
)
# In this case, if the Person already exists, its name is updated
person, created = Person.objects.update_or_create(
identifier=identifier, defaults={"name": name}
)
If you're looking for "update if exists else create" use case, please refer to #Zags excellent answer
Django already has a get_or_create, https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create
For you it could be :
id = 'some identifier'
person, created = Person.objects.get_or_create(identifier=id)
if created:
# means you have created a new person
else:
# person just refers to the existing one
Django has support for this, check get_or_create
person, created = Person.objects.get_or_create(name='abc')
if created:
# A new person object created
else:
# person object already exists
For only a small amount of objects the update_or_create works well, but if you're doing over a large collection it won't scale well. update_or_create always first runs a SELECT and thereafter an UPDATE.
for the_bar in bars:
updated_rows = SomeModel.objects.filter(bar=the_bar).update(foo=100)
if not updated_rows:
# if not exists, create new
SomeModel.objects.create(bar=the_bar, foo=100)
This will at best only run the first update-query, and only if it matched zero rows run another INSERT-query. Which will greatly increase your performance if you expect most of the rows to actually be existing.
It all comes down to your use case though. If you are expecting mostly inserts then perhaps the bulk_create() command could be an option.
Thought I'd add an answer since your question title looks like it is asking how to create or update, rather than get or create as described in the question body.
If you did want to create or update an object, the .save() method already has this behaviour by default, from the docs:
Django abstracts the need to use INSERT or UPDATE SQL statements.
Specifically, when you call save(), Django follows this algorithm:
If the object’s primary key attribute is set to a value that evaluates
to True (i.e., a value other than None or the empty string), Django
executes an UPDATE. If the object’s primary key attribute is not set
or if the UPDATE didn’t update anything, Django executes an INSERT.
It's worth noting that when they say 'if the UPDATE didn't update anything' they are essentially referring to the case where the id you gave the object doesn't already exist in the database.
You can also use update_or_create just like get_or_create and here is the pattern I follow for update_or_create assuming a model Person with id (key), name, age, is_manager as attributes -
update_values = {"is_manager": False}
new_values = {"name": "Bob", "age": 25, "is_manager":True}
obj, created = Person.objects.update_or_create(identifier='id',
defaults=update_values)
if created:
obj.update(**new_values)
If one of the input when you create is a primary key, this will be enough:
Person.objects.get_or_create(id=1)
It will automatically update if exist since two data with the same primary key is not allowed.
This should be the answer you are looking for
EmployeeInfo.objects.update_or_create(
#id or any primary key:value to search for
identifier=your_id,
#if found update with the following or save/create if not found
defaults={'name':'your_name'}
)

Categories