mongokit index does not work - python

I am developing a Web application using Flask and MongoDB. And I use (Flask-)MongoKit to define a schema to validate my data.
In my database, there is a collection called "users" (see below) that contains a field "email". I try to create a unique index on that field as specified in the MongoKit documentation (http://namlook.github.com/mongokit/indexes.html). However, when I check the collection indexes via MongoDB client shell, there is no index "email" at all.
I found a similar issue on the net: "unique index does not work" (https://github.com/namlook/mongokit/issues/98)
Does someone has any idea why it does not work?
User collection:
#db.register
class User(Model):
__collection__ = 'users'
structure = {
'first_name': basestring,
'last_name': basestring,
'email': basestring,
'password': unicode,
'registration_date': datetime,
}
required_fields = ['first_name', 'last_name', 'email', 'password', 'registration_date']
default_values = {
'registration_date': datetime.utcnow,
}
# Create a unique index on the "email" field
indexes = [
{
'fields': 'email', # note: this may be an array
'unique': True, # only unique values are allowed
'ttl': 0, # create index immediately
},
]
db.users.getIndexes() output:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "youthmind.users",
"name" : "_id_"
},
]
Note that I also try without 'ttl':0, and I was able to create an index using the following piece of code:
db.users.create_index('email', unique=True)
I think this uses the pymongo Connection object directly.
Thanks in advance for your help.

You are doing it exactly how you should be doing it. Automatic index creation has been removed from MongoKit as of version 0.7.1 (maybe version 0.8?). Here is an issue for it.
The reason behind it is that the it would have to call ensureIndex on the collection. The "ensure" part of the name makes it seem like it would check and then create the index if it doesn't exist, but a developer from Mongo said that it might still wind up (re-)creating the entire index, which could be terribly expensive. The developer also said it should be considered an administrative task, instead of a development task.
The work around is to call create_index yourself on for each index in the list you've defined as part of an upgrade/create script.

Right, you need to use separate script to recreate DB with indexes. It will be called if needed, not each time server runs. Example:
def recreatedb(uri, database_name):
connection = Connection(uri)
connection.drop_database(database_name)
#noinspection PyStatementEffect
connection[database_name]
connection.register(_DOCUMENTS)
for document_name, obj in connection._registered_documents.iteritems():
obj.generate_index(connection[database_name][obj._obj_class.__collection__])
To prevent using database without indexes:
def init_engine(uri, database_name):
global db
connection = Connection(uri)
if database_name not in connection.database_names():
recreatedb(uri, database_name)
connection.register(_DOCUMENTS)
db = connection[database_name]

I use Flask-Script so it was easy to add Marboni's answer as a command to my manage script which is easy to run.
#manager.command
def setup_indexes():
"""
create index for all the registered_documents
"""
for doc in application.db.registered_documents:
collection = application.db[doc.__collection__]
doc.generate_index(collection)
I keep my database as member of app (application.db) for various admin stuff. Now whenever I add few index or change anything I run my manager command.
./manage.py setup_indexes
You can read more about manager module here
http://flask-script.readthedocs.org/en/latest/

Related

I can connect an application to 2 databases in Django?

I have a web application in Python django. I need to import users and display data about them from another database, from another existing application. All I need is the user to be able to login and display information about them. What solutions are?
You can set 2 DATABASES in settings.py.
DATABASES = {
'default': {
...
},
'user_data': {
...
}
}
Then in one database store User models with authentication and stuff, in another rest information. You can connect information about specific User with a field that is storing id of User from another database.
If you have multiple databases and create a model, you should declare on which db it is going to be stored. If you didn't, it will be in default one (if you have it declared).
class UserModel(models.Model):
class Meta:
db_table = 'default'
class UserDataModel(models.Model):
class Meta:
db_table = 'user_data'
the answer from #NixonSparrow was wrong.
_meta.db_table defined only table_name in database and not the database self.
for switch database you can use manager.using('database_name'), for every model, it is good declared here: https://docs.djangoproject.com/en/4.0/topics/db/multi-db/#topics-db-multi-db-routing
in my project i use multiple router.
https://docs.djangoproject.com/en/4.0/topics/db/multi-db/#topics-db-multi-db-routing
it help don't override every manager with using. But in your case:
DATABASES = {
'default': {
...
},
'other_users_data': {
...
}
}
and somethere in views:
other_users = otherUserModel.objects.using('other_users_data')
Probably, otherUserModel should define in meta, which table you want to use db_table = 'other_users_table_name' and also probably it should have managed=False, to hide this model from migration manager.

serialize children in marshmallow-sqlalchemy

Marshmallow normally serializes nested children (assuming nested schema are defined). For example:
{
'id': 2,
'messages' : [
{
'id': 1,
'message': 'foo'
},
{
'id': 2,
'message': 'bar'
}
]
}
However, marshmallow-sqlalchemy causes children to simply be represented by their primary key. For example:
{
'id': 2,
'messages' : [
1,
2
]
}
How can I get marshmallow-sqlalchemy to serialize the child objects. Preferably, I should be able to specify a depth. For example: serialize to 4 layers deep, then use the uid behavior.
Ideally, this should be configurable via schema.load() because it should be dynamic based on where the serialization began. In other words, coupling this to the schema itself doesn't make sense. However, if that's the only way to do it, I'm curious to hear a solution like that as well.
You can used a Nested field to do this.
Assuming you have a schema that looks like this:
class MessageSchema(Schema):
msg_id = fields.Integer()
message = fields.String()
something = fields.String()
class ListSchema(Schema):
list_id = fields.Integer()
messages = fields.Nested(MessageSchema, only=['msg_id', 'message'])
To include only certain fields, use the only parameter when calling Nested (see example above for usage, where only msg_id and message fields are included in the serialized output).
The Marshmallow docs have a more detailed + complete example:

django with multiple databases and foreignkeys for User

Suppose I have a django app on my server, but I wish to do authentication using django.contrib.auth.models where the User and Group models/data are on another server in another database. In Django, my DATABASES setting would be something like this:
DATABASES = {
'default': {},
'auth_db': {
'NAME' : 'my_auth_db',
'ENGINE' : 'django.db.backends.mysql',
'USER' : 'someuser',
'PASSWORD' : 'somepassword',
'HOST' : 'some.host.com',
'PORT' : '3306',
},
'myapp': {
'NAME': 'myapp_db',
'ENGINE': 'django.db.backends.mysql',
'USER': 'localuser',
'PASSWORD': 'localpass',
}
}
DATABASE_ROUTERS = ['pathto.dbrouters.AuthRouter', 'pathto.dbrouters.MyAppRouter']
First question: will this work, ie will it allow me to login to my Django app using users that are stored in the remote DB 'my_auth_db'?
Assuming the answer to the above is yes, what happens if in my local DB (app 'myapp') I have models that have a ForeignKey to User? In other words, my model SomeModel is defined in myapp and should exist in the myapp_db, but it have a ForeignKey to a User in my_auth_db:
class SomeModel(models.model):
user = models.ForeignKey(User, unique=False, null=False)
description = models.CharField(max_length=255, null=True)
dummy = models.CharField(max_length=32, null=True)
etc.
Second question: Is this possible or is it simply not possible for one DB table to have a ForeignKey to a table in another DB?
If I really wanted to make this work, could I replace the ForeignKey field 'user' with an IntegerField 'user_id' and then if I needed somemodel.user I would instead get somemodel.user_id and use models.User.objects.get(pk=somemodel.user_id), where the router knows to query auth_db for the User? Is this a viable approach?
The answer to question 1 is: Yes.
What you will need in any case is a database router (The example in the Django docs is exactly about the auth app, so there's no need to copy this code here).
The answer to question 2 is: Maybe. Not officially. It depends on how you have set up MySQL:
https://docs.djangoproject.com/en/dev/topics/db/multi-db/#limitations-of-multiple-databases
Django doesn’t currently provide any support for foreign key or many-to-many relationships spanning multiple databases.
This is because of referential integrity.
However, if you’re using SQLite or MySQL with MyISAM tables, there is no enforced referential integrity; as a result, you may be able to ‘fake’ cross database foreign keys. However, this configuration is not officially supported by Django.
I have a setup with several legacy MySQL DBs (readonly). This answer shows How to use django models with foreign keys in different DBs?
I later ran into troubles with Django ManyToMany through with multiple databases and the solution (as stated in the accepted answer there) is to set the table name with quotes:
class Meta:
db_table = '`%s`.`table2`' % db2_name
Related questions that might provide some additional information:
How to work around lack of support for foreign keys across databases in Django
How to use django models with foreign keys in different DBs?
It would be nice if somebody would take all this information and put in into the official Django doc :-)

Pymodm - Mongodb, How to create an index in a collection

I'm trying to create server side flask session extension that expires after # time. I found below Mongodb shell command in the documentation.
db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )
But how can I do it using pymodm?
Take a look to the model definition: http://pymodm.readthedocs.io/en/stable/api/index.html?highlight=indexes#defining-models. There is a meta attribute called "indexes", that one is responsible for creating indexes. Here is an example:
import pymodm
import pymongo
class SomeModel(pymodm.MongoModel):
...
class Meta:
indexes=[pymongo.IndexModel([('field_name', <direction>)])]
Form the docs:
indexes: This is a list of IndexModel instances that describe the
indexes that should be created for this model. Indexes are created
when the class definition is evaluated.
IndexModel is explained in this page.
Then add the following Meta class to your MongoModel class:
class Meta:
indexes = [
IndexModel([('createdAt', pymongo.ASCENDING)], expireAfterSeconds=3600)
]

Refresh mongodb collection structure through python mongoengine

I'm writing a simple Flask app, with the sole purpose to learn Python and MongoDB.
I've managed to reach to the point where all the collections are defined, and CRUD operations work in general. Now, one thing that I really want to understand, is how to refresh the collection, after updating its structure. For example, say that I have the following model:
user.py
class User(db.Document, UserMixin):
email = db.StringField(required=True, unique=True)
password = db.StringField(required=True)
active = db.BooleanField()
first_name = db.StringField(max_length=64, required=True)
last_name = db.StringField(max_length=64, required=True)
registered_at = db.DateTimeField(default=datetime.datetime.utcnow())
confirmed = db.BooleanField()
confirmed_at = db.DateTimeField()
last_login_at = db.DateTimeField()
current_login_at = db.DateTimeField()
last_login_ip = db.StringField(max_length=45)
current_login_ip = db.StringField(max_length=45)
login_count = db.IntField()
companies = db.ListField(db.ReferenceField('Company'), default=[])
roles = db.ListField(db.ReferenceField(Role), default=[])
meta = {
'indexes': [
{'fields': ['email'], 'unique': True}
]
}
Now, I already have entries in my user collection, but I want to change companies to:
company = db.ReferenceField('Company')
How can I refresh the collection's structure, without having to bring the whole database down?
I do have a manage.py script that helps me and also provides a shell:
#!/usr/bin/python
from flask.ext.script import Manager
from flask.ext.script.commands import Shell
from app import factory
app = factory.create_app()
manager = Manager(app)
manager.add_command("shell", Shell(use_ipython=True))
# manager.add_command('run_tests', RunTests())
if __name__ == "__main__":
manager.run()
and I have tried a couple of commands, from information that I could recompile and out of my basic knowledge:
>>> from app.models import db, User
>>> import mongoengine
>>> mongoengine.Document(User)
field = iter(self._fields_ordered)
AttributeError: 'Document' object has no attribute '_fields_ordered'
>>> mongoengine.Document(User).modify() # well, same result as above
Any pointers on how to achieve this?
Update
I am asking all of this, because I have updated my user.py to match my new requests, but anytime I interact with the db its self, since the table's structure was not refreshed, I get the following error:
FieldDoesNotExist: The field 'companies' does not exist on the
document 'User', referer: http://local.faqcolab.com/company
Solution is easier then I expected:
db.getCollection('user').update(
// query
{},
// update
{
$rename: {
'companies': 'company'
}
},
// options
{
"multi" : true, // update all documents
"upsert" : false // insert a new document, if no existing document match the query
}
);
Explanation for each of the {}:
First is empty because I want to update all documents in user collection.
Second contains $rename which is the invoking action to rename the fields I want.
Last contains aditional settings for the query to be executed.
I have updated my user.py to match my new requests, but anytime I interact with the db its self, since the table's structure was not refreshed, I get the following error
MongoDB does not have a "table structure" like relational databases do. After a document has been inserted, you can't change it's schema by changing the document model.
I don't want to sound like I'm telling you that the answer is to use different tools, but seeing things like db.ListField(db.ReferenceField('Company')) makes me think you'd be much better off with a relational database (Postgres is well supported in the Flask ecosystem).
Mongo works best for storing schema-less documents (you don't know before hand how your data is structured, or it varies significantly between documents). Unless you have data like that, it's worth looking at other options. Especially since you're just getting started with Python and Flask, there's no point in making things harder than they are.

Categories