Python MongoDB dynamic query building with multiple conditions

Python MongoDB dynamic query building with multiple conditions - python

I am building an Open Source Project, Python MongoDB ORM (for Flask especially) using flask_pymongo and I am kind of stuck at building dynamic conditions.
Below code is what I have written in corresponding files
Model.py
from app.database import Database
class Model:
conditions = {"and":[], "or":[], "in":[]}
operators = {
"!=": "$ne",
"<": "$lt",
">": "$gt",
"<=": "$lte",
">=": "$gte",
"in": "$in",
"not in":"$nin",
"and": "$and",
"or": "$or"
}
def __init__(self):
# collection property from User class
# Database class takes collection to fire MongoDB queries
self.db = Database(self.collection)
def where(self, field, operator, value=None):
if value is None:
# to enable Model.where("first_name", "John")
value = operator
operator = "="
self._handle_condition("and", field, operator, value)
# to enable Model.where().where_or() and etc
return self
def where_or(self, field, operator, value=None):
if value is None:
# to enable Model.where("first_name", "John")
value = operator
operator = "="
self._handle_condition("or", field, operator, value)
# to enable Model.where().where_or() and etc
return self
def _handle_condition(self, type, field, operator, value):
self.conditions[type].append({"field":field, "operator":operator, value:value})
def get(self):
filetrs = {}
for type in self.conditions:
filetrs[self.operators[type]] = []
for condition in self.conditions[type]:
if condition["operator"] == "=":
filter = {condition["field"]:condition["value"]}
else:
filter = {condition["field"]:{self.operators[condition["operator"]]:condition["value"]}}
filetrs[self.operators[type]].append(filter)
return self.db.find(filters)
User.py
from app.Model import Model
class UserModel(Model):
# MongoDB collection name
collection = "users"
def __init__(self):
Model.__init__(self)
User = UserModel()
What I want to achieve is, from UserController.py where User.py is imported and used like the mentioned code.
Where multiple conditions are being added using where and where_or Model methods, get methods is parsing all the conditions and passing as filter to find method
UserController.py
from app.User import User
class UserController:
def index(self):
# Should return all the users where _id is not blank or their first_name is equal to John
return User.where("_id", "!=", "").where_or("first_name", "John").get()
The problem is this is not working at it should be, it seems working fine for any one condition, where or where_or but when I try to add multiple where and where_or conditions it is not working.
Your help is really appreciated.
PS: This question seems to have lots of code but to make you understand the complete scenario I had to, please feel free to comment if you still need any clarifications.
Eagerly looking forward.

Related

How can I intercept database requests(Peewee ORM) in pytest?

I have a main database and a replica (they are the same in the test environment):
core_db = PooledPostgresqlExtDatabase(**DB_COFIG)
replica_db = PooledPostgresqlExtDatabase(**DB_REPLICA_COFIG)
A controllers that executes a query in different databases depending on the model
class BaseController:
def _get_logs(self):
query = self.model.select()
if is_instance(self.model, ModelToReplica)
query = query.bind(replica_db)
return list(query)
class ReplicaExampleController(BaseLogsController):
model = ModelToReplica
def process(self):
return self._get_logs()
class BaseExampleController(BaseLogsController):
model = BaseModel
def process(self):
return self._get_logs()
Controllers are linked to two urls:
/get_core_result/ # Returns the result from BaseExampleController (core_db)
/get_replica_result/ # Returns the result from ReplicaExampleController (replica_db)
I want to check that each of the corners accesses the right base. And I know that the reference to the database object is stored in the request object. How do I get it from the test? I'm using a PyTest. I understand that I probably need to use mock, but I don't understand how.
Unfortunately, this is all I have so far:
class TestSwitchDB:
def test_switch_db_to_replica():
url_core = url_for('core_db_controller')
core_result = self.client.get(url_core)
url_replica = url_for('replica_db_controller')
replica_result = self.client.get(url_replica)

In test_switch_db_to_replica you can mock the db query and patch it, something
with patch(BaseController._get_logs) as mock_get_logs:
mock_get_logs.return_value="your expected return"
As a reference: https://docs.python.org/3/library/unittest.mock-examples.html

Return missing values in google app engine query

In google app engine, say I have a Parent and a Child Entity:
class Parent(ndb.Model):
pass
class Child(ndb.Model):
parent_key = ndb.KeyProperty(indexed = True)
... other properties I don't need to fetch ...
I have a list of parents' keys, say parents_list, and I'm trying to answer efficiently: what parent in parents_list has a child.
Ideally, I would run this query:
children_list = Child.query().filter(Child.parent_key = parents_list).fetch(projection = 'parent_key')
It does not work because of the projection property (parent_key) being in the equality filter. So I would have to retrieve all properties, which seems inefficient.
Is there a way to efficiently solve this?

Your child model should actually be
class Child(ndb.Model):
parent_key = ndb.KeyProperty(kind="Parent", indexed = True)
If you were doing this in Python2, you could use an ndb_tasklet (see code below; note that I haven't executed this code myself so it's not guaranteed to work; it's just here to serve as a guide but I have used tasklets in the past before). If python3, try and create async queries
class Parent(ndb.Model):
#classmethod
def parentsWithChildren(cls, parents_list):
#ndb.tasklet
def child_callback(parent_key):
q = Child.query(Child.parent_key == parent_key)
output = yield q.fetch_async(projection = 'parent_key')
raise ndb.Return ((parentKey, output))
# For each key in parents_list, invoke the callback - child_callback which returns a result only if the parent_key matches
# ndb.tasklet is asynchronous so code is run in parallel
final_output = [ child_callback(a) for a in parents_list]
return final_output

Conditionally adding several filters to a SQLAlchemy query without duplicating code

I have a SQLAlchemy model:
class Ticket(db.Model):
__tablename__ = 'ticket'
id = db.Column(INTEGER(unsigned=True), primary_key=True, nullable=False,
autoincrement=True)
cluster = db.Column(db.VARCHAR(128))
#classmethod
def get(cls, cluster=None):
query = db.session.query(Ticket)
if cluster is not None:
query = query.filter(Ticket.cluster==cluster)
return query.one()
If I add a new column and would like to extend the get method, I have to add one if xxx is not None like this below:
#classmethod
def get(cls, cluster=None, user=None):
query = db.session.query(Ticket)
if cluster is not None:
query = query.filter(Ticket.cluster==cluster)
if user is not None:
query = query.filter(Ticket.user==user)
return query.one()
Is there any way I could make this more efficient? If I have too many columns, the get method would become so ugly.

As always, if you don't want to write something repetitive, use a loop:
#classmethod
def get(cls, **kwargs):
query = db.session.query(Ticket)
for k, v in kwargs.items():
query = query.filter(getattr(table, k) == v)
return query.one()
Because we're no longer setting the cluster=None/user=None as defaults (but instead depending on things that weren't specified by the caller simply never being added to kwargs), we no longer need to prevent filters for null values from being added: The only way a null value will end up in the argument list is if the user actually asked to search for a value of None; so this new code is able to honor that request should it ever take place.
If you prefer to retain the calling convention where cluster and user can be passed positionally (but the user can't search for a value of None), see the initial version of this answer.

How to extend peewee to use logical deletes?

I'm using peewee as ORM for a project and want to extend it to handle logical deletes.
I've added "deleted" field to my base model and have extended the delete operations as follows:
#classmethod
def delete(cls, permanently=False):
if permanently:
return super(BaseModel, cls).delete()
else:
return super(BaseModel, cls).update(deleted=True, modified_at=datetime.datetime.now())
def delete_instance(self, permanently=False, recursive=False, delete_nullable=False):
if permanently:
return self.delete(permanently).where(self.pk_expr()).execute()
else:
self.deleted = True
return self.save()
This works great. However, when I'm overriding select I get some problems.
#classmethod
def select(cls, *selection):
print selection
return super(BaseModel, cls).select(cls, *selection).where(cls.deleted == False)
This works in most cases, but in certains selects it breaks when the resulting query ends up using a join with the keyword "IN" with the following error: "1241, 'Operand should contain 1 column(s)"
Any suggestion on how to properly override select or work around this problem?

I always use a field on my models to indicate whether the model is deleted. I do not recommend overriding methods like delete, delete_instance and especially select. Rather create a new API and use that. Here's how I typically do it:
class StatusModel(Model):
status = IntegerField(
choices=(
(1, 'Public'),
(2, 'Private'),
(3, 'Deleted')),
default=1)
#classmethod
def public(cls):
return cls.select().where(cls.status == 1)

What is the proper way to delineate modules and classes in Python?

I am new to Python, and I'm starting to learn the basics of the code structure. I've got a basic app that I'm working on up on my Github.
For my simple app, I'm create a basic "Evernote-like" service which allows the user to create and edit a list of notes. In the early design, I have a Note object and a Notepad object, which is effectively a list of notes. Presently, I have the following file structure:
Notes.py
|
|------ Notepad (class)
|------ Note (class)
From my current understanding and implementation, this translates into the "Notes" module having a Notepad class and Note class, so when I do an import, I'm saying "from Notes import Notepad / from Notes import Note".
Is this the right approach? I feel, out of Java habit, that I should have a folder for Notes and the two classes as individual files.
My goal here is to understand what the best practice is.

As long as the classes are rather small put them into one file.
You can still move them later, if necessary.
Actually, it is rather common for larger projects to have a rather deep hierarchy but expose a more flat one to the user. So if you move things later but would like still have notes.Note even though the class Note moved deeper, it would be simple to just import note.path.to.module.Note into notes and the user can get it from there. You don't have to do that but you can. So even if you change your mind later but would like to keep the API, no problem.

I've been working in a similar application myself. I can't say this is the best possible approach, but it served me well. The classes are meant to interact with the database (context) when the user makes a request (http request, this is a webapp).
# -*- coding: utf-8 -*-
import json
import datetime
class Note ():
"""A note. This class is part of the data model and is instantiated every
time there access to the database"""
def __init__(self, noteid = 0, note = "", date = datetime.datetime.now(), context = None):
self.id = noteid
self.note = note
self.date = date
self.ctx = context #context holds the db connection and some globals
def get(self):
"""Get the current object from the database. This function needs the
instance to have an id"""
if id == 0:
raise self.ctx.ApplicationError(404, ("No note with id 0 exists"))
cursor = self.ctx.db.conn.cursor()
cursor.execute("select note, date from %s.notes where id=%s" %
(self.ctx.db.DB_NAME, str(self.id)))
data = cursor.fetchone()
if not data:
raise self.ctx.ApplicationError(404, ("No note with id "
+ self.id + " was found"))
self.note = data[0]
self.date = data[1]
return self
def insert(self, user):
"""This function inserts the object to the database. It can be an empty
note. User must be authenticated to add notes (authentication handled
elsewhere)"""
cursor = self.ctx.db.conn.cursor()
query = ("insert into %s.notes (note, owner) values ('%s', '%s')" %
(self.ctx.db.DB_NAME, str(self.note), str(user['id'])))
cursor.execute(query)
return self
def put(self):
"""Modify the current note in the database"""
cursor = self.ctx.db.conn.cursor()
query = ("update %s.notes set note = '%s' where id = %s" %
(self.ctx.db.DB_NAME, str(self.note), str(self.id)))
cursor.execute(query)
return self
def delete(self):
"""Delete the current note, by id"""
if self.id == 0:
raise self.ctx.ApplicationError(404, "No note with id 0 exists")
cursor = self.ctx.db.conn.cursor()
query = ("delete from %s.notes where id = %s" %
(self.ctx.db.DB_NAME, str(self.id)))
cursor.execute(query)
def toJson(self):
"""Returns a json string of the note object's data attributes"""
return json.dumps(self.toDict())
def toDict(self):
"""Returns a dict of the note object's data attributes"""
return {
"id" : self.id,
"note" : self.note,
"date" : self.date.strftime("%Y-%m-%d %H:%M:%S")
}
class NotesCollection():
"""This class handles the notes as a collection"""
collection = []
def get(self, user, context):
"""Populate the collection object and return it"""
cursor = context.db.conn.cursor()
cursor.execute("select id, note, date from %s.notes where owner=%s" %
(context.db.DB_NAME, str(user["id"])))
note = cursor.fetchone()
while note:
self.collection.append(Note(note[0], note[1],note[2]))
note = cursor.fetchone()
return self
def toJson(self):
"""Return a json string of the current collection"""
return json.dumps([note.toDict() for note in self.collection])
I personally use python as a "get it done" language, and don't bother myself with details. This shows in the code above. However one piece of advice: There are no private variables nor methods in python, so don't bother trying to create them. Make your life easier, code fast, get it done
Usage example:
class NotesCollection(BaseHandler):
#tornado.web.authenticated
def get(self):
"""Retrieve all notes from the current user and return a json object"""
allNotes = Note.NotesCollection().get(self.get_current_user(), settings["context"])
json = allNotes.toJson()
self.write(json)
#protected
#tornado.web.authenticated
def post(self):
"""Handles all post requests to /notes"""
requestType = self.get_argument("type", "POST")
ctx = settings["context"]
if requestType == "POST":
Note.Note(note = self.get_argument("note", ""),
context = ctx).insert(self.get_current_user())
elif requestType == "DELETE":
Note.Note(id = self.get_argument("id"), context = ctx).delete()
elif requestType == "PUT":
Note.Note(id = self.get_argument("id"),
note = self.get_argument("note"),
context = ctx).put()
else:
raise ApplicationError(405, "Method not allowed")
By using decorators I'm getting user authentication and error handling out of the main code. This makes it clearer and easier to mantain.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python MongoDB dynamic query building with multiple conditions - python

Related

How can I intercept database requests(Peewee ORM) in pytest?

Return missing values in google app engine query

Conditionally adding several filters to a SQLAlchemy query without duplicating code

How to extend peewee to use logical deletes?

What is the proper way to delineate modules and classes in Python?

Categories

Resources