I have two models Storage and Drawers
class Storage(BaseModel):
id = PrimaryKeyField()
name = CharField()
description = CharField(null=True)
class Drawer(BaseModel):
id = PrimaryKeyField()
name = CharField()
storage = ForeignKeyField(Storage, related_name="drawers")
at the moment I'm producing json from a select query
storages = Storage.select()
As a result I have got a json array, which looks like this:
[{
description: null,
id: 1,
name: "Storage"
},
{
description: null,
id: 2,
name: "Storage 2"
}]
I know, that peewee allows to query for all drawers with storage.drawer(). But I'm struggling to include a json array to every storage which contains all drawers of that storage. I tried to use a join
storages = Storage.select(Storage, Drawer)
.join(Drawer)
.where(Drawer.storage == Storage.id)
.group_by(Storage.id)
But I just retrieve the second storage which does have drawers, but the array of drawers is not included. Is this even possible with joins? Or do I need to iterate over every storage retrieve the drawers and append them to the storage?
This is the classic O(n) query problem for ORMs. The documentation goes into some detail on various ways to approach the problem.
For this case, you will probably want prefetch(). Instead of O(n) queries, it will execute O(k) queries, one for each table involved (so 2 in your case).
storages = Storage.select().order_by(Storage.name)
drawers = Drawer.select().order_by(Drawer.name)
query = prefetch(storages, drawers)
To serialize this, we'll iterate through the Storage objects returned by prefetch. The associated drawers will have been pre-populated using the Drawer.storage foreign key's related_name + '_prefetch' (drawers_prefetch):
accum = []
for storage in query:
data = {'name': storage.name, 'description': storage.description}
data['drawers'] = [{'name': drawer.name}
for drawer in storage.drawers_prefetch]
accum.append(data)
To make this even easier you can use the playhouse.shortcuts.model_to_dict helper:
accum = []
for storage in query:
accum.append(model_to_dict(storage, backrefs=True, recurse=True))
Related
I have an object stored in mongo that has a list of reference fields. In a restplus app I need to parse this list of objects and map them into a JSON doc to return for a client.
# Classes I have saved in Mongo
class ThingWithList(Document):
list_of_objects = ListField(ReferenceField(InfoHolder))
class InfoHolder(Document):
thing_id = StringField()
thing_i_care_about = ReferenceField(Info)
class Info(Document):
name = StringField()
foo = StringField()
bar = StringField()
I am finding iterating through the list to be very slow. I guess because I am having to do another database query every time I dereference children of objects in the list.
Simple (but rubbish) method:
info_to_return = []
thing = ThingWithList.get_from_id('thingsId')
for o in list_of_objects:
info = {
'id': o.id,
'name': o.thing_i_care_about.name,
'foo': o.thing_i_care_about.foo,
'bar': o.thing_i_care_about.bar
}
info_to_return.append(info)
return(info_to_return)
I thought I would be able to solve this by using select_related which sounds like it should do the dereferencing for me N levels deep so that I only do one big mongo call rather than several per iteration. When I add
thing.select_related(3)
it seems to have no effect. Have I just misunderstood what this function is for. How else could I speed up my query?
A simple query looks like this
User.query.filter(User.name == 'admin')
In my code, I need to check the parameters that are being passed and then filter the results from the database based on the parameter.
For example, if the User table contains columns like username, location and email, the request parameter can contain either one of them or can have combination of columns. Instead of checking each parameter as shown below and chaining the filter, I'd like to create one dynamic query string which can be passed to one filter and can get the results back. I'd like to create a separate function which will evaluate all parameters and will generate a query string. Once the query string is generated, I can pass that query string object and get the desired result. I want to avoid using RAW SQL query as it defeats the purpose of using ORM.
if location:
User.query.filter(User.name == 'admin', User.location == location)
elif email:
User.query.filter(User.email == email)
You can apply filter to the query repeatedly:
query = User.query
if location:
query = query.filter(User.location == location)
if email:
query = query.filter(User.email == email)
If you only need exact matches, there’s also filter_by:
criteria = {}
# If you already have a dict, there are easier ways to get a subset of its keys
if location: criteria['location'] = location
if email: criteria['email'] = email
query = User.query.filter_by(**criteria)
If you don’t like those for some reason, the best I can offer is this:
from sqlalchemy.sql.expression import and_
def get_query(table, lookups, form_data):
conditions = [
getattr(table, field_name) == form_data[field_name]
for field_name in lookups if form_data[field_name]
]
return table.query.filter(and_(*conditions))
get_query(User, ['location', 'email', ...], form_data)
Late to write an answer but if anyone is looking for the answer then sqlalchemy-json-querybuilder can be useful. It can be installed as -
pip install sqlalchemy-json-querybuilder
e.g.
filter_by = [{
"field_name": "SomeModel.field1",
"field_value": "somevalue",
"operator": "contains"
}]
order_by = ['-SomeModel.field2']
results = Search(session, "pkg.models", (SomeModel,), filter_by=filter_by,order_by=order_by, page=1, per_page=5).results
https://github.com/kolypto/py-mongosql/
MongoSQL is a query builder that uses JSON as the input.
Capable of:
Choosing which columns to load
Loading relationships
Filtering using complex conditions
Ordering
Pagination
Example:
{
project: ['id', 'name'], // Only fetch these columns
sort: ['age+'], // Sort by age, ascending
filter: {
// Filter condition
sex: 'female', // Girls
age: { $gte: 18 }, // Age >= 18
},
join: ['user_profile'], // Load the 'user_profile' relationship
limit: 100, // Display 100 per page
skip: 10, // Skip first 10 rows
}
I would like to be able to check if a related object has already been fetched by using either select_related or prefetch_related, so that I can serialize the data accordingly. Here is an example:
class Address(models.Model):
street = models.CharField(max_length=100)
zip = models.CharField(max_length=10)
class Person(models.Model):
name = models.CharField(max_length=20)
address = models.ForeignKey(Address)
def serialize_address(address):
return {
"id": address.id,
"street": address.street,
"zip": address.zip
}
def serialize_person(person):
result = {
"id": person.id,
"name": person.name
}
if is_fetched(person.address):
result["address"] = serialize_address(person.address)
else:
result["address"] = None
######
person_a = Person.objects.select_related("address").get(id=1)
person_b = Person.objects.get(id=2)
serialize_person(person_a) #should be object with id, name and address
serialize_person(person_b) #should be object with only id and name
In this example, the function is_fetched is what I am looking for. I would like to determine if the person object already has a resolves address and only if it has, it should be serialized as well. But if it doesn't, no further database query should be executed.
So is there a way to achieve this in Django?
Since Django 2.0 you can easily check for all fetched relation by:
obj._state.fields_cache
ModelStateFieldsCacheDescriptor is responsible for storing your cached relations.
>>> Person.objects.first()._state.fields_cache
{}
>>> Person.objects.select_related('address').first()._state.fields_cache
{'address': <Address: Your Address>}
If the address relation has been fetched, then the Person object will have a populated attribute called _address_cache; you can check this.
def is_fetched(obj, relation_name):
cache_name = '_{}_cache'.format(relation_name)
return getattr(obj, cache_name, False)
Note you'd need to call this with the object and the name of the relation:
is_fetched(person, 'address')
since doing person.address would trigger the fetch immediately.
Edit reverse or many-to-many relations can only be fetched by prefetch_related; that populates a single attribute, _prefetched_objects_cache, which is a dict of lists where the key is the name of the related model. Eg if you do:
addresses = Address.objects.prefetch_related('person_set')
then each item in addresses will have a _prefetched_objects_cache dict containing a "person' key.
Note, both of these are single-underscore attributes which means they are part of the private API; you're free to use them, but Django is also free to change them in future releases.
Per this comment on the ticket linked in the comment by #jaap3 above, the recommended way to do this for Django 3+ (perhaps 2+?) is to use the undocumented is_cached method on the model's field, which comes from this internal mixin:
>>> person1 = Person.objects.first()
>>> Person.address.is_cached(person1)
False
>>> person2 = Person.objects.select_related('address').last()
>>> Person.address.is_cached(person2)
True
I'm new to MongoDB and am trying to design a simple schema for a set of python objects. I'm having a tough time working with the concept of polymorphism.
Below is some pseudo-code. How would you represent this inheritance hierarchy in MongoDB schema:
class A:
content = 'video' or 'image' or 'music'
data = contentData # where content may be video or image or music depending on content.
class videoData:
length = *
director = *
actors = *
class imageData:
dimensions = *
class musicData:
genre = *
The problem I'm facing is that the schema of A.data depends on A.content. How can A be represented in a mongodb schema?
Your documents could look like this:
{ _type: "video",
data: {
length: 120,
director: "Smith",
actors = ["Jones", "Lee"]
}
}
So, basically, "data" points to an embedded document with the document's type-specified fields.
This doesn't particularly answer your question, but you might check out Ming. It does polymorphism for you when it maps the document to the object.
http://merciless.sourceforge.net/tour.html
I'm a newcomer to SQLAlchemy ORM and I'm struggling to accomplish complex-ish queries on multiple tables - queries which I find relatively straightforward to do in Doctrine DQL.
I have data objects of Cities, which belong to Countries. Some Cities also have a County ID set, but not all. As well as the necessary primary and foreign keys, each record also has a text_string_id, which links to a TextStrings table which stores the name of the City/County/Country in different languages. The TextStrings MySQL table looks like this:
CREATE TABLE IF NOT EXISTS `text_strings` (
`id` INT UNSIGNED NOT NULL,
`language` VARCHAR(2) NOT NULL,
`text_string` varchar(255) NOT NULL,
PRIMARY KEY (`id`, `language`)
)
I want to construct a breadcrumb for each city, of the form:
country_en_name > city_en_name OR
country_en_name > county_en_name > city_en_name,
depending on whether or not a County attribute is set for this city. In Doctrine this would be relatively straightforward:
$query = Doctrine_Query::create()
->select('ci.id, CONCAT(cyts.text_string, \'> \', IF(cots.text_string is not null, CONCAT(cots.text_string, \'> \', \'\'), cits.text_string) as city_breadcrumb')
->from('City ci')
->leftJoin('ci.TextString cits')
->leftJoin('ci.Country cy')
->leftJoin('cy.TextString cyts')
->leftJoin('ci.County co')
->leftJoin('co.TextString cots')
->where('cits.language = ?', 'en')
->andWhere('cyts.language = ?', 'en')
->andWhere('(cots.language = ? OR cots.language is null)', 'en');
With SQLAlchemy ORM, I'm struggling to achieve the same thing. I believe I've setup the objects correctly - in the form eg:
class City(Base):
__tablename__ = "cities"
id = Column(Integer, primary_key=True)
country_id = Column(Integer, ForeignKey('countries.id'))
text_string_id = Column(Integer, ForeignKey('text_strings.id'))
county_id = Column(Integer, ForeignKey('counties.id'))
text_strings = relation(TextString, backref=backref('cards', order_by=id))
country = relation(Country, backref=backref('countries', order_by=id))
county = relation(County, backref=backref('counties', order_by=id))
My problem is in the querying - I've tried various approaches to generating the breadcrumb but nothing seems to work. Some observations:
Perhaps using things like CONCAT and IF inline in the query is not very pythonic (is it even possible with the ORM?) - so I've tried performing these operations outside SQLAlchemy, in a Python loop of the records. However here I've struggled to access the individual fields - for example the model accessors don't seem to go n-levels deep, e.g. City.counties.text_strings.language doesn't exist.
I've also experimented with using tuples - the closest I've got to it working was by splitting it out into two queries:
# For cities without a county
for city, country in session.query(City, Country).\
filter(Country.id == City.country_id).\
filter(City.county_id == None).all():
if city.text_strings.language == 'en':
# etc
# For cities with a county
for city, county, country in session.query(City, County, Country).\
filter(and_(City.county_id == County.id, City.country_id == Country.id)).all():
if city.text_strings.language == 'en':
# etc
I split it out into two queries because I couldn't figure out how to make the Suit join optional in just the one query. But this approach is of course terrible and worse the second query didn't work 100% - it wasn't joining all of the different city.text_strings for subsequent filtering.
So I'm stumped! Any help you can give me setting me on the right path for performing these sorts of complex-ish queries in SQLAlchemy ORM would be much appreciated.
The mapping for Suit is not present but based on the propel query I would assume it has a text_strings attribute.
The relevant portion of SQLAlchemy documentation describing aliases with joins is at:
http://www.sqlalchemy.org/docs/orm/tutorial.html#using-aliases
generation of functions is at:
http://www.sqlalchemy.org/docs/core/tutorial.html#functions
cyts = aliased(TextString)
cits = aliased(TextString)
cots = aliased(TextString)
cy = aliased(Suit)
co = aliased(Suit)
session.query(
City.id,
(
cyts.text_string + \
'> ' + \
func.if_(cots.text_string!=None, cots.text_string + '> ', cits.text_string)
).label('city_breadcrumb')
).\
outerjoin((cits, City.text_strings)).\
outerjoin((cy, City.country)).\
outerjoin((cyts, cy.text_strings)).\
outerjoin((co, City.county))\
outerjoin((cots, co.text_string)).\
filter(cits.langauge=='en').\
filter(cyts.langauge=='en').\
filter(or_(cots.langauge=='en', cots.language==None))
though I would think its a heck of a lot simpler to just say:
city.text_strings.text_string + " > " + city.country.text_strings.text_string + " > " city.county.text_strings.text_string
If you put a descriptor on City, Suit:
class City(object):
# ...
#property
def text_string(self):
return self.text_strings.text_string
then you could say city.text_string.
Just for the record, here is the code I ended up using. Mike (zzzeek)'s answer stays as the correct and definitive answer because this is just an adaptation of his, which was the breakthrough for me.
cits = aliased(TextString)
cyts = aliased(TextString)
cots = aliased(TextString)
for (city_id, country_text, county_text, city_text) in \
session.query(City.id, cyts.text_string, cots.text_string, cits.text_string).\
outerjoin((cits, and_(cits.id==City.text_string_id, cits.language=='en'))).\
outerjoin((County, City.county)).\
outerjoin((cots, and_(cots.id==County.text_string_id, cots.language=='en'))).\
outerjoin((Country, City.country)).\
outerjoin((cyts, and_(cyts.id==Country.text_string_id, cyts.language=='en'))):
# Python to construct the breadcrumb, checking county_text for None-ness