I am studying the library graphene, (https://github.com/graphql-python/graphene) and I was trying to understand how I can serialize / return a nested json into the graphene and perform the query in the correct way.
The code that I will insert below follows the example of the link available in the repository (it is at the end of the question).
import graphene
from graphene.types.resolver import dict_resolver
class User(graphene.ObjectType):
id = graphene.ID()
class Meta:
default_resolver = dict_resolver
class Patron(graphene.ObjectType):
id = graphene.ID()
name = graphene.String()
age = graphene.Int()
user = User
class Meta:
default_resolver = dict_resolver
class Query(graphene.ObjectType):
patron = graphene.Field(Patron)
#staticmethod
def resolve_patron(root, info):
return Patron(**{"id":1, "name": "Syrus", "age": 27, "user": {"id": 2}})
schema = graphene.Schema(query=Query)
query = """
query something{
patron {
id
}
}
"""
if __name__ == "__main__":
result = schema.execute(query)
print(result.data)
The idea is basically to be able to use a multi-level json to "resolve" with graphql. This example is very simple, in the actual use case I plan, there will be several levels in json.
I think that if you use the setattr at the lowest level of json and go up, it works, but I would like to know if someone has already implemented or found a more practical way of doing it.
original example:
https://github.com/graphql-python/graphene/blob/master/examples/simple_example.py
Related
I want to extract the line 'Unique protein chains: 1' from this entry, using a graphQL query.
I know this is the query I want to use:
{
entry(entry_id: "5O6C") {
rcsb_entry_info {
polymer_entity_count_protein
}
}
}
and I can see the output if I use the graphQL interface here:
{
"data": {
"entry": {
"rcsb_entry_info": {
"polymer_entity_count_protein": 1
}
}
}
}
Has the information I want : "polymer_entity_count_protein": 1
I want to run this query through python so it can be fed into other pipelines (and also process multiple IDs).
I found graphene to be one library that will do graphQL queries, and this is the hello world example, which I can get to work on my machine:
import graphene
class Query(graphene.ObjectType):
hello = graphene.String(name=graphene.String(default_value="world"))
def resolve_hello(self, info, name):
return name
schema = graphene.Schema(query=Query)
result = schema.execute('{ hello }')
print(result.data['hello']) # "Hello World"
I don't understand how to combine the two. Can someone show me how I edit my python code with the query of interest, so what's printed at the end is:
'506C 1'
I have seen some other examples/queries about graphene/graphQL: e.g. here; except I can't understand how to make my specific example work.
Based on answer below, I ran:
import graphene
class Query(graphene.ObjectType):
# ResponseType needs to be the type of your response
# the following line defines the return value of your query (ResponseType)
# and the inputType (graphene.String())
entry = graphene.String(entry_id=graphene.String(default_value=''))
def resolve_entry(self, info, **kwargs):
id = kwargs.get('entry_id')
# as you already have a working query you should enter the logic here
schema = graphene.Schema(query=Query)
# not totally sure if the query needs to look like this, it also depends heavily on your response type
query = '{ entry(entry_id="506C"){rcsb_entry_info}'
result = schema.execute(query)
print("506C" + str(result.data.entry.rcsb_entry_info.polymer_entity_count_protein))
However, I get:
Traceback (most recent call last):
File "graphene_query_for_rcsb.py", line 18, in <module>
print("506C" + str(result.data.entry.rcsb_entry_info.polymer_entity_count_protein))
AttributeError: 'NoneType' object has no attribute 'entry'
Did you write the logic of the already working query you have in your question? Is it not using python/ graphene?
I'm not sure if I understood the question correctly but here's a general idea:
import graphene
class Query(graphene.ObjectType):
# ResponseType needs to be the type of your response
# the following line defines the return value of your query (ResponseType)
# and the inputType (graphene.String())
entry = graphene.Field(ResponseType, entry_id=graphene.String()
def resolve_entry(self, info, **kwargs):
id = kwargs.get('entry_id')
# as you already have a working query you should enter the logic here
schema = graphene.Schema(query=Query)
# not totally sure if the query needs to look like this, it also depends heavily on your response type
query = '{ entry(entry_id="506C"){rcsb_entry_info}}'
result = schema.execute(query)
print("506C" + str(result.data.entry.rcsb_entry_info.polymer_entity_count_protein)
Here an example for a response type:
if you have the query
# here TeamType is my ResponseType
get_team = graphene.Field(TeamType, id=graphene.Int())
def resolve_get_team(self, info, **kwargs):
id = kwargs.get('id')
if id is not None:
return Team.objects.get(pk=id)
else:
raise Exception();
the responseType is defined as:
class TeamType(DjangoObjectType):
class Meta:
model = Team
but you can also define a response type that is not based on a model:
class DeleteResponse(graphene.ObjectType):
numberOfDeletedObject = graphene.Int(required=True)
numberOfDeletedTeams = graphene.Int(required=False)
And your response type should look something like this:
class myResponse(graphene.ObjectType):
rcsb_entry_info = graphne.Field(Polymer)
class Polymer(graphene.ObjectType):
polymer_entity_count_protein = graphene.Int()
again this is not testet or anything and I don't really know what your response really is.
I have a specify use case but my question pertains to the best way of doing this in general.
I have three tables
Order - primary key order_id
OrderLine - Linking table with order_id, product_id and quantity. An order has 1 or more order lines
Product - primary key product_id, each order line has one product
In sqlachemy / python how do I generate nested JSON along the lines of:
{
"orders": [
{
"order_id": 1
"some_order_level_detail": "Kansas"
"order_lines": [
{
"product_id": 1,
"product_name": "Clawhammer",
"quantity": 5
},
...
]
},
...
]
}
Potential Ideas
Hack away doing successive queries
First idea which I want to get away from if possible is using list comprehesion and a brute force approach.
def get_json():
answer = {
"orders": [
{
"order_id": o.order_id,
"some_order_level_detail": o.some_order_level_detail,
"order_lines": [
{
"product_id": 1,
"product_name": Product.query.get(o_line.product_id).product_name,
"quantity": 5
}
for o_line in OrderLine.query.filter(order_id=o.order_id).all()
]
}
for o in Order.query.all()
]
}
This gets hard to maintain mixing the queries with json. Ideally I'd like to do a query first...
Get joined results first, somehow manipulate later
The second idea is to do a join query to join the three tables showing per row in OrderLine the order and product details.
My question to pythonista out there is is there a nice way to convert this to nested json.
Another way?
This really seems like such a common requirement I'm really wondering whether there is a book method for this sort of thing?
Is there an SQLAchemy version of this
Look into marshmallow-sqlalchemy, as it does exactly what you're looking for.
I strongly advise against baking your serialization directly into your model, as you will eventually have two services requesting the same data, but serialized in a different way (including fewer or more nested relationships for performance, for instance), and you will either end up with either (1) a lot of bugs that your test suite will miss unless you're checking for literally every field or (2) more data serialized than you need and you'll run into performance issues as the complexity of your application scales.
With marshmallow-sqlalchemy, you'll need to define a schema for each model you'd like to serialize. Yes, it's a bit of extra boilerplate, but believe me - you will be much happier in the end.
We build applications using flask-sqlalchemy and marshmallow-sqlalchemy like this (also highly recommend factory_boy so that you can mock your service and write unit tests in place of of integration tests that need to touch the database):
# models
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
children = relationship("Child", back_populates="parent")
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('parent.id'))
parent = relationship('Parent', back_populates='children',
foreign_keys=[parent_id])
# schemas. Don't put these in your models. Avoid tight coupling here
from marshmallow_sqlalchemy import ModelSchema
import marshmallow as ma
class ParentSchema(ModelSchema):
children = ma.fields.Nested(
'myapp.schemas.child.Child', exclude=('parent',), many=True)
class Meta(ModelSchema.Meta):
model = Parent
strict = True
dump_only = ('id',)
class ChildSchema(ModelSchema):
parent = ma.fields.Nested(
'myapp.schemas.parent.Parent', exclude=('children',))
class Meta(ModelSchema.Meta):
model = Child
strict = True
dump_only = ('id',)
# services
class ParentService:
'''
This service intended for use exclusively by /api/parent
'''
def __init__(self, params, _session=None):
# your unit tests can pass in _session=MagicMock()
self.session = _session or db.session
self.params = params
def _parents(self) -> typing.List[Parent]:
return self.session.query(Parent).options(
joinedload(Parent.children)
).all()
def get(self):
schema = ParentSchema(only=(
# highly recommend specifying every field explicitly
# rather than implicit
'id',
'children.id',
))
return schema.dump(self._parents()).data
# views
#app.route('/api/parent')
def get_parents():
service = ParentService(params=request.get_json())
return jsonify(data=service.get())
# test factories
class ModelFactory(SQLAlchemyModelFactory):
class Meta:
abstract = True
sqlalchemy_session = db.session
class ParentFactory(ModelFactory):
id = factory.Sequence(lambda n: n + 1)
children = factory.SubFactory('tests.factory.children.ChildFactory')
class ChildFactory(ModelFactory):
id = factory.Sequence(lambda n: n + 1)
parent = factory.SubFactory('tests.factory.parent.ParentFactory')
# tests
from unittest.mock import MagicMock, patch
def test_can_serialize_parents():
parents = ParentFactory.build_batch(4)
session = MagicMock()
service = ParentService(params={}, _session=session)
assert service.session is session
with patch.object(service, '_parents') as _parents:
_parents.return_value = parents
assert service.get()[0]['id'] == parents[0].id
assert service.get()[1]['id'] == parents[1].id
assert service.get()[2]['id'] == parents[2].id
assert service.get()[3]['id'] == parents[3].id
I would add a .json() method to each model, so that they call each other. It's essentially your "hacked" solution but a bit more readable/maintainable. Your Order model could have:
def json(self):
return {
"id": self.id,
"order_lines": [line.json() for line in self.order_lines]
}
Your OrderLine model could have:
def json(self):
return {
"product_id": self.product_id,
"product_name": self.product.name,
"quantity": self.quantity
}
Your resource at the top level (where you're making the request for orders) could then do:
...
orders = Order.query.all()
return {"orders": [order.json() for order in orders]}
...
This is how I normally structure this JSON requirement.
Check my answer in this thread Flask Sqlalchmey - Marshmallow Nested Schema fails for joins with filter ( where ) conditions and using the Marshmallow package you include in your schema something like this:
name = fields.Nested(Schema, many=True)
I would like to be able to check if a related object has already been fetched by using either select_related or prefetch_related, so that I can serialize the data accordingly. Here is an example:
class Address(models.Model):
street = models.CharField(max_length=100)
zip = models.CharField(max_length=10)
class Person(models.Model):
name = models.CharField(max_length=20)
address = models.ForeignKey(Address)
def serialize_address(address):
return {
"id": address.id,
"street": address.street,
"zip": address.zip
}
def serialize_person(person):
result = {
"id": person.id,
"name": person.name
}
if is_fetched(person.address):
result["address"] = serialize_address(person.address)
else:
result["address"] = None
######
person_a = Person.objects.select_related("address").get(id=1)
person_b = Person.objects.get(id=2)
serialize_person(person_a) #should be object with id, name and address
serialize_person(person_b) #should be object with only id and name
In this example, the function is_fetched is what I am looking for. I would like to determine if the person object already has a resolves address and only if it has, it should be serialized as well. But if it doesn't, no further database query should be executed.
So is there a way to achieve this in Django?
Since Django 2.0 you can easily check for all fetched relation by:
obj._state.fields_cache
ModelStateFieldsCacheDescriptor is responsible for storing your cached relations.
>>> Person.objects.first()._state.fields_cache
{}
>>> Person.objects.select_related('address').first()._state.fields_cache
{'address': <Address: Your Address>}
If the address relation has been fetched, then the Person object will have a populated attribute called _address_cache; you can check this.
def is_fetched(obj, relation_name):
cache_name = '_{}_cache'.format(relation_name)
return getattr(obj, cache_name, False)
Note you'd need to call this with the object and the name of the relation:
is_fetched(person, 'address')
since doing person.address would trigger the fetch immediately.
Edit reverse or many-to-many relations can only be fetched by prefetch_related; that populates a single attribute, _prefetched_objects_cache, which is a dict of lists where the key is the name of the related model. Eg if you do:
addresses = Address.objects.prefetch_related('person_set')
then each item in addresses will have a _prefetched_objects_cache dict containing a "person' key.
Note, both of these are single-underscore attributes which means they are part of the private API; you're free to use them, but Django is also free to change them in future releases.
Per this comment on the ticket linked in the comment by #jaap3 above, the recommended way to do this for Django 3+ (perhaps 2+?) is to use the undocumented is_cached method on the model's field, which comes from this internal mixin:
>>> person1 = Person.objects.first()
>>> Person.address.is_cached(person1)
False
>>> person2 = Person.objects.select_related('address').last()
>>> Person.address.is_cached(person2)
True
I have a fairly basic CRUDMixin
class CRUDMixin(object):
""" create, read, update and delete methods for SQLAlchemy """
id = db.Column(db.Integer, primary_key=True)
#property
def columns(self):
return [ c.name for c in self.__table__.columns ]
def read(self):
""" return json of this current model """
return dict([ (c, getattr(self, c)) for c in self.columns ])
# ...
For something like an Article class which will subclass this, it might have a relationship with another class, like so:
author_id = db.Column(db.Integer, db.ForeignKey('users.id'))
The only real problem is that it will not return any user details in the json. Ideally, the json should look like this:
{
'id': 1234,
'title': 'this is an article',
'body': 'Many words go here. Many shall be unread. Roman Proverb.',
'author': {
'id': 14
'name': 'Thor',
'joined': October 1st, 1994
}
}
As it is right now, it will just give author_id: 14.
Can I detect if a column is a relationship and load it as json as well in this way?
You have to setup the entire relation by adding something like
author = db.relationship("Author") # I assume that you have an Author model
Then to json your result you have differents way to handle relations.
Take a look at this 2 responses :
jsonify a SQLAlchemy result set in Flask
How to serialize SqlAlchemy result to JSON?
You can also take a look at flask-restful which provide a method/decorator (marshal_with) to marshal your results in a good way with nested object (relations).
http://flask-restful.readthedocs.org/en/latest/fields.html#advanced-nested-field
I am using the following code to serialize my appengine datastore to JSON
class DictModel(db.Model):
def to_dict(self):
return dict([(p, unicode(getattr(self, p))) for p in self.properties()])
class commonWordTweets(DictModel):
commonWords = db.StringListProperty(required=True)
venue = db.ReferenceProperty(Venue, required=True, collection_name='commonWords')
class Venue(db.Model):
id = db.StringProperty(required=True)
fourSqid = db.StringProperty(required=False)
name = db.StringProperty(required=True)
twitter_ID = db.StringProperty(required=True)
This returns the following JSON response
[
{
"commonWords": "[u'storehouse', u'guinness', u'badge', u'2011"', u'"new', u'mayor', u'dublin)']",
"venue": "<__main__.Venue object at 0x1028ad190>"
}
]
How can I return the actual venue name to appear?
Firstly, although it's not exactly your question, it's strongly recommended to use simplejson to produce json, rather than trying to turn structures into json strings yourself.
To answer your question, the ReferenceProperty just acts as a reference to your Venue object. So you just use its attributes as per normal.
Try something like:
cwt = commonWordTweets() # Replace with code to get the item from your datastore
d = {"commonWords":cwt.commonWords, "venue": cwt.venue.name}
jsonout = simplejson.dumps(d)