Trying to use Embedded Documents Fields in MongoDB - python

I'm following freecodecamp's video on MongoDB using mongoengine (as db). I'm trying to use the embedded document list field to add information to my main document. Also using a Streamlit webapp as my input source
My class's are:
class Contest(db.Document):
date_created = db.DateTimeField(default=datetime.today)
name = db.StringField(required=True)
format = db.EmbeddedDocumentField(Format)
class Format(db.EmbeddedDocument):
contest_id = db.ObjectIdField()
name = db.StringField()
Then I've tried a few different ways to to add the format to a specific contest instance.
Try #1
def set_format(active_contest):
format : Format = None
name = st.text_input('Name of Format:')
submit = st.button('Set Format Name')
if submit == True:
format.contest_id = active_contest.id
format.name = name
active_contest.save()
setting Format to None is the way the freecodecamp video shows... but i get this error: AttributeError: 'NoneType' object has no attribute 'contest_id'.
So I tried switching it to: format = Format()... this way it doesn't give me an error, but also doesn't update the Contest document to include the format information.
I also tried switching active_contest.save() to format.save() but then i get a: AttributeError: 'Format' object has no attribute 'save'
I've also tried the update function instead of save... but i get similar errors every-which way.
New to mongoDB and programming in general. Thanks in advance!

First of all, if you want to store Format as embedded document, the contest_id is not necessary in Format class. With this approach you will end with something like this in your MongoDB collection:
{
"date_created":ISODate(...),
"name": "...",
"format": {
"name": "..."
}
}
Another approach could be something like:
class Contest(db.Document):
date_created = db.DateTimeField(default=datetime.today)
name = db.StringField(required=True)
format = db.ReferenceField('Format') # <- Replaced by ReferenceField
class Format(db.Document): # <- EmbeddedDocument replaced by Document
name = db.StringField()
In that case each instance of "Format" will be stored in a separate collection. So you will end with something like this in MongoDB:
Collection Contest:
{
"date_created":ISODate(...),
"name": "...",
"format": :ObjectId("...") // <-- here's the relation field
}
Collection Format:
{
"_id":"...",
"name":"..",
}
Both approaches shares the same code:
def set_format(active_contest): # <-- here's the instance of 'Contest'
format : Format = Format() # <-- create a new Format instance
name = st.text_input('Name of Format:')
submit = st.button('Set Format Name')
if submit == True:
format.name = name
active_contest.format = format # <-- assigns the format to contest
active_contest.save() <- stores both because you are saving the 'parent' object

Related

how to serialize a nested json inside a graphene resolve?

I am studying the library graphene, (https://github.com/graphql-python/graphene) and I was trying to understand how I can serialize / return a nested json into the graphene and perform the query in the correct way.
The code that I will insert below follows the example of the link available in the repository (it is at the end of the question).
import graphene
from graphene.types.resolver import dict_resolver
class User(graphene.ObjectType):
id = graphene.ID()
class Meta:
default_resolver = dict_resolver
class Patron(graphene.ObjectType):
id = graphene.ID()
name = graphene.String()
age = graphene.Int()
user = User
class Meta:
default_resolver = dict_resolver
class Query(graphene.ObjectType):
patron = graphene.Field(Patron)
#staticmethod
def resolve_patron(root, info):
return Patron(**{"id":1, "name": "Syrus", "age": 27, "user": {"id": 2}})
schema = graphene.Schema(query=Query)
query = """
query something{
patron {
id
}
}
"""
if __name__ == "__main__":
result = schema.execute(query)
print(result.data)
The idea is basically to be able to use a multi-level json to "resolve" with graphql. This example is very simple, in the actual use case I plan, there will be several levels in json.
I think that if you use the setattr at the lowest level of json and go up, it works, but I would like to know if someone has already implemented or found a more practical way of doing it.
original example:
https://github.com/graphql-python/graphene/blob/master/examples/simple_example.py

graphQL and graphene: How to extract information from a graphQL query using graphene?

I want to extract the line 'Unique protein chains: 1' from this entry, using a graphQL query.
I know this is the query I want to use:
{
entry(entry_id: "5O6C") {
rcsb_entry_info {
polymer_entity_count_protein
}
}
}
and I can see the output if I use the graphQL interface here:
{
"data": {
"entry": {
"rcsb_entry_info": {
"polymer_entity_count_protein": 1
}
}
}
}
Has the information I want : "polymer_entity_count_protein": 1
I want to run this query through python so it can be fed into other pipelines (and also process multiple IDs).
I found graphene to be one library that will do graphQL queries, and this is the hello world example, which I can get to work on my machine:
import graphene
class Query(graphene.ObjectType):
hello = graphene.String(name=graphene.String(default_value="world"))
def resolve_hello(self, info, name):
return name
schema = graphene.Schema(query=Query)
result = schema.execute('{ hello }')
print(result.data['hello']) # "Hello World"
I don't understand how to combine the two. Can someone show me how I edit my python code with the query of interest, so what's printed at the end is:
'506C 1'
I have seen some other examples/queries about graphene/graphQL: e.g. here; except I can't understand how to make my specific example work.
Based on answer below, I ran:
import graphene
class Query(graphene.ObjectType):
# ResponseType needs to be the type of your response
# the following line defines the return value of your query (ResponseType)
# and the inputType (graphene.String())
entry = graphene.String(entry_id=graphene.String(default_value=''))
def resolve_entry(self, info, **kwargs):
id = kwargs.get('entry_id')
# as you already have a working query you should enter the logic here
schema = graphene.Schema(query=Query)
# not totally sure if the query needs to look like this, it also depends heavily on your response type
query = '{ entry(entry_id="506C"){rcsb_entry_info}'
result = schema.execute(query)
print("506C" + str(result.data.entry.rcsb_entry_info.polymer_entity_count_protein))
However, I get:
Traceback (most recent call last):
File "graphene_query_for_rcsb.py", line 18, in <module>
print("506C" + str(result.data.entry.rcsb_entry_info.polymer_entity_count_protein))
AttributeError: 'NoneType' object has no attribute 'entry'
Did you write the logic of the already working query you have in your question? Is it not using python/ graphene?
I'm not sure if I understood the question correctly but here's a general idea:
import graphene
class Query(graphene.ObjectType):
# ResponseType needs to be the type of your response
# the following line defines the return value of your query (ResponseType)
# and the inputType (graphene.String())
entry = graphene.Field(ResponseType, entry_id=graphene.String()
def resolve_entry(self, info, **kwargs):
id = kwargs.get('entry_id')
# as you already have a working query you should enter the logic here
schema = graphene.Schema(query=Query)
# not totally sure if the query needs to look like this, it also depends heavily on your response type
query = '{ entry(entry_id="506C"){rcsb_entry_info}}'
result = schema.execute(query)
print("506C" + str(result.data.entry.rcsb_entry_info.polymer_entity_count_protein)
Here an example for a response type:
if you have the query
# here TeamType is my ResponseType
get_team = graphene.Field(TeamType, id=graphene.Int())
def resolve_get_team(self, info, **kwargs):
id = kwargs.get('id')
if id is not None:
return Team.objects.get(pk=id)
else:
raise Exception();
the responseType is defined as:
class TeamType(DjangoObjectType):
class Meta:
model = Team
but you can also define a response type that is not based on a model:
class DeleteResponse(graphene.ObjectType):
numberOfDeletedObject = graphene.Int(required=True)
numberOfDeletedTeams = graphene.Int(required=False)
And your response type should look something like this:
class myResponse(graphene.ObjectType):
rcsb_entry_info = graphne.Field(Polymer)
class Polymer(graphene.ObjectType):
polymer_entity_count_protein = graphene.Int()
again this is not testet or anything and I don't really know what your response really is.

Update Embedded Mongo Documents - Mongoengine

I've a model -
class Comment(EmbeddedDocument):
c_id = Integer()
content = StringField()
class Page(DynamicDocument):
comments = ListField(EmbeddedDocumentField(Comment))
I insert the following data
comment1 = Comment(c_id=1,content='Good work!')
comment2 = Comment(c_id=2,content='Nice article!')
page = Page(comments=[comment1, comment2])
Now I want to update the comment whose id was 1 to Great work!. How can I do it?
I read on some SO thread that it can be done in following way:-
p_obj = Page.objects.get(comments__c_id=1).update(set__comments__S__content='Great work')
However, the above update throws an error saying:-
Update failed (Cannot apply the positional operator without a corresponding query field containing an array.)
Following is the document structure:-
{
"comments": [{
"content": "Good Work",
"c_id": "1"
},
{
"content": "Nice article",
"c_id": "2"
}],
}
You need to modify your Comment model slightly. c_id field in the Comment model should be an mongoengine IntField() instead of Integer() you are using.
class Comment(EmbeddedDocument):
c_id = IntField() # use IntField here
content = StringField()
class Page(DynamicDocument):
comments = ListField(EmbeddedDocumentField(Comment))
Secondly, When performing an .update() operation using set, you need to use .filter() instead of the .get() you are using as .update() will try to update documents.
Updating embedded documents:
We will try to update embedded documents in the Python shell. We will first import the models and then create 2 comment instances comment1 and comment2. Then we create a Product instance and attach these comment instances to it.
To perform an update, we will first filter Product objects having c_id as 1 in the comments embedded document. After getting the filtered results we will call update() on it using set__.
For example:
In [1]: from my_app.models import Product, Comment # import models
In [2]: comment1 = Comment(c_id=1,content='Good work!') # create 1st comment instance
In [3]: comment2 = Comment(c_id=2,content='Nice article!') # create 2nd comment instance
In [4]: page = Page(comments=[comment1, comment2]) # attach coments to `Page` instance
In [5]: page.save() # save the page object
Out[5]: <Page: Page object>
# perform update operation
In [6]: Page.objects.filter(comments__c_id=1).update(set__comments__S__content='Great work')
Out[6]: 1 # Returns the no. of entries updated
Check that value has updated:
We can check that the value has updated by going inside the mongo shell and performing .find() on the page_collection.
> db.page_collection.find()
{ "_id" : ObjectId("5605aad6e8e4351af1191d3f"), "comments" : [ { "c_id" : 1, "content" : "Great work" }, { "c_id" : 2, "content" : "Nice article!" } ] }

Mongoengine, retriving only some of a MapField

For Example.. In Mongodb..
> db.test.findOne({}, {'mapField.FREE':1})
{
"_id" : ObjectId("4fb7b248c450190a2000006a"),
"mapField" : {
"BOXFLUX" : {
"a" : "f",
}
}
}
The 'mapField' field is made of MapField of Mongoengine.
and 'mapField' field has a log of key and data.. but I just retrieved only 'BOXFLUX'..
this query is not working in MongoEngine....
for example..
BoxfluxDocument.objects( ~~ querying ~~ ).only('mapField.BOXFLUX')
AS you can see..
only('mapField.BOXFLUX') or only only('mapField__BOXFLUX') does not work.
it retrieves all 'mapField' data, including 'BOXFLUX' one..
How can I retrieve only a field of MapField???
I see there is a ticket for this: https://github.com/hmarr/mongoengine/issues/508
Works for me heres an example test case:
def test_only_with_mapfields(self):
class BlogPost(Document):
content = StringField()
author = MapField(field=StringField())
BlogPost.drop_collection()
post = BlogPost(content='Had a good coffee today...',
author={'name': "Ross", "age": "20"}).save()
obj = BlogPost.objects.only('author__name',).get()
self.assertEquals(obj.author['name'], "Ross")
self.assertEquals(obj.author.get("age", None), None)
Try this:
query = BlogPost.objects({your: query})
if name:
query = query.only('author__'+name)
else:
query = query.only('author')
I found my fault! I used only twice.
For example:
BlogPost.objects.only('author').only('author__name')
I spent a whole day finding out what is wrong with Mongoengine.
So my wrong conclusion was:
BlogPost.objects()._collection.find_one(~~ filtering query ~~, {'author.'+ name:1})
But as you know it's a just raw data not a mongoengine query.
After this code, I cannot run any mongoengine methods.
In my case, I should have to query depending on some conditions.
so it will be great that 'only' method overwrites 'only' methods written before.. In my humble opinion.
I hope this feature would be integrated with next version. Right now, I have to code duplicate code:
not this code:
query = BlogPost.objects()
query( query~~).only('author')
if name:
query = query.only('author__'+name)
This code:
query = BlogPost.objects()
query( query~~).only('author')
if name:
query = BlogPost.objects().only('author__'+name)
So I think the second one looks dirtier than first one.
of course, the first code shows you all the data
using only('author') not only('author__name')

Serializing ReferenceProperty in Appengine Datastore to JSON

I am using the following code to serialize my appengine datastore to JSON
class DictModel(db.Model):
def to_dict(self):
return dict([(p, unicode(getattr(self, p))) for p in self.properties()])
class commonWordTweets(DictModel):
commonWords = db.StringListProperty(required=True)
venue = db.ReferenceProperty(Venue, required=True, collection_name='commonWords')
class Venue(db.Model):
id = db.StringProperty(required=True)
fourSqid = db.StringProperty(required=False)
name = db.StringProperty(required=True)
twitter_ID = db.StringProperty(required=True)
This returns the following JSON response
[
{
"commonWords": "[u'storehouse', u'guinness', u'badge', u'2011"', u'"new', u'mayor', u'dublin)']",
"venue": "<__main__.Venue object at 0x1028ad190>"
}
]
How can I return the actual venue name to appear?
Firstly, although it's not exactly your question, it's strongly recommended to use simplejson to produce json, rather than trying to turn structures into json strings yourself.
To answer your question, the ReferenceProperty just acts as a reference to your Venue object. So you just use its attributes as per normal.
Try something like:
cwt = commonWordTweets() # Replace with code to get the item from your datastore
d = {"commonWords":cwt.commonWords, "venue": cwt.venue.name}
jsonout = simplejson.dumps(d)

Categories