Return MongoEngine Documents as JSON - python

Not too sure if this is really simple or not, but I can't really find anything on the topic. But, either using the regular MongoEngine library, or even Flask-MongoEngine for my Flask based website, would it be possible to return a MongoEngine document as straight JSON?
Thanks!

In 0.8 there are helpers see https://github.com/MongoEngine/mongoengine/issues/1
in the meantime you have to use pymongo's json_utils directly:
from bson import json_util
json_util.dumps(MyDoc._collection_obj.find(MyDoc.objects()._query))

Ross's and Jellyflower's workarounds don't work when field projection or ordering is used.
More general workaround:
from bson import json_util
json = json_util.dumps(query._cursor)

The correct workaround should probably be:
from bson import json_util
objects = MyDoc.objects()
json_util.dumps(objects._collection_obj.find(objects._query))

Update: thanks to Lo-Tan for to_mongo() method usage suggestion.
Eventually I came up with the following solution:
from json import JSONEncoder
from mongoengine.base import BaseDocument
class MongoEncoder(JSONEncoder):
def default(self, o):
if isinstance(o, BaseDocument):
data = o.to_mongo()
# might not be present if EmbeddedDocument
o_id = data.pop('_id', None)
if o_id:
data['id'] = str(o_id['$oid'])
data.pop('_cls', None)
return data
else:
return JSONEncoder.default(self, o)
# consider `obj` to be MongoEngine object
json_data = json.dumps(obj, cls=MongoEncoder)
It uses to_json() method, added as the response to the aforementioned issue.

Related

How to compare sql vs json in python

I have the following problem.
I have a class User simplified example:
class User:
def __init__(self, name, lastname, status, id=None):
self.id = id
self.name = name
self.lastname = lastname
self.status = status
def set_status(self,status)
# call to the api to change status
def get_data_from_db_by_id(self)
# select data from db where id = self.id
def __eq__(self, other):
if not isinstance(other, User):
return NotImplemented
return (self.id, self.name, self.lastname, self.status) == \
(other.id, other.name, other.lastname, other.status)
And I have a database structure like:
id, name, lastname, status
1, Alex, Brown, free
And json response from an API:
{
"id": 1,
"name": "Alex",
"lastname": "Brown",
"status": "Sleeping"
}
My question is:
What the best way to compare json vs sql responses?
What for? - it's only for testing purposes - I have to check that API has changed the DB correctly.
How can I deserialize Json and DB resul to the same class? Is there any common /best practices ?
For now, I'm trying to use marshmallow for json and sqlalchemy for DB, but have no luck with it.
Convert the database row to a dictionary:
def row2dict(row):
d = {}
for column in row.__table__.columns:
d[column.name] = str(getattr(row, column.name))
return d
Then convert json string to a dictionary:
d2 = json.loads(json_response)
And finally compare:
d2 == d
If you are using SQLAlchemy for the database, then I would recommend using SQLAthanor (full disclosure: I am the library’s author).
SQLAthanor is a serialization and de-serialization library for SQLAlchemy that lets you configure robust rules for how to serialize / de-serialize your model instances to JSON. One way of checking your instance and JSON for equivalence is to execute the following logic in your Python code:
First, serialize your DB instance to JSON. Using SQLAthanor you can do that as simply as:
instance_as_json = my_instance.dump_to_json()
This will take your instance and dump all of its attributes to a JSON string. If you want more fine-grained control over which model attributes end up on your JSON, you can also use my_instance.to_json() which respects the configuration rules applied to your model.
Once you have your serialized JSON string, you can use the Validator-Collection to convert your JSON strings to dicts, and then check if your instance dict (from your instance JSON string) is equivalent to the JSON from the API (full disclosure: I’m also the author of the Validator-Collection library):
from validator_collection import checkers, validators
api_json_as_dict = validators.dict(api_json_as_string)
instance_json_as_dict = validators.dict(instance_as_json)
are_equivalent = checkers.are_dicts_equivalent(instance_json_as_dict, api_json_as_dict)
Depending on your specific situation and objectives, you can construct even more elaborate checks and validations as well, using SQLAthanor’s rich serialization and deserialization options.
Here are some links that you might find helpful:
SQLAthanor Documentation on ReadTheDocs
SQLAthanor on Github
.dump_to_json() documentation
.to_json() documentation
Validator-Collection Documentation
validators.dict() documentation
checkers.are_dicts_equivalent() documentation
Hope this helps!

Flask: Peewee model_to_dict helper not working

i'm developing a little app for a University project and i need to json encode the result of a query to pass it to a js library, i've read elsewhere that i can use model_to_dict to accomplish that, but i'm getting this error
AttributeError: 'SelectQuery' object has no attribute '_meta'
and i don't know why or what to do, does anyone know how to solve that?
I'm using python 2.7 and the last version of peewee
#app.route('/ormt')
def orm():
doitch = Player.select().join(Nationality).where(Nationality.nation % 'Germany')
return model_to_dict(doitch)
This is because doitch is a SelectQuery instance it is not model, you have to call get()
from flask import jsonify
#app.route('/ormt')
def orm():
doitch = Player.select().join(Nationality).where(Nationality.nation % 'Germany')
return jsonify(model_to_dict(doitch.get()))
Also you could use dicts method to get data as dict. This omits creation a whole model stuff.
from flask import jsonify
#app.route('/ormt')
def orm():
doitch = Player.select().join(Nationality).where(Nationality.nation % 'Germany')
return jsonify(doitch.dicts().get())
edit
As #lord63 pointed out, you cannot simply return dict, it must be a Flask response so convert it to jsonify.
edit 2
#app.route('/ormt')
def orm():
doitch = Player.select().join(Nationality).where(Nationality.nation % 'Germany')
# another query
sth = Something.select()
return jsonify({
'doitch': doitch.dicts().get(),
'something': sth_query.dicts().get()
})

Elasticsearch fails in parsing datetime field coming from pymongo as object

I am trying to stream data from a mongoDB to Elasticsearch using both pymongo and the Python client elasticsearch.
I have set a mapping, I report here the snippet related to the field of interest:
"updated_at": {
"type": "date",
"format": "dateOptionalTime"
}
My script grabs each document from the MongoDB using pymongo and tries indexing it into Elasticsearch as
from elasticsearch import Elasticsearch
from pymongo import MongoClient
mongo_client = MongoClient('localhost', 27017)
es_client = Elasticsearch(hosts=[{"host": "localhost", "port": 9200}])
db = mongo_client['my_db']
collection = db['my_collection']
for doc in collection.find():
es_client.index(
index='index_name',
doc_type='my_type',
id=str(doc['_id']),
body=json.dumps(doc, default=json_util.default)
)
The problem I have in running it is:
elasticsearch.exceptions.RequestError: TransportError(400, u'MapperParsingException[failed to parse [updated_at]]; nested: ElasticsearchIllegalArgumentException[unknown property [$date]]; ')
I believe the source of the problem is in the fact that pymongo serializes the field updated_at as a datetime.datetime object, as I can see if I print the doc in the for loop:
u'updated_at': datetime.datetime(2014, 8, 31, 17, 18, 13, 17000)
This conflicts with Elasticsearch looking for an object of type date as specified in the mapping.
Any ideas how to solve this?
You're on the right path, your Python datetime needs to be serialized as an ISO 8601-compliant date string. So, you need to add a CustomEncoder in your json.dumps() call. First, declare your CustomEncoder as a subclass of JSONEncoder which will handle the transformation of datetime and time properties, but delegate the rest to its superclass:
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.strftime('%Y-%m-%dT%H:%M:%S%z')
if isinstance(obj, time):
return obj.strftime('%H:%M:%S')
if hasattr(obj, 'to_json'):
return obj.to_json()
return super(CustomEncoder, self).default(obj)
And then you can use it in your json.dumps call, like this:
...
body=json.dumps(doc, default=json_util.default, cls=CustomEncoder)
...
I guess your problem is that you're using
body=json.dumps(doc, default=json_util.default)
but you should be using
body=doc
Doing that works for me, since it seems elasticsearch is caring for the aliasing of the dictionarly into a JSON document (of course, assuming doc is a dictionary, which I guess it is).
At least in the version of elasticsearch I'm using (2.x), datetime.datetime is correctly aliased, with no need of a mapping. For example, this works for me:
doc = {"updated_on": datetime.now(timezone.utc)}
res = es.index(index=es_index, doc_type='my_type',
id=1, body=doc)
And is recognized by Kibana as a date.
You can use:
from elasticsearch_dsl.serializer import serializer
serializer.dumps(your_dict)
Replace your_dict with your Document().prepare() or document.to_dict()
Making sure I timestamp to elastic using datetime.now(timezone.utc)
from datetime import datetime, timezone
doc = {
"timestamp": datetime.now(timezone.utc),
#the rest of your data
}
Solved the problem of the time having a strange drift on elastic search.

TypeError: ObjectId('') is not JSON serializable

My response back from MongoDB after querying an aggregated function on document using Python, It returns valid response and i can print it but can not return it.
Error:
TypeError: ObjectId('51948e86c25f4b1d1c0d303c') is not JSON serializable
Print:
{'result': [{'_id': ObjectId('51948e86c25f4b1d1c0d303c'), 'api_calls_with_key': 4, 'api_calls_per_day': 0.375, 'api_calls_total': 6, 'api_calls_without_key': 2}], 'ok': 1.0}
But When i try to return:
TypeError: ObjectId('51948e86c25f4b1d1c0d303c') is not JSON serializable
It is RESTfull call:
#appv1.route('/v1/analytics')
def get_api_analytics():
# get handle to collections in MongoDB
statistics = sldb.statistics
objectid = ObjectId("51948e86c25f4b1d1c0d303c")
analytics = statistics.aggregate([
{'$match': {'owner': objectid}},
{'$project': {'owner': "$owner",
'api_calls_with_key': {'$cond': [{'$eq': ["$apikey", None]}, 0, 1]},
'api_calls_without_key': {'$cond': [{'$ne': ["$apikey", None]}, 0, 1]}
}},
{'$group': {'_id': "$owner",
'api_calls_with_key': {'$sum': "$api_calls_with_key"},
'api_calls_without_key': {'$sum': "$api_calls_without_key"}
}},
{'$project': {'api_calls_with_key': "$api_calls_with_key",
'api_calls_without_key': "$api_calls_without_key",
'api_calls_total': {'$add': ["$api_calls_with_key", "$api_calls_without_key"]},
'api_calls_per_day': {'$divide': [{'$add': ["$api_calls_with_key", "$api_calls_without_key"]}, {'$dayOfMonth': datetime.now()}]},
}}
])
print(analytics)
return analytics
db is well connected and collection is there too and I got back valid expected result but when i try to return it gives me Json error. Any idea how to convert the response back into JSON. Thanks
Pymongo provides json_util - you can use that one instead to handle BSON types
def parse_json(data):
return json.loads(json_util.dumps(data))
You should define you own JSONEncoder and using it:
import json
from bson import ObjectId
class JSONEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, ObjectId):
return str(o)
return json.JSONEncoder.default(self, o)
JSONEncoder().encode(analytics)
It's also possible to use it in the following way.
json.encode(analytics, cls=JSONEncoder)
>>> from bson import Binary, Code
>>> from bson.json_util import dumps
>>> dumps([{'foo': [1, 2]},
... {'bar': {'hello': 'world'}},
... {'code': Code("function x() { return 1; }")},
... {'bin': Binary("")}])
'[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'
Actual example from json_util.
Unlike Flask's jsonify, "dumps" will return a string, so it cannot be used as a 1:1 replacement of Flask's jsonify.
But this question shows that we can serialize using json_util.dumps(), convert back to dict using json.loads() and finally call Flask's jsonify on it.
Example (derived from previous question's answer):
from bson import json_util, ObjectId
import json
#Lets create some dummy document to prove it will work
page = {'foo': ObjectId(), 'bar': [ObjectId(), ObjectId()]}
#Dump loaded BSON to valid JSON string and reload it as dict
page_sanitized = json.loads(json_util.dumps(page))
return page_sanitized
This solution will convert ObjectId and others (ie Binary, Code, etc) to a string equivalent such as "$oid."
JSON output would look like this:
{
"_id": {
"$oid": "abc123"
}
}
Most users who receive the "not JSON serializable" error simply need to specify default=str when using json.dumps. For example:
json.dumps(my_obj, default=str)
This will force a conversion to str, preventing the error. Of course then look at the generated output to confirm that it is what you need.
from bson import json_util
import json
#app.route('/')
def index():
for _ in "collection_name".find():
return json.dumps(i, indent=4, default=json_util.default)
This is the sample example for converting BSON into JSON object. You can try this.
As a quick replacement, you can change {'owner': objectid} to {'owner': str(objectid)}.
But defining your own JSONEncoder is a better solution, it depends on your requirements.
Posting here as I think it may be useful for people using Flask with pymongo. This is my current "best practice" setup for allowing flask to marshall pymongo bson data types.
mongoflask.py
from datetime import datetime, date
import isodate as iso
from bson import ObjectId
from flask.json import JSONEncoder
from werkzeug.routing import BaseConverter
class MongoJSONEncoder(JSONEncoder):
def default(self, o):
if isinstance(o, (datetime, date)):
return iso.datetime_isoformat(o)
if isinstance(o, ObjectId):
return str(o)
else:
return super().default(o)
class ObjectIdConverter(BaseConverter):
def to_python(self, value):
return ObjectId(value)
def to_url(self, value):
return str(value)
app.py
from .mongoflask import MongoJSONEncoder, ObjectIdConverter
def create_app():
app = Flask(__name__)
app.json_encoder = MongoJSONEncoder
app.url_map.converters['objectid'] = ObjectIdConverter
# Client sends their string, we interpret it as an ObjectId
#app.route('/users/<objectid:user_id>')
def show_user(user_id):
# setup not shown, pretend this gets us a pymongo db object
db = get_db()
# user_id is a bson.ObjectId ready to use with pymongo!
result = db.users.find_one({'_id': user_id})
# And jsonify returns normal looking json!
# {"_id": "5b6b6959828619572d48a9da",
# "name": "Will",
# "birthday": "1990-03-17T00:00:00Z"}
return jsonify(result)
return app
Why do this instead of serving BSON or mongod extended JSON?
I think serving mongo special JSON puts a burden on client applications. Most client apps will not care using mongo objects in any complex way. If I serve extended json, now I have to use it server side, and the client side. ObjectId and Timestamp are easier to work with as strings and this keeps all this mongo marshalling madness quarantined to the server.
{
"_id": "5b6b6959828619572d48a9da",
"created_at": "2018-08-08T22:06:17Z"
}
I think this is less onerous to work with for most applications than.
{
"_id": {"$oid": "5b6b6959828619572d48a9da"},
"created_at": {"$date": 1533837843000}
}
For those who need to return the data thru Jsonify with Flask:
cursor = db.collection.find()
data = []
for doc in cursor:
doc['_id'] = str(doc['_id']) # This does the trick!
data.append(doc)
return jsonify(data)
You could try:
objectid = str(ObjectId("51948e86c25f4b1d1c0d303c"))
in my case I needed something like this:
class JsonEncoder():
def encode(self, o):
if '_id' in o:
o['_id'] = str(o['_id'])
return o
This is how I've recently fixed the error
#app.route('/')
def home():
docs = []
for doc in db.person.find():
doc.pop('_id')
docs.append(doc)
return jsonify(docs)
I know I'm posting late but thought it would help at least a few folks!
Both the examples mentioned by tim and defuz(which are top voted) works perfectly fine. However, there is a minute difference which could be significant at times.
The following method adds one extra field which is redundant and may not be ideal in all the cases
Pymongo provides json_util - you can use that one instead to handle BSON types
Output: {
"_id": {
"$oid": "abc123"
}
}
Where as the JsonEncoder class gives the same output in the string format as we need and we need to use json.loads(output) in addition. But it leads to
Output: {
"_id": "abc123"
}
Even though, the first method looks simple, both the method need very minimal effort.
I would like to provide an additional solution that improves the accepted answer. I have previously provided the answers in another thread here.
from flask import Flask
from flask.json import JSONEncoder
from bson import json_util
from . import resources
# define a custom encoder point to the json_util provided by pymongo (or its dependency bson)
class CustomJSONEncoder(JSONEncoder):
def default(self, obj): return json_util.default(obj)
application = Flask(__name__)
application.json_encoder = CustomJSONEncoder
if __name__ == "__main__":
application.run()
If you will not be needing the _id of the records I will recommend unsetting it when querying the DB which will enable you to print the returned records directly e.g
To unset the _id when querying and then print data in a loop you write something like this
records = mycollection.find(query, {'_id': 0}) #second argument {'_id':0} unsets the id from the query
for record in records:
print(record)
If you want to send it as a JSON response you need to format in two steps
Using json_util.dumps() from bson to covert ObjectId in BSON response to
JSON compatible format i.e. "_id": {"$oid": "123456789"}
The above JSON Response obtained from json_util.dumps() will have backslashes and quotes
To remove backslashes and quotes use json.loads() from json
from bson import json_util
import json
bson_data = [{'_id': ObjectId('123456789'), 'field': 'somedata'},{'_id': ObjectId('123456781'), 'field': 'someMoredata'}]
json_data_with_backslashes = json_util.dumps(bson_data)
# output will look like this
# "[{\"_id\": {\"$oid\": \"123456789\"}, \"field\": \"somedata\"},{\"_id\": {\"$oid\": \"123456781\"}, \"field\": \"someMoredata\"}]"
json_data = json.loads(json_data_with_backslashes)
# output will look like this
# [{"_id": {"$oid": "123456789"},"field": "somedata"},{"_id": {"$oid": "123456781"},"field": "someMoredata"}]
Flask's jsonify provides security enhancement as described in JSON Security. If custom encoder is used with Flask, its better to consider the
points discussed in the JSON Security
If you don't want _id in response, you can refactor your code something like this:
jsonResponse = getResponse(mock_data)
del jsonResponse['_id'] # removes '_id' from the final response
return jsonResponse
This will remove the TypeError: ObjectId('') is not JSON serializable error.
from bson.objectid import ObjectId
from core.services.db_connection import DbConnectionService
class DbExecutionService:
def __init__(self):
self.db = DbConnectionService()
def list(self, collection, search):
session = self.db.create_connection(collection)
return list(map(lambda row: {i: str(row[i]) if isinstance(row[i], ObjectId) else row[i] for i in row}, session.find(search))
SOLUTION for: mongoengine + marshmallow
If you use mongoengine and marshamallow then this solution might be applicable for you.
Basically, I imported String field from marshmallow, and I overwritten default Schema id to be String encoded.
from marshmallow import Schema
from marshmallow.fields import String
class FrontendUserSchema(Schema):
id = String()
class Meta:
fields = ("id", "email")

Google App Engine Datastore Query to JSON with Python

How can I get a JSON Object in python from getting data via Google App Engine Datastore?
I've got model in datastore with following field:
id
key_name
object
userid
created
Now I want to get all objects for one user:
query = Model.all().filter('userid', user.user_id())
How can I create a JSON object from the query so that I can write it?
I want to get the data via AJAX call.
Not sure if you got the answer you were looking for, but did you mean how to parse the model (entry) data in the Query object directly into a JSON object? (At least that's what I've been searching for).
I wrote this to parse the entries from Query object into a list of JSON objects:
def gql_json_parser(query_obj):
result = []
for entry in query_obj:
result.append(dict([(p, unicode(getattr(entry, p))) for p in entry.properties()]))
return result
You can have your app respond to AJAX requests by encoding it with simplejson e.g.:
query_data = MyModel.all()
json_query_data = gql_json_parser(query_data)
self.response.headers['Content-Type'] = 'application/json'
self.response.out.write(simplejson.dumps(json_query_data))
Your app will return something like this:
[{'property1': 'value1', 'property2': 'value2'}, ...]
Let me know if this helps!
If I understood you correctly I have implemented a system that works something like this. It sounds like you want to store an arbitrary JSON object in a GAE datastore model. To do this you need to encode the JSON into a string of some sort on the way into the database and decode it from a string into a python datastructure on the way out. You will need to use a JSON coder/decoder to do this. I think the GAE infrastructure includes one. For example you could use a "wrapper class" to handle the encoding/decoding. Something along these lines...
class InnerClass(db.Model):
jsonText = db.TextProperty()
def parse(self):
return OuterClass(self)
class Wrapper:
def __init__(self, storage=None):
self.storage = storage
self.json = None
if storage is not None:
self.json = fromJsonString(storage.jsonText)
def put(self):
jsonText = ToJsonString(self.json)
if self.storage is None:
self.storage = InnerClass()
self.storage.jsonText = jsonText
self.storage.put()
Then always operate on parsed wrapper objects instead of the inner class
def getall():
all = db.GqlQuery("SELECT * FROM InnerClass")
for x in all:
yield x.parse()
(untested). See datastoreview.py for some model implementations that work like this.
I did the following to convert the google query object to json. I used the logic in jql_json_parser above as well except for the part where everything is converted to unicode. I want to preserve the data-types like integer, floats and null.
import json
class JSONEncoder(json.JSONEncoder):
def default(self, obj):
if hasattr(obj, 'isoformat'): #handles both date and datetime objects
return obj.isoformat()
else:
return json.JSONEncoder.default(self, obj)
class BaseResource(webapp2.RequestHandler):
def to_json(self, gql_object):
result = []
for item in gql_object:
result.append(dict([(p, getattr(item, p)) for p in item.properties()]))
return json.dumps(result, cls=JSONEncoder)
Now you can subclass BaseResource and call self.to_json on the gql_object

Categories