Serialize list of objects to JSON - python

I am trying to serialize a list of objects.
I am making an HTTP API call. The call returns a list of objects (e.g. class A). I do not have access to the definition of class A.
I tried using dumps
print ("Result is: %s", json.dumps(result_list.__dict__))
This prints an empty result. However if I were to print the result_list I get below output
{
"ResultList": [{
"fieldA": 0,
"fieldB": 1.46903594E9,
"fieldC": "builder",
"fieldD": "StringA/StringB-Test-124.35.4.24"
}]
}
IS there a way I can convert the object with whichever field it returns to a json.

Please specify more how the class of which result_list is an instance looks like (e.g. post the class code).
json.dumps(result_list) probably works not since result_list is not a plain dictionary, but an object of a class. You need to dump the variable that holds the data structure (i.e. the same that is displayed in the print call).

Related

Pymongo is it possible to insert one with insert_many?

This may sound like a dumb question but I was working with pymongo and wrote the following function to insert documents and was wondering if the insert_many method would also work for one record inserts, that way I wouldn't need another function in case I was just inserting one record.
This is my function:
def insert_records(list_of_documents: list, collection):
i = collection.insert_many(list_of_documents)
print(len(i.inserted_ids), " documents inserted!")
When I insert one it throws an error:
post1 = {"_id":0, "user_name":"Jack"}
insert_records(list(post1), stackoverflow)
TypeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument, or a type that inherits from collections.MutableMapping
I know I can use insert_one() for this purpose, I was just wondering if it was possible to do everything with insert_many(), as the original insert() method is deprecated. Thanks!
As your post1 is a dict when you use list(post1) you have a list of keys:
>>> list(post1)
['_id', 'user_name']
Use instead:
>>> [post1]
[{'_id': 0, 'user_name': 'Jack'}]
So:
insert_records([post1], stackoverflow)

How do you save a Python object into pymongo? (i.e. what method/return value do I need to override)

Let's say I have a class: TestClass. This class is slotted.
class TestClass(object):
__slots__ = ('_id', 'value1', 'value2',)
So we create an object.
test = TestClass()
test.key1 = 'val1'
test.key2 = 'val2'
Great! Now what I would like to do is insert test into a MongoDB instance.
db.test_collection.insert(test)
Uh oh.
TypeError: 'TestClass' object is not iterable
Ok, let's make that iterable.
class TestClass(object):
__slots__ = ('_id', 'key1', 'key2',)
def __iter__(self):
yield from dict(
(slot, self.__getattribute__(slot))
for slot in self.__slots__).items()
test = TestClass()
test.key1 = 'val1'
test.key2 = 'val2'
for i in test:
print(i)
// result
// ('key1', 'val1')
// ('key2', 'val2')
db.test_collection.insert(test)
This gives me: doc['_id'] = ObjectId() // TypeError: 'tuple' object does not support item assignment.
Further, let's say I have a composition of objects...
test = TestClass()
test.key1 = 'val1'
test.key2 = TestClass()
Would the pymongo encoder be able to encode test.key2 when saving test?
EDIT: I'm okay not saving the object directly and calling a function on the object like test.to_document(), but the goal is to have composite fields (e.g. test.key2) become a dict so that it can be saved.
As you are trying to insert an individual object rather than a series of them, the first error message you got, suggesting that PyMongo is expecting an iterable, is perhaps misleading. insert can take either a single document (dictionary) or an iterable of them.
You can convert the object to a dict with a function like this:
def slotted_to_dict(obj):
return {s: getattr(obj, s) for s in obj.__slots__ if hasattr(obj, s)}
Then
db.test_collection.insert(slotted_to_dict(test))
should work, but is deprecated. So
db.test_collection.insert_one(slotted_to_dict(test))
is probably better.
Note that if you are frequently converting the slotted object to a dict, then you might be losing any performance benefit from using slots. Also, the "_id" attribute will not be set in the test object, and you might need an awkward solution like this to save the ID:
test._id = db.test_collection.insert_one(slotted_to_dict(test))
First I would ask myself, why is it important to store the object as is in the Mongodb. What is it that I want to achieve here?
If the answer to the above question is: "Yes, there is a valid reason for this". The way to store objects is to first serialize them, and then store that serialization.
The pickle module does that for you.
import pickle
test = TestClass()
dumped = pickle.dumps(test)
db.objects.insert_one({"data": dumped, "name": 'test'})
It's worth noting that because these are python objects and if a user in anyway has the possibility to insert a pickled object into the database and you at some point unpickles that object it would pose a security threat to you.

Parse JSON MSG in Python

I am trying to parse a json MSG into a python dict.
For reference, the message is received from the Things Network with the python MQTT handler.
Here is the format I am receiving when I print the object
msg = MSG(variable_group=MSG(data0=0, data1=0, data2=0), variable2='name', variable3='num')
In its default state, I can access individual fields by msg.variable2 for example which provides 'name' but does not provide the variable name itself.
This is fine for a scenario in which I hardcode everything into my system, but I would like it to be a bit more adaptable and create new entries for variables as they come in.
Is there any way to parse this in such a way that I get both the data and the variable name?
Thanks!
EDIT:
From the input above, I would like to get a python dict containing the variable name and data.
dict =
{
variable_group : MSG(data0=0, data1=0, data2=0),
variable2 : 'name',
variable3 : 'num'
}
Currently, I can access the data via a for loop and can print the variable names if I print the entire structure, but cannot access the variable names through a looping mechanism
EDIT 2:
After doing some digging on the wrapper found the following:
def _json_object_hook(d):
return namedtuple("MSG", d.keys())(*d.values())
def json2obj(data):
return json.loads(data, object_hook=_json_object_hook)
Where the input shown above is created by passing it as 'data' to json2obj.
I am still unsure how to get a dict out of this format, haven't used object_hooks before.
From discussion in the comments below, it appears that the MSG object is a namedtuple created on the fly out of the json object.
In a case like that you can get the fields by looking at the _fields of the object. You can dict-ify a namedtuple like this
def nt_to_dict(nt):
return {field, getattr(nt, field) for field in nt._fields}
or you could just inspect the object by trolling _fields in code and using getattr as needed

Django validate a python dictionary via DictField

I'm new to the django rest framework and have two basic questions about something I don't fully understand. I have a python function which takes as an input a dictionary and would like to make it available via an API.
In [24]: def f(d):
...: pass
...:
I'm testing a post request via postman sending a json file which should be translated to my python dictionary. The input I'm sending looks like
{
"test": {
"name1": {
"list": ["a",
"b",
"c"
],
"numbers": [0.0, 0.1]
}
}
}
As you can see this is a dict currently with one key name1 which itself is a dict with two keys, list and numbers.
The view set I wrote for this example looks like this:
class TestViewSet(viewsets.ViewSet):
def create(self, request):
serializer = TestSerializer(data=request.data)
print(type(request.data.get("test")))
if serializer.is_valid():
print(serializer.validated_data.get("test"))
f(serializer.validated_data.get("test"))
and my serializer looks like this:
class TestSerializer(serializers.Serializer):
test = serializers.DictField()
If I send the json input above I get the printed the following:
<type 'dict'>
None
So I have the following two questions:
As we can see the request.data.get("test") is already the desired type (dict). Why do we want to call the serializer and doing some validation. This might cast certain inputs to not standard python types. E.g. validating a decimal via the DecimalField returns a object of type Decimal which is a Django type not a standard python type. Do I need to recast this after calling the serializer or how do I know that this won't cause any trouble with function expecting for example native python float64 type?
How can I define the serializer in the above example for the dictionary so that I get returned the correct object and not None? This dictionary consists of a dictionary consisting of two keys, where one value is a list of decimals and the other a list of strings. How can I write a validation for this within the DictField?

Why can't I call __dict__ on object in list that happens to be within another object within a list within a dictionary?

Here's my setup: dictD contains a key users paired with value = list of UserObjects. Each UserObject has an attribute username plus two arrays, threads and comments.
I was able to convert dictD's array of user objects into a dictionary style with this call:
dictD["users"] = [user.__dict__ for user in dictD["users"]]
If I dump out dictD, here's the relevant part before I try to do my manipulation:
{
'users':[
{
'username': Redditor(user_name='$$$$$$$$$$'),
'threads':[
<__main__.redditThread instance at 0x7f05db28b320>
],
'comments':[
<__main__.comment instance at 0x7f05db278e60>
]
},
{
'username': Redditor(user_name='##########e\ gone'),
'threads':[
<__main__.redditThread instance at 0x7f05db2a4a70>
],
'comments':[
<__main__.comment instance at 0x7f05db298e18>
]
}
As you can see the comments contain comment objects and the threads list contains thread objects. So I'd like to do the same call for them that I did for the users array. But when I try to do this:
for user in dictD["users"]:
user.threads = [thread.__dict__ for thread in user.threads]
user.comments = [comment.__dict__ for comment in user.comments]
I run into this error:
AttributeError: 'dict' object has no attribute 'threads'
I also tried
users = dictD["users"]
for user in users...
but this triggers the same error message. How can I turn objects in lists into dictionary form when those objects' lists are themselves held within objects within lists within a dictionary?
Incidentally, I am doing all this so I can insert these objects into MongoDB, so if there is an easier way to serialize a complex object, please let me into the secret. Thank you.
Promoting my comment to an answer since it seems reasonable and nobody else is posting: it looks at a glance like you're confusing Python for Javascript: a dict with a key 'threads' is not an object you can reference with .threads, only with ["threads"]. ie. user.threads should be user["threads"]. A dict usually only has the same standard attributes (see: https://docs.python.org/2/library/stdtypes.html#typesmapping or https://docs.python.org/3/library/stdtypes.html#mapping-types-dict for Python 3.) The problem isn't that you're trying to call __dict__ on an object, it's that you're trying to get an attribute from an object that doesn't exist, later in that same line of code.
If you want to recreate complex objects from MongoDB rather than just nested dicts and lists then that is basically a process of deserialization; you can either handle that manually, or maybe use some sort of object mapping library to do it for you (eg. something like Mongoobject might work, though I've not tested it myself)

Categories