How to maintain dictionary element order in JSON dump in Python - python

Due to a very helpful person on this site, I have been able to put together a script which takes user names (inputted by user) and then use a loop to append each of those user names to a specific JSON structure. However, I have noticed that the order within my JSON dump is not maintained. For example, here is what my loop and JSON dump looks like:
list_users = []
for user in users:
list_users.append({"name": user,
"member": 123,
"comment": "this is a comment"})
json_user_list_encoding = json.dumps(list_users, indent=2)
My print looks like this:
({"member": 123,
"comment": "this is a comment"
"name": user
})
I was wondering if it was possible to maintain the same order in "list_users" when I use the JSON dump. I have looked at this site a bit and from what I have read, I will have to assign keys in order to maintain a specific order. However, I am not sure how to do this being that I only have one JSON object. I hope this make sense. Thanks for the help.

If you just want them ordered reliably/repeatably, but don't care about the order itself, then json dumps accepts a sort_keys kwarg that can help:
>>> json.dumps({'z': 3, 'b': 2, 'a': 1})
'{"a": 1, "z": 3, "b": 2}'
>>> json.dumps({'z': 3, 'b': 2, 'a': 1}, sort_keys=True)
'{"a": 1, "b": 2, "z": 3}'
Otherwise, put the users in an OrderedDict instead of a plain dict.
from collections import OrderedDict
list_users.append(OrderedDict([
("name", user),
("member", 123),
("comment", "this is a comment"),
]))
If you need them to deserialize to the same order too, use the object_pairs_hook when loading.

Related

Python JSON parsing: ignore sub-objects

I only want to parse the root object of a JSON string. If this object contains any key where the value is also an object, the value should be kept as string, and should not be treated as Python dictionary.
input = '{ "a": 1, "b": { "c": 2 } }'
Needed outcome:
result = {
'a': 1,
'b': '{ "c": 2 }'
}
The reason for doing so is because the sub-objects are large, and we won't process them here, so parsing and storing them as typed values are not useful. Surely some parsing have to be done, but at least objects are not created, the deep processing of the token can be skipped.
After using json.loads(input), I would be able to convert back the value via json.dumps(result['c']). Is there a better way to do this? Maybe a pre-created JSONDecoder which yields all sub-object tokens as string?
Definetely not the best solution and maybe something you have thought of already but here is a solution which converts all values which are dicts to strings after the fact.
input = '{ "a": 1, "b": { "c": 2 } }'
import json
data = json.loads(input)
for k, v in data.items():
if isinstance(v, dict):
data[k] = str(v)

Python throws KeyError: 1 when using a Dictionary restored from a file

In my program, I have certain settings that can be modified by the user, saved on the disk, and then loaded when application is restarted. Some these settings are stored as dictionaries. While trying to implement this, I noticed that after a dictionary is restored, it's values cannot be used to access values of another dictionary, because it throws a KeyError: 1 exception.
This is a minimal code example that ilustrates the issue:
import json
motorRemap = {
1: 3,
2: 1,
3: 6,
4: 4,
5: 5,
6: 2,
}
motorPins = {
1: 6,
2: 9,
3: 10,
4: 11,
5: 13,
6: 22
}
print(motorPins[motorRemap[1]]); #works correctly
with open('motorRemap.json', 'w') as fp:
json.dump(motorRemap, fp)
with open('motorRemap.json', 'r') as fp:
motorRemap = json.load(fp)
print(motorPins[motorRemap[1]]); #throws KeyError: 1
You can run this code as it is. First print statement works fine, but after the first dictionary is saved and restored, it doesn't work anymore. Apparently, saving/restoring somehow breaks that dictionary.
I have tried saving and restoring with json and pickle libraries, and both produce in the same error. I tried printing values of the first dictionary after it is restored directly ( print(motorRemap[1]), and it prints out correct values without any added spaces or anything. KeyError usually means that the specified key doesn't exist in the dictionary, but in this instance print statement shows that it does exist - unless some underlying data types have changed or something. So I am really puzzled as to why this is happening.
Can anyone help me understand what is causing this issue, and how to solve it?
What happens becomes clear when you look at what json.dump wrote into motorRemap.json:
{"1": 3, "2": 1, "3": 6, "4": 4, "5": 5, "6": 2}
Unlike Python, json can only use strings as keys. Python, on the other hand, allows many different types for dictionary keys, including booleans, floats and even tuples:
my_dict = {False: 1,
3.14: 2,
(1, 2): 3}
print(my_dict[False], my_dict[3.14], my_dict[(1, 2)])
# Outputs '1 2 3'
The json.dump function automatically converts some of these types to string when you try to save the dictionary to a json file. False becomes "false", 3.14 becomes "3.14" and, in your example, 1 becomes "1". (This doesn't work for the more complex types such as a tuple. You will get a TypeError if you try to json.dump the above dictionary where one of the keys is (1, 2).)
Note how the keys change when you dump and load a dictionary with some of the Python-specific keys:
import json
my_dict = {False: 1,
3.14: 2}
print(my_dict[False], my_dict[3.14])
with open('my_dict.json', 'w') as fp:
json.dump(my_dict, fp)
# Writes {"false": 1, "3.14": 2} into the json file
with open('my_dict.json', 'r') as fp:
my_dict = json.load(fp)
print(my_dict["false"], my_dict["3.14"])
# And not my_dict[False] or my_dict[3.14] which raise a KeyError
Thus, the solution to your issue is to access the values using strings rather than integers after you load the dictionary from the json file.
print(motorPins[motorRemap["1"]]) instead of your last line will fix your code.
From a more general perspective, it might be worth considering keeping the keys as strings from the beginning if you know you will be saving the dictionary into a json file. You could also convert the values back to integers after loading as discussed here; however, that can lead to bugs if not all the keys are integers and is not a very good idea in bigger scale.
Checkout pickle if you want to save the dictionary keeping the Python format. It is, however, not human-readable unlike json and it's also Python-specific so it cannot be used to transfer data to other languages, missing virtually all the main benefits of json.
If you want to save and load the dictionary using pickle, this is how you would do it:
# import pickle
...
with open('motorRemap.b', 'wb') as fp:
pickle.dump(motorRemap, fp)
with open('motorRemap.b', 'rb') as fp:
motorRemap = pickle.load(fp)
...
since the keys (integers) from a dict will be written to the json file as strings, we can modify the reading of the json file. using a dict comprehension restores the original dict values:
...
with open('motorRemap.json', 'r') as fp:
motorRemap = {int(item[0]):item[1] for item in json.load(fp).items()}
...

More consistent hashing in dictionary with python objects?

So, I saw Hashing a dictionary?, and I was trying to figure out a way to handle python native objects better and produce stable results.
After looking at all the answers + comments this is what I came to and everything seems to work properly, but am I maybe missing something that would make my hashing inconsistent (besides hash algorithm collisions)?
md5(repr(nested_dict).encode()).hexdigest()
tl;dr: it creates a string with the repr and then hashes the string.
Generated my testing nested dict with this:
for i in range(100):
for j in range(100):
if not nested_dict.get(i,None):
nested_dict[i] = {}
nested_dict[i][j] = ''
I'd imagine the repr should be able to support any python object, since most have to have the __repr__ support in general, but I'm still pretty new to python programming. One thing that I've heard of when using from reprlib import repr instead of the stdlib one that it'll truncate large sequences. So, that's one potential downfall, but it seems like the native list and set types don't do that.
other notes:
I'm not able to use https://stackoverflow.com/a/5884123, because I'm going to have nested dictionaries.
I used python 3.9.7 when testing this out.
Not able to use https://stackoverflow.com/a/22003440, because at the time of hashing it still has IPv4 address objects as keys. (json.dumps didn't like that too much 😅)
Python dicts are insert ordered. The repr respects that. Your hexdigest of {"A":1,"B":2} will differ from {"B":2,"A":1} whereas == - wise those dicts are the same.
Yours won't work out:
from hashlib import md5
def yourHash(d):
return md5(repr(d).encode()).hexdigest()
a = {"A":1,"B":2}
b = {"B":2,"A":1}
print(repr(a))
print(repr(b))
print (a==b)
print(yourHash(a) == yourHash(b))
gives
{'A': 1, 'B': 2} # repr a
{'B': 2, 'A': 1} # repr b
True # a == b
False # your hashes equall'ed
I really do not see the "sense" in hashing dicts at all ... and those ones here are not even "nested".
You could try JSON to sort keys down to the last nested one and using the json.dumps() of the whole structure to be hashed - but still - don't see the sense and it will give you plenty computational overhead:
import json
a = {"A":1,"B":2, "C":{2:1,3:2}}
b = {"B":2,"A":1, "C":{3:2,2:1}}
for di in (a,b):
print(json.dumps(di,sort_keys=True))
gives
{"A": 1, "B": 2, "C": {"2": 1, "3": 2}} # thouroughly sorted
{"A": 1, "B": 2, "C": {"2": 1, "3": 2}} # recursively...
which is exactly what this answer in Hashing a dictionary? proposes ... why stray from it?

Making pretty json string with python's json

I am using python's json to produce a json file and cannot manage to format it the way I want.
What I have is a dictionary with some keys, and each key has a list of numbers attached to it:
out = {"a": [1,2,3], "b": [4,5,6]}
What I want to do is produce a JSON string where each list is in its own line, like so:
{
"a": [1,2,3],
"b": [4,5,6]
}
However, I can only get
>>> json.dumps(out)
'{"a": [1, 2, 3], "b": [4, 5, 6]}'
which has no new lines, or
>>> print json.dumps(out, indent=2)
{
"a": [
1,
2,
3
],
"b": [
4,
5,
6
]
}
which has waaay to many. Is there a simply way to produce the string I want? I can do it manually, of course, but I am wondering if it is possible with json alone...
You can't do that with the json module, no. It was never the goal for the module to allow this much control over the output.
The indent option is only meant to aid debugging. JSON parsers don't care about how much whitespace is used in-between elements.

How does JSON work in my python program?

So I use the Java Debugger JSON in my python program because a few months ago I was told that this was the best way of opening a text file and making it into a dictionary and also saving the dictionary to a text file. However I am not sure how it works.
Below is how I am using it within my program:
with open ("totals.txt", 'r') as f30:
totaldict = json.load(f30)
and
with open ("totals.txt", 'w') as f29:
json.dump(totaldict, f29)
I need to explain how it works for my project so could anyone explain for me how exactly json works when loading a text file into dictionary format and when dumping contents into the text file?
Thanks.
Edit: please don't just post links to other articles as I have tried to look at these and they have offered me not much help as they are not in my context of using JSON for dictionaries and a bit overwhelming as I am only a beginner.
JSON is J ava S cript O bject N otation. It works in Python like it does anywhere else, by giving you a syntax for describing arbitrary things as objects.
Most JSON is primarily composed of JavaScript arrays, which look like this:
[1, 2, 3, 4, 5]
Or lists of key-value pairs describing an object, which look like this:
{"key1": "value1", "key2": "value2"}
These can also be nested in either direction:
[{"object1": "data1"}, {"object2": "data2"}]
{"object1": ["list", "of", "data"]}
Naturally, Python can very easily treat these types as lists and dicts, which is exactly what the json module tries to do.
>>> import json
>>> json.loads('[{"object1": "data1"}, {"object2": "data2"}]')
[{'object1': 'data1'}, {'object2': 'data2'}]
>>> json.dumps(_)
'[{"object1": "data1"}, {"object2": "data2"}]'
Try this: Python Module of the Week
The json module provides an API similar to pickle for converting in-memory Python objects to a serialized representation known as JavaScript Object Notation (JSON). Unlike pickle, JSON has the benefit of having implementations in many languages (especially JavaScript)
Encoding and Decoding Simple Data Types
The encoder understands Python’s native types by default (string, unicode, int, float, list, tuple, dict).
import json
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0 } ]
print 'DATA:', repr(data)
data_string = json.dumps(data)
print 'JSON:', data_string
Values are encoded in a manner very similar to Python’s repr() output.
$ python json_simple_types.py
DATA: [{'a': 'A', 'c': 3.0, 'b': (2, 4)}]
JSON: [{"a": "A", "c": 3.0, "b": [2, 4]}]
Encoding, then re-decoding may not give exactly the same type of object.
import json
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0 } ]
data_string = json.dumps(data)
print 'ENCODED:', data_string
decoded = json.loads(data_string)
print 'DECODED:', decoded
print 'ORIGINAL:', type(data[0]['b'])
print 'DECODED :', type(decoded[0]['b'])
In particular, strings are converted to unicode and tuples become lists.
$ python json_simple_types_decode.py
ENCODED: [{"a": "A", "c": 3.0, "b": [2, 4]}]
DECODED: [{u'a': u'A', u'c': 3.0, u'b': [2, 4]}]
ORIGINAL: <type 'tuple'>
DECODED : <type 'list'>

Categories