json.dumps with indent prints different output [duplicate] - python

I've noticed the order of elements in a JSON object not being the original order.
What about the elements of JSON lists? Is their order maintained?

Yes, the order of elements in JSON arrays is preserved. From RFC 7159 -The JavaScript Object Notation (JSON) Data Interchange Format
(emphasis mine):
An object is an unordered collection of zero or more name/value
pairs, where a name is a string and a value is a string, number,
boolean, null, object, or array.
An array is an ordered sequence of zero or more values.
The terms "object" and "array" come from the conventions of
JavaScript.
Some implementations do also preserve the order of JSON objects as well, but this is not guaranteed.

The order of elements in an array ([]) is maintained. The order of elements (name:value pairs) in an "object" ({}) is not, and it's usual for them to be "jumbled", if not by the JSON formatter/parser itself then by the language-specific objects (Dictionary, NSDictionary, Hashtable, etc) that are used as an internal representation.

Practically speaking, if the keys were of type NaN, the browser will not change the order.
The following script will output "One", "Two", "Three":
var foo={"3":"Three", "1":"One", "2":"Two"};
for(bar in foo) {
alert(foo[bar]);
}
Whereas the following script will output "Three", "One", "Two":
var foo={"#3":"Three", "#1":"One", "#2":"Two"};
for(bar in foo) {
alert(foo[bar]);
}

Some JavaScript engines keep keys in insertion order. V8, for instance, keeps all keys in insertion order except for keys that can be parsed as unsigned 32-bit integers.
This means that if you run either of the following:
var animals = {};
animals['dog'] = true;
animals['bear'] = true;
animals['monkey'] = true;
for (var animal in animals) {
if (animals.hasOwnProperty(animal)) {
$('<li>').text(animal).appendTo('#animals');
}
}
var animals = JSON.parse('{ "dog": true, "bear": true, "monkey": true }');
for (var animal in animals) {
$('<li>').text(animal).appendTo('#animals');
}
You'll consistently get dog, bear, and monkey in that order, on Chrome, which uses V8. Node.js also uses V8. This will hold true even if you have thousands of items. YMMV with other JavaScript engines.
Demo here and here.

"Is the order of elements in a JSON list maintained?" is not a good question. You need to ask "Is the order of elements in a JSON list maintained when doing [...] ?"
As Felix King pointed out, JSON is a textual data format. It doesn't mutate without a reason. Do not confuse a JSON string with a (JavaScript) object.
You're probably talking about operations like JSON.stringify(JSON.parse(...)). Now the answer is: It depends on the implementation. 99%* of JSON parsers do not maintain the order of objects, and do maintain the order of arrays, but you might as well use JSON to store something like
{
"son": "David",
"daughter": "Julia",
"son": "Tom",
"daughter": "Clara"
}
and use a parser that maintains order of objects.
*probably even more :)

Related

Parse an embedded object (JSON) into an ordered dictionary in Python

I am looking to parse some JSON into a dictionary but need to preserve order for one particular part of the dictionary.
I know that I can parse the entire JSON file into an ordered dictionary (ex. Can I get JSON to load into an OrderedDict?) but this is not quite what I'm looking for.
{
"foo": "bar",
"columns":
{
"col_1": [],
"col_2": []
}
}
In this example, I would want to parse the entire file in as a dictionary with the "columns" portion being an OrderedDict. Is it possible to get that granular with the JSON parsing tools while guaranteeing that order is preserved throughout? Thank you!
From the comments meanwhile, I gathered that a complete, nested OrderedDict is fine as well, but this could be a solution too, if you don't mind using some knowledge about the names of the columns:
import json
from collections import OrderedDict
def hook(partialjson):
if "col_1" in partialjson:
return OrderedDict(partialjson)
return dict(partialjson)
result = json.loads("YOUR JSON STRING", object_hook=hook)
Hope this helps!

str.format a list by joining its values

Say I have a dictionary:
data = {
"user" : {
"properties" : ["i1", "i2"]
}
}
And the following string:
txt = "The user has properties {user[properties]}"
I want to have:
txt.format(**data)
to equal:
The user has properties i1, i2
I believe to achieve this, I could subclass the formatter used by str.format but I am unfortunately unsure how to proceed. I rarely subclass standard Python classes. Note that writing {user[properties][0]}, {user[properties][1]} is not an ideal option for me here. I don't know how many items are in the list so I would need to do a regex to identify matches, then find the relevant value in data and replace the matched text with {user[properties][0]}, {user[properties][1]}. str.format takes care of all the indexing from the string's value so it is very practical.
Just join the items in data["user"]["properties"]
txt = "The user has properties {properties}"
txt.format(properties = ", ".join(data["user"]["properties"]))
Here you have a live example
I ended up using the jinja2 package for all of my formatting needs. It's extremely powerful and I really recommend it!

Python json.loads changes the order of the object

I've got a file that contains a JSON object. It's been loaded the following way:
with open('data.json', 'r') as input_file:
input_data = input_file.read()
At this point input_data contains just a string, and now I proceed to parse it into JSON:
data_content = json.loads(input_data.decode('utf-8'))
data_content has the JSON representation of the string which is what I need, but for some reason not clear to me after json.loads it is altering the order original order of the keys, so for instance, if my file contained something like:
{ "z_id": 312312,
"fname": "test",
"program": "none",
"org": null
}
After json.loads the order is altered to let's say something like:
{ "fname": "test",
"program": None,
"z_id": 312312,
"org": "none"
}
Why is this happening? Is there a way to preserve the order? I'm using Python 2.7.
Dictionaries (objects) in python have no guaranteed order. So when parsed into a dict, the order is lost.
If the order is important for some reason, you can have json.loads use an OrderedDict instead, which is like a dict, but the order of keys is saved.
from collections import OrderedDict
data_content = json.loads(input_data.decode('utf-8'), object_pairs_hook=OrderedDict)
This is not an issue with json.load. Dictionaries in Python are not order enforced, so you will get it out of order; generally speaking, it doesn't matter, because you access elements based on strings, like "id".

python json dump, how to make specify key first?

I want to dump this json to a file:
json.dumps(data)
This is the data:
{
"list":[
"one": { "id": "12","desc":"its 12","name":"pop"},
"two": {"id": "13","desc":"its 13","name":"kindle"}
]
}
I want id to be the first property after I dump it to file, but it is not. How can I fix this?
My guess is that it's because you're using a dictionary (hash-map). It's unsortable.
What you could do is:
from collections import OrderedDict
data = OrderedDict()
data['list'] = OrderedDict()
data['list']['one'] = OrderedDict()
data['list']['one']['id'] = '12'
data['list']['one']['idesc'] = ...
data['list']['two'] = ...
This makes it sorted by order of input.
It's "impossible" to know the output of a dict/hashmap because the nature (and speed) of a traditional dictionary makes the sort/access order vary depending on usage, items in the dictionary and a lot of other factors.
So you need to either pass your dictionary to a sort() function prior to sending it to json or use a slower version of the dictionary called OrderedDict (see above).
Many thanks goes out to #MarcoNawijn for checking the source of JSON that does not honor the sort structure of the dictionary, which means you'll have to build the JSON string yourself.
If the parser on the other end of your JSON string honors the order (which i doubt), you could pass this to a function that builds a regular text-string representation of your OrderedDict and formatting the string as per JSON standards. This will however take up more time than I have at this moment since i'm not 100% certain of the RFC for JSON strings.
You shouldnt worry about the order in which json is saved. The order will be changed when dumping. Better look at these too. JSON order mixed up
and
Is the order of elements in a JSON list maintained?

Access a variable within a dictionary with unknown nested location

I have a JSON file and I want to query it using python. However, I do not know the nested location of a variable before hand. E.g. to query a JSON object below loaded into python and called 'data', I could do the following:
data['experiments']['initial_ns']['icdat']
However, this assumes that I know that the icdat variable is located below initial_ns which is located under experiments. Unfortunately I do not have this information and also the JSON structure could change in the future. Is there a simpler variable to access variables within a JSON string without explicitly specifying the entire structure?
thanks!!!
{
"experiments": [
{
"management": {
"events": [
{
"date": "19122",
"timp": "TI3",
"eve": "tage"
}
]
},
"initial_ns": {
"icpcr": "MZ",
"icdat": "1922"
},
"observed": {
"mdat": "19403",
"time_series": [
{
"date": "198423",
"etac": "0"
}
],
"adat": "190218"
},
"local_name": "lhi",
"exname": "SE",
"exp_dur": "1"
}
]
}
Have a look at the jsonpath module. http://goessner.net/articles/JsonPath/. I think the search string $..icdat will match your needs.
"...without explicitly specifying the entire structure?"
Yes, there are many ways. Unfortunately you have not specified which answer you are looking for.
To be "unique in terms of the schema" (my terminology) is as follows: If you have for example multiple Foo dictionaries with the key Foo.bar, then that is still unique. What is not unique is if you have Foo objects with Foo.bar, and Baz objects with Baz.bar: searching for {... baz:...} will return different kinds of objects.
If the key is unique in terms of the schema, you can search the entire tree. You can make this go faster by caching all key-value pairs in a dictionary for later use (therefore the operation is O(1) "instant" amortized cost, since you needed to go through the entire data structure anyway to parse it!). This even works if you would like to return sets of objects: use a cache = collections.defaultdict(set) and when you preprocess items to cache, do cache[key].add(value).
If the key is not unique in terms of the schema, you will want to make a reasonable guess about the path and provide some partial information, per Hans Then's answer utilization JsonPath: https://stackoverflow.com/a/12291240/711085 (alternatively, change the schema)
No. You need to know the format, or you'll have to manually loop over everything in it.
You can write a function to recursively search nested containers for a given key, similar to findElementByID() in an XML DOM parser.
def find_key(json, key):
if isinstance(json, dict):
if key in json:
yield json[key]
if isinstance(json, (dict, list)):
for value in (json.itervalues() if isinstance(json, dict) else json):
if isinstance(value, (dict, list)):
for item in find_key(value, key):
yield item
>>> next(items_by_key(data, "icdat"))
'1922'
Since the same key may be found in multiple places in the document, this is actually written as a generator. You can iterate over the results to get all the values or, if you just want the first one (or know it's the only one), use next() around it as I've shown above. You could also convert it to a list() if desired.

Categories