If I have dictionary like:
{
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
And try to save it to a text file, I get something like this:
{"cats": {"sphinx": 3}, {"british": 2}, "dogs": {}}
How can I save a dictionary in pretty format, so it will be easy to read by human eye?
You can import json and specify an indent level:
import json
d = {
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
j = json.dumps(d, indent=4)
print(j)
{
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
Note that this is a string, however:
>>> j
'{\n "cats": {\n "sphinx": 3, \n "british": 2\n }, \n "dogs": {}\n}'
You can use pprint for that:
import pprint
pprint.pformat(thedict)
If you want to save it in a more standard format, you can also use, for example, a yaml file (and the related python package http://pyyaml.org/wiki/PyYAMLDocumentation), and the code would look like:
import yaml
dictionary = {"cats": {"sphinx": 3}, {"british": 2}, "dogs": {}}
with open('dictionary_file.yml', 'w') as yaml_file:
yaml.dump(dictionary, stream=yaml_file, default_flow_style=False)
dump creates a string in the yaml format to be written to the file. Note that it is possible to specify the stream and write the content immediately to the file. If it is necessary to get the string for some reason before writing to the file, just don't specify it and write it after using write function for the file.
Note also that the parameter default_flow_style allows to have a nicer format; in the example the file looks:
cats:
british: 2
sphinx: 3
dogs: {}
To load again the yaml file in a dictionary:
import yaml
with open('dictionary_file.yml', 'r') as yaml_file:
dictionary = yaml.load(yaml_file)
You can dump it by using the Python Object Notation module (pon: disclaimer I am the author of that module)
from pon import PON, loads
data = {
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
pon = PON(obj=data)
pon.dump()
which gives:
dict(
cats=dict(
sphinx=3,
british=2,
),
dogs=dict( ),
)
which again is correct Python, but trading the quoted strings needed for keys by using dict .
You can load this again with:
read_back = loads(open('file_name.pon').read())
print(read_back)
giving:
{'cats': {'sphinx': 3, 'british': 2}, 'dogs': {}}
Please note that loads() does not evaluate the string, it actually parses it safely using python's built-in parser.
PON also allows you to load python dictionaries from files, that have commented entries, and dump them while preserving the comments. This is where it's real usefulness comes into action.
Alternatively, if you would like something, arbitrarily more readable like the YAML format, you can use ruamel.yaml and do:
import ruamel.yaml
ruamel.yaml.round_trip_dump(data, stream=open('file_name.yaml', 'wb'), indent=4)
which gives you a file file_name.yaml with contents:
cats:
sphinx: 3
british: 2
dogs: {}
which uses the indent you seem to prefer (and is more efficient than #alberto's version)
Related
Content of a Sample Input Text
{'key1':'value1','msg1':"content1"} //line 1
{'key2':'value2','msg2':"content2"} //line 2
{'key3':'value3','msg3':"content3"} //line 3
Also, pointing out some notable characteristics of the input text
Lacks a proper delimiter, currently each object {...} takes a new line "\n"
Contains single quotes, which can be an issue since JSON (the expected output) accepts only double quotes
Does not have the opening and closing curly brackets required by JSON
Expected Output JSON
{
{
"key1":"value1",
"msg1":"content1"
},
{
"key2":"value2",
"msg2":"content2"
},
{
"key3":"value3",
"msg3":"content3"
}
}
What I have tried, but failed
json.dumps(input_text), but it cannot identify "\n" as the "delimiter"
Appending a comma at the end of each object {...}, but encountered the issue of extra comma when it comes to the last object
If you have one dictionary per line, you can replace newlines with , and enclose the whole in brackets [,] (you get a list of dictionaries).
You can use ast.literal_eval to import your file as list of dictionaries.
Finally export it to json:
import json
import ast
with open("file.txt", "r") as f:
dic_list = ast.literal_eval("[" + f.read().replace('\n',',') + "]")
print(json.dumps(dic_list, indent=4))
Output:
[
{
"key1": "value1",
"msg1": "content1"
},
{
"key2": "value2",
"msg2": "content2"
},
{
"key3": "value3",
"msg3": "content3"
}
]
Just use ast
import ast
with open('test.txt') as f:
data = [ast.literal_eval(l.strip()) for l in f.readlines()]
print(data)
output
[{'key1': 'value1', 'msg1': 'content1'}, {'key2': 'value2', 'msg2': 'content2'}, {'key3': 'value3', 'msg3': 'content3'}]
I am using csv module to convert json to csv and store it in a file or print it to stdout.
def write_csv(data:list, header:list, path:str=None):
# data is json format data as list
output_file = open(path, 'w') if path else sys.stdout
out = csv.writer(output_file)
out.writerow(header)
for row in data:
out.writerow([row[attr] for attr in header])
if path: output_file.close()
I want to store the converted csv to a variable instead of sending it to a file or stdout.
say I want to create a function like this:
def json_to_csv(data:list, header:list):
# convert json data into csv string
return string_csv
NOTE: format of data is simple
data is list of dictionaries of string to string maping
[
{
"username":"srbcheema",
"name":"Sarbjit Singh"
},
{
"username":"testing",
"name":"Test, user"
}
]
I want csv output to look like:
username,name
srbcheema,Sarbjit Singh
testing,"Test, user"
Converting JSON to CSV is not a trivial operation. There is also no standardized way to translate between them...
For example
my_json = {
"one": 1,
"two": 2,
"three": {
"nested": "structure"
}
}
Could be represented in a number of ways...
These are all (to my knowledge) valid CSVs that contain all the information from the JSON structure.
data
'{"one": 1, "two": 2, "three": {"nested": "structure"}}'
one,two,three
1,2,'{"nested": "structure"}'
one,two,three__nested
1,2,structure
In essence, you will have to figure out the best translation between the two based on your knowledge of the data. There is no right answer on how to go about this.
I'm relatively knew to Python so there's probably a better way, but this works:
def get_safe_string(string):
return '"'+string+'"' if "," in string else string
def json_to_csv(data):
csv_keys = data[0].keys()
header = ",".join(csv_keys)
res = list(",".join(get_safe_string(row.get(k)) for k in csv_keys) for row in data)
res.insert(0,header)
return "\n".join(r for r in res)
My generated json output is showing that it's not a valid Json while checking with jslint. Getting error EOF.
Here am using if len(data) != 0: for not inserting [] in the final output.json file (working but don't know any other way to avoid inserting [] to file)
with open('output.json', 'a') as jsonFile:
print(data)
if len(data) != 0:
json.dump(data, jsonFile, indent=2)
My input data is coming one by one from another function generated from inside for loop.
Sample "data" coming from another function using loop :
print(data)
[{'product': 'food'}, {'price': '$100'}]
[{'product': 'clothing'}, {'price': '$40'}]
...
Can I append these data and make a json file under "Store". What should be the the proper practice. Please suggest.
Sample output generated from output.json file :
[
{
"product": "food"
},
{
"price": "$100"
}
][
{
"product": "clothing"
},
{
"price": "$40"
}
]
Try jsonlines package, you would need to install it using pip install jsonlines.
jsonlines does not contain the comma(,) at the end of line. So you can read and write exact structure the way you have anod you would not need to do any additional merge or formatting.
import jsonlines
with jsonlines.open('output.json') as reader:
for obj in reader:
// Do something with obj
Similarly, you can do the dump but by write method of this module.
with jsonlines.open('output.json', mode='w') as writer:
writer.write(...)
output.jsonl would look like this
[{'product': 'food'}, {'price': '$100'}]
[{'product': 'clothing'}, {'price': '$40'}]
Yes, You can always club them all together and link it to a key named Store which would make sense as they are all the products in the store.
But I think the below format would be much better as each product in the store have a defined product name along with the price of that product
{
"Store":[
{
"product":"food",
"price":"$100"
},
{
"product":"clothing",
"price":"$40"
}
]
}
If you do this way you need not have to insert each and every key,value pair to the json but instead if you can simply insert the entire product name and price to a single object and keep appending it to the store list
I got a json file in format like, each record is represented in lines:
{
"A":0,
"B":2
}{
"A":3,
"B":4
}
how to read it in a list?
If your data is exactly in that format, we can edit it into valid JSON.
import json
source = '''\
{
"A":0,
"B":2
}{
"A":3,
"B":4
}{
"C":5,
"D":6
}
'''
fixed = '[' + source.replace('}{', '},{') + ']'
lst = json.loads(fixed)
print(lst)
output
[{'A': 0, 'B': 2}, {'A': 3, 'B': 4}, {'C': 5, 'D': 6}]
This relies on each record being separated by '}{'. If that's not the case, we can use regex to do the search & replace operation.
Add [ and ] around your input and try this:
import json
with open('data.json') as data_file:
data = json.load(data_file)
print (data)
This code returns this line
[{'A': 0, 'B': 2}, {'A': 3, 'B': 4}]
when I put this data into the file:
[
{
"A":0,
"B":2
},{
"A":3,
"B":4
}
]
If you can't edit the file data.json, you can read string from this file, add [ and ] around this string, and call json.loads().
Update: Oh, I see that I added comma separator between JSON files. For initial input this my code doesn't work. But may be it is better to modify generator of this file? (i.e. to add comma separator)
Untested
import pandas as pd
str = '{"A":0,"B":2}{"A":3,"B":4}'
list(pd.read_json(str))
I have a txt file that contains a dictionary in Python and I have opened it in the following manner:
with open('file') as f:
test = list(f)
The result when I look at test is a list of one element. This first element is a string of the dictionary (which also contains other dictionaries), so it looks like:
["ID": 1, date: "2016-01-01", "A": {name: "Steve", id: "534", players:{last: "Smith", first: "Joe", job: "IT"}}
Is there any way to store this as the dictionary without having to find a way to determine the indices of the characters where the different keys and corresponding values begin/end? Or is it possible to read in the file in a way that recognizes the data as a dictionary?
If you are reading a json file then you can use the json module.
import json
with open('data.json') as f:
data = json.load(f)
If you are sure that the file you are reading contains python dictionaries, then you can use the built-in ast.literal_eval to convert those strings to a Python dictionary:
>>> import ast
>>> a = ast.literal_eval("{'a' : '1', 'b' : '2'}")
>>> a
{'a': '1', 'b': '2'}
>>> type(a)
<type 'dict'>
There is an alternative method, eval. But using ast.literal_eval would be better. This answer will explain why.
You can use json module
for Writing
import json
data = ["ID": 1, date: "2016-01-01", "A": {name: "Steve", id: "534", players:{last: "Smith", first: "Joe", job: "IT"}}
with open("out.json", "w") as f:
json.dump(data)
for Reading
import json
with open("out.json", "w") as f:
data = json.load(f)
print data
Just use eval() when you read it from your file.
for example:
>>> f = open('file.txt', 'r').read()
>>> mydict = eval(f)
>>> type(f)
<class 'str'>
>>> type(mydict)
<class 'dict'>
The Python interpreter thinks you are just trying to read an external file as text. It does not know that your file contains formatted content. One way to import easily as a dictionary be to write a second python file that contains the dictionary:
# mydict.py
myImportedDict = {
"ID": 1,
"date": "2016-01-01",
"A": {
"name": "Steve",
"id": "534",
"players": {
"last": "Smith",
"first": "Joe",
"job": "IT"
}
}
}
Then, you can import the dictionary and use it in another file:
#import_test.py
from mydict import myImportedDict
print(type(myImportedDict))
print(myImportedDict)
Python also requires that folders containing imported files also contain a file called
__init__.py
which can be blank. So, create a blank file with that name in addition to the two files above.
If your source file is meant to be in JSON format, you can use the json library instead, which comes packaged with Python: https://docs.python.org/2/library/json.html