String transformation for JSON purposes [duplicate] - python

How do I write JSON data stored in the dictionary data to a file?
f = open('data.json', 'wb')
f.write(data)
This gives the error:
TypeError: must be string or buffer, not dict

data is a Python dictionary. It needs to be encoded as JSON before writing.
Use this for maximum compatibility (Python 2 and 3):
import json
with open('data.json', 'w') as f:
json.dump(data, f)
On a modern system (i.e. Python 3 and UTF-8 support), you can write a nicer file using:
import json
with open('data.json', 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=4)
See json documentation.

To get utf8-encoded file as opposed to ascii-encoded in the accepted answer for Python 2 use:
import io, json
with io.open('data.txt', 'w', encoding='utf-8') as f:
f.write(json.dumps(data, ensure_ascii=False))
The code is simpler in Python 3:
import json
with open('data.txt', 'w') as f:
json.dump(data, f, ensure_ascii=False)
On Windows, the encoding='utf-8' argument to open is still necessary.
To avoid storing an encoded copy of the data in memory (result of dumps) and to output utf8-encoded bytestrings in both Python 2 and 3, use:
import json, codecs
with open('data.txt', 'wb') as f:
json.dump(data, codecs.getwriter('utf-8')(f), ensure_ascii=False)
The codecs.getwriter call is redundant in Python 3 but required for Python 2
Readability and size:
The use of ensure_ascii=False gives better readability and smaller size:
>>> json.dumps({'price': '€10'})
'{"price": "\\u20ac10"}'
>>> json.dumps({'price': '€10'}, ensure_ascii=False)
'{"price": "€10"}'
>>> len(json.dumps({'абвгд': 1}))
37
>>> len(json.dumps({'абвгд': 1}, ensure_ascii=False).encode('utf8'))
17
Further improve readability by adding flags indent=4, sort_keys=True (as suggested by dinos66) to arguments of dump or dumps. This way you'll get a nicely indented sorted structure in the json file at the cost of a slightly larger file size.

I would answer with slight modification with aforementioned answers and that is to write a prettified JSON file which human eyes can read better. For this, pass sort_keys as True and indent with 4 space characters and you are good to go. Also take care of ensuring that the ascii codes will not be written in your JSON file:
with open('data.txt', 'w') as out_file:
json.dump(json_data, out_file, sort_keys = True, indent = 4,
ensure_ascii = False)

Read and write JSON files with Python 2+3; works with unicode
# -*- coding: utf-8 -*-
import json
# Make it work for Python 2+3 and with Unicode
import io
try:
to_unicode = unicode
except NameError:
to_unicode = str
# Define data
data = {'a list': [1, 42, 3.141, 1337, 'help', u'€'],
'a string': 'bla',
'another dict': {'foo': 'bar',
'key': 'value',
'the answer': 42}}
# Write JSON file
with io.open('data.json', 'w', encoding='utf8') as outfile:
str_ = json.dumps(data,
indent=4, sort_keys=True,
separators=(',', ': '), ensure_ascii=False)
outfile.write(to_unicode(str_))
# Read JSON file
with open('data.json') as data_file:
data_loaded = json.load(data_file)
print(data == data_loaded)
Explanation of the parameters of json.dump:
indent: Use 4 spaces to indent each entry, e.g. when a new dict is started (otherwise all will be in one line),
sort_keys: sort the keys of dictionaries. This is useful if you want to compare json files with a diff tool / put them under version control.
separators: To prevent Python from adding trailing whitespaces
With a package
Have a look at my utility package mpu for a super simple and easy to remember one:
import mpu.io
data = mpu.io.read('example.json')
mpu.io.write('example.json', data)
Created JSON file
{
"a list":[
1,
42,
3.141,
1337,
"help",
"€"
],
"a string":"bla",
"another dict":{
"foo":"bar",
"key":"value",
"the answer":42
}
}
Common file endings
.json
Alternatives
CSV: Super simple format (read & write)
JSON: Nice for writing human-readable data; VERY commonly used (read & write)
YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML)
pickle: A Python serialization format (read & write)
MessagePack (Python package): More compact representation (read & write)
HDF5 (Python package): Nice for matrices (read & write)
XML: exists too *sigh* (read & write)
For your application, the following might be important:
Support by other programming languages
Reading / writing performance
Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

For those of you who are trying to dump greek or other "exotic" languages such as me but are also having problems (unicode errors) with weird characters such as the peace symbol (\u262E) or others which are often contained in json formated data such as Twitter's, the solution could be as follows (sort_keys is obviously optional):
import codecs, json
with codecs.open('data.json', 'w', 'utf8') as f:
f.write(json.dumps(data, sort_keys = True, ensure_ascii=False))

I don't have enough reputation to add in comments, so I just write some of my findings of this annoying TypeError here:
Basically, I think it's a bug in the json.dump() function in Python 2 only - It can't dump a Python (dictionary / list) data containing non-ASCII characters, even you open the file with the encoding = 'utf-8' parameter. (i.e. No matter what you do). But, json.dumps() works on both Python 2 and 3.
To illustrate this, following up phihag's answer: the code in his answer breaks in Python 2 with exception TypeError: must be unicode, not str, if data contains non-ASCII characters. (Python 2.7.6, Debian):
import json
data = {u'\u0430\u0431\u0432\u0433\u0434': 1} #{u'абвгд': 1}
with open('data.txt', 'w') as outfile:
json.dump(data, outfile)
It however works fine in Python 3.

Write a data in file using JSON use json.dump() or json.dumps() used.
write like this to store data in file.
import json
data = [1,2,3,4,5]
with open('no.txt', 'w') as txtfile:
json.dump(data, txtfile)
this example in list is store to a file.

json.dump(data, open('data.txt', 'wb'))

To write the JSON with indentation, "pretty print":
import json
outfile = open('data.json')
json.dump(data, outfile, indent=4)
Also, if you need to debug improperly formatted JSON, and want a helpful error message, use import simplejson library, instead of import json (functions should be the same)

All previous answers are correct here is a very simple example:
#! /usr/bin/env python
import json
def write_json():
# create a dictionary
student_data = {"students":[]}
#create a list
data_holder = student_data["students"]
# just a counter
counter = 0
#loop through if you have multiple items..
while counter < 3:
data_holder.append({'id':counter})
data_holder.append({'room':counter})
counter += 1
#write the file
file_path='/tmp/student_data.json'
with open(file_path, 'w') as outfile:
print("writing file to: ",file_path)
# HERE IS WHERE THE MAGIC HAPPENS
json.dump(student_data, outfile)
outfile.close()
print("done")
write_json()

if you are trying to write a pandas dataframe into a file using a json format i'd recommend this
destination='filepath'
saveFile = open(destination, 'w')
saveFile.write(df.to_json())
saveFile.close()

The JSON data can be written to a file as follows
hist1 = [{'val_loss': [0.5139984398465246],
'val_acc': [0.8002029867684085],
'loss': [0.593220705309384],
'acc': [0.7687131817929321]},
{'val_loss': [0.46456472964199463],
'val_acc': [0.8173602046780344],
'loss': [0.4932038113037539],
'acc': [0.8063946213802453]}]
Write to a file:
with open('text1.json', 'w') as f:
json.dump(hist1, f)

The accepted answer is fine. However, I ran into "is not json serializable" error using that.
Here's how I fixed it
with open("file-name.json", 'w') as output:
output.write(str(response))
Although it is not a good fix as the json file it creates will not have double quotes, however it is great if you are looking for quick and dirty.

Before write a dictionary into a file as a json, you have to turn that dict onto json string using json library.
import json
data = {
"field1":{
"a": 10,
"b": 20,
},
"field2":{
"c": 30,
"d": 40,
},
}
json_data = json.dumps(json_data)
And also you can add indent to json data to look prettier.
json_data = json.dumps(json_data, indent=4)
If you want to sort keys before turning into json,
json_data = json.dumps(json_data, sort_keys=True)
You can use the combination of these two also.
Refer the json documentation here for much more features
Finally you can write into a json file
f = open('data.json', 'wb')
f.write(json_data)

This is just an extra hint at the usage of json.dumps (this is not an answer to the problem of the question, but a trick for those who have to dump numpy data types):
If there are NumPy data types in the dictionary, json.dumps() needs an additional parameter, credits go to TypeError: Object of type 'ndarray' is not JSON serializable, and it will also fix errors like TypeError: Object of type int64 is not JSON serializable and so on:
class NumpyEncoder(json.JSONEncoder):
""" Special json encoder for np types """
def default(self, obj):
if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
np.int16, np.int32, np.int64, np.uint8,
np.uint16, np.uint32, np.uint64)):
return int(obj)
elif isinstance(obj, (np.float_, np.float16, np.float32,
np.float64)):
return float(obj)
elif isinstance(obj, (np.ndarray,)):
return obj.tolist()
return json.JSONEncoder.default(self, obj)
And then run:
import json
#print(json.dumps(my_data[:2], indent=4, cls=NumpyEncoder)))
with open(my_dir+'/my_filename.json', 'w') as f:
json.dumps(my_data, indent=4, cls=NumpyEncoder)))
You may also want to return a string instead of a list in case of a np.array() since arrays are printed as lists that are spread over rows which will blow up the output if you have large or many arrays. The caveat: it is more difficult to access the items from the dumped dictionary later to get them back as the original array. Yet, if you do not mind having just a string of an array, this makes the dictionary more readable. Then exchange:
elif isinstance(obj, (np.ndarray,)):
return obj.tolist()
with:
elif isinstance(obj, (np.ndarray,)):
return str(obj)
or just:
else:
return str(obj)

For people liking oneliners (hence with statement is not an option), a cleaner method than leaving a dangling opened file descriptor behind can be to use write_text from pathlib and do something like below:
pathlib.Path("data.txt").write_text(json.dumps(data))
This can be handy in some cases in contexts where statements are not allowed like:
[pathlib.Path(f"data_{x}.json").write_text(json.dumps(x)) for x in [1, 2, 3]]
I'm not claiming it should be preferred to with (and it's likely slower), just another option.

Related

Writing a JSON file from dictionary, correcting the output

So I am working on a conversion file that is taking a dictionary and converting it to a JSON file. Current code looks like:
data = {json_object}
json_string = jsonpickle.encode(data)
with open('/Users/machd/Mac/Documents/VISUAL CODE/CSV_to_JSON/JSON FILES/test.json', 'w') as outfile:
json.dump(json_string, outfile)
But when I go to open that rendered file, it is adding three \ on the front and back of each string.
ps: sorry if I am using the wrong terminology, I am still new to python and don't know the vocabulary that well yet.
Try this
import json
data = {"k": "v"}
with open( 'path_to_file.json', 'w') as f:
json.dump(data, f)
You don't need to use jsonpickle to encode dict data.
The json.dump is a wrapper function that convert data to json format firstly, then write these string data to your file.
The reason why you found \\ exist between each string is that, jsonpickle have took your data to string, after which the quote(") would convert to Escape character when json.dump interact.
Just use the following code to write dict data to json
with open('/Users/machd/Mac/Documents/VISUAL CODE/CSV_to_JSON/JSON FILES/test.json', 'w') as outfile:
json.dump(data, outfile)

How to write or append at a specific key/value of an object in json via python

The question is very self explanatory.
I need to write or append at a specific key/value of an object in json via python.
I'm not sure how to do it because I'm not good with JSON but here is an example of how I tried to do it (I know it is wrong).
with open('info.json', 'a') as f:
json.dumps(data, ['key1'])
this is the json file:
{"key0":"xxxxx#gmail.com","key1":"12345678"}
A typical usage pattern for JSONs in Python is to load the JSON object into Python, edit that object, and then write the resulting object back out to file.
import json
with open('info.json', 'r') as infile:
my_data = json.load(infile)
my_data['key1'] = my_data['key1'] + 'random string'
# perform other alterations to my_data here, as appropriate ...
with open('scratch.json', 'w') as outfile:
json.dump(my_data, outfile)
Contents of 'info.json' are now
{"key0": "xxxxx#gmail.com", "key1": "12345678random string"}
The key operations were json.load(fp), which deserialized the file into a Python object in memory, and json.dump(obj, fp), which reserialized the edited object to the file being written out.
This may be unsuitable if you're editing very large JSON objects and cannot easily pull the entire object into memory at once, but if you're just trying to learn the basics of Python's JSON library it should help you get started.
An example for appending data to a json file using json library:
import json
raw = '{ "color": "green", "type": "car" }'
data_to_add = { "gear": "manual" }
parsed = json.loads(raw)
parsed.update(data_to_add)
You can then save your changes with json.dumps.

Edit dictionary to be a valid json [duplicate]

How do I write JSON data stored in the dictionary data to a file?
f = open('data.json', 'wb')
f.write(data)
This gives the error:
TypeError: must be string or buffer, not dict
data is a Python dictionary. It needs to be encoded as JSON before writing.
Use this for maximum compatibility (Python 2 and 3):
import json
with open('data.json', 'w') as f:
json.dump(data, f)
On a modern system (i.e. Python 3 and UTF-8 support), you can write a nicer file using:
import json
with open('data.json', 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=4)
See json documentation.
To get utf8-encoded file as opposed to ascii-encoded in the accepted answer for Python 2 use:
import io, json
with io.open('data.txt', 'w', encoding='utf-8') as f:
f.write(json.dumps(data, ensure_ascii=False))
The code is simpler in Python 3:
import json
with open('data.txt', 'w') as f:
json.dump(data, f, ensure_ascii=False)
On Windows, the encoding='utf-8' argument to open is still necessary.
To avoid storing an encoded copy of the data in memory (result of dumps) and to output utf8-encoded bytestrings in both Python 2 and 3, use:
import json, codecs
with open('data.txt', 'wb') as f:
json.dump(data, codecs.getwriter('utf-8')(f), ensure_ascii=False)
The codecs.getwriter call is redundant in Python 3 but required for Python 2
Readability and size:
The use of ensure_ascii=False gives better readability and smaller size:
>>> json.dumps({'price': '€10'})
'{"price": "\\u20ac10"}'
>>> json.dumps({'price': '€10'}, ensure_ascii=False)
'{"price": "€10"}'
>>> len(json.dumps({'абвгд': 1}))
37
>>> len(json.dumps({'абвгд': 1}, ensure_ascii=False).encode('utf8'))
17
Further improve readability by adding flags indent=4, sort_keys=True (as suggested by dinos66) to arguments of dump or dumps. This way you'll get a nicely indented sorted structure in the json file at the cost of a slightly larger file size.
I would answer with slight modification with aforementioned answers and that is to write a prettified JSON file which human eyes can read better. For this, pass sort_keys as True and indent with 4 space characters and you are good to go. Also take care of ensuring that the ascii codes will not be written in your JSON file:
with open('data.txt', 'w') as out_file:
json.dump(json_data, out_file, sort_keys = True, indent = 4,
ensure_ascii = False)
Read and write JSON files with Python 2+3; works with unicode
# -*- coding: utf-8 -*-
import json
# Make it work for Python 2+3 and with Unicode
import io
try:
to_unicode = unicode
except NameError:
to_unicode = str
# Define data
data = {'a list': [1, 42, 3.141, 1337, 'help', u'€'],
'a string': 'bla',
'another dict': {'foo': 'bar',
'key': 'value',
'the answer': 42}}
# Write JSON file
with io.open('data.json', 'w', encoding='utf8') as outfile:
str_ = json.dumps(data,
indent=4, sort_keys=True,
separators=(',', ': '), ensure_ascii=False)
outfile.write(to_unicode(str_))
# Read JSON file
with open('data.json') as data_file:
data_loaded = json.load(data_file)
print(data == data_loaded)
Explanation of the parameters of json.dump:
indent: Use 4 spaces to indent each entry, e.g. when a new dict is started (otherwise all will be in one line),
sort_keys: sort the keys of dictionaries. This is useful if you want to compare json files with a diff tool / put them under version control.
separators: To prevent Python from adding trailing whitespaces
With a package
Have a look at my utility package mpu for a super simple and easy to remember one:
import mpu.io
data = mpu.io.read('example.json')
mpu.io.write('example.json', data)
Created JSON file
{
"a list":[
1,
42,
3.141,
1337,
"help",
"€"
],
"a string":"bla",
"another dict":{
"foo":"bar",
"key":"value",
"the answer":42
}
}
Common file endings
.json
Alternatives
CSV: Super simple format (read & write)
JSON: Nice for writing human-readable data; VERY commonly used (read & write)
YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML)
pickle: A Python serialization format (read & write)
MessagePack (Python package): More compact representation (read & write)
HDF5 (Python package): Nice for matrices (read & write)
XML: exists too *sigh* (read & write)
For your application, the following might be important:
Support by other programming languages
Reading / writing performance
Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python
For those of you who are trying to dump greek or other "exotic" languages such as me but are also having problems (unicode errors) with weird characters such as the peace symbol (\u262E) or others which are often contained in json formated data such as Twitter's, the solution could be as follows (sort_keys is obviously optional):
import codecs, json
with codecs.open('data.json', 'w', 'utf8') as f:
f.write(json.dumps(data, sort_keys = True, ensure_ascii=False))
I don't have enough reputation to add in comments, so I just write some of my findings of this annoying TypeError here:
Basically, I think it's a bug in the json.dump() function in Python 2 only - It can't dump a Python (dictionary / list) data containing non-ASCII characters, even you open the file with the encoding = 'utf-8' parameter. (i.e. No matter what you do). But, json.dumps() works on both Python 2 and 3.
To illustrate this, following up phihag's answer: the code in his answer breaks in Python 2 with exception TypeError: must be unicode, not str, if data contains non-ASCII characters. (Python 2.7.6, Debian):
import json
data = {u'\u0430\u0431\u0432\u0433\u0434': 1} #{u'абвгд': 1}
with open('data.txt', 'w') as outfile:
json.dump(data, outfile)
It however works fine in Python 3.
Write a data in file using JSON use json.dump() or json.dumps() used.
write like this to store data in file.
import json
data = [1,2,3,4,5]
with open('no.txt', 'w') as txtfile:
json.dump(data, txtfile)
this example in list is store to a file.
json.dump(data, open('data.txt', 'wb'))
To write the JSON with indentation, "pretty print":
import json
outfile = open('data.json')
json.dump(data, outfile, indent=4)
Also, if you need to debug improperly formatted JSON, and want a helpful error message, use import simplejson library, instead of import json (functions should be the same)
All previous answers are correct here is a very simple example:
#! /usr/bin/env python
import json
def write_json():
# create a dictionary
student_data = {"students":[]}
#create a list
data_holder = student_data["students"]
# just a counter
counter = 0
#loop through if you have multiple items..
while counter < 3:
data_holder.append({'id':counter})
data_holder.append({'room':counter})
counter += 1
#write the file
file_path='/tmp/student_data.json'
with open(file_path, 'w') as outfile:
print("writing file to: ",file_path)
# HERE IS WHERE THE MAGIC HAPPENS
json.dump(student_data, outfile)
outfile.close()
print("done")
write_json()
if you are trying to write a pandas dataframe into a file using a json format i'd recommend this
destination='filepath'
saveFile = open(destination, 'w')
saveFile.write(df.to_json())
saveFile.close()
The JSON data can be written to a file as follows
hist1 = [{'val_loss': [0.5139984398465246],
'val_acc': [0.8002029867684085],
'loss': [0.593220705309384],
'acc': [0.7687131817929321]},
{'val_loss': [0.46456472964199463],
'val_acc': [0.8173602046780344],
'loss': [0.4932038113037539],
'acc': [0.8063946213802453]}]
Write to a file:
with open('text1.json', 'w') as f:
json.dump(hist1, f)
The accepted answer is fine. However, I ran into "is not json serializable" error using that.
Here's how I fixed it
with open("file-name.json", 'w') as output:
output.write(str(response))
Although it is not a good fix as the json file it creates will not have double quotes, however it is great if you are looking for quick and dirty.
Before write a dictionary into a file as a json, you have to turn that dict onto json string using json library.
import json
data = {
"field1":{
"a": 10,
"b": 20,
},
"field2":{
"c": 30,
"d": 40,
},
}
json_data = json.dumps(json_data)
And also you can add indent to json data to look prettier.
json_data = json.dumps(json_data, indent=4)
If you want to sort keys before turning into json,
json_data = json.dumps(json_data, sort_keys=True)
You can use the combination of these two also.
Refer the json documentation here for much more features
Finally you can write into a json file
f = open('data.json', 'wb')
f.write(json_data)
This is just an extra hint at the usage of json.dumps (this is not an answer to the problem of the question, but a trick for those who have to dump numpy data types):
If there are NumPy data types in the dictionary, json.dumps() needs an additional parameter, credits go to TypeError: Object of type 'ndarray' is not JSON serializable, and it will also fix errors like TypeError: Object of type int64 is not JSON serializable and so on:
class NumpyEncoder(json.JSONEncoder):
""" Special json encoder for np types """
def default(self, obj):
if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
np.int16, np.int32, np.int64, np.uint8,
np.uint16, np.uint32, np.uint64)):
return int(obj)
elif isinstance(obj, (np.float_, np.float16, np.float32,
np.float64)):
return float(obj)
elif isinstance(obj, (np.ndarray,)):
return obj.tolist()
return json.JSONEncoder.default(self, obj)
And then run:
import json
#print(json.dumps(my_data[:2], indent=4, cls=NumpyEncoder)))
with open(my_dir+'/my_filename.json', 'w') as f:
json.dumps(my_data, indent=4, cls=NumpyEncoder)))
You may also want to return a string instead of a list in case of a np.array() since arrays are printed as lists that are spread over rows which will blow up the output if you have large or many arrays. The caveat: it is more difficult to access the items from the dumped dictionary later to get them back as the original array. Yet, if you do not mind having just a string of an array, this makes the dictionary more readable. Then exchange:
elif isinstance(obj, (np.ndarray,)):
return obj.tolist()
with:
elif isinstance(obj, (np.ndarray,)):
return str(obj)
or just:
else:
return str(obj)
For people liking oneliners (hence with statement is not an option), a cleaner method than leaving a dangling opened file descriptor behind can be to use write_text from pathlib and do something like below:
pathlib.Path("data.txt").write_text(json.dumps(data))
This can be handy in some cases in contexts where statements are not allowed like:
[pathlib.Path(f"data_{x}.json").write_text(json.dumps(x)) for x in [1, 2, 3]]
I'm not claiming it should be preferred to with (and it's likely slower), just another option.

json.dump failing to alter file [duplicate]

How do I write JSON data stored in the dictionary data to a file?
f = open('data.json', 'wb')
f.write(data)
This gives the error:
TypeError: must be string or buffer, not dict
data is a Python dictionary. It needs to be encoded as JSON before writing.
Use this for maximum compatibility (Python 2 and 3):
import json
with open('data.json', 'w') as f:
json.dump(data, f)
On a modern system (i.e. Python 3 and UTF-8 support), you can write a nicer file using:
import json
with open('data.json', 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=4)
See json documentation.
To get utf8-encoded file as opposed to ascii-encoded in the accepted answer for Python 2 use:
import io, json
with io.open('data.txt', 'w', encoding='utf-8') as f:
f.write(json.dumps(data, ensure_ascii=False))
The code is simpler in Python 3:
import json
with open('data.txt', 'w') as f:
json.dump(data, f, ensure_ascii=False)
On Windows, the encoding='utf-8' argument to open is still necessary.
To avoid storing an encoded copy of the data in memory (result of dumps) and to output utf8-encoded bytestrings in both Python 2 and 3, use:
import json, codecs
with open('data.txt', 'wb') as f:
json.dump(data, codecs.getwriter('utf-8')(f), ensure_ascii=False)
The codecs.getwriter call is redundant in Python 3 but required for Python 2
Readability and size:
The use of ensure_ascii=False gives better readability and smaller size:
>>> json.dumps({'price': '€10'})
'{"price": "\\u20ac10"}'
>>> json.dumps({'price': '€10'}, ensure_ascii=False)
'{"price": "€10"}'
>>> len(json.dumps({'абвгд': 1}))
37
>>> len(json.dumps({'абвгд': 1}, ensure_ascii=False).encode('utf8'))
17
Further improve readability by adding flags indent=4, sort_keys=True (as suggested by dinos66) to arguments of dump or dumps. This way you'll get a nicely indented sorted structure in the json file at the cost of a slightly larger file size.
I would answer with slight modification with aforementioned answers and that is to write a prettified JSON file which human eyes can read better. For this, pass sort_keys as True and indent with 4 space characters and you are good to go. Also take care of ensuring that the ascii codes will not be written in your JSON file:
with open('data.txt', 'w') as out_file:
json.dump(json_data, out_file, sort_keys = True, indent = 4,
ensure_ascii = False)
Read and write JSON files with Python 2+3; works with unicode
# -*- coding: utf-8 -*-
import json
# Make it work for Python 2+3 and with Unicode
import io
try:
to_unicode = unicode
except NameError:
to_unicode = str
# Define data
data = {'a list': [1, 42, 3.141, 1337, 'help', u'€'],
'a string': 'bla',
'another dict': {'foo': 'bar',
'key': 'value',
'the answer': 42}}
# Write JSON file
with io.open('data.json', 'w', encoding='utf8') as outfile:
str_ = json.dumps(data,
indent=4, sort_keys=True,
separators=(',', ': '), ensure_ascii=False)
outfile.write(to_unicode(str_))
# Read JSON file
with open('data.json') as data_file:
data_loaded = json.load(data_file)
print(data == data_loaded)
Explanation of the parameters of json.dump:
indent: Use 4 spaces to indent each entry, e.g. when a new dict is started (otherwise all will be in one line),
sort_keys: sort the keys of dictionaries. This is useful if you want to compare json files with a diff tool / put them under version control.
separators: To prevent Python from adding trailing whitespaces
With a package
Have a look at my utility package mpu for a super simple and easy to remember one:
import mpu.io
data = mpu.io.read('example.json')
mpu.io.write('example.json', data)
Created JSON file
{
"a list":[
1,
42,
3.141,
1337,
"help",
"€"
],
"a string":"bla",
"another dict":{
"foo":"bar",
"key":"value",
"the answer":42
}
}
Common file endings
.json
Alternatives
CSV: Super simple format (read & write)
JSON: Nice for writing human-readable data; VERY commonly used (read & write)
YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML)
pickle: A Python serialization format (read & write)
MessagePack (Python package): More compact representation (read & write)
HDF5 (Python package): Nice for matrices (read & write)
XML: exists too *sigh* (read & write)
For your application, the following might be important:
Support by other programming languages
Reading / writing performance
Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python
For those of you who are trying to dump greek or other "exotic" languages such as me but are also having problems (unicode errors) with weird characters such as the peace symbol (\u262E) or others which are often contained in json formated data such as Twitter's, the solution could be as follows (sort_keys is obviously optional):
import codecs, json
with codecs.open('data.json', 'w', 'utf8') as f:
f.write(json.dumps(data, sort_keys = True, ensure_ascii=False))
I don't have enough reputation to add in comments, so I just write some of my findings of this annoying TypeError here:
Basically, I think it's a bug in the json.dump() function in Python 2 only - It can't dump a Python (dictionary / list) data containing non-ASCII characters, even you open the file with the encoding = 'utf-8' parameter. (i.e. No matter what you do). But, json.dumps() works on both Python 2 and 3.
To illustrate this, following up phihag's answer: the code in his answer breaks in Python 2 with exception TypeError: must be unicode, not str, if data contains non-ASCII characters. (Python 2.7.6, Debian):
import json
data = {u'\u0430\u0431\u0432\u0433\u0434': 1} #{u'абвгд': 1}
with open('data.txt', 'w') as outfile:
json.dump(data, outfile)
It however works fine in Python 3.
Write a data in file using JSON use json.dump() or json.dumps() used.
write like this to store data in file.
import json
data = [1,2,3,4,5]
with open('no.txt', 'w') as txtfile:
json.dump(data, txtfile)
this example in list is store to a file.
json.dump(data, open('data.txt', 'wb'))
To write the JSON with indentation, "pretty print":
import json
outfile = open('data.json')
json.dump(data, outfile, indent=4)
Also, if you need to debug improperly formatted JSON, and want a helpful error message, use import simplejson library, instead of import json (functions should be the same)
All previous answers are correct here is a very simple example:
#! /usr/bin/env python
import json
def write_json():
# create a dictionary
student_data = {"students":[]}
#create a list
data_holder = student_data["students"]
# just a counter
counter = 0
#loop through if you have multiple items..
while counter < 3:
data_holder.append({'id':counter})
data_holder.append({'room':counter})
counter += 1
#write the file
file_path='/tmp/student_data.json'
with open(file_path, 'w') as outfile:
print("writing file to: ",file_path)
# HERE IS WHERE THE MAGIC HAPPENS
json.dump(student_data, outfile)
outfile.close()
print("done")
write_json()
if you are trying to write a pandas dataframe into a file using a json format i'd recommend this
destination='filepath'
saveFile = open(destination, 'w')
saveFile.write(df.to_json())
saveFile.close()
The JSON data can be written to a file as follows
hist1 = [{'val_loss': [0.5139984398465246],
'val_acc': [0.8002029867684085],
'loss': [0.593220705309384],
'acc': [0.7687131817929321]},
{'val_loss': [0.46456472964199463],
'val_acc': [0.8173602046780344],
'loss': [0.4932038113037539],
'acc': [0.8063946213802453]}]
Write to a file:
with open('text1.json', 'w') as f:
json.dump(hist1, f)
The accepted answer is fine. However, I ran into "is not json serializable" error using that.
Here's how I fixed it
with open("file-name.json", 'w') as output:
output.write(str(response))
Although it is not a good fix as the json file it creates will not have double quotes, however it is great if you are looking for quick and dirty.
Before write a dictionary into a file as a json, you have to turn that dict onto json string using json library.
import json
data = {
"field1":{
"a": 10,
"b": 20,
},
"field2":{
"c": 30,
"d": 40,
},
}
json_data = json.dumps(json_data)
And also you can add indent to json data to look prettier.
json_data = json.dumps(json_data, indent=4)
If you want to sort keys before turning into json,
json_data = json.dumps(json_data, sort_keys=True)
You can use the combination of these two also.
Refer the json documentation here for much more features
Finally you can write into a json file
f = open('data.json', 'wb')
f.write(json_data)
This is just an extra hint at the usage of json.dumps (this is not an answer to the problem of the question, but a trick for those who have to dump numpy data types):
If there are NumPy data types in the dictionary, json.dumps() needs an additional parameter, credits go to TypeError: Object of type 'ndarray' is not JSON serializable, and it will also fix errors like TypeError: Object of type int64 is not JSON serializable and so on:
class NumpyEncoder(json.JSONEncoder):
""" Special json encoder for np types """
def default(self, obj):
if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
np.int16, np.int32, np.int64, np.uint8,
np.uint16, np.uint32, np.uint64)):
return int(obj)
elif isinstance(obj, (np.float_, np.float16, np.float32,
np.float64)):
return float(obj)
elif isinstance(obj, (np.ndarray,)):
return obj.tolist()
return json.JSONEncoder.default(self, obj)
And then run:
import json
#print(json.dumps(my_data[:2], indent=4, cls=NumpyEncoder)))
with open(my_dir+'/my_filename.json', 'w') as f:
json.dumps(my_data, indent=4, cls=NumpyEncoder)))
You may also want to return a string instead of a list in case of a np.array() since arrays are printed as lists that are spread over rows which will blow up the output if you have large or many arrays. The caveat: it is more difficult to access the items from the dumped dictionary later to get them back as the original array. Yet, if you do not mind having just a string of an array, this makes the dictionary more readable. Then exchange:
elif isinstance(obj, (np.ndarray,)):
return obj.tolist()
with:
elif isinstance(obj, (np.ndarray,)):
return str(obj)
or just:
else:
return str(obj)
For people liking oneliners (hence with statement is not an option), a cleaner method than leaving a dangling opened file descriptor behind can be to use write_text from pathlib and do something like below:
pathlib.Path("data.txt").write_text(json.dumps(data))
This can be handy in some cases in contexts where statements are not allowed like:
[pathlib.Path(f"data_{x}.json").write_text(json.dumps(x)) for x in [1, 2, 3]]
I'm not claiming it should be preferred to with (and it's likely slower), just another option.

Pretty-Print JSON Data to a File using Python

A project for class involves parsing Twitter JSON data. I'm getting the data and setting it to the file without much trouble, but it's all in one line. This is fine for the data manipulation I'm trying to do, but the file is ridiculously hard to read and I can't examine it very well, making the code writing for the data manipulation part very difficult.
Does anyone know how to do that from within Python (i.e. not using the command line tool, which I can't get to work)? Here's my code so far:
header, output = client.request(twitterRequest, method="GET", body=None,
headers=None, force_auth_header=True)
# now write output to a file
twitterDataFile = open("twitterData.json", "wb")
# magic happens here to make it pretty-printed
twitterDataFile.write(output)
twitterDataFile.close()
Note I appreciate people pointing me to simplejson documentation and such, but as I have stated, I have already looked at that and continue to need assistance. A truly helpful reply will be more detailed and explanatory than the examples found there. Thanks
Also:
Trying this in the windows command line:
more twitterData.json | python -mjson.tool > twitterData-pretty.json
results in this:
Invalid control character at: line 1 column 65535 (char 65535)
I'd give you the data I'm using, but it's very large and you've already seen the code I used to make the file.
You should use the optional argument indent.
header, output = client.request(twitterRequest, method="GET", body=None,
headers=None, force_auth_header=True)
# now write output to a file
twitterDataFile = open("twitterData.json", "w")
# magic happens here to make it pretty-printed
twitterDataFile.write(simplejson.dumps(simplejson.loads(output), indent=4, sort_keys=True))
twitterDataFile.close()
You can parse the JSON, then output it again with indents like this:
import json
mydata = json.loads(output)
print json.dumps(mydata, indent=4)
See http://docs.python.org/library/json.html for more info.
import json
with open("twitterdata.json", "w") as twitter_data_file:
json.dump(output, twitter_data_file, indent=4, sort_keys=True)
You don't need json.dumps() if you don't want to parse the string later, just simply use json.dump(). It's faster too.
You can use json module of python to pretty print.
>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
"4": 5,
"6": 7
}
So, in your case
>>> print json.dumps(json_output, indent=4)
If you are generating new *.json or modifying existing josn file the use "indent" parameter for pretty view json format.
import json
responseData = json.loads(output)
with open('twitterData.json','w') as twitterDataFile:
json.dump(responseData, twitterDataFile, indent=4)
If you already have existing JSON files which you want to pretty format you could use this:
with open('twitterdata.json', 'r+') as f:
data = json.load(f)
f.seek(0)
json.dump(data, f, indent=4)
f.truncate()
import json
def writeToFile(logData, fileName, openOption="w"):
file = open(fileName, openOption)
file.write(json.dumps(json.loads(logData), indent=4))
file.close()
You could redirect a file to python and open using the tool and to read it use more.
The sample code will be,
cat filename.json | python -m json.tool | more

Categories