How to write human-readable data to to a JSON file - python

When I export a file from python to json file it contains charecters like,
{"-": "text", "menu": {"-": "node", "id": 2244676, "prev": "[2/40] \u0d2a\u0d4d\u0d30\u0d2f\u0d4b\u0d1c\u0d15 \u0d15\u0d4d\u0d30\u0d3f\u0d2f
I used
with open('messages.json', 'w') as outfile:
json.dump(all_messages, outfile, cls=DateTimeEncoder)
in python. How to convert it to normal unicode text?

If you want the output JSON to be human-readable, use UTF-8 encoding and the ensure_ascii=False parameter:
with open('messages.json', 'w', encoding='utf8') as outfile:
json.dump(all_messages, outfile, cls=DateTimeEncoder,ensure_ascii=False)
If you just want to read the data back in again, json.load will convert it back to Unicode:
with open('messages.json', encoding='utf8') as infile:
data = json.load(infile)
Examples with simple strings:
>>> s = '[2/40] പ്രയോജക ക്രിയ'
>>> print(json.dumps(s))
"[2/40] \u0d2a\u0d4d\u0d30\u0d2f\u0d4b\u0d1c\u0d15 \u0d15\u0d4d\u0d30\u0d3f\u0d2f"
>>> print(json.dumps(s,ensure_ascii=False))
"[2/40] പ്രയോജക ക്രിയ"
>>> out = json.dumps(s)
>>> out
'"[2/40] \\u0d2a\\u0d4d\\u0d30\\u0d2f\\u0d4b\\u0d1c\\u0d15 \\u0d15\\u0d4d\\u0d30\\u0d3f\\u0d2f"'
>>> json.loads(out)
'[2/40] പ്രയോജക ക്രിയ'

Related

Loading json file and decoding in UTF-16

This is a better way of wording my question:
I'm trying to read a utf-16 characters (English and Arabic) from a .json.gz file in python 2.7.
The code lines that I have written read utf-8 characters:
import glob
import json
import gzip
print("Reading input JSON files")
for filename in glob.glob("*api*.json.gz"):
with gzip.open(filename,'r') as f:
data = json.loads(f.read().decode('utf-8'))
I tried a simple replacement of utf-8 to utf-16, but I got this error:
ValueError: No JSON object could be decoded
Any help would be appreciated.
Specify the encoding as a part of open(). Here is a "round-trip demo":
>>> import json
>>> data = {
... "title": "قالت وزارة الداخلية المصرية إن كمية من المتفجرات في سيارة كانت معدة لتنفيذ عملية إرهابية أدت إلى الانفجار الذي وقع وسط القاهرة وأودى بحياة نحو 20 شخصا."
... }
>>> with open("/tmp/utf16demo.json", "w", encoding="utf-16") as f:
... json.dump(data, f)
>>> with open("/tmp/utf16demo.json", encoding="utf-16") as f:
... newdata = json.load(f)
>>> next(iter(newdata.values())) == next(iter(data.values()))
True
As mentioned in the comments, just because the data is originally UTF-16 encoded does not need you mean to write it back to CSV in the same encoding. You are perfectly free to load and decode using UTF-16, but then write out using UTF-8.
import json
{"intents": [
{"tag": "greeting",
"patterns": ["هاي","عامل إيه","ايه اخبارك","ازيك"],
"responses": ["هاي!","كويس","حمدالله","ماشي الحال وإنت ??"],
"context_set": ""
}
]
}
with open("intents.json", encoding="utf-8") as f:
intents = json.load(f)

python print to file instead output screen

I have the following python script (it 'converts' xml to for example json):
import xmltodict
import pprint
import json
with open('file.xml') as fd:
doc = xmltodict.parse(fd.read())
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(json.dumps(doc))
When I run the following code it will output the json code. Question is; how can I write the output to output.json instead of the output to the screen?
Thanks!
To format json with indents you can use indent argument (link to docs).
with open('file.xml', 'r') as src_file, open('file.json', 'w+') as dst_file:
doc = xmltodict.parse(src_file.read()) #read file
dst_file.write(json.dumps(doc, indent=4)) #write file
To print Json into file
with open("your_output_file.json", "w+") as f:
f.write(json.dumps(doc))
To read JSON from file
with open("your_output_file.json") as f:
d = json.load(f)
To write your dictionary dct to a file, use json.dump
with open("output.json", "w+") as f:
json.dump(dct,f)
To read your dictionary from a file, use json.load
with open("output.json", "w+") as f:
dct = json.load(f)
Combining both examples
In [8]: import json
In [9]: dct = {'a':'b','c':'d'}
In [10]: with open("output.json", "w") as f:
...: json.dump(dct,f)
...:
In [11]: with open("output.json", "r") as f:
...: print(json.load(f))
...:
...:
{'a': 'b', 'c': 'd'}

Converting a dictionary to json having persian characters

Here is some code of mine, I'm trying to convert a dictionary to json having Persian characters but I get question marks instead of characters. My dictionary looks like this:
bycommunity("0": [{"60357": "این یک پیام است"}] )
with open('data.json', 'wb') as f:
f.write(json.dumps(bycommunity).encode("utf-8"))
the result is :
{"0": [{"60357": "?????? ??? ??? ???? ???????? ??????"}]}
data = {"0": [{"60357": "این یک پیام است"}]}
with open('data.json', 'w') as f:
json.dump(data, f, ensure_ascii=False)
and also check this Answer for more details
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, ensure_ascii=False, indent=4))

Writing to a JSON file and updating said file

I have the following code that will write to a JSON file:
import json
def write_data_to_table(word, hash):
data = {word: hash}
with open("rainbow_table\\rainbow.json", "a+") as table:
table.write(json.dumps(data))
What I want to do is open the JSON file, add another line to it, and close it. How can I do this without messing with the file?
As of right now when I run the code I get the following:
write_data_to_table("test1", "0123456789")
write_data_to_table("test2", "00123456789")
write_data_to_table("test3", "000123456789")
#<= {"test1": "0123456789"}{"test2": "00123456789"}{"test3": "000123456789"}
How can I update the file without completely screwing with it?
My expected output would probably be something along the lines of:
{
"test1": "0123456789",
"test2": "00123456789",
"test3": "000123456789",
}
You may read the JSON data with :
parsed_json = json.loads(json_string)
You now manipulate a classic dictionary. You can add data with :
parsed_json.update({'test4': 0000123456789})
Then you can write data to a file using :
with open('data.txt', 'w') as outfile:
json.dump(parsed_json, outfile)
If you are sure the closing "}" is the last byte in the file you can do this:
>>> f = open('test.json', 'a+')
>>> json.dump({"foo": "bar"}, f) # create the file
>>> f.seek(0)
>>> f.read()
'{"foo": "bar"}'
>>> f.seek(-1, 2)
>>> f.write(',\n', f.write(',\n' + json.dumps({"spam": "bacon"})[1:]))
>>> f.seek(0)
>>> print(f.read())
{"foo": "bar",
"spam": "bacon"}
Since your data is not hierarchical, you should consider a flat format like "TSV".

Python file input (write mode) issue with JSON

I'm learning Python and I'm following official documentation from:
Section: 7.2.2. Saving structured data with json for Python 3
I'm testing the json.dump() function to dump my python set into a file pointer:
>>> response = {"success": True, "data": ["test", "array", "response"]}
>>> response
{'success': True, 'data': ['test', 'array', 'response']}
>>> import json
>>> json.dumps(response)
'{"success": true, "data": ["test", "array", "response"]}'
>>> f = open('testfile.txt', 'w', encoding='UTF-8')
>>> f
<_io.TextIOWrapper name='testfile.txt' mode='w' encoding='UTF-8'>
>>> json.dump(response, f)
The file testfile.txt already exists in my working directory and even if it didn't, statement f = open('testfile.txt', 'w', encoding='UTF-8') would have re-create it, truncated.
The json.dumps(response) converts my response set into a valid JSON object, so that's fine.
Problem is when I use the json.dumps(response, f) method, which actually updates my testfile.txt, but it gets truncated.
I've managed to do a reverse workaround like:
>>> f = open('testfile.txt', 'w', encoding='UTF-8')
>>> f.write(json.dumps(response));
56
>>>
After which the contents of my testfile.txt become as expected:
{"success": true, "data": ["test", "array", "response"]}
Even, this approach works too:
>>> json.dump(response, open('testfile.txt', 'w', encoding='UTF-8'))
Why does this approach fail?:
>>> f = open('testfile.txt', 'w', encoding='UTF-8')
>>> json.dump(response, f)
Note that I don't get any errors from the console; just a truncated file.
It looks like you aren't exiting the interactive prompt to check the file. Close the file to flush it:
f.close()
It will close if you exit the interactive prompt as well.

Categories