JSON dump breaks my dictionary when using certain characters

JSON dump breaks my dictionary when using certain characters - python

Dumping data into a file with json.dump.
The data looks like this:
{"hello": {"this": 1, "a": 1, "is": 1, "test": 1}}
The code I use to achieve this is as follows (worddict is a file, something like file.json):
with open(words, 'w') as fp:
json.dump(worddict, fp)
fp.close()
I'd like to have the data in this format:
{
"hello": {
"a": 1,
"is": 1,
"test": 1,
"this": 1
}
I changed the code to this:
with open(words, 'w') as fp:
json.dump(worddict, fp, sort_keys=True, indent=4, separators=(',', ': '))
fp.close()
And it works, until I try to dump characters "Á", "É", "Ű"...
These characters breaks the worddict file, and when I cat the file it looks like this:
{
Any idea why?

Replace
json.dump(worddict, fp, sort_keys=True, indent=4, separators=(',', ': '))
with
json.dump(worddict, fp, sort_keys=True, indent=4, separators=(',', ': '), ensure_ascii=False)

I ran your code snippet in both python2 and python3.
In python3, it gave me no errors.
I ran the following code:
import json
words = 'a.txt'
worddict = {"hello": {"this": 1, "a": 1, "is": 1, "test": "Á"}}
with open(words, 'w') as fp:
json.dump(worddict, fp, sort_keys=True, indent=4, separators=(',', ': '))
and got as output:
{
"hello": {
"a": 1,
"is": 1,
"test": "\u00c1",
"this": 1
}
}
But in python2, I ran into errors. I got a link describing the error:
http://www.python.org/peps/pep-0263.html
The problem occurs in python2 as python2 strings as not unicode by default. So you have to mention encoding at the top of the source code file.
I added "# coding=UTF-8" to the top of the file, before the python source code starts, so as to let python interpreter know the encoding of the file. Once I did that, the code ran in python2 as well as python3 with no errors.
I got the following as output:
{
"hello": {
"a": 1,
"is": 1,
"test": "\u00c1",
"this": 1
}
}
Here is my full final source code that I used.
# coding=UTF-8
import json
words = 'a.txt'
worddict = {"hello": {"this": 1, "a": 1, "is": 1, "test": "Á"}}
with open(words, 'w') as fp:
json.dump(worddict, fp, sort_keys=True, indent=4, separators=(',', ': '))

Related

Writing to text file while reading one. (ValueError: I/O operation on closed file.)

I'm currently working on a script that rearranges JSON data in a more basic manner so I can run it through another YOLO box plot script. So far I've managed to make the script print the data in the exact format that I wish for. However, I would like to save it in a text file so I don't have to copy/paste it every time. Doing this seemed to be more difficult than first anticipated.
So here is the code that currently "works":
import sys
data = open(sys.argv[1], 'r')
with data as file:
for line in file:
split = line.split()
if split[0] == '"x":':
print("0", split[1][0:8], end = ' ')
if split[0] == '"y":':
print(split[1][0:8], end = ' ')
if split[0] == '"w":':
print(split[1][0:8], end = ' ')
if split[0] == '"h":':
print(split[1][0:8])
And here is an example of the dataset that will be run through this script:
{
"car": {
"count": 7,
"instances": [
{
"bbox": {
"x": 0.03839285671710968,
"y": 0.8041666746139526,
"w": 0.07678571343421936,
"h": 0.16388888657093048
},
"confidence": 0.41205787658691406
},
{
"bbox": {
"x": 0.9330357313156128,
"y": 0.8805555701255798,
"w": 0.1339285671710968,
"h": 0.2222222238779068
},
"confidence": 0.8200334906578064
},
{
"bbox": {
"x": 0.15803571045398712,
"y": 0.8111110925674438,
"w": 0.22678571939468384,
"h": 0.21111111342906952
},
"confidence": 0.8632314801216125
},
{
"bbox": {
"x": 0.762499988079071,
"y": 0.8916666507720947,
"w": 0.1428571492433548,
"h": 0.20555555820465088
},
"confidence": 0.8819259405136108
},
{
"bbox": {
"x": 0.4178571403026581,
"y": 0.8902778029441833,
"w": 0.17499999701976776,
"h": 0.17499999701976776
},
"confidence": 0.8824222087860107
},
{
"bbox": {
"x": 0.5919643044471741,
"y": 0.8722222447395325,
"w": 0.16607142984867096,
"h": 0.25
},
"confidence": 0.8865317106246948
},
{
"bbox": {
"x": 0.27767857909202576,
"y": 0.8541666865348816,
"w": 0.2053571492433548,
"h": 0.1805555522441864
},
"confidence": 0.8922017216682434
}
]
}
}
The outcome will be looking like this:
0 0.038392 0.804166 0.076785 0.163888
0 0.933035 0.880555 0.133928 0.222222
0 0.158035 0.811111 0.226785 0.211111
0 0.762499 0.891666 0.142857 0.205555
0 0.417857 0.890277 0.174999 0.174999
0 0.591964 0.872222 0.166071 0.25
0 0.277678 0.854166 0.205357 0.180555
Instead of printing these lines I've tried writing them to a new text file, however, I keep getting the "ValueError: I/O operation on closed file." error. I would guess this is because I already have one open and opening a new one will close the first one? Is there an easy way to work around this? Or is the hassle too much to bother and copy/pasting the print result is the "easiest" way?

Why don't you use the json and csv packages??
import csv
import json
# import sys
# file = sys.argv[1]
file = "input.json"
output_file = "output.csv"
with open(file, "r") as data_file:
data = json.load(data_file)
with open(output_file, "w") as csv_file:
writer = csv.writer(csv_file, delimiter=' ')
for value in data.values():
instances = value.get("instances")
bboxes = [instance.get("bbox") for instance in instances]
for bbox in bboxes:
writer.writerow([
0,
f"{bbox['x']:.6f}",
f"{bbox['y']:.6f}",
f"{bbox['w']:.6f}",
f"{bbox['h']:.6f}",
])
Output:
0 0.038393 0.804167 0.076786 0.163889
0 0.933036 0.880556 0.133929 0.222222
0 0.158036 0.811111 0.226786 0.211111
0 0.762500 0.891667 0.142857 0.205556
0 0.417857 0.890278 0.175000 0.175000
0 0.591964 0.872222 0.166071 0.250000
0 0.277679 0.854167 0.205357 0.180556
Notes:
It's important that you understand your input file format you are working with. Read about JSON here.
I do round the values to 6 digits in both examples (not sure what the requirements are but simply modify f"{bbox['x']:.6f}" and the 3 lines following that one to your use case)
Or, if you want to use jmespath along with csv and json:
import csv
import json
import jmespath # pip install jmespath
# import sys
# file = sys.argv[1]
file = "input.json"
output_file = "output.csv"
with open(file, "r") as data_file:
data = json.load(data_file)
bboxes = jmespath.search("*.instances[*].bbox", data)
with open(output_file, "w") as csv_file:
writer = csv.writer(csv_file, delimiter=' ')
for bbox in bboxes[0]:
writer.writerow([
0,
f"{bbox['x']:.6f}",
f"{bbox['y']:.6f}",
f"{bbox['w']:.6f}",
f"{bbox['h']:.6f}",
])

I suggest parsing the file as JSON rather than raw text. If the file is JSON, treat it as JSON in order to avoid the unfortunate case in which it is valid, minified JSON and the lack of line breaks makes treating it as a string a nightmare of regexes that are likely fragile. Or possibly worse, the file is invalid JSON.
import json
import sys
with open(sys.argv[1], 'r') as f:
raw = f.read()
obj = json.loads(raw)
print("\n".join(
f"0 {i['bbox']['x']:.6f} {i['bbox']['y']:.6f} {i['bbox']['w']:.6f} {i['bbox']['h']:.6f}"
for i in obj["car"]["instances"])
)

Parse JSON starting with bracket in python

I have a .json file structured as well:
"[{\"dataset\": \"x0\", \"test\": \"Test 3 \", \"results\": {\"TP\": 0, \"FP\": 0, \"FN\": 0, \"TN\": 17536}, \"dir\": \"/Users//Test_3\"}]"
When I try to read it with the following code:
with open(dir, 'r+') as f:
data = json.load(f)
print(data[0])
I get [ as output, which means it is reading the json object as a string.
I do not understand if the problem is how I'm saving it. Since I populate it in a loop, the code which creates this object is the following one:
json_obj = []
for i in range(len(dictionary)):
dataset, test, dir = retrieve_data()
tp, fp, tn, fn = calculate_score()
json_obj.append({'dataset': dataset,
'test': test,
'results': {'TP': tp, 'FP': fp, 'FN': fn, 'TN': tn},
'dir': dir })
json_dump = json.dumps(json_obj)
with open(save_folder, 'w') as outfile:
json.dump(json_dump, outfile)
The structure I tried to create is the following one:
{
"dataset": "1",
"test": "trial1",
"results": {
"TP": 5,
"FP": 3,
"FN": 2,
"TN": 5
},
"dir": dir
}
How can I read it correctly to make it parsable?

You are converting json_obj to a string and then dumping the string to a file. Dump json_obj directly to the file:
#json_dump = json.dumps(json_obj)
with open(save_folder, 'w') as outfile:
json.dump(json_obj, outfile)

How can I convert a CSV file to a JSON file with a nested json object using Python?

I am stuck with a problem where I don't know how I can convert a "nested JSON object" inside a CSV file into a JSON object.
So I have a CSV file with the following value:
data.csv
1, 12385, {'message': 'test 1', 'EngineId': 3, 'PersonId': 1, 'GUID': '0ace2-02d8-4eb6-b2f0-63bb10829cd4s56'}, 6486D, TestSender1
2, 12347, {'message': 'test 2', 'EngineId': 3, 'PersonId': 2, 'GUID': 'c6d25672-cb17-45e8-87be-46a6cf14e76b'}, 8743F, TestSender2
I wrote a python script that converts this CSV file into a JSON file inside an array.
This I did with the following python script
csvToJson.py
import json
import csv
with open("data.csv","r") as f:
reader = csv.reader(f)
data = []
for row in reader:
data.append({"id": row[0],
"receiver": row[1],
"payload": row[2],
"operator": row[3],
"sender": row[4]})
with open("data.json", "w") as f:
json.dump(data, f, indent=4)
The problem I'm facing is that I'm not getting the right values inside "payload", which I would like to be a nested JSON object.
The result I get is the following:
data.json
[
{
"id": "1",
"receiver": " 12385",
"payload": " {'message': 'test 1'",
"operator": " 'EngineId': 3",
"sender": " 'PersonId': 1"
},
{
"id": "2",
"receiver": " 12347",
"payload": " {'message': 'test 2'",
"operator": " 'EngineId': 3",
"sender": " 'PersonId': 2"
}
]
So my question is, how can I create a nested JSON object for the "payload" while I'm doing the conversion from CSV to JSON?
I think the main problem is that it is seen as a string and not as an object.

Try the following. You can just do everything as previously, but merge back all elements that were in 3rd column and load it via ast.literal_eval.
import json
import csv
import ast
with open("data.csv","r") as f:
reader = csv.reader(f,skipinitialspace=True)
data = [{"id": ident,
"receiver": rcv,
"payload": ast.literal_eval(','.join(payload)),
"operator": op,
"sender": snd}
for ident,rcv,*payload,op,snd in reader]
with open("data.json", "w") as f:
json.dump(data, f, indent=4)

How to combine these 2 json-like files?

I have file1.txt with following contents;
[
{
"SERIAL": "124584",
"X": "30024.1",
},
{
"SERIAL": "114025",
"X": "14006.2",
}
]
I have file2.txt with following contents;
[
{
"SERIAL": "344588",
"X": "48024.1",
},
{
"SERIAL": "255488",
"X": "56006.2",
}
]
I want to combine the 2 files into single file output.txt that looks like this;
[
{
"SERIAL": "124584",
"X": "30024.1",
},
{
"SERIAL": "114025",
"X": "14006.2",
},
{
"SERIAL": "344588",
"X": "48024.1",
},
{
"SERIAL": "255488",
"X": "56006.2",
},
]
The tricky part is the [] at the end of each individual file.
I am using python v3.7

Firstly to be JSON compliant, you may remove all the trailing commas (ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Trailing_commas)
Then you can use the following code:
import json
with open("file1.txt") as f1:
d1 = json.load(f1)
with open("file2.txt") as f2:
d2 = json.load(f2)
d3 = d1 + d2
with open("output.txt", "w") as out:
json.dump(d3, out)

Here is the solution to read content from file and then append them.
from ast import literal_eval
with open("/home/umesh/Documents/text1.txt", "r") as data
first_file_data = data.read()
with open("/home/umesh/Documents/text2.txt", "r") as data:
second_file_data = data.read()
first_file_data = literal_eval(first_file_data)
second_file_data = literal_eval(second_file_data)
for item in second_file_data:
first_file_data.append(item)
print(first_file_data)
OUTPUT
[{'SERIAL': '124584', 'X': '30024.1'},{'SERIAL': '114025', 'X': '14006.2'},{'SERIAL': '344588', 'X': '48024.1'},{'SERIAL': '255488', 'X': '56006.2'}]
text file content

This solves your problem
import ast
import json
with open('file1.txt') as f:
data = ast.literal_eval(f.read())
with open('file2.txt') as f:
data2 = ast.literal_eval(f.read())
data.extend(data2)
print(data)
with open('outputfile', 'w') as fout: # write to a file
json.dump(data, fout)
OUTPUT:
[{'SERIAL': '124584', 'X': '30024.1'}, {'SERIAL': '114025', 'X': '14006.2'}, {'SERIAL': '344588', 'X': '48024.1'}, {'SERIAL': '255488', 'X': '56006.2'}]

Since both of the content of the files are lists you can concatenate them together as following
file1 = [{'SERIAL': '124584', 'X': '30024.1'}, {'SERIAL': '114025', 'X': '14006.2'}]
file2 = [{'SERIAL': '344588', 'X': '48024.1'}, {'SERIAL': '255488', 'X': '56006.2'}]
totals = file1 + file2
Result
[{'SERIAL': '124584', 'X': '30024.1'},
{'SERIAL': '114025', 'X': '14006.2'},
{'SERIAL': '344588', 'X': '48024.1'},
{'SERIAL': '255488', 'X': '56006.2'}]

How to append data properly to a JSON using python

Update:
The only issue I have now is when running the command to add a user it create a completely duplicate key.
Question:
json.dump() simply adds the entry to the end of the json, I want it to overwrite the entire file with the new updated entry
Setup: (Create blank "Banks" Field)
with open(DATA_FILENAME, mode='w', encoding='utf-8') as f:
data = {"banks": []}
json.dump(data, f)
Set User: (Create a User Key inside "Banks")
member = ctx.message.author
entry = {'name': member.name, 'id': member.id, 'balance': 0}
with open(DATA_FILENAME, 'r+') as outfile:
data = json.load(outfile)
data['banks'].append((entry))
json.dump(data, outfile, indent=4)
Output of first use:
{"banks": []}{
"banks": [
{
"name": "ViperZ-14",
"id": 367151547575959562,
"balance": 0
}
]
}
What I need:
{
"banks": [
{
"name": "ViperZ-14",
"id": 367151547575959562,
"balance": 0
}
]
}

file_path = '/home/vishnudev/Downloads/new.json'
import json
def load(file, mode, data=[]):
with open(file, mode) as f:
if mode == 'r':
return json.load(f)
elif mode == 'w':
json.dump(data, f)
def get_data_func():
return {
'name': 'vishnu',
'data': 'dev'
}
d = load(file_path, 'r')
print(d)
d.append(get_data_func())
load(file_path, 'w', d)
d = load(file_path, 'r')
print(d)
Output:
On running the above twice I get
[{'name': 'vishnu', 'data': 'dev'}]
[{'name': 'vishnu', 'data': 'dev'}, {'name': 'vishnu', 'data': 'dev'}]

I have found that the solution was to simply seek to the beginning of the document. The json.dump() does overwrite but it only overwrites whats in its way. AKA, seeking/placing the cursor at the top of the document will overwrite the entire document using the new entry.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

JSON dump breaks my dictionary when using certain characters - python

Replace json.dump(worddict, fp, sort_keys=True, indent=4, separators=(',', ': ')) with json.dump(worddict, fp, sort_keys=True, indent=4, separators=(',', ': '), ensure_ascii=False)

Related

Writing to text file while reading one. (ValueError: I/O operation on closed file.)

Parse JSON starting with bracket in python

How can I convert a CSV file to a JSON file with a nested json object using Python?

How to combine these 2 json-like files?

How to append data properly to a JSON using python

Categories

Resources