I'm looking to output a Python dictionary to a file using the json library with formatting such that lists are represented on the same line.
I have tried making a custom encoder and using the ones I have found online as well such as the suggestion here:
https://stackoverflow.com/a/26512016/1411362
such that the final code line is:
import json
data = {'a':1, 'b':[1,2,3,4]}
with open("data.json", 'w') as f:
json.dump(data, f, indent=4, cls=CustomEncoderClass)
however, this fails to work if I try to use json.dump to export as a file instead of json.dumps (like in the link above) to a string. Is there a way to use the custom encoder such that it works when I export the data to a file?
Related
So I am working on a conversion file that is taking a dictionary and converting it to a JSON file. Current code looks like:
data = {json_object}
json_string = jsonpickle.encode(data)
with open('/Users/machd/Mac/Documents/VISUAL CODE/CSV_to_JSON/JSON FILES/test.json', 'w') as outfile:
json.dump(json_string, outfile)
But when I go to open that rendered file, it is adding three \ on the front and back of each string.
ps: sorry if I am using the wrong terminology, I am still new to python and don't know the vocabulary that well yet.
Try this
import json
data = {"k": "v"}
with open( 'path_to_file.json', 'w') as f:
json.dump(data, f)
You don't need to use jsonpickle to encode dict data.
The json.dump is a wrapper function that convert data to json format firstly, then write these string data to your file.
The reason why you found \\ exist between each string is that, jsonpickle have took your data to string, after which the quote(") would convert to Escape character when json.dump interact.
Just use the following code to write dict data to json
with open('/Users/machd/Mac/Documents/VISUAL CODE/CSV_to_JSON/JSON FILES/test.json', 'w') as outfile:
json.dump(data, outfile)
I've found a couple of others asking for help with this, but not specifically what I'm trying to do. I have a dictionary full of various formats (int, str, bool, etc) and I'm trying to save it so I can load it at a later time. Here is a basic version of the code without all the extra trappings that are irrelevant for this.
petStats = { 'name':"", 'int':1, 'bool':False }
def petSave(pet):
with open(pet['name']+".txt", "w+") as file:
for k,v in pet.items():
file.write(str(k) + ':' + str(v) + "\n")
def digimonLoad(petName):
dStat = {}
with open(petName+".txt", "r") as file:
for line in file:
(key, val) = line.split(":")
dStat[str(key)] = val
print(petName,"found. Loading",petName+".")
return dStat
In short I'm just brute forcing it by saving a text file with a Key:Value on each line, then split them all back up on load. Unfortunately this turns all of my int and bool into strings. Is there a file format I could use to save a dictionary to (I don't need to be able to read it, but the conveniance would be nice) that I could easily load back in?
This works for a basic dictionary but if I start adding things like arrays this is going to get out of hand as it is.
Use module json.
import json
def save_pet(pet):
filename = <Whatever filename you want>
with open(filename, 'w') as f:
f.write(json.dumps(pet))
def load_pet(filename):
with open(filename) as f:
pet = json.loads(f.read())
return pet
Use pickle. This is part of the standard library, so you can just import it.
import pickle
pet_stats = {'name':"", 'int':1, 'bool':False}
def pet_save(pet):
with open(pet['name'] + '.pickle', 'wb') as f:
pickle.dump(pet, f, pickle.HIGHEST_PROTOCOL)
def digimon_load(pet_name):
with open(pet_name + '.pickle', 'rb') as f:
return pickle.load(f)
Pickle works on more data types than JSON, and automatically loads them as the right Python type. (There are ways to save more types with JSON, but it takes more work.) JSON (or XML) is better if you need the output to be human-readable, or need to share it with non-Python programs, but neither appears to be necessary for your use case. Pickle will be easiest.
If you need to see what's in the file, just load it using Python or
python -m pickle foo.pickle
instead of a text editor. (Only do this to pickle files from sources you trust, pickle is not at all secure against hacking.)
Q: Is there a file format I could use to save a dictionary to load back in?
A: Yes, there are many. XML and JSON come immediately to mind.
For example:
jsonfile.txt
{
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
Here's an example reading the file into a dictionary:
import json
with open('data.txt','r') as json_file:
data = json.load(json_file)
... and an example writing the dictionary to JSON:
import json
with open('data.txt','w') as fp:
fp.write(json.dumps(data))
If you prefer XML, there are many libraries, including xmltodict:
import xmltodict
with open('path/to/file.xml') as fd:
doc = xmltodict.parse(fd.read())
There are two useful words that you may not know about yet : serialization and pickle.
Serialization refers to the process of converting a data structure (like your dictionary) to a stream of bytes that can be written to storage, and later retrieved from storage to recreate that data structure. This is a common task and your intuition is correct: trying to do this all by yourself will quickly get out of hand.
Pickle is the standard python module for implementing serialization. It’s easy to use, mature and works with a large set of Python data types. You can read more about pickle here : https://docs.python.org/3/library/pickle.html
I have written the following python code to populate a JSON file.
import json
data = {}
data['people'] = []
for i in range(0,3):
data['people'].append({
'name': 'C%d'%(i),
'div':i,
'from': 'City%d'%(i)
})
with open('data.txt', 'w') as outfile:
json.dump(data, outfile)
However, my JSON file looks something like this:
{"people": [{"div":0,"from":,"City0":"name":"C0"},{"div":0,"from":,"City0":"name":"C0"}]}
My order of input is different from the output's. What is the reason and how do I rectify this?
What python version do you use? You create a dict, but before python 3.6 order of insertion is not preserved. In python 3.6 order of insertion is preserved, but it's considered implementation detail and should not be relied upon. In python 3.7 the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.
If you are using python version lower than 3.7 use OrderedDict from collections.
import json
from collections import OrderedDict
data = {}
data['people'] = []
for i in range(0,3):
data['people'].append(OrderedDict((
('name', 'C%d' %(i)),
('div', i),
('from', 'City%d'%(i))
)))
with open('data.json', 'w') as outfile:
json.dump(data, outfile)
By the way, why the extension of the file is txt and not json? It doesn't matter and is not related to your problem, but I am curious.
The reason your output is like that is because json files don't really care what order they are in, they hold data and are used in comparison with a file directory. As long as you can get to the file and it actually be the file, its all good. You more or less want it to be exactly how you input it which would be impossible with json.dumps, If you absolutely need it that way, Id just make a string like
string='''{"people": [{#arange in order you want it}]}'''and save it how you would any other file.
If your looking to sort your json, try something i found here Sorting Json
A project for class involves parsing Twitter JSON data. I'm getting the data and setting it to the file without much trouble, but it's all in one line. This is fine for the data manipulation I'm trying to do, but the file is ridiculously hard to read and I can't examine it very well, making the code writing for the data manipulation part very difficult.
Does anyone know how to do that from within Python (i.e. not using the command line tool, which I can't get to work)? Here's my code so far:
header, output = client.request(twitterRequest, method="GET", body=None,
headers=None, force_auth_header=True)
# now write output to a file
twitterDataFile = open("twitterData.json", "wb")
# magic happens here to make it pretty-printed
twitterDataFile.write(output)
twitterDataFile.close()
Note I appreciate people pointing me to simplejson documentation and such, but as I have stated, I have already looked at that and continue to need assistance. A truly helpful reply will be more detailed and explanatory than the examples found there. Thanks
Also:
Trying this in the windows command line:
more twitterData.json | python -mjson.tool > twitterData-pretty.json
results in this:
Invalid control character at: line 1 column 65535 (char 65535)
I'd give you the data I'm using, but it's very large and you've already seen the code I used to make the file.
You should use the optional argument indent.
header, output = client.request(twitterRequest, method="GET", body=None,
headers=None, force_auth_header=True)
# now write output to a file
twitterDataFile = open("twitterData.json", "w")
# magic happens here to make it pretty-printed
twitterDataFile.write(simplejson.dumps(simplejson.loads(output), indent=4, sort_keys=True))
twitterDataFile.close()
You can parse the JSON, then output it again with indents like this:
import json
mydata = json.loads(output)
print json.dumps(mydata, indent=4)
See http://docs.python.org/library/json.html for more info.
import json
with open("twitterdata.json", "w") as twitter_data_file:
json.dump(output, twitter_data_file, indent=4, sort_keys=True)
You don't need json.dumps() if you don't want to parse the string later, just simply use json.dump(). It's faster too.
You can use json module of python to pretty print.
>>> import json
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
"4": 5,
"6": 7
}
So, in your case
>>> print json.dumps(json_output, indent=4)
If you are generating new *.json or modifying existing josn file the use "indent" parameter for pretty view json format.
import json
responseData = json.loads(output)
with open('twitterData.json','w') as twitterDataFile:
json.dump(responseData, twitterDataFile, indent=4)
If you already have existing JSON files which you want to pretty format you could use this:
with open('twitterdata.json', 'r+') as f:
data = json.load(f)
f.seek(0)
json.dump(data, f, indent=4)
f.truncate()
import json
def writeToFile(logData, fileName, openOption="w"):
file = open(fileName, openOption)
file.write(json.dumps(json.loads(logData), indent=4))
file.close()
You could redirect a file to python and open using the tool and to read it use more.
The sample code will be,
cat filename.json | python -m json.tool | more
I am using json and jsonpickle sometimes to serialize objects to files, using the following function:
def json_serialize(obj, filename, use_jsonpickle=True):
f = open(filename, 'w')
if use_jsonpickle:
import jsonpickle
json_obj = jsonpickle.encode(obj)
f.write(json_obj)
else:
simplejson.dump(obj, f)
f.close()
The problem is that if I serialize a dictionary for example, using "json_serialize(mydict, myfilename)" then the entire serialization gets put on one line. This means that I can't grep the file for entries to be inspected by hand, like I would a CSV file. Is there a way to make it so each element of an object (e.g. each entry in a dict, or each element in a list) is placed on a separate line in the JSON output file?
thanks.
(simple)json.dump() has the indent argument. jsonpickle probably has something similar, or in the worst case you can decode it and encode it again.
Jsonpickle uses one of the json backends and so you can try this to your code:
jsonpickle.set_encoder_options('simplejson', sort_keys=True, indent=4)
Update: simplejson has been incorporated into base python, just replace simplejson for json and you'll get the pretty-printed/formatted/non-minified json
jsonpickle.set_encoder_options('json', sort_keys=True, indent=4)