How to append a json line to a loaded json file? - python

I am writing Python code.
I loaded a json file.
with open('..\config_4099.json', "r") as fid:
jaySon = json.load(fid)
It's a flat json structure, so no internal elements to append to. Just need to tack onto the bottom the piece in curlies:
jaySon.append({'pluginInputs': "PluginInputs"})
It's complaining about dictionaries.
What's the best way to do this?

With dicts, use update:
jaySon.update({'pluginInputs': "PluginInputs"})

Related

How to read Json files in a directory separately with a for loop and performing a calculation

Update: Sorry it seems my question wasn't asked properly. So I am analyzing a transportation network consisting of more than 5000 links. All the data included in a big CSV file. I have several JSON files which each consist of subset of this network. I am trying to loop through all the JSON files INDIVIDUALLY (i.e. not trying to concatenate or something), read the JSON file, extract the information from the CVS file, perform calculation, and save the information along with the name of file in new dataframe. Something like this:
enter image description here
This is the code I wrote, but not sure if it's efficient enough.
name=[]
percent_of_truck=[]
path_to_json = \\directory
import glob
z= glob.glob(os.path.join(path_to_json, '*.json'))
for i in z:
with open(i, 'r') as myfile:
l=json.load(myfile)
name.append(i)
d_2019= final.loc[final['LINK_ID'].isin(l)] #retreive data from main CSV file
avg_m=(d_2019['AADTT16']/d_2019['AADT16']*d_2019['Length']).sum()/d_2019['Length'].sum() #calculation
percent_of_truck.append(avg_m)
f=pd.DataFrame()
f['Name']=name
f['% of truck']=percent_of_truck
I'm assuming here you just want a dictionary of all the JSON. If so, use the JSON library ( import JSON). If so, this code may be of use:
import json
def importSomeJSONFile(f):
return json.load(open(f))
# make sure the file exists in the same directory
example = importSomeJSONFile("example.json")
print(example)
#access a value within this , replacing key with what you want like "name"
print(JSON_imported[key])
Since you haven't added any Schema or any other specific requirements.
You can follow this approach to solve your problem, in any language you prefer
Get Directory of the JsonFiles, which needs to be read
Get List of all files present in directory
For each file-name returned in Step2.
Read File
Parse Json from String
Perform required calculation

Best way to store dictionary in a file and load it partially?

Which is the best way to store dictionary of strings in file(as they are big) and load it partially in python. Dictionary of strings here means, keyword would be a string and the value would be a list of strings.
Dictionary storing in appended form to check keys, if available not update or else update. Then use keys for post processing.
Usually a dictionary is stored in JSON.
I'll leave here a link:
Convert Python dictionary to JSON array
You could simply write the dictionary to a text file, and then create a new dictionary that only pulls certain keys and values from that text file.
But you're probably best off exploring the json module.
Here's a straighforward way to write a dict called "sample" to a file with the json module:
import json
with open('result.json', 'w') as fp:
json.dump(sample, fp)
On the loading side, we'd need to know more about how you want to choose which keys to load from the JSON file.
The above answers are great, but i hate using JSON, i have had issues with pickle before that corrupted my data, so what i do is, i use numpy's save and load
To save np.save(filename,dict)
to load dict = np.load(filename).item()
really simple and works well, as far as loading partially goes, you could always split the dictionary into multiple smaller dictionaries and save them as individual files, maybe not a very concrete solution but it could work
to split the dictionary you could do something like this
temp_dict = {}
for i,k in enumerate(dict.keys()):
if i%1000 == 0:
np.save("records-"+str(i-1000)+"-"+str(i)+".npy",temp_dict)
temp_dict = {}
temp_dict[k]=dict[k].value()
then for loading just do something like
my_dict={}
all_files = glob.glob("*.npy")
for f in all_files:
dict = np.load(filename).item()
my_dict.update(dict)
If this is for some sort of database type use then save yourself the headache and use TinyDB. It uses JSON format when saving to disc and will provide you the "partial" loading that you're looking for.
I only recommend TinyDB as this seems to be the closest to what you're looking to achieve, maybe try googling for other databases if this isn't your fancy there's TONS of them out there!

storing and retrieving lists from files

I have a very big list of lists. One of my programs does this:
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
out.write (str(power_time_array))
Now another independent script need to read this list of lists back.
How do I do this?
What I have tried:
with open (file_name,'r') as app_trc_file :
power_trace_of_application.append (app_trc_file.read())
Note: power_trace_application is a list of list of lists.
This stores it as a list with one element as a huge string.
How does one efficiently store and retrieve big lists or list of lists from files in python?
You can serialize your list to json and deserialize it back. This really doesn't change anything in representation, your list is already valid json:
import json
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
json.dump(power_time_array, out)
and then just read it back:
with open (file_name,'r') as app_trc_file :
power_trace_of_application = json.load(app_trc_file)
For speed, you can use a json library with C backend (like ujson). And this works with custom objects too.
Use Json library to efficiently read and write structured information (in the form of JSON) to a text file.
To write data on the file, use json.dump() , and
To retrieve json data from file, use json.load()
It will be faster:
from ast import literal_eval
power_time_array = [[1,2,3],[1,2,3]]
with open(file_name, 'w') as out:
out.write(repr(power_time_array))
with open(file_name,'r') as app_trc_file:
power_trace_of_application.append(literal_eval(app_trc_file.read()))

Python: Converting Entire Directory of JSON to Python Dictionaries to send to MongoDB

I'm relatively new to Python, and extremely new to MongoDB (as such, I'll only be concerned with taking the text files and converting them). I'm currently trying to take a bunch of .txt files that are in JSON to move them into MongoDB. So, my approach is to open each file in the directory, read each line, convert it from JSON to a dictionary, and then over-write that line that was JSON as a dictionary. Then it'll be in a format to send to MongoDB
(If there's any flaw in my reasoning, please point it out)
At the moment, I've written this:
"""
Kalil's step by step iteration / write.
JSON dumps takes a python object and serializes it to JSON.
Loads takes a JSON string and turns it into a python dictionary.
So we return json.loads so that we can take that JSON string from the tweet and save it as a dictionary for Pymongo
"""
import os
import json
import pymongo
rootdir='~/Tweets'
def convert(line):
line = file.readline()
d = json.loads(lines)
return d
for subdir, dirs, files in os.walk(rootdir):
for file in files:
f=open(file, 'r')
lines = f.readlines()
f.close()
f=open(file, 'w')
for line in lines:
newline = convert(line)
f.write(newline)
f.close()
But it isn't writing.
Which... As a rule of thumb, if you're not getting the effect that you're wanting, you're making a mistake somewhere.
Does anyone have any suggestions?
When you decode a json file you don't need to convert line by line as the parser will iterate over the file for you (that is unless you have one json document per line).
Once you've loaded the json document you'll have a dictionary which is a data structure and cannot be directly written back to file without first serializing it into a certain format such as json, yaml or many others (the format mongodb uses is called bson but your driver will handle the encoding for you).
The overall process to load a json file and dump it into mongo is actually pretty simple and looks something like this:
import json
from glob import glob
from pymongo import Connection
db = Connection().test
for filename in glob('~/Tweets/*.txt'):
with open(filename) as fp:
doc = json.load(fp)
db.tweets.save(doc)
a dictionary in python is an object that lives within the program, you can't save the dictionary directly to a file unless you pickle it (pickling is a way to save objects in files so you can retrieve it latter). Now I think a better approach would be to read the lines from the file, load the json which converts that json to a dictionary and save that info into mongodb right away, no need to save that info into a file.

Editing Pickled Data

I need to save a complex piece of data:
list = ["Animals", {"Cats":4, "Dogs":5}, {"x":[], "y":[]}]
I was planning on saving several of these lists within the same file, and I was also planning on using the pickle module to save this data. I also want to be able to access the pickled data and add items to the lists in the 2nd dictionary. So after I unpickle the data and edit, the list might look like this:
list = ["Animals", {"Cats":4, "Dogs":5}, {"x"=[1, 2, 3], "y":[]}]
Preferable, I want to be able to save this list (using pickle) in the same file I took that piece of data from. However, if I simply re-pickle the data to the same file (lets say I originally saved it to "File"), I'll end up with two copies of the same list in that file:
a = open("File", "ab")
pickle.dump(list, a)
a.close()
Is there a way to replace the edited list in the file using pickle rather than adding a second (updated) copy? Or, is there another method I should consider for saving this data?
I think you want the shelve module. It creates a file (uses pickle under the hood) that contains the contents of a variable accessible by key (think persistent dictionary).
You could open the file for writing instead of appending -- then the changes would overwrite previous data. This is however a problem if there is more data stored in that file. If what you want really is to selectively replace data in a pickled file, I'm afraid this won't work with pickle. If this is a common operation, check if something like a sqlite database helps you to this end.

Categories