I want to create a script where I check a json file times to times using a while function. In there there is a json that looks like:
{
"names":[
{
"name":"hello",
"numbers":0
},
{
"name":"stack",
"numbers":1
},
{
"name":"over",
"numbers":2
},
{
"name":"flow",
"numbers":12
},
{
"name":"how",
"numbers":17
},
{
"name":"are",
"numbers":11
},
{
"name":"you",
"numbers":18
},
{
"name":"today",
"numbers":6
},
{
"name":"merry",
"numbers":4
},
{
"name":"x",
"numbers":1
},
{
"name":"mass",
"numbers":0
},
{
"name":"santa",
"numbers":4
},
{
"name":"hohoho",
"numbers":1
}
]
}
and what I want to do is that I want to check every number if numbers for each name has been increased than previous json look.
def script():
with open('data.json') as f:
old_data = json.load(f)
while True:
with open('data.json') as f:
new_data = json.load(f)
if old_data < new_data:
print("Bigger!!" + new_data['name'])
old_data = new_data
else:
randomtime = random.randint(5, 15)
print("Nothing increased")
old_data = new_data
time.sleep(randomtime)
Now I know that I have done it wrong and that's the reason I am here. I have no idea at this moment what I can do to make a sort of function where it checks numbers by numbers to see if its gotten bigger or not.
My question is:
How can I make it so it checks object by object to see if the numbers has gotten bigger from previous loop? and if it has not gotten bigger but lower, it should update the value of old_data and loops forever until the numbers has gotten bigger than previous loop?
EDIT:
Recommendation that I got from #Karl
{
'names': {
'hello': 0,
'stack': 0,
'over': 2,
'flow': 12,
'how': 17,
'are': 11,
'you': 18,
'today': 6,
'merry': 4,
'x': 1,
'mass': 0,
'santa': 4,
'hohoho': 1
}
}
Assuming your json is in this format:
{
"names": {
"hello": 0,
"stack": 1,
"over": 2,
"flow": 13,
"how": 17,
"are": 12,
"you": 18,
"today": 6,
"merry": 4,
"x": 1,
"mass": 0,
"santa": 4,
"hohoho": 1
}
}
I would do something along the following lines:
import json
import time
with open("data.json") as f:
old_data = json.load(f)["names"]
while True:
with open("data.json") as f:
new_data = json.load(f)["names"]
for name, number in new_data.items():
if number > old_data[name]:
print("Entry '{0}' has increased from {1} to {2}".format(name, old_data[name], number))
old_data = new_data
print("sleeping for 5 seconds")
time.sleep(5)
EDIT to answer question posted in comment "just curious, lets say if I want to add another value beside the numbers etc "stack": 1, yes (Yes and no to each of format), What would be needed to do in that case? (Just a script that I want to develop from this)".
In that case you should design your json input as follows:
{
"names": {
"hello": {
"number": 0,
"status": true
},
"stack": {
"number": 1,
"status": true
},
"over": {
"number": 2,
"status": false
},
...
}
}
You would need to change the lookups in the comparison script as follows:
for name, values in new_data.items():
if values["number"] > old_data[name]["number"]
(Note that for status you could also just have "yes" or "no" as inputs, but using booleans is must more useful when you have to represent a binary choice like this).
By the way, unless you aim to have objects other than names in this json, you can leave out that level and just make it:
{
"hello": {
"number": 0,
"status": true
},
"stack": {
"number": 1,
"status": true
},
"over": {
"number": 2,
"status": false
},
...
}
In that case, replace old_data = json.load(f)["names"] with old_data = json.load(f) and new_data= json.load(f)["names"] with new_data= json.load(f)
I took your original .json which you edited and presented in your question and re-factored your code to the below example. It appears to be working.
import time
import random
import json
path_to_file = r"C:\path\to\.json"
def script():
with open(path_to_file) as f:
d = json.load(f)
old_data = 0
for a_list in d.values():
for i in a_list:
print()
for d_keys, d_values in i.items():
print(d_keys, d_values)
if type(d_values) == int and d_values > old_data:
print("Bigger!!" + i['name'])
old_data = d_values
elif type(d_values) == int and d_values < old_data:
print("Nothing increased")
old_data = d_values
randomtime = random.randint(5, 15)
time.sleep(randomtime)
script()
This is the output I receive:
name hello numbers 0
name stack numbers 1 Bigger!!stack
name over numbers 2 Bigger!!over
name flow numbers 12 Bigger!!flow
name how numbers 17 Bigger!!how
name are numbers 11 Nothing increased
name you numbers 18 Bigger!!you
name today numbers 6 Nothing increased
name merry numbers 4 Nothing increased
name x numbers 1 Nothing increased
name mass numbers 0 Nothing increased
name santa numbers 4 Bigger!!santa
name hohoho numbers 1 Nothing increased
Related
I have a large script that parses js with a dataframe entry, but to shorten the question, I put what I need in a separate variable.
My variable contains the following value
value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"
I apply the following script and get data like this
value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"
def parse_json(value):
arr = value.split("},")
arr = [x+"}" for x in arr]
arr[-1] = arr[-1][:-1]
return json.dumps({str(i):add_quotation_marks(x) for i, x in enumerate(arr)})
def add_quotation_marks(value):
words = re.findall(r'(\w+:)', value)
for word in words:
value = value.replace(word[:-1], f'"{word[:-1]}"')
return json.loads(value)
print(parse_json(value))
{"0": {"from": [3, 4], "to": [7, 4], "color": 2}, "1": {"from": [3, 6], "to": [10, 6], "color": 3}}
The script executes correctly, but I need to get a slightly different result.
This is what the result I want to get looks like:
{
"0": {
"from": {
"0": "3",
"1": "4"
},
"to": {
"0": "7",
"1": "4"
},
"color": "2"
},
"1": {
"from": {
"0": "3",
"1": "6"
},
"to": {
"0": "10",
"1": "6"
},
"color": "3"
}
}
This is valid json and valid yaml. Please tell me how can I do this
I'd suggest a regex approach in this case:
res = []
# iterates over each "{from:...,to:...,color:...}" group separately
for obj in re.findall(r'\{([^}]+)}', value):
item = {}
# iterates over each "...:..." key-value separately
for k, v in re.findall(r'(\w+):(\[[^]]+]|\d+)', obj):
if v.startswith('['):
v = v.strip('[]').split(',')
item[k] = v
res.append(item)
This produces this output in res:
[{'from': ['3', '4'], 'to': ['7', '4'], 'color': '2'}, {'from': ['3', '6'], 'to': ['10', '6'], 'color': '3'}]
Since your values can contain commas, trying to split on commas or other markers is fairly tricky, and using these regexes to match your desired values instead is more stable.
Here's the code that converts the the value to your desired output.
import json5 # pip install json5
value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"
def convert(str_value):
str_value = f"[{str_value}]" # added [] to make it a valid json
parsed_value = json5.loads(str_value) # convert to python object
output = {} # create empty dict
# Loop through the list of dicts. For each item, create a new dict
# with the index as the key and the value as the value. If the value
# is a list, convert it to a dict with the index as the key and the
# value as the value. If the value is not a list, just add it to the dict.
for i, d in enumerate(parsed_value):
output[i] = {}
for k, v in d.items():
output[i][k] = {j: v[j] for j in range(len(v))} if isinstance(v, list) else v
return output
print(json5.dumps(convert(value)))
Output
{
"0": {
"from": {
"1": 4
},
"to": {
"0": 7,
"1": 4
},
"color": 2
},
"1": {
"from": {
"0": 3,
"1": 6
},
"to": {
"0": 10,
"1": 6
},
"color": 3
}
}
json5 package allows you to convert a javascrip object to a python dictionary so you dont have to do split("},{").
Then added [ and ] to make the string a valid json.
Then load the string using json5.loads()
Now you can loop through the dictionary and convert it to desired output format.
I want to get the data from a json. I have the idea of a loop to access all levels.
I have only been able to pull data from a single block.
print(output['body']['data'][0]['list'][0]['outUcastPkts'])
How do I get the other data?
import json,urllib.request
data = urllib.request.urlopen("http://172.0.0.0/statistic").read()
output = json.loads(data)
for elt in output['body']['data']:
print(output['body']['data'][0]['inUcastPktsAll'])
for elt in output['list']:
print(output['body']['data'][0]['list'][0]['outUcastPkts'])
{
"body": {
"data": [
{
"inUcastPktsAll": 3100617019,
"inMcastPktsAll": 7567,
"inBcastPktsAll": 8872,
"outPktsAll": 8585575441,
"outUcastPktsAll": 8220240108,
"outMcastPktsAll": 286184143,
"outBcastPktsAll": 79151190,
"list": [
{
"outUcastPkts": 117427359,
"outMcastPkts": 1990586,
"outBcastPkts": 246120
},
{
"outUcastPkts": 0,
"outMcastPkts": 0,
"outBcastPkts": 0
}
]
},
{
"inUcastPktsAll": 8269483865,
"inMcastPktsAll": 2405765,
"inBcastPktsAll": 124466,
"outPktsAll": 3101194852,
"outUcastPktsAll": 3101012296,
"outMcastPktsAll": 173409,
"outBcastPktsAll": 9147,
"list": [
{
"outUcastPkts": 3101012296,
"outMcastPkts": 90488,
"outBcastPkts": 9147
},
{
"outUcastPkts": 0,
"outMcastPkts": 0,
"outBcastPkts": 0
}
]
}
],
"msgs": [ "successful" ]
},
"header": {
"opCode": "1",
"token": "",
"state": "",
"version": 1
}
}
output = json.loads(data) #Type of output is a dictionary.
#Try to use ".get()" method.
print(output.get('body')) #Get values of key 'body'
print(output.get('body').get('data')) #Get a list of key 'data'
If a key doesn't exist, the '.get()' method will return None.
https://docs.python.org/3/library/stdtypes.html#dict.get
In python you can easily iterate over the objects of a list like so:
>>> l = [1, 2, 3, 7]
>>> for elem in l:
... print(elem)
...
1
2
3
7
This works regarding what can of object do you have in the list (integers, tuples, dictionaries). Having that in mind, your solution was not far off, you only to do the following changes:
for entry in output['body']['data']:
print(entry['inUcastPktsAll'])
for list_element in entry['list']:
print(list_element['outUcastPkts'])
This will give you the following for the json object you have provided:
3100617019
117427359
0
8269483865
3101012296
0
I have a dictionary python with keys and values nested.
How do I find the object's index number by providing a value.
As for now, I can get values of the keys in specific object when I know the index of the object.
I mean, if I know the object's index number in the dictionary, I can get the key's and values in this specific object.
my_dict = {
"ftml": {
"people": {
"person": [
{
"#id": "Terach",
"#sex": "male",
"Death": {
"#year": ""
},
"Midrash": {
"#midrah": ""
},
"Old": {
"#age": ""
},
"Role": {
"#role": ""
},
"birth": {
"#year": ""
},
"father": {
"#id": "Nachor"
},
"mother": {
"#id": ""
},
"spouse": ""
},
{
"#id": "Avraham",
"#sex": "male",
"Death": {
"#year": "2123"
},
"Grandson": {
"#son1": "Esav",
"#son2": "Yaakov"
},
"Midrash": {
"#midrah": ""
},
"Old": {
"#age": "175"
},
"Role": {
"#role": ""
},
"birth": {
"#year": "1948"
},
"father": {
"#id": "Terach"
},
"mother": {
"#id": ""
},
"spouse": {
"#wife1": "Sara"
}
},
{
"#husband": "Avraham",
"#id": "Sara",
"#sex": "female"
},
{
"#id": "Nachor",
"#sex": "male",
"Death": {
"#year": ""
},
"Midrash": {
"#midrah": ""
},
"Old": {
"#age": ""
},
"Role": {
"#role": ""
},
"birth": {
"#year": ""
},
"father": {
"#id": "Terach"
},
"mother": {
"#id": ""
},
"spouse": ""
},
]
}
}
}
x = int(input("Type the chronological person number. (i.e 1 is Avraham): "))
print("First Name: ",my_dict['ftml']['people']['person'][x]["#id"]) #1 = avraham
I expect to ask the user for the #id and return the object's index number.
For example, if the user sends the program "Avraham" the program will return 1.
If the user is looking for Nachor the program will return 0.
I don't think revising the dict is a good idea.
Here is my solution:
First get the "position" of your list, i.e. what to find from.
list_to_find = my_dict['ftml']['people']['person']
The list_to_find is a list of dict (people), like
[{"#id": "Terach", '#sex': ...}, {"#id": 'Avraham', ...} ...]
Then what you want to do is to search in all the #id, so you can get all the #id by:
ids = [person['#id'] for person in list_to_find]
And then use index to get the index:
index = ids.index('Avraham')
In here I used dict comprehensions with enumerate() python Built-in Function. It's little bit confusing you. But you know the data structure about Dictionaries. For this example I didn't attach your my_dict dictionary cause it's too large.
>>> obj = {y["#id"]:x for x,y in list(enumerate(my_dict["ftml"]["people"]["person"]))}
>>> obj
{'Terach': 0, 'Avraham': 1, 'Sara': 2, 'Nachor': 3}
This output of the obj looks like the summary of the list of my_dict["ftml"]["people"]["person"]. Isn't it? For your question this obj is simply enough without extracting such a long dictionary and this is fast. If you confusing with dict comprehensions, hopefully this will understand for you.
>>> obj = {}
>>> for x,y in list(enumerate(my_dict["ftml"]["people"]["person"])):
... obj[y["#id"]] = x
...
>>> obj
{'Terach': 0, 'Avraham': 1, 'Sara': 2, 'Nachor': 3}
If you didn't understand what enumerate() does in here, check this small example which I directly get it from original documentation.
Return an enumerate object. iterable must be a sequence, an iterator, or some other object which supports iteration. The __next__() method of the iterator returned by enumerate() returns a tuple containing a count (from start which defaults to 0) and the values obtained from iterating over iterable.
>>> seasons = ['Spring', 'Summer', 'Fall', 'Winter']
>>> list(enumerate(seasons))
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]
>>> list(enumerate(seasons, start=1))
[(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]
According to above the example we are numbering seasons final output. I saw you comment in #Joery's answer. Now you want to insert a name and get an index of if it.
>>> x = input("Type the chronological person number. (i.e 1 is Avraham): ")
Type the chronological person id. (i.e 1 is Avraham): Avraham
>>> print(obj.get(x, None)) # if not match anything, will return None
1
So this 1 mean the 2nd element of my_dict["ftml"]["people"]["person"] list. Now you can easily access any of it. This is what get() function does.
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
In you obj dictionary there are 4 keys available, Terach, Avraham, Sara and Nachor. When you enter one of these as your input, you'll get 0, 1, 2 or 3 as an output. According to the default value of get() function, it is None. When you enter value as an input which is not in obj dictionary mean you'll get the default value which is None.
just reverse the dictionary like so:
reversed_dict = {}
for i in range(0,len(my_dict['ftml']['people']['person'])):
reversed_dict[my_dict['ftml']['people']['person'][i]['#id']] = {'index':i}
print(reversed_dict)
print(reversed_dict['Avraham']['index'])
this should give you the outout of
{'Terach': {'index': 0}, 'Avraham': {'index': 1}, 'Sara': {'index': 2}, 'Nachor': {'index': 3}}
1
The most simple thing would probably be:
my_dict = {} # Defined in the question
x = int(input("Type the chronological person number. (i.e 1 is Avraham): "))
persons = my_dict['ftml']['people']['person']
for i, v in enumerate(persons):
if v['#id'] == x:
break
# i now has the most recent index
print(i)
Your intention: "For example, if the user sends the program "Avraham" the program will return 1. If the user is looking for Nachor the program will return 0." is implemented by this. However, above would work in in reverse... as the iteration will go from top to bottom in this representation...
reversed(persons)... :)
Input data is like below.But, it actually contains thousands of dictionaries under this list and serial_ids are repeated throughout the list.
[{
"serial_id": 1,
"name": "ABC"
},
{
"serial_id": 6,
"name": "DEF"
},
{
"serial_id": 8,
"name": "GHI"
},
{
"serial_id": 0,
"name": "JKL"
},
{
"serial_id": 6,
"name": "VVV"
}]
Now, I know the range of serial_id but I don't want to hardcode it.
My task is to find the total number of users (i.e. name_count basically) per serial id. It will be better if I can get a table like structure sorted in descending order containing columns, serial_id and user_count per serial_id.
Questions are:
Can we make use of Dataframe concept? If possible, I would like to.
I am unable to get any method to achieve the required output.
Thanks in Advance !!
Since the JSON data is pulled from an API, below is the code I tried to but failed badly.
#Python libraries
import numpy as np
import pandas as pd
from pandas import DataFrame, Series
from collections import Counter
url1 = 'INPUT URL'
#print ('Retrieving',url1)
#uh = urllib2.urlopen(url1)
r = requests.get(url1)
r = r.text
#print r
#print ('Retrieved', len(r), 'characters')
try:js = json.loads(r) # js -> Native Python list
except:js = None
#print js
info = json.dumps(js , indent =4) #Prints out the JSON data in a nice format which we call as "Pretty Print"
#print (info)
'''
#print ('User Count:' , len(info))
for item in (js):
print ('Name' , item["name"])
'''
'''
user_count = 0
for item in (js):
#df = {'serial_id': Series[item["affiliate_id"]]} //ERROR
df = DataFrame({'serial_id': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]})
#Hard-coded the serial_id since we know the range of the affiliate_id
print(df)
Let's use Pandas dataframes:
from io import StringIO
import pandas as pd
jstring = StringIO("""[{
"serial_id": 1,
"name": "ABC"
},
{
"serial_id": 6,
"name": "DEF"
},
{
"serial_id": 8,
"name": "GHI"
},
{
"serial_id": 0,
"name": "JKL"
},
{
"serial_id": 6,
"name": "VVV"
}]""")
df = pd.read_json(jstring)
df_out = df.groupby('serial_id')['name'].count().reset_index(name='name_count')
print(df_out)
Output:
serial_id name_count
0 0 1
1 1 1
2 6 2
3 8 1
As the question explains the problem, I've been trying to generate nested JSON object. In this case I have for loops getting the data out of dictionary dic. Below is the code:
f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
f.write("\"term_freq\":"+str(len(value))+",\n")
f.write("\"lists\":[\n\t")
for item in value:
f.write("{\n")
f.write("\t\t\"occurance\" :"+str(item)+"\n")
#Check last object
if value.index(item)+1 == len(value):
f.write("}\n"
f.write("]\n")
else:
f.write("},") # close occurrence object
# Check last item in dic
if i == len(dic)-1:
flag = True
if(flag):
f.write("}")
else:
f.write("},") #close lists object
flag = False
#check for flag
f.write("]") #close lists array
f.write("}")
Expected output is:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}]
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}],
"term_freq": 5
}]
}
But currently I'm getting an output like below:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},] // Here lies the problem "," before array(last element)
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},], // Here lies the problem "," before array(last element)
"term_freq": 5
}]
}
Please help, I've trying to solve it, but failed. Please don't mark it duplicate since I have already checked other answers and didn't help at all.
Edit 1:
Input is basically taken from a dictionary dic whose mapping type is <String, List>
for example: "irritation" => [1,3,5,7,8]
where irritation is the key, and mapped to a list of page numbers.
This is basically read in the outer for loop where key is the keyword and value is a list of pages of occurrence of that keyword.
Edit 2:
dic = collections.defaultdict(list) # declaring the variable dictionary
dic[key].append(value) # inserting the values - useless to tell here
for key in dic:
# Here dic[x] represents list - each value of x
print key,":",dic[x],"\n" #prints the data in dictionary
What #andrea-f looks good to me, here another solution:
Feel free to pick in both :)
import json
dic = {
"bomber": [1, 2, 3, 4, 5],
"irritation": [1, 3, 5, 7, 8]
}
filename = "abc.pdf"
json_dict = {}
data = []
for k, v in dic.iteritems():
tmp_dict = {}
tmp_dict["keyword"] = k
tmp_dict["term_freq"] = len(v)
tmp_dict["lists"] = [{"occurrance": i} for i in v]
data.append(tmp_dict)
json_dict["filename"] = filename
json_dict["data"] = data
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
It's the same idea, I first create a big json_dict to be saved directly in json. I use the with statement to save the json avoiding the catch of exception
Also, you should have a look to the doc of json.dumps() if you need future improve in your json output.
EDIT
And just for fun, if you don't like tmp var, you can do all the data for loop in a one-liner :)
json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()]
It could gave for final solution something not totally readable like this:
import json
json_dict = {
"filename": "abc.pdf",
"data": [{
"keyword": k,
"term_freq": len(v),
"lists": [{"occurrance": i} for i in v]
} for k, v in dic.iteritems()]
}
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
EDIT 2
It looks like you don't want to save your json as the desired output, but be abble to read it.
In fact, you can also use json.dumps() in order to print your json.
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle)
print json.dumps(json_dict, indent=4, sort_keys=True)
There is still one problem here though, "filename": is printed at the end of the list because the d of data comes before the f.
To force the order, you will have to use an OrderedDict in the generation of the dict. Be careful the syntax is ugly (imo) with python 2.X
Here is the new complete solution ;)
import json
from collections import OrderedDict
dic = {
'bomber': [1, 2, 3, 4, 5],
'irritation': [1, 3, 5, 7, 8]
}
json_dict = OrderedDict([
('filename', 'abc.pdf'),
('data', [ OrderedDict([
('keyword', k),
('term_freq', len(v)),
('lists', [{'occurrance': i} for i in v])
]) for k, v in dic.iteritems()])
])
with open('abc.json', 'w') as outfile:
json.dump(json_dict, outfile)
# Now to read the orderer json file
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle, object_pairs_hook=OrderedDict)
print json.dumps(json_dict, indent=4)
Will output:
{
"filename": "abc.pdf",
"data": [
{
"keyword": "bomber",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 2
},
{
"occurrance": 3
},
{
"occurrance": 4
},
{
"occurrance": 5
}
]
},
{
"keyword": "irritation",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 3
},
{
"occurrance": 5
},
{
"occurrance": 7
},
{
"occurrance": 8
}
]
}
]
}
But be carefull, most of the time, it is better to save a regular .json file in order to be cross languages.
Your current code is not working because the loop iterates through the before-last item adding the }, then when the loop runs again it sets the flag to false, but the last time it ran it added a , since it thought that there will be another element.
If this is your dict: a = {"bomber":[1,2,3,4,5]} then you can do:
import json
file_name = "a_file.json"
file_name_input = "abc.pdf"
new_output = {}
new_output["filename"] = file_name_input
new_data = []
i = 0
for key, val in a.iteritems():
new_data.append({"keyword":key, "lists":[], "term_freq":len(val)})
for p in val:
new_data[i]["lists"].append({"occurrance":p})
i += 1
new_output['data'] = new_data
Then save the data by:
f = open(file_name, 'w+')
f.write(json.dumps(new_output, indent=4, sort_keys=True, default=unicode))
f.close()