converting csv data to nested json format

converting csv data to nested json format - python

I have some csv data that need to be converted to specific json format.
I have written a code that works for some nested level but not as required
This is my csv data:
title context answers question id
tit1 con1 text1 que1 id1
tit1 con1 text2 que2 id2
tit2 con2 text3 que3 id3
tit2 con2 text4 que4 id4
tit2 con3 text5 que5 id5
my code:
df = pd.read_csv('processedOutput.csv')
finalList = []
finalDict = {}
grouped = df.groupby(['context'])
for key, value in grouped:
dictionary = {}
j = grouped.get_group(key).reset_index(drop=True)
dictionary['context'] = j.at[0, 'context']
dictList = []
anotherDict = {}
for i in j.index:
anotherDict['answers'] = j.at[i, 'answers']
anotherDict['question'] = j.at[i, 'question']
anotherDict['id'] = j.at[i, 'id']
dictList.append(anotherDict)
dictionary['qas'] = dictList
finalList.append(dictionary)
import json
data = json.dumps(finalList)
whose output structure is fine but takes the last elem of grouped item only
[{"context": "con1",
"qas": [
{"answers": "text2", "question": "que2", "id": "id2"},
{"answers": "text2", "question": "que2", "id": "id2"}
]
},
{"context": "con2",
"qas": [
{"answers": "text4", "question": "que4", "id": "id4"},
{"answers": "text4", "question": "que4", "id": "id4"}
]
},
{"context": "con3",
"qas": [
{"answers": "text5", "question": "que5", "id": "id5"}
]
}
]
want to make the data to be nested one more level with all fields as below:
[
{
"title": "tit1",
"paragraph": [
{
"context": "con1",
"qas": [
{"answers": "text1","question": "que1","id": "id1"},
{"answers": "text2","question": "que2","id": "id2"}
]}]
},
{
"title": "tit2",
"paragraph": [
{
"context": "con2",
"qas": [
{"answers": "text3","question": "que3","id": "id3"},
{"answers": "text4","question": "que4","id": "id4"}
],
"context": "con3",
"qas": [
{"answers": "text5","question":"que5", "id": "id5"}
]
}
]
}
]
stuck on this for very long, any suggestions will be great

Your output data needs 3 levels of grouping: title, paragraph, and q&a's. I would recommend using df.groupby(['title', 'context', 'answers']) to drive the loop.
Then, within the loop, each group would constitute of one q&a dictionary (assuming
that id column contains unique values only). To build the higher level structure,
all it takes is some bookkeeping to detect level changes and add to the appropriate list and dictionary. We'll use more groupby levels to do this:
...
g1 = df.groupby(['title'])
for k1, v1 in g1:
l2_para_list = []
l4_qas_list = []
g2 = v1.groupby(['context'])
for k2, v2 in g2:
g3 = v2.groupby(['answers'])
for _, v3 in g3:
qas_dict = {}
qas_dict['answers'] = v3.answers.item()
qas_dict['question'] = v3.question.item()
qas_dict['id'] = v3.id.item()
l4_qas_list.append(qas_dict)
l3_para_dict = {}
l3_para_dict['context'] = k2
l3_para_dict['qas'] = l4_qas_list
l4_qas_list = []
l2_para_list.append(l3_para_dict)
l3_para_dict = {}
l1_title_dict = {}
l1_title_dict['title'] = k1
l1_title_dict['paragraph'] = l2_para_list
finalList.append(l1_title_dict)
l1_title_dict = {}
l2_para_list = []
print(json.dumps(finalList))
...
Output (formatted for presentation)
[{"title": "tit1", "paragraph":
[{"context": "con1",
"qas": [{"answers": "text1", "question": "que1", "id": "id1"},
{"answers": "text2", "question": "que2", "id": "id2"}]}]},
{"title": "tit2", "paragraph":
[{"context": "con2",
"qas": [{"answers": "text3", "question": "que3", "id": "id3"},
{"answers": "text4", "question": "que4", "id": "id4"}]},
{"context": "con3",
"qas": [{"answers": "text5", "question": "que5", "id": "id5"}]}]}]

Related

create dataframe in pandas using multilevel dict dynamic

I am fetching api and trying that response into csv but on catch is there this is multilevel dict or json when i am converting into csv most of the look like list of dict or dicts
I am trying using this
def expand(data):
d = pd.Series(data)
t = d.index
for i in t:
if type(d[i]) in (list,dict):
expend_s = pd.Series(d[i])
t.append(expend_s.index)
d = d.append(expend_s)
d = d.drop([i])
return d
df['person'].apply(expand)
but this solution is not working. if we see person col there is multiple dict or list of dict like
"birthDate": "0000-00-00",
"genderCode": {
"codeValue": "M",
"shortName": "Male",
"longName": "Male"
},
"maritalStatusCode": {
"codeValue": "M",
"shortName": "Married"
},
"disabledIndicator": False,
"preferredName": {},
"ethnicityCode": {
"codeValue": "4",
"shortName": "4",
"longName": "Not Hispanic or Latino"
},
"raceCode": {
"identificationMethodCode": {},
"codeValue": "1",
"shortName": "White",
"longName": "White"
},
"militaryClassificationCodes": [],
"governmentIDs": [
{
"itemID": "9200037107708_4385",
"idValue": "XXX-XX-XXXX",
"nameCode": {
"codeValue": "SSN",
"longName": "Social Security Number"
},
"countryCode": "US"
}
],
"legalName": {
"givenName": "Jack",
"middleName": "C",
"familyName1": "Abele",
"formattedName": "Abele, Jack C"
},
"legalAddress": {
"nameCode": {
"codeValue": "Personal Address 1",
"shortName": "Personal Address 1",
"longName": "Personal Address 1"
},
"lineOne": "1932 Keswick Lane",
"cityName": "Concord",
"countrySubdivisionLevel1": {
"subdivisionType": "StateTerritory",
"codeValue": "CA",
"shortName": "California"
},
"countryCode": "US",
"postalCode": "94518"
},
"communication": {
"mobiles": [
{
"itemID": "9200037107708_4389",
"nameCode": {
"codeValue": "Personal Cell",
"shortName": "Personal Cell"
},
"countryDialing": "1",
"areaDialing": "925",
"dialNumber": "6860589",
"access": "1",
"formattedNumber": "(925) 686-0589"
}
]
}
}
your suggestion and advice would be so helpful

I think we can solve multiple dict using read as pd.josn_normalise and list of dict using the below functions first we get those columns which have list
def df_list_and_dict_col(explode_df: pd.DataFrame, primary_key: str,
col_name: str, folder: str) -> pd.DataFrame:
""" convert list of dict or list of into clean dataframe
Keyword arguments:
-----------------
dict: explode_df -- dataframe where we have to expand column
dict: col_name -- main_file name where most of data is present
Return: pd.DataFrame
return clean or expand dataframe
"""
explode_df[col_name] = explode_df[col_name].replace('', '[]', regex=True)
explode_df[col_name] = explode_df[col_name].fillna('[]')
explode_df[col_name] = explode_df[col_name].astype(
'string') # to make sure that entire column is string
explode_df[col_name] = explode_df[col_name].apply(ast.literal_eval)
explode_df = explode_df.explode(col_name)
explode_df = explode_df.reset_index(drop=True)
normalized_df = pd.json_normalize(explode_df[col_name])
explode_df = explode_df.join(
other=normalized_df,
lsuffix="_left",
rsuffix="_right"
)
explode_df = explode_df.drop(columns=col_name)
type_df = explode_df.applymap(type)
col_list = []
for col in type_df.columns:
if (type_df[col]==type([])).any():
col_list.append(col)
# print(col_list,explode_df.columns)
if len(col_list) != 0:
for col in col_list:
df_list_and_dict_col(explode_df[[primary_key,col]], primary_key,
col, folder)
explode_df.drop(columns=col, inplace =True)
print(f'{col}.csv is done')
explode_df.to_csv(f'{folder}/{col_name}.csv')
first we get list col and pass col to function one by one and then check is there any list inside col and then go on and save into csv
type_df = df.applymap(type)
col_list =[]
for col in type_df.columns:
if (type_df[col]==type([])).any():
col_list.append(col)
for col in col_list:
# print(col, df[['associateOID',col]])
df_list_and_dict_col(df[['primary_key',col]].copy(), 'primary_key', col,folder='worker')
df.drop(columns=col, inplace=True)
now you have multiple csv in normalise format

Merging methods of two different sets of data in Python

This question was edited. Please see the edit on the bottom first.
This question is going to be a bit long so I'm sorry in advance. Please consider two different types of data:
Data A:
{
"files": [
{
"name": "abc",
"valid": [
"func4",
"func1",
"func3"
],
"invalid": [
"func2",
"func8"
]
}
]
}
Data B:
{
"files": [
{
"methods": {
"invalid": [
"func2",
"func8"
],
"valid": [
"func4",
"func1",
"func3"
]
},
"classes": [
{
"invalid": [
"class1",
"class2"
],
"valid": [
"class8",
"class5"
],
"name": "class1"
}
],
"name": "abc"
}
]
}
I'm trying to merge each file (A files with A and B files with B). Previous question helped me figure out how to do it but I got stuck again.
As I said in the previous question there is a rule for merging the files. I'll explain again:
Consider two dictionaries A1 and A2. I want to merge invalid of A1 with A2 and valid of A1 with A2. The merge should be easy enough but the problem is that the data of invalid and valid dependents on each other.
The rule of that dependency - if number x is valid in A1 and invalid in A2 then its valid in the merged report.
The only way to be invalid is to be in the invalid list of both of A1 and A2 (Or invalid in one of them while not existing in the other).
In order to merge the A files I wrote the following code:
def merge_A_files(self, src_report):
for current_file in src_report["files"]:
filename_index = next((index for (index, d) in enumerate(self.A_report["files"]) if d["name"] == current_file["name"]), None)
if filename_index == None:
new_block = {}
new_block['valid'] = current_file['valid']
new_block['invalid'] = current_file['invalid']
new_block['name'] = current_file['name']
self.A_report["files"].append(new_block)
else:
block_to_merge = self.A_report["files"][filename_index]
merged_block = {'valid': [], 'invalid': []}
merged_block['valid'] = list(set(block_to_merge['valid'] + current_file['valid']))
merged_block['invalid'] = list({i for l in [block_to_merge['invalid'], current_file['invalid']]
for i in l if i not in merged_block['valid']})
merged_block['name'] = current_file['name']
self.A_report["files"][filename_index] = merged_block
For merging B files I wrote:
def _merge_functional_files(self, src_report):
for current_file in src_report["files"]:
filename_index = next((index for (index, d) in enumerate(self.B_report["files"]) if d["name"] == current_file["name"]), None)
if filename_index == None:
new_block = {'methods': {}, 'classes': []}
new_block['methods']['valid'] = current_file['methods']['valid']
new_block['methods']['invalid'] = current_file['methods']['invalid']
new_block['classes'] += [{'valid': c['valid'], 'invalid': c['invalid'], 'name': c['name'] } for c in current_file['classes']]
new_block['name'] = current_file['name']
self.B_report["files"].append(new_block)
else:
block_to_merge = self.B_report["files"][filename_index]
merged_block = {'methods': {}, 'classes': []}
for current_class in block_to_merge["classes"]:
current_classname = current_class.get("name")
class_index = next((index for (index, d) in enumerate(merged_block["classes"]) if d["name"] == current_classname), None)
if class_index == None:
merged_block['classes'] += ([{'valid': c['valid'], 'invalid': c['invalid'], 'name': c['name'] } for c in current_file['classes']])
else:
class_block_to_merge = merged_block["classes"][class_index]
class_merged_block = {'valid': [], 'invalid': []}
class_merged_block['valid'] = list(set(class_block_to_merge['valid'] + current_class['valid']))
class_merged_block['invalid'] = list({i for l in [class_block_to_merge['invalid'], current_class['invalid']]
for i in l if i not in class_merged_block['valid']})
class_merged_block['name'] = current_classname
merged_block["classes"][filename_index] = class_merged_block
merged_block['methods']['valid'] = list(set(block_to_merge['methods']['valid'] + current_file['methods']['valid']))
merged_block['methods']['invalid'] = list({i for l in [block_to_merge['methods']['invalid'], current_file['methods']['invalid']]
for i in l if i not in merged_block['methods']['valid']})
merged_block['name'] = current_file['name']
self.B_report["files"][filename_index] = merged_block
It looks like the code of A is valid and works as expected. But I have a problem with B, especially with merging classes. The example I have problem with:
First file:
{
"files": [
{
"name": "some_file1",
"methods": {
"valid": [
"func4",
"func1"
],
"invalid": [
"func3"
]
},
"classes": [
{
"name": "class1",
"valid": [
"class1",
"class2"
],
"invalid": [
"class3",
"class5"
]
}
]
}
]
}
Second file:
{
"files": [
{
"name": "some_file1",
"methods": {
"valid": [
"func4",
"func1",
"func3"
],
"invalid": [
"func2",
"func8"
]
},
"classes": [
{
"name": "class1",
"valid": [
"class8",
"class5"
],
"invalid": [
"class1",
"class2"
]
}
]
}
]
}
I get:
{
"files": [
{
"methods": {
"invalid": [
"func2",
"func8"
],
"valid": [
"func3",
"func1",
"func4"
]
},
"classes": [
{
"invalid": [
"class5",
"class3"
],
"valid": [
"class2",
"class1"
],
"name": "class1"
}
],
"name": "some_file1"
}
]
}
But it's wrong because for example class5 should be valid.
So my questions are:
I would love to have another set of eyes to check my code and help me find out the reason for this issue.
Those two methods got so complicated that it's hard to debug it. I would love to see an alternative, less complicated way to achieve it. Maybe some generic solution?
Edit: My first explanation was too complicated. I'll try to explain what I'm trying to achieve. For those of you who read the topic (appreciate it!), please forget about data type A (for simplicity). Consider Data type file B (that was showed at the start). I'm trying to merge a bunch of B files. As I understand, the algorithm for that is to do:
Iterate over files.
Check if file already located in the merged dictionary.
If no, we should add the file block to the files array.
If yes:
Merge methods dictionary.
Merge classes array.
To merge methods: method is invalid only if its invalid in both of the block. Otherwise, it's valid.
To merge classes: It's getting more complicated because it's an array. I want to follow same rule that I did for methods but I need to find the index of each block in the array, first.
The main problem is with merging classes. Can you please suggest a non-complicated on how to merge B type files?

It would be great if you could provide an expected output for the example you're showing. Based on my understanding, what you're trying to achieves is:
You're given multiple JSON files, each contains an "files" entry, which is a list of dictionaries with the structure:
{
"name": "file_name",
"methods": {
"invalid": ["list", "of", "names"],
"valid": ["list", "of", "names"]
},
"classes": [
{
"name": "class_name",
"invalid": ["list", "of", "names"],
"valid": ["list", "of", "names"]
}
]
}
You wish to merge structures from multiple files, so that file entries with the same "name" are merged together, according to the following rule:
For each name inside "methods": if goes into "valid" if it is in the "valid" array in at least one file entry; otherwise if goes into "invalid".
Classes with the same "name" are also merged together, and names inside the "valid" and "invalid" arrays are merged according to the above rule.
The following analysis of your code assumes my understanding as stated above. Let's look at the code snippet for merging lasses:
block_to_merge = self.B_report["files"][filename_index]
merged_block = {'methods': {}, 'classes': []}
for current_class in block_to_merge["classes"]:
current_classname = current_class.get("name")
class_index = next((index for (index, d) in enumerate(merged_block["classes"]) if d["name"] == current_classname), None)
if class_index == None:
merged_block['classes'] += ([{'valid': c['valid'], 'invalid': c['invalid'], 'name': c['name'] } for c in current_file['classes']])
else:
class_block_to_merge = merged_block["classes"][class_index]
class_merged_block = {'valid': [], 'invalid': []}
class_merged_block['valid'] = list(set(class_block_to_merge['valid'] + current_class['valid']))
class_merged_block['invalid'] = list({i for l in [class_block_to_merge['invalid'], current_class['invalid']]
for i in l if i not in class_merged_block['valid']})
class_merged_block['name'] = current_classname
merged_block["classes"][filename_index] = class_merged_block
The code is logically incorrect because:
You're iterating over each class dictionary from block_to_merge["classes"], which is the previous merged block.
The new merged block (merged_block) is initialized to an empty dictionary.
In the case where class_index is None, the class dictionary in merged_block is set to the the class dictionary in the previous merged block.
If you think about it, class_index will always be None, because current_class is enumerated from block_to_merge["classes"], which is already merged. Thus, what gets written into the merged_block is only the "classes" entries from the first file entry for a file. In your example, you can verify that the "classes" entry is exactly the same as that in the first file.
That said, your overall idea of how to merge the files is correct, but implementation-wise it could be a lot more simpler (and efficient). I'll first point out the non-optimal implementations in your code, and then provide a simpler solution.
You're directly storing the data in its output form, however, it's not a form that is efficient for your task. It's perfectly fine to store them in a form that is efficient, and then apply post-processing to transform it into the output form. For instance:
You're using next to find an existing entry in the list with the same "name", but this could take linear time. Instead, you can store these in a dictionary, with "name" as keys.
You're also storing valid & invalid names as a list. While merging, it's converted into a set and then back into a list. This results in a large number of redundant copies. Instead, you can just store them as sets.
You have some duplicate routines that could have been extracted into functions, but instead you rewrote them wherever needed. This violates the DRY principle and increases your chances of introducing bugs.
A revised version of the code is as follows:
class Merger:
def __init__(self):
# A structure optimized for efficiency:
# dict (file_name) -> {
# "methods": {
# "valid": set(names),
# "invalid": set(names),
# }
# "classes": dict (class_name) -> {
# "valid": set(names),
# "invalid": set(names),
# }
# }
self.file_dict = {}
def _create_entry(self, new_entry):
return {
"valid": set(new_entry["valid"]),
"invalid": set(new_entry["invalid"]),
}
def _merge_entry(self, merged_entry, new_entry):
merged_entry["valid"].update(new_entry["valid"])
merged_entry["invalid"].difference_update(new_entry["valid"])
for name in new_entry["invalid"]:
if name not in merged_entry["valid"]:
merged_entry["invalid"].add(name)
def merge_file(self, src_report):
# Method called to merge one file.
for current_file in src_report["files"]:
file_name = current_file["name"]
# Merge methods.
if file_name not in self.file_dict:
self.file_dict[file_name] = {
"methods": self._create_entry(current_file["methods"]),
"classes": {},
}
else:
self._merge_entry(self.file_dict[file_name]["methods"], current_file["methods"])
# Merge classes.
file_class_entry = self.file_dict[file_name]["classes"]
for class_entry in current_file["classes"]:
class_name = class_entry["name"]
if class_name not in file_class_entry:
file_class_entry[class_name] = self._create_entry(class_entry)
else:
self._merge_entry(file_class_entry[class_name], class_entry)
def post_process(self):
# Method called after all files are merged, and returns the data in its output form.
return [
{
"name": file_name,
"methods": {
"valid": list(file_entry["methods"]["valid"]),
"invalid": list(file_entry["methods"]["invalid"]),
},
"classes": [
{
"name": class_name,
"valid": list(class_entry["valid"]),
"invalid": list(class_entry["invalid"]),
}
for class_name, class_entry in file_entry["classes"].items()
],
}
for file_name, file_entry in self.file_dict.items()
]
We can test the implementation by:
def main():
a = {
"files": [
{
"name": "some_file1",
"methods": {
"valid": [
"func4",
"func1"
],
"invalid": [
"func3"
]
},
"classes": [
{
"name": "class1",
"valid": [
"class1",
"class2"
],
"invalid": [
"class3",
"class5"
]
}
]
}
]
}
b = {
"files": [
{
"name": "some_file1",
"methods": {
"valid": [
"func4",
"func1",
"func3"
],
"invalid": [
"func2",
"func8"
]
},
"classes": [
{
"name": "class1",
"valid": [
"class8",
"class5"
],
"invalid": [
"class1",
"class2"
]
}
]
}
]
}
import pprint
merge = Merger()
merge.merge_file(a)
merge.merge_file(b)
output = merge.post_process()
pprint.pprint(output)
if __name__ == '__main__':
main()
The output is:
[{'classes': [{'invalid': ['class3'],
'name': 'class1',
'valid': ['class2', 'class5', 'class8', 'class1']}],
'methods': {'invalid': ['func2', 'func8'],
'valid': ['func1', 'func4', 'func3']},
'name': 'some_file1'}]

Particular nested dictionary from a Pandas DataFrame for circle packing

I am trying to create a particular nested dictionary from a DataFrame in Pandas conditions, in order to then visualize.
dat = pd.DataFrame({'cat_1' : ['marketing', 'marketing', 'marketing', 'communications'],
'child_cat' : ['marketing', 'social media', 'marketing', 'communications],
'skill' : ['digital marketing','media marketing','research','seo'],
'value' : ['80', '101', '35', '31']
and I would like to turn this into a dictionary that looks a bit like this:
{
"name": "general skills",
"children": [
{
"name": "marketing",
"children": [
{
"name": "marketing",
"children": [
{
"name": "digital marketing",
"value": 80
},
{
"name": "research",
"value": 35
}
]
},
{
"name": "social media", // notice that this is a sibling of the parent marketing
"children": [
{
"name": "media marketing",
"value": 101
}
]
}
]
},
{
"name": "communications",
"children": [
{
"name": "communications",
"children": [
{
"name": "seo",
"value": 31
}
]
}
]
}
]
}
So cat_1 is the parent node, child_cat is its children, and skill is its child too. I am having trouble with creating the additional children lists. Any help?

With a lot of inefficiencies I came up with this solution. Probably highly sub-optimal
final = {}
# control dict to get only one broad category
contrl_dict = {}
contrl_dict['dummy'] = None
final['name'] = 'variants'
final['children'] = []
# line is the values of each row
for idx, line in enumerate(df_dict.values):
# parent categories dict
broad_dict_1 = {}
print(line)
# this takes every value of the row minus the value in the end
for jdx, col in enumerate(line[:-1]):
# look into the broad category first
if jdx == 0:
# check in our control dict - does this category exist? if not add it and continue
if not col in contrl_dict.keys():
# if it doesn't it appends it
contrl_dict[col] = 'added'
# then the broad dict parent takes the name
broad_dict_1['name'] = col
# the children are the children broad categories which will be populated further
broad_dict_1['children'] = []
# go to broad categories 2
for ydx, broad_2 in enumerate(list(df_dict[df_dict.broad_categories == col].broad_2.unique())):
# sub categories dict
prov_dict = {}
prov_dict['name'] = broad_2
# children is again a list
prov_dict['children'] = []
# now isolate the skills and values of each broad_2 category and append them
for row in df_dict[df_dict.broad_2 == broad_2].values:
prov_d_3 = {}
# go to each row
for xdx, direct in enumerate(row):
# in each row, values 2 and 3 are name and value respectively add them
if xdx == 2:
prov_d_3['name'] = direct
if xdx == 3:
prov_d_3['size'] = direct
prov_dict['children'].append(prov_d_3)
broad_dict_1['children'].append(prov_dict)
# if it already exists in the control dict then it moves on
else:
continue
final['children'].append(broad_dict_1)

Build JSON object with variables

I used this answer Create array of json objects from for loops and works really well for "plain" JSON objects, but if I have a nested element is not working correctly, this is my code:
story_project=open('json_jira/stories/stories_to_jira_TESTEST.json', 'w+'
#######Projects############
json_projects = []
p_name_a, p_key_a, p_type_a = [], [], []
#######Issues##############
json_issues = []
i_summary_a, i_created_a, i_reporter_a, i_status_a, i_issue_type_a = [], [], [], [], []
#######Custom Fields########
json_custom_field_values = []
cf_field_name_a, cf_field_type_a, cf_value_a = [], [], []
#########The Values################
p_name_a.append("ClubHouseDEV")
p_key_a.append("CLUB")
p_type_a.append("software")
i_summary_a.append("This summary doesn not exist")
i_created_a.append("2017-07-17T02:35:16Z")
i_reporter_a.append("5a02285487c3eb1913c44a80")
i_status_a.append("Open")
i_issue_type_a.append("Milestones")
cf_field_name_a.append("external_id")
cf_field_type_a.append("com.atlassian.jira.plugin.system.customfieldtypes:float")
cf_value_a.append(3)
cf_field_name_a.append("Story Points")
cf_field_type_a.append("com.atlassian.jira.plugin.system.customfieldtypes:float")
cf_value_a.append(5)
###########Build The JSON##############
json_custom_field_values = [{"fieldName": cf_field_name, "fieldType": cf_field_type, "value": cf_value} for cf_field_name, cf_field_type, cf_value in zip(cf_field_name_a, cf_field_type_a, cf_value_a)]
json_issues = [{"sumamry": i_summary, "created": i_created, "reporter": i_reporter, "status": i_status, "issueType": i_issue_type, "customFieldValues" : json_custom_field_value} for i_summary, i_created, i_reporter, i_status, i_issue_type, json_custom_field_value in zip(i_summary_a, i_created_a, i_reporter_a, i_status_a, i_issue_type_a, json_custom_field_values)]
json_projects = [{"name": p_name, "key": p_key, "type": p_type, "issues": json_issue} for p_name, p_key, p_type,json_issue in zip(p_name_a, p_key_a, p_type_a,json_issues)]
json_file = [{"projects": json_project} for json_project in zip(json_projects)]
json.dump(json_file, story_project)
The output should be:
{ "projects": [
{
"name": "ClubHouseDEV",
"key": "CLUB",
"type":"software",
"issues":
[
{
"summary":"This summary doesn not exist",
"created":"2017-07-17T02:35:16Z",
"reporter":"5a02285487c3eb1913c44a80",
"status":"Open",
"issueType":"Milestones",
"customFieldValues":
[
{
"fieldName": "external_id",
"fieldType": "com.atlassian.jira.plugin.system.customfieldtypes:float",
"value": 3
},
{
"fieldName": "Story Points",
"fieldType": "com.atlassian.jira.plugin.system.customfieldtypes:float",
"value": 5
}
],
"labels" : ["ch_epics"],
"updated": "2017-07-17T02:35:16Z"
}
]
}
]
}
But it is:
[{"projects": [{"name": "ClubHouseDEV", "key": "CLUB", "type": "software", "issues": {"sumamry": "This summary doesn not exist", "created": "2017-07-17T02:35:16Z", "reporter": "5a02285487c3eb1913c44a80", "status": "Open", "issueType": "Milestones", "customFieldValues": {"fieldName": "external_id", "fieldType": "com.atlassian.jira.plugin.system.customfieldtypes:float", "value": 3}}}]}]
As you can see it only added one value on the nested "Custom Field Values", how can I add all the values.

This is how I solve id: Build the deepest level, then integrated it with the level up and so on, the output was the expected.
###########Build The JSON##############
for cf_field_name, cf_field_type, cf_value in zip(cf_field_name_a, cf_field_type_a, cf_value_a):
json_custom_field_values = {}
json_custom_field_values["fieldName"] = cf_field_name
json_custom_field_values["fieldType"] = cf_field_type
json_custom_field_values["value"] = cf_value
cf_data.append(json_custom_field_values)
for i_summary, i_created, i_reporter, i_status, i_issue_type in zip(i_summary_a, i_created_a, i_reporter_a, i_status_a, i_issue_type_a):
json_issues = {}
json_issues["summary"] = i_summary
json_issues["created"] = i_created
json_issues["reporter"] =i_reporter
json_issues["status"] = i_status
json_issues["issueType"] = i_issue_type
# json_issues["customFieldValues"] = [{"fieldName": cf_field_name, "fieldType": cf_field_type, "value": cf_value} for cf_field_name, cf_field_type, cf_value in zip(cf_field_name_a, cf_field_type_a, cf_value_a)]
json_issues["customFieldValues"] = cf_data
issues_data.append(json_issues)
for p_name, p_key, p_type in zip(p_name_a, p_key_a, p_type_a):
json_projects = {}
json_projects["name"] = p_name
json_projects["key"] = p_key
json_projects["type"] = p_type
json_projects["issues"] = issues_data
projects_data.append(json_projects)
json_dict["projects"] = projects_data
json.dump(json_dict, story_project)

Get value pairs from dictionaries

Re-edited to make more clear and simple
For below data
[
{
"name": "name1",
"a_id": "12345",
"b_id": "0d687c94c5f4"
},
{
"name": "name2",
"a_id": "67890",
"b_id": "0d687c94c5f4"
},
{
"name": "name3",
"a_id": "23857",
"b_id": "9ec34be3d535"
},
{
"name": "name4",
"a_id": "84596",
"b_id": "9ec34be3d535"
},
{
"name": "name5",
"a_id": "d82ebe9815cc",
"b_id": null
}
]
How to get
based on "b_id" "0d687c94c5f4":
id1 = 12345
id2 = 67890
based on "b_id" "9ec34be3d535":
id3 = 23857
id4 = 84596

result = collections.defaultdict(list)
for res in response:
result[res['b_id']].append(res['a_id'])
result:
defaultdict(list,
{'0d687c94c5f4': ['12345', '67890'],
'9ec34be3d535': ['23857', '84596'],
None: ['d82ebe9815cc']})

result = {
item['b_id']: [
subitem['a_id']
for subitem in response
if subitem['b_id'] == item['b_id']
]
for item in response
}
print(result)
>>> {'9ec34be3d535': ['23857', '84596'], '0d687c94c5f4': ['12345', '67890'], None: ['d82ebe9815cc']}

Your request is not very clear.. but I think you mean you want to regroup the list of json with a different key... you can use itertools for that
try this:
import itertools
for key, group in itertools.groupby(r, lambda item: item['b_id']):
print 'b_id', key, [x['a_id'] for x in group]
b_id 0d687c94c5f4 ['12345', '67890']
b_id 9ec34be3d535 ['23857', '84596']
b_id None ['d82ebe9815cc']
or in dictionary form
for key, group in itertools.groupby(r, lambda item: item['b_id']):
print {key: [x['a_id'] for x in group]}
{'0d687c94c5f4': ['12345', '67890']}
{'9ec34be3d535': ['23857', '84596']}
{None: ['d82ebe9815cc']}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

converting csv data to nested json format - python

Related

create dataframe in pandas using multilevel dict dynamic

Merging methods of two different sets of data in Python

Particular nested dictionary from a Pandas DataFrame for circle packing

Build JSON object with variables

Get value pairs from dictionaries

Categories

Resources