Multiple Nested Dictionaries with Pandas - python

Is there a way to import this kind of JSON response into Pandas? Ive been trying to get it usable with json_normalize but I can't seem to get more than one level to work at a time ( I can get notes but can't being in custom_fields). I also cannot figure out how to call out something like ['reporter']['name'] (which should be jdoe). This is from Mantis and its the JSON output of a requests response. Im now wondering if it needs to br broken up into multiple frames and put back together, or should I use a for loop and put the data I want into a better format for PD to import?
In my head each item should be a column in the series all tied to the id column like this.
id | summary | | ..|.. custom.fields.Project_Stage | ... | notes1.text ... | notes2.text
"issues": [
"id": 1234,
"summary": "Some text",
"project": {
"id": 1,
"name": "North America"
"category": {
"id": 11,
"name": "Retail"
"reporter": {
"id": 1099,
"name": "jdoe"
"custom_fields": [
"field": {
"id": 107,
"name": "Product Escalations"
"value": ""
"field": {
"id": 1,
"name": "Project_Stage"
"value": "Pending"
"notes": [
"id": 214288,
"reporter": {
"id": 9999,
"name": "jdoe"
"text": "Worked with Mark over e-mail",
"view_state": {
"id": 10,
"name": "public",
"label": "public"
"type": "note",
"created_at": "2020-12-04T15:55:02-08:00",
"updated_at": "2020-12-04T15:55:02-08:00"
"id": 214289,
"reporter": {
"id": 9999,
"name": "jdoe"
"text": "I attempted on numerous occasions to setup a meeting with him to set it up for him.",
"view_state": {
"id": 10,
"name": "public",
"label": "public"
"type": "note",
"created_at": "2020-12-04T15:57:02-08:00",
"updated_at": "2020-12-04T15:57:02-08:00"
Here is what the DF would look like in my head. All the data for one ticket on one line/series.

Those structures with many lists at the same level are tricky. Try flatten_json.
If you're response is called 'dic', you can use this.
from flatten_json import flatten
dic_flattened = (flatten(d, '.') for d in dic['issues'])
df = pd.DataFrame(dic_flattened)
id summary ... notes.1.view_state.label notes.1.type notes.1.created_at notes.1.updated_at
0 1234 Some text 1 North America 11 Retail ... 10 public public note 2020-12-04T15:57:02-08:00 2020-12-04T15:57:02-08:00
In [101]: df.columns
Index(['id', 'summary', '', '', '',
'', '', '',
'', '',
'custom_fields.0.value', '',
'', 'custom_fields.1.value', '',
'', '', 'notes.0.text',
'', '',
'notes.0.view_state.label', 'notes.0.type', 'notes.0.created_at',
'notes.0.updated_at', '', '',
'', 'notes.1.text', '',
'', 'notes.1.view_state.label', 'notes.1.type',
'notes.1.created_at', 'notes.1.updated_at'],


Using pandas to convert csv into nested json with dynamic strucutre

I am new to python and now want to convert a csv file into json file. Basically the json file is nested with dynamic structure, the structure will be defined using the csv header.
From csv input:
ID, Name, person_id/id_type, person_id/id_value,person_id_expiry_date,additional_info/0/name,additional_info/0/value,additional_info/1/name,additional_info/1/value,salary_info/details/0/grade,salary_info/details/0/payment,salary_info/details/0/amount,salary_info/details/1/next_promotion
2,Jane,PASSPORT,B859804,2-01-2035,Age,38,Gender,F,Worker, Monthly,125980.1,unknown
To json output:
"ID": 1,
"Name": "Peter",
"person_id": {
"id_type": "PASSPORT",
"id_value": "A452817"
"person_id_expiry_date": "1-01-2055",
"additional_info": [
"name": "Age",
"value": 19
"name": "Gender",
"value": "M"
"salary_info": {
"details": [
"grade": "Manager",
"payment": "Monthly",
"amount": 8956.23
"next_promotion": "unknown"
"ID": 2,
"Name": "Jane",
"person_id": {
"id_type": "PASSPORT",
"id_value": "B859804"
"person_id_expiry_date": "2-01-2035",
"additional_info": [
"name": "Age",
"value": 38
"name": "Gender",
"value": "F"
"salary_info": {
"details": [
"grade": "Worker",
"payment": " Monthly",
"amount": 125980.1
"next_promotion": "unknown"
Is this something can be done by the existing pandas API or I have to write lots of complex codes to dynamically construct the json object? Thanks.

3 levels json count in python

I am new at python, I´ve worked with other languages... I´ve made this code with Java and works, but now, I must do it in python. I have a json of 3 levels, the first two are: resources, usages, and I want to count the names on the third level. I´ve seen several examples but I cant get it done
import json
data = {
"startDate": "2019-06-23T16:07:21.205Z",
"endDate": "2019-07-24T16:07:21.205Z",
"status": "Complete",
"usages": [
"name": "PureCloud Edge Virtual Usage",
"resources": [
"name": "Edge01-VM-GNS-DemoSite01 (1f279086-a6be-4a21-ab7a-2bb1ae703fa0)",
"date": "2019-07-24T09:00:28.034Z"
"name": "329ad5ae-e3a3-4371-9684-13dcb6542e11",
"date": "2019-07-24T09:00:28.034Z"
"name": "e5796741-bd63-4b8e-9837-4afb95bb0c09",
"date": "2019-07-24T09:00:28.034Z"
"name": "PureCloud for SmartVideo Add-On Concurrent",
"resources": [
"name": "",
"date": "2019-06-25T04:54:17.662Z"
"name": "",
"date": "2019-06-25T04:54:17.662Z"
"name": "",
"date": "2019-07-15T15:06:09.203Z"
"name": "PureCloud 3 Concurrent User Usage",
"resources": [
"name": "",
"date": "2019-06-25T04:54:17.662Z"
"name": "",
"date": "2019-06-25T04:54:17.662Z"
"name": "",
"date": "2019-07-15T15:06:09.203Z"
"name": "PureCloud Skype for Business WebSDK",
"resources": [
"name": "",
"date": "2019-06-25T04:54:17.662Z"
"name": "",
"date": "2019-06-25T04:54:17.662Z"
"name": "",
"date": "2019-07-15T15:06:09.203Z"
"selfUri": "/api/v2/billing/reports/billableusage"
cantidadDeLicencias = 0
cantidadDeUsages = len(data['usages'])
for x in range(cantidadDeUsages):
temporal = data[x]
cantidadDeResources = len(temporal['resource'])
for z in range(cantidadDeResources):
What changes I have to make? Maybe I have to do it on another approach? Thanks in advance
Code that works
cantidadDeLicencias = 0
for usage in data['usages']:
cantidadDeLicencias = cantidadDeLicencias + len(usage['resources'])
You can do this :
for usage in data['usages']:
If you want to know the number of names in each of the resources level, counting the duplicated names (e.g. "" appears more than one time in your data), then just do iterate over the first-level (usages) and sum the size of each array
cantidadDeLicencias = 0
for usage in data['usages']:
cantidadDeLicencias += len(usage['resources'])
If you don't want to count duplicates, then use a set and iterate over each resources array
cantidadDeLicencias_set = {}
for usage in data['usages']:
for resource in usage['resources']:
print(len(cantidadDeLicencias_set ))

Convert complex Json to CSV

The file is from a slack server export file, so the structure varies every time (if people responded to a thread with text or reactions).
I have tried several SO questions, with similar problems. But I guarantee my question is different. This one, This one too,This one as well
Sample JSON file:
"client_msg_id": "f347abdc-9e2a-4cad-a37d-8daaecc5ad51",
"type": "message",
"text": "I came here just to check <#U3QSFG5A4> This is a sample :slightly_smiling_face:",
"user": "U51N464MN",
"ts": "1550511445.321100",
"team": "T1559JB9V",
"user_team": "T1559JB9V",
"source_team": "T1559JB9V",
"user_profile": {
"avatar_hash": "gcc8ae3d55bb",
"image_72": "https:\/\/\/avatar\/fcc8ae3d55bb91cb750438657694f8a0.jpg?s=72&",
"first_name": "A",
"real_name": "a name",
"display_name": "user",
"team": "T1559JB9V",
"name": "name",
"is_restricted": false,
"is_ultra_restricted": false
"thread_ts": "1550511445.321100",
"reply_count": 3,
"reply_users_count": 3,
"latest_reply": "1550515952.338000",
"reply_users": [
"replies": [
"user": "U51N464MN",
"ts": "1550511485.321200"
"user": "U8DUH4U2V",
"ts": "1550515191.337300"
"user": "U3QSFG5A4",
"ts": "1550515952.338000"
"subscribed": false,
"reactions": [
"name": "trolldance",
"users": [
"count": 3
"name": "trollface",
"users": [
"count": 1
The issue is that there are several keys that vary, so the structure changes within the same json file between messages depending on how other users interact to a given message.
with open("file.json") as file:
d = json.load(file)
df =
df.columns = x: x.split(".")[-1])

How to reshape DataFrame for nested JSON output

I am starting to work with some data manipulation and I need to create a new file (with new features) out of an old one. However, I could not realize how can I customize my own dataframe before using a ".to_json" method.
For example, I have a .csv as:
seller, customer, product, price
Roger, Will, 8129, 30
Roger, Markus, 1234, 100
Roger, Will, 2334, 50
Mike, Markus, 2295, 20
Mike, Albert, 1234, 100
...and I want to generate a .json file to support me in visualizing a network out of it. This should be more or less like:
"node": [
{"id":"Roger", "group": "seller" },
{"id":"Mike", "group": "seller" },
{"id":"Will", "group": "customer" },
{"id":"Markus", "group": "customer" },
{"id":"Albert", "group": "customer" }
#...and so on
I tried to do something like:
df1 = pd.read_csv('file.csv')
seller_list = df1.seller.unique()
customer_list = df1.customer.unique()
..and I could get indeed lists with unique items. However, I could not find how I should add them in a dataframe in order to create an structure such as:
{"id":"Mike", "group": "seller" },
{"id":"Markus", "group": "customer" },
]...#see above
Any support or hint on this is appreciated.
This will be a two step process. First, create the nodes dict using melt + drop_duplicates +to_dict -
nodes = df[['customer', 'seller']]\
.melt(var_name='group', value_name='id')\
Now, create the links dict using rename + to_dict
links = df.rename(columns={'seller' : 'source', 'customer' : 'target'}).to_dict('r')
Now, combine the data into one dictionary, and dump it as JSON to a file.
data = {'nodes' : nodes, 'links' : links}
with open('data.json', 'w') as f:
json.dump(data, f, indent=4)
Your data.json file should look like this -
"nodes": [
"id": "Will",
"group": "customer"
"id": "Markus",
"group": "customer"
"id": "Albert",
"group": "customer"
"id": "Roger",
"group": "seller"
"id": "Mike",
"group": "seller"
"links": [
"product": 8129,
"target": "Will",
"source": "Roger",
"price": 30
"product": 1234,
"target": "Markus",
"source": "Roger",
"price": 100
"product": 2334,
"target": "Will",
"source": "Roger",
"price": 50
"product": 2295,
"target": "Markus",
"source": "Mike",
"price": 20
"product": 1234,
"target": "Albert",
"source": "Mike",
"price": 100

Unable to pull data from json using python

I have the following json
"response": {
"message": null,
"exception": null,
"context": [
"headers": null,
"name": "aname",
"children": [
"type": "cluster-connectivity",
"name": "cluster-connectivity"
"type": "consistency-groups",
"name": "consistency-groups"
"type": "devices",
"name": "devices"
"type": "exports",
"name": "exports"
"type": "storage-elements",
"name": "storage-elements"
"type": "system-volumes",
"name": "system-volumes"
"type": "uninterruptible-power-supplies",
"name": "uninterruptible-power-supplies"
"type": "virtual-volumes",
"name": "virtual-volumes"
"parent": "/clusters",
"attributes": [
"value": "true",
"name": "allow-auto-join"
"value": "0",
"name": "auto-expel-count"
"value": "0",
"name": "auto-expel-period"
"value": "0",
"name": "auto-join-delay"
"value": "1",
"name": "cluster-id"
"value": "true",
"name": "connected"
"value": "synchronous",
"name": "default-cache-mode"
"value": "true",
"name": "default-caw-template"
"value": "blah",
"name": "default-director"
"value": [
"name": "director-names"
"value": [
"name": "health-indications"
"value": "ok",
"name": "health-state"
"value": "1",
"name": "island-id"
"value": "blah",
"name": "name"
"value": "ok",
"name": "operational-status"
"value": [
"name": "transition-indications"
"value": [
"name": "transition-progress"
"type": "cluster"
"custom-data": null
which im trying to parse using the json module in python. I am only intrested in getting the following information out of it.
Name Value
operational-status Value
health-state Value
Here is what i have tried.
in the below script data is the json returned from a webpage
json = json.loads(data)
healthstate= json['response']['context']['operational-status']
operationalstatus = json['response']['context']['health-status']
Unfortunately i think i must be missing something as the above results in an error that indexes must be integers not string.
if I try
healthstate= json['response'][0]
it errors saying index 0 is out of range.
Any help would be gratefully received.
json['response']['context'] is a list, so that object requires you to use integer indices.
Each item in that list is itself a dictionary again. In this case there is only one such item.
To get all "name": "health-state" dictionaries out of that structure you'd need to do a little more processing:
[attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
would give you a list of of matching values for health-state in the first context.
>>> [attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
You have to follow the data structure. It's best to interactively manipulate the data and check what every item is. If it's a list you'll have to index it positionally or iterate through it and check the values. If it's a dict you'll have to index it by it's keys. For example here is a function that get's the context and then iterates through it's attributes checking for a particular name.
def get_attribute(data, attribute):
for attrib in data['response']['context'][0]['attributes']:
if attrib['name'] == attribute:
return attrib['value']
return 'Not Found'
>>> data = json.loads(s)
>>> get_attribute(data, 'operational-status')
>>> get_attribute(data, 'health-state')
json['reponse']['context'] is a list, not a dict. The structure is not exactly what you think it is.
For example, the only "operational status" I see in there can be read with the following:
