Dataframe to PowerBI's json format

Dataframe to PowerBI's json format - python

I am trying to convert Dataframe data into PowerBI's JSON format. But no luck so far.
DataFrame:
ProductID Name Category IsCompete ManufacturedOn
0 1 Adjustable Race Components true 07/30/2014
1 2 LL Crankarm Components false 07/30/2014
2 3 HL Mountain Frame - Silver Bikes true 07/30/2019
Expected JSON Format:
{
"rows": [
{
"ProductID": 1,
"Name": "Adjustable Race",
"Category": "Components",
"IsCompete": true,
"ManufacturedOn": "07/30/2014"
},
{
"ProductID": 2,
"Name": "LL Crankarm",
"Category": "Components",
"IsCompete": true,
"ManufacturedOn": "07/30/2014"
},
{
"ProductID": 3,
"Name": "HL Mountain Frame - Silver",
"Category": "Bikes",
"IsCompete": true,
"ManufacturedOn": "07/30/2014"
}
]
}

use pandas to_dict method :
json = {'rows':df.to_dict('records')}
print(json)
{'rows': [{'ProductID': 1,
'Name': 'Adjustable Race',
'Category': 'Components',
'IsCompete': True,
'ManufacturedOn': '07/30/2014'},
{'ProductID': 2,
'Name': 'LL Crankarm',
'Category': 'Components',
'IsCompete': False,
'ManufacturedOn': '07/30/2014'},
{'ProductID': 3,
'Name': 'HL Mountain Frame - Silver',
'Category': 'Bikes',
'IsCompete': True,
'ManufacturedOn': '07/30/2019'}]}

Related

represent parent and child relation as dictionary

I have a list of dictionary. I want to convert this list into dictionary using parent and child relation. I have try many time. But its difficult for me.
Thanks in advance for solving the problem.
Input =
data = [
{
"_id": 1,
"label": "Property",
"index": 1
},
{
"_id": 2,
"label": "Find Property",
"index": 1,
"parent_id": 1
},
{
"_id": 3,
"label": "Add Property",
"index": 2,
"parent_id": 1
},
{
"_id": 4,
"label": "Offer",
"index": 2
},
{
"_id": 5,
"label": "My Offer",
"index": 1,
"parent_id": 4
},
{
"_id": 6,
"label": "Accept",
"index": 1,
"parent_id": 5
}
]
I have a list of dictionary. I want to convert this list into dictionary using parent and child relation. I have try many time. But its difficult for me.
Thanks in advance for solving the problem.
Expected Output:
[
{
"_id": 1,
"label": "Property",
"index": 1,
"children" : [
{
"_id": 2,
"label": "Find Property",
"index": 1
},
{
"_id": 3,
"label": "Add Property",
"index": 2
}
]
},
{
"_id": 4,
"label": "Offer",
"index": 2,
"children" : [
{
"_id": 5,
"label": "My Offer",
"index": 1,
"children" : [
{
"_id": 6,
"label": "Accept",
"index": 1
}
]
}
]
},
]

I would do it like this. Keep in mind that this solution also affects the original data list.
parents = list()
# First, create a new dict where the key is property id and the value
# is the property itself.
indexed = {d["_id"]:d for d in data}
for id_, item in indexed.items():
# If a property doesn't have "parent_id" key it means that
# this is the root property, appending it to the result list.
if "parent_id" not in item:
parents.append(item)
continue
# Saving parent id for convenience.
p_id = item["parent_id"]
# Adding a children list if a parent doesn't have it yet.
if "children" not in indexed[p_id]:
indexed[p_id]["children"] = list()
indexed[p_id]["children"].append(item)
And the result is:
import pprint
pprint.pprint(parents)
[{'_id': 1,
'children': [{'_id': 2, 'index': 1, 'label': 'Find Property', 'parent_id': 1},
{'_id': 3, 'index': 2, 'label': 'Add Property', 'parent_id': 1}],
'index': 1,
'label': 'Property'},
{'_id': 4,
'children': [{'_id': 5,
'children': [{'_id': 6,
'index': 1,
'label': 'Accept',
'parent_id': 5}],
'index': 1,
'label': 'My Offer',
'parent_id': 4}],
'index': 2,
'label': 'Offer'}]

ordering a dictionary by count of items across a number of key value lists

hopefully he the title is not too confusing, I have a dictionary (sample below) whereby im trying to sort the dictionary by the number of list (dictionary items) across a number of key values beneath a parent. Hopefully the example makes more sense then my description?
{
"data": {
"London": {
"SHOP 1": [
{
"kittens": 10,
"type": "fluffy"
},
{
"puppies": 11,
"type": "squidgy"
}
],
"SHOP 2": [
{
"kittens": 15,
"type": "fluffy"
},
{
"puppies": 3,
"type": "squidgy"
},
{
"fishes": 132,
"type": "floaty"
}
]
},
"Manchester": {
"SHOP 1": [
{
"kittens": 10,
"type": "fluffy"
},
{
"puppies": 11,
"type": "squidgy"
}
],
"SHOP 2": [
{
"kittens": 15,
"type": "fluffy"
},
{
"puppies": 3,
"type": "squidgy"
},
{
"fishes": 132,
"type": "floaty"
}
],
"SHOP 3": [
{
"kittens": 15,
"type": "fluffy"
},
{
"puppies": 3,
"type": "squidgy"
},
]
},
"Edinburgh": {
"SHOP 1": [
{
"kittens": 10,
"type": "fluffy"
},
{
"puppies": 11,
"type": "squidgy"
}
],
"SHOP 2": [
{
"kittens": 15,
"type": "fluffy"
},
],
"SHOP 3": [
{
"puppies": 3,
"type": "squidgy"
},
]
}
}
}
Summary
# London 2 shops, 5 item dictionaries total
# Machester 3 shops, 7 item dictionaries total
# Edinburgh 3 shops, 4 item dictionaries total
Desired sorting would be by total items across the shops, so ordered Manchester, London, Edinburgh
id usually use somethign like the below to sort, but im not sure how to do this oen with it being counting the number of items across a number of keys?
{k: v for k, v in sorted(x.items(), key=lambda item: item[1])}

You need to reverse sort based on the total number of items for each location, which you can generate as:
sum(len(i) for i in s.values())
where s is the shop dictionary for each location.
Putting this into a sorted expression:
dict(sorted(d['data'].items(), key=lambda t:sum(len(i) for i in t[1].values()), reverse=True))
gives:
{
'Manchester': {
'SHOP 1': [{'kittens': 10, 'type': 'fluffy'}, {'puppies': 11, 'type': 'squidgy'}],
'SHOP 2': [{'kittens': 15, 'type': 'fluffy'}, {'puppies': 3, 'type': 'squidgy'}, {'fishes': 132, 'type': 'floaty'}],
'SHOP 3': [{'kittens': 15, 'type': 'fluffy'}, {'puppies': 3, 'type': 'squidgy'}]
},
'London': {
'SHOP 1': [{'kittens': 10, 'type': 'fluffy'}, {'puppies': 11, 'type': 'squidgy'}],
'SHOP 2': [{'kittens': 15, 'type': 'fluffy'}, {'puppies': 3, 'type': 'squidgy'}, {'fishes': 132, 'type': 'floaty'}]
},
'Edinburgh': {
'SHOP 1': [{'kittens': 10, 'type': 'fluffy'}, {'puppies': 11, 'type': 'squidgy'}],
'SHOP 2': [{'kittens': 15, 'type': 'fluffy'}], 'SHOP 3': [{'puppies': 3, 'type': 'squidgy'}]
}
}

No need to make things complex:
adict = adict['data']
result = []
for capital, value in adict.items():
shop_count = len(value)
items = sum([len(obj) for obj in value.values()])
result.append((capital, shop_count, items))
for capital, shop_count, items in sorted(result, key=lambda x: x[2], reverse=True):
print(f'{capital} {shop_count} shops, {items} item dictionaries total')
Output:
Manchester 3 shops, 7 item dictionaries total
London 2 shops, 5 item dictionaries total
Edinburgh 3 shops, 4 item dictionaries total

Python Pandas - Convert dataframe into json

I have this pandas.dataframe:
date. pid value interval
0 2021-09-05 00:04:24 1 5.554 2021-09-05 00:00:00
1 2021-09-05 00:06:38 1 4.359 2021-09-05 00:05:00
2 2021-09-05 00:06:46 1 18.364 2021-09-05 00:05:00
3 2021-09-05 00:04:24 2 15.554 2021-09-05 00:00:00
4 2021-09-05 00:06:38 2 3.359 2021-09-05 00:05:00
5 2021-09-05 00:06:46 2 10.364 2021-09-05 00:05:00
which I want to turn it into JSON like this:
{
"2021-09-05 00:00:00": {
"pid1": [
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 5.554,
},
],
"pid2": [
{
"date": "2021-09-05 00:04:24",
"pid": 2,
"value": 15.554,
}
],
},
"2021-09-05 00:05:00": {
"pid1": [
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 4.359,
},
{
"date": "2021-09-05 00:04:24",
"pid": 1,
"value": 18.364,
},
],
"pid2": [
{
"date": "2021-09-05 00:06:38",
"pid": 2,
"value": 3.359,
},{
"date": "2021-09-05 00:06:46",
"pid": 1,
"value": 10.364,
},
],
}
}
Basically I want the group the data by the interval value.
Is there a quick way to format this?

Create helper column with pid, convert to MultiIndex Series and last crate nested dictionary:
s = (df.assign(new = 'pid' + df['pid'].astype(str))
.groupby(['interval','new'])[['date','pid','value']]
.apply(lambda x : x.to_dict(orient= 'records')))
d = {level: s.xs(level).to_dict() for level in s.index.levels[0]}
print (d)
{
'2021-09-05 00:00:00': {
'pid1': [{
'date': '2021-09-05 00:04:24',
'pid': 1,
'value': 5.554
}],
'pid2': [{
'date': '2021-09-05 00:04:24',
'pid': 2,
'value': 15.554
}]
},
'2021-09-05 00:05:00': {
'pid1': [{
'date': '2021-09-05 00:06:38',
'pid': 1,
'value': 4.359
},
{
'date': '2021-09-05 00:06:46',
'pid': 1,
'value': 18.364
}
],
'pid2': [{
'date': '2021-09-05 00:06:38',
'pid': 2,
'value': 3.359
},
{
'date': '2021-09-05 00:06:46',
'pid': 2,
'value': 10.364
}
]
}
}
Last for json use:
import json
json = json.dumps(d)

Python Dictionary transpose rows as column

I have a CSV file that will be imported and converted into a dictionary.
with open(r"DictionaryQuestion.csv", encoding='utf-8-sig') as csvfile:
csvReader = csv.DictReader(csvfile)
for row in map(dict, csvReader):
print(row)
Example Input
I want to be able to transpose the data so that the Discount & NonDiscount rows will be added as columns with their associated amount as well as getting rid of duplicates. Essentially, I want a new dictionary so that I can zip through it.
This is the desired output.
Desired Output as Dictionary

You can use itertools.groupby() to group records by productId and then update your data.
Below I've converted a list which has records same as yours and created new list with data as expected.
data = [
{
"ProductId": "1", "Brand": "Brand1", "rateamount": 1, "rate_type": "Discount"
},
{
"ProductId": "1", "Brand": "Brand1", "rateamount": 2, "rate_type": "NonDiscount"
},
{
"ProductId": "2", "Brand": "Brand2", "rateamount": 3, "rate_type": "Discount"
},
{
"ProductId": "2", "Brand": "Brand2", "rateamount": 4, "rate_type": "NonDiscount"
},
{
"ProductId": "3", "Brand": "Brand3", "rateamount": 5, "rate_type": "Discount"
},
{
"ProductId": "3", "Brand": "Brand3", "rateamount": 6, "rate_type": "NonDiscount"
},
{
"ProductId": "4", "Brand": "Brand4", "rateamount": 7, "rate_type": "Discount"
},
{
"ProductId": "4", "Brand": "Brand4", "rateamount": 2, "rate_type": "NonDiscount"
},
]
Solution
Assuming you data is ordered by productId, otherwise you'll need to order it before grouping.
import itertools
groups = itertools.groupby(data, lambda e: {"ProductId": e["ProductId"], "Brand": e["Brand"]})
output = []
for group, items in groups:
el = dict(group)
for item in items:
if item["rate_type"] == "Discount":
el["Discount"] = item["rateamount"]
else:
el["NonDiscount"] = item["rateamount"]
output.append(el)
print(output)
Above for loop can be converted to a map
import itertools
groups = itertools.groupby(data, lambda e: {"ProductId": e["ProductId"], "Brand": e["Brand"]})
output = map(
lambda group: dict(
**group[0],
**{
item["rate_type"]: item["rateamount"] for item in group[1]
}),
groups
)
print(list(output))
Both prints
[
{'ProductId': '1', 'Brand': 'Brand1', 'Discount': 1, 'NonDiscount': 2},
{'ProductId': '2', 'Brand': 'Brand2', 'Discount': 3, 'NonDiscount': 4},
{'ProductId': '3', 'Brand': 'Brand3', 'Discount': 5, 'NonDiscount': 6},
{'ProductId': '4', 'Brand': 'Brand4', 'Discount': 7, 'NonDiscount': 2}
]

Convert Pandas Dataframe to nested dictionary

I am trying to convert a dataframe to a nested dictionary but no success so far.
Dataframe: clean_data['Model', 'Problem', 'Size']
Here's how my data looks like:
Model Problem Size
lenovo a6020 screen broken 1
lenovo a6020a40 battery 60
bluetooth 60
buttons 60
lenovo k4 wi-fi 3
bluetooth 3
My desired output:
{
"name": "Brand",
"children": [
{
"name": "Lenovo",
"children": [
{
"name": "lenovo a6020",
"children": {
"name": "screen broken",
"size": 1
}
},
{
"name": "lenovo a6020a40",
"children": [
{
"name": "battery",
"size": 60
},
{
"name": "bluetooth",
"size": 60
},
{
"name": "buttons",
"size": 60
}
]
},
{
"name": "lenovo k4",
"children": [
{
"name": "wi-fi",
"size": 3
},
{
"name": "bluetooth",
"size": 3
}
]
}
]
}
]
}
I have tried pandas.DataFrame.to_dict method But it is returning a simple dictionary but I want it like the one mentioned above.

Use:
print (df)
Model Problem size
0 lenovo a6020 screen broken 1
1 lenovo a6020a40 battery 60
2 NaN bluetooth 60
3 NaN buttons 60
4 lenovo k4 wi-fi 3
5 NaN bluetooth 3
#repalce missing values by forward filling
df = df.ffill()
#split Model column by first whitesapces to 2 columns
df[['a','b']] = df['Model'].str.split(n=1, expand=True)
#each level convert to list of dictionaries
#for correct keys use rename
L = (df.rename(columns={'Problem':'name'})
.groupby(['a','b'])['name','size']
.apply(lambda x: x.to_dict('r'))
.rename('children')
.reset_index()
.rename(columns={'b':'name'})
.groupby('a')['name','children']
.apply(lambda x: x.to_dict('r'))
.rename('children')
.reset_index()
.rename(columns={'a':'name'})
.to_dict('r')
)
#print (L)
#create outer level by contructor
d = { "name": "Brand", "children": L}
print (d)
{
'name': 'Brand',
'children': [{
'name': 'lenovo',
'children': [{
'name': 'a6020',
'children': [{
'name': 'screen broken',
'size': 1
}]
}, {
'name': 'a6020a40',
'children': [{
'name': 'battery',
'size': 60
}, {
'name': 'bluetooth',
'size': 60
}, {
'name': 'buttons',
'size': 60
}]
}, {
'name': 'k4',
'children': [{
'name': 'wi-fi',
'size': 3
}, {
'name': 'bluetooth',
'size': 3
}]
}]
}]
}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Dataframe to PowerBI's json format - python

Related

represent parent and child relation as dictionary

ordering a dictionary by count of items across a number of key value lists

Python Pandas - Convert dataframe into json

Python Dictionary transpose rows as column

Convert Pandas Dataframe to nested dictionary

Categories

Resources