How to convert this DataFrame into Json - python

I have this DataFrame with 2 columns
print(df)
a b
10 {'A': 'foo', ...}
20 {'B': 'faa', ...}
30 {'C': 'fee', ...}
40 {'D': 'fii', ...}
50 {'E': 'foo', ...}
when I try to convert it into json it goes wrong:
df.to_json("test.json")
# Output:
{
"a":{10, 20, 30, 40, 50},
"b":{
"1":{
"A":"foo",
...
},
"2":{
"B":"faa",
...
},
"3":{
"B":"faa",
...
},
...
"5":{
"E":"foo",
...
}
}
I don't even know ehere the numbers come from.
My desired json:
[{
'a': 10,
'b': {
'A': 'foo',
...
},
...
'a': 50,
'b': {
'E': 'foo',
...
}
}
]

You could try the following:
data = []
for i in df:
data.append({'a': df[i[0]], 'b': df(i[1])})
This should give you your desired output.
If you want to convert this into a JSON file then you can do the following:
with open("myjson.json", "w") as f:
json.dump(data, f, indent=4)

Related

converting pandas dataframe to a custom JSON

This is my dataframe:
df = pd.DataFrame(
{
'a': ['x', 'x', 'y', 'y'],
'b': ['xs', 'sx', 'rrx', 'ywer'],
'c': ['aaa', 'bbb', 'rrsdrx', 'yz'],
}
)
And this is the JSON output that I want:
{
'x':{
'links':[
{
'b': 'xs',
'c': 'aaa'
},
{
'b': 'sx',
'c': 'bbb'
}
]
},
'y':{
'links':[
{
'b': 'rrx',
'c': 'rrsdrx'
},
{
'b': 'ywer',
'c': 'yz'
}
]
},
}
I have tried the accepted answer of this post. And the following code was my other try:
x = df.groupby('a')['b'].apply(list).reset_index()
y = x.to_json(orient='records')
parsed = json.loads(y)
z = json.dumps(parsed, indent=4)
but the output was not what I needed.
Group the dataframe by a, then create dictionary for each dataframe for the keys, and create the required dictionary.
{k:{'links': d.drop(columns=['a']).to_dict('records')} for k,d in df.groupby('a')}
OUTPUT
{
"x": {
"links": [
{
"b": "xs",
"c": "aaa"
},
{
"b": "sx",
"c": "bbb"
}
]
},
"y": {
"links": [
{
"b": "rrx",
"c": "rrsdrx"
},
{
"b": "ywer",
"c": "yz"
}
]
}
}

Python: formatting json output

I need to convert this DataFrame to json file.
Code:
def new_json(df):
drec = dict()
ncols = df.values.shape[1]
for line in df.values:
d = drec
for j, col in enumerate(line[:-1]):
if not col in d.keys():
if j != ncols-2:
d[col] = {}
d = d[col]
else:
d[col] = line[-1]
else:
if j!= ncols-2:
d = d[col]
return drec
a=new_json(df)
print(a)
result:
{'a': {'a2': {'a21': 'new', 'a22': 'old'}, 'a3': {'a31': 'content'}, 'a4': {'a41': 'old'}}, 'b': {'b1': {'b11': 'content', 'b12': 'new', 'b13': 'new'}}, 'c': {'c1': {'c11': 'content'}, 'c2': {'c21': 'content'}, 'c3': {'c31': 'old'}}}
Is it possible to modify the result in this json format?
{
'a': {
'a2': {
'a21': 'new',
'a22': 'old'
},
'a3': {
'a31': 'content'
},
'a4': {
'a41': 'old'
}
},
'b': {
'b1': {
'b11': 'content',
'b12': 'new',
'b13': 'new'
}
},
'c': {
'c1': {
'c11': 'content'
},
'c2': {
'c21': 'content'
},
'c3': {
'c31': 'old'
}
}
}

convert hierarchical data to a specific json format in python

I have a dataframe like below. Each topic has several sub-topics.
pd.DataFrame({'topic': ['A', 'A', 'A', 'B', 'B'],
'sub-topic': ['A1', 'A2', 'A3', 'B1', 'B3' ],
'value': [2,12,44,21,1]})
topic sub-topic value
0 A A1 2
1 A A2 12
2 A A3 44
3 B B1 21
4 B B3 1
I need to convert it to Json format like below. Within first layer, for example topic A, the value is the sum of all its sub-topics.
{'A': {
'value': 58,
'children': {
'A1': {'value': 2},
'A2': {'value': 12},
'A3': {'value': 44}
},
},
'B': {
'value': 22,
'children': {
'B1': {'value': 21},
'B3': {'value': 1}
}
}
}
Does anyone know how I can convert the data to this specific json? I have no clue how I should approach that. Thanks a lot in advance.
Use cusom function in GroupBy.apply, last use Series.to_dict or Series.to_json:
def f(x):
d = {'value': x['value'].sum(),
'children': x.set_index('sub-topic')[['value']].to_dict('index')}
return (d)
#for dictonary
out = df.groupby('topic').apply(f).to_dict()
#for json
#out = df.groupby('topic').apply(f).to_json()
print (out)
{
'A': {
'value': 58,
'children': {
'A1': {
'value': 2
},
'A2': {
'value': 12
},
'A3': {
'value': 44
}
}
},
'B': {
'value': 22,
'children': {
'B1': {
'value': 21
},
'B3': {
'value': 1
}
}
}
}

turn a dict that may contain a pandas dataframe to several dicts

I have a dict that may be 'infinitely' nested and contain several pandas DataFrame's (all the DataFrame's have the same amount of rows).
I want to create a new dict for each row in the DataFrame's, with the row being transformed to a dict (the key's are the column names) and the rest of the dictionary staying the same.
Note: I am not making a cartesian product between the rows of the different DataFrame's.
what would be the best and most pythonic way to do it?
Example:
the original dict:
d = {'a': 1,
'inner': {
'b': 'string',
'c': pd.DataFrame({'c_col1': range(1,3), 'c_col2': range(2,4)})
},
'd': pd.DataFrame({'d_col1': range(4,6), 'd_col2': range(7,9)})
}
the desired result:
lst_of_dicts = [
{'a': 1,
'inner': {
'b': 'string',
'c': {
'c_col1': 1, 'c_col2':2
}
},
'd': {
'd_col1': 4, 'd_col2': 7
}
},
{'a': 1,
'inner': {
'b': 'string',
'c': {
'c_col1': 2, 'c_col2': 3
}
},
'd': {
'd_col1': 5, 'd_col2': 8
}
}
]

Nested dictionary with lists to many dictionaries

I have nested dictionary with lists like this
{
'a': 1,
'x':[
{'b': 1,
'c': [
{'z': 12},
{'z': 22},
]
},
{'b': 2,
'c': [
{'z': 10},
{'z': 33},
]
}
]
}
And I want to convert it to list of flat dictionaries i form like this.
[
{'a': 1, 'b': 1, 'z': 12},
{'a': 1, 'b': 1, 'z': 22},
{'a': 1, 'b': 2, 'z': 10},
{'a': 1, 'b': 2, 'z': 33},
]
Any idea how to achieve that?
The following produces the requested result:
[{'a': 1, 'b': 1, 'z': 12}, {'a': 1, 'b': 2, 'z': 10}]
Use at your own risk. The following was only tested on your example.
from itertools import product
def flatten(D):
if not isinstance(D, dict): return D
base = [(k, v) for k, v in D.items() if not isinstance(v, list)]
lists = [[flatten(x) for x in v] for k, v in D.items() if isinstance(v, list)]
l = []
for p in product(*lists):
r = dict(base)
for a in p:
for d in a:
r.update(d)
l.append(r)
return l
The following tests above.
d = {
'a': 1,
'x':[
{'b': 1,
'c': [
{'z': 12}
]
},
{'b': 2,
'c': [
{'z': 10}
]
}
]
}
print flatten(d)
A possible solution is:
#!/usr/bin/env python3
d = {
'a': 1,
'x': [
{
'b': 1,
'c': [
{'z': 12}
]
},
{
'b': 2,
'c': [
{'z': 10}
]
}
]
}
res = [{"a": 1, "b": x["b"], "z": x["c"][0]["z"]} for x in d["x"]]
print(res)
This assumes that there is only one a (with a fixed value of 1) and x element and this element is added to the comprehension manually.
The other two elements (b and z) are taken from x array with a list comprehension.
To learn more about how comprehensions work read the following:
Python Documentation - 5.1.4. List Comprehensions
Python: List Comprehensions
PS. You are supposed to first show what you have tried so far and get help on that. Take a look at SO rules before posting your next question.

Categories