Related
Is there any possible way to convert pandas Dataframe to dict with list for each row?
Open High Low Close
2021-12-15 12:30:00 1.9000 1.91 1.86 1.8850
2021-12-15 13:30:00 1.8881 1.95 1.88 1.9400
2021-12-15 14:30:00 1.9350 1.95 1.86 1.8956
The output I want
{x:2021-12-15 12:30:00, y:\[1.9000,1.91,1.86,1.8850\]}
{x:2021-12-15 13:30:00, y:\[1.8881,1.95,1.88,1.9400\]}
{x:2021-12-15 14:30:00, y:\[1.9350,1.95,1.86,1.8956\]}
You can use:
dictt=list(zip(df.index,df[['Open','High','Low','Close']].values.tolist()))
final =[{'x':i[0], 'y':i[1]} for i in dictt]
or without loop:
df['y']=df[['Open','High','Low','Close']].values.tolist()
final = df.reset_index().rename(columns={'index':'x'})[['x','y']].to_dict('records')
Output:
[
{
"x":"2021-12-15 12:30:00",
"y":[
1.9,
1.91,
1.86,
1.885
]
},
{
"x":"2021-12-15 13:30:00",
"y":[
1.8881,
1.95,
1.88,
1.94
]
},
{
"x":"2021-12-15 14:30:00",
"y":[
1.935,
1.95,
1.86,
1.8956
]
}
]
If you want to convert a dataframe to a list of dict,you simply need to specify orient='index' ... So in your case if:
df = pd.DataFrame({'o':[1,2,3],'l':[4,5,6],'x':[7,8,9]},index=['t1','t2','t3'])
then you can do:
[{'x':k,'y':list(v.values())} for k,v in df.to_dict(orient='index').items()]
or also:
df2 = pd.DataFrame(df.apply(lambda x:list(x[df.columns]), axis=1))
list(df2.reset_index().rename(columns={'index':'x',0:'y'}).to_dict(orient='index').values())
Either results to:
[{'x': 't1', 'y': [1, 4, 7]},
{'x': 't2', 'y': [2, 5, 8]},
{'x': 't3', 'y': [3, 6, 9]}]
I have this dataframe:
id value
0 10.2
1 5.7
2 7.4
With id being the index. I want to have such output:
{'0': 10.2, '1': 5.7, '2': 7.4}
How to do this in python?
Use to_dict on the column:
>>> df['value'].to_dict()
{0: 10.2, 1: 5.7, 2: 7.4}
If you need the keys as strings:
>>> df.set_index(df.index.astype(str))['value'].to_dict()
{'0': 10.2, '1': 5.7, '2': 7.4}
I have the following dataframe:
import pandas as pd
df = pd.DataFrame({
"color": ["blue", "blue", "blue", "blue", "green", "green", "green", "green"],
"object": ["hat", "hat", "coat", "coat", "hat", "hat", "coat", "coat"],
"group": [1, 2, 1, 2, 1, 2, 1, 2],
"value": [1.2 , 3.5, 5.4, 7.1, 6.4, 1.8, 3.5, 5.6]
})
that looks like this:
I want to create a nested dict, with the columns "color" and "object" as the first key as a string (e.g. "(blue, hat)" (Note: This is a syntactically incorrect tuple with intention. It should be in string format!), the group as the key in the second level and the value as the key of the second level. I.e. my desired output is:
{
"(blue, hat)": {
1: 1.2,
2: 3.5
},
"(blue, coat)": {
1: 5.4,
2: 7.1
},
"(green, hat)": {
1: 6.4,
2: 1.8
},
"(green, coat)": {
1: 3.5,
2: 5.6
}
}
My approach would be to loop through the unique values of color, object and group, but that seems cumbersome to me. Is there a more pythonic approach?
Use dictionary comprehension with DataFrame.groupby, if need tuples rept like strings use:
d = {str(k): v.set_index('group')['value'].to_dict()
for k, v in df.groupby(['color','object'])}
print (d)
{
"('blue', 'coat')": {
1: 5.4,
2: 7.1
},
"('blue', 'hat')": {
1: 1.2,
2: 3.5
},
"('green', 'coat')": {
1: 3.5,
2: 5.6
},
"('green', 'hat')": {
1: 6.4,
2: 1.8
}
Or if need change format like 'tuple's without '' use f-strings:
d = {f'({k[0]}, {k[1]})': v.set_index('group')['value'].to_dict()
for k, v in df.groupby(['color','object'])}
Alternative with join:
d = {f'({", ".join(k)})': v.set_index('group')['value'].to_dict()
for k, v in df.groupby(['color','object'])}
print (d)
{
'(blue, coat)': {
1: 5.4,
2: 7.1
},
'(blue, hat)': {
1: 1.2,
2: 3.5
},
'(green, coat)': {
1: 3.5,
2: 5.6
},
'(green, hat)': {
1: 6.4,
2: 1.8
}
}
I have a nested dictionary, such as:
{'A1': {'T1': [1, 3.0, 3, 4.0], 'T2': [2, 2.0]}, 'A2': {'T1': [1, 0.0, 3, 5.0], 'T2': [2, 3.0]}}
What I want to do is sum each sub dictionary, to obtain this:
A1 A2 A1 A2
T1+T1 T2+T2 (ignore the first entry of the list)
[3.0, 5.0, 9.0] <<<< output
1 2 3
res 3.0 + 0.0 = 3.0 and 2.0 + 3.0 = 5.0 and 5.0 + 4.0 = 9.0
How can I do this? I've tried a for, but I've created a big mess
One way is to use collections.Counter in a list comprehension, and sum the resulting Counter objects:
from collections import Counter
d = {'A1': {'T1': 3.0, 'T2': 2.0}, 'A2': {'T1': 0.0, 'T2': 3.0}}
l = (Counter(i) for i in d.values())
sum(l, Counter())
# Counter({'T1': 3.0, 'T2': 5.0})
For sum to work here, I've defined an empty Counter() as the start argument, so sum expects other Counter objects.
To get only the values, you can do:
sum(l, Counter()).values()
# dict_values([3.0, 5.0])
you could use a list comprehension with zip:
d = {'A1': {'T1': 3.0, 'T2': 2.0}, 'A2': {'T1': 0.0, 'T2': 3.0}}
[sum(e) for e in zip(*(e.values() for e in d.values()))]
output:
[3.0, 5.0]
this will work if your python version is >= 3.6
also, you can use 2 for loops:
r = {}
for dv in d.values():
for k, v in dv.items():
r.setdefault(k, []).append(v)
result = [sum(v) for v in r.values()]
print(result)
output:
[3.0, 5.0]
after your edit
you could use:
from itertools import zip_longest
sum_t1, sum_t2 = list(list(map(sum, zip(*t))) for t in zip(*[e.values() for e in d.values()]))
[i for t in zip_longest(sum_t1[1:], sum_t2[1:]) for i in t if i is not None]
output:
[3.0, 5.0, 6, 9.0]
I have a data frame which looks like following
Date Top
A B
2018-09-30 1.2 2.3
2018-10-01 1.5 1.7
2018-10-02 2.3 2.8
2018-10-03 7.7 7.5
2018-10-04 1.1 0.9
2018-10-05 2.1 6.5
So I have multi-index in the columns, only two columns 'Date' and 'Top' and then 'Top' has two level 1 columns 'A' and 'B'.
I am trying to convert them into python dictionary.
when I am using
df_dict = df.to_dict(orient = 'index')
I get an output
{0: {('Top', 'A'): 1.2, ('Top', 'B'): 2.3, ('date', ''): '2018-09-30'},
1: {('Top', 'A'): 1.5, ('Top', 'B'): 1.7, ('date', ''): '2018-10-01'},
2: {('Top', 'A'): 2.3, ('Top', 'B'): 2.8, ('date', ''): '2018-10-02'},
3: {('Top', 'A'): 7.7, ('Top', 'B'): 7.5, ('date', ''): '2018-10-03'},
4: {('Top', 'A'): 1.1, ('Top', 'B'): 0.9, ('date', ''): '2018-10-04'},
5: {('Top', 'A'): 2.1, ('Top', 'B'): 6.5, ('date', ''): '2018-10-05'}}
Now I can access df_dict with following script which give me an output of 1.2
df_dict[1]['Top']['Top','A']
But I am looking for output with this script
df_dict[1]['Top']
Output: A:1.2, B:2.3
since 'Top' is not a key inside the first [1] key-value pair. So that I can access all 'Top' easily for a date.
Thanks for all the help
You can use dict comprehension with filtering by first level Top:
df_dict = df.to_dict(orient = 'index')
out = {k2: v for (k1, k2), v in df_dict[0].items() if k1 == 'Top'}
print (out)
{'A': 1.2, 'B': 2.3}
Simplier is use pandas for select by index value and first level of MultiIndex and then create dict:
print (df.loc[0, 'Top'])
A 1.2
B 2.3
Name: 0, dtype: object
out = df.loc[0, 'Top'].to_dict()
print (out)
{'A': 1.2, 'B': 2.3}
EDIT:
print (df)
A B
2018-09-30 1.2 2.3
2018-10-01 1.5 1.7
2018-10-02 2.3 2.8
2018-10-03 7.7 7.5
2018-10-04 1.1 0.9
2018-10-05 2.1 6.5
df.index.name = 'date'
df = df.reset_index()
#set MultiIndex for each columns for avoid empty strings keys
df.columns = [['d','Top', 'Top'], df.columns]
#for each first level of MultiIndex create dictionary
#also add new level to outer level of dict
out = {x:df[x].to_dict(orient = 'index') for x in df.columns.levels[0]}
print (out)
{'Top': {0: {'A': 1.2, 'B': 2.3}, 1: {'A': 1.5, 'B': 1.7}, 2: {'A': 2.3, 'B': 2.8},
3: {'A': 7.7, 'B': 7.5}, 4: {'A': 1.1, 'B': 0.9}, 5: {'A': 2.1, 'B': 6.5}},
'd': {0: {'date': '2018-09-30'}, 1: {'date': '2018-10-01'},
2: {'date': '2018-10-02'}, 3: {'date': '2018-10-03'},
4: {'date': '2018-10-04'}, 5: {'date': '2018-10-05'}}}
print (out['Top'][0])
{'A': 1.2, 'B': 2.3}