I have 2 dataframes
df1 = pd.DataFrame(data={'ID': ['0','1'], 'col1': [0.73, 0.58], 'col2': [0.51, 0.93], 'Type': ['mean', 'mean'] })
df2 = pd.DataFrame(data={'ID': ['0','1'], 'col1': [0.44, 0.49], 'col2': [0.50, 0.24], 'Type': ['std', 'std'] })
print(df1)
print(df2)
I need to convert to nested dictionary like
mydict = {0: {'col1': {'mean': 0.73, 'std': 0.44}, 'col2': {'mean': 0.51, 'std': 0.5}},
1: {'col1': {'mean': 0.58, 'std': 0.49}, 'col2': {'mean': 0.93, 'std': 0.24}}}
where 'ID' as key, column names as nested key and 'Type' as nested keys and column values as values
Use concat with DataFrame.pivot for MultiIndex DataFrame and then convert to nested dict:
df = pd.concat([df1, df2]).pivot('Type', 'ID')
d = {level: df.xs(level, axis=1, level=1).to_dict() for level in df.columns.levels[1]}
print (d)
{'0': {'col1': {'mean': 0.73, 'std': 0.44},
'col2': {'mean': 0.51, 'std': 0.5}},
'1': {'col1': {'mean': 0.58, 'std': 0.49},
'col2': {'mean': 0.93, 'std': 0.24}}}
(df1.drop(columns = 'Type').melt('ID', value_name='mean')
.merge(df2.drop(columns='Type').melt('ID', value_name='std'))
.assign(c = lambda x:x[['mean', 'std']].to_dict('records'))
.pivot('variable','ID', 'c').to_dict())
{'0': {'col1': {'mean': 0.73, 'std': 0.44},
'col2': {'mean': 0.51, 'std': 0.5}},
'1': {'col1': {'mean': 0.58, 'std': 0.49},
'col2': {'mean': 0.93, 'std': 0.24}}}
Related
This is my dictionary, called "reviews":
reviews= {1: {'like', 'the', 'acting'},
2: {'hate', 'plot', 'story'}}
And this is my "lexicon" dataFrame:
import pandas as pd
lexicon = {'word': ['like', 'movie', 'hate'],
'neg': [0.0005, 0.0014, 0.0029],
'pos': [0.0025, 0.0019, 0.0002]
}
lexicon = pd.DataFrame(lexicon, columns = ['word', 'neg','pos'])
print (lexicon)
I need to fill my "reviews" dictionary with the neg and pos values from the "lexicon" dataFrame.
If there is no value in the lexicon, then I want to put 0.5
To finally get this outcome:
reviews= {1: {'like': [0.0005, 0.0025], 'the': [0.5, 0.5], 'acting': [0.5, 0.5]},
2: {'plot': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'story': [0.5, 0.5]}}
You can use df.reindex here.
df_ = lexicon.set_index("word").agg(list, axis=1)
out = {k: df_.reindex(v, fill_value=[0.5, 0.5]).to_dict() for k, v in reviews.items()}
# {1: {'the': [0.5, 0.5], 'like': [0.0005, 0.0025], 'acting': [0.5, 0.5]},
# 2: {'story': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'plot': [0.5, 0.5]}}
Create dictionary from lexicon and then in double dictionary comprehension mapping by dict.get for possible add default value if no match:
d = lexicon.set_index('word').agg(list, axis=1).to_dict()
print (d)
{'like': [0.0005, 0.0025], 'movie': [0.0014, 0.0019], 'hate': [0.0029, 0.0002]}
out = {k: {x: d.get(x, [0.5,0.5]) for x in v} for k, v in reviews.items()}
print (out)
{1: {'like': [0.0005, 0.0025], 'the': [0.5, 0.5], 'acting': [0.5, 0.5]},
2: {'story': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'plot': [0.5, 0.5]}}
I need to convert Pandas Series to a Dictionary, without Index (like pandas.DataFrame.to_dict('r')) - code is below:
grouped_df = df.groupby(index_column)
for key, val in tqdm(grouped):
json_dict[key] = val.apply(lambda x: x.to_dict(), axis=1).to_dict()
Currently, I get output like so:
{
"15717":{
"col1":1.61,
"col2":1.53,
"col3":1.0
},
"15718":{
"col1":10.97,
"col2":5.79,
"col3":2.0
},
"15719":{
"col1":15.38,
"col2":12.81,
"col3":1.0
}
}
but i need output like:
[
{
"col1":1.61,
"col2":1.53,
"col3":1.0
},
{
"col1":10.97,
"col2":5.79,
"col3":2.0
},
{
"col1":15.38,
"col2":12.81,
"col3":1.0
}
]
Thanks for your help!
Edit: Here is the original dataframe:
col1 col2 col3
2751 5.46 1.0 1.11
2752 16.47 0.0 6.54
2753 26.51 0.0 18.25
2754 31.04 1.0 28.95
2755 36.45 0.0 32.91
Two ways of doing that:
[v for _, v in df.to_dict(orient="index").items()]
Another one:
df.to_dict(orient="records")
The output, either way, is:
[{'col1': 1.61, 'col2': 1.53, 'col3': 1.0},
{'col1': 10.97, 'col2': 5.79, 'col3': 2.0},
{'col1': 15.38, 'col2': 12.81, 'col3': 1.0}]
You can try:
df.T.to_dict('r')
Output:
[{'col1': 1.61, 'col2': 1.53, 'col3': 1.0},
{'col1': 10.97, 'col2': 5.79, 'col3': 2.0},
{'col1': 15.38, 'col2': 12.81, 'col3': 1.0}]
I have a dictionary as
ex_dict_tot={'recency': 12, 'frequency': 12, 'money': 12}
another count dictionary as
ex_dict_count= {'recency': {'current': 4, 'savings': 2, 'fixed': 6},
'frequency': {'freq': 10, 'infreq': 2},
'money': {'med': 2, 'high': 8, 'low': 1, 'md': 1}}
I would like to calculate the proportions of each key values as,
In key - recency,
current=4/12,
savings=2/12,
fixed=6/12
Similarly - in key - frequency,
freq=10/12
infreq=2/12
And the required output would be,
{'recency': {'current': 0.3, 'savings': 0.16, 'fixed': 0.5},
'frequency': {'freq': 0.83, 'infreq': 0.16},
'money': {'med': 0.16, 'high': 0.6, 'low': 0.08, 'md': 0.08}}
Could you please write your suggestions/inputs on it?
You can do this with dict comprehension.
out = {key:{k:v/ex_dict_tot[key] for k,v in val.items()} for key,val in ex_dict_count.items()}
out
{'recency': {'current': 0.3333333333333333, 'savings': 0.16666666666666666, 'fixed': 0.5},
'frequency': {'freq': 0.8333333333333334, 'infreq': 0.16666666666666666},
'money': {'med': 0.16666666666666666, 'high': 0.6666666666666666, 'low': 0.08333333333333333, 'md': 0.08333333333333333}}
Use round to get values with floating-point precision 2.
out = {key:{k:round(v/ex_dict_tot[key],2) for k,v in val.items()} for key,val in ex_dict_count.items()}
out
{'recency': {'current': 0.33, 'savings': 0.17, 'fixed': 0.5},
'frequency': {'freq': 0.83, 'infreq': 0.17},
'money': {'med': 0.17, 'high': 0.67, 'low': 0.08, 'md': 0.08}}
I want to replace nan values from my dictionary.
For example, sometimes my dictionary looks like:
{'mean': nan, 'std': nan, 'median': nan, 'sum': 0, 'average_per_day': 0.0, 'freq': 0}
Now I'm doing it like this:
for k, v in stats_record.items():
if math.isnan(v):
stats_record[k] = 0
Is there a more pythonic way to replace nan values from dictionary?
Dict-comprehension can be handy here.
import numpy as np
e = {'mean': np.nan, 'std': np.nan, 'median': np.nan, 'sum': 0, 'average_per_day': 0.0, 'freq': 0}
e = {k:v if not np.isnan(v) else 0 for k,v in e.items() }
print(e)
Output:
{'average_per_day': 0.0, 'sum': 0, 'freq': 0, 'median': 0, 'std': 0, 'mean': 0}
I have the data in tabular format (rows and columns) which I read into a dataframe (Data1) :
Name D Score
0 Angelica D1 3.5
1 Angelica D2 2.0
2 Bill D1 2.0
3 Chan D3 1.0
......
I am able to convert it into a list using:
Data2 = Data1.values.tolist()
and get the below output:
[
['Angelica', 'D1', 3.5], ['Angelica', 'D2', 2.0],
['Bill', 'D1', 2.0], ['Bill', 'D2', 3.5],
['Chan', 'D8', 1.0], ['Chan', 'D3', 3.0], ['Chan', 'D4', 5.0],
['Dan', 'D4', 3.0], ['Dan', 'D5', 4.5], ['Dan', 'D6', 4.0]
]
What I want is, the output to be like this:
{
'Angelica': {'D1': 3.5, 'D2': 2.0} ,
'Bill': {'D1': 2.0, 'D2': 3.5}
'Chan': {'D8': 1.0, 'D3': 3.0, 'D4': 5.0 }
'Dan': {'D4': 3.0, 'D5': 4.5, 'D6': 4.0}
}
How can I achieve this in Python?
You can use a dictionary comprehension after grouping the df by the Name column:
>>> df = pd.DataFrame([{'Name': 'Angela', 'Score': 3.5, 'D': 'D1'}, {'Name': 'Angela', 'Score': 2.0, 'D': 'D2'}, {'Name': 'Bill', 'Score': 2.0, 'D': 'D1'}, {'Name': 'Chan', 'Score': 1.0, 'D': 'D3'}])
>>> df
D Name Score
0 D1 Angela 3.5
1 D2 Angela 2.0
2 D1 Bill 2.0
3 D3 Chan 1.0
>>> data2 = {name: {df.ix[v].D: df.ix[v].Score for v in val} for name, val in df.groupby('Name').groups.items()}
>>> data2
{'Chan': {'D3': 1.0}, 'Angela': {'D1': 3.5, 'D2': 2.0}, 'Bill': {'D1': 2.0}}
You can zip up the values from each group after grouping by Name:
In [4]: l = [
...: ['Angelica', 'D1', 3.5], ['Angelica', 'D2', 2.0],
...: ['Bill', 'D1', 2.0], ['Bill', 'D2', 3.5],
...: ['Chan', 'D8', 1.0], ['Chan', 'D3', 3.0], ['Chan', 'D4', 5.0],
...: ['Dan', 'D4', 3.0], ['Dan', 'D5', 4.5], ['Dan', 'D6', 4.0]
...: ]
...: columns=["Name", "D", "Score"]
...: df = pd.DataFrame(l, columns=columns)
...:
In [5]: {name: dict(zip(v["D"], v["Score"])) for name, v in df.groupby("Name")}
In [6]: data
Out[6]:
{'Angelica': {'D1': 3.5, 'D2': 2.0},
'Bill': {'D1': 2.0, 'D2': 3.5},
'Chan': {'D3': 3.0, 'D4': 5.0, 'D8': 1.0},
'Dan': {'D4': 3.0, 'D5': 4.5, 'D6': 4.0}}
from collections import defaultdict
result = defaultdict(dict)
for item in Data2:
result[item[0]].update(dict([item[1:]]))