Get dictionary from a dataframe

Get dictionary from a dataframe - python

I have a dataframe like below:
df = pd.DataFrame({
'Aapl': [12, 5, 8],
'Fs': [18, 12, 8],
'Bmw': [6, 18, 12],
'Year': ['2020', '2025', '2030']
})
I want a dictionary like:
d={'2020':[12,18,16],
'2025':[5,12,18],
'2030':[8,8,12]
}
I am not able to develop the whole logic:
lst = [list(item.values()) for item in df.to_dict().values()]
dic={}
for items in lst:
for i in items[-1]:
dic[i]=#2000 will hold all the 0th values of other lists and so on
Is there any easier way using pandas ?

Convert Year to index, transpose and then in dict comprehension create lists:
d = {k: list(v) for k, v in df.set_index('Year').T.items()}
print (d)
{'2020': [12, 18, 6], '2025': [5, 12, 18], '2030': [8, 8, 12]}
Or use DataFrame.agg:
d = df.set_index('Year').agg(list, axis=1).to_dict()
print (d)
{'2020': [12, 18, 6], '2025': [5, 12, 18], '2030': [8, 8, 12]}

Try this:
import pandas as pd
data = {'Name': ['Ankit', 'Amit',
'Aishwarya', 'Priyanka'],
'Age': [21, 19, 20, 18],
'Stream': ['Math', 'Commerce',
'Arts', 'Biology'],
'Percentage': [88, 92, 95, 70]}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data, columns=['Name', 'Age',
'Stream', 'Percentage'])

Related

itertool group dynamic element in list python

I have a list of lists with multi columns:
column = [id, date,col1, col2...coln]
list_OfRows = [[1,date1, 10,20 ...23],
[1,date1, 1,10 ...33],
[2,date2, 3,7...8],
[2,date2, 21,9...23],
[2,date3, 10,56 ...20],
[2,date4, 10,20 ...42]]
I want to group by on id and date and do sum on cols WITHOUT USING PANDAS
RESULT = [[1,date1, 11,30 ...56],
[2,date2, 24,16...31],
[2,date3, 10,20 ...20],
[2,date4, 10,20 ...42]]

You can do it like this:
from itertools import groupby
list_OfRows.sort(key=lambda x: x[:2])
res = []
for k, g in groupby(list_OfRows, key=lambda x: x[:2]):
res.append(k + list(map(sum, zip(*[c[2:] for c in g]))))
which produces:
[[1, 'date1', 11, 30, 56],
[2, 'date2', 24, 16, 31],
[2, 'date3', 10, 56, 20],
[2, 'date4', 10, 20, 42]]

Aggregation in array element - python

I have a OP : {'2017-05-06': [3, 7, 8],'2017-05-07': [3, 9, 10],'2017-05-08': [4]}
from the OP I just want another OP :
{'2017-05-06': [15, 11, 10],'2017-05-07': [19, 13, 12],'2017-05-08': [4]}
which means:
Ncleand is 2017-05-06
element total is 18 so '2017-05-06': [3 -18, 7-18, 8-18] = '2017-05-06': [15, 11, 10]
likewise all elements data.
So final output is {'2017-05-06': [15, 11, 10],'2017-05-07': [19, 13, 12],'2017-05-08': [4]}
How to do this?
Note : I am using python 3.6.2 and pandas 0.22.0
code so far :
import pandas as pd
dfs = pd.read_excel('ff2.xlsx', sheet_name=None)
dfs1 = {i:x.groupby(pd.to_datetime(x['date']).dt.strftime('%Y-%m-%d'))['duration'].sum() for i, x in dfs.items()}
d = pd.concat(dfs1).groupby(level=1).apply(list).to_dict()
actuald = pd.concat(dfs1).div(80).astype(int)
sum1 = actuald.groupby(level=1).transform('sum')
m = actuald.groupby(level=1).transform('size') > 1
cleand = sum1.sub(actuald).where(m, actuald).groupby(level=1).apply(list).to_dict()
print (cleand)
From the cleand I want to do this?

In a compact (but somehow inefficient) way:
>>> op = {'2017-05-06': [3, 7, 8],'2017-05-07': [3, 9, 10],'2017-05-08': [4]}
>>> { x:[sum(y)-i for i in y] if len(y)>1 else y for x,y in op.items() }
#output:
{'2017-05-06': [15, 11, 10], '2017-05-07': [19, 13, 12], '2017-05-08': [4]}

def get_list_manipulation(list_):
subtraction = list_
if len(list_) != 1:
total = sum(list_)
subtraction = [total-val for val in list_]
return subtraction
for key, values in data.items():
data[key] = get_list_manipulation(values)
>>>{'2017-05-06': [15, 11, 10], '2017-05-07': [19, 13, 12], '2017-05-08': [4]}

Merge two list contained dictionary based on its index in python

I have two list contain multi dictionary, each dictionary has a list as value, these are my list:
list1 = [{'a':[12,22,61],'b':[21,12,50]},{'c':[10,11,47],'d':[13,20,45],'e':[11,24,42]},{'a':[12,22,61],'b':[21,12,50]}]
list2 = [{'f':[21,23,51],'g':[11,12,44]},{'h':[22,26,68],'i':[12,9,65],'j':[10,12,50]},{'f':[21,23,51],'g':[11,12,44]}]
In my case, i need to merge these list with this rule:
Dictionary from the first list (list1) only can be merged by
dictionary from the second list (list2) with the same listing index
After both of these list are merged, each dictionary has to be sorted based on the third number of its value
This is the expected result based on two rule above:
result = [
{'a':[12,22,61],'f':[21,23,51],'b':[21,12,50],'g':[11,12,44]},
{'h':[22,26,68],'i':[12,9,65],'j':[10,12,50],'c':[10,11,47],'d':[13,20,45],'e':[11,24,42]},
{'a':[12,22,61],'f':[21,23,51],'b':[21,12,50],'g':[11,12,44]}
]
How can i do that? is it possible to be done in python with inline looping?

Try:
[dict(a, **b) for a,b in zip(list1, list2)]

In one line (if you do not count with the import):
from collections import OrderedDict
[OrderedDict(sorted(dict(d1.items() + d2.items()).items(), key=lambda x: x[1][-1],
reverse=True)) for d1, d2 in zip(list1, list2)]
[OrderedDict([('a', [12, 22, 61]),
('f', [21, 23, 51]),
('b', [21, 12, 50]),
('g', [11, 12, 44])]),
OrderedDict([('h', [22, 26, 68]),
('i', [12, 9, 65]),
('j', [10, 12, 50]),
('c', [10, 11, 47]),
('d', [13, 20, 45]),
('e', [11, 24, 42])]),
OrderedDict([('a', [12, 22, 61]),
('f', [21, 23, 51]),
('b', [21, 12, 50]),
('g', [11, 12, 44])])]
This works in Python 2.7.

Dictionaries are not sorted by nature, so if you don't need them sorted your can merge them in a simple one-liner.
result = [ {**d1, **d2} for d1, d2 in zip(list1, list2) ] # python 3.5+
If you are using a lower version then define a merge function.
def merge(d1, d2):
result = d1.copy()
result.update(d2)
return result
And then have
result = [ merge(d1, d2) for d1, d2 in zip(list1, list2) ]
If you do need them sorted then your only option is to use an OrderedDict
from collections import OrderedDict
def merge(d1, d2):
tempD = d1.copy()
tempD.update(d2)
return OrderedDict(sorted(tempD.items(), key = lambda t: t[1][2], reverse = True))
result = [ merge(d1, d2) for d1, d2 in zip(list1, list2) ]
Or even shorter for python 3.5+ is
result = [ OrderedDict(sorted(({**d1, **d2}).items(), key = lambda t: t[1][2], reverse = True)) for d1, d2 in zip(list1, list2) ]

You can do like this for your result :
r = map(lambda x,y:dict(x.items() + y.items()), list1, list2)
Result :
[{'a': [12, 22, 61], 'b': [21, 12, 50], 'g': [11, 12, 44], 'f': [21, 23, 51]},
{'c': [10, 11, 47], 'e': [11, 24, 42], 'd': [13, 20, 45], 'i': [12, 9, 65], 'h': [22, 26, 68], 'j': [10, 12, 50]},
{'a': [12, 22, 61], 'b': [21, 12, 50], 'g': [11, 12, 44], 'f': [21, 23, 51]}]

Dictionary with Keys inside Keys

I need to find a way to return the 1st value inside the key 'hw' for keys 1 and 2 and sum them but I cannot think of a way.It needs to work for any number of keys, not just 1 and 2 but even if there were 10 or so. The 1 and 2 are the students, the 'hw','qz',,etc are the categories of assignments,every student will have hw,qz,ex,pr and there will be 3 qz,3hw,3pr,3ex for each student. I need to return all the students 1st, hw grade, 1st quiz grade, 2nd hw grade..etc
grades = {
1: {
'pr': [18, 15],
'hw': [16, 27, 25],
'qz': [8, 10, 5],
'ex': [83, 93],
},
2: {
'pr': [20, 18],
'hw': [17, 23, 28],
'qz': [9, 9, 8],
'ex': [84, 98],
},
}

More succinctly (and Pythonicly):
hw_sum = sum([grades[key]['hw'][0] for key in grades])

return the 1st value inside the key 'hw' for keys 1 and 2 and sum them
It helps to use explanatory names so we know what the moving parts are meant to be. I've chosen descriptive names somewhat arbitrarily by guessing at the meaning.
Succinct solution
grades_by_year_and_subject = {
1: {
'pr': [18, 15],
'hw': [16, 27, 25],
'qz': [8, 10, 5],
'ex': [83, 93],
},
2: {
'pr': [20, 18],
'hw': [17, 23, 28],
'qz': [9, 9, 8],
'ex': [84, 98],
},
}
sum_of_grades_for_year_1_and_2_for_subject_hw = sum(
grades[0] for grades in (
grades_by_subject['hw']
for (year, grades_by_subject) in
grades_by_year_and_subject.items()
if year in [1, 2]
)
)
Breaking this into several smaller problems:
Sum a collection of values
sum_of_grades = sum(values)
Get the set of first values from a collection of lists
set_of_first_grades = {
values[0] for values in collection}
Make a generator of the values for key 'hw' in each dict from a collection
generator_of_hw_value_lists = (
values_dict['hw'] for values_dict in
collection_of_dicts.values())
Reduce a dictionary only to those items with keys 1 or 2
mapping_of_values_for_key_1_and_2 = {
key: value
for (key, value) in values_dict.items()
if key in [1, 2]}
Verbose solution
grades_by_year_and_subject = {
1: {
'pr': [18, 15],
'hw': [16, 27, 25],
'qz': [8, 10, 5],
'ex': [83, 93],
},
2: {
'pr': [20, 18],
'hw': [17, 23, 28],
'qz': [9, 9, 8],
'ex': [84, 98],
},
}
grades_for_year_1_and_2_by_subject = {
year: grades_by_subject
for (year, grades_by_subject) in
grades_by_year_and_subject.items()
if year in [1, 2]}
grades_for_year_1_and_2_for_subject_hw = (
grades_by_subject['hw']
for grades_by_subject in
grades_for_year_1_and_2_by_subject.values())
sum_of_grades_for_year_1_and_2_for_subject_hw = sum(
grades[0] for grades in grades_for_year_1_and_2_for_subject_hw)

filtering lists of lists python, how to create final list?

I have code that produces a data struicture that looks like this:
{'AttributeId': '4192',
'AttributeList': '',
'ClassId': '1014 (AP)',
'InstanceId': '0',
'MessageType': '81 (GetAttributesResponse)',
'ObjectInstance': '',
'Protocol': 'BSMIS Rx',
'RDN': '',
'TransactionId': '66',
'Sequences': [[],
[1,'2013-02-26T15:01:11Z'],
[],
[10564,13,388,0,-321,83,'272','05',67,67,708,896,31,128,-12,-109,0,-20,-111,-1,-1,0],
[10564,13,108,0,-11,83,'272','05',67,67,708,1796,31,128,-12,-109,0,-20,-111,-1,-1,0],
[10589,16,388,0,-15,79,'272','05',67,67,708,8680,31,125,-16,-110,0,-20,-111,-1,-1,0],
[10589,15,108,0,-16,81,'272','05',67,67,708,8105,31,126,-14,-109,0,-20,-111,-1,-1,0],
[10637,40,233,0,-11,89,'272','03',30052,1,5,54013,33,103,-6,-76,1,-20,-111,-1,-1,0],
[10662,46,234,0,-15,85,'272','03',30052,1,5,54016,33,97,-10,-74,1,-20,-111,-1,-1,0],
[10712,51,12,0,-24,91,'272','01',4013,254,200,2973,3,62,-4,-63,0,-20,-111,-1,-1,0],
[10737,15,224,0,-16,82,'272','01',3020,21,21,40770,33,128,-13,-108,0,-20,-111,-1,-1,0],
[10762,14,450,0,-7,78,'272','01',3020,21,21,53215,29,125,-17,-113,0,-20,-111,-1,-1,0],
[10762,15,224,0,-7,85,'272','01',3020,21,21,50770,33,128,-10,-105,0,-20,-111,-1,-1,0],
[10762,14,124,0,-7,78,'272','01',3020,10,10,56880,32,128,-17,-113,0,-20,-111,-1,-1,0],
[10812,11,135,0,-14,81,'272','02',36002,1,11,43159,31,130,-14,-113,1,-20,-111,-1,-1,0],
[10837,42,23,0,-9,89,'272','02',36002,1,11,53529,31,99,-6,-74,1,-20,-111,-1,-1,0,54],
[13,'2013-02-26T15:02:09Z'],
[],
[2,12,7,0,9,70,'272','02',20003,0,0,15535,0,0,0,0,1,100,100,-1,-1,0],
[5,15,44,0,-205,77,'272','02',20003,0,0,15632,0,0,0,0,1,100,100,-1,-1,0],
[7,25,9,0,0,84,'272','02',20002,0,0,50883,0,0,0,0,1,100,100,-1,-1,0]]
}
I then filtered this down to make a list of relevant values, I only wanted the first 2 elements of Sequences if the length was >=22. I did this as follows:
len22seqs = filter(lambda s: len(s)>=22, data['Sequences'])
UARFCNRSSI = []
for i in range(len(len22seqs)):
UARFCNRSSI.append([len22seqs[i][0], len22seqs[i][1]])
An example of the filtered list is:
[[10564, 15], [10564, 13], [10589, 18], [10637, 39], [10662, 38], [10712, 50], [10737, 15], [10762, 14], [10787, 9], [10812, 12], [10837, 45], [3, 17], [7, 21], [46, 26], [48, 12], [49, 24], [64, 14], [66, 17], [976, 27], [981, 22], [982, 22], [983, 17], [985, 13], [517, 9], [521, 15], [525, 11], [526, 13], [528, 14], [698, 14], [788, 24], [792, 19]]
However I now note that I need a third element in each of these sub-lists.
That is this:
[1,'2013-02-26T15:01:11Z'],
I need the first element of every list with length of 2 to be appended to this filtered list as a third element, for the elements that follow. But when there is a new list with length 2 then I need that new value to be appended to the subsequent entries.
So my final list example could look like, note the change to 13 for the third element upon finding another list with length 2:
[[10564, 15, 1], [10564, 13, 1], [10589, 18, 1], [10637, 39, 1], [10662, 38, 1], [10837, 45, 1], [3, 17, 13], [7, 21, 13], [46, 26, 13], etc]
How do I do this? Do i have to filter twice with len >=22 and len = 2, and a separate filter for just len >=22 as I wouldn't want to append element 0 or 1 to my final list for the lists with length 2.

I would try to make it readable:
UARFCNRSSI = []
x = None # future "third element"; please choose a better name
for item in data["Sequences"]:
if len(item) == 2:
x = item[0]
elif len(item) >= 22:
UARFCNRSSI.append([item[0], item[1], x])

I'd go with a generator to filter your data:
def filterdata(sequences):
add = []
for item in sequences:
if len(item) == 2:
add = [item[0]]
elif len(item) >= 22:
yield [item[0], item[1]] + add
You can access it like data = list(filterdata(data['Sequences']))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Get dictionary from a dataframe - python

Related

itertool group dynamic element in list python

Aggregation in array element - python

Merge two list contained dictionary based on its index in python

Dictionary with Keys inside Keys

filtering lists of lists python, how to create final list?

Categories

Resources