Dictionary with Keys inside Keys - python

I need to find a way to return the 1st value inside the key 'hw' for keys 1 and 2 and sum them but I cannot think of a way.It needs to work for any number of keys, not just 1 and 2 but even if there were 10 or so. The 1 and 2 are the students, the 'hw','qz',,etc are the categories of assignments,every student will have hw,qz,ex,pr and there will be 3 qz,3hw,3pr,3ex for each student. I need to return all the students 1st, hw grade, 1st quiz grade, 2nd hw grade..etc
grades = {
1: {
'pr': [18, 15],
'hw': [16, 27, 25],
'qz': [8, 10, 5],
'ex': [83, 93],
},
2: {
'pr': [20, 18],
'hw': [17, 23, 28],
'qz': [9, 9, 8],
'ex': [84, 98],
},
}

More succinctly (and Pythonicly):
hw_sum = sum([grades[key]['hw'][0] for key in grades])

return the 1st value inside the key 'hw' for keys 1 and 2 and sum them
It helps to use explanatory names so we know what the moving parts are meant to be. I've chosen descriptive names somewhat arbitrarily by guessing at the meaning.
Succinct solution
grades_by_year_and_subject = {
1: {
'pr': [18, 15],
'hw': [16, 27, 25],
'qz': [8, 10, 5],
'ex': [83, 93],
},
2: {
'pr': [20, 18],
'hw': [17, 23, 28],
'qz': [9, 9, 8],
'ex': [84, 98],
},
}
sum_of_grades_for_year_1_and_2_for_subject_hw = sum(
grades[0] for grades in (
grades_by_subject['hw']
for (year, grades_by_subject) in
grades_by_year_and_subject.items()
if year in [1, 2]
)
)
Breaking this into several smaller problems:
Sum a collection of values
sum_of_grades = sum(values)
Get the set of first values from a collection of lists
set_of_first_grades = {
values[0] for values in collection}
Make a generator of the values for key 'hw' in each dict from a collection
generator_of_hw_value_lists = (
values_dict['hw'] for values_dict in
collection_of_dicts.values())
Reduce a dictionary only to those items with keys 1 or 2
mapping_of_values_for_key_1_and_2 = {
key: value
for (key, value) in values_dict.items()
if key in [1, 2]}
Verbose solution
grades_by_year_and_subject = {
1: {
'pr': [18, 15],
'hw': [16, 27, 25],
'qz': [8, 10, 5],
'ex': [83, 93],
},
2: {
'pr': [20, 18],
'hw': [17, 23, 28],
'qz': [9, 9, 8],
'ex': [84, 98],
},
}
grades_for_year_1_and_2_by_subject = {
year: grades_by_subject
for (year, grades_by_subject) in
grades_by_year_and_subject.items()
if year in [1, 2]}
grades_for_year_1_and_2_for_subject_hw = (
grades_by_subject['hw']
for grades_by_subject in
grades_for_year_1_and_2_by_subject.values())
sum_of_grades_for_year_1_and_2_for_subject_hw = sum(
grades[0] for grades in grades_for_year_1_and_2_for_subject_hw)

Related

Get dictionary from a dataframe

I have a dataframe like below:
df = pd.DataFrame({
'Aapl': [12, 5, 8],
'Fs': [18, 12, 8],
'Bmw': [6, 18, 12],
'Year': ['2020', '2025', '2030']
})
I want a dictionary like:
d={'2020':[12,18,16],
'2025':[5,12,18],
'2030':[8,8,12]
}
I am not able to develop the whole logic:
lst = [list(item.values()) for item in df.to_dict().values()]
dic={}
for items in lst:
for i in items[-1]:
dic[i]=#2000 will hold all the 0th values of other lists and so on
Is there any easier way using pandas ?
Convert Year to index, transpose and then in dict comprehension create lists:
d = {k: list(v) for k, v in df.set_index('Year').T.items()}
print (d)
{'2020': [12, 18, 6], '2025': [5, 12, 18], '2030': [8, 8, 12]}
Or use DataFrame.agg:
d = df.set_index('Year').agg(list, axis=1).to_dict()
print (d)
{'2020': [12, 18, 6], '2025': [5, 12, 18], '2030': [8, 8, 12]}
Try this:
import pandas as pd
data = {'Name': ['Ankit', 'Amit',
'Aishwarya', 'Priyanka'],
'Age': [21, 19, 20, 18],
'Stream': ['Math', 'Commerce',
'Arts', 'Biology'],
'Percentage': [88, 92, 95, 70]}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data, columns=['Name', 'Age',
'Stream', 'Percentage'])

Deriving len/min/sum of dictionary values

I have the following dictionary for which I’d like to produce a new dictionary that gives the original keys but the values are computed using len, min and sum.
scores = {
"Monday" : [21, 23, 24, 19],
"Tuesday" : [16, 15, 12, 19],
"Wednesday" : [23, 22, 23],
"Thursday": [ 18, 20, 26, 24],
"Friday": [17, 22],
"Saturday" : [22, 24],
"Sunday" : [21, 21, 28, 25]
}
I initially tried to use the following but it only returns a single value:
for k, v in scores.items():
stat = {
k : [len(v), min(v), sum(v)]
}
This returns: {'Sunday': [4, 21, 95]}
Doing some research, I managed to use the following comprehension to achieve the outcome:
stats = {k : [len(v), min(v), sum(v)] for k, v in scores.items()}
This returns:
{'Monday': [4, 19, 87], 'Tuesday': [4, 12, 62], 'Wednesday': [3, 22, 68], 'Thursday': [4, 18, 88], 'Friday': [2, 17, 39], 'Saturday': [2, 22, 46], 'Sunday': [4, 21, 95]}
I would really like to understand why my first attempt only produced a single value and didn’t iterate through the entire dictionary?
I’m new to learning Python so very keen to understand the difference in methods and what I was doing incorrect with the first method.
Many thanks!
JJ
This code:
for k, v in scores.items():
stat = {
k : [len(v), min(v), sum(v)]
}
is the equivalent of doing:
stat = {"Monday" : [len(scores["Monday"]), min(scores["Monday"]), sum(scores["Monday"])}
stat = {"Tuesday" : [len(scores["Tuesday"]), min(scores["Tuesday"]), sum(scores["Tuesday"])}
...
stat = {"Sunday" : [len(scores["Sunday"]), min(scores["Sunday"]), sum(scores["Sunday"])}
On each iteration of the loop, you're re-assigning stat to a new single-item dictionary. The last one is Sunday, so that's the final value that stat has after the loop is finished.
If you did something like:
stat = {}
for k, v in scores.items():
stat[k] = [len(v), min(v), sum(v)]
your end result would be an accumulation of all the values, similar to the result you get from the dictionary comprehension (which is generally the preferred way of building a dictionary through iteration).
your script is iterating arround all dictionary
but you are overriting the result dictionary
try something like:
stats = []
for k, v in scores.items():
stat = {
k : [len(v), min(v), sum(v)]
}
stats.append(stat)
print(stats)

How can I create a new nested dictionary from an old nested dictionary with updated keys and value?

Currently, I am trying to create a new dictionary from an unusable dictionary but am having difficulty updating/creating new keys based on the number of tuples in the 'Acceptable' key.
dirty_dictionary = {'DOG': {'Acceptable': ([[35, 38]], 'DOG_GROUP'),
'Unacceptable': ([[2], [29], [44], [50], [54], [60]], 'DOG_GROUP')},
'CAT': {'Acceptable': ([[3, 6], [100, 101]], 'CAT_GROUP'), 'Unacceptable': ([[12], [18], [45], [51], [61]], 'CAT_GROUP')},
'FISH': {'Acceptable': ([], 'FISH_GROUP'), 'Unacceptable': ([[13], [19], [22], [28], [34]], 'FISH_GROUP')},
'COW': {'Acceptable': ([[87, 69]], 'COW_GROUP'), 'Unacceptable': ([], 'COW_GROUP')}}
new_dict = {}
for key, values in dirty_dictionary.items():
if len(values['Acceptable'][0]) == 0:
new_dict[dirty_dictionary[key]['Acceptable'][-1]] = {'Acceptable': []}
for oids in values['Acceptable'][0]:
if len(values['Acceptable'][0]) == 1:
new_dict[dirty_dictionary[key]['Acceptable'][-1]] = {'Acceptable': oids}
if len(values['Acceptable'][0]) > 1:
for i in range(len(values['Acceptable'][0])):
new_dict[dirty_dictionary[key]['Acceptable'][-1] + F'_{i}'] = {'Acceptable': values['Acceptable'][0][i]}
# for oids in values['Unacceptable'][0]:
# if len(values['Unacceptable'][0]) == 1:
# new_dict[dirty_dictionary[key]['Unacceptable'][-1]].update({'Unacceptable': oids})
# if len(values['Unacceptable'][0]) > 1:
# for i in range(len(values['Unacceptable'][0])):
# new_dict[dirty_dictionary[key]['Unacceptable'][-1] + F'_{i}'].update({'Unacceptable': values['Unacceptable'][0][i]})
print(new_dict)
I can create a new dictionary with all 'Acceptable' Keys/Values, but I am stuck on updating the dictionary with the 'Unacceptable' since new groups need to be created if the len of values['Acceptable'][0] > 1.
The goal is to get the final dictionary to look like:
final = {'DOG_GROUP': {'Acceptable': [35, 38], 'Unacceptable': [2, 29, 44, 50, 54, 60]},
'CAT_GROUP_0': {'Acceptable': [3, 6], 'Unacceptable': []},
'CAT_GROUP_1': {'Acceptable': [100, 101], 'Unacceptable': [12, 18, 45, 51, 61]},
'FISH_GROUP': {'Acceptable': [], 'Unacceptable': [13, 19, 22, 28, 34]},
'COW_GROUP': {'Acceptable': [87, 69], 'Unacceptable': []}}
Try this one.
dirty_dictionary = {'DOG': {'Acceptable': ([[35, 38]], 'DOG_GROUP'), 'Unacceptable': ([[2], [29], [44], [50], [54], [60]], 'DOG_GROUP')}, 'CAT': {'Acceptable': ([[3, 6], [100, 101]], 'CAT_GROUP'), 'Unacceptable': ([[12], [18], [45], [51], [61]], 'CAT_GROUP')}, 'FISH': {'Acceptable': ([], 'FISH_GROUP'), 'Unacceptable': ([[13], [19], [22], [28], [34]], 'FISH_GROUP')}}
new_dict = {}
# assign text strings in variable to get suggestion in IDE
acceptable = 'Acceptable'
unacceptable = 'Unacceptable'
for key, values in dirty_dictionary.items():
group = values[acceptable][1]
if len(values[acceptable][0]) <= 1:
new_dict[group] = {}
new_dict[group][acceptable] = [y for x in values[acceptable][0] for y in x]
new_dict[group][unacceptable] = [y for x in values[unacceptable][0] for y in x]
else:
for idx, item in enumerate(values[acceptable][0]):
group_temp = group + '_' + str(idx+1)
new_dict[group_temp] = {}
new_dict[group_temp][acceptable] = item
# if last item then give all unacceptable as a single array
if idx == len(values[acceptable][0]) - 1:
new_dict[group_temp][unacceptable] = [y for x in values[unacceptable][0] for y in x]
else: # else empty array
new_dict[group_temp][unacceptable] = []
print(new_dict)
This should work as described:
dirty_dictionary = {'DOG': {'Acceptable': ([[35, 38]], 'DOG_GROUP'),
'Unacceptable': ([[2], [29], [44], [50], [54], [60]], 'DOG_GROUP')},
'CAT': {'Acceptable': ([[3, 6], [100, 101]], 'CAT_GROUP'), 'Unacceptable': ([[12], [18], [45], [51], [61]], 'CAT_GROUP')},
'FISH': {'Acceptable': ([], 'FISH_GROUP'), 'Unacceptable': ([[13], [19], [22], [28], [34]], 'FISH_GROUP')}}
new_dict = {}
for _, next_group in dirty_dictionary.items():
next_group_name = next_group['Acceptable'][1]
if len(next_group['Acceptable'][0]) > 1: # Split the Acceptables across several keys
for i, next_acceptable in enumerate(next_group['Acceptable'][0]):
new_dict[f"{next_group_name}_{i}"] = {'Acceptable': next_acceptable, 'Unacceptable': []}
new_dict[f'{next_group_name}_{i}']['Unacceptable'] = [next_entry for unacceptable in next_group['Unacceptable'][0] for next_entry in unacceptable]
else: # Nothing else to consider
new_dict[f'{next_group_name}'] = {
'Acceptable': [next_entry for acceptable in next_group['Acceptable'][0] for next_entry in acceptable],
'Unacceptable': [next_entry for unacceptable in next_group['Unacceptable'][0] for next_entry in unacceptable]
}
for k, v in new_dict.items():
print(f'{k}: {v}')
Returns the following result:
DOG_GROUP: {'Acceptable': [35, 38], 'Unacceptable': [2, 29, 44, 50, 54, 60]}
CAT_GROUP_0: {'Acceptable': [3, 6], 'Unacceptable': []}
CAT_GROUP_1: {'Acceptable': [100, 101], 'Unacceptable': [12, 18, 45, 51, 61]}
FISH_GROUP: {'Acceptable': [], 'Unacceptable': [13, 19, 22, 28, 34]}
Test data:
dirty_dictionary = {
"DOG": {
"Acceptable": ([[35, 38]], "DOG_GROUP"),
"Unacceptable": ([[2], [29], [44], [50], [54], [60]], "DOG_GROUP"),
},
"CAT": {
"Acceptable": ([[3, 6], [100, 101]], "CAT_GROUP"),
"Unacceptable": ([[12], [18], [45], [51], [61]], "CAT_GROUP"),
},
"FISH": {
"Acceptable": ([], "FISH_GROUP"),
"Unacceptable": ([[13], [19], [22], [28], [34]], "FISH_GROUP"),
},
"COW": {"Acceptable": ([[87, 69]], "COW_GROUP"), "Unacceptable": ([], "COW_GROUP")},
}
Solution:
import itertools
flatten = itertools.chain.from_iterable
new_dict = {}
for key, value in dirty_dictionary.items():
accept_status = {"Acceptable": [], "Unacceptable": []}
if value["Acceptable"][0]:
accept_status["Acceptable"] = list(flatten(value["Acceptable"][0]))
if value["Unacceptable"][0]:
accept_status["Unacceptable"] = list(flatten(value["Unacceptable"][0]))
new_dict[key] = accept_status
Output:
{'DOG': {'Acceptable': [35, 38], 'Unacceptable': [2, 29, 44, 50, 54, 60]},
'CAT': {'Acceptable': [3, 6, 100, 101], 'Unacceptable': [12, 18, 45, 51, 61]},
'FISH': {'Acceptable': [], 'Unacceptable': [13, 19, 22, 28, 34]},
'COW': {'Acceptable': [87, 69], 'Unacceptable': []}}
because you create keys from Acceptable
try this code :
new_dict = {}
for key, values in dirty_dictionary.items():
if len(values['Acceptable'][0]) == 0:
new_dict[dirty_dictionary[key]['Acceptable'][-1]] = {'Acceptable': []}
for oids in values['Acceptable'][0]:
if len(values['Acceptable'][0]) == 1:
new_dict[dirty_dictionary[key]['Acceptable'][-1]] = {'Acceptable': oids}
if len(values['Acceptable'][0]) > 1:
for i in range(len(values['Acceptable'][0])):
new_dict[dirty_dictionary[key]['Acceptable'][-1] + F'_{i}'] = {'Acceptable': values['Acceptable'][0][i]}
for oids in values['Unacceptable'][0]:
if len(values['Unacceptable'][0]) == 1:
new_dict[dirty_dictionary[key]['Unacceptable'][-1]].update({'Unacceptable': oids})
if len(values['Unacceptable'][0]) > 1:
Unacceptable = []
for i in range(len(values['Unacceptable'][0])):
for j in range(len(values['Unacceptable'][0][i])):
Unacceptable.append(values['Unacceptable'][0][i][j])
if len(values['Acceptable'][0]) > 1:
for k in range(len(values['Acceptable'][0])):
new_dict[dirty_dictionary[key]['Acceptable'][-1] + F'_{k}'].update({'Unacceptable': Unacceptable})
else:
new_dict[dirty_dictionary[key]['Unacceptable'][-1]].update({'Unacceptable': Unacceptable})
print(new_dict)
and the out put is :
{'DOG_GROUP': {'Acceptable': [35, 38], 'Unacceptable': [2, 29, 44, 50, 54, 60]},
'CAT_GROUP_0': {'Acceptable': [3, 6], 'Unacceptable': [12, 18, 45, 51, 61]},
'CAT_GROUP_1': {'Acceptable': [100, 101], 'Unacceptable': [12, 18, 45, 51, 61]},
'FISH_GROUP': {'Acceptable': [], 'Unacceptable': [13, 19, 22, 28, 34]}}

Aggregation in array element - python

I have a OP : {'2017-05-06': [3, 7, 8],'2017-05-07': [3, 9, 10],'2017-05-08': [4]}
from the OP I just want another OP :
{'2017-05-06': [15, 11, 10],'2017-05-07': [19, 13, 12],'2017-05-08': [4]}
which means:
Ncleand is 2017-05-06
element total is 18 so '2017-05-06': [3 -18, 7-18, 8-18] = '2017-05-06': [15, 11, 10]
likewise all elements data.
So final output is {'2017-05-06': [15, 11, 10],'2017-05-07': [19, 13, 12],'2017-05-08': [4]}
How to do this?
Note : I am using python 3.6.2 and pandas 0.22.0
code so far :
import pandas as pd
dfs = pd.read_excel('ff2.xlsx', sheet_name=None)
dfs1 = {i:x.groupby(pd.to_datetime(x['date']).dt.strftime('%Y-%m-%d'))['duration'].sum() for i, x in dfs.items()}
d = pd.concat(dfs1).groupby(level=1).apply(list).to_dict()
actuald = pd.concat(dfs1).div(80).astype(int)
sum1 = actuald.groupby(level=1).transform('sum')
m = actuald.groupby(level=1).transform('size') > 1
cleand = sum1.sub(actuald).where(m, actuald).groupby(level=1).apply(list).to_dict()
print (cleand)
From the cleand I want to do this?
In a compact (but somehow inefficient) way:
>>> op = {'2017-05-06': [3, 7, 8],'2017-05-07': [3, 9, 10],'2017-05-08': [4]}
>>> { x:[sum(y)-i for i in y] if len(y)>1 else y for x,y in op.items() }
#output:
{'2017-05-06': [15, 11, 10], '2017-05-07': [19, 13, 12], '2017-05-08': [4]}
def get_list_manipulation(list_):
subtraction = list_
if len(list_) != 1:
total = sum(list_)
subtraction = [total-val for val in list_]
return subtraction
for key, values in data.items():
data[key] = get_list_manipulation(values)
>>>{'2017-05-06': [15, 11, 10], '2017-05-07': [19, 13, 12], '2017-05-08': [4]}

filtering lists of lists python, how to create final list?

I have code that produces a data struicture that looks like this:
{'AttributeId': '4192',
'AttributeList': '',
'ClassId': '1014 (AP)',
'InstanceId': '0',
'MessageType': '81 (GetAttributesResponse)',
'ObjectInstance': '',
'Protocol': 'BSMIS Rx',
'RDN': '',
'TransactionId': '66',
'Sequences': [[],
[1,'2013-02-26T15:01:11Z'],
[],
[10564,13,388,0,-321,83,'272','05',67,67,708,896,31,128,-12,-109,0,-20,-111,-1,-1,0],
[10564,13,108,0,-11,83,'272','05',67,67,708,1796,31,128,-12,-109,0,-20,-111,-1,-1,0],
[10589,16,388,0,-15,79,'272','05',67,67,708,8680,31,125,-16,-110,0,-20,-111,-1,-1,0],
[10589,15,108,0,-16,81,'272','05',67,67,708,8105,31,126,-14,-109,0,-20,-111,-1,-1,0],
[10637,40,233,0,-11,89,'272','03',30052,1,5,54013,33,103,-6,-76,1,-20,-111,-1,-1,0],
[10662,46,234,0,-15,85,'272','03',30052,1,5,54016,33,97,-10,-74,1,-20,-111,-1,-1,0],
[10712,51,12,0,-24,91,'272','01',4013,254,200,2973,3,62,-4,-63,0,-20,-111,-1,-1,0],
[10737,15,224,0,-16,82,'272','01',3020,21,21,40770,33,128,-13,-108,0,-20,-111,-1,-1,0],
[10762,14,450,0,-7,78,'272','01',3020,21,21,53215,29,125,-17,-113,0,-20,-111,-1,-1,0],
[10762,15,224,0,-7,85,'272','01',3020,21,21,50770,33,128,-10,-105,0,-20,-111,-1,-1,0],
[10762,14,124,0,-7,78,'272','01',3020,10,10,56880,32,128,-17,-113,0,-20,-111,-1,-1,0],
[10812,11,135,0,-14,81,'272','02',36002,1,11,43159,31,130,-14,-113,1,-20,-111,-1,-1,0],
[10837,42,23,0,-9,89,'272','02',36002,1,11,53529,31,99,-6,-74,1,-20,-111,-1,-1,0,54],
[13,'2013-02-26T15:02:09Z'],
[],
[2,12,7,0,9,70,'272','02',20003,0,0,15535,0,0,0,0,1,100,100,-1,-1,0],
[5,15,44,0,-205,77,'272','02',20003,0,0,15632,0,0,0,0,1,100,100,-1,-1,0],
[7,25,9,0,0,84,'272','02',20002,0,0,50883,0,0,0,0,1,100,100,-1,-1,0]]
}
I then filtered this down to make a list of relevant values, I only wanted the first 2 elements of Sequences if the length was >=22. I did this as follows:
len22seqs = filter(lambda s: len(s)>=22, data['Sequences'])
UARFCNRSSI = []
for i in range(len(len22seqs)):
UARFCNRSSI.append([len22seqs[i][0], len22seqs[i][1]])
An example of the filtered list is:
[[10564, 15], [10564, 13], [10589, 18], [10637, 39], [10662, 38], [10712, 50], [10737, 15], [10762, 14], [10787, 9], [10812, 12], [10837, 45], [3, 17], [7, 21], [46, 26], [48, 12], [49, 24], [64, 14], [66, 17], [976, 27], [981, 22], [982, 22], [983, 17], [985, 13], [517, 9], [521, 15], [525, 11], [526, 13], [528, 14], [698, 14], [788, 24], [792, 19]]
However I now note that I need a third element in each of these sub-lists.
That is this:
[1,'2013-02-26T15:01:11Z'],
I need the first element of every list with length of 2 to be appended to this filtered list as a third element, for the elements that follow. But when there is a new list with length 2 then I need that new value to be appended to the subsequent entries.
So my final list example could look like, note the change to 13 for the third element upon finding another list with length 2:
[[10564, 15, 1], [10564, 13, 1], [10589, 18, 1], [10637, 39, 1], [10662, 38, 1], [10837, 45, 1], [3, 17, 13], [7, 21, 13], [46, 26, 13], etc]
How do I do this? Do i have to filter twice with len >=22 and len = 2, and a separate filter for just len >=22 as I wouldn't want to append element 0 or 1 to my final list for the lists with length 2.
I would try to make it readable:
UARFCNRSSI = []
x = None # future "third element"; please choose a better name
for item in data["Sequences"]:
if len(item) == 2:
x = item[0]
elif len(item) >= 22:
UARFCNRSSI.append([item[0], item[1], x])
I'd go with a generator to filter your data:
def filterdata(sequences):
add = []
for item in sequences:
if len(item) == 2:
add = [item[0]]
elif len(item) >= 22:
yield [item[0], item[1]] + add
You can access it like data = list(filterdata(data['Sequences']))

Categories