Average values in multiple dictionaries? - python

I have 4 dictionaries where I have symbol as my key and LTP as the value. Now I want to create a new dictionary where I want the symbol as my key and average of LTP of 4 dictionary as my value
first = {"MRF":40000,"RELIANCE":1000}
second = {"MRF":50000,"RELIANCE":2000}
third = {"MRF":30000,"RELIANCE":500}
fourth = {"MRF":60000,"RELIANCE":4000}
new = {"MRF":45000,"RELIANCE":1875} # this is the average of ltp
Kindly assist me with a way to do it ?

We can get this using mean method in statistics library and list comprehension.
Here is the code :
Note: assuming that keys in all dictionaries are the same:
Note: I am using Python3.x for the below code:
from statistics import mean
first = {"MRF":40000,"RELIANCE":1000}
second = {"MRF":50000,"RELIANCE":2000}
third = {"MRF":30000,"RELIANCE":500}
fourth = {"MRF":60000,"RELIANCE":4000}
dictionaryList = [first,second,third,fourth]
new = {}
for key in first.keys():
new[key] = mean([d[key] for d in dictionaryList ])
print(new)
It Produces the exact same result that you needed
{'MRF': 45000, 'RELIANCE': 1875}

first = {"MRF":40000,"RELIANCE":1000}
second = {"MRF":50000,"RELIANCE":2000}
third = {"MRF":30000,"RELIANCE":500}
fourth = {"MRF":60000,"RELIANCE":4000}
dicts = [first, second, third, fourth]
keys = first.keys()
new = {k: sum((d[k] for d in dicts)) / len(dicts) for k in first.keys()}
print(new) ## {'MRF': 45000.0, 'RELIANCE': 1875.0}

Related

Find max value of a column based on another in python

i have 2d list implementation as follows. It shows no. of times every student topped in exams:-
list = main_record
['student1',1]
['student2',1]
['student2',2]
['student1',5]
['student3',3]
i have another list of unique students as follows:-
list = students_enrolled
['student1','student2','student3']
which i want to display student ranking based on their distinctions as follows:-
list = student_ranking
['student1','student3','student2']
What built in functions can be useful. I could not pose proper query on net. In other words i need python equivalent of following queries:-
select max(main_record[1]) where name = student1 >>> result = 5
select max(main_record[1]) where name = student2 >>> result = 2
select max(main_record[1]) where name = student3 >>> result = 3
You define a dict base key of studentX and save the max value for each student key then sort the students_enrolled base max value of each key.
from collections import defaultdict
main_record = [['student1',1], ['student2',1], ['student2',2], ['student1',5], ['student3',3]]
students_enrolled = ['student1','student2','student3']
# defind dict with negative infinity and update with max in each iteration
tmp_dct = defaultdict(lambda: float('-inf'))
for lst in main_record:
k, v = lst
tmp_dct[k] = max(tmp_dct[k], v)
print(tmp_dct)
students_enrolled.sort(key = lambda x: tmp_dct[x], reverse=True)
print(students_enrolled)
Output:
# tmp_dct =>
defaultdict(<function <lambda> at 0x7fd81044b1f0>,
{'student1': 5, 'student2': 2, 'student3': 3})
# students_enrolled after sorting
['student1', 'student3', 'student2']
If it is a 2D list it should look like this: l = [["student1", 2], ["student2", 3], ["student3", 4]]. To get the highest numeric value from the 2nd column you can use a loop like this:
numbers = []
for student in list:
numbers.append(student[1])
for num in numbers:
n = numbers.copy()
n.sort()
n.reverse()
student_index = numbers.index(n[0])
print(list[student_index], n[0])
numbers.remove(n[0])

how to store all values of loop in list of dictionary python

dic = {}
list =[]
def bathSum():
for ZPRODHDR in root.iter('ZPRODHDR'):
for ZPRODITM in ZPRODHDR.iter('ZPRODITM'):
component = ZPRODITM.findtext('COMPONENT')
quantity = ZPRODITM.findtext('QUANTITY')
component = int(component)
quantity = float(quantity)
dic[component] = quantity
list.append(dic)
print('dictionary print', dic)
return(list)
print('list of dictionaries', bathSum())
How I can get output list of dictionaries like in 'dictionary print'?Because it seams to overwrite all values in each dictionary for last loop values.
dictionary print: {60943240: 814.0, 60943245: 557.0}
dictionary print: {60943240: 793.0, 60943245: 482.0}
list of dictionaries: [{60943240: 793.0, 60943245: 482.0}, {60943240: 793.0, 60943245: 482.0}]
The problem is that you are not resetting your dictionary dic at each ZPRODHDR loop. The values are then always updated in the original dictionnary, and always overriding the old values.
This variable is an intermediate variable that should be cleaned at each loop from its values.
You need to move the dic variable declaration inside your first for loop.
list =[]
def bathSum():
for ZPRODHDR in root.iter('ZPRODHDR'):
dic = {}
for ZPRODITM in ZPRODHDR.iter('ZPRODITM'):
component = ZPRODITM.findtext('COMPONENT')
quantity = ZPRODITM.findtext('QUANTITY')
component = int(component)
quantity = float(quantity)
dic[component] = quantity
list.append(dic)
print('dictionary print', dic)
return(list)
print('list of dictionaries', bathSum())

Creating python dictionaries using for loop

I make a bunch of matrices that I want to store in python dictionaries and I always find myself typing the same thing for every state that I want to build, i.e.
Ne21_1st_state = {}
Ne21_2nd_state = {}
Ne21_3rd_state = {}
Ne21_4th_state = {}
Ne21_5th_state = {}
Ne21_6th_state = {}
...
Ne21_29th_state = {}
Ne21_30th_state = {}
Can somebody help me automate this using python for loops?
Thanks in advance!
I want something like this:
for i in range(3, 11):
states = f'Ar36_{i}th_state'
print(states)
where the output would be:
Ar36_3th_state
Ar36_4th_state
Ar36_5th_state
Ar36_6th_state
Ar36_7th_state
Ar36_8th_state
Ar36_9th_state
Ar36_10th_state
but instead of printing it it would create individual dictionaries named Ar36_3th_state, Ar36_4th_state, Ar36_5th_state, ...
can't we make a List of dictionaries
List of 30 (or any N) elements where each element is a dictionary with key = "Ar36_{i}th_state" and value = {whatever value you want}
You can create "name" of pseudo variable and use it as key in dictionary like:
my_dic = {1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}
my_empty_dic = {}
solution = {}
for i in range(1, 31):
name = 'Ne21_'+str(i)+'st_state'
#solution[name] = my_dic
solution[name] = my_empty_dic
for pseudo_variable in solution:
print(pseudo_variable, solution[pseudo_variable])
print(solution['Ne21_16st_state'])
for pseudo_variable in solution:
if '_16st' in pseudo_variable:
print(pseudo_variable, solution[pseudo_variable])
One way I've done this is using list comprehension.
key = list(
str(input(f"Please enter a Key for value {x + 1}: "))
if x == 0
else str(input(f"\nPlease enter a Key for value {x + 1}: "))
for x in range(3))
value = list(str(input(f"\nPlease enter a Bool for value {x + 1}: "))
for x in range(3))
BoolValues = dict(zip(key, value))
I first create a list of keys followed by a list of the values to be stored in the keys. Then I just zip them together into a dictionary. The conditional statements in the first list are only for a slightly better user-experience with \n being added if it's passed the first input.
Actually now that I look back on the question it may be slightly different to what I was thinking, are you trying to create new dictionaries for every matrix? If that is the case, is it something similar to this?: How do you create different variable names while in a loop?

if two people have same score how to return both names connected by "and"

I'm calculating the average score of people in a dictionary with two-dimensional array and I want to know how to return two people with the same score connected by "and"; EX: name and name
My code:
def bestAverage(inputDict):
dic = {}
for i in inputDict:
if i[0] in dic.keys():
dic[i[0]].append(int(i[1]))
else:
dic[i[0]] = [int(i[1])]
totle_score = 0
print(dic)
for key, value, in dic.items():
for c in value:
totle_score += int(c)
Q = len(value)
avrage = totle_score / Q
dic[key]= [avrage]
print(dic)
My input:
inputDict = [ ["Diane", 20],["Bion",25],["Jack","30"],["Diane","50"] ]
result = bestAverage(inputDict)
OUTCOME:
{'Diane': [35.0], 'Bion': [95.0], 'Jack': [125.0]}
Using the sorted dictionary, you can get the dictionary you want.
Sorry, I think my code is a bit complicated.
dic = {'Diane': [35.0],
'Bion': [95.0],
'Jack': [125.0],
'Diane_2': [35.0],
'Bion_2':[95],
'Diane_3':[35.0],
'John':[10]}
import operator
sorted_dic = sorted(dic.items(), key=operator.itemgetter(0))
new_dic = dict()
preKey = sorted_dic[0][0]
preValue = sorted_dic[0][1]
nms = preKey
for key,value in sorted_dic[1:]:
if(value == preValue):
nms += ' and ' + key
else:
new_dic[nms] = preValue
preKey = key
preValue = value
nms = preKey
new_dic[nms] = preValue
print(new_dic)
OUTCOME:
{'Jack': [125.0], 'John': [10], 'Diane and Diane_2 and Diane_3':
[35.0], 'Bion and Bion_2': [95.0]}
Per the OPs question in the comments, this example now produces a final structure containing entries for only those scores that had multiple people with that same score.
data = {'Diane': [35.0], 'Bion': [95.0], 'Jack': [125.0], 'Sam': [95.0]}
# Here, we create a dict of lists, where the keys are the scores, and the values
# are the names of each person who has that score. This will produce:
#
# {
# 35.0: ['Diane'],
# 95.0: ['Bion', 'Sam'],
# 125.0: ['Jack']
# }
collected = {}
# For each key (name) in the input dict...
for name in data:
# Get the score value out of the array for this name
val = data[name][0]
# If we don't have an entry in our new dict for this score (no key in the dict of that
# score value) then add that entry as the score for the key and an empty array for the value
if val not in collected:
collected[val] = []
# Now that we're sure we have an entry for the score of the name we're processing, add
# the name to the array for that score in the new dict
collected[val].append(name)
# Now we just "flip" each entry in the 'collected' map to create a new dict. We create
# one entry in this dict for each entry in the 'collected' map, where each key is a
# single string where we've combined all of the names with the same score, separated
# by 'and', and each value is the score that those names had.
result = {}
# Now iterate over each of our keys, the unique scores, in our new 'collected' dict...
for val in collected:
# We only want to create an entry in the new dict if the entry we're processing has more than
# just one name in the list of names. So here, we check for that, and skip adding an entry to
# the new dict if there is only one name in the list
if len(collected[val]) == 1:
continue
# Combine the value of this entry, the list of names with a particular score, into a single string
combinedNames = " and ".join(collected[val])
# Add an entry to our 'result' dict with this combined name as the key and the score as the value
result[combinedNames] = val
# Print each combined name string from the resulting structure
for names in result:
print(names)
Output:
Bion and Sam

Summing keys and values in a list of dictionaries python

I have a list of dictionaries called "timebucket" :
[{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
I would like to return the top two largest keys (.99 and .98) and average them , plus , get both of their values and average those as well.
Expected output would like something like:
{ (avg. two largest keys) : (avg. values of two largest keys) }
I've tried:
import numpy as np
import heapq
[np.mean(heapq.nlargest(2, i.keys())) for i in timebucket]
but heapq doesn't work in this scenario, and not sure how to keep keys and values linked
Doing this with numpy:
In []:
a = np.array([e for i in timebucket for e in i.items()]);
a[a[:,1].argsort()][:2].mean(axis=0)
Out[]
array([ 0.99084129, 0.00261179])
Though I suspect creating a better data-structure up front would probably be a better approach.
This gives you the average of 2 largest keys (keyave) and the average of the two corresponding values (valave).
The keys and values are put into a dictionary called newdict.
timebucket = [{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
keys = []
for time in timebucket:
for x in time:
keys.append(x)
result = {}
for d in timebucket:
result.update(d)
largestkey = (sorted(keys)[-1])
ndlargestkey = (sorted(keys)[-2])
keyave = (float((largestkey)+(ndlargestkey))/2)
largestvalue = (result[(largestkey)])
ndlargestvalue = (result[(ndlargestkey)])
valave = (float((largestvalue)+(ndlargestvalue))/2)
newdict = {}
newdict[keyave] = valave
print(newdict)
#print(keyave)
#print(valave)
Output
{0.9908412862404705: 0.002611786698168918}
Here is a solution to your problem:
def dothisthing(mydict) # define the function with a dictionary a the only parameter
keylist = [] # create an empty list
for key in mydict: # iterate the input dictionary
keylist.append(key) # add the key from the dictionary to a list
keylist.sort(reverse = True) # sort the list from highest to lowest numbers
toptwokeys = 0 # create a variable
toptwovals = 0 # create a variable
count = 0 # create an integer variable
for item in keylist: # iterate the list we created above
if count <2: # this limits the iterations to the first 2
toptwokeys += item # add the key
toptwovals += (mydict[item]) # add the value
count += 1
finaldict = {(toptwokeys/2):(toptwovals/2)} # create a dictionary where the key and val are the average of the 2 from the input dict with the greatest keys
return finaldict # return the output dictionary
dothisthing({0.9711533363722904: 0.008296776727415599, 0.97163564816067838: 0.008153794130319884, 0.99212783984967068: 0.0022392112909864364, 0.98955473263127025: 0.0029843621053514003})
#call the function with your dictionary as the parameter
I hope it helps
You can do it in just four lines without importing numpy :
One line solution
For two max average keys :
max_keys_average=sorted([keys for item in timebucket for keys,values in item.items()])[::-1][:2]
print(sum(max_keys_average)/len(max_keys_average))
output:
0.9908412862404705
for their keys average :
max_values_average=[values for item in max_keys_average for item_1 in timebucket for keys,values in item_1.items() if item==keys]
print(sum(max_values_average)/len(max_values_average))
output:
0.002611786698168918
If you are facing issue with understanding list comprehension here is detailed solution for you:
Detailed Solution
first step:
get all the keys of dict in one list :
Here is your timebucket list:
timebucket=[{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
now let's store all the keys in one list:
keys_list=[]
for dict in timebucket:
for key,value in dict.items():
keys_list.append(key)
Now next step is sort this list and get last two values of this list :
max_keys=sorted(keys_list)[::-1][:2]
Next step just take sum of this new list and divide by len of list :
print(sum(max_keys)/len(max_keys))
output:
0.9908412862404705
Now just iterate the max_keys and keys in timebucket and see if both item match then get the value of that item in a list.
max_values=[]
for item in max_keys:
for dict in timebucket:
for key, value in dict.items():
if item==key:
max_values.append(value)
print(max_values)
Now last part , just take sum and divide by len of max_values:
print(sum(max_values)/len(max_values))
Gives the output :
0.002611786698168918
This is an alternative solution to the problem:
In []:
import numpy as np
import time
def AverageTB(time_bucket):
tuples = [tb.items() for tb in time_bucket]
largest_keys = []
largest_keys.append(max(tuples))
tuples.remove(max(tuples))
largest_keys.append(max(tuples))
keys = [i[0][0] for i in largest_keys]
values = [i[0][1] for i in largest_keys]
return np.average(keys), np.average(values)
time_bucket = [{0.9711533363722904: 0.008296776727415599},
{0.97163564816067838: 0.008153794130319884},
{0.99212783984967068: 0.0022392112909864364},
{0.98955473263127025: 0.0029843621053514003}]
time_exe = time.time()
print('avg. (keys, values): {}'.format(AverageTB(time_bucket)))
print('time: {}'.format(time.time() - time_exe))
Out[]:
avg. (keys, values): (0.99084128624047052, 0.0026117866981689181)
time: 0.00037789344787

Categories