Get average value from list of dictionary - python

I have lists of dictionary. Let's say it
total = [{"date": "2014-03-01", "value": 200}, {"date": "2014-03-02", "value": 100}{"date": "2014-03-03", "value": 400}]
I need get maximum, minimum, average value from it. I can get max and min values with below code:
print min(d['value'] for d in total)
print max(d['value'] for d in total)
But now I need get average value from it. How to do it?

Just divide the sum of values by the length of the list:
print sum(d['value'] for d in total) / len(total)
Note that division of integers returns the integer value. This means that average of the [5, 5, 0, 0] will be 2 instead of 2.5. If you need more precise result then you can use the float() value:
print float(sum(d['value'] for d in total)) / len(total)

I needed a more general implementation of the same thing to work on the whole dictionary. So here is one simple option:
def dict_mean(dict_list):
mean_dict = {}
for key in dict_list[0].keys():
mean_dict[key] = sum(d[key] for d in dict_list) / len(dict_list)
return mean_dict
Testing:
dicts = [{"X": 5, "value": 200}, {"X": -2, "value": 100}, {"X": 3, "value": 400}]
dict_mean(dicts)
{'X': 2.0, 'value': 233.33333333333334}

reduce(lambda x, y: x + y, [d['value'] for d in total]) / len(total)
catavaran's anwser is more easy, you don't need a lambda

An improvement on dsalaj's answer if the values are numeric lists instead:
def dict_mean(dict_list):
mean_dict = {}
for key in dict_list[0].keys():
mean_dict[key] = np.mean([d[key] for d in dict_list], axis=0)
return mean_dict

Related

Convert a list of values to use as a dictionary key

Trying to convert a list of values to be used to find a particular key value in a Dictionary.
I am not able to figure out a pythonic way to do it.
Tried converting the list to string and pass as a key to the dictionary, but it is now working as the list contains integer values also.
l = ['tsGroups', 0, 'testCases', 0, 'parameters', 'GnbControlAddr', 'ip']
d={
"tsGroups": [{"tsId": 19,
"testCases": [{"name": "abcd",
"type": "xyz",
"parameters": {"GnbControlAddr":
{"ip": "192.1.1.1",
"mac": "",
"mtu": 1500,
"phy": "eth2",
}
}
}]
}]
}
print(d["tsGroups"][0]["testCases"][0]["parameters"]["GnbControlAddr"]
["ip"])
Need to convert input list 'l' to a format to be used as
d["tsGroups"][0]["testCases"][0]["parameters"]["GnbControlAddr"]["ip"]
In [5]: d={
...: "tsGroups": [{"tsId": 19,"testCases": [{"name": "abcd","type": "xyz",
...: "parameters": {"GnbControlAddr": {
...: "ip": "192.1.1.1",
...: "mac": "",
...: "mtu": 1500,
...: "phy": "eth2",
...: }
...: }}]}]}
In [6]: L = ['tsGroups', 0, 'testCases', 0, 'parameters', 'GnbControlAddr', 'ip']
In [7]: functools.reduce?
Docstring:
reduce(function, sequence[, initial]) -> value
Apply a function of two arguments cumulatively to the items of a sequence,
from left to right, so as to reduce the sequence to a single value.
For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
((((1+2)+3)+4)+5). If initial is present, it is placed before the items
of the sequence in the calculation, and serves as a default when the
sequence is empty.
Type: builtin_function_or_method
In [8]: t = d
In [9]: for e in L: t = t[e]
In [10]: t
Out[10]: '192.1.1.1'
Can't speak to how pythonic this is, but looping through the list and updating a reference to a new data structure appears to work:
current = d
for key in l:
current = current[key]
print current

How can I sort or change keys (consisting of numbers) in dictionary, based on other key names?

I have a dictionary, loaded from JSON, like so:
data = {
"1": [
"data",
"data"
],
"2": [
"data",
"data"
],
"3": [
"data",
"data"
]
"5": [
"data",
"data"
]
}
In this instance, "4" is missing, since it has been deleted through another function.
I am trying to write a function that will re-organize/sort the dictionary. This involves fixing holes such as this. In this instance, the numbers should go from 1 to 4 with no holes, this means changing the key-name 5 to 4.
I have written some code to find out which number is missing:
nums = [1, 2, 3, 5]
missing= 0
for x in range(len(data)):
if x not in nums and x is not 0:
missing += x
Greatly appreciate some help. I am simply stuck on how to proceed.
PS: I realize it may not be an optimal data structure. It is like this so I can easily match integers given as system arguments to keys and thus finding corresponding values.
So I just figured out a very easy way of doing it... was a lot simpler than I thought it would be.
for x in range(len(data)):
for k, v in data.items():
data[x] = data.pop(k)
break
You could use enumerate to get the new indices for the existing (sorted) keys:
>>> data = {"1": ["data11", "data12"],
... "2": ["data21", "data22"],
... "3": ["data31", "data32"],
... "5": ["data51", "data52"]}
...
>>> {i: data[k] for i, k in enumerate(sorted(data), start=1)}
{1: ['data11', 'data12'],
2: ['data21', 'data22'],
3: ['data31', 'data32'],
4: ['data51', 'data52']}
Or the same, in-place (the if i != k: is not really needed, but might be faster):
>>> for i, k in enumerate(sorted(data), start=1):
... if i != k:
... data[i] = data.pop(k)
...
>>> data
{1: ['data11', 'data12'],
2: ['data21', 'data22'],
3: ['data31', 'data32'],
4: ['data51', 'data52']}

Sum of specific key-value in python dictionary

I have a school dictionary as follow-
{
ID('6a15ce'): {
'count': 5,
'amount': 0,
'r_amount': None,
'sub': < subobj >
}, ID('464ba1'): {
'count': 2,
'amount': 120,
'r_amount': None,
'sub': < subobj2 >
}
}
I want to find out the sum of amount , doing as follow-
{k:sum(v['amount']) for k,v in school.items()}
but here I am getting error TypeError: 'int' object is not iterable what could be efficient way to achieve.
You can do:
result = sum(v["amount"] for v in school.values())
You can also do it using the map function:
result = sum(map(lambda i: i['amount'], school.values()))
print(result)
Output:
120
This is a functional solution:
from operator import itemgetter
res = sum(map(itemgetter('amount'), school.values()))
sum(map(lambda schoolAmount: schoolAmount.amount, school))

Matrix weight algorithm

I'm trying to work out how to write an algorithm to calculate the weights across different lists the most efficient way. I have a dict which contains various ids:
x["Y"]=[id1,id2,id3...]
x["X"]=[id2,id3....]
x["Z"]=[id3]
.
.
I have an associated weight for each of the elements:
w["Y"]=10
w["X"]=10
w["Z"]=5
Given an input, e.g. "Y","Z", I want to get an output of to give me:
(id1,10),(id2,10),(id3,15)
id3 gets 15 because it's in both x["Y"] and x["Z"].
Is there a way way I can do this with vector matrixes?
You can use the itertools library to group together common terms in a list:
import itertools
import operator
a = {'x': [2,3], 'y': [1,2,3], 'z': [3]}
b = {'x': 10, 'y': 10, 'z': 5}
def matrix_weight(letter1,letter2):
final_list = []
for i in a[letter1]:
final_list.append((i, b[letter1]))
for i in a[letter2]:
final_list.append((i, b[letter2]))
# final_list = [(1,10), (2,10), (3,10), (3,5)]
it = itertools.groupby(final_list, operator.itemgetter(0))
for key, subiter in it:
yield key, sum(item[1] for item in subiter)
print list(matrix_weight('y', 'z'))
I'll use the id in strings as in your example, but integer id works similarly.
def id_weights(x, w, keys):
result = {}
for key in keys:
for id in x[key]:
if id not in result:
result[id] = 0
result[id] += w[key]
return [(id, result[id]) for id in sorted(result.keys())]
x = {"Y": ["id1","id2","id3"],
"X": ["id2", "id3"],
"Z": ["id3"]}
w = {"Y": 10, "X": 10, "Z": 5}
if __name__ == "__main__":
keys = ["Y", "Z"]
print id_weights(x, w, keys)
gives
[('id1', 10), ('id2', 10), ('id3', 15)]

Comparing elements of one dictionary to ranges of values in another dictionary

valid = {'Temp': [10, 55], 'rain_percent': [49, 100], 'humidity': [30,50]}
data = {'Temp': 30.45, 'rain_percent': 80.56 }
min_temp , max_temp = valid['Temp']
if not(min_temp <= data['Temp'] <= max_temp):
print "Bad Temp"
min_rain , max_rain = valid['rain_percent']
if not(min_rain <= data['rain_percent'] <= max_rain):
print "It's not going to rain"
This is what I'm doing with the 2 dictionarties I'm having. I know that this check can be further modified. Since both the dictionaries i.e valid and data have the same keys, there must be some better way of implementing this check. Can anyone help me in this?
Thanks a lot.
If I understand the question correctly, you're trying to check if each value data[k] is in the range defined by the 2-element list/tuple valid[k].
Try using a for loop and dict.items() to iterate through data and compare each value to the corresponding range in valid:
valid = {'Temp': [10, 55], 'rain_percent': [49, 100], 'humidity': [30,50]}
data = {'Temp': 30.45, 'rain_percent': 80.56, 'humidity': 70 }
for key,val in data.items():
min, max = valid[key]
if not( min <= val <= max ):
print "%s=%g is out of valid range (%g-%g)" % (key, val, min, max)
else:
print "%s=%g is in the valid range (%g-%g)" % (key, val, min, max)
In the case of the example data values I gave, it will print this:
rain_percent=80.56 is in the valid range (49-100)
Temp=30.45 is in the valid range (10-55)
humidity=70 is out of valid range (30-50)
This answer builds off #Dan's.
It could be the case that you want to add other parameters to your 'valid' dictionary such as avg, standard deviation, etc and many more data points such as air_pressure, wind_speed, visibility, etc.
Especially in the case where you have many more data points(temp, humidity, etc.) and many more parameters and labels (min, max, 'high temp,' 'low temp,' etc.), you would want your 'valid' dictionary to be more descriptive. You can then write general functions that are more flexible and descriptive depending on the depth of your 'valid' dictionary.
Here's an example. Let's now call the 'valid' dictionary 'parameters.'
parameters = {
'temp': {
'min':10,
'max':55,
'avg':40,
'stddev':10,
'in_range_label':"Good Temp",
'out_range_label':"Bad Temp",
'above_average_label':"Above average temp",
'below_average_label':"Below average temp",
},
'rain_percent': {
'min':49,
'max':100,
'avg':75,
'in_range_label':"Going to rain",
'out_range_label':"Not going to rain",
'above_average_label':"Above average rain",
'below_average_label':"Below average rain",
},
'humidity': {
'min':30,
'max':50,
'avg':45,
'in_range_label':"Humid",
'out_range_label':"Not humid" ,
'above_average_label':"Above average hemp",
'below_average_label':"Below average humidity",
}
}
data = {'temp': 30.45, 'rain_percent': 80.56 }
def check_min_max(data, parameters):
for k, v in data.items():
min = parameters[k]['min']
max = parameters[k]['max']
if min <= v <= max:
print '{}={}, {}'.format(k, v, parameters[k]['in_range_label'])
else:
print '{}={}, {}'.format(k, v, parameters[k]['out_range_label'])
def check_avg(data, parameters):
for k, v in data.items():
avg = parameters[k]['avg']
if v > avg:
print '{}={}, {}'.format(k, v, parameters[k]['above_average_label'])
else:
print '{}={}, {}'.format(k, v, parameters[k]['below_average_label'])
check_min_max(data, parameters)
check_avg(data, parameters)
>>>
rain_percent=80.56, Going to rain
temp=30.45, Good Temp
rain_percent=80.56, Above average rain
temp=30.45, Below average temp

Categories