I have a dictionary.
{1 : [1.2, 2.3, 4.9, 2.0], 2 : [4.1, 5.1, 6.3], 3 : [4.9, 6.8, 9.5, 1.1, 7.1]}
I want to pass each key:value pair to an instance of matplotlib.pyplot as two lists: x values and y values.
Each key is an x value associated with each item in its value.
So I want two lists for each key:
[1,1,1,1] [1.2,2.3,4.9,2.0]
[2,2,2] [4.1,5.1,6.3]
[3,3,3,3,3] [4.9,6.8,9.5,1.1,7.1]
Is there an elegant way to do this?
Or perhaps there is a way to pass a dict to matplotlib.pyplot?
for k, v in dictionary.iteritems():
x = [k] * len(v)
y = v
pyplot.plot(x, y)
d = {1 : [1.2, 2.3, 4.9, 2.0], 2 : [4.1, 5.1, 6.3], 3 : [4.9, 6.8, 9.5, 1.1, 7.1]}
res = [([x]*len(y), y) for x, y in d.iteritems()]
res will be a list of tuples, where the first element in the tuple is your list of x-values and second element in the tuple is your list f y-values
Maybe something like:
d = {1 : [1.2, 2.3, 4.9, 2.0], 2 : [4.1, 5.1, 6.3], 3 : [4.9, 6.8, 9.5, 1.1, 7.1]}
result = []
for key, values in d.items():
result.append(([key]*len(values), values))
Use this list comprehension:
[([k]*len(v), v) for k, v in D.iteritems()]
Here's an example of it being used:
>>> from pprint import pprint
>>> D = {1: [1.2, 2.3, 4.9, 2.0], 2: [4.1, 5.1, 6.3], 3: [4.9, 6.8, 9.5, 1.1, 7.1]}
>>> LL = [([k]*len(v), v) for k, v in D.iteritems()]
>>> pprint(LL)
[([1, 1, 1, 1], [1.2, 2.2999999999999998, 4.9000000000000004, 2.0]),
([2, 2, 2], [4.0999999999999996, 5.0999999999999996, 6.2999999999999998]),
([3, 3, 3, 3, 3],
[4.9000000000000004,
6.7999999999999998,
9.5,
1.1000000000000001,
7.0999999999999996])]
As a list comprehension:
r = [([k]*len(v), v) for k,v in d.items()]
If your dictionary is very large, you'd want to use a generator expression:
from itertools import repeat
r = ((repeat(k, len(v)), v) for k,v in d.iteritems())
...though note that using repeat means that the first item in each tuple the generator returns is itself a generator. That's unnecessary if the dictionary's values don't themselves have many items.
>>> d = {1 : [1.2, 2.3, 4.9, 2.0], 2 : [4.1, 5.1, 6.3], 3 : [4.9, 6.8, 9.5, 1.1, 7.1]}
>>> result = [ ([k] * len(d[k]), d[k]) for k in d.keys() ]
>>> print result
[([1, 1, 1, 1], [1.2, 2.2999999999999998, 4.9000000000000004, 2.0]), ([2, 2, 2],
[4.0999999999999996, 5.0999999999999996, 6.2999999999999998]), ([3, 3, 3, 3, 3],
[4.9000000000000004, 6.7999999999999998, 9.5, 1.1000000000000001, 7.0999999999999996])]
I guess that a wizard will put something nicer, but I would do something like:
map(lambda x: ([x]*len(a[x]),a[x]),a)
for a tuple, or
map(lambda x: [[x]*len(a[x]),a[x]],a)
for a list.
btw: a is the dictionary, of course!
I assume that you work with the 2.x series...
Regards
the map function in python will allow this
x = [1,2,4]
y = [1,24,2]
c = zip(x,y)
print c
d = map(None,x,y)
print d
check it out. This will give you
[(1, 1), (2, 24), (4, 2)]
In the case of zip(), if one of the lists are smaller then the others, values will be truncated:
x = [1,2,4]
a = [1,2,3,4,5]
c = zip(x,a)
print c
d = map(None,x,a)
print d
[(1, 1), (2, 2), (4, 3)]
[(1, 1), (2, 2), (4, 3), (None, 4), (None, 5)]
Related
I have a huge array of data and I would like to do subgroups for the values for same integers and then take their average.
For example:
a = [0, 0.5, 1, 1.5, 2, 2.5]
I want to take sub groups as follows:
[0, 0.5] [1, 1.5] [2, 2.5]
... and then take the average and put all the averages in a new array.
Assuming you want to group by the number's integer value (so the number rounded down), something like this could work:
>>> a = [0, 0.5, 1, 1.5, 2, 2.5]
>>> groups = [list(g) for _, g in itertools.groupby(a, int)]
>>> groups
[[0, 0.5], [1, 1.5], [2, 2.5]]
Then averaging becomes:
>>> [sum(grp) / len(grp) for grp in groups]
[0.25, 1.25, 2.25]
This assumes a is already sorted, as in your example.
Ref: itertools.groupby, list comprehensions.
If you have no problem using additional libraries:
import pandas as pd
import numpy as np
a = [0, 0.5, 1, 1.5, 2, 2.5]
print(pd.Series(a).groupby(np.array(a, dtype=np.int32)).mean())
Gives:
0 0.25
1 1.25
2 2.25
dtype: float64
If you want an approach with dictionary, you can go ahead like this:
dic={}
a = [0, 0.5, 1, 1.5, 2, 2.5]
for items in a:
if int(items) not in dic:
dic[int(items)]=[]
dic[int(items)].append(items)
print(dic)
for items in dic:
dic[items]=sum(dic[items])/len(dic[items])
print(dic)
You can use groupby to easily get that (you might need to sort the list first):
from itertools import groupby
from statistics import mean
a = [0, 0.5, 1, 1.5, 2, 2.5]
for k, group in groupby(a, key=int):
print(mean(group))
Will give:
0.25
1.25
2.25
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have a dictionary:
d = {
'inds': [0, 3, 7, 3, 3, 5, 1],
'vals': [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
}
I want to sum over the inds where it sums the repeated inds and outputs the following:
ind: 0 1 2 3* 4 5 6 7
x == [1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
I've tried various loops but can't seem to figure it out or have idea where to begin otherwise.
>>> from collections import defaultdict
>>> indices = [0,3,7,3,3,5,1]
>>> vals = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
>>> d = defaultdict(float)
>>> for i, idx in enumerate(indices):
... d[idx] += vals[i]
...
>>> print(d)
defaultdict(<class 'float'>, {0: 1.0, 3: 11.0, 7: 3.0, 5: 6.0, 1: 7.0})
>>> x = []
>>> for i in range(max(indices)+1):
... x.append(d[i])
...
>>> x
[1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
Using itertools.groupby
>>> z = sorted(zip(indices, vals), key=lambda x:x[0])
>>> z
[(0, 1.0), (1, 7.0), (3, 2.0), (3, 4.0), (3, 5.0), (5, 6.0), (7, 3.0)]
>>> for k, g in itertools.groupby(z, key=lambda x:x[0]):
... print(k, sum([t[1] for t in g]))
0 1.0
1 7.0
3 11.0
5 6.0
7 3.0
You need x to be a list of sums for every value (say i) in the range of 'inds' in d (min to max) of the 'vals' in d that have a inds matching i at the same position.
d = {
'inds': [0, 3, 7, 3, 3, 5, 1],
'vals': [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
}
result = [sum([val for ind, val in zip(d['inds'], d['vals']) if ind == i])
for i in range(min(d['inds']), max(d['inds']) + 1)]
print(result)
The output:
[1.0, 7.0, 0, 11.0, 0, 6.0, 0, 3.0]
No libraries required. Although the list comprehension isn't exactly easy to read - it's fairly efficient and matches the description.
A breakdown of the list comprehension into its parts:
for i in range(min(d['inds']), max(d['inds']) + 1) just gets i to range from the smallest value found in d['inds'] to the largest, the + 1 takes into account that range goes up to (but not including) the second passed to it.
zip(d['inds'], d['vals']) pairs up elements from d['inds'] and d['vals'] and the surrounding for ind, val in .. makes these pairs available as ind, val.
[val for ind, val in .. if ind == i] generates a list of val where ind matches the current i
So, all put together, it creates a list that has the sums of those values that have an index that matches some i for each i in the range of the minimum d['inds'] to the maximum d['inds'].
I have the following function which will take in a list of 2D lists of size NxN, for example:
print(matrix)
[
[ [1.0, 2.0, 3.0, 4.0],
[5.0, 6.0, 7.0, 8.0],
[1.0, 2.0, 3.0, 4.0],
[5.0, 6.0, 7.0, 8.0] ],
[ [2.0, 3.0, 4.0, 5.0],
[7.0, 8.0, 9.0, 1.0],
[8.0, 0.0, 2.0, 4.0],
[1.0, 9.0, 5.0, 8.0] ]
]
Each "matrix" is actually a 2D list both with dimension = 4; making 'matrix' a 3D list with two 2D list entries. The function below will take in the dimension of the 2D list, some number of time periods (say 3), age_classes (again suppose 3), and 'values' which would be the 3D list from above.
def initial_values_ext(dimension,periods,age_classes,values):
dicts = {}
dict_keys = range(dimension)
time_keys = range(periods)
age_keys = range(age_classes)
for i in dict_keys:
for j in dict_keys:
for t in time_keys:
for k in age_keys:
if t == 0:
dicts[i+1,j+1,t+1,k+1] = values[k][i][j]
else:
dicts[i+1,j+1,t+1,k+1] = 1
return dicts
The function 'initial_values_ext' will then pass those 2D lists and generates a dictionary. Each 2D list corresponded with an age class - so the first 2D list would be age_classes = 1 and the second 2D list would be age_classes = 2, and if there was an additional 2D list then it would correspond to age_classes = 3, and so on. So if we were to call the function, then a couple of the outputs might look like the following:
initial_values_ext(dimension=4, periods=3, age_classes=2,values=matrix)
(1,1,1,1):1.0
(1,1,1,2):2.0
(1,1,2,2):1.0
(3,4,1,1):7.0
(3,4,1,2):5.0
(3,4,2,1):1.0
So the final output would be a full dictionary of values that starts at (1,1,1,age_class=1):1.0 and ends at (4,4,2,age_class=2):8.0. Importantly, the resulting dictionary will pull from the first 2D list of 'matrix' when age_class=1 and will pull from the second 2D of 'matrix' when age_class=2
Edit: Below I have included the code that I have made for when the input matrix is only a list of lists and when there is no fourth entry of the dictionary.
matrix = [[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0]]
def initial_values(dimension,periods,values):
dicts = {}
dict_keys = range(dimension)
time_keys = range(periods)
for i in dict_keys:
for j in dict_keys:
for t in time_keys:
if t == 0:
dicts[i+1,j+1,t+1] = values[i][j]
else:
dicts[i+1,j+1,t+1] = 1
return dicts
Output:
initial_values(4,2,matrix)
{(1, 1, 1): 1.0,
(1, 1, 2): 1,
(1, 2, 1): 2.0,
(1, 2, 2): 1,
(1, 3, 1): 3.0,
(1, 3, 2): 1,
(1, 4, 1): 4.0,
(1, 4, 2): 1,
(2, 1, 1): 5.0,
(2, 1, 2): 1,
(2, 2, 1): 6.0,
(2, 2, 2): 1,
(2, 3, 1): 7.0,
(2, 3, 2): 1,
(2, 4, 1): 8.0,
(2, 4, 2): 1,
(3, 1, 1): 1.0,
(3, 1, 2): 1,
(3, 2, 1): 2.0,
(3, 2, 2): 1,
(3, 3, 1): 3.0,
(3, 3, 2): 1,
(3, 4, 1): 4.0,
(3, 4, 2): 1,
(4, 1, 1): 5.0,
(4, 1, 2): 1,
(4, 2, 1): 6.0,
(4, 2, 2): 1,
(4, 3, 1): 7.0,
(4, 3, 2): 1,
(4, 4, 1): 8.0,
(4, 4, 2): 1}
I made some modifications to make your approach more pythonic.
def initial_values_ext(dimension, periods, age_classes, values):
x = list(map(range,[dimension, periods, age_classes]))
dicts = {(i+1,j+1,t+1,k+1) : values[k][i][j] if t==0 else 1 \
for i in x[0] for j in x[0] for t in x[1] for k in x[2]}
return dicts
The function call was missing an additional looping index when 'values' are called:
def initial_values_ext(dimension,periods,age_classes,values):
dicts = {}
dict_keys = range(dimension)
time_keys = range(periods)
age_keys = range(age_classes)
for i in dict_keys:
for j in dict_keys:
for t in time_keys:
for k in age_keys:
if t == 0:
dicts[i+1,j+1,t+1,k+1] = values[k][i][j]
else:
dicts[i+1,j+1,t+1,k+1] = 1
return dicts
I have a list of values: [0,2,3,5,6,7,9] and want to get a list of the numbers in the middle in between each number: [1, 2.5, 4, 5.5, 6.5, 8]. Is there a neat way in python to do that?
It's a simple list comprehension (note I'm asuming you want all your values as floats rather than a mixture of ints and floats):
>>> lst = [0,2,3,5,6,7,9]
>>> [(a + b) / 2.0 for a,b in zip(lst, lst[1:])]
[1.0, 2.5, 4.0, 5.5, 6.5, 8.0]
(Dividing by 2.0 ensure floor division is not applied in Python 2)
Use a for loop:
>>> a = [0,2,3,5,6,7,9]
>>> [(a[x] + a[x + 1])/2 for x in range(len(a)-1)]
[1.0, 2.5, 4.0, 5.5, 6.5, 8.0]
However using zip as #Chris_Rands said is better... (and more readable ¬¬)
Obligatory itertools solution:
>>> import itertools
>>> values = [0,2,3,5,6,7,9]
>>> [(a+b)/2.0 for a,b in itertools.izip(values, itertools.islice(values, 1, None))]
[1.0, 2.5, 4.0, 5.5, 6.5, 8.0]
values = [0,2,3,5,6,7,9]
middle_values = [(values[i] + values[i + 1]) / 2.0 for i in range(len(values) - 1)]
Dividing by 2.0 rather than 2 is unnecessary in Python 3, or if you use from __future__ import division to change the integer division behavior.
The zip or itertools.izip answers are more idiomatic.
Simple for loop:
nums = [0,2,3,5,6,7,9]
betweens = []
for i in range(1, len(nums)):
if nums[i] - nums[i-1] > 1:
betweens.extend([item for item in range(nums[i-1]+1, nums[i])])
else:
betweens.append((nums[i] + nums[i-1]) / 2)
Output is as desired, which doesn't need further conversion (in Python3.x):
[1, 2.5, 4, 5.5, 6.5, 8]
[(l[i]+l[i+1])/2 for i in range(len(l)-1)]
I have one list
a = [1.0, 2.0, 2.1, 3.0, 3.1, 4.2, 5.1, 7.2, 9.2]
I want to compare this list with other list but also I want to extract the information regarding the list content in numeric order.All other list have the elements that are same as a.
So I have tried this
a = [1.0, 2.0, 2.1, 3.0, 3.1, 4.2, 5.1, 7.2, 9.2]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9]
print dict(zip(a,b))
a1=[2.1, 3.1, 4.2, 7.2]
I want to compare a1 with a and extract dict values [3, 5, 6, 8].
Just loop through a1 and see if there is a matching key in the dictionary you created:
mapping = dict(zip(a, b))
matches = [mapping[value] for value in a1 if value in mapping]
Demo:
>>> a = [1.0, 2.0, 2.1, 3.0, 3.1, 4.2, 5.1, 7.2, 9.2]
>>> b = [1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a1 = [2.1, 3.1, 4.2, 7.2]
>>> mapping = dict(zip(a, b))
>>> [mapping[value] for value in a1 if value in mapping]
[3, 5, 6, 8]
However, take into account that you are using floating point numbers. You may not be able to match values exactly, since floating point numbers are binary approximations to decimal values; the value 2.999999999999999 (15 nines) for example, may be presented by the Python str() function as 3.0, but is not equal to 3.0:
>>> 2.999999999999999
2.999999999999999
>>> str(2.999999999999999)
'3.0'
>>> 2.999999999999999 == 3.0
False
>>> 2.999999999999999 in mapping
False
If your input lists a is sorted, you could use the math.isclose() function (or a backport of it), together with the bisect module to keep matching efficient:
import bisect
try:
from math import isclose
except ImportError:
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
# simplified backport, doesn't handle NaN or infinity.
if a == b: return True
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
result = []
for value in a1:
index = bisect.bisect(a, value)
if index and isclose(a[index - 1], value):
result.append(b[index - 1])
elif index < len(a) and isclose(a[index], value):
result.append(b[index])
This tests up to two values from a per input value; one that is guaranteed to equal or lower (at index - 1) and the next, higher value. For your sample a, the value 2.999999999999999 is bisected to index 3, between 2.1 and 3.0. Since isclose(3.0, 2.999999999999999) is true, that would still let you map that value to 4 in b.