I'm looking to convert lists like:
idx = ['id','m','x','y','z']
a = ['1, 1.0, 1.11, 1.11, 1.11']
b = ['2, 2.0, 2.22, 2.22, 2,22']
c = ['3, 3.0, 3.33, 3.33, 3.33']
d = ['4, 4.0, 4.44, 4.44, 4.44']
e = ['5, 5.0, 5.55, 5.55, 5.55']
Into a dictionary where:
dictlist = {
'id':[1,2,3,4,5],
'm':[1.0,2.0,3.0,4.0,5.0],
'x':[1.11,2.22,3.33,4.44,5.55],
'y':[1.11,2.22,3.33,4.44,5.55],
'z':[1.11,2.22,3.33,4.44,5.55]
}
But I would like to be able to do this for a longer set of lists >> 6 elements per list. So I assume a function would be best to be able to create dict for the len of elements in the idx list.
**Edit:
in response to g.d.d.c:
I had tried something like:
def make_dict(indx):
data=dict()
for item in xrange(0,len(indx)):
data.update({a[item]:''})
return data
data = make_dict(idx)
Which worked for making:
{'id': '', 'm': '', 'x': '', 'y': '', 'z': ''}
but then adding each value to the dictionary became an issue.
result = {}
keys = idx
lists = [a, b, c, d, e]
for index, key in enumerate(keys):
result[key] = []
for l in lists:
result[key].append(l[index])
As a single comprehension
Start by grouping your lists {a,b,c,d,e,...} into a list of lists
dataset = [a,b,c,d,e]
idx = ['id','m','x','y','z']
d = { k: [v[i] for v in dataset] for i,k in enumerate(idx) }
The last line builds a dictionary by enumerating over idx using the value for the dict key, and its index to pick out the correct column of each data sample.
The comprehension will work regardless of the number of fields, as long as each list has the same length as idx
You can try this:
idx = ['id','m','x','y','z']
a = [1, 1.0, 1.11, 1.11, 1.11]
b = [2, 2.0, 2.22, 2.22, 2,22]
c = [3, 3.0, 3.33, 3.33, 3.33]
d = [4, 4.0, 4.44, 4.44, 4.44]
e = [5, 5.0, 5.55, 5.55, 5.55]
dictlist = {x[0] : list(x[1:]) for x in zip(idx,a,b,c,d,e)}
print dictlist
answer = {}
for key, a,b,c,d,e in zip(idx, map(lambda s:[float(i) for i in s.split(',')], [a,b,c,d,e])):
answer[key] = [a,b,c,d,e]
Related
I'm trying to understand how to loop through a list of dictionaries to produce a single list of data.
The current format of the data:
data = [{'q_rounded': 100, 'title': 'Product Evaluation', 'final_score': 5.0, 'project': < Project: C > },
{'q_rounded': 100, 'title': 'Community', 'final_score': 5.0, 'project': < Project: C > },
{'q_rounded': 100, 'title': 'Marketing', 'final_score': 5.0, 'project': < Project: C >},
{'q_rounded': 0, 'title': 'Product Evaluation', 'project': < Project: D > }]
I'm hoping to be able to end up with the final score of each title in a single list
[project,value,value2,value3]
I think I need to iterate through the original list using something like?
for item in data:
for key,value in item.items():
print(key,value)
but I'm not sure if this is the correct way to approach this?
Thanks
By adding each value to a list, then inserting the title at the end:
score = []
for title in data:
x = [v for k, v in title.items() if k != "project"]
x.insert(0, title["project"])
score.append(x)
# Score saves in a form of
[[< Project: C >, 100, 'Product Evaluation', 5.0], ... ]
Edit
With no title:
x = [v for k, v in title.items() if k not in ["project", "title"]]
>>> x[0]
[< Project: C >, 100, 5.0]
Edit 2 (all scores)
all_scores = []
for title in data:
if "final_score" in title:
all_scores.append(title["final_score"])
>>> all_scores
[5.0, 5.0, 5.0]
I have a list of dictionaries, I would like to create a new dictionary where the first key 'value' corresponds to the second value of the 'b' key of each dictionary in the list. The second key 'number' of the new dictionary corresponds to the third (therefore last) value of the 'b' key of each dictionary in the list.
my_list = [
{
'a': (2.6, 0.08, 47.0, 1),
'b': (5.7, 0.05, 1)
},
{
'a': (2.6, 0.08, 47.0, 2),
'b': (5.7, 0.06, 2)
}
]
expected output:
new_dic = {'value': (0.05, 0.06), number = (1, 2)}
you can use comprehension as follows:
new_dict = {}
new_dict['value'] = tuple(val['b'][1] for val in my_list)
new_dict['number'] = tuple(val['b'][2] for val in my_list)
Note that you need to call the tuple constructor, because (val['b'][2] for val in my_list) alone returns a generator object.
I have list from mssql query which includes Decimals. Such as:
[(1, Decimal('33.00'), Decimal('5.30'), Decimal('50.00')),
(2, Decimal('17.00'), Decimal('0.50'), Decimal('10.00'))]
I want to transform that to dict and float number like that:
{1: [33.00, 5.30, 50.00],
2: [17.00, 0.50, 10.00]}
I writed below line:
load_dict = {key: values for key, *values in dataRead}
which results:
{1: [Decimal('33.00'), Decimal('105.30'), Decimal('25650.00')],
2: [Decimal('17.00'), Decimal('40.50'), Decimal('10000.00')]}
I am asking that is there anyway making this transformation with list/dict comprehension?
you could use a dict-comprehension with a cast to float like this:
from decimal import Decimal
lst = [(1, Decimal('33.00'), Decimal('5.30'), Decimal('50.00')),
(2, Decimal('17.00'), Decimal('0.50'), Decimal('10.00'))]
ret = {key: [float(f) for f in values] for key, *values in lst}
print(ret)
# {1: [33.0, 5.3, 50.0], 2: [17.0, 0.5, 10.0]}
Apply float to values:
from decimal import Decimal
data = [(1, Decimal('33.00'), Decimal('5.30'), Decimal('50.00')),
(2, Decimal('17.00'), Decimal('0.50'), Decimal('10.00'))]
load_dict = {key: list(map(float, values)) for key, *values in data}
print(load_dict)
Output
{1: [33.0, 5.3, 50.0], 2: [17.0, 0.5, 10.0]}
I have two multi-index dataframes: mean and std
arrays = [['A', 'A', 'B', 'B'], ['Z', 'Y', 'X', 'W']]
mean=pd.DataFrame(data={0.0:[np.nan,2.0,3.0,4.0], 60.0: [5.0,np.nan,7.0,8.0], 120.0:[9.0,10.0,np.nan,12.0]},
index=pd.MultiIndex.from_arrays(arrays, names=('id', 'comp')))
mean.columns.name='Times'
std=pd.DataFrame(data={0.0:[10.0,10.0,10.0,10.0], 60.0: [10.0,10.0,10.0,10.0], 120.0:[10.0,10.0,10.0,10.0]},
index=pd.MultiIndex.from_arrays(arrays, names=('id', 'comp')))
std.columns.name='Times'
My task is to combine them in a dictionary with '{id:' as first level, followed by second level dictionary with '{comp:' and then for each comp a list of tuples, which combines the (time-points, mean, std). So, the result should look like that:
{'A': {
'Z': [(60.0,5.0,10.0),
(120.0,9.0,10.0)],
'Y': [(0.0,2.0,10.0),
(120.0,10.0,10.0)]
},
'B': {
'X': [(0.0,3.0,10.0),
(60.0,7.0,10.0)],
'W': [(0.0,4.0,10.0),
(60.0,8.0,10.0),
(120.0,12.0,10.0)]
}
}
Additionally, when there is NaN in data, the triplets are left out, so value A,Z at time 0, A,Y at time 60 B,X at time 120.
How do I get there? I constructed already a dict of dict of list of tuples for a single line:
iter=0
{mean.index[iter][0]:{mean.index[iter][1]:list(zip(mean.columns, mean.iloc[iter], std.iloc[iter]))}}
>{'A': {'Z': [(0.0, 1.0, 10.0), (60.0, 5.0, 10.0), (120.0, 9.0, 10.0)]}}
Now, I need to extend to a dictionary with a loop over each line {inner dict) and adding the ids each {outer dict}. I started with iterrows and dic comprehension, but here I have problems, indexing with the iter ('A','Z') which i get from iterrows(), and building the whole dict, iteratively.
{mean.index[iter[1]]:list(zip(mean.columns, mean.loc[iter[1]], std.loc[iter[1]])) for (iter,row) in mean.iterrows()}
creates errors, and I would only have the inner loop
KeyError: 'the label [Z] is not in the [index]'
Thanks!
EDIT: I exchanged the numbers to float in this example, because here integers were generated before which was not consistent with my real data, and which would fail in following json dump.
Here is a solution using a defaultdict:
from collections import defaultdict
mean_as_dict = mean.to_dict(orient='index')
std_as_dict = std.to_dict(orient='index')
mean_clean_sorted = {k: sorted([(i, j) for i, j in v.items()]) for k, v in mean_as_dict.items()}
std_clean_sorted = {k: sorted([(i, j) for i, j in v.items()]) for k, v in std_as_dict.items()}
sol = {k: [j + (std_clean_sorted[k][i][1],) for i, j in enumerate(v) if not np.isnan(j[1])] for k, v in mean_clean_sorted.items()}
solution = defaultdict(dict)
for k, v in sol.items():
solution[k[0]][k[1]] = v
Resulting dict will be defaultdict object that you can change to dict easily:
solution = dict(solution)
con = pd.concat([mean, std])
primary = dict()
for i in set(con.index.values):
if i[0] not in primary.keys():
primary[i[0]] = dict()
primary[i[0]][i[1]] = list()
for x in con.columns:
primary[i[0]][i[1]].append((x, tuple(con.loc[i[0]].loc[i[1][0].values)))
Here is sample output
I found a very comprehensive way of putting up this nested dict:
mean_dict_items=mean.to_dict(orient='index').items()
{k[0]:{u[1]:list(zip(mean.columns, mean.loc[u], std.loc[u]))
for u,v in mean_dict_items if (k[0],u[1]) == u} for k,l in mean_dict_items}
creates:
{'A': {'Y': [(0.0, 2.0, 10.0), (60.0, nan, 10.0), (120.0, 10.0, 10.0)],
'Z': [(0.0, nan, 10.0), (60.0, 5.0, 10.0), (120.0, 9.0, 10.0)]},
'B': {'W': [(0.0, 4.0, 10.0), (60.0, 8.0, 10.0), (120.0, 12.0, 10.0)],
'X': [(0.0, 3.0, 10.0), (60.0, 7.0, 10.0), (120.0, nan, 10.0)]}}
I have a dict like below:
dict={idx1:{tokenA: 0.1,
tokenB: 1.3,
tokenD: 2.3},
idx2:{tokenC: 0.9,
tokenE: 3.4},
...
idxn:{tokenA: 0.3,
tokenF: 0.4,
...
tokenZ: 7.4}
}
each index may have different tokens/Values, Now I want to get average of each token, simple as below:
{tokenA: average_value, tokenB: average_value, ... tokenZ: average_value)
any efficient way to do this? Thanks in advance!
d ={'idx1':{'tokenA': 0.1,
'tokenB': 1.3,
'tokenD': 2.3},
'idx2':{'tokenC': 0.9,
'tokenE': 3.4},
'idxn':{'tokenA': 0.3,
'tokenF': 0.4,
'tokenZ': 7.4}
}
from collections import Counter
token_sums = sum((Counter(v ) for k,v in d.iteritems()), Counter())
token_counts = sum((Counter(v.keys()) for k,v in d.iteritems()), Counter())
token_mean = {k:token_sums[k]/token_counts[k] for k in token_sums}
print token_mean
my_lists = defaultdict(list)
for key,val in my_dict.items():
for key2,val2 in val.items():
my_lists[key2].append(val2)
def average(key_val):
key,val = key_val
return (key, sum(val)*1.0/len(val))
print dict(map(average,my_lists))
Using pandas:
import pandas
d = {'a': {'t1': 0.1,
't2': 0.2},
'b': {'t1': 0.1,
't3': 0.2}}
data = pandas.DataFrame(d)
data.T.mean()
=>
t1 0.1
t2 0.2
t3 0.2
dtype: float64
import collections
d ={'idx1':{'tokenA' : 0.1,
'tokenB': 1.3,
'tokenD': 2.3},
'idx2':{'tokenC': 0.9,
'tokenE': 3.4},
'idxn':{'tokenA': 0.3,
'tokenF': 0.4,
'tokenZ': 7.4}
}
avg = collections.defaultdict(float)
count = collections.Counter()
for dat in d.itervalues():
for k,v in dat.iteritems():
avg[k] += v
count[k] += 1
for k,v in count.iteritems():
avg[k] /= count[k]
print avg