Shuffle a dictionary of lists aggregating by rows - python

I have a defaultfict(list) that might look like this
d = {0: [2, 4, 5], 1: [5, 6, 1]}
that I need to shuffle all the first elements from all of the lists together, and move one to the second and third rows. So in this example I need to take [2, 5], [4, 6], [5, 1] shuffle them and then put them back. At the end my dictionary might look like this
d = {0: [5, 4, 1], 1: [2, 6, 5]}
is there a pythonic way of doing this avoiding loops?
What I have until now is a way to extract and aggregate all the first, second, etc., elements of the lists and shuffle them using this
[random.sample([tmp_list[tmp_index] for tmp_list in d.values()], 2) for tmp_index in range(3)]
that will create the following
[[2, 5], [4, 6], [5, 1]]
and then in order to create my final shuffled-by-rows dictionary I use simple for loops.

Get a transposed version of the dict values:
>>> data = [list(v) for v in zip(*d.values())]
>>> data
[[2, 5], [4, 6], [5, 1]]
Shuffle them in-place
>>> for x in data:
... random.shuffle(x)
...
>>> data
[[5, 2], [4, 6], [5, 1]]
Transpose the data again
>>> data = zip(*data)
Assign the new values to the dict
>>> for x, k in zip(data, d):
... d[k][:] = x # Could also be written as d[k] = list(x)
...
>>> d
{0: [5, 4, 5], 1: [2, 6, 1]}

Related

Mapping a list right in Python

I have a question on how to map correctly my list.
I have the following code:
class Bar():
def __init__(self, i, j):
self.i = i-1
self.j = j-1
For the following list:
bars = [Bar(1,2), Bar(2,3), Bar(3,4), Bar(4,5), Bar(5,1),Bar(1,4), Bar(2,4), Bar(4,6), Bar(6,5)]
But for my problem, I have an array like this:
elementsmat=[[1, 1, 2], [2, 2, 3], [3, 3, 4], [4, 4, 5], [5, 5, 1], [6, 1, 4], [7, 2, 4], [8, 4, 6], [9, 6, 5]]
I used the following code to obtain an array where I removed the first element of each list of the list and then transformed it into a list.
s= np.delete(elementsmat, 0, 1)
r = s.tolist()
Output: [[1, 2], [2, 3], [3, 4], [4, 5], [5, 1], [1, 4], [2, 4], [4, 6], [6, 5]]
So, how can I apply the Bar function to all the elements of my new array correctly? I did this but I got the following error.
bars = map(Bar,r)
__init__() missing 1 required positional argument: 'j'
I thought it could be because in the first one the list has () and in my list I have [], but I am not sure.
You can use itertools.starmap instead of map (after importing itertools). Your current way calls Bar([1, 2]). starmap unpacks the lists into arguments. A generator/list comprehension is also an option.
(Bar(*x) for x in r)
Now you see why it's called starmap.
You need to unpack the nested lists into the call to Bar():
l = list(map(lambda x: Bar(*x), r))
itertools.starmap does the same thing.
Or, you can use a list-comprehension:
l = [Bar(i, j) for i, j in r]
A built-in functional approach
lst = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 1], [1, 4], [2, 4], [4, 6], [6, 5]]
map(Bar, *zip(*lst))

How can i sum up all values with the same index in a dictionary which each key has a nested list as a value?

I have a dictionary, each key of dictionary has a list of list (nested list) as its value. What I want is imagine we have:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
My question is how can I access each element of the dictionary and concatenate those with same index: for example first list from all keys:
[1,2] from first keye +
[2,1] from second and
[1,5] from third one
How can I do this?
You can access your nested list easily when you're iterating through your dictionary and append it to a new list and the you apply the sum function.
Code:
x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
ans=[]
for key in x:
ans += x[key][0]
print(sum(ans))
Output:
12
Assuming you want a list of the first elements, you can do:
>>> x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
>>> y = [a[0] for a in x.values()]
>>> y
[[1, 2], [2, 1], [1, 5]]
If you want the second element, you can use a[1], etc.
The output you expect is not entirely clear (do you want to sum? concatenate?), but what seems clear is that you want to handle the values as matrices.
You can use numpy for that:
summing the values
import numpy as np
sum(map(np.array, x.values())).tolist()
output:
[[4, 8], [10, 15]] # [[1+2+1, 2+1+5], [3+2+5, 5+6+4]]
concatenating the matrices (horizontally)
import numpy as np
np.hstack(list(map(np.array, x.values()))).tolist()
output:
[[1, 2, 2, 1, 1, 5], [3, 5, 2, 6, 5, 4]]
As explained in How to iterate through two lists in parallel?, zip does exactly that: iterates over a few iterables at the same time and generates tuples of matching-index items from all iterables.
In your case, the iterables are the values of the dict. So just unpack the values to zip:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
for y in zip(*x.values()):
print(y)
Gives:
([1, 2], [2, 1], [1, 5])
([3, 5], [2, 6], [5, 4])

How to extend a list inside a pandas dataframe

I have a pandas data frame, and each element of one of its columns is a list.
Then I have a list with the same amount of elements as rows in the pandas data frame; I want to extend the list inside pandas with this new list.
So, for example, if this is the data frame.
my_column
[1, 2]
[3, 4]
df = pd.DataFrame({'my_column':[[1, 2], [3, 4]]})
and this is the external list
external_list = [[5, 6], [7, 8, 9]]
I want to extend each of the lists of the data frame, so the final result is:
my_column
[1, 2, 5, 6]
[3, 4, 7, 8, 9]
For now, what I have is:
for index, row in data.iterrows():
df["my_column"].loc[index] = row["my_column"].extend(external_list[index])
Is there a more pythonic way?
df = pd.DataFrame({'my_column':[[1, 2], [3, 4]]})
lst = [[5, 6], [7, 8, 9]]
One way:
df['my_column'] += pd.Series(lst)
Another way: You can zip the column values with list values and use list comprehension:
df['my_column'] = [l1 + l2 for l1, l2 in zip(df['my_column'].tolist(), lst)]
Output:
my_column
0 [1, 2, 5, 6]
1 [3, 4, 7, 8, 9]
I am not sure whether or not it's pythonic enough but you can do it this way:
data = {"my_column":[[1, 2], [3, 4]]}
df = pd.DataFrame(data)
list2 = [[5, 6], [7, 8, 9]]
df["my_column"] = [list1 + list2[i] for i, list1 in df["my_column"].iteritems()]

Combine values from 2 dicts into a np.array python

I have two dicts
a = {0:[1,2,3,4], 1:[5,6,7,8],...}
b = {0:[4,3,2,1], 1:[8,7,6,5],...}
I would like to create an np.array c for each key-value pair such as follows
c1 = array([[1,4],[2,3],[3,2],[4,1]])
c2 = array([[5,8],[6,7],[7,6],[8,5]])
How can I do this? Is it possible to store np.array in a python dict so that I can create a single dict c instead of multiple arrays
Yes, you can put np.array into a Python dictionary. Just use a dict comprehension and zip the lists from a and b together.
>>> a = {0:[1,2,3,4], 1:[5,6,7,8]}
>>> b = {0:[4,3,2,1], 1:[8,7,6,5]}
>>> c = {i: np.array(list(zip(a[i], b[i]))) for i in set(a) & set(b)}
>>> c
{0: array([[1, 4], [2, 3], [3, 2], [4, 1]]),
1: array([[5, 8], [6, 7], [7, 6], [8, 5]])}
You can also use column_stack with a list comprehension:
import numpy as np
[np.column_stack((a[k], b[k])) for k in b.keys()]
Out[30]:
[array([[1, 4],
[2, 3],
[3, 2],
[4, 1]]), array([[5, 8],
[6, 7],
[7, 6],
[8, 5]])]

Return min/max of multidimensional in Python?

I have a list in the form of
[ [[a,b,c],[d,e,f]] , [[a,b,c],[d,e,f]] , [[a,b,c],[d,e,f]] ... ] etc.
I want to return the minimal c value and the maximal c+f value. Is this possible?
For the minimum c:
min(c for (a,b,c),(d,e,f) in your_list)
For the maximum c+f
max(c+f for (a,b,c),(d,e,f) in your_list)
Example:
>>> your_list = [[[1,2,3],[4,5,6]], [[0,1,2],[3,4,5]], [[2,3,4],[5,6,7]]]
>>> min(c for (a,b,c),(d,e,f) in lst)
2
>>> max(c+f for (a,b,c),(d,e,f) in lst)
11
List comprehension to the rescue
a=[[[1,2,3],[4,5,6]], [[2,3,4],[4,5,6]]]
>>> min([x[0][2] for x in a])
3
>>> max([x[0][2]+ x[1][2] for x in a])
10
You have to map your list to one containing just the items you care about.
Here is one possible way of doing this:
x = [[[5, 5, 3], [6, 9, 7]], [[6, 2, 4], [0, 7, 5]], [[2, 5, 6], [6, 6, 9]], [[7, 3, 5], [6, 3, 2]], [[3, 10, 1], [6, 8, 2]], [[1, 2, 2], [0, 9, 7]], [[9, 5, 2], [7, 9, 9]], [[4, 0, 0], [1, 10, 6]], [[1, 5, 6], [1, 7, 3]], [[6, 1, 4], [1, 2, 0]]]
minc = min(l[0][2] for l in x)
maxcf = max(l[0][2]+l[1][2] for l in x)
The contents of the min and max calls is what is called a "generator", and is responsible for generating a mapping of the original data to the filtered data.
Of course it's possible. You've got a list containing a list of two-element lists that turn out to be lists themselves. Your basic algorithm is
for each of the pairs
if c is less than minimum c so far
make minimum c so far be c
if (c+f) is greater than max c+f so far
make max c+f so far be (c+f)
suppose your list is stored in my_list:
min_c = min(e[0][2] for e in my_list)
max_c_plus_f = max(map(lambda e : e[0][2] + e[1][2], my_list))

Categories