python permutations with more randomization - python

I'm trying to generate permutation from a list of indices, currently, I am using itertools.permutation. It's okay, except I need a really random nature of the indices as I will not be able to select all the permutations, but a very short subset of the total set (initial ones) for simulation.
For itertools.permutation:
The permutation tuples are emitted in lexicographic ordering according to the order of the input iterable. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.
import itertools
for ind, idxs in enumerate(itertools.permutations(range(5))):
print(ind)
print(idxs)
print('--------')
0
(0, 1, 2, 3, 4)
--------
1
(0, 1, 2, 4, 3)
--------
2
(0, 1, 3, 2, 4)
--------
3
(0, 1, 3, 4, 2)
--------
4
(0, 1, 4, 2, 3)
--------
5
(0, 1, 4, 3, 2)
--------
6
(0, 2, 1, 3, 4)
--------
7
(0, 2, 1, 4, 3)
--------
8
(0, 2, 3, 1, 4)
--------
9
(0, 2, 3, 4, 1)
--------
10
(0, 2, 4, 1, 3)
--------
11
(0, 2, 4, 3, 1)
--------
12
(0, 3, 1, 2, 4)
--------
13
(0, 3, 1, 4, 2)
--------
One solution definitely comes to my mind is to shuffle the list each time to get a random order, but that makes the idea of permutation obsolete, which is not desired as there is a chance that the same sample will be generated more than once. The permutation should be generated iteratively, so I can not just do list(itertools.permutation..) as this will make a really unnecessary long list.

One way would be to shuffle before and/or after you generate the permutations.
For reference:
import itertools
import random
a = list(range(3))
print("original =",a)
random.shuffle(a)
print("shuffled =",a)
permutations = list(itertools.permutations(a))
print("permutations of shuffled array =",permutations)
random.shuffle(permutations)
print("shuffled permutations of shuffled array =",permutations)
original = [0, 1, 2]
shuffled = [1, 0, 2]
permutations of shuffled array = [(1, 0, 2), (1, 2, 0), (0, 1, 2), (0, 2, 1), (2, 1, 0), (2, 0, 1)]
shuffled permutations of shuffled array = [(0, 1, 2), (2, 0, 1), (2, 1, 0), (1, 0, 2), (1, 2, 0), (0, 2, 1)]

Generate random permutations : if you only use a small number k of them, the chance you take twice the same is k/n!.

Use random.sample:
permutations = list(itertools.permutations(range(5)))
permutation = random.sample(permutations, k=4)
# run 1
>>> random.sample(permutations, k=4)
[(0, 4, 1, 2, 3), (4, 0, 1, 3, 2), (3, 2, 0, 4, 1), (1, 2, 3, 4, 0)]
# run 2
>>> random.sample(permutations, k=4)
[(2, 1, 4, 0, 3), (0, 3, 4, 1, 2), (3, 1, 4, 0, 2), (0, 3, 4, 2, 1)]
# run 3
>>> random.sample(permutations, k=4)
[(3, 4, 1, 0, 2), (3, 0, 1, 2, 4), (0, 4, 1, 2, 3), (3, 4, 2, 0, 1)]
# and so on

Related

I have a list of tuples that have different length so I need to iterate over list

The tuple that length is equal to 6 is correct one while the tuples with shorter length are artefacts that should be joined to give length of 6.
For example:
I have a list of tuples as below:
foo = [(3, 1, 0, 1, 1, 1), (3, 1), (1, 1), (3, 1), (3, 1, 0, 1), (1, 2), (3, 3, 3, 1, 2, 2)]
len(foo[0]) = 6
len(foo[1]) = 2
len(foo[2]) = 2
len(foo[3]) = 2
len(foo[4]) = 4
len(foo[5]) = 2
len(foo[6]) = 6
So it means that I want to have a list with the following output:
foo_1 = [(3, 1, 0, 1, 1, 1), (3, 1, 1, 1, 3, 1), (3, 1, 0, 1, 1, 2), (3, 3, 3, 1, 2, 2)]
where:
foo_1[1] = foo[1] + foo[2] + foo[3],
foo_1[2] = foo[4] + foo[5]
Basically, I need to iterate over list of tuples and compare the length of each with 6. Then if the length of tuple is not equal to six I have to join tuples till their length will be 6.
You can create a function that flatten's the list of tuples, and then use generators and zip to group them into proper number of length.
>>> def flatten(lst):
for tup in lst:
yield from tup
>>> list(zip(*[flatten(foo)]*6))
[(3, 1, 0, 1, 1, 1),
(3, 1, 1, 1, 3, 1),
(3, 1, 0, 1, 1, 2),
(3, 3, 3, 1, 2, 2)]
You can find more about how zip(*[iter(iterable)]*n) works here.
Or you can use the itertools.chain.from_iterable function to accomplish the flattening part:
>>> flat = chain.from_iterable(foo)
>>> list(zip(*[flat]*6))
[(3, 1, 0, 1, 1, 1),
(3, 1, 1, 1, 3, 1),
(3, 1, 0, 1, 1, 2),
(3, 3, 3, 1, 2, 2)]
import itertools
foo=[(3, 1, 0, 1, 1, 1), (3, 1), (1, 1), (3, 1), (3, 1, 0, 1), (1, 2), (3, 3, 3, 1, 2, 2)]
foo1=[i for t in foo for i in t]
list(itertools.zip_longest(*[iter(foo1)]*6))
Output:
[(3, 1, 0, 1, 1, 1),
(3, 1, 1, 1, 3, 1),
(3, 1, 0, 1, 1, 2),
(3, 3, 3, 1, 2, 2)]
Or just iterate over and use slice
[foo1[i:i+6] for i in range(0,len(foo1),6)]
foo1 is list of all elements..and after that we can use slicing or zip_longest from itertools to get the desired result.
itertools.zip_longest
Make an iterator that aggregates elements from each of the iterables.
If the iterables are of uneven length, missing values are filled-in
with fillvalue. Iteration continues until the longest iterable is
exhausted. Roughly equivalent to:
If the total length is not in multiples of 6
foo=[(3, 1, 0, 1, 1, 1), (3, 1), (1, 1), (3, 1), (3, 1, 0, 1), (1, 2), (3, 3, 3, 1, 2, 2),(1,)
I've added an extra (1,)
list(itertools.zip_longest(*[iter(foo1)]*6))
(1, None, None, None, None, None)]
If we need some fill value instead of None then
list(itertools.zip_longest(*[iter(foo1)]*6,fillvalue=2))
(1, 2, 2, 2, 2, 2)
An easy way to get the results could use more_itertools.chunked
from more_itertools import chunked
from itertools import chain
foo = [(3, 1, 0, 1, 1, 1), (3, 1), (1, 1), (3, 1),
(3, 1, 0, 1), (1, 2), (3, 3, 3, 1, 2, 2)]
for chunk in chunked(chain.from_iterable(foo), 6):
print(chunk)
Prints:
[3, 1, 0, 1, 1, 1]
[3, 1, 1, 1, 3, 1]
[3, 1, 0, 1, 1, 2]
[3, 3, 3, 1, 2, 2]

Combination of List and Nested List by Index

The output of my script is a list and a nested list. I would like to get the combinations of the two lists by index. In this instance, I have the following two lists:
x = [0, 1, 2, 3]
y = [[0, 1, 2, 3],
[0, 1, 2, 3, 4, 5, 6, 7, 8],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4, 5, 6, 7, 8]]
The desired output should look something like this.
[(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1,
7), (1, 8), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5),
(3, 6), (3, 7), (3, 8)]
I've looked at many posts about itertools.combinations and itertools.product, but I cannot find anything about looping and combining at the same time, which I think would be the approach to the problem. I want to get all combinations x[0] and y[0], then x[1] and y[1], etc.
You can do this with a list comprehension.
x = [0, 1, 2, 3]
y = [[0, 1, 2, 3],
[0, 1, 2, 3, 4, 5, 6, 7, 8],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4, 5, 6, 7, 8]]
final = [(i,j) for i in x for j in y[i]]
It seems you are going to do the catidion multiplication of two array. Here is the reference check and let me know if worked for you.
Cartesian product of x and y array points into single array of 2D points

Sliding window of width n over the given iterable

I have a sequence, window size and step:
seq = [0,1,2,3,4]
n=4
step=2
from more_itertools import windowed
list(windowed([0,1,2,3,4], n, fillvalue=0, step=step))
result:
[(0, 1, 2, 3), (2, 3, 4, 0)]
but I need:
[(0, 1, 2, 3), (2, 3, 4, 0), (4, 0, 0, 0)]
Please help me find a solution
Just write your own windowed function:
def windowed(iterable, size, fillvalue=None, step=1):
for i in range(0, len(iterable), step):
window = iterable[i:i+size]
window += [fillvalue] * (size - len(window))
yield window
>>> list(windowed([0,1,2,3,4], 4, fillvalue=0, step=2))
[[0, 1, 2, 3], [2, 3, 4, 0], [4, 0, 0, 0]]
this should also work with iterables and not just sequences:
from itertools import islice
def sliding_window(seq, n, step, fillvalue=None):
it = iter(seq)
values = tuple(islice(it, n))
while values:
yield values + (n-len(values)) * (fillvalue, )
values = values[step:] + tuple(islice(it, step))
the function outputs:
print(list(sliding_window(seq, n, step, fillvalue=0)))
# [(0, 1, 2, 3), (2, 3, 4, 0), (4, 0, 0, 0)]
most of it is borrowed from the original itertools recipe for a sliding window.
How about using padded?
seq = [0,1,2,3,4]
n=4
step=2
from more_itertools import windowed, padded
list(windowed(padded(seq, 0, n=n, next_multiple=True), n, step=step))
Consider more_itertools.stagger:
Given
import itertools as it
import more_itertools as mit
iterable = [0, 1, 2, 3, 5]
Code
Get all results from sliding windows:
windows = list(mit.stagger(iterable, offsets=(0, 1, 2, 3), longest=True, fillvalue=0))
windows
# [(0, 1, 2, 3), (1, 2, 3, 5), (2, 3, 5, 0), (3, 5, 0, 0), (5, 0, 0, 0)]
Next, filter out the desired results:
[w for i, w in enumerate(windows) if not (i % 2)]
# [(0, 1, 2, 3), (2, 3, 5, 0), (5, 0, 0, 0)]
or slice the iterable:
list(it.islice(windows, 0, None, 2))
# [(0, 1, 2, 3), (2, 3, 5, 0), (5, 0, 0, 0)]

Generate all permutations with separate limits for each index

Let's say we have a list, e.g., [3, 2, 1]. I would like to generate all permutations of that list in the form:
[1, 1, 1], [2, 1, 1], [3, 1, 1], [1, 2, 1], [2, 2, 1] , [3, 2, 1]
for any list of length n. This way, the value of the ith element of the original list is the upper limit for the value of the ith element of all the permutations.
I would also like to use a generator using yield, since the input list may be rather large large (e.g., n = 30).
So far, I have been using something like this:
itertools.product(range(1, 5), repeat=5)
Which has the following output when used in a for loop:
(1, 1, 1, 1, 1), (1, 1, 1, 1, 2), (1, 1, 1, 1, 3), (1, 1, 1, 1, 4), (1, 1, 1, 2, 1), (1, 1, 1, 2, 2), (1, 1, 1, 2, 3), ...
However, I don't think it allows specifying custom limits for each element of the permutations.
Also, please note that the elements of the input list do not necessarily need to be consecutive numbers, so [25, 17, 10, 4] is a valid input.
This recursive function returns a generator in the desired order:
def f(limits):
if not limits:
yield ()
return
for l in f(limits[1:]):
for i in range(1, limits[0]+1):
yield (i,) + l
>>> print(list(f([3, 2, 1])))
[(1, 1, 1), (2, 1, 1), (3, 1, 1), (1, 2, 1), (2, 2, 1), (3, 2, 1)]
You can filter the results of itertools.product
>>> from itertools import product
>>> l = [3,2,1]
>>> list(filter(lambda t: all(x<=y for x,y in zip(t,l)), product(l, repeat=len(l))))
[(3, 2, 1), (3, 1, 1), (2, 2, 1), (2, 1, 1), (1, 2, 1), (1, 1, 1)]
You're looking for the Cartesian product of a bunch of ranges the argument of which is defined by your input list:
from itertools import product
lims = [3, 2, 1]
gen = product(*(range(1,lim+1) for lim in lims))
print(list(gen))
The result is
[(1, 1, 1), (1, 2, 1), (2, 1, 1), (2, 2, 1), (3, 1, 1), (3, 2, 1)]

numpy.searchsorted with more than one source

Let's say that I have two arrays in the form
a = [0, 0, 1, 1, 2, 3, 3, 3, 4, 4, 5, 6]
b = [1, 2, 1, 2, 1, 4, 7, 9, 4, 8, 1, 1]
As you can see, the above arrays are sorted, when considered a and b as columns of a super array.
Now, I want to do a searchsorted on this array. For instance, if I search for (3, 7) (a = 3 and b = 7), I should get 6.
Whenever there are duplicate values in a, the search should continue with values in b.
Is there a built-in numpy method to do it? Or what could be the efficient way to do it, assuming that I have million entries in my array.
I tried with numpy.recarray, to create one recarray with a and b and tried searching in it, but I am getting the following error.
TypeError: expected a readable buffer object
Any help is much appreciated.
You're almost there. It's just that numpy.record (which is what I assume you used, given the error message you received) isn't really what you want; just create a one-item record array:
>>> a_b = numpy.rec.fromarrays((a, b))
>>> a_b
rec.array([(0, 1), (0, 2), (1, 1), (1, 2), (2, 1), (3, 4), (3, 7), (3, 9),
(4, 4), (4, 8), (5, 1), (6, 1)],
dtype=[('f0', '<i8'), ('f1', '<i8')])
>>> numpy.searchsorted(a_b, numpy.array((3, 7), dtype=a_b.dtype))
6
It might also be useful to know that sort and argsort sort record arrays lexically, and there is also lexsort. An example using lexsort:
>>> random_idx = numpy.random.permutation(range(12))
>>> a = numpy.array(a)[random_idx]
>>> b = numpy.array(b)[random_idx]
>>> sorted_idx = numpy.lexsort((b, a))
>>> a[sorted_idx]
array([0, 0, 1, 1, 2, 3, 3, 3, 4, 4, 5, 6])
>>> b[sorted_idx]
array([1, 2, 1, 2, 1, 4, 7, 9, 4, 8, 1, 1])
Sorting record arrays:
>>> a_b = numpy.rec.fromarrays((a, b))
>>> a_b[a_b.argsort()]
rec.array([(0, 1), (0, 2), (1, 1), (1, 2), (2, 1), (3, 4), (3, 7), (3, 9),
(4, 4), (4, 8), (5, 1), (6, 1)],
dtype=[('f0', '<i8'), ('f1', '<i8')])
>>> a_b.sort()
>>> a_b
rec.array([(0, 1), (0, 2), (1, 1), (1, 2), (2, 1), (3, 4), (3, 7), (3, 9),
(4, 4), (4, 8), (5, 1), (6, 1)],
dtype=[('f0', '<i8'), ('f1', '<i8')])
You could use a repeated searchsorted from left and right:
left, right = np.searchsorted(a, 3, side='left'), np.searchsorted(a, 3, side='right')
index = left + np.searchsorted(b[left:right], 7)
This works for me:
>>> a = [0, 0, 1, 1, 2, 3, 3, 3, 4, 4, 5, 6]
>>> b = [1, 2, 1, 2, 1, 4, 7, 9, 4, 8, 1, 1]
>>> Z = numpy.array(zip(a, b), dtype=[('a','int'), ('b','int')])
>>> Z.searchsorted(numpy.asarray((3,7), dtype=Z.dtype))
6
I think the trick might be to make sure the argument to searchsorted has the same dtype as the array. When I try Z.searchsorted((3, 7)) I get a segfault.
Here's an interesting way to do it (though it's not the most efficient way, as I believe it's O(n) rather than O(log(n)) as ecatmur's answer would be; it is, however, more compact):
np.searchsorted(a + 1j*b, a_val + 1j*b_val)
Example:
>>> a = np.array([0, 0, 1, 1, 2, 3, 3, 3, 4, 4, 5, 6])
>>> b = np.array([1, 2, 1, 2, 1, 4, 7, 9, 4, 8, 1, 1])
>>> np.searchsorted(a + 1j*b, 4 + 1j*8)
9
n arrays extension :
import numpy as np
def searchsorted_multi(*args):
v = args[-1]
if len(v) != len(args[:-1]):
raise ValueError
l, r = 0, len(args[0])
ind = 0
for vi, ai in zip(v, args[:-1]):
l, r = [np.searchsorted(ai[l:r], vi, side) for side in ('left', 'right')]
ind += l
return ind
if __name__ == "__main__":
a = [0, 0, 1, 1, 2, 3, 3, 3, 4, 4, 5, 6]
b = [1, 2, 1, 2, 1, 4, 7, 9, 4, 8, 1, 1]
c = [1, 2, 1, 2, 1, 4, 7, 9, 4, 8, 1, 2]
assert(searchsorted_multi(a, b, (3, 7)) == 6)
assert(searchsorted_multi(a, b, (3, 0)) == 5)
assert(searchsorted_multi(a, b, c, (6, 1, 2)) == 12)
Or without numpy:
>>> import bisect
>>> a = [0, 0, 1, 1, 2, 3, 3, 3, 4, 4, 5, 6]
>>> b = [1, 2, 1, 2, 1, 4, 7, 9, 4, 8, 1, 1]
>>> bisect.bisect_left(zip(a,b), (3,7))
6

Categories