NumPy Argument Array Split

NumPy Argument Array Split - python

The function np.array_split(x, n) splits the array x into n roughly equally sized chunks. I am wondering what the most convenient form of this if one wants to obtain the indices of where the array is to be split. So an array of
ix = [(start1, end1), (start2, end2), ... (startn, endn)]
such that
np.array_split(x, n)[i] == x[ix[i][0]:ix[i][1]]
I can think of a few awkward ways of obtaining this but nothing simple.

You know the lengths of the sub arrays. Just use them to find start and end indices:
a = np.arange(10)
res = np.array_split(a, 3)
end = list(np.add.accumulate([len(x) for x in res]))
start = [0] + end[:-1]
ix = list(zip(start, end))
Now, the indices are:
>>> ix
[(0, 4), (4, 7), (7, 10)]
for this result:
>>> res
[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
or:
for i in range(3):
assert np.all(np.array_split(a, 3)[i] == a[ix[i][0]:ix[i][1]])

Related

How can I convert a 2D array into a tuple in python 3?

type conversion
I have a numpy array of dimensions 667000 * 3 and I want to convert it to a 667000*3 tuple.
In smaller dimensions it would be like converting arr to t, where:
arr= [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
t= ((1,2,3),(4,5,6),(7,8,9),(10,11,12))
I have tried :
t = tuple((map(tuple, sub)) for sub in arr)
but didn't work.
Can you help me how can I do that in python 3?

You do not need to iterate over the sub, just first wrap every sublist in a tuple, and then wrap that result in a tuple, like:
tuple(map(tuple, arr))
For example:
>>> arr = [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
>>> tuple(map(tuple, arr))
((1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12))
Here map will thus produce an generator that for each sublist (like [1, 2, 3]) will convert it to a tuple (like (1, 2, 3)). The outer tuple(..) constructor than wraps the elements of this generator in a tuple.
Based on an experiment, converting a 667000×3 matrix is feasible. When I run this for an np.arange(667000*3) and np.random.rand(667000, 3) it requires 0.512 seconds:
>>> arr = np.random.rand(667000,3)
>>> timeit.timeit(lambda: tuple(map(tuple, arr)), number=10)
5.120870679005748
>>> arr = np.arange(667000*3).reshape(-1, 3)
>>> timeit.timeit(lambda: tuple(map(tuple, arr)), number=10)
5.109966446005274

A simple iterative solution to your problem would be to use a generator expression:
tuple(tuple(i) for i in arr)
# ((1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12))

Extract a larger slice than the numpy array's size

I want to extract a slice of length 10, beginning at index 2, of a numpy array A:
import numpy
A = numpy.array([1,3,5,3,9])
def bigslice(A, begin_at, length):
a = A[begin_at:begin_at + length]
while len(a) + len(A) < length:
a = numpy.concatenate((a,A))
return numpy.concatenate((a, A[:length-len(a)]))
print bigslice(A, begin_at = 2, length = 10)
#[5,3,9,1,3,5,3,9,1,3]
This is correct. But I'm looking for a more efficient way to do this (especially when I'll have arrays of thousands of elements at the end) : I suspect the concatenate used here to recreate lots of new temporary arrays, and that would be un-efficient.
How to do the same thing more efficiently ?

Since the middle part of the array is already known to you (i.e. n repetitions of the full array), you can simply construct the middle portion using np.tile:
def cyclical_slice(A, start, length):
arr_l = len(A)
middle = np.tile(A, length // arr_l)
return np.array([A[start:], middle, A[0:length - len(middle)]])

Your code doesn't seem to guarantee that you get a slice of length length, e.g.
>>> A = numpy.array([1,3,5,3,9])
>>> bigslice(A, 0, 3)
array([1, 3, 5, 3, 9, 1, 3, 5])
Assuming that this is an oversight, maybe you could use np.pad, e.g.
def wpad(A, begin_at, length):
to_pad = max(length + begin_at - len(A), 0)
return np.pad(A, (0, to_pad), mode='wrap')[begin_at:begin_at+length]
which gives
>>> wpad(A, 0, 3)
array([1, 3, 5])
>>> wpad(A, 0, 10)
array([1, 3, 5, 3, 9, 1, 3, 5, 3, 9])
>>> wpad(A, 2, 10)
array([5, 3, 9, 1, 3, 5, 3, 9, 1, 3])
and so on.

merge two numpy.array without a loop

I have a two numpy.arrays, I want to get following result efficiently
1.add the element's of b to a's sub-array
a=numpy.array([(1,2,3),(1,2,3)])
b=numpy.array([0,0])
->
c=[(0,1,2,3),(0,1,2,3)]
code in a loop
a=numpy.array([(1,2,3),(1,2,3)])
b=numpy.array([(0,0)])
c=numpy.zeros(2 , 4)
idx=0
for x in a:
c[idx]=(a[idx][0],a[idx][1],a[idx][2], b[idx])
idx = idx+1
and
2. Get an 2-D array with dimension(a.dim*b.dim, 2) from two 1-D arrays
a=numpy.array([(1,2)])
b=numpy.array([(3,4)])
->
c=[(1,3),(1,4),(2,3),(2,4)]
code in a loop
a=numpy.array([(1,2)])
b=numpy.array([(3,4)])
c=numpy.zeros(a.size*b.size , 2)
idx=0
for x in a:
for y in b:
c[idx]=(x,y)
idx = idx+1

For the first problem, you can define b differently and use numpy.hstack:
a = numpy.array([(1,2,3),(1,2,3)])
b = numpy.array([[0],[0]])
numpy.hstack((b,a))
Regarding the second problem, I would probably use sza's answer and create the numpy array from that result, if necessary. That technique was suggested in an old Stack Overflow question.

For the first one, you can do
>>> a=numpy.array([(1,2,3),(1,2,3)])
>>> b=numpy.array([0,0])
>>> [tuple(numpy.insert(x, 0, y)) for (x,y) in zip(a,b)]
[(0, 1, 2, 3), (0, 1, 2, 3)]
For the 2nd one, you can get the 2-D array like this
>>> a=numpy.array([(1,2)])
>>> b=numpy.array([(3,4)])
>>> import itertools
>>> c = list(itertools.product(a.tolist()[0], b.tolist()[0]))
[(1, 3), (1, 4), (2, 3), (2, 4)]

Python shuffle such that position will never repeat

I'd like to do a random shuffle of a list but with one condition: an element can never be in the same original position after the shuffle.
Is there a one line way to do such in python for a list?
Example:
list_ex = [1,2,3]
each of the following shuffled lists should have the same probability of being sampled after the shuffle:
list_ex_shuffled = [2,3,1]
list_ex_shuffled = [3,1,2]
but the permutations [1,2,3], [1,3,2], [2,1,3] and [3,2,1] are not allowed since all of them repeat one of the elements positions.
NOTE: Each element in the list_ex is a unique id. No repetition of the same element is allowed.

Randomize in a loop and keep rejecting the results until your condition is satisfied:
import random
def shuffle_list(some_list):
randomized_list = some_list[:]
while True:
random.shuffle(randomized_list)
for a, b in zip(some_list, randomized_list):
if a == b:
break
else:
return randomized_list

I'd describe such shuffles as 'permutations with no fixed points'. They're also known as derangements.
The probability that a random permutation is a derangement is approximately 1/e (fun to prove). This is true however long the list. Thus an obvious algorithm to give a random derangement is to shuffle the cards normally, and keep shuffling until you have a derangement. The expected number of necessary shuffles is about 3, and it's rare you'll have to shuffle more than ten times.
(1-1/e)**11 < 1%
Suppose there are n people at a party, each of whom brought an umbrella. At the end of the party, each person takes an umbrella at random from the basket. What is the probability that no-one holds their own umbrella?

You could generate all possible valid shufflings:
>>> list_ex = [1,2,3]
>>> import itertools
>>> list(itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex))))
[(2, 3, 1), (3, 1, 2)]
For some other sequence:
>>> list_ex = [7,8,9,0]
>>> list(itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex))))
[(8, 7, 0, 9), (8, 9, 0, 7), (8, 0, 7, 9), (9, 7, 0, 8), (9, 0, 7, 8), (9, 0, 8, 7), (0, 7, 8, 9), (0, 9, 7, 8), (0, 9, 8, 7)]
You could also make this a bit more efficient by short-circuiting the iterator if you just want one result:
>>> list_ex = [1,2,3]
>>> i = itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex)))
>>> next(i)
(2, 3, 1)
But, it would not be a random choice. You'd have to generate all of them and choose one for it to be an actual random result:
>>> list_ex = [1,2,3]
>>> i = itertools.ifilter(lambda p: not any(i1==i2 for i1,i2 in zip(list_ex, p)),
... itertools.permutations(list_ex, len(list_ex)))
>>> import random
>>> random.choice(list(i))
(2, 3, 1)

Here is another take on this. You can pick one solution or another depending on your needs. This is not a one liner but shuffles the indices of elements instead of the elements themselves. Thus, the original list may have duplicate values or values of types that cannot be compared or may be expensive to compare.
#! /usr/bin/env python
import random
def shuffled_values(data):
list_length = len(data)
candidate = range(list_length)
while True:
random.shuffle(candidate)
if not any(i==j for i,j in zip(candidate, range(list_length))):
yield [data[i] for i in candidate]
list_ex = [1, 2, 3]
list_gen = shuffled_values(list_ex)
for i in range(0, 10):
print list_gen.next()
This gives:
[2, 3, 1]
[3, 1, 2]
[3, 1, 2]
[2, 3, 1]
[3, 1, 2]
[3, 1, 2]
[2, 3, 1]
[2, 3, 1]
[3, 1, 2]
[2, 3, 1]
If list_ex is [2, 2, 2], this method will keep yielding [2, 2, 2] over and over. The other solutions will give you empty lists. I am not sure what you want in this case.

Use Knuth-Durstenfeld to shuffle the list. As long as it is found to be in the original position during the shuffling process, a new shuffling process is started from the beginning until it returns to a qualified arrangement. The time complexity of this algorithm is the smallest constant term:
def _random_derangement(x: list, randint: Callable[[int, int], int]) -> None:
'''
Random derangement list x in place, and return None.
An element can never be in the same original position after the shuffle. provides uniform distribution over permutations.
The formal parameter randint requires a callable object such as rand_int(b, a) that generates a random integer within the specified closed interval.
'''
from collections import namedtuple
sequence_type = namedtuple('sequence_type', ('sequence_number', 'elem'))
x_length = len(x)
if x_length > 1:
for i in range(x_length):
x[i] = sequence_type(sequence_number = i, elem = x[i])
end_label = x_length - 1
while True:
for i in range(end_label, 0, -1):
random_location = randint(i, 0)
if x[random_location].sequence_number != i:
x[i], x[random_location] = x[random_location], x[i]
else:
break
else:
if x[0].sequence_number != 0: break
for i in range(x_length):
x[i] = x[i].elem
complete_shuffle

Here's another algorithm. Take cards at random. If your ith card is card i, put it back and try again. Only problem, what if when you get to the last card it's the one you don't want. Swap it with one of the others.
I think this is fair (uniformally random).
import random
def permutation_without_fixed_points(n):
if n == 1:
raise ArgumentError, "n must be greater than 1"
result = []
remaining = range(n)
i = 0
while remaining:
if remaining == [n-1]:
break
x = i
while x == i:
j = random.randrange(len(remaining))
x = remaining[j]
remaining.pop(j)
result.append(x)
i += 1
if remaining == [n-1]:
j = random.randrange(n-1)
result.append(result[j])
result[j] = n
return result

Creating dynamic sublists from a list /sequence in python

Im trying to write a function that creates set of dynamic sublists each containing 5 elements from a list passed to it.Here's my attempt at the code
def sublists(seq):
i=0
x=[]
while i<len(seq)-1:
j=0
while j<5:
X.append(seq[i]) # How do I change X after it reaches size 5?
#return set of sublists
EDIT:
Sample input: [1,2,3,4,5,6,7,8,9,10]
Expected output: [[1,2,3,4,5],[6,7,8,9,10]]

Well, for starters, you'll need to (or at least should) have two lists, a temporary one and a permanent one that you return (Also you will need to increase j and i or, more practically, use a for loop, but I assume you just forgot to post that).
EDIT removed first code as the style given doesn't match easily with the expected results, see other two possibilities.
Or, more sensibly:
def sublists(seq):
x=[]
for i in range(0,len(seq),5):
x.append(seq[i:i+5])
return x
Or, more sensibly again, a simple list comprehension:
def sublists(seq):
return [seq[i:i+5] for i in range(0,len(seq),5)]
When given the list:
l = [1,2,3,4,5,6,7,8,9,10]
They will return
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]

Have you considered using itertools.combinations(...)?
For example:
>>> from itertools import combinations
>>> l = [1,2,3,4,5,6]
>>> list(combinations(l, 5))
[(1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 5, 6), (1, 2, 4, 5, 6), (1, 3, 4, 5, 6), (2, 3, 4, 5, 6)]

By "dynamic sublists", do you mean break up the list into groups of five elements? This is similar to your approach:
def sublists(lst, n):
ret = []
i = 0
while i < len(lst):
ret.append(seq[i:i+n])
i += n
return ret
Or, using iterators:
def sublists(seq, n):
it = iter(seq)
while True:
r = list(itertools.islice(it, 5))
if not r:
break
yield r
which will return an iterator of lists over list of length up to five. (If you took out the list call, weird things would happen if you didn't access the iterators in the same order.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

NumPy Argument Array Split - python

Related

How can I convert a 2D array into a tuple in python 3?

Extract a larger slice than the numpy array's size

merge two numpy.array without a loop

Python shuffle such that position will never repeat

Creating dynamic sublists from a list /sequence in python

Categories

Resources