How to find the indices of a vectorised matrix numpy - python

I have an ndmatrix in numpy (n x n x n), which I vectorise in order to do some sampling of my data in a particular way, giving me (1 x n^3).
I would like to take the individual vectorised indices and convert them back to n-dimensional indices in the form (n x n x n). Im not sure how bumpy actually vectorises matrices.
Can anyone advise?

Numpy has a function unravel_index which does pretty much that: given a set of 'flat' indices, it will return a tuple of arrays of indices in each dimension:
>>> indices = np.arange(25, dtype=int)
>>> np.unravel_index(indices, (5, 5))
(array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4,
4, 4], dtype=int64),
array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2,
3, 4], dtype=int64))
You can then zip them to get your original indices.
Be aware however that matrices can be represented as 'sequences of rows' (C convention, 'C') or 'sequence of columns' (Fortran convention, 'F'), or the corresponding convention in higher dimensions. Typical flattening of matrices in numpy will preserve that order, so [[1, 2], [3, 4]] can be flattened into [1, 2, 3, 4] (if it has 'C' order) or [1, 3, 2, 4] (if it has 'F' order). unravel_index takes an optional order parameter if you want to change the default (which is 'C'), so you can do:
>>> # Typically, transposition will change the order for
>>> # efficiency reasons: no need to change the data !
>>> n = np.random.random((2, 2, 2)).transpose()
>>> n.flags.f_contiguous
True
>>> n.flags.c_contiguous
False
>>> x, y, z = np.unravel_index([1,2,3,7], (2, 2, 2), order='F')

Related

How do I extract rows of a 2D NumPy array by condition?

I have a 4*5 NumPy array and I want to retrieve rows in which all elements are less than 5.
arr = np.array([[0,2,3,4,5],[1,2,4,1,3], [2,2,5,4,6], [0,2,3,4,3]])
arr[np.where(arr[:,:] <= 4)]
expected output:
[[1,2,4,1,3],[0,2,3,4,3]]
actual output:
array([0, 2, 3, 4, 1, 2, 4, 1, 3, 2, 2, 4, 0, 2, 3, 4, 3])
Any help is appreciated!
This actually quite simple. Just convert the entire array to booleans (each value is True if it's less than 5, False otherwise), and use np.all with axis=1 to return True for each row where all items are True:
>>> arr[np.all(arr < 5, axis=1)]
array([[1, 2, 4, 1, 3],
[0, 2, 3, 4, 3]])

Replacing values in Numpy array based on permutations

I have a large np array called X (size:32000) filled with duplicate values of 0, 1, 2, 3.
I want to replace each of the values(0, 1, 2, 3) with permutations of the following numbers: 0, 1, 2, 3, 4, 5
For example, 0, 1, 2, 3 can be replaced with following:
1, 5, 3, 4
5, 2, 4, 3
0, 5, 1, 4
and so on.(there are 360 such permutations in total)
How can I take each of the 360 permutations and replace the 32000 values in X accordingly such that finally I have 360 versions of X for each permutation?
You can try the method numpy.choose:
import numpy as np
x = np.array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,])
perm = [1, 5, 3, 4,]
x = np.choose(x, perm)
np.choose(x, perm) will choose a value from perm for each value of x, taking x as a list of indices. I recommend looking at the documentation since this function can lead to confusion.

Split a list into increasing sequences using itertools

I have a list with mixed sequences like
[1,2,3,4,5,2,3,4,1,2]
I want to know how I can use itertools to split the list into increasing sequences cutting the list at decreasing points. For instance the above would output
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
this has been obtained by noting that the sequence decreases at 2 so we cut the first bit there and another decrease is at one cutting again there.
Another example is with the sequence
[3,2,1]
the output should be
[[3], [2], [1]]
In the event that the given sequence is increasing we return the same sequence. For example
[1,2,3]
returns the same result. i.e
[[1, 2, 3]]
For a repeating list like
[ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
the output should be
[[1, 2, 2, 2], [1, 2, 3, 3], [1, 1, 1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
What I did to achieve this is define the following function
def splitter (L):
result = []
tmp = 0
initialPoint=0
for i in range(len(L)):
if (L[i] < tmp):
tmpp = L[initialPoint:i]
result.append(tmpp)
initialPoint=i
tmp = L[i]
result.append(L[initialPoint:])
return result
The function is working 100% but what I need is to do the same with itertools so that I can improve efficiency of my code. Is there a way to do this with itertools package to avoid the explicit looping?
With numpy, you can use numpy.split, this requires the index as split positions; since you want to split where the value decreases, you can use numpy.diff to calculate the difference and check where the difference is smaller than zero and use numpy.where to retrieve corresponding indices, an example with the last case in the question:
import numpy as np
lst = [ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
np.split(lst, np.where(np.diff(lst) < 0)[0] + 1)
# [array([1, 2, 2, 2]),
# array([1, 2, 3, 3]),
# array([1, 1, 1, 2, 3, 4]),
# array([1, 2, 3, 4, 5, 6])]
Psidom already has you covered with a good answer, but another NumPy solution would be to use scipy.signal.argrelmax to acquire the local maxima, then np.split.
from scipy.signal import argrelmax
arr = np.random.randint(1000, size=10**6)
splits = np.split(arr, argrelmax(arr)[0]+1)
Assume your original input array:
a = [1, 2, 3, 4, 5, 2, 3, 4, 1, 2]
First find the places where the splits shall occur:
p = [ i+1 for i, (x, y) in enumerate(zip(a, a[1:])) if x > y ]
Then create slices for each such split:
print [ a[m:n] for m, n in zip([ 0 ] + p, p + [ None ]) ]
This will print this:
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
I propose to use more speaking names than p, n, m, etc. ;-)

Finding differences between all values in an List

I want to find the differences between all values in a numpy array and append it to a new list.
Example: a = [1,4,2,6]
result : newlist= [3,1,5,3,2,2,1,2,4,5,2,4]
i.e for each value i of a, determine difference between values of the rest of the list.
At this point I have been unable to find a solution
You can do this:
a = [1,4,2,6]
newlist = [abs(i-j) for i in a for j in a if i != j]
Output:
print newlist
[3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4]
I believe what you are trying to do is to calculate absolute differences between elements of the input list, but excluding the self-differences. So, with that idea, this could be one vectorized approach also known as array programming -
# Input list
a = [1,4,2,6]
# Convert input list to a numpy array
arr = np.array(a)
# Calculate absolute differences between each element
# against all elements to give us a 2D array
sub_arr = np.abs(arr[:,None] - arr)
# Get diagonal indices for the 2D array
N = arr.size
rem_idx = np.arange(N)*(N+1)
# Remove the diagonal elements for the final output
out = np.delete(sub_arr,rem_idx)
Sample run to show the outputs at each step -
In [60]: a
Out[60]: [1, 4, 2, 6]
In [61]: arr
Out[61]: array([1, 4, 2, 6])
In [62]: sub_arr
Out[62]:
array([[0, 3, 1, 5],
[3, 0, 2, 2],
[1, 2, 0, 4],
[5, 2, 4, 0]])
In [63]: rem_idx
Out[63]: array([ 0, 5, 10, 15])
In [64]: out
Out[64]: array([3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4])

Add arrays to the numpy array

I try to add newly created arrays to other numpy array, but I'm doing something wrong. What I want, is to add multiple arrays like numpy.array([0, 1, 2, 3]), to already created array, so I could get something like this:
x = numpy.array([])
for i in np.arange(5):
y = numpy.array([0, 1, 2, 3])
x = np.append(x, y)
result:
x = [0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]
However, with the loop shown above I get this:
x = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
Try this:
x = []
for i in range(5):
y = numpy.array([0, 1, 2, 3])
x.append(y)
x = numpy.array(x)
or:
N = 5
x = numpy.zeros((N, 4))
for i in range(N):
x[i] = numpy.array([0, 1, 2, 3])
Here I avoid numpy.append and numpy.vstack inside the loop because it can be quite slow. Every call to numpy.append or numpy.vstack creates an empty array and copies both x and y into the new empty array. If you use a list to hold the rows of array until the loop is over, the array just gets copied once at the end.
If neither of the above work for you, you could do something like this (but it'll be slower):
x = numpy.zeros((0, 4))
for i in range(5):
y = numpy.array([0, 1, 2, 3])
x = numpy.vstack(x, y)
append adds to the end of the array. Since x only has one dimension (it has shape (0,) to begin with) it can grow only in the way you observe.
It's not generally the right tool to use to build multi-dimensional arrays incrementally as you're doing - you can add append to a specific access (and so stack arrays) but you need to ensure that both arrays are the same shape, and same size along that axis. On top of this the array you're appending to must be copied each time.
A more succinct way to build your required array could be to use np.tile instead:
>>> np.tile([1, 2, 3, 4, 5], (5, 1)) # (5,1) means 5/1 copies along axis 0/1
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])

Categories