Unravel Index numpy - own implementation - python

I try to implement np.unravel_index and np.ravel_multi_index on my own.
For np.ravel_multi_index I could write this short function:
def coord2index(coord, shape):
return np.concatenate((np.asarray(shape[1:])[::-1].cumprod()[::-1],[1])).dot(coord)
But I struggle with finding a similar, short (one-liner) function for np.unravel_index. Does somebody have an idea?

This is one possible implementation:
import numpy as np
def index2coord(index, shape):
return ((np.expand_dims(index, 1) // np.r_[1, shape[:0:-1]].cumprod()[::-1]) % shape).T
shape = (2, 3, 4)
coord = [[0, 1], [2, 0], [1, 3]]
print(index2coord(coord2index(coord, shape), shape))
# [[0 1]
# [2 0]
# [1 3]]


Numpy double-slice assignment with integer indexing followed by boolean indexing

I already know that Numpy "double-slice" with fancy indexing creates copies instead of views, and the solution seems to be to convert them to one single slice (e.g. This question). However, I am facing this particular problem where i need to deal with an integer indexing followed by boolean indexing and I am at a loss what to do. The problem (simplified) is as follows:
a = np.random.randn(2, 3, 4, 4)
idx_x = np.array([[1, 2], [1, 2], [1, 2]])
idx_y = np.array([[0, 0], [1, 1], [2, 2]])
print(a[..., idx_y, idx_x].shape) # (2, 3, 3, 2)
mask = (np.random.randn(2, 3, 3, 2) > 0)
a[..., idx_y, idx_x][mask] = 1 # assignment doesn't work
How can I make the assignment work?
Not sure, but an idea is to do the broadcasting manually and adding the mask respectively just like Tim suggests. idx_x and idx_y both have the same shape (3,2) which will be broadcasted to the shape (6,6) from the cartesian product (3*2)^2.
x = np.broadcast_to(idx_x.ravel(), (6,6))
y = np.broadcast_to(idx_y.ravel(), (6,6))
# this should be the same as
x,y = np.meshgrid(idx_x, idx_y)
Now reshape the mask to the broadcasted indices and use it to select
mask = mask.reshape(6,6)
a[..., x[mask], y[mask]] = 1
The assignment now works, but I am not sure if this is the exact assignment you wanted.
Ok apparently I am making things complicated. No need to combine the indexing. The following code solves the problem elegantly:
b = a[..., idx_y, idx_x]
b[mask] = 1
a[..., idx_y, idx_x] = b
print(a[..., idx_y, idx_x][mask]) # all 1s
EDIT: Use #Kevin's solution which actually gets the dimensions correct!
I haven't tried it specifically on your sample code but I had a similar issue before. I think I solved it by applying the mask to the indices instead, something like:
a[..., idx_y[mask], idx_x[mask]] = 1
-that way, numpy can assign the values to the a array correctly.
EDIT2: Post some test code as comments remove formatting.
a = np.arange(27).reshape([3, 3, 3])
ind_x = np.array([[0, 0], [1, 2]])
ind_y = np.array([[1, 2], [1, 1]])
x = np.broadcast_to(ind_x.ravel(), (4, 4))
y = np.broadcast_to(ind_y.ravel(), (4, 4)).T
# x1, y2 = np.meshgrid(ind_x, ind_y) # above should be the same as this
mask = a[:, ind_y, ind_x] % 2 == 0 # what should this reshape to?
# a[..., x[mask], y[mask]] = 1 # Then you can mask away (may also need to reshape a or the masked x or y)

Numpy reshape - automatic filling or removal

I would like to find a reshape function that is able to transform my arrays of different dimensions in arrays of the same dimension. Let me explain it:
import numpy as np
a = np.array([[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,3]]])
b = np.array([[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,4]]])
c = np.array([[[1,2,3,3],[1,2,3,3]]])
I would like to be able to make b,c shapes equal to a shape. However, np.reshape throws an error because as explained here (Numpy resize or Numpy reshape) the function is explicitly made to handle the same dimensions.
I would like some version of that function that adds zeros at the start of the first dimension if the shape is smaller or remove the start if the shape is bigger. My example will look like this:
b = np.array([[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,4]]])
c = np.array([[[0,0,0,0],[0,0,0,0]],[[1,2,3,3],[1,2,3,3]]])
Do I need to write my own function to do that?
This is similar to above solution but will also work also if lower dimensions don't match
def custom_reshape(a, b):
result = np.zeros_like(a).ravel()
result[-min(a.size, b.size):] = b.ravel()[-min(a.size, b.size):]
return result.reshape(a.shape)
I would write a function like this:
def align(a,b):
out = np.zeros_like(a)
x = min(a.shape[0], b.shape[0])
out[-x:] = b[-x:]
return out
# array([[[1, 2, 3, 3],
# [1, 2, 3, 3]],
# [[1, 2, 3, 3],
# [1, 2, 3, 4]]])
# array([[[0, 0, 0, 0],
# [0, 0, 0, 0]],
# [[1, 2, 3, 3],
# [1, 2, 3, 3]]])

How to used np.where to return a list of tuples with all positions in a 2D numpy array equal to zero?

I am learning about numpy and as an exercise I have to create a function possibilities that has as an input a numpy 2D array with integers and must return a list of tuples where the values are zeros. I've been told that numpy has a function where that can help. I read the docs but couldn't understand it at least not for this task so I had to do it with for loops like this:
def possibilities(board):
not_occupied = []
for i in range(len(board)):
for j in range(len(board[0])):
if board[i][j] == 0:
return not_occupied
Board is somthing like this:
board = [[1,2,0],[0,0,1],[2,0,1]]
How I could use numpy where to do that instead?
You could use argwhere:
import numpy as np
board = [[1, 2, 0],
[0, 0, 1],
[2, 0, 1]]
result = np.argwhere(np.array(board) == 0).tolist()
[[0, 2], [1, 0], [1, 1], [2, 1]]
If the coordinates must be tuples you could do:
result = [tuple(coord) for coord in np.argwhere(np.array(board) == 0).tolist()]
[(0, 2), (1, 0), (1, 1), (2, 1)]
You could use the zip function to re-pair the result of where
list(zip(*np.where(board == 0)))

count overlap between two numpy arrays [duplicate]

I want to get the intersecting (common) rows across two 2D numpy arrays. E.g., if the following arrays are passed as inputs:
array([[1, 4],
[2, 5],
[3, 6]])
array([[1, 4],
[3, 6],
[7, 8]])
the output should be:
array([[1, 4],
[3, 6])
I know how to do this with loops. I'm looking at a Pythonic/Numpy way to do this.
For short arrays, using sets is probably the clearest and most readable way to do it.
Another way is to use numpy.intersect1d. You'll have to trick it into treating the rows as a single value, though... This makes things a bit less readable...
import numpy as np
A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])
nrows, ncols = A.shape
dtype={'names':['f{}'.format(i) for i in range(ncols)],
'formats':ncols * [A.dtype]}
C = np.intersect1d(A.view(dtype), B.view(dtype))
# This last bit is optional if you're okay with "C" being a structured array...
C = C.view(A.dtype).reshape(-1, ncols)
For large arrays, this should be considerably faster than using sets.
You could use Python's sets:
>>> import numpy as np
>>> A = np.array([[1,4],[2,5],[3,6]])
>>> B = np.array([[1,4],[3,6],[7,8]])
>>> aset = set([tuple(x) for x in A])
>>> bset = set([tuple(x) for x in B])
>>> np.array([x for x in aset & bset])
array([[1, 4],
[3, 6]])
As Rob Cowie points out, this can be done more concisely as
np.array([x for x in set(tuple(x) for x in A) & set(tuple(x) for x in B)])
There's probably a way to do this without all the going back and forth from arrays to tuples, but it's not coming to me right now.
I could not understand why there is no suggested pure numpy way to get this working. So I found one, that uses numpy broadcast. The basic idea is to transform one of the arrays to 3d by axes swapping. Let's construct 2 arrays:
a=np.random.randint(10, size=(5, 3))
b[:4,:]=a[np.random.randint(a.shape[0], size=4), :]
With my run it gave:
a=array([[5, 6, 3],
[8, 1, 0],
[2, 1, 4],
[8, 0, 6],
[6, 7, 6]])
b=array([[2, 1, 4],
[2, 1, 4],
[6, 7, 6],
[5, 6, 3],
[0, 0, 0]])
The steps are (arrays can be interchanged) :
#a is nxm and b is kxm
c = np.swapaxes(a[:,:,None],1,2)==b #transform a to nx1xm
# c has nxkxm dimensions due to comparison broadcast
# each nxixj slice holds comparison matrix between a[j,:] and b[i,:]
# Decrease dimension to nxk with product:
c = np.prod(c,axis=2)
#To get around duplicates://
# Calculate cumulative sum in k-th dimension
c= c*np.cumsum(c,axis=0)
# compare with 1, so that to get only one 'True' statement by row
# sum in k-th dimension, so that a nx1 vector is produced
# The intersection between a and b is a[c]
In a function with 2 lines for used memory reduction (correct me if wrong):
def array_row_intersection(a,b):
return a[np.sum(np.cumsum(tmp,axis=0)*tmp==1,axis=1).astype(bool)]
which gave result for my example:
result=array([[5, 6, 3],
[2, 1, 4],
[6, 7, 6]])
This is faster than set solutions, as it makes use only of simple numpy operations, while it reduces constantly dimensions, and is ideal for two big matrices. I guess I might have made mistakes in my comments, as I got the answer by experimentation and instinct. The equivalent for column intersection can either be found by transposing the arrays or by changing the steps a little. Also, if duplicates are wanted, then the steps inside "//" have to be skipped. The function can be edited to return only the boolean array of the indices, which came handy to me ,while trying to get different arrays indices with the same vector. Benchmark for the voted answer and mine (number of elements in each dimension plays role on what to choose):
def voted_answer(A,B):
nrows, ncols = A.shape
dtype={'names':['f{}'.format(i) for i in range(ncols)],
'formats':ncols * [A.dtype]}
C = np.intersect1d(A.view(dtype), B.view(dtype))
return C.view(A.dtype).reshape(-1, ncols)
a_small=np.random.randint(10, size=(10, 10))
a_big_row=np.random.randint(10, size=(10, 1000))
a_big_col=np.random.randint(10, size=(1000, 10))
a_big_all=np.random.randint(10, size=(100,100))
print 'Small arrays:'
print '\t Voted answer:',timeit.timeit(lambda:voted_answer(a_small,b_small),number=100)/100
print '\t Proposed answer:',timeit.timeit(lambda:array_row_intersection(a_small,b_small),number=100)/100
print 'Big column arrays:'
print '\t Voted answer:',timeit.timeit(lambda:voted_answer(a_big_col,b_big_col),number=100)/100
print '\t Proposed answer:',timeit.timeit(lambda:array_row_intersection(a_big_col,b_big_col),number=100)/100
print 'Big row arrays:'
print '\t Voted answer:',timeit.timeit(lambda:voted_answer(a_big_row,b_big_row),number=100)/100
print '\t Proposed answer:',timeit.timeit(lambda:array_row_intersection(a_big_row,b_big_row),number=100)/100
print 'Big arrays:'
print '\t Voted answer:',timeit.timeit(lambda:voted_answer(a_big_all,b_big_all),number=100)/100
print '\t Proposed answer:',timeit.timeit(lambda:array_row_intersection(a_big_all,b_big_all),number=100)/100
with results:
Small arrays:
Voted answer: 7.47108459473e-05
Proposed answer: 2.47001647949e-05
Big column arrays:
Voted answer: 0.00198730945587
Proposed answer: 0.0560171294212
Big row arrays:
Voted answer: 0.00500325918198
Proposed answer: 0.000308241844177
Big arrays:
Voted answer: 0.000864889621735
Proposed answer: 0.00257176160812
Following verdict is that if you have to compare 2 big 2d arrays of 2d points then use voted answer. If you have big matrices in all dimensions, voted answer is the best one by all means. So, it depends on what you choose each time.
Numpy broadcasting
We can create a boolean mask using broadcasting which can be then used to filter the rows in array A which are also present in array B
A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])
m = (A[:, None] == B).all(-1).any(1)
>>> A[m]
array([[1, 4],
[3, 6]])
Another way to achieve this using structured array:
>>> a = np.array([[3, 1, 2], [5, 8, 9], [7, 4, 3]])
>>> b = np.array([[2, 3, 0], [3, 1, 2], [7, 4, 3]])
>>> av = a.view([('', a.dtype)] * a.shape[1]).ravel()
>>> bv = b.view([('', b.dtype)] * b.shape[1]).ravel()
>>> np.intersect1d(av, bv).view(a.dtype).reshape(-1, a.shape[1])
array([[3, 1, 2],
[7, 4, 3]])
Just for clarity, the structured view looks like this:
>>> a.view([('', a.dtype)] * a.shape[1])
array([[(3, 1, 2)],
[(5, 8, 9)],
[(7, 4, 3)]],
dtype=[('f0', '<i8'), ('f1', '<i8'), ('f2', '<i8')])
np.array(set(map(tuple, b)).difference(set(map(tuple, a))))
This could also work
Without Index
Visit https://gist.github.com/RashidLadj/971c7235ce796836853fcf55b4876f3c
def intersect2D(Array_A, Array_B):
Find row intersection between 2D numpy arrays, a and b.
# ''' Using Tuple ''' #
intersectionList = list(set([tuple(x) for x in Array_A for y in Array_B if(tuple(x) == tuple(y))]))
print ("intersectionList = \n",intersectionList)
# ''' Using Numpy function "array_equal" ''' #
""" This method is valid for an ndarray """
intersectionList = list(set([tuple(x) for x in Array_A for y in Array_B if(np.array_equal(x, y))]))
print ("intersectionList = \n",intersectionList)
# ''' Using set and bitwise and '''
intersectionList = [list(y) for y in (set([tuple(x) for x in Array_A]) & set([tuple(x) for x in Array_B]))]
print ("intersectionList = \n",intersectionList)
return intersectionList
With Index
Visit https://gist.github.com/RashidLadj/bac71f3d3380064de2f9abe0ae43c19e
def intersect2D(Array_A, Array_B):
Find row intersection between 2D numpy arrays, a and b.
Returns another numpy array with shared rows and index of items in A & B arrays
# [[IDX], [IDY], [value]] where Equal
# ''' Using Tuple ''' #
IndexEqual = np.asarray([(i, j, x) for i,x in enumerate(Array_A) for j, y in enumerate (Array_B) if(tuple(x) == tuple(y))]).T
# ''' Using Numpy array_equal ''' #
IndexEqual = np.asarray([(i, j, x) for i,x in enumerate(Array_A) for j, y in enumerate (Array_B) if(np.array_equal(x, y))]).T
idx, idy, intersectionList = (IndexEqual[0], IndexEqual[1], IndexEqual[2]) if len(IndexEqual) != 0 else ([], [], [])
return intersectionList, idx, idy
A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])
def matching_rows(A,B):
matches=[i for i in range(B.shape[0]) if np.any(np.all(A==B[i],axis=1))]
if len(matches)==0:
return B[matches]
return np.unique(B[matches],axis=0)
>>> matching_rows(A,B)
array([[1, 4],
[3, 6]])
This of course assumes the rows are all the same length.
import numpy as np
A=np.array([[1, 4],
[2, 5],
[3, 6]])
B=np.array([[1, 4],
[3, 6],
[7, 8]])
intersetingRows=[(B==irow).all(axis=1).any() for irow in A]

Tensorflow - pick values from indicies, what is the operation called?

An example
Suppose I have a tensor values with shape (2,2,2)
values = [[[0, 1],[2, 3]],[[4, 5],[6, 7]]]
And a tensor indicies with shape (2,2) which describes what values to be selected in the innermost dimension
indicies = [[1,0],[0,0]]
Then the result will be a (2,2) matrix with these values
result = [[1,2],[4,6]]
What is this operation called in tensorflow and how to do it?
Note that the above shape (2,2,2) is only an example, it can be any dimension. Some conditions for this operation:
ndim(values) -1 = ndim(indicies)
values.shape[:-1] == indicies.shape == result.shape
indicies.max() < values.shape[-1] -1
I think you can emulate this with tf.gather_nd. You will just have to convert "your" indices to a representation that is suitable for tf.gather_nd. The following example here is tied to your specific example, i.e. input tensors of shape (2, 2, 2) but I think this gives you an idea how you could write the conversion for input tensors with arbitrary shape, although I am not sure how easy it would be to implement this (haven't thought about it too long). Also, I'm not claiming that this is the easiest possible solution.
import tensorflow as tf
import numpy as np
values = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]])
values_tf = tf.constant(values)
indices = np.array([[1, 0], [0, 0]])
converted_idx = []
for k in range(values.shape[0]):
outer = []
for l in range(values.shape[1]):
inds = [k, l, indices[k][l]]
with tf.Session() as sess:
result = tf.gather_nd(values_tf, converted_idx)
This prints
[[1 2]
[4 6]]
Edit: To handle arbitrary shapes here is a recursive solution that should work (only tested on your example):
def convert_idx(last_dim_vals, ori_indices, access_to_ori, depth):
if depth == len(last_dim_vals.shape) - 1:
inds = access_to_ori + [ori_indices[tuple(access_to_ori)]]
return inds
outer = []
for k in range(ori_indices.shape[depth]):
inds = convert_idx(last_dim_vals, ori_indices, access_to_ori + [k], depth + 1)
return outer
You can use this together with the original code I posted like so:
converted_idx = convert_idx(values, indices, [], 0)
with tf.Session() as sess:
result = tf.gather_nd(values_tf, converted_idx)
