Multiple Multi Dimensional Indexing - python

I want to find the values in a numpty multidimensional array (2D example below) be passing in an array of indicies.
It appears that I can only pass in upto 2 indices without getting an error:
V2 = [[1,2],[2,1]]
V3 = [[1,2],[2,1],[0,0]]
lookup = np.random.rand(3,3)
lookup[V2] #OK
lookup[V3] #IndexError: too many indices for array

The number of indexes as you use it is the number of dimensions.
I think you are making that assumption that every subelement of the list is 1 point while actually the syntax:
V2 = [[a1,a2,a3],[b1,b2,b3]]
lookup[V2]
is equivalent to accessing:
[V2[a1,b1],
V2[a2,b2],
V2[a3,b3]]
using a 3rd dimension gives you an error since you have an array with only 2 dimensions

Related

Python indexing question - 'IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed'

Please can someone tell me why the following code does not work, and what the best work arounds for this are?
Choices # variable containing True or False in each element.
Choices.shape = (18978,)
BestOption # variable containing 1 or 2 in each element.
BestOption.shape = (18978, 1)
Choices[BestOption==1] # I want to look up the values in choices for all instances where BestOption is 1.
I get the following error:
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
BestOption is a 1-D "column vector" that's actually made up of many rows and is treated like a 2-D matrix. You can simply reshape it back to a 1-D "row vector":
Choices[BestOption.reshape(-1)==1]

Mask certain indices for every entry in a batch, when using torch.max()

I am incremently sampling a batch of size torch.Size([n, 8]).
I also have a list valid_indices of length n which contains tuples of indices that are valid for each entry in the batch.
For instance valid_indices[0] may look like this: (0,1,3,4,5,7) , which suggests that indices 2 and 6 should be excluded from the first entry in batch along dim 1.
Particularly I need to exclude these values for when I use torch.max(batch, dim=1, keepdim=True).
Indices to be excluded (if any) may differ from entry to entry within the batch.
Any ideas? Thanks in advance.
I assume that you are getting the good old
IndexError: too many indices for tensor of dimension 1
error when you use your tuple indices directly on the tensor.
At least that was the error that I was able to reproduce when I execute the following line
t[0][valid_idx0]
Where t is a random tensor with size (10,8) and valid_idx0 is a tuple with 4 elements.
However, same line works just fine when you convert your tuple to a list as following
t[0][list(valid_idx0)]
>>> tensor([0.1847, 0.1028, 0.7130, 0.5093])
But when it comes to applying these indices to 2D tensors, things get a bit different, since we need to preserve the structure of our tensor for batch processing.
Therefore, it would be reasonable to convert our indices to mask arrays.
Let's say we have a list of tuples valid_indices at hand. First thing will be converting it to a list of lists.
valid_idx_list = [list(tup) for tup in valid_indices]
Second thing will be converting them to mask arrays.
masks = np.zeros((t.size()))
for i, indices in enumerate(valid_idx_list):
masks[i][indices] = 1
Done. Now we can apply our mask and use the torch.max on the masked tensor.
torch.max(t*masks)
Kindly see the colab notebook that I've used to reproduce the problem.
https://colab.research.google.com/drive/1BhKKgxk3gRwUjM8ilmiqgFvo0sfXMGiK?usp=sharing

Random array from list of arrays by numpy.random.choice()

I have list of arrays similar to lstB and want to pick random collection of 2D arrays. The problem is that numpy somehow does not treat objects in lists equally:
lstA = [numpy.array(0), numpy.array(1)]
lstB = [numpy.array([0,1]), numpy.array([1,0])]
print(numpy.random.choice(lstA)) # returns 0 or 1
print(numpy.random.choice(lstB)) # returns ValueError: must be 1-dimensional
Is there an ellegant fix to this?
Let's call it semi-elegant...
# force 1d object array
swap = lstB[0]
lstB[0] = None
arrB = np.array(lstB)
# reinsert value
arrB[0] = swap
# and clean up
lstB[0] = swap
# draw
numpy.random.choice(arrB)
# array([1, 0])
Explanation: The problem you encountered appears to be that numpy when converting the input list to an array will make as deep an array as it can. Since all your list elements are sequences of the same length this will be 2d. The hack shown here forces it to make a 1d array of object dtype instead by temporarily inserting an incompatible element.
However, I personally would not use this. Because if you draw multiple subarrays with this method you'll get a 1d array of arrays which is probably not what you want and tedious to convert.
So I'd actually second what one of the comments recommends, i.e. draw ints and then use advanced indexing into np.array(lstB).

Python: return the row index of the minimum in a matrix

I wanna print the index of the row containing the minimum element of the matrix
my matrix is matrix = [[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]]
and the code
matrix = [[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]]
a = np.array(matrix)
buff_min = matrix.argmin(axis = 0)
print(buff_min) #index of the row containing the minimum element
min = np.array(matrix[buff_min])
print(str(min.min(axis=0))) #print the minium of that row
print(min.argmin(axis = 0)) #index of the minimum
print(matrix[buff_min]) # print all row containing the minimum
after running, my result is
1
3
1
[22, 3, 4, 12]
the first number should be 2, because the minimum is 2 in the third list ([34,6,4,5,8,2]), but it returns 1. It returns 3 as minimum of the matrix.
What's the error?
I am not sure which version of Python you are using, i tested it for Python 2.7 and 3.2 as mentioned your syntax for argmin is not correct, its should be in the format
import numpy as np
np.argmin(array_name,axis)
Next, Numpy knows about arrays of arbitrary objects, it's optimized for homogeneous arrays of numbers with fixed dimensions. If you really need arrays of arrays, better use a nested list. But depending on the intended use of your data, different data structures might be even better, e.g. a masked array if you have some invalid data points.
If you really want flexible Numpy arrays, use something like this:
np.array([[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]], dtype=object)
However this will create a one-dimensional array that stores references to lists, which means that you will lose most of the benefits of Numpy (vector processing, locality, slicing, etc.).
Also, to mention if you can resize your numpy array thing might work, i haven't tested it, but by the concept that should be an easy solution. But i will prefer use a nested list in this case of input matrix
Does this work?
np.where(a == a.min())[0][0]
Note that all rows of the matrix need to contain the same number of elements.

Logical indices in numpy throwing exception [duplicate]

This question already has an answer here:
Logical indexing in Numpy with two indices as in MATLAB
(1 answer)
Closed 7 years ago.
I am trying to write some code that uses logical numpy arrays to index a larger array, similar to how MATLAB allows array indexing with logical arrays.
import numpy as np
m = 4
n = 4
unCov = np.random.randint(10, size = (m,n) )
rowCov = np.zeros( m, dtype = bool )
colCov = np.ones( n, dtype = bool )
>>> unCov[rowCov, rowCov]
[] # as expected
>>> unCov[colCov, colCov]
[0 8 3 3] # diagonal values of unCov, as expected
>>> unCov[rowCov, colCov]
ValueError: shape mismatch: objects cannot be broadcast to a single shape
For this last evaluation, I expected an empty array, similar to what MATLAB returns. I'd rather not have to check rowCov/colCov for True elements prior to indexing. Why is this happening, and is there a better way to do this?
As I understand it, numpy will translate your 2d logical indices to actual index vectors: arr[[True,False],[False,True]] would become arr[0,1] for an ndarray of shape (2,2). However, in your last case the second index array is full False, hence it corresponds to an index array of length 0. This is paired with the other full True index vector, corresponding to an index array of length 4.
From the numpy manual:
If the index arrays do not have the same shape, there is an attempt to
broadcast them to the same shape. If they cannot be broadcast to the
same shape, an exception is raised:
In your case, the error is exactly due to this:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-1411-28e41e233472> in <module>()
----> 1 unCov[colCov,rowCov]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (4,) (0,)
MATLAB, on the other hand, automatically returns an empty array if the index array is empty along any given dimension.
This actually highlights a fundamental difference between the logical indexing in MATLAB and numpy. In MATLAB, vectors in subscript indexing always slice out a subarray. That is, both
arr([1,2],[1,2])
and
arr([true,true],[true,true])
will return the 2 x 2 submatrix of the matrix arr. If the logical index vectors are shorter than the given dimension of the array, the missing indexing elements are assumed to be false. Fun fact: the index vector can also be longer than the given dimension, as long as the excess elements are all false. So the above is also equivalent to
arr([true,true,false,false],[true,true])
and
arr([true,true,false,false,false,false,false],[true,true])
for a 4 x 4 array (for the sake of argument).
In numpy, however, indexing with boolean-valued numpy arrays in this way will try to extract a vector. Furthermore, the boolean index vectors should be the same length as the dimension they are indexing into. In your 4 x 4 example,
unCov[np.array([True,True]),np.array([True,True])]
and
unCov[np.array([True,True,False,False,False]),np.array([True,True,False,False,False])]
both return the two first diagonal elements, so not a submatrix but rather a vector. Furthermore, they also give the less-then-encouraging warning along the lines of
/usr/bin/ipython:1: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4 but corresponding boolean dimension is 5
So, in numpy, your logical indexing vectors should be the same length as the corresponding dimensions of the ndarray. And then what I wrote above holds true: the logical values are translated into indices, and the result is expected to be a vector. The length of this vector is the number of True elements in every index vector, so if your boolean index vectors have a different number of True elements, then the referencing doesn't make sense, and you get the error that you get.

Categories