Why does indexing a numpy array using an array change the shape? - python

I'm trying to index a 2-dimensional array to certain values using numpy.where(), but unless I am indexing in the first index without a : slice it always increases the dimension. I can't seem to find an explanation for this in the documentation.
For example, say I have an array a:
a = np.arange(20)
a = np.reshape(a,(4,5))
print("a = ",a)
print("a shape = ", a.shape)
Output:
a = [[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]
a shape = (4, 5)
If I have two indexing arrays, one in the 'x' direction and one in the 'y' direction:
x = np.arange(5)
y = np.arange(4)
xindx = np.where((x>=2)&(x<=4))
yindx = np.where((y>=1)&(y<=2))
and then index a using the 'y' index like so there's no problem:
print(a[yindx])
print(a[yindx].shape)
Output:
[[ 5 6 7 8 9]
[10 11 12 13 14]]
(2, 5)
But if I have : in one of the indices then I have an extra dimension of size 1:
print(a[yindx,:])
print(a[yindx,:].shape)
print(a[:,xindx])
print(a[:,xindx].shape)
Output:
[[[ 5 6 7 8 9]
[10 11 12 13 14]]]
(1, 2, 5)
[[[ 2 3 4]]
[[ 7 8 9]]
[[12 13 14]]
[[17 18 19]]]
(4, 1, 3)
I run into this issue with one-dimensional arrays, too. How do I fix this?

If xindx and yindx were numpy arrays, the result would be as expected. However, they are tuples with a single value.
Easiest (and pretty dumb) fix:
xindx = np.where((x>=2)&(x<=4))[0]
yindx = np.where((y>=1)&(y<=2))[0]
With only the condition given, np.where will return indices of matching elements in a tuple. This use is explicitly discouraged in the documentation.
More realistically, you probably need something like:
xindx = np.arange(2, 5)
yindx = np.arange(1, 3)
... but it really depends on the context we don't see

Related

How to fill a zeros matrix using for loop?

I have an array full of zeros created as A = np.zeros(m,m) and m = 6.
I want to fill this array with specific numbers that each equals sum of it's row and column number; such as A(x,y) = x+y.
How can i do this using for loop and while loop?
Method that avoids a loop with rather significant performance improvement on a large ndarray:
A = np.zeros((6,6))
m = A.shape[0]
n = A.shape[1]
x = np.transpose(np.array([*range(1,m+1)]*n).reshape(n,m))
y = np.array([*range(1,n+1)]*m).reshape(m,n)
A = x+y
print(A)
[[ 2 3 4 5 6 7]
[ 3 4 5 6 7 8]
[ 4 5 6 7 8 9]
[ 5 6 7 8 9 10]
[ 6 7 8 9 10 11]
[ 7 8 9 10 11 12]]
A = np.zeros((6,6))
for i in range(0,A.shape[0]):
for j in range(0, A.shape[1]):
A[i][j] = i+j+2
If you want the rows and columns to be starting from 1, you can directly use this code, but if you want them to be starting from 0 you can surely remove the "+2" in line-4.
Explanation:
I am first traversing the row in a loop when, then traversing the columns in loop 2, and then I am accessing the cell value using A[i][j]. and assigning it to i+j+2 (or just i + j). This way the original array will fill your new values.
Have you tried this?
for y in range(len(A)):
for x in range(len(A[Y]):
A[y][x] = x + y

How to manually select values from an array

For example I have a matrix array
a=np.arrange(25).shape(5,5)
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
How do I make an 1D array of elements that I would like to choose manually? For example [2,3], [4,1], [1,0] and [2,2], so I get the following:
b=[13, 21, 5, 12]
The array should a reference rather than a copy.
You can make a function for this.
# defining the function
def get_value(matrix, row_list, col_list):
for i, j in zip(row_list, col_list):
return matrix[row_list, col_list]
# initializing the array
a = np.arange(0, 25, 1).reshape(5, 5)
# getting the required values and printing
b = get_value(a, [2,4,1,0], [3,1,0,2])
# output
print(b)
Edit
I'll let the previous answer be as is, just in case if anyone else stumbles upon that and needs it.
What the question wants is to give a value from b (i.e. b[0] which is 13) and change the value from the original matrix a based on the index of that passed value from b in a.
def change_the_value(old_mat, val_to_change, new_val):
mat_coor = np.array(np.matrix(np.where(old_mat == val_to_change)).T)[0]
old_mat[mat_coor[0], mat_coor[1]] = new_val
a = np.arange(0, 25, 1).reshape(5,5)
b = [13, 16, 5, 12]
change_the_value(a, b[0], 0)
a=np.arange(25).reshape(5,5)
search=[[2,3], [4,1], [1,0], [2,2]]
for row,col in search:
print(row,col, a[row][col])
output:
r c result
2 3 13
4 1 21
1 0 5
2 2 12
First of all, I've found that constructing a non-contiguous view to a Numpy array is not natively possible, because Numpy efficiently utilises contiguous memory layout of an array, which enables dramatic speed increase.
Here's a solution I found that works the best so far:
Instead of having a view to an array, I construct a collection indices, that I would like to process, [2,3], [4,1], [1,0], [2,2].
The collection type I have chosen are Sets, due to exclusion of duplicates and set().add and set().discard methods that do not require search. Keeping order was not necessary.
To use them for indexing an array they have to be casted from a set of tuples set{(2,3),(4,1),(1,0),(2,2)} to a tuple of arrays (ndarray([2,4,1,2], ndarray[3,1,0,2]).
Which can be achieved by unzipping a set and constructing a tuple of arrays:
import numpy as np
a=np.arrange(25).shape(5,5)
>>>[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
my_set = {(2,3),(4,1),(1,0),(2,2)}
uzip_set = list(zip(*my_set))
seq_from_set = (np.asarray(uzip_set[0]),np.asarray(uzip_set[1]))
print(seq_from_set)
>>>(array[2,4,1,2], array[3,1,0,2])
And array a can be manipulated by providing such a sequence of indices:
b = a[seq_from_set]
print(b)
>>>array[13,21,5,12]
a[seq_from_set] = 0
print(a)
>>>[[ 0 1 2 3 4]
[ 0 6 7 8 9]
[10 11 0 0 14]
[15 16 17 18 19]
[20 0 22 23 24]]
The solution is a bit sophisticated compared to something native, but works surprisingly fast. This allows an easy management of the collection of indices and supports quick conversion to a stream of indices on demand.

what is the difference between resize and reshape [duplicate]

This question already has answers here:
What is the difference between resize and reshape when using arrays in NumPy?
(5 answers)
What does it mean "The reshape function returns its argument with a modified shape, whereas the ndarray.resize method modifies the array itself"?
(3 answers)
Closed 2 years ago.
I was writing a line of code and I get some strange output of it.
a = np.arange(2,11).resize((3,3))
print(a)
a = np.arange(2,11).reshape((3,3))
print(a)
the first one gives me None but the second one gives me a 3X3 matrix.
but when I write the first code in separate lines it won't give me None
a = np.arange(2,11)
a.resize((3,3))
print(a)
what is the difference between resize and reshape in this case and in what are the differences in general?
That is because ndarray.resize modifies the array shape in-place, and since you're assigning back to a you get None, as an in-place operation does not return anything. reshape instead returns a view of the array:
a = np.arange(2,11)
a.shape
#(10,)
a.resize((3,3))
a.shape
# (3, 3)
np.arange(2,11).reshape((3,3)).shape
# (3, 3)
Both reshape and resize change the shape of the numpy array; the difference is that using resize will affect the original array while using reshape create a new reshaped instance of the array.
Reshape:
import numpy as np
r = np.arange(16)
print('original r: \n',r)
print('\nreshaped array: \n',r.reshape((4,4)))
print('\narray r after reshape was applied: \n',r)
output
original r:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
reshaped array:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
array r after reshape was applied:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
Resize:
import numpy as np
r = np.arange(16)
print('original r: \n',r)
print('\nresized array: \n',r.resize((4,4)))
print('\narray r after resize was applied: \n',r)
output:
original r:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
resized array:
None
array r after resize was applied:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
As you can see reshape created a reshaped new instance of the data while the original r stayed unchanged.And As you can see, resize didn’t create a new instance of r, the changes were applied to the original array directly.

How can I select a small matrix from a larger matrix by its index

I have a larger 2 dimensional matrix which is 36*72 and I want to select a small matrix from it by using indexes.
The matrix looks like this:
[ [312, 113, 525, 543, ...] ,
[...],
[...],
... ].
And I print the shape like this:
print(array(matrix).shape)
(36, 72)
But when I try to print out the small matrix like this
print(matrix[6:9][9])
The error is "IndexError: list index out of range"
Then I tried
print(matrix[6:9,9])
It showed "TypeError: list indices must be integers, not tuple"
Then I tried
print(matrix[6:9][8:9])
I get the empty list. But when I tried
print(matrix[9][9])
It did give out some number.
With numpy arrays, you can use quite convenient indexing methods, which is a feature of numpy parts of which are refered to as fancy indexing.
Let's try that with a small example 2D-array:
import numpy as np
a=np.arange(48).reshape(6, 8)
print(a)
#[[ 0 1 2 3 4 5 6 7]
# [ 8 9 10 11 12 13 14 15]
# [16 17 18 19 20 21 22 23]
# [24 25 26 27 28 29 30 31]
# [32 33 34 35 36 37 38 39]
# [40 41 42 43 44 45 46 47]]
If you now want to index e.g. rows 2 and 3 and columns 3 to 6, you can simply write that down in slices, no matter if by constants or variables:
r1 = 2; r2 = 4
print(a[r1:r2, 3:7])
#[[19 20 21 22]
# [27 28 29 30]]
You might want to read further here: https://docs.scipy.org/doc/numpy/user/basics.indexing.html
Here's an example. I have a 3x3 matrix, named 'a' and I want to select the top left 2x2 matrix named 'c'.
>>> import numpy as np # importing numpy
>>> a=np.matrix('1 2 3;4 5 6;7 8 9') # creating an example matrix, named a
>>> a
matrix([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> b=[[a.item(0,0),a.item(0,1)],[a.item(1,0),a.item(1,1)]] # creating a list, with 1,1 1,2 2,1 and 2,2 indices of a. remember, in math indexing starts from 1 but in most programming languages, it starts from 0
>>> b
[[1, 2], [4, 5]]
>>> c=np.matrix(b) # creating an numpy matrix object from b which is a part of a
>>> c
matrix([[1, 2],
[4, 5]])
noe = Mx1[2:4, 2:4 ] # this is the salshuen, but yo use 2 becuse 0 is 1 like bit.
# Mx1 [row:colems , Colems:row ] |bns be cerefel
# It confusing but works
noe =[[ 8750. 8750.]
[ 8750. 70000.]]
Mx1 = [[ 8750. 8750. -8750. -8750.]
[ 8750. 8750. -8750. -8750.]
[-8750. -8750. 8750. 8750.]
[-8750. -8750. 8750. 70000.]]

Numpy fancy indexing in multiple dimensions

Let's say I have an numpy array A of size n x m x k and another array B of size n x m that has indices from 1 to k.
I want to access each n x m slice of A using the index given at this place in B,
giving me an array of size n x m.
Edit: that is apparently not what I want!
[[ I can achieve this using take like this:
A.take(B)
]] end edit
Can this be achieved using fancy indexing?
I would have thought A[B] would give the same result, but that results
in an array of size n x m x m x k (which I don't really understand).
The reason I don't want to use take is that I want to be able to assign this portion something, like
A[B] = 1
The only working solution that I have so far is
A.reshape(-1, k)[np.arange(n * m), B.ravel()].reshape(n, m)
but surely there has to be an easier way?
Suppose
import numpy as np
np.random.seed(0)
n,m,k = 2,3,5
A = np.arange(n*m*k,0,-1).reshape((n,m,k))
print(A)
# [[[30 29 28 27 26]
# [25 24 23 22 21]
# [20 19 18 17 16]]
# [[15 14 13 12 11]
# [10 9 8 7 6]
# [ 5 4 3 2 1]]]
B = np.random.randint(k, size=(n,m))
print(B)
# [[4 0 3]
# [3 3 1]]
To create this array,
print(A.reshape(-1, k)[np.arange(n * m), B.ravel()])
# [26 25 17 12 7 4]
as a nxm array using fancy indexing:
i,j = np.ogrid[0:n, 0:m]
print(A[i, j, B])
# [[26 25 17]
# [12 7 4]]

Categories