Get part of np array with parameters - python

I am using python and numpy. I am using n dimensional array.
I want to select all elements with index like
arr[a,b,:,c]
but I want to be able to select slice position like parameter. For example if the parameter
#pos =2
arr[a,b,:,c]
#pos =1
arr[a,:,b,c]

I would move the axis of interest (at pos) to the front with numpy.moveaxis(array,pos,0)[1] and then simply slice with [:,a,b,c].
There is also numpy.take[2], but in your case you would still need to loop over each dimension a,b,c, so I think moveaxis is more convenient. Maybe there is an even more direct way to do this.

The idea of moving the slicing axis to one end is a good one. Various numpy functions use that idea.
In [171]: arr = np.ones((2,3,4,5),int)
In [172]: arr[0,0,:,0].shape
Out[172]: (4,)
In [173]: arr[0,:,0,0].shape
Out[173]: (3,)
Another idea is to build a indexing tuple:
In [176]: idx = (0,0,slice(None),0)
In [177]: arr[idx].shape
Out[177]: (4,)
In [178]: idx = (0,slice(None),0,0)
In [179]: arr[idx].shape
Out[179]: (3,)
To do this programmatically it may be easier to start with a list or array that can be modified, and then convert it to a tuple for indexing. Details will vary depending on how you prefer to specify the axis and variables.
If any of a,b,c are arrays (or lists), you may get some shape surprises, since it's a case of mixing advanced and basic indexing. But as long as they are scalars, that's not an issue.

You could np.transpose the array arr based on your preferences before you try to slice it, since you move your axis of interest (i.e. the :) "to the back". This way, you can rearrange arr, s.t. you can always call arr[a,b,c].
Example with only a and b:
import numpy as np
a = 0
b = 2
target_axis = 1
# Generate some random data
arr = np.random.randint(10, size=[3, 3, 3], dtype=int)
print(arr)
#[[[0 8 2]
# [3 9 4]
# [0 3 6]]
#
# [[8 5 4]
# [9 8 5]
# [8 6 1]]
#
# [[2 2 5]
# [5 3 3]
# [9 1 8]]]
# Define transpose s.t. target_axis is the last axis
transposed_shape = np.arange(arr.ndim)
transposed_shape = np.delete(transposed_shape, target_axis)
transposed_shape = np.append(transposed_shape, target_axis)
print(transposed_shape)
#[0 2 1]
# Caution! These 0 and 2 above do not come from a or b.
# Instead they are the indices of the axes.
# Transpose arr
arr_T = np.transpose(arr, transposed_shape)
print(arr_T)
#[[[0 3 0]
# [8 9 3]
# [2 4 6]]
#
# [[8 9 8]
# [5 8 6]
# [4 5 1]]
#
# [[2 5 9]
# [2 3 1]
# [5 3 8]]]
print(arr_T[a,b])
#[2 4 6]

Related

Generalized version of np.roll

I have a 2D array
a = np.array([[0,1,2,3],[4,5,6,7]])
that is a 2x4 array. I need to shift the elements of each of the two arrays in axis 0 in but with different steps, say 1 for the first and 2 for the second, so that the output will be
np.array([[1,2,3,0],[6,7,4,5]])
With np.roll it doesn't seem possible to do it, at least looking at the documentation, I don't see any useful hint. There exists another function doing this?
This is an attempt at a generalized version of numpy.roll.
import numpy as np
a = np.array([[0,1,2,3],[4,5,6,7]])
def roll(a, shifts, axis):
assert a.shape[axis] == len(shifts)
return np.stack([
np.roll(np.take(a, i, axis), shifts[i]) for i in range(len(shifts))
], axis)
print(a)
print(roll(a, [-1, -2], 0))
print(roll(a, [1, 2, 1, 0], 1))
prints
[[0 1 2 3]
[4 5 6 7]]
[[1 2 3 0]
[6 7 4 5]]
[[4 1 6 3]
[0 5 2 7]]
Here, the parameter a is a numpy.array, shifts is an Iterable containing the shift amounts per element and axis is the axis along which to shift. Note that was only tested on two-dimensional arrays however.

Change Numpy array values in-place

Say when we have a randomly generated 2D 3x2 Numpy array a = np.array(3,2) and I want to change the value of the element on the first row & column (i.e. a[0,0]) to 10. If I do
a[0][0] = 10
then it works and a[0,0] is changed to 10. But if I do
a[np.arange(1)][0] = 10
then nothing is changed. Why is this?
I want to change some columns values of a selected list of rows (that is indicated by a Numpy array) to some other values (like a[row_indices][:,0] = 10) but it doesn't work as I'm passing in an array (or list) that indicates rows.
a[x][y] is wrong. It happens to work in the first case, a[0][0] = 10 because a[0] returns a view, hence doing resul[y] = whatever modifies the original array. However, in the second case, a[np.arange(1)][0] = 10, a[np.arange(1)] returns a copy (because you are using array indexing).
You should be using a[0, 0] = 10 or a[np.arange(1), 0] = 10
Advanced indexing always returns a copy as a view cannot be guaranteed.
Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).
If you replace np.arange(1) with something that returns a view (or equivalent slicing) then you get back to basic indexing, and hence when you chain two views, the change is reflected into the original array.
For example:
import numpy as np
arr = np.arange(2 * 3).reshape((2, 3))
arr[0][0] = 10
print(arr)
# [[10 1 2]
# [ 3 4 5]]
arr = np.arange(2 * 3).reshape((2, 3))
arr[:1][0] = 10
print(arr)
# [[10 10 10]
# [ 3 4 5]]
arr = np.arange(2 * 3).reshape((2, 3))
arr[0][:1] = 10
print(arr)
# [[10 1 2]
# [ 3 4 5]]
etc.
If you have some row indices you want to use, to modify the array you can just use them, but you cannot chain the indexing, e.g:
arr = np.arange(5 * 3).reshape((5, 3))
row_indices = (0, 2)
arr[row_indices, 0] = 10
print(arr)
# [[10 1 2]
# [ 3 4 5]
# [10 7 8]
# [ 9 10 11]
# [12 13 14]]

Numpy sorted array for loop error but original works fine

After I sort this numpy array and remove all duplicate (y) values and the corresponding (x) value for the duplicate (y) value, I use a for loop to draw rectangles at the remaining coordinates. yet I get the error : ValueError: too many values to unpack (expected 2), but its the same shape as the original just the duplicates have been removed.
from graphics import *
import numpy as np
def main():
win = GraphWin("A Window", 500, 500)
# starting array
startArray = np.array([[2, 1, 2, 3, 4, 7],
[5, 4, 8, 3, 7, 8]])
# the following reshapes the from all x's in one row and y's in second row
# to x,y rows pairing the x with corresponding y value.
# then it searches for duplicate (y) values and removes both the duplicate (y) and
# its corresponding (x) value by removing the row.
# then the unique [x,y]'s array is reshaped back to a [[x,....],[y,....]] array to be used to draw rectangles.
d = startArray.reshape((-1), order='F')
# reshape to [x,y] matching the proper x&y's together
e = d.reshape((-1, 2), order='C')
# searching for duplicate (y) values and removing that row so the corresponding (x) is removed too.
f = e[np.unique(e[:, 1], return_index=True)[1]]
# converting unique array back to original shape
almostdone = f.reshape((-1), order='C')
# final reshape to return to original starting shape but is only unique values
done = almostdone.reshape((2, -1), order='F')
# print all the shapes and elements
print("this is d reshape of original/start array:", d)
print("this is e reshape of d:\n", e)
print("this is f unique of e:\n", f)
print("this is almost done:\n", almostdone)
print("this is done:\n", done)
print("this is original array:\n",startArray)
# loop to draw a rectangle with each x,y value being pulled from the x and y rows
# says too many values to unpack?
for x,y in np.nditer(done,flags = ['external_loop'], order = 'F'):
print("this is x,y:", x,y)
print("this is y:", y)
rect = Rectangle(Point(x,y),Point(x+4,y+4))
rect.draw(win)
win.getMouse()
win.close()
main()
here is the output:
line 42, in main
for x,y in np.nditer(done,flags = ['external_loop'], order = 'F'):
ValueError: too many values to unpack (expected 2)
this is d reshape of original/start array: [2 5 1 4 2 8 3 3 4 7 7 8]
this is e reshape of d:
[[2 5]
[1 4]
[2 8]
[3 3]
[4 7]
[7 8]]
this is f unique of e:
[[3 3]
[1 4]
[2 5]
[4 7]
[2 8]]
this is almost done:
[3 3 1 4 2 5 4 7 2 8]
this is done:
[[3 1 2 4 2]
[3 4 5 7 8]]
this is original array:
[[2 1 2 3 4 7]
[5 4 8 3 7 8]]
why would the for loop work for the original array but not this sorted one?
or what loop could I use to just use (f) since it is sorted but shape(-1,2)?
I also tried a different loop:
for x,y in done[np.nditer(done,flags = ['external_loop'], order = 'F')]:
Which seems to fix the too many values error but I get:
IndexError: index 3 is out of bounds for axis 0 with size 2
and
FutureWarning: Using a non-tuple sequence for multidimensional indexing is
deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this
will be interpreted as an array index, `arr[np.array(seq)]`, which will
result either in an error or a different result.
for x,y in done[np.nditer(done,flags = ['external_loop'], order = 'F')]:
which I've looked up on stackexchange to fix but keep getting the error regardless of how I do the syntax.
any help would be great thanks!
I don't have the graphics package (it might be a windows specific thing?), but I do know that you're making this waaaaay too complicated. Here is a much simpler version that produces the same done array:
from graphics import *
import numpy as np
# starting array
startArray = np.array([[2, 1, 2, 3, 4, 7],
[5, 4, 8, 3, 7, 8]])
# searching for duplicate (y) values and removing that row so the corresponding (x) is removed too.
done = startArray.T[np.unique(startArray[1,:], return_index=True)[1]]
for x,y in done:
print("this is x,y:", x, y)
print("this is y:", y)
rect = Rectangle(Point(x,y),Point(x+4,y+4))
rect.draw(win)
To note, in the above version done.shape==(5, 2) instead of (2, 5), but you can always change that back after the for loop with done = done.T.
Here's some notes on your original code for future reference:
The order flag in reshape is completely superfluous to what your code is trying to do, and just makes it more confusing/potentially more buggy. You can do all of the reshapes you wanted to do without it.
The use case for nditer is to iterate over the individual elements of one (or more) array one at a time. It cannot in general be used to iterate over the rows or columns of a 2D array. If you try to use it this way, you're likely to get buggy results that are highly dependent on the layout of the array in memory (as you saw).
To iterate over rows or columns of a 2D array, just use simple iteration. If you just iterate over an array (eg for row in arr:), you get each row, one at a time. If you want the columns instead you can transpose the array first (like I did in the code above with .T).
Note about .T
.T takes the transpose of an array. For example, if you start with
arr = np.array([[0, 1, 2, 3],
[4, 5, 6, 7]])
then the transpose is:
arr.T==np.array([[0, 4],
[1, 5],
[2, 6],
[3, 7]])

Iterating through matrix in python using numpy

I want to generate a resultant matrix by iterating through 5 different matrices and firstly i want to take first value of all matrix and take the average of these values and append the result as the first value of resultant matrix. Can anyone tell how to do this in python using numpy library??
In general you want to avoid (potentially slow) python-based looping and let numpy do (faster) c-based looping (or no looping at all).
Most people would call the approach of removing explicit loops as (numpy-)vectorization which is usually very important if going for performance.
The following example creates 5 numpy-arrays with size (3,3) (the matrix-type, which also exists, is kind of deprecated, not used here and most numpy-users should use arrays as replacement for matrices) and calculate a new matrix containing all the averages with the same shape (elementwise-mean over matrix-cells; we are interpreting the 2d-arrays as a matrix).
Code:
import numpy as np
a, b, c, d, e = [np.random.randint(0, 5, size=(3,3)) for i in range(5)]
all = np.stack((a, b, c, d, e), axis=0)
print(all.shape)
x = np.mean(all, axis=0)
print(a)
print(b)
print(c)
print(d)
print(e)
print(x)
Out:
(5, 3, 3)
[[0 0 0]
[0 1 0]
[2 4 0]]
[[4 2 0]
[3 3 4]
[0 4 0]]
[[3 4 0]
[2 2 1]
[0 0 4]]
[[3 1 2]
[4 3 4]
[2 0 3]]
[[3 4 2]
[3 1 0]
[1 0 0]]
[[ 2.6 2.2 0.8]
[ 2.4 2. 1.8]
[ 1. 1.6 1.4]]
If you still want to loop, you can just use a nested loop like:
for row in range(array.shape[0]):
for col in range(array.shape[1]):
cell_value = array[row, col]
...
given an array of 2 dimensions.

Iterate over columns of a NumPy array and elements of another one?

I am trying to replicate the behaviour of zip(a, b) in order to be able to loop simultaneously along two NumPy arrays. In particular, I have two arrays a and b:
a.shape=(n,m)
b.shape=(m,)
I would like to get for every loop a column of a and an element of b.
So far, I have tried the following:
for a_column, b_element in np.nditer([a, b]):
print(a_column)
However, I get printed the element a[0,0] rather than the column a[0,:], which I want.
How can I solve this?
You can still use zip on numpy arrays, because they are iterables.
In your case, you'd need to transpose a first, to make it an array of shape (m,n), i.e. an iterable of length m:
for a_column, b_element in zip(a.T, b):
...
Adapting my answer in shallow iteration with nditer,
nditer and ndindex can be used to iterate over rows or columns by generating indexes.
In [19]: n,m=3,4
In [20]: a=np.arange(n*m).reshape(n,m)
In [21]: b=np.arange(m)
In [22]: it=np.nditer(b)
In [23]: for i in it: print a[:,i],b[i]
[0 4 8] 0
[1 5 9] 1
[ 2 6 10] 2
[ 3 7 11] 3
In [24]: for i in np.ndindex(m):print a[:,i],b[i]
[[0]
[4]
[8]] 0
[[1]
[5]
[9]] 1
[[ 2]
[ 6]
[10]] 2
[[ 3]
[ 7]
[11]] 3
In [25]:
ndindex uses an iterator like: it = np.nditer(b, flags=['multi_index'].
For iteration over a single dimension like this, for i in range(m): works just as well.
Also from the other thread, here's a trick using order to iterate without the indexes:
In [28]: for i,j in np.nditer([a,b],order='F',flags=['external_loop']):
print i,j
[0 4 8] [0 0 0]
[1 5 9] [1 1 1]
[ 2 6 10] [2 2 2]
[ 3 7 11] [3 3 3]
Usually, because of NumPy's ability to broadcast arrays, it is not necessary to iterate over the columns of an array one-by-one. For example, if a has shape (n,m) and b has shape (m,) then you can add a+b and b will broadcast itself to shape (n, m) automatically.
Moreover, your calculation will complete much faster if it can be expressed through operations on the whole array, a, rather than through operations on pieces of a (such as on columns) using a Python for-loop.
Having said that, the easiest way to loop through the columns of a is to iterate over the index:
for i in np.arange(b.shape[0]):
a_column, b_element = a[:, i], b[i]
print(a_column)

Categories