Change Numpy array values in-place - python

Say when we have a randomly generated 2D 3x2 Numpy array a = np.array(3,2) and I want to change the value of the element on the first row & column (i.e. a[0,0]) to 10. If I do
a[0][0] = 10
then it works and a[0,0] is changed to 10. But if I do
a[np.arange(1)][0] = 10
then nothing is changed. Why is this?
I want to change some columns values of a selected list of rows (that is indicated by a Numpy array) to some other values (like a[row_indices][:,0] = 10) but it doesn't work as I'm passing in an array (or list) that indicates rows.

a[x][y] is wrong. It happens to work in the first case, a[0][0] = 10 because a[0] returns a view, hence doing resul[y] = whatever modifies the original array. However, in the second case, a[np.arange(1)][0] = 10, a[np.arange(1)] returns a copy (because you are using array indexing).
You should be using a[0, 0] = 10 or a[np.arange(1), 0] = 10

Advanced indexing always returns a copy as a view cannot be guaranteed.
Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).
If you replace np.arange(1) with something that returns a view (or equivalent slicing) then you get back to basic indexing, and hence when you chain two views, the change is reflected into the original array.
For example:
import numpy as np
arr = np.arange(2 * 3).reshape((2, 3))
arr[0][0] = 10
print(arr)
# [[10 1 2]
# [ 3 4 5]]
arr = np.arange(2 * 3).reshape((2, 3))
arr[:1][0] = 10
print(arr)
# [[10 10 10]
# [ 3 4 5]]
arr = np.arange(2 * 3).reshape((2, 3))
arr[0][:1] = 10
print(arr)
# [[10 1 2]
# [ 3 4 5]]
etc.
If you have some row indices you want to use, to modify the array you can just use them, but you cannot chain the indexing, e.g:
arr = np.arange(5 * 3).reshape((5, 3))
row_indices = (0, 2)
arr[row_indices, 0] = 10
print(arr)
# [[10 1 2]
# [ 3 4 5]
# [10 7 8]
# [ 9 10 11]
# [12 13 14]]

Related

Copy data to an array with 2 different way of slicing produces different results

Consider the following scenario:
arr = np.zeros(10)
subset1 = arr[:5] # the first 5 elements of arr
subset2 = subset[:2] # the first 2 elements of arr
Now, assigning data to subset2 (subset2[:] = 1) will have the expected effects on arr where the first two elements will be 1. However, doing the samething with a different way of slicing does not work:
subset1 = arr[[0, 1, 2, 3, 4]] # first 5 elements of arr
subset2 = subset1[[0, 1]] # first 2 elements of arr
Now assigning to subset2[:] = 1 will do nothing on arr. Why?
Note: doing arr[:5][[0, 1]] = 1 works.

Get part of np array with parameters

I am using python and numpy. I am using n dimensional array.
I want to select all elements with index like
arr[a,b,:,c]
but I want to be able to select slice position like parameter. For example if the parameter
#pos =2
arr[a,b,:,c]
#pos =1
arr[a,:,b,c]
I would move the axis of interest (at pos) to the front with numpy.moveaxis(array,pos,0)[1] and then simply slice with [:,a,b,c].
There is also numpy.take[2], but in your case you would still need to loop over each dimension a,b,c, so I think moveaxis is more convenient. Maybe there is an even more direct way to do this.
The idea of moving the slicing axis to one end is a good one. Various numpy functions use that idea.
In [171]: arr = np.ones((2,3,4,5),int)
In [172]: arr[0,0,:,0].shape
Out[172]: (4,)
In [173]: arr[0,:,0,0].shape
Out[173]: (3,)
Another idea is to build a indexing tuple:
In [176]: idx = (0,0,slice(None),0)
In [177]: arr[idx].shape
Out[177]: (4,)
In [178]: idx = (0,slice(None),0,0)
In [179]: arr[idx].shape
Out[179]: (3,)
To do this programmatically it may be easier to start with a list or array that can be modified, and then convert it to a tuple for indexing. Details will vary depending on how you prefer to specify the axis and variables.
If any of a,b,c are arrays (or lists), you may get some shape surprises, since it's a case of mixing advanced and basic indexing. But as long as they are scalars, that's not an issue.
You could np.transpose the array arr based on your preferences before you try to slice it, since you move your axis of interest (i.e. the :) "to the back". This way, you can rearrange arr, s.t. you can always call arr[a,b,c].
Example with only a and b:
import numpy as np
a = 0
b = 2
target_axis = 1
# Generate some random data
arr = np.random.randint(10, size=[3, 3, 3], dtype=int)
print(arr)
#[[[0 8 2]
# [3 9 4]
# [0 3 6]]
#
# [[8 5 4]
# [9 8 5]
# [8 6 1]]
#
# [[2 2 5]
# [5 3 3]
# [9 1 8]]]
# Define transpose s.t. target_axis is the last axis
transposed_shape = np.arange(arr.ndim)
transposed_shape = np.delete(transposed_shape, target_axis)
transposed_shape = np.append(transposed_shape, target_axis)
print(transposed_shape)
#[0 2 1]
# Caution! These 0 and 2 above do not come from a or b.
# Instead they are the indices of the axes.
# Transpose arr
arr_T = np.transpose(arr, transposed_shape)
print(arr_T)
#[[[0 3 0]
# [8 9 3]
# [2 4 6]]
#
# [[8 9 8]
# [5 8 6]
# [4 5 1]]
#
# [[2 5 9]
# [2 3 1]
# [5 3 8]]]
print(arr_T[a,b])
#[2 4 6]

Numpy array add inplace, values not summed when select same row multiple times

suppose I have a 2x2 matrix, I want to select a few rows and add inplace with another array of the correct shape. The problem is, when a row is selected multiple times, the values from another array is not summed:
Example:
I have a 2x2 matrix:
>>> import numpy as np
>>> x = np.arange(15).reshape((5,3))
>>> print(x)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
I want to select a few rows, and add values:
>>> x[np.array([[1,1],[2,3]])] # row 1 is selected twice
[[[ 3 4 5]
[ 3 4 5]]
[[ 6 7 8]
[ 9 10 11]]]
>>> add_value = np.random.randint(0,10,(2,2,3))
[[[6 1 2] # add to row 1
[9 8 5]] # add to row 1 again!
[[5 0 5] # add to row 2
[1 9 3]]] # add to row 3
>>> x[np.array([[1,1],[2,3]])] += add_value
>>> print(x)
[[ 0 1 2]
[12 12 10] # [12,12,10]=[3,4,5]+[9,8,5]
[11 7 13]
[10 19 14]
[12 13 14]]
as above, the first row is [12,12,10], which means [9,8,5] and [6,1,2] is not summed when added onto the first row. Are there any solutions? Thanks!
This behavior is described in the numpy documentation, near the bottom of this page, under "assigning values to indexed arrays":
https://numpy.org/doc/stable/user/basics.indexing.html#basics-indexing
Quoting:
Unlike some of the references (such as array and mask indices) assignments are always made to the original data in the array (indeed, nothing else would make sense!). Note though, that some actions may not work as one may naively expect. This particular example is often surprising to people:
>>> x = np.arange(0, 50, 10)
>>> x
array([ 0, 10, 20, 30, 40])
>>> x[np.array([1, 1, 3, 1])] += 1
>>> x
array([ 0, 11, 20, 31, 40])
Where people expect that the 1st location will be incremented by 3. In fact, it will only be incremented by 1. The reason is that a new array is extracted from the original (as a temporary) containing the values at 1, 1, 3, 1, then the value 1 is added to the temporary, and then the temporary is assigned back to the original array. Thus the value of the array at x[1] + 1 is assigned to x[1] three times, rather than being incremented 3 times.
Just wanna share what #hpaulj suggests that uses np.add.at:
>>> import numpy as np
>>> x = np.arange(15).reshape((5,3))
>>> select = np.array([[1,1],[2,3]])
>>> add_value = np.array([[[6,1,2],[9,8,5]],[[5,0,5],[1,9,3]]])
>>> np.add.at(x, select.flatten(), add_value.reshape(-1, add_value.shape[-1]))
[[ 0 1 2]
[18 13 12]
[11 7 13]
[10 19 14]
[12 13 14]]
Now the first row is [18,13,12] which is the sum of [3,4,5], [6,1,2] and [9,8,5]

Pad list of arrays with zeros in order all arrays to have the same size

I have created this array(or I think its a list) that consist of many arrays that are different size and that is the reason I put dtype = object .
m = [data[a:b] for a, b in zip(z[0:-1:2], z[1:-1:2])]
array = np.array(m, dtype=object)
I need to pad each array with zero so that they have the same size (lets say size=smax) and become a "proper" array. My definitions are a little off and I am sorry in advance
You can do this using np.pad on each row. For example:
import numpy as np
data = np.arange(10)
z = [0, 2, 1, 4, 6, 10, 8, 9]
m = [data[a:b] for a, b in zip(z[0:-1:2], z[1:-1:2])]
max_length = max(len(row) for row in m)
result = np.array([np.pad(row, (0, max_length-len(row))) for row in m])
print(result)
# [[0 1 0 0]
# [1 2 3 0]
# [6 7 8 9]]

Numpy sorted array for loop error but original works fine

After I sort this numpy array and remove all duplicate (y) values and the corresponding (x) value for the duplicate (y) value, I use a for loop to draw rectangles at the remaining coordinates. yet I get the error : ValueError: too many values to unpack (expected 2), but its the same shape as the original just the duplicates have been removed.
from graphics import *
import numpy as np
def main():
win = GraphWin("A Window", 500, 500)
# starting array
startArray = np.array([[2, 1, 2, 3, 4, 7],
[5, 4, 8, 3, 7, 8]])
# the following reshapes the from all x's in one row and y's in second row
# to x,y rows pairing the x with corresponding y value.
# then it searches for duplicate (y) values and removes both the duplicate (y) and
# its corresponding (x) value by removing the row.
# then the unique [x,y]'s array is reshaped back to a [[x,....],[y,....]] array to be used to draw rectangles.
d = startArray.reshape((-1), order='F')
# reshape to [x,y] matching the proper x&y's together
e = d.reshape((-1, 2), order='C')
# searching for duplicate (y) values and removing that row so the corresponding (x) is removed too.
f = e[np.unique(e[:, 1], return_index=True)[1]]
# converting unique array back to original shape
almostdone = f.reshape((-1), order='C')
# final reshape to return to original starting shape but is only unique values
done = almostdone.reshape((2, -1), order='F')
# print all the shapes and elements
print("this is d reshape of original/start array:", d)
print("this is e reshape of d:\n", e)
print("this is f unique of e:\n", f)
print("this is almost done:\n", almostdone)
print("this is done:\n", done)
print("this is original array:\n",startArray)
# loop to draw a rectangle with each x,y value being pulled from the x and y rows
# says too many values to unpack?
for x,y in np.nditer(done,flags = ['external_loop'], order = 'F'):
print("this is x,y:", x,y)
print("this is y:", y)
rect = Rectangle(Point(x,y),Point(x+4,y+4))
rect.draw(win)
win.getMouse()
win.close()
main()
here is the output:
line 42, in main
for x,y in np.nditer(done,flags = ['external_loop'], order = 'F'):
ValueError: too many values to unpack (expected 2)
this is d reshape of original/start array: [2 5 1 4 2 8 3 3 4 7 7 8]
this is e reshape of d:
[[2 5]
[1 4]
[2 8]
[3 3]
[4 7]
[7 8]]
this is f unique of e:
[[3 3]
[1 4]
[2 5]
[4 7]
[2 8]]
this is almost done:
[3 3 1 4 2 5 4 7 2 8]
this is done:
[[3 1 2 4 2]
[3 4 5 7 8]]
this is original array:
[[2 1 2 3 4 7]
[5 4 8 3 7 8]]
why would the for loop work for the original array but not this sorted one?
or what loop could I use to just use (f) since it is sorted but shape(-1,2)?
I also tried a different loop:
for x,y in done[np.nditer(done,flags = ['external_loop'], order = 'F')]:
Which seems to fix the too many values error but I get:
IndexError: index 3 is out of bounds for axis 0 with size 2
and
FutureWarning: Using a non-tuple sequence for multidimensional indexing is
deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this
will be interpreted as an array index, `arr[np.array(seq)]`, which will
result either in an error or a different result.
for x,y in done[np.nditer(done,flags = ['external_loop'], order = 'F')]:
which I've looked up on stackexchange to fix but keep getting the error regardless of how I do the syntax.
any help would be great thanks!
I don't have the graphics package (it might be a windows specific thing?), but I do know that you're making this waaaaay too complicated. Here is a much simpler version that produces the same done array:
from graphics import *
import numpy as np
# starting array
startArray = np.array([[2, 1, 2, 3, 4, 7],
[5, 4, 8, 3, 7, 8]])
# searching for duplicate (y) values and removing that row so the corresponding (x) is removed too.
done = startArray.T[np.unique(startArray[1,:], return_index=True)[1]]
for x,y in done:
print("this is x,y:", x, y)
print("this is y:", y)
rect = Rectangle(Point(x,y),Point(x+4,y+4))
rect.draw(win)
To note, in the above version done.shape==(5, 2) instead of (2, 5), but you can always change that back after the for loop with done = done.T.
Here's some notes on your original code for future reference:
The order flag in reshape is completely superfluous to what your code is trying to do, and just makes it more confusing/potentially more buggy. You can do all of the reshapes you wanted to do without it.
The use case for nditer is to iterate over the individual elements of one (or more) array one at a time. It cannot in general be used to iterate over the rows or columns of a 2D array. If you try to use it this way, you're likely to get buggy results that are highly dependent on the layout of the array in memory (as you saw).
To iterate over rows or columns of a 2D array, just use simple iteration. If you just iterate over an array (eg for row in arr:), you get each row, one at a time. If you want the columns instead you can transpose the array first (like I did in the code above with .T).
Note about .T
.T takes the transpose of an array. For example, if you start with
arr = np.array([[0, 1, 2, 3],
[4, 5, 6, 7]])
then the transpose is:
arr.T==np.array([[0, 4],
[1, 5],
[2, 6],
[3, 7]])

Categories