This question already has answers here:
What is the difference between i = i + 1 and i += 1 in a 'for' loop? [duplicate]
(6 answers)
Closed 6 years ago.
I need a good explanation (reference) to explain NumPy slicing within (for) loops. I have three cases.
def example1(array):
for row in array:
row = row + 1
return array
def example2(array):
for row in array:
row += 1
return array
def example3(array):
for row in array:
row[:] = row + 1
return array
A simple case:
ex1 = np.arange(9).reshape(3, 3)
ex2 = ex1.copy()
ex3 = ex1.copy()
returns:
>>> example1(ex1)
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> example2(ex2)
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> example3(ex3)
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
It can be seen that the first result differs from the second and third.
First example:
You extract a row and add 1 to it. Then you redefine the pointer row but not what the array contains! So it will not affect the original array.
Second example:
You make an in-place operation - obviously this will affect the original array - as long as it is an array.
If you were doing a double loop it wouldn't work anymore:
def example4(array):
for row in array:
for column in row:
column += 1
return array
example4(np.arange(9).reshape(3,3))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
this doesn't work because you don't call np.ndarray's __iadd__ (to modify the data the array points to) but the python int's __iadd__. So this example only works because your rows are numpy arrays.
Third example:
row[:] = row + 1 this is interpreted as something like row[0] = row[0]+1, row[1] = row[1]+1, ... again this works in place so this affects the original array.
Bottom Line
If you are operating on mutable objects, like lists or np.ndarray you need to be careful what you change. Such an object only points to where the actual data is stored in memory - so changing this pointer (example1) doesn't affect the saved data. You need to follow the pointer (either directly by [:] (example3) or indirectly with array.__iadd__ (example2)) to change the saved data.
In the first code, you don't do anything with the new computed row; you rebind the name row, and there is no connection to the array anymore.
In the second and the third, you dont rebind, but assign values to the old variable. With += some internal function is called, which varies depending on the type of the object you let it act upon. See links below.
If you write row + 1 on the right hand side, a new array is computed. In the first case, you tell python to give it the name row (and forget the original object which was called row before). And in the third, the new array is written to the slice of the old row.
For further reading follow the link of the comment to the question by #Thiru above. Or read about assignment and rebinding in general...
Related
I was wondering whether anyone could explain to me the syntax of this line of code
a[[1, 6, 7]] = 10
where a is an array.
The need for the double brackets is what is confusing me the most
Indexing with 1, 6, 7 (i.e. a[1, 6, 7]) would suggest you are indexing along three separate axes of a which is not what you are looking for. This corresponds to the following assignment:
a[1][6][7] = 10
Indexing with [1, 6, 7] (i.e. a[[1, 6, 7]]), means you are indexing a single axis with a list. Essentially accessing multiple different values of array a. This corresponds to the following assignments:
a[1] = 10
a[6] = 10
a[7] = 10
If a is a numpy.array, the syntax a[x, y, z] has another use, which is to select from a 3-dimensional array the element at that position. In your case, a seems to be a 1-dimensional array, so you can only select by x (that is a[x], and for example a[x,y] would be incorrect since there is no second dimension)
And that 'x' value can be either just one index (for instance a[0], here the x is an integer) or multiple ones (for instance a[[0,1]], and here x is a list of integers).
I am accessing column values but it gives me the row values while indexing 2-d array in numpy.
The general format is arr_2d[row][col] or arr_2d[row,col]. recommended
is comma notation for clarity
arr_2d = np.arange(0,9).reshape((3,3))
# sub array
arr_2d[0:2,1:]
arr_2d[:,0] # in row form but the data will be of the column.
access column data but it gives me row values.
arr_2d[:][0] # it gives the first row data.
What is the difference between comma notation and bracket notation?
arr_2d[row][col] is only works as you intended, i.e, like arr_2d[row,col], if you pass an integer as row index, not slices.
For e.g.:
>>> arr_2d = np.arange(0,9).reshape((3,3))
>>> arr_2d[1][2]
5
>>> arr_2d[1,2]
5
But:
>>> arr_2d[:][2]
array([6, 7, 8])
This is because np.ndarray[:] is essentially a copy of the original array:
>>> arr_2d[:]
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> arr_2d
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
and:
>>> arr_2d[2]
array([6, 7, 8])
# so no surprises here:
>>> arr_2d[:][2]
array([6, 7, 8])
The notation arr_2d[:,0] translates to select all items in dimension 0, and the the first item in dimension 1 - amounting to the entire first column (item 0 of all rows).
The notation arr_2d[:][0] is chaining two operations:
arr_2d[:]means select all items in dimension 0 - basically referring to the entire matrix.
[0] simply selects the first item in the matrix returned by the first operation - returning the first row.
In order to select the first row, you can use either arr_2d[0] or more 'verbosely' arr_2d[0, :] (which translates to "all columns of the first row").
You could access the same items using both notations, but in different ways. For example -
In order to select the 3rd item in the 2nd row you could use:
Comma notation - arr_2d[1, 2]
Bracket notation - arr_2d[1][2]
I have a rather basic question about the NumPy module in Python 2, particularly the version on trinket.io. I do not see how to replace values in a multidimensional array several layers in, regardless of the method. Here is an example:
a = numpy.array([1,2,3])
a[0] = 0
print a
a = numpy.array([[1,2,3],[1,2,3]])
a[0][0] = a[1][0] = 0
print a
Result:
array([0, 2, 3], '<class 'int'>')
array([[1, 2, 3], [1, 2, 3]], '<class 'int'>')
I need the ability to change individual values, my specific code being:
a = numpy.empty(shape = (8,8,2),dtype = str)
for row in range(a.shape[0]):
for column in range(a.shape[1]):
a[row][column][1] = 'a'
Thank you for your time and any help provided.
To change individual values you can simply do something like:
a[1,2] = 'b'
If you want to change all the array, you can do:
a[:,:] = 'c'
Use commas (array[a,b]) instead of (array[a][b])
With numpy version 1.11.0, I get
[[0 2 3]
[0 2 3]]
When I run your code. I guess your numpy version is newer and better.
As user3408085 said, the correct thing is to go a[0,0] = 0 to change one element or a[:,0]=0 if your actually want to zero the entire first column.
The reason a[0][0]=0 does not modify a (at least in your version of numpy) is that a[0] is a new array. If break down your command a[0][0]=0 into 2 lines:
b=a[0]
b[0]=0
Then the fact that this modifies a is counterintuitive.
I feel silly, because this is such a simple thing, but I haven't found the answer either here or anywhere else.
Is there no straightforward way of indexing a numpy array with another?
Say I have a 2D array
>> A = np.asarray([[1, 2], [3, 4], [5, 6], [7, 8]])
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
if I want to access element [3,1] I type
>> A[3,1]
8
Now, say I store this index in an array
>> ind = np.array([3,1])
and try using the index this time:
>> A[ind]
array([[7, 8],
[3, 4]])
the result is not A[3,1]
The question is: having arrays A and ind, what is the simplest way to obtain A[3,1]?
Just use a tuple:
>>> A[(3, 1)]
8
>>> A[tuple(ind)]
8
The A[] actually calls the special method __getitem__:
>>> A.__getitem__((3, 1))
8
and using a comma creates a tuple:
>>> 3, 1
(3, 1)
Putting these two basic Python principles together solves your problem.
You can store your index in a tuple in the first place, if you don't need NumPy array features for it.
That is because by giving an array you actually ask
A[[3,1]]
Which gives the third and first index of the 2d array instead of the first index of the third index of the array as you want.
You can use
A[ind[0],ind[1]]
You can also use (if you want more indexes at the same time);
A[indx,indy]
Where indx and indy are numpy arrays of indexes for the first and second dimension accordingly.
See here for all possible indexing methods for numpy arrays: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html
I want to initialize an empty list and keep on adding new rows to it. For example.
myarray=[]
now at each iteration I want to add new row which I compute during iteration. For example
for i in range(5):
calc=[i,i+1,i+4,i+5]
After calc I want to add this row to myarray. Therfore after 1st iteration myarray would be 1X4, after 2nd iteration it would be 2X4 etc. I tried numpy.concatenate. It simply adds to same row ie I get 1X4 then 1X8. I tried vstack as well but since myarray is initially [] it gives error "all the input array dimensions except for the concatenation axis must match exactly"
It looks like you need a multi dimensional array
calc = [[0, 1, 4, 5]]
for i in range(1, 5):
calc.append([i, i+1, i+4, i+5])
Will yield you the following array
calc = [[0, 1, 4, 5], [1, 2, 5, 6], [2, 3, 6, 7], [3, 4, 7, 8], [4, 5, 8, 9]]
To access the various elements of calc you can address it like the following
calc[0] returns [0,1,5,6]
calc[1] returns [1,2,5,6]
I'm pretty sure this works, unless I'm misunderstanding:
mylist = [] #I'm using a list, not an array
for i in range(5):
calc=[i,i+1,i+4,i+5]
mylist.append(calc) #You're appending a list into another list, making a nested list
Now, a little more general knowledge. Append vs. Concatenate.
You want to append if you want to add into a list. In this case, you're adding a list into another list. You want to concatenate if you want to 'merge' two lists together to make a single list - which is why your implementation was not making a nested list.