How does adding two axis get calculated? - python

Update: Apologies. I have updated the code to match the output now.
import numpy as np
x = np.arange(0,4,1)
matrix = x[:6,np.newaxis] + x[np.newaxis,:]
That produces this table that looks like this in rows and columns:
0 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6
which is this array:
array([[0, 1, 2, 3],[1, 2, 3, 4],[2, 3, 4, 5],[3, 4, 5, 6]])
My question is how does numpy add up each cell to produce this result?

In my environment your code doesn't give that result. The matrix I get using your code is this:
[[0, 2],
[2, 4]]
To understand the way this result is calculated lets split up the process:
#--- 1
x = np.arange(0,4,2)
The first line gets an array of numbers between 0 and 4 with a step of 2 (not including the 4). The output of this line is the following array:
[0, 2]
Next the two parts of the equation:
#--- 2
part1 = x[:4,np.newaxis]
part2 = x[np.newaxis,:]
The first part takes the first 4 elements of array x and expands it in axis index 1 (so now the elements of our original array are in the axis index 0 (rows)). The output of this part is the following:
[[0],
[2]]
The second part takes the array x and expands the dimensions in the axis index 0 (the elements of the original array are now axis index 1 (columns)). The output of this part is:
[[0, 2]]
Finally adding the two parts:
matrix = part1 + part2
The result of this is the following matrix:
[[0, 2],
[2, 4]]
At each index combination for example [0, 1] the result is the element of part1 at index [0] plus the array of the element of part2 at index [1], thus the result is 0 + 2 = 2.
Similarly:
matrix[0, 0] = part1[0] + part2[0] = 0 + 0 = 0
matrix[0, 1] = part1[0] + part2[1] = 0 + 2 = 2
matrix[1, 0] = part1[1] + part2[0] = 2 + 0 = 2
matrix[1, 1] = part1[1] + part2[1] = 2 + 2 = 4
Edit: Yes the updated code gives the result you mention. Anyway, the same logic applies.

Related

Is np.argpartition giving me the wrong results?

Take the following code:
import numpy as np
one_dim = np.array([2, 3, 1, 5, 4])
partitioned = np.argpartition(one_dim, 0)
print(f'Unpartitioned array: {one_dim}')
print(f'Partitioned array index: {partitioned}')
print(f'Partitioned array: {one_dim[partitioned]}')
The following output results:
Unpartitioned array: [2 3 1 5 4]
Partitioned array index: [2 1 0 3 4]
Partitioned array: [1 3 2 5 4]
The output for the partitioned array should be [1 2 3 5 4]. How is three on the left side of two? It seems to me the function is making an error or am I missing something?
The second argument is which index will be in sorted position after partitioning, so it is correct that index 0 of the partition (element value 1) is in sorted position, and everything to the right is greater.

I have a 2d array. Need to make a loop to replace first 2 rows, then next 2 only rows and so on by ones and print till the loop ends

b = np.random.randint(0,10, (6,3))
I tried this code, but it gives the `ValueError: operands could not be broadcast together with shapes (2,3) () (6,3)
step = 2
r1 = 0
r2 = 2
while r2 <= len(b):
c = np.where(b[r1:r2] >= 0, 1, b)
print(c)
r1+ = step
r2+ = step
I think the problem is in a condition of np.where. It creates an array wih a shape that is incompatible with b array
What i need is for the code to receive array b and to return 3 arrays of the same size of b but with two rows been substituted by 1´s. Like this:
[[1 1 1]
[1 1 1]
[6 3 4]
[2 9 3]
[6 9 2]
[8 1 0]]
[[3 2 8]
[3 8 5]
[1 1 1]
[1 1 1]
[6 9 2]
[8 1 0]]
[[3 2 8]
[3 8 5]
[6 3 4]
[2 9 3]
[1 1 1]
[1 1 1]]
My tutor told me to try it with 'np.where' function.But it seems that this function doesnt support this type of condition i´m trying to feed to it. May be there is another way to get the desired output. All examples I googled work with random values of the array and not precisely rows. In pandas it easier. But i need numpy code to feed the output to the neural network. The ones will be treated by it as an empty values, but the size of the array will be always the same, thus not producing errors
You are getting a ValueError because the size of b[0:2] is not the same as the size of b.
print(b.shape)
# (6, 3)
print(b[0:2].shape)
# (2, 3)
The documentation for numpy.where states that the way the condition works is "Where True, yield x, otherwise yield y." Thus, you need to be able to broadcast x and y onto the size of your condition. In your example, you can't broadcast (6,3) onto (2,3) and hence the error.
You need things to be the same size. For example, c = np.where(b[0:2] >= 0, 1, b[0:2]) would not give you an error.
However, if you want to step through your array b, then you need something other than b[0:2]. Otherwise it will just keep repeating that first part your array. I think you probably want b[r1:r2].
Also, I notice that you have r1+ = step instead of r1 += step, which will also spit out an error. Note that you don't actually need both r1 and r2 since their offset is step.
Putting all this together, we can adjust your code to give you something that works:
import numpy as np
b = np.random.randint(0,5, (6,3))
step = 2
r1 = 0
while r1 <= len(b) - step:
c = np.copy(b)
c[r1:r1+step] = np.where(b[r1:r1+step] >= 0, 1, b[r1:r1+step])
print(c)
r1 += step
Or you could instead do it with a for loop instead of a while loop:
import numpy as np
b = np.random.randint(0,5, (6,3))
step = 2
for r1 in range(0, len(b), step):
c = np.copy(b)
c[r1:r1+step] = np.where(b[r1:r1+step] >= 0, 1, b[r1:r1+step])
print(c)
Resulting output:
[[1 1 1]
[1 1 1]
[3 2 2]
[1 1 2]
[3 3 0]
[3 2 2]]
[[4 0 2]
[4 0 0]
[1 1 1]
[1 1 1]
[3 3 0]
[3 2 2]]
[[4 0 2]
[4 0 0]
[3 2 2]
[1 1 2]
[1 1 1]
[1 1 1]]

The use of numpy.argmax

I'm here to inquire about the use of numpy.argmax
For instance, consider this array:
import numpy as np
a = np.arange(6).reshape(2,3)
b = np.argmax(a, axis = 0)
c = np.argmax(a, axis = 1)
print(a)
print(b)
print(c)
Here's the output:
[[0 1 2]
[3 4 5]]
5
[1 1 1]
[2 2]
I'm confused about the use of the parameter axis for numpy.argmax. What does it do? Why does it return [1 1 1] if axis = 0 and [2 2] if the value of axis = 1?
numpy.argmax() returns the position of the largest element in an array, optionally by row or column (the axis argument). So in the first case, [1 1 1], you get the position of the largest element column-wise. Since the elements in row 1 are all larger that the elements in row 0, you get your array of three ones. Analogously for axis=1, where you get the column of the largest element in each row.
argmax returns to you the index of the max value along the axis you specified.
The exact comparisons it did to get there:
3 > 0, 4 > 1, 5 > 2 : [1 1 1]
2 is the largest of set [0 1 2]
5 is the largest of set [3 4 5]:
[2 2]

How to find numpy array shape in a larger array?

big_array = np.array((
[0,1,0,0,1,0,0,1],
[0,1,0,0,0,0,0,0],
[0,1,0,0,1,0,0,0],
[0,0,0,0,1,0,0,0],
[1,0,0,0,1,0,0,0]))
print(big_array)
[[0 1 0 0 1 0 0 1]
[0 1 0 0 0 0 0 0]
[0 1 0 0 1 0 0 0]
[0 0 0 0 1 0 0 0]
[1 0 0 0 1 0 0 0]]
Is there a way to iterate over this numpy array and for each 2x2 cluster of 0s, set all values within that cluster = 5? This is what the output would look like.
[[0 1 5 5 1 5 5 1]
[0 1 5 5 0 5 5 0]
[0 1 5 5 1 5 5 0]
[0 0 5 5 1 5 5 0]
[1 0 5 5 1 5 5 0]]
My thoughts are to use advanced indexing to set the 2x2 shape = to 5, but I think it would be really slow to simply iterate like:
1) check if array[x][y] is 0
2) check if adjacent array elements are 0
3) if all elements are 0, set all those values to 5.
big_array = [1, 7, 0, 0, 3]
i = 0
p = 0
while i <= len(big_array) - 1 and p <= len(big_array) - 2:
if big_array[i] == big_array[p + 1]:
big_array[i] = 5
big_array[p + 1] = 5
print(big_array)
i = i + 1
p = p + 1
Output:
[1, 7, 5, 5, 3]
It is a example, not whole correct code.
Here's a solution by viewing the array as blocks.
First you need to define this function rolling_window from here https://gist.github.com/seberg/3866040/revisions
Then break the array big, your starting array, into 2x2 blocks using this function.
Also generate an array which has indices of every element in big and break it similarly into 2x2 blocks.
Then generate a boolean mask where the 2x2 blocks of big are all zero, and use the index array to get those elements.
blks = rolling_window(big,window=(2,2)) # 2x2 blocks of original array
inds = np.indices(big.shape).transpose(1,2,0) # array of indices into big
blkinds = rolling_window(inds,window=(2,2,0)).transpose(0,1,4,3,2) # 2x2 blocks of indices into big
mask = blks == np.zeros((2,2)) # generate a mask of every 2x2 block which is all zero
mask = mask.reshape(*mask.shape[:-2],-1).all(-1) # still generating the mask
# now blks[mask] is every block which is zero..
# but you actually want the original indices in the array 'big' instead
inds = blkinds[mask].reshape(-1,2).T # indices into big where elements need replacing
big[inds[0],inds[1]] = 5 #reassign
You need to test this: I did not. But the idea is to break the array into blocks, and an array of indices into blocks, then develop a boolean condition on the blocks, use those to get the indices, and then reassign.
An alternative would be to iterate through indblks as defined here, then test the 2x2 obtained from big at each indblk element and reassign if necessary.
This is my attempt to help you solve your problem. My solution may be subject to fair criticism.
import numpy as np
from itertools import product
m = np.array((
[0,1,0,0,1,0,0,1],
[0,1,0,0,0,0,0,0],
[0,1,0,0,1,0,0,0],
[0,0,0,0,1,0,0,0],
[1,0,0,0,1,0,0,0]))
h = 2
w = 2
rr, cc = tuple(d + 1 - q for d, q in zip(m.shape, (h, w)))
slices = [(slice(r, r + h), slice(c, c + w))
for r, c in product(range(rr), range(cc))
if not m[r:r + h, c:c + w].any()]
for s in slices:
m[s] = 5
print(m)
[[0 1 5 5 1 5 5 1]
[0 1 5 5 0 5 5 5]
[0 1 5 5 1 5 5 5]
[0 5 5 5 1 5 5 5]
[1 5 5 5 1 5 5 5]]

stacking numpy arrays?

I am trying to stack arrays horizontally, using numpy hstack, but can't get it to work. Instead, it all comes out in one list, instead of a 'matrix-looking' 2D array.
import numpy as np
y = np.array([0,2,-6,4,1])
y_bool = y > 0
y_bool = [1 if l == True else 0 for l in y_bool] #convert to decimals for classification
y_range = range(0,len(y))
print y
print y_bool
print y_range
print np.hstack((y,y_bool,y_range))
Prints this:
[ 0 2 -6 4 1]
[0, 1, 0, 1, 1]
[0, 1, 2, 3, 4]
[ 0 2 -6 4 1 0 1 0 1 1 0 1 2 3 4]
How do I instead get the last line to look like this:
[0 0 0
2 1 1
-6 0 2
4 1 3]
If you want to create a 2D array, do:
print np.transpose(np.array((y, y_bool, y_range)))
# [[ 0 0 0]
# [ 2 1 1]
# [-6 0 2]
# [ 4 1 3]
# [ 1 1 4]]
Well, close enough h is for horizontal/column wise, if you check its help, you will see under See Also
vstack : Stack arrays in sequence vertically (row wise).
dstack : Stack arrays in sequence depth wise (along third axis).
concatenate : Join a sequence of arrays together.
Edit: First thought vstack does it, but it would be if np.vstack(...).T or np.dstack(...).squeeze(). Other then that the "problem" is that the arrays are 1D and you want them to act like 2D, so you could do:
print np.hstack([np.asarray(a)[:,np.newaxis] for a in (y,y_bool,y_range)])
the np.asarray is there just in case one of the variables is a list. The np.newaxis makes them 2D to make it clearer what happens when concatenating.

Categories