Related
I have an array a of ones and zeroes (it might be rather big)
a = np.array([[1, 0, 0, 1, 0, 0],
[1, 1, 0, 0, 1, 0],
[0, 1, 1, 0, 0, 1],
[0, 0, 0, 1, 1, 1])
in which the "upper" rows are more "important" in the sense that if there is 1 in any column of the i-th row, then all ones in that columns in the following rows must be zeroed.
So, the desired output should be:
array([[1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 0, 0, 0]])
In other words, there should only be single 1 per column.
I'm looking for a more numpy way to do this (i.e. minimising or, better, avoiding the loops).
Your array:
[[1, 0, 0, 1, 0, 0],
[1, 1, 0, 0, 1, 0],
[0, 1, 1, 0, 0, 1],
[0, 0, 0, 1, 1, 1]]
Transpose it with numpy:
a = np.transpose(your_array)
Now it looks like this:
[[1, 1, 0, 0],
[0, 1, 1, 0],
[0, 0, 1, 0],
[1, 0, 0, 1],
[0, 1, 0, 1],
[0, 0, 1, 1]]
Zero all the non-zero (and "not upper") elements row wise:
res = np.zeros(a.shape, dtype="int64")
idx = np.arange(res.shape[0])
args = a.astype(bool).argmax(1)
res[idx, args] = a[idx, args]
The output of res is this:
#### Output
[[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]]
Re-transpose your array:
a = np.transpose(res)
[[1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 0, 0, 0]])
EDIT: Thanks to #The.B for the tip
An alternative solution is to do a forward fill followed by the cumulative sum and then replace all values which are not 1 with 0:
a = np.array([[1, 0, 0, 1, 0, 0],
[1, 1, 0, 0, 1, 0],
[0, 1, 1, 0, 0, 1],
[0, 0, 0, 1, 1, 1]])
ff = np.maximum.accumulate(a, axis=0)
cs = np.cumsum(ff, axis=0)
cs[cs > 1] = 0
Output in cs:
array([[1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 0, 0, 0]])
EDIT
This will do the same thing and should be slightly more efficient:
ff = np.maximum.accumulate(a, axis=0)
ff ^ np.pad(ff, ((1,0), (0,0)))[:-1]
Output:
array([[1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 0, 0, 0]])
And if you want to do the operations in-place to avoid temporary memory allocation:
out = np.zeros((a.shape[0]+1, a.shape[1]), dtype=a.dtype)
np.maximum.accumulate(a, axis=0, out=out[1:])
out[:-1] ^ out[1:]
Output:
array([[1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 0, 0, 0]])
You can traverse through each column of array and check if it is the first one -
If Not: Make it 0
for col in a.T:
f=0
for x in col:
if(x==1 and f==0):
f=1
else:
x=0
I have initialized a numpy nd array like the following
arr = np.zeros((6, 6))
This empty array is passed as an input argument to a function,
def fun(arr):
arr.append(1) # this works for arr = [] initialization
return arr
for i in range(0,12):
fun(arr)
But append doesn't work for nd array. I want to fill up the elements of the nd array row-wise.
Is there any way to use a python scalar index for the nd array? I could increment this index every time fun is called and append elements to arr
Any suggestions?
In [523]: arr = np.zeros((6,6),int)
In [524]: arr
Out[524]:
array([[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
In [525]: arr[0] = 1
In [526]: arr
Out[526]:
array([[1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
In [527]: arr[1] = [1,2,3,4,5,6]
In [528]: arr[2,3:] = 2
In [529]: arr
Out[529]:
array([[1, 1, 1, 1, 1, 1],
[1, 2, 3, 4, 5, 6],
[0, 0, 0, 2, 2, 2],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
I have a tensor with three dimensions and three classes (0: background, 1: first class, 2: second class). I would like to find connected clusters and assign outlier's labels by performing a majority vote. A 2D example:
import numpy as np
data = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 1, 2],
[1, 2, 0, 0, 2, 2, 2],
[0, 1, 0, 0, 0, 2, 0],
[0, 0, 0, 0, 0, 0, 0],])
should be changed to
data = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 2, 2],
[1, 1, 0, 0, 2, 2, 2],
[0, 1, 0, 0, 0, 2, 0],
[0, 0, 0, 0, 0, 0, 0],])
It is enough to see connected regions as one cluster an count the appearence of the labels. I am not looking for any machine learning method.
You can use scipy.ndimage.measurements.label to find the connected components and then use np.bincount for the counting
from scipy.ndimage import measurements
lbl,ncl = measurements.label(data)
lut = np.bincount((data+2*lbl).ravel(),None,2*ncl+3)[1:].reshape(-1,2).argmax(1)+1
lut[0] = 0
lut[lbl]
# array([[0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0],
# [0, 1, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 2, 2],
# [1, 1, 0, 0, 2, 2, 2],
# [0, 1, 0, 0, 0, 2, 0],
# [0, 0, 0, 0, 0, 0, 0]])
Trying to make my code more efficient and readable and i'm stuck. Assume I want to build something like a chess board, with alternating black and white colors on an 8x8 grid. So, using numpy, I have done this:
import numpy as np
board = np.zeros((8,8), np.int32)
for ri in range(8):
for ci in range(8):
if (ci + ri) % 2 == 0:
board[ri,ci] = 1
Which nicely outputs:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]], dtype=int32)
That I can then parse as white squares or black squares. However, in practice my array is much larger, and this way is very inefficient and unreadable. I assumed numpy already has this figured out, so I tried this:
board = np.zeros(64, np.int32)
board[::2] = 1
board = board.reshape(8,8)
But that output is wrong, and looks like this:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 1, 0]], dtype=int32)
Is there a better way to achieve what I want that works efficiently (and preferably, is readable)?
Note: i'm not attached to 1's and 0's, this can easily be done with other types of values, even True/False or strings of 2 kinds, as long as it works
Here's one approach using slicing with proper starts and stepsize of 2 in two steps -
board = np.zeros((8,8), np.int32)
board[::2,::2] = 1
board[1::2,1::2] = 1
Sample run -
In [229]: board = np.zeros((8,8), np.int32)
...: board[::2,::2] = 1
...: board[1::2,1::2] = 1
...:
In [230]: board
Out[230]:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]], dtype=int32)
Other tricky ways -
1) Broadcasted comparison :
In [254]: r = np.arange(8)%2
In [255]: (r[:,None] == r)*1
Out[255]:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]])
2) Broadcasted addition :
In [279]: r = np.arange(8)
In [280]: 1-(r[:,None] + r)%2
Out[280]:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]])
Just found out an alternative answer by myself, so posting it here for future reference to anyone who's interested:
a = np.array([[1,0],[0,1]])
b = np.tile(a, (4,4))
Results:
array([[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0],
[0, 1, 0, 1, 0, 1, 0, 1]])
I think the following is also a good way of doing it for a variable input
import sys
lines = sys.stdin.readlines()
n = int(lines[0])
import numpy as np
a = np.array([[1,0], [0,1]],dtype=np.int)
outputData= np.tile(a,(n//2,n//2))
print(outputData)
You can achieve this for single even input number n
import numpy as np
i = np.eye(2)
i = i[::-1]
k = np.array(i, dtype = np.int)
print(np.tile(k,(n//2,n//2)))
I tried and found this to be shorter one for any giver number:
n = int(input())
import numpy as np
c = np.array([[0,1], [1, 0]])
print(np.tile(c, reps=(n//2, n//2)))
I'm trying to find an elegant way to find the max value in a two-dimensional array.
for example for this array:
[0, 0, 1, 0, 0, 1] [0, 1, 0, 2, 0, 0][0, 0, 2, 0, 0, 1][0, 1, 0, 3, 0, 0][0, 0, 0, 0, 4, 0]
I would like to extract the value '4'.
I thought of doing a max within max but I'm struggling in executing it.
Another way to solve this problem is by using function numpy.amax()
>>> import numpy as np
>>> arr = [0, 0, 1, 0, 0, 1] , [0, 1, 0, 2, 0, 0] , [0, 0, 2, 0, 0, 1] , [0, 1, 0, 3, 0, 0] , [0, 0, 0, 0, 4, 0]
>>> np.amax(arr)
Max of max numbers (map(max, numbers) yields 1, 2, 2, 3, 4):
>>> numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
>>> map(max, numbers)
<map object at 0x0000018E8FA237F0>
>>> list(map(max, numbers)) # max numbers from each sublist
[1, 2, 2, 3, 4]
>>> max(map(max, numbers)) # max of those max-numbers
4
Not quite as short as falsetru's answer but this is probably what you had in mind:
>>> numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
>>> max(max(x) for x in numbers)
4
How about this?
import numpy as np
numbers = np.array([[0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]])
print(numbers.max())
4
>>> numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
You may add key parameter to max as below to find Max value in a 2-D Array/List
>>> max(max(numbers, key=max))
4
One very easy solution to get both the index of your maximum and your maximum could be :
numbers = np.array([[0,0,1,0,0,1],[0,1,0,2,0,0],[0,0,2,0,0,1],[0,1,0,3,0,0],[0,0,0,0,4,0]])
ind = np.argwhere(numbers == numbers.max()) # In this case you can also get the index of your max
numbers[ind[0,0],ind[0,1]]
This approach is not as intuitive as others but here goes,
numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
maximum = -9999
for i in numbers:
maximum = max(maximum,max(i))
return maximum"