I've got three puzzle pieces defined as a number of arrays, 7x7, in a following manner:
R3LRU = pd.DataFrame([
[1, 1, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1]
])
I am trying to join them by the following rules: 1111111 can be joined with 1000001, 1000001 can be joined with 1000001, but 1111111 cannot be joined with 1111111. Better illustration will be the following:
I have tried using pd.concat function, but it just glues them together instead of joining by sides, like this:
Or, in terms of code output, like this:
0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6
0 1 1 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1
1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0
3 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0
4 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0
5 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0
6 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
I suppose I would like to join by columns 6 and 0, or rows 6 and 0
How can I define "joining" sides, so that the pieces would join through the proposed rules?
I take it you want to concatenate if the last column and first columns match and then "overlap" both parts. I dont think, pandas is a good fit for this problem as you only need values, no columns or basically any features you would use pandas for.
I would recommend simple numpy arrays. Then you could do something like
In [1]: import numpy as np
In [2]: R3LRU = np.array([
...: [1, 1, 1, 1, 1, 1, 1],
...: [1, 0, 0, 0, 0, 0, 1],
...: [1, 0, 0, 0, 0, 0, 1],
...: [1, 0, 0, 0, 0, 0, 1],
...: [1, 0, 0, 0, 0, 0, 1],
...: [1, 0, 0, 0, 0, 0, 1],
...: [1, 0, 0, 0, 0, 0, 1]
...: ])
In [3]: R3LRU
Out[3]:
array([[1, 1, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1]])
Get the last column of the first part and the first column of the second part
In [4]: R3LRU[:,0]
Out[4]: array([1, 1, 1, 1, 1, 1, 1])
In [5]: R3LRU[:,-1]
Out[5]: array([1, 1, 1, 1, 1, 1, 1])
Compare them
In [6]: R3LRU[:,0] == R3LRU[:,-1]
Out[6]: array([ True, True, True, True, True, True, True])
In [7]: np.all(R3LRU[:,0] == R3LRU[:,-1])
Out[7]: True
If they are equal, combine them
In [8]: if np.all(R3LRU[:,0] == R3LRU[:,-1]):
...: combined = np.hstack([R3LRU[:,:-1], R3LRU])
In [9]: combined
Out[9]:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1]])
Maybe your rules are a bit more complicated than a simple == comparison, but you can just make that if statement more complicated to reflect all rules you have ;)
Related
I have a list of numpy arrays (one-hot represantation) like the example bellow, I want to count the number of occurances of each one-hot code.
[0 0 1 0 0 0 0 0 0 0]
[0 0 1 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0]
[0 0 0 0 0 1 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0]
[0 0 0 0 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 1]
[0 0 0 0 1 0 0 0 0 0]
[1 0 0 0 0 0 0 0 0 0]
[0 0 0 1 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 1]
Edit :
Expected output :
[1 0 0 0 0 0 0 0 0 0] ==> 1 occurrence
[0 0 1 0 0 0 0 0 0 0] ==> 2 occurrences
[0 1 0 0 0 0 0 0 0 0] ==> 3 occurrences
[0 0 0 0 0 1 0 0 0 0] ==> 1 occurrence
[0 0 0 0 1 0 0 0 0 0] ==> 2 occurrences
[0 0 0 0 0 0 0 0 0 1] ==> 2 occurrences
I think you can get the result you seek:
[1 3 2 1 2 1 0 0 0 2]
indicating the count of occurrences of one hot in that position via a simple column-wise sum using ndarray.sum():
import numpy
data = numpy.array([
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
])
print(numpy.ndarray.sum(data, axis=0))
or more compactly as just:
print(data.sum(axis=0))
both should give you:
[1 3 2 1 2 1 0 0 0 2]
Using the face that each row is 1 hot, you can do the following:
temp = np.array([[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0 ,0 ,0 ,1 ,0 ,0 ,0 ,0 ,0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
converting the one-hot to indices can be done as follows:
temp2 = np.argmax(temp, axis=1) # array([2, 2, 1, 5, 1, 4, 9, 4, 0, 3, 1, 9])
and then the counting of the occurances can be done using np.histogram. We know that you have 10 possible values, so we use 10 bins as follows:
temp3 = np.histogram(temp2, bins=10, range=(-0.5,9.5))
np.histogram returns a touple where index [0] holds the histogram values and index [1] holds the bins. In your case:
(array([1, 3, 2, 1, 2, 1, 0, 0, 0, 2]),
array([-0.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5]))
I want to make matrix like below using numpy
matrix_example = [[1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 0, 0, 0, 1],
[1, 0, 1, 1, 1, 1, 1, 0, 1],
[1, 0, 1, 0, 0, 0, 1, 0, 1],
[1, 0, 1, 0, 1, 0, 1, 0, 1],
[1, 0, 1, 0, 0, 0, 1, 0, 1],
[1, 0, 1, 1, 1, 1, 1, 0, 1],
[1, 0, 0, 0, 0, 0, 0, 0, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1]]
my Idea is using np.where but It doesn't work well..
I want hint about generate matrix like that.
my second idea is
make 9 by 9 matrix fill with zero using numpy.zeros([9, 9])
change 0 to 1 where index is include 0, 2, 4.
a2D = np.array([[1, 1, 1, 1, 1, 1, 1, 1, 1],[1, 0, 0, 0, 0, 0, 0, 0, 1],[1, 0, 1, 1, 1, 1, 1, 0, 1],[1, 0, 1, 0, 0, 0, 1, 0, 1],[1, 0, 1, 0, 1, 0, 1, 0, 1],[1, 0, 1, 0, 0, 0, 1, 0, 1],[1, 0, 1, 1, 1, 1, 1, 0, 1],[1, 0, 0, 0, 0, 0, 0, 0, 1],[1, 1, 1, 1, 1, 1, 1, 1, 1]])
try this
you can use np.ones and np.zeros to do it like:
first_mat = np.ones([9, 9])
second_mat = np.zeros([7, 7])
third_mat = np.ones([5, 5])
forth_mat = np.zeros([3, 3])
first_mat[1:-1, 1:-1] = second_mat
first_mat[2:-2, 2:-2] = third_mat
first_mat[3:-3, 3:-3] = forth_mat
first_mat[4:-4, 4:-4] = 1
and this will give you your output, it maybe not the easiest way, but I hope it can help, and of course first_mat is the maxrix you need
There's already a np.matrix function that makes what you probably want
For you example, it should be as easy as
my_matrix = np.matrix(matrix_example)
Have a look at the official documentation for further info :)
https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
Inspired by Mohamed Yahya's answer and generalizing it to any number of "squares":
import numpy as np
def cool_matrix(squares):
final_matrix = np.zeros((squares * 2 - 1, squares * 2 - 1), dtype=np.int)
for square in range(squares, 0, -1):
square_dimensions = (square * 2 - 1, square * 2 - 1)
if square % 2 == 0:
curr_square = np.zeros(square_dimensions, dtype=np.int)
else:
curr_square = np.ones(square_dimensions, dtype=np.int)
offset = squares + square - 1
final_matrix[-offset:offset, -offset:offset] = curr_square
return final_matrix
print(cool_matrix(5))
output is:
[[1 1 1 1 1 1 1 1 1]
[1 0 0 0 0 0 0 0 1]
[1 0 1 1 1 1 1 0 1]
[1 0 1 0 0 0 1 0 1]
[1 0 1 0 1 0 1 0 1]
[1 0 1 0 0 0 1 0 1]
[1 0 1 1 1 1 1 0 1]
[1 0 0 0 0 0 0 0 1]
[1 1 1 1 1 1 1 1 1]]
Excluding the boundary zero values, is it possible to group the coordinates (as tuples) of remaining zero values into different lists in this numpy array?
[[ 0 0 0 0 0 0 0 0 0 0 0]
[ 0 1 1 1 0 0 0 1 10 2 0]
[ 0 2 10 2 1 0 0 1 2 10 0]
[ 0 10 3 10 1 0 0 0 1 1 0]
[ 0 1 2 1 1 0 0 0 0 0 0]
[ 0 1 2 1 2 2 2 1 0 0 0]
[ 0 10 2 10 2 10 10 1 0 0 0]
[ 0 1 2 1 2 2 2 1 1 1 0]
[ 0 0 0 0 0 0 0 0 1 10 0]
[ 0 0 0 0 0 0 0 0 1 1 0]
[ 0 0 0 0 0 0 0 0 0 0 0]]
for ex. in above grid, there are two 'groups' of zeros, one in lower left corner and other in upper right corner. can these be put into separate lists, for every such matrix generated? Below is the code for creating matrix 'sol_mat' :-
import numpy as np
import random
bomb_mat = np.zeros((11,11), dtype = int)
for i in range(10):
a = random.randint(1,9)
b = random.randint(1,9)
bomb_mat[a,b] = 1
sol_mat = np.zeros(11,11), dtype = int)
for j in range(1,10):
for k in range(1,y-1):
if bomb_mat[j,k] == 1:
sol_mat[j,k] = 10
else:
sol_mat[j,k] = bomb_mat[j-1,k-1] + bomb_mat[j,k-1] + bomb_mat[j+1,k-1]+ bomb_mat[j-1,k] + bomb_mat[j+1,k] + bomb_mat[j-1,k+1] + bomb_mat[j,k+1] + bomb_mat[j+1,k+1]
Trying to create minesweeper
I made some adjustments on your code. Mainly I tried to avoid for loops and I used scipys convolve2d() to create sol_mat. The main advantage of this method is that you don't have to worry about the edge cases of the image. Using a 3x3 kernel of ones on the boolean array of bombs gives you exactly the number of neighbouring bombs (the flags in minesweeper).
import numpy as np
from scipy.signal import convolve2d
grid_size = (7, 7)
n_bombs = 5
bomb_mat = np.zeros(grid_size, dtype=int)
bomb_mat[np.random.randint(low=1, high=grid_size[0]-1, size=n_bombs),
np.random.randint(low=1, high=grid_size[1]-1, size=n_bombs)] = 1
# array([[0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 1, 0, 0],
# [0, 0, 1, 0, 0, 0, 0],
# [0, 1, 1, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0],
# [0, 0, 1, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0]])
sol_mat = convolve2d(bomb_mat, np.ones((3, 3)), mode='same').astype(int)
sol_mat[bomb_mat.astype(bool)] = 10
# array([[ 0, 0, 0, 1, 1, 1, 0],
# [ 0, 1, 1, 2, 10, 1, 0],
# [ 1, 3, 10, 3, 1, 1, 0],
# [ 1, 10, 10, 2, 0, 0, 0],
# [ 1, 3, 3, 2, 0, 0, 0],
# [ 0, 1, 10, 1, 0, 0, 0],
# [ 0, 1, 1, 1, 0, 0, 0]])
You can use np.tril() and np.triu() to get the lower and upper triangle of an array. By building the intersection of boolean triangles with the condition sol_mat == 0 you get the wanted indices:
lower0 = np.logical_and(np.tril(np.ones(grid_size)), sol_mat == 0)
# lower0.astype(int)
# array([[1, 0, 0, 0, 0, 0, 0],
# [1, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 1, 0, 0],
# [1, 0, 0, 0, 1, 1, 0],
# [1, 0, 0, 0, 1, 1, 1]])
upper0 = np.logical_and(np.triu(np.ones(grid_size)), sol_mat == 0)
# upper0.astype(int)
# array([[1, 1, 1, 0, 0, 0, 1],
# [0, 0, 0, 0, 0, 0, 1],
# [0, 0, 0, 0, 0, 0, 1],
# [0, 0, 0, 0, 1, 1, 1],
# [0, 0, 0, 0, 1, 1, 1],
# [0, 0, 0, 0, 0, 1, 1],
# [0, 0, 0, 0, 0, 0, 1]])
You can get the indices of these arrays via np.nonzero():
lower0_idx = np.array(np.nonzero(lower0))
# array([[0, 1, 4, 5, 5, 5, 6, 6, 6, 6],
# [0, 0, 4, 0, 4, 5, 0, 4, 5, 6]])
upper0_idx = np.array(np.nonzero(upper0))
# array([[0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6],
# [0, 1, 2, 6, 6, 6, 4, 5, 6, 4, 5, 6, 5, 6, 6]])
I have a pandas DataFrame with multiple columns.
2u 2s 4r 4n 4m 7h 7v
0 1 1 0 0 0 1
0 1 0 1 0 0 1
1 0 0 1 0 1 0
1 0 0 0 1 1 0
1 0 1 0 0 1 0
0 1 1 0 0 0 1
What I want to do is to convert this pandas.DataFrame into a list like following
X = [
[0, 0, 1, 1, 1, 0],
[1, 1, 0, 0, 0, 1],
[1, 0, 0, 0, 1, 1],
[0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 1, 1, 1, 0],
[1, 1, 0, 0, 0, 1]
]
2u 2s 4r 4n 4m 7h 7v are column headings. It will change in different situations, so don't bother about it.
It looks like a transposed matrix:
df.values.T.tolist()
[list(l) for l in zip(*df.values)]
[[0, 0, 1, 1, 1, 0],
[1, 1, 0, 0, 0, 1],
[1, 0, 0, 0, 1, 1],
[0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 1, 1, 1, 0],
[1, 1, 0, 0, 0, 1]]
To change Dataframe into list use tolist() function to convert
Let use say i have Dataframe df
to change into list you can simply use tolist() function
df.values.tolist()
You can also change a particular column in to list by using
df['column name'].values.tolist()
I have an issue with numpy that I can't solve.
I have 3D arrays (x,y,z) filled with 0 and 1.
For instance, one slice in the z axis :
array([[1, 0, 1, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0],
[1, 0, 1, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 1, 0, 0, 1],
[1, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 1, 0, 1]])
And I want this result :
array([[1, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 1]])
That is to say, what I want to do for each slice z is to scan line by line right to left and left to right (x axis) and the first time I have a 1 I want to fill the rest of the line with ones.
Is there an efficient way to compute that ?
Thanks a lot.
Nico !
Accessing NumPy array elements one by one is not very efficient. You may do better with just plain Python lists. They also have an index method which can search for the first entry of the value in the list.
from numpy import *
a = array([[1, 0, 1, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0],
[1, 0, 1, 1, 0, 0, 0, 1],
[0, 1, 0, 0, 1, 0, 1, 0],
[1, 1, 1, 0, 1, 0, 0, 1],
[1, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 1, 1, 0, 1]])
def idx_front(ln):
try:
return list(ln).index(1)
except ValueError:
return len(ln) # an index beyond line end
def idx_back(ln):
try:
return len(ln) - list(reversed(ln)).index(1) - 1
except ValueError:
return len(ln) # an index beyond line end
ranges = [ (idx_front(ln), idx_back(ln)) for ln in a ]
for ln, (lo,hi) in zip(a, ranges):
ln[lo:hi] = 1 # attention: destructive update in-place
print "ranges =", ranges
print a
Output:
ranges = [(0, 5), (2, 6), (0, 7), (1, 6), (0, 7), (0, 7), (4, 4), (8, 8), (2, 7)]
[[1 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1]
[0 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1]
[0 0 0 0 1 0 0 0]
[0 0 0 0 0 0 0 0]
[0 0 1 1 1 1 1 1]]
Actually, this is a basic binary image morphology operation.
You can do it in one step for the entire 3D array using scipy.ndimage.morphology.binary_fill_holes
You just need a slightly different structure element. In a nutshell, you want a structuring element that looks like this for the 2D case:
[[0, 0, 0],
[1, 1, 1],
[0, 0, 0]]
Here's a quick example:
import numpy as np
import scipy.ndimage as ndimage
a = np.array( [[1, 0, 1, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0],
[1, 0, 1, 1, 0, 0, 0, 1],
[0, 1, 0, 0, 1, 0, 1, 0],
[1, 1, 1, 0, 1, 0, 0, 1],
[1, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 1, 0, 1]])
structure = np.zeros((3,3), dtype=np.int)
structure[1,:] = 1
filled = ndimage.morphology.binary_fill_holes(a, structure)
print filled.astype(np.int)
This yields:
[[1 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1]
[0 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1]
[0 0 0 0 0 0 0 0]
[0 0 0 0 1 0 0 0]
[0 0 1 1 1 1 1 1]]
The real advantage to this (Other than speed... It will be much faster and more memory efficient than using lists!) is that it will work just as well for 3D, 4D, 5D, etc arrays.
We just need to adjust the structuring element to match the number of dimensions.
import numpy as np
import scipy.ndimage as ndimage
# Generate some random 3D data to match what we want...
x = (np.random.random((10,10,20)) + 0.5).astype(np.int)
# Make the structure (I'm assuming that "z" is the _last_ dimension!)
structure = np.zeros((3,3,3))
structure[1,:,1] = 1
filled = ndimage.morphology.binary_fill_holes(x, structure)
print x[:,:,5]
print filled[:,:,5].astype(np.int)
Here's a slice from the random input 3D array:
[[1 0 1 0 1 1 0 1 0 0]
[1 0 1 1 0 1 0 1 0 0]
[1 0 0 1 0 1 1 1 1 0]
[0 0 0 1 1 0 1 0 0 0]
[1 0 1 0 1 0 0 1 1 0]
[1 0 1 1 0 1 0 0 0 1]
[0 1 0 1 0 0 1 0 1 0]
[0 1 1 0 1 0 0 0 0 1]
[0 0 0 1 1 1 1 1 0 1]
[1 0 1 1 1 1 0 0 0 1]]
And here's the filled version:
[[1 1 1 1 1 1 1 1 0 0]
[1 1 1 1 1 1 1 1 0 0]
[1 1 1 1 1 1 1 1 1 0]
[0 0 0 1 1 1 1 0 0 0]
[1 1 1 1 1 1 1 1 1 0]
[1 1 1 1 1 1 1 1 1 1]
[0 1 1 1 1 1 1 1 1 0]
[0 1 1 1 1 1 1 1 1 1]
[0 0 0 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1]]
The key difference here is that we did this for every slice of the entire 3D array in one step.
After a moments thought, following your description and corner case with all zero rows, this will be still quite straightforward with numpylike:
In []: A
Out[]:
array([[1, 0, 1, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0],
[1, 0, 1, 1, 0, 0, 0, 1],
[0, 1, 0, 0, 1, 0, 1, 0],
[1, 1, 1, 0, 1, 0, 0, 1],
[1, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 1, 0, 1]])
In []: v= 0< A.sum(1) # work only with rows at least one 1
In []: A_v= A[v, :]
In []: (r, s), a= A_v.nonzero(), arange(v.sum())
In []: se= c_[searchsorted(r, a), searchsorted(r, a, side= 'right')- 1]
In []: for k in a: A_v[k, s[se[k, 0]]: s[se[k, 1]]]= 1
..:
In []: A[v, :]= A_v
In []: A
Out[]:
array([[1, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 1]])
Update:
After some more tinkering, here is a more 'pythonic' implementation and way much simpler, than the above one. So, the following lines:
for k in xrange(A.shape[0]):
m= A[k].nonzero()[0]
try: A[k, m[0]: m[-1]]= 1
except IndexError: continue
are quite straightforward ones. And they'll perform very well, indeed.
I can't think of a more efficient way than what you describe:
For every line
Scan line from the left until you find a 1.
If no 1 is find continue with next line.
Otherwise scan from the right to find the last 1 in the line.
Fill everything in the current line between the positions from 1. and 3. with 1s.