Problem Statement
I am trying to write a function that would sparsify a matrix given a target sparsity and an argument called block_shape which defines the minimum size of zeros block in the matrix. The target doesn't have to be met perfectly, but as close as possible.
For example, given the following arguments,
>>> matrix = [
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]
]
>>> target = 0.5
>>> block_shape = (2, 2)
valid outputs of 50% sparsity could be
>>> sparse_matrix = sparsify(matrix, target, block_shape)
>>> sparse_matrix
[
[1, 1, 0, 0],
[1, 1, 0, 0],
[0, 0, 1, 1],
[0, 0, 1, 1]
]
>>> sparse_matrix = sparsify(matrix, target, block_shape)
>>> sparse_matrix
[
[1, 0, 0, 1],
[1, 0, 0, 1],
[0, 0, 1, 1],
[0, 0, 1, 1]
]
Note that there could be multiple valid sparsified versions of the input. The only criteris is to get to the target as much as possible. One of the constraints is that only the zeros of shape block_size are considered to be sparse.
For example, the matrix below has a sparsity level of 0%, given the arguments
>>> sparse_matrix = sparsify(matrix, target, block_shape)
>>> sparse_matrix
[
[1, 0, 0, 1],
[1, 1, 0, 0],
[0, 1, 1, 1],
[0, 0, 0, 0]
]
What I have so far
Currently, I have the following piece of code
import numpy as np
def sparsify(matrix, target, block_shape=None):
if block_shape is None or block_shape == 1 or block_shape == (1,) or block_shape == (1, 1):
# 1x1 is just bernoulli with p=target
probs = np.random.uniform(size=matrix.shape)
mask = np.zeros(matrix.shape)
mask[probs >= target] = 1.0
else:
if isinstance(block_shape, int):
block_shape = (block_shape, block_shape)
if len(block_shape) == 1:
block_shape = (block_shape[0], block_shape[0])
mask = np.ones(matrix.shape)
rows, cols = matrix.shape
for row in range(rows):
for col in range(cols):
submask = mask[row:row+block_shape[0], col:col+block_shape[1]]
if submask.shape != block_shape:
# we don't care about the edges, cannot partially sparsify
continue
if (submask == 0).any():
# If current (row, col) is already in the sparsified area, skip
continue
prob = np.random.random()
if prob < target:
submask[:, :] = np.zeros(submask.shape)
return matrix * mask, mask
The problem with the code above is that it does not match the target if the block size is not (1, 1)
>>> matrix = np.random.randn(100, 100)
>>> matrix, mask = sparsify(matrix, target=0.5, block_shape=(2, 2))
>>> print((matrix == 0).mean())
0.73
>>> print((mask == 0).mean())
0.73
Reason for discrepancy (I think)
I am not sure why I am not getting the target I expect, but I think it has something to do with the fact that I check the probability of every element, instead of the block as a whole. However, I have skipping conditions in my code, so I thought that should cover it
Edits
Edit 1 -- additional examples
Just giving some more examples.
Example 1: Given different block size
>>> sparse_matrix = sparsify(matrix, 0.25, (3, 3))
>>> sparse_matrix
[
[0, 0, 0, 1],
[0, 0, 0, 1],
[0, 0, 0, 1],
[1, 1, 1, 1]
]
The example above is a valid sparse matrix, although the level of sparsity is not 25%, another valid result could be a matrix of all 1's.
Example 2: Given a different block size and target
>>> sparse_matrix = sparsify(matrix, 0.6, (1, 2))
>>> sparse_matrix
[
[0, 0, 0, 0],
[1, 0, 0, 1],
[0, 0, 1, 1],
[1, 1, 0, 0]
]
Notice that all zeroes can be put in blocks of (1, 2), and the sparsity level = 60%
Edit 2 -- forgot a constraint
Another constraint that I forgot to mention, but tried incorporating into my code is that the zero blocks must be non-overlapping.
Example 1: The result below is NOT valid
>>> sparse_matrix = sparsify(matrix, 0.5, (2, 2))
>>> sparse_matrix
[
[0, 0, 1, 1],
[0, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 1]
]
Although the blocks starting at index (0, 0) and (1, 1) have valid zero-shapes, the result does not meet the requirements. The reason is that only one of those blocks can be considered valid. if we label the zero blocks as z0 and z1, here is what this matrix is:
[
[z0, z0, 1, 1],
[z0, z0, z1, 1],
[ 1, z1, z1, 1],
[ 1, 1, 1, 1]
]
element at (1, 1) can be treated as belonging to z0 or z1. That means that there is only one sparse block, which makes the level of sparsity at 25% (not ~44%).
The probability of becoming 0 is not all equal.
For example: block_shape (2, 2), matrix(0, 0) becoming 0 has probability of target since the loop only passes through once. matrix(1, 0) has probability more than target since the loop passes it twice. similarly, matrix(1, 1) has probability more than (1, 0) because the loop sees it four times at (0, 0), (1, 0), (0, 1), (1, 1).
This also happens in the middle of the matrix due to prior loop operations.
So the main variable affecting the result is the block_shape.
I've been fiddling around for a bit and here's an alternative way using while loop instead of for loop. Simulating through until you reach target probability within err. You just need to watch out for inf loop due to too small err.
import numpy as np
def sparsify(matrix, target, block_shape=None):
if block_shape is None or block_shape == 1 or block_shape == (1,) or block_shape == (1, 1):
# 1x1 is just bernoulli with p=target
probs = np.random.uniform(size=matrix.shape)
mask = np.zeros(matrix.shape)
mask[probs >= target] = 1.0
else:
if isinstance(block_shape, int):
block_shape = (block_shape, block_shape)
if len(block_shape) == 1:
block_shape = (block_shape[0], block_shape[0])
mask = np.ones(matrix.shape)
rows, cols = matrix.shape
# vars for probability check
total = float(rows * cols)
zero_cnt= total - np.count_nonzero(matrix)
err = 0.005 # .5%
# simulate until we reach target probability range
while not target - err < (zero_cnt/ total) < target + err:
# pick a random point in the matrix
row = np.random.randint(rows)
col = np.random.randint(cols)
# submask = mask[row:row + block_shape[0], col:col + block_shape[1]]
submask = matrix[row:row + block_shape[0], col:col + block_shape[1]]
if submask.shape != block_shape:
# we don't care about the edges, cannot partially sparsify
continue
if (submask == 0).any():
# If current (row, col) is already in the sparsified area, skip
continue
# need more 0s to reach target probability range
if zero_cnt/ total < target - err:
matrix[row:row + block_shape[0], col:col + block_shape[1]] = 0
# need more 1s to reach target probability range
else:
matrix[row:row + block_shape[0], col:col + block_shape[1]] = 1
# update 0 count
zero_cnt = total - np.count_nonzero(matrix)
return matrix * mask, mask
note.
Didn't check for any optimization or code refactoring.
Didn't use the mask var. Worked on the matrix directly.
matrix = np.ones((100, 100))
matrix, mask = sparsify(matrix, target=0.5, block_shape=(2, 2))
print((matrix == 0).mean())
# prints somewhere between target - err and target + err
# likely to see a lower value in the range since we're counting up (0s)
Related
I'm trying to automate a trading strategy which should enter/exit a long position when the current price is the minimum/maximum among the previous k prices.
The result should contain 1 if the current number is maximum among previous k numbers, -1 if it is the minimum and 0 if none of the conditions are true.
For example if k = 3 and the numpyp array = [1, 2, 3, 2, 1, 6], the result should be an array like:
[0, 0, 1, 0, -1, 1].
I tried the numpy's max function but don't know how to take into account the previous k numbers instead of fixed index and how to switch to default condition for the first k - 1 numbers which should be 0 since there are not k number available to compare them with.
I will use Pandas
import pandas as pd
array = [1, 2, 3, 2, 1, 6]
df = pd.DataFrame(array)
df['rolling_max'] = df[0].rolling(3).max()
df['rolling_min'] = df[0].rolling(3).min()
df['result'] = df.apply(lambda row: 1 if row[0] == row['rolling_max'] else (-1 if row[0] == row['rolling_min'] else 0), axis=1)
Here is a solution with numpy using numpy.lib.stride_tricks.sliding_window_view, which was introduced in version 1.20.0.
Note that this solution (like the one proposed by #Hanwei Tang) does not exactly yield the result you was looking for, because in the second window ([2, 3, 2]) 2 is the minimum value and thus a -1 is returned instead of zero (what you requested). But maybe you should rethink whether you really want a zero for the second window or a -1.
EDIT: If a windows only contains same numbers, i.e. the minimum and maximum are the same, this method returns a zero.
import numpy as np
def rolling_max(a, wsize):
windows = np.lib.stride_tricks.sliding_window_view(a, wsize)
return np.max(windows, axis=-1)
def rolling_min(a, wsize):
windows = np.lib.stride_tricks.sliding_window_view(a, wsize)
return np.min(windows, axis=-1)
def check_prize(a, wsize):
rmax = rolling_max(a, wsize)
rmin = rolling_min(a, wsize)
ismax = np.where(a[wsize-1:] == rmax, 1, 0)
ismin = np.where(a[wsize-1:] == rmin, -1, 0)
result = np.zeros_like(a)
result[wsize-1:] = ismax + ismin
return result
a = np.array([1, 2, 3, 2, 1, 6])
check_prize(a, wsize=3)
# Output:
# array([ 0, 0, 1, -1, -1, 1])
b = np.array([1, 2, 4, 3, 1, 6])
check_prize(b, wsize=3)
# Output:
# array([ 0, 0, 1, 0, -1, 1])
c = np.array([1, 2, 2, 2, 1, 6])
check_prize(c, wsize=3)
# Output:
# array([ 0, 0, 1, 0, -1, 1])
Another approach using sliding_window_view with pad:
from numpy.lib.stride_tricks import sliding_window_view as swv
k = 3
a = np.array([1, 2, 3, 2, 1, 6])
# create sliding window
v = swv(np.pad(a.astype(float), (k-1, 0), constant_values=np.nan), k)
# compare each element to min/max of sliding window
out = np.select([np.max(v, 1)==a, np.min(v, 1)==a], [1, -1], 0)
Output: array([ 0, 0, 1, -1, -1, 1])
I have a numpy array where 0 denotes empty space and 1 denotes that a location is filled. I am trying to find a quick method of scanning the numpy array for where there are multiple values of zero adjacent to each other and return the location of the central zero.
For Example if I had the following array
[0 1 0 1]
[0 0 0 1]
[0 1 0 1]
[1 1 1 1]
I want to return the locations for which there is an adjacent zero on either side of a central zero
e.g
[1,1]
as this is the central of 3 zeros, i.e there is a zero either side of the zero at this location
Im aware that this can be calculated using if statements, but wondered if there was a more pythonic way of doing this.
Any help is greatly appreciated
The desired output here for arbitrary inputs is not exhaustively specified in the question, but here is a possible approach that might be useful for this kind of problem, and adapted to the details of the desired output. It uses np.cumsum, np.bincount, np.where, and np.median to find the middle index for groups of consecutive zeros along rows of a 2D array:
import numpy as np
def find_groups(x, min_size=3, value=0):
# Compute a sequential label for groups in each row.
xc = (x != value).cumsum(1)
# Count the number of occurances per group in each row.
counts = np.apply_along_axis(
lambda x: np.bincount(x, minlength=1 + xc.max()),
axis=1, arr=xc)
# Filter by minimum number of occurances.
i, j = np.where(counts >= min_size)
# Compute the median index of each group.
return [
(ii, int(np.ceil(np.median(np.where(xc[ii] == jj)[0]))))
for ii, jj in zip(i, j)
]
x = np.array([[0, 1, 0, 1],
[0, 0, 0, 1],
[0, 1, 0, 1],
[1, 1, 1, 1]])
print(find_groups(x))
# [(1, 1)]
It should work properly even for multiple rows with groups of varying sizes, and even multiple groups per row:
x2 = np.array([[0, 1, 0, 1, 1, 1, 1],
[0, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0]])
print(find_groups(x2))
# [(1, 1), (1, 5), (2, 3), (3, 3)]
Let's say I have a NumPy array:
x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])
At each index, I want to find the distance to nearest zero value. If the position is a zero itself then return zero as a distance. Afterward, we are only interested in distances to the nearest zero that is to the right of the current position. The super naive approach would be something like:
out = np.full(x.shape[0], x.shape[0]-1)
for i in range(x.shape[0]):
j = 0
while i + j < x.shape[0]:
if x[i+j] == 0:
break
j += 1
out[i] = j
And the output would be:
array([0, 2, 1, 0, 4, 3, 2, 1, 0, 0])
I'm noticing a countdown/decrement pattern in the output in between the zeros. So, I might be able to do use the locations of the zeros (i.e., zero_indices = np.argwhere(x == 0).flatten())
What is the fastest way to get the desired output in linear time?
Approach #1 : Searchsorted to the rescue for linear-time in a vectorized manner (before numba guys come in)!
mask_z = x==0
idx_z = np.flatnonzero(mask_z)
idx_nz = np.flatnonzero(~mask_z)
# Cover for the case when there's no 0 left to the right
# (for same results as with posted loop-based solution)
if x[-1]!=0:
idx_z = np.r_[idx_z,len(x)]
out = np.zeros(len(x), dtype=int)
idx = np.searchsorted(idx_z, idx_nz)
out[~mask_z] = idx_z[idx] - idx_nz
Approach #2 : Another with some cumsum -
mask_z = x==0
idx_z = np.flatnonzero(mask_z)
# Cover for the case when there's no 0 left to the right
if x[-1]!=0:
idx_z = np.r_[idx_z,len(x)]
out = idx_z[np.r_[False,mask_z[:-1]].cumsum()] - np.arange(len(x))
Alternatively, last step of cumsum could be replaced by repeat functionality -
r = np.r_[idx_z[0]+1,np.diff(idx_z)]
out = np.repeat(idx_z,r)[:len(x)] - np.arange(len(x))
Approach #3 : Another with mostly just cumsum -
mask_z = x==0
idx_z = np.flatnonzero(mask_z)
pp = np.full(len(x), -1)
pp[idx_z[:-1]] = np.diff(idx_z) - 1
if idx_z[0]==0:
pp[0] = idx_z[1]
else:
pp[0] = idx_z[0]
out = pp.cumsum()
# Handle boundary case and assigns 0s at original 0s places
out[idx_z[-1]:] = np.arange(len(x)-idx_z[-1],0,-1)
out[mask_z] = 0
You could work from the other side. Keep a counter on how many non zero digits have passed and assign it to the element in the array. If you see 0, reset the counter to 0
Edit: if there is no zero on the right, then you need another check
x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])
out = x
count = 0
hasZero = False
for i in range(x.shape[0]-1,-1,-1):
if out[i] != 0:
if not hasZero:
out[i] = x.shape[0]-1
else:
count += 1
out[i] = count
else:
hasZero = True
count = 0
print(out)
You can use the difference between the indices of each position and the cumulative max of zero positions to determine the distance to the preceding zero. This can be done forward and backward. The minimum between forward and backward distance to the preceding (or next) zero will be the nearest:
import numpy as np
indices = np.arange(x.size)
zeroes = x==0
forward = indices - np.maximum.accumulate(indices*zeroes) # forward distance
forward[np.cumsum(zeroes)==0] = x.size-1 # handle absence of zero from edge
forward = forward * (x!=0) # set zero positions to zero
zeroes = zeroes[::-1]
backward = indices - np.maximum.accumulate(indices*zeroes) # backward distance
backward[np.cumsum(zeroes)==0] = x.size-1 # handle absence of zero from edge
backward = backward[::-1] * (x!=0) # set zero positions to zero
distZero = np.minimum(forward,backward) # closest distance (minimum)
results:
distZero
# [0, 1, 1, 0, 1, 2, 2, 1, 0, 0]
forward
# [0, 1, 2, 0, 1, 2, 3, 4, 0, 0]
backward
# [0, 2, 1, 0, 4, 3, 2, 1, 0, 0]
Special case where no zeroes are present on outer edges:
x = np.array([3, 1, 2, 0, 4, 5, 6, 0,8,8])
forward: [9 9 9 0 1 2 3 0 1 2]
backward: [3 2 1 0 3 2 1 0 9 9]
distZero: [3 2 1 0 1 2 1 0 1 2]
also works with no zeroes at all
[EDIT] non-numpy solutions ...
if you're looking for an O(N) solution that doesn't require numpy, you can apply this strategy using the accumulate function from itertools:
x = [0, 1, 2, 0, 4, 5, 6, 7, 0, 0]
from itertools import accumulate
maxDist = len(x) - 1
zeroes = [maxDist*(v!=0) for v in x]
forward = [*accumulate(zeroes,lambda d,v:min(maxDist,(d+1)*(v!=0)))]
backward = accumulate(zeroes[::-1],lambda d,v:min(maxDist,(d+1)*(v!=0)))
backward = [*backward][::-1]
distZero = [min(f,b) for f,b in zip(forward,backward)]
print("x",x)
print("f",forward)
print("b",backward)
print("d",distZero)
output:
x [0, 1, 2, 0, 4, 5, 6, 7, 0, 0]
f [0, 1, 2, 0, 1, 2, 3, 4, 0, 0]
b [0, 2, 1, 0, 4, 3, 2, 1, 0, 0]
d [0, 1, 1, 0, 1, 2, 2, 1, 0, 0]
If you don't want to use any library, you can accumulate the distances manually in a loop:
x = [0, 1, 2, 0, 4, 5, 6, 7, 0, 0]
forward,backward = [],[]
fDist = bDist = maxDist = len(x)-1
for f,b in zip(x,reversed(x)):
fDist = min(maxDist,(fDist+1)*(f!=0))
forward.append(fDist)
bDist = min(maxDist,(bDist+1)*(b!=0))
backward.append(bDist)
backward = backward[::-1]
distZero = [min(f,b) for f,b in zip(forward,backward)]
print("x",x)
print("f",forward)
print("b",backward)
print("d",distZero)
output:
x [0, 1, 2, 0, 4, 5, 6, 7, 0, 0]
f [0, 1, 2, 0, 1, 2, 3, 4, 0, 0]
b [0, 2, 1, 0, 4, 3, 2, 1, 0, 0]
d [0, 1, 1, 0, 1, 2, 2, 1, 0, 0]
My first intuition would be to use slicing. If x can be a normal list instead of a numpy array, then you could use
out = [x[i:].index(0) for i,_ in enumerate(x)]
if numpy is necessary then you can use
out = [np.where(x[i:]==0)[0][0] for i,_ in enumerate(x)]
but this is less efficient because you are finding all zero locations to the right of the value and then pulling out just the first. Almost definitely a better way to do this in numpy.
Edit: I am sorry, I misunderstood. This will give you the distance to the nearest zeros - may it be at left or right. But you can use d_right as intermediate result. This does not cover the edge case of not having any zero to the right though.
import numpy as np
x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])
# Get the distance to the closest zero from the left:
zeros = x == 0
zero_locations = np.argwhere(x == 0).flatten()
zero_distances = np.diff(np.insert(zero_locations, 0, 0))
temp = x.copy()
temp[~zeros] = 1
temp[zeros] = -(zero_distances-1)
d_left = np.cumsum(temp) - 1
# Get the distance to the closest zero from the right:
zeros = x[::-1] == 0
zero_locations = np.argwhere(x[::-1] == 0).flatten()
zero_distances = np.diff(np.insert(zero_locations, 0, 0))
temp = x.copy()
temp[~zeros] = 1
temp[zeros] = -(zero_distances-1)
d_right = np.cumsum(temp) - 1
d_right = d_right[::-1]
# Get the smallest distance from both sides:
smallest_distances = np.min(np.stack([d_left, d_right]), axis=0)
# np.array([0, 1, 1, 0, 1, 2, 2, 1, 0, 0])
Consider a sequence of coin tosses: 1, 0, 0, 1, 0, 1 where tail = 0 and head = 1.
The desired output is the sequence: 0, 1, 2, 0, 1, 0
Each element of the output sequence counts the number of tails since the last head.
I have tried a naive method:
def timer(seq):
if seq[0] == 1: time = [0]
if seq[0] == 0: time = [1]
for x in seq[1:]:
if x == 0: time.append(time[-1] + 1)
if x == 1: time.append(0)
return time
Question: Is there a better method?
Using NumPy:
import numpy as np
seq = np.array([1,0,0,1,0,1,0,0,0,0,1,0])
arr = np.arange(len(seq))
result = arr - np.maximum.accumulate(arr * seq)
print(result)
yields
[0 1 2 0 1 0 1 2 3 4 0 1]
Why arr - np.maximum.accumulate(arr * seq)? The desired output seemed related to a simple progression of integers:
arr = np.arange(len(seq))
So the natural question is, if seq = np.array([1, 0, 0, 1, 0, 1]) and the expected result is expected = np.array([0, 1, 2, 0, 1, 0]), then what value of x makes
arr + x = expected
Since
In [220]: expected - arr
Out[220]: array([ 0, 0, 0, -3, -3, -5])
it looks like x should be the cumulative max of arr * seq:
In [234]: arr * seq
Out[234]: array([0, 0, 0, 3, 0, 5])
In [235]: np.maximum.accumulate(arr * seq)
Out[235]: array([0, 0, 0, 3, 3, 5])
Step 1: Invert l:
In [311]: l = [1, 0, 0, 1, 0, 1]
In [312]: out = [int(not i) for i in l]; out
Out[312]: [0, 1, 1, 0, 1, 0]
Step 2: List comp; add previous value to current value if current value is 1.
In [319]: [out[0]] + [x + y if y else y for x, y in zip(out[:-1], out[1:])]
Out[319]: [0, 1, 2, 0, 1, 0]
This gets rid of windy ifs by zipping adjacent elements.
Using itertools.accumulate:
>>> a = [1, 0, 0, 1, 0, 1]
>>> b = [1 - x for x in a]
>>> list(accumulate(b, lambda total,e: total+1 if e==1 else 0))
[0, 1, 2, 0, 1, 0]
accumulate is only defined in Python 3. There's the equivalent Python code in the above documentation, though, if you want to use it in Python 2.
It's required to invert a because the first element returned by accumulate is the first list element, independently from the accumulator function:
>>> list(accumulate(a, lambda total,e: 0))
[1, 0, 0, 0, 0, 0]
The required output is an array with the same length as the input and none of the values are equal to the input. Therefore, the algorithm must be at least O(n) to form the new output array. Furthermore for this specific problem, you would also need to scan all the values for the input array. All these operations are O(n) and it will not get any more efficient. Constants may differ but your method is already in O(n) and will not go any lower.
Using reduce:
time = reduce(lambda l, r: l + [(l[-1]+1)*(not r)], seq, [0])[1:]
I try to be clear in the following code and differ from the original in using an explicit accumulator.
>>> s = [1,0,0,1,0,1,0,0,0,0,1,0]
>>> def zero_run_length_or_zero(seq):
"Return the run length of zeroes so far in the sequnece or zero"
accumulator, answer = 0, []
for item in seq:
accumulator = 0 if item == 1 else accumulator + 1
answer.append(accumulator)
return answer
>>> zero_run_length_or_zero(s)
[0, 1, 2, 0, 1, 0, 1, 2, 3, 4, 0, 1]
>>>
I've got an n*n binary matrix (only 1 and 0), how can I go about counting 2*2 squares (squares are made by 1)
for example A=[[1,1],[1,1]] is considered to make one 2*2 square. or
A = [[1, 1, 0, 1],
[1, 1, 1, 1],
[1, 1, 1, 0],
[0, 1, 1, 1]]
is considered to make four 2*2 squares.
here's my code for this , but I just don't know why it doesn't work.
A = [[1, 1, 0, 1] , [1, 1, 1, 1], [1, 1, 1, 0], [0, 1, 1, 1]]
result=[]
for x in range(len(A)-1):
for y in range(len(A)-1):
if A[x][y]==1:
if A[x+1][y]==1:
if A[x][y+1]==1 or A[x][y-1]==1 and A[x+1][y] or A[x+1][y-1]==1:
result.append(1)
if A[x-1][y]==1:
if A[x][y+1]==1 or A[x][y-1]==1 and A[x-1][y] or A[x-1][y-1]==1:
result.append(1)
print(len(result))
`
Generate indices for width - 1 by height - 1; itertools.product() can do this for us.
Test 4 coordinates for each generated index using all() to only test as many as needed to disprove a square exists.
Use sum() with a generator to count the number of squares found; faster than manually counting with a list or a counter.
Together with lambda to test for squares, this then becomes:
from itertools import product
def count_squares(A):
width, height = len(A[0]), len(A)
indices = product(range(width - 1), range(height - 1))
is_square = lambda x, y: all(A[a][b] == 1 for a, b in product((x, x + 1), (y, y + 1)))
return sum(1 for x, y in indices if is_square(x, y))
Demo:
>>> from itertools import product
>>> count_squares([[1,1],[1,1]])
>>> def count_squares(A):
... width, height = len(A[0]), len(A)
... indices = product(range(width - 1), range(height - 1))
... is_square = lambda x, y: all(A[a][b] == 1 for a, b in product((x, x + 1), (y, y + 1)))
... return sum(1 for x, y in indices if is_square(x, y))
...
>>> count_squares([[1,1],[1,1]])
1
>>> count_squares([[1, 1, 0, 1] , [1, 1, 1, 1], [1, 1, 1, 0], [0, 1, 1, 1]])
4
To get the column count use len(A[x]) so
for y in range(len(A)-1)
becomes
for y in range(len(A[x])-1)
Change
if A[x][y]==1:
if A[x+1][y]==1:
if A[x][y+1]==1 or A[x][y-1]==1 and A[x+1][y] or A[x+1][y-1]==1:
result.append(1)
if A[x-1][y]==1:
if A[x][y+1]==1 or A[x][y-1]==1 and A[x-1][y] or A[x-1][y-1]==1:
result.append(1)
To
if A[x][y]==1 and A[x+1][y]==1 and a[x+1][y+1]==1 and a[x][y+1]:
result.append(1)
Unless you want to count squares multiple times.
Using scipy.signal there is a simple solution that finds the correlation between your target and the input. This is nice since it generalizes to "almost matches" and arbitrary shapes!
import numpy as np
from scipy import signal
A = np.array([[1,1,0,1] ,[1,1,1,1],[1,1,1,0],[0,1,1,1]],dtype=int)
b = np.ones((2,2),dtype=int)
c = signal.correlate(A, b, 'valid')
idx = np.where(c==4)
count = sum(idx[0])
print count
This gives 4 as expected. If you find this interesting, there is a (longer) answer that uses this same idea:
Finding matching submatrices inside a matrix
I multipliply the values of every 2*2-submatrix and sum up:
A = [[1, 1, 0, 1],
[1, 1, 1, 1],
[1, 1, 1, 0],
[0, 1, 1, 1]]
sum( A[x][y]*A[x+1][y]*A[x][y+1]*A[x+1][y+1]
for y in range(len(A)-1)
for x in range(len(A[y])-1)
)
Out[79]: 4