Finding a Pattern in a Grid Python [duplicate] - python

This question already has answers here:
Largest rectangle of 1's in 2d binary matrix
(6 answers)
Closed 2 years ago.
I have randomly generated grid containing 0 and 1:
1 1 0 0 0 1 0 1
1 1 1 0 1 1 1 1
1 0 0 0 1 0 1 1
0 0 1 0 1 0 1 1
1 1 1 1 0 0 1 1
0 0 1 1 1 1 1 0
0 1 0 0 1 0 1 1
How can I iterate through the grid to find the largest cluster of 1s, that is equal or larger than 4 items (across row and column)?
I assume I need to keep a count of each found cluster while iterating and ones its more than 4 items, record and count in a list and then find the largest number.
The problem is that I cannot figure out how to do so across both rows and columns and record the count. I have iterated through the grid but not sure how to move further than two rows.
For example in the above example, the largest cluster is 8. There are some other clusters in the grid, but they have 4 elements:
A A 0 0 0 1 0 1
A A 1 0 1 1 1 1
1 0 0 0 1 0 1 1
0 0 1 0 1 0 1 1
1 1 B B 0 0 1 1
0 0 B B 1 1 1 0
0 1 0 0 1 0 1 1
The code I tried:
rectcount = []
for row in range(len(grid)):
for num in range(len(grid[row])):
# count = 0
try:
# if grid[row][num] == 1:
# if grid[row][num] == grid[row][num + 1] == grid[row + 1][num] == grid[row + 1][num + 1]:
# count += 1
if grid[row][num] == grid[row][num + 1]:
if grid[row + 1][num] == grid[row][num + 1]:
count += 1
# if grid[row][num] == grid[row][num + 1] and grid[row][num] == grid[row + 1][num]:
# count += 1
else:
count = 0
if grid[row][num] == grid[row + 1][num]:
count += 1
except:
pass

I've implemented three algorithms.
First algorithm is Simple, using easiest approach of nested loops, it has O(N^5) time complexity (where N is one side of input grid, 10 for our case), for our inputs of size 10x10 time of O(10^5) is quite alright. Algo id in code is algo = 0. If you just want to see this algorithm jump to line ------ Simple Algorithm inside code.
Second algorithm is Advanced, using Dynamic Programming approach, its complexity is O(N^3) which is much faster than first algorithm. Algo id in code is algo = 1. Jump to line ------- Advanced Algorithm inside code.
Third algorithm Simple-ListComp I implemented just for fun, it is almost same like Simple, same O(N^5) complexity, but using Python's list comprehensions instead of regular loops, that's why it is shorter, also a bit slower because doesn't use some optimizations. Algo id in code is algo = 2. Jump to line ------- Simple-ListComp Algorithm inside code to see algo.
The rest of code, besides algorithms, implements checking correctness of results (double-checking between algorithms), printing results, producing text inputs. Code is split into solving-task function solve() and testing function test(). solve() function has many arguments to allow configuring behavior of function.
All main code lines are documented by comments, read them to learn how to use code. Basically if s variable contains multi-line text with grid elements, same like in your question, you just run solve(s, text = True) and it will solve task and print results. Also you may choose algorithm out of two versions (0 (Simple) and 1 (Advanced) and 2 (Simple-ListComp)) by giving next arguments to solve function algo = 0, check = False (here 0 for algo 0). Look at test() function body to see simplest example of usage.
Algorithms output to console by default all clusters, from largest to smallest, largest is signified by . symbol, the rest by B, C, D, ..., Z symbols. You may set argument show_non_max = False in solve function if you want only first (largest) cluster to be shown.
I'll explain Simple algorithm:
Basically what algorithm does - it searches through all possible angled 1s rectangles and stores info about maximal of them into ma 2D array. Top-left point of such rectangle is (i, j), top-right - (i, k), bottom-left - (l, j + angle_offset), bottom-right - (l, k + angle_offset), all 4 corners, that's why we have so many loops.
In outer two i (row) , j (column) loops we iterate over whole grid, this (i, j) position will be top-left point of 1s rectangle, we need to iterate whole grid because all possible 1s rectangles may have top-left at any (row, col) point of whole grid. At start of j loop we check that grid at (i, j) position should always contain 1 because inside loops we search for all rectangle with 1s only.
k loop iterates through all possible top-right positions (i, k) of 1s rectangle. We should break out of loop if (i, k) equals to 0 because there is no point to extend k further to right because such rectangle will always contain 0.
In previous loops we fixed top-left and top-right corners of rectangle. Now we need to search for two bottom corners. For that we need to extend rectangle downwards at different angles till we reach first 0.
off loop tries extending rectangle downwards at all possible angles (0 (straight vertical), +1 (45 degrees shifted to the right from top to bottom), -1 (-45 degrees)), off basically is such number that grid[y][x] is "above" (corresponds to by Y) grid[y + 1][x + off].
l tries to extend rectangle downwards (in Y direction) at different angles off. It is extended till first 0 because it can't be extended further then (because each such rectangle will already contain 0).
Inside l loop there is if grid[l][max(0, j + off * (l - i)) : min(k + 1 + off * (l - i), c)] != ones[:k - j + 1]: condition, basically this if is meant to check that last row of rectangle contains all 1 if not this if breaks out of loop. This condition compares two list slices for non-equality. Last row of rectangle spans from point (l, j + angle_offset) (expression max(0, j + off * (l - i)), max-limited to be 0 <= X) to point (l, k + angle_offset) (expression min(k + 1 + off * (l - i), c), min-limited to be X < c).
Inside l loop there are other lines, ry, rx = l, k + off * (l - i) computes bottom-right point of rectangle (ry, rx) which is (l, k + angle_offset), this (ry, rx) position is used to store found maximum inside ma array, this array stores all maximal found rectangles, ma[ry][rx] contains info about rectangle that has bottom-right at point (ry, rx).
rv = (l + 1 - i, k + 1 - j, off) line computes new possible candidate for ma[ry][rx] array entry, possible because ma[ry][rx] is updated only if new candidate has larger area of 1s. Here rv[0] value inside rv tuple contains height of such rectangle, rv[1] contains width of such rectangle (width equals to the length of bottom row of rectangle), rv[2] contains angle of such rectangle.
Condition if rv[0] * rv[1] > ma[ry][rx][0] * ma[ry][rx][1]: and its body just checks if rv area is larger than current maximum inside array ma[ry][rx] and if it is larger then this array entry is updated (ma[ry][rx] = rv). I'll remind that ma[ry][rx] contains info (width, height, angle) about current found maximal-area rectangle that has bottom-right point at (ry, rx) and that has these width, height and angle.
Done! After algorithm run array ma contains information about all maximal-area angled rectangles (clusters) of 1s so that all clusters can be restored and printed later to console. Largest of all such 1s-clusters is equal to some rv0 = ma[ry0][rx0], just iterate once through all elements of ma and find such point (ry0, rx0) so that ma[ry0][rx0][0] * ma[ry0][rx0][1] (area) is maximal. Then largest cluster will have bottom-right point (ry0, rx0), bottom-left point (ry0, rx0 - rv0[1] + 1), top-right point (ry0 - rv0[0] + 1, rx0 - rv0[2] * (rv0[0] - 1)), top-left point (ry0 - rv0[0] + 1, rx0 - rv0[1] + 1 - rv0[2] * (rv0[0] - 1)) (here rv0[2] * (rv0[0] - 1) is just angle offset, i.e. how much shifted is first row along X compared to last row of rectangle).
Try it online!
# ----------------- Main function solving task -----------------
def solve(
grid, *,
algo = 1, # Choose algorithm, 0 - Simple, 1 - Advanced, 2 - Simple-ListComp
check = True, # If True run all algorithms and check that they produce same results, otherwise run just chosen algorithm without checking
text = False, # If true then grid is a multi-line text (string) having grid elements separated by spaces
print_ = True, # Print results to console
show_non_max = True, # When printing if to show all clusters, not just largest, as B, C, D, E... (chars from "cchars")
cchars = ['.'] + [chr(ii) for ii in range(ord('B'), ord('Z') + 1)], # Clusters-chars, these chars are used to show clusters from largest to smallest
one = None, # Value of "one" inside grid array, e.g. if you have grid with chars then one may be equal to "1" string. Defaults to 1 (for non-text) or "1" (for text).
offs = [0, +1, -1], # All offsets (angles) that need to be checked, "off" is such that grid[i + 1][j + off] corresponds to next row of grid[i][j]
debug = False, # If True, extra debug info is printed
):
# Preparing
assert algo in [0, 1, 2], algo
if text:
grid = [l.strip().split() for l in grid.splitlines() if l.strip()]
if one is None:
one = 1 if not text else '1'
r, c = len(grid), len(grid[0])
sgrid = '\n'.join([''.join([str(grid[ii][jj]) for jj in range(c)]) for ii in range(r)])
mas, ones = [], [one] * max(c, r)
# ----------------- Simple Algorithm, O(N^5) Complexity -----------------
if algo == 0 or check:
ma = [[(0, 0, 0) for jj in range(c)] for ii in range(r)] # Array containing maximal answers, Lower-Right corners
for i in range(r):
for j in range(c):
if grid[i][j] != one:
continue
for k in range(j + 1, c): # Ensure at least 2 ones along X
if grid[i][k] != one:
break
for off in offs:
for l in range(i + 1, r): # Ensure at least 2 ones along Y
if grid[l][max(0, j + off * (l - i)) : min(k + 1 + off * (l - i), c)] != ones[:k - j + 1]:
l -= 1
break
ry, rx = l, k + off * (l - i)
rv = (l + 1 - i, k + 1 - j, off)
if rv[0] * rv[1] > ma[ry][rx][0] * ma[ry][rx][1]:
ma[ry][rx] = rv
mas.append(ma)
ma = None
# ----------------- Advanced Algorithm using Dynamic Programming, O(N^3) Complexity -----------------
if algo == 1 or check:
ma = [[(0, 0, 0) for jj in range(c)] for ii in range(r)] # Array containing maximal answers, Lower-Right corners
for off in offs:
d = [[(0, 0, 0) for jj in range(c)] for ii in range(c)]
for i in range(r):
f, d_ = 0, [[(0, 0, 0) for jj in range(c)] for ii in range(c)]
for j in range(c):
if grid[i][j] != one:
f = j + 1
continue
if f >= j:
# Check that we have at least 2 ones along X
continue
df = [(0, 0, 0) for ii in range(c)]
for k in range(j, -1, -1):
t0 = d[j - off][max(0, k - off)] if 0 <= j - off < c and k - off < c else (0, 0, 0)
if k >= f:
t1 = (t0[0] + 1, t0[1], off) if t0 != (0, 0, 0) else (0, 0, 0)
t2 = (1, j - k + 1, off)
t0 = t1 if t1[0] * t1[1] >= t2[0] * t2[1] else t2
# Ensure that we have at least 2 ones along Y
t3 = t1 if t1[0] > 1 else (0, 0, 0)
if k < j and t3[0] * t3[1] < df[k + 1][0] * df[k + 1][1]:
t3 = df[k + 1]
df[k] = t3
else:
t0 = d_[j][k + 1]
if k < j and t0[0] * t0[1] < d_[j][k + 1][0] * d_[j][k + 1][1]:
t0 = d_[j][k + 1]
d_[j][k] = t0
if ma[i][j][0] * ma[i][j][1] < df[f][0] * df[f][1]:
ma[i][j] = df[f]
d = d_
mas.append(ma)
ma = None
# ----------------- Simple-ListComp Algorithm using List Comprehension, O(N^5) Complexity -----------------
if algo == 2 or check:
ma = [
[
max([(0, 0, 0)] + [
(h, w, off)
for h in range(2, i + 2)
for w in range(2, j + 2)
for off in offs
if all(
cr[
max(0, j + 1 - w - off * (h - 1 - icr)) :
max(0, j + 1 - off * (h - 1 - icr))
] == ones[:w]
for icr, cr in enumerate(grid[max(0, i + 1 - h) : i + 1])
)
], key = lambda e: e[0] * e[1])
for j in range(c)
]
for i in range(r)
]
mas.append(ma)
ma = None
# ----------------- Checking Correctness and Printing Results -----------------
if check:
# Check that we have same answers for all algorithms
masx = [[[cma[ii][jj][0] * cma[ii][jj][1] for jj in range(c)] for ii in range(r)] for cma in mas]
assert all([masx[0] == e for e in masx[1:]]), 'Maximums of algorithms differ!\n\n' + sgrid + '\n\n' + (
'\n\n'.join(['\n'.join([' '.join([str(e1).rjust(2) for e1 in e0]) for e0 in cma]) for cma in masx])
)
ma = mas[0 if not check else algo]
if print_:
cchars = ['.'] + [chr(ii) for ii in range(ord('B'), ord('Z') + 1)] # These chars are used to show clusters from largest to smallest
res = [[grid[ii][jj] for jj in range(c)] for ii in range(r)]
mac = [[ma[ii][jj] for jj in range(c)] for ii in range(r)]
processed = set()
sid = 0
for it in range(r * c):
sma = sorted(
[(mac[ii][jj] or (0, 0, 0)) + (ii, jj) for ii in range(r) for jj in range(c) if (ii, jj) not in processed],
key = lambda e: e[0] * e[1], reverse = True
)
if len(sma) == 0 or sma[0][0] * sma[0][1] <= 0:
break
maxv = sma[0]
if it == 0:
maxvf = maxv
processed.add((maxv[3], maxv[4]))
show = True
for trial in [True, False]:
for i in range(maxv[3] - maxv[0] + 1, maxv[3] + 1):
for j in range(maxv[4] - maxv[1] + 1 - (maxv[3] - i) * maxv[2], maxv[4] + 1 - (maxv[3] - i) * maxv[2]):
if trial:
if mac[i][j] is None:
show = False
break
elif show:
res[i][j] = cchars[sid]
mac[i][j] = None
if show:
sid += 1
if not show_non_max and it == 0:
break
res = '\n'.join([''.join([str(res[ii][jj]) for jj in range(c)]) for ii in range(r)])
print(
'Max:\nArea: ', maxvf[0] * maxvf[1], '\nSize Row,Col: ', (maxvf[0], maxvf[1]),
'\nLowerRight Row,Col: ', (maxvf[3], maxvf[4]), '\nAngle: ', ("-1", " 0", "+1")[maxvf[2] + 1], '\n', sep = ''
)
print(res)
if debug:
# Print all computed maximums, for debug purposes
for cma in [ma, mac]:
print('\n' + '\n'.join([' '.join([f'({e0[0]}, {e0[1]}, {("-1", " 0", "+1")[e0[2] + 1]})' for e0_ in e for e0 in (e0_ or ('-', '-', 0),)]) for e in cma]))
print(end = '-' * 28 + '\n')
return ma
# ----------------- Testing -----------------
def test():
# Iterating over text inputs or other ways of producing inputs
for s in [
"""
1 1 0 0 0 1 0 1
1 1 1 0 1 1 1 1
1 0 0 0 1 0 1 1
0 0 1 0 1 0 1 1
1 1 1 1 0 0 1 1
0 0 1 1 1 1 1 0
0 1 0 0 1 0 1 1
""",
"""
1 0 1 1 0 1 0 0
0 1 1 0 1 0 0 1
1 1 0 0 0 0 0 1
0 1 1 1 0 1 0 1
0 1 1 1 1 0 1 1
1 1 0 0 0 1 0 0
0 1 1 1 0 1 0 1
""",
"""
0 1 1 0 1 0 1 1
0 0 1 1 0 0 0 1
0 0 0 1 1 0 1 0
1 1 0 0 1 1 1 0
0 1 1 0 0 1 1 0
0 0 1 0 1 0 1 1
1 0 0 1 0 0 0 0
0 1 1 0 1 1 0 0
"""
]:
solve(s, text = True)
if __name__ == '__main__':
test()
Output:
Max:
Area: 8
Size Row,Col: (4, 2)
LowerRight Row,Col: (4, 7)
Angle: 0
CC000101
CC1011..
100010..
001010..
1BBB00..
00BBBDD0
010010DD
----------------------------
Max:
Area: 6
Size Row,Col: (3, 2)
LowerRight Row,Col: (2, 1)
Angle: -1
10..0100
0..01001
..000001
0BBB0101
0BBB1011
CC000100
0CC10101
----------------------------
Max:
Area: 12
Size Row,Col: (6, 2)
LowerRight Row,Col: (5, 7)
Angle: +1
0..01011
00..0001
000..010
BB00..10
0BB00..0
001010..
10010000
01101100
----------------------------

Related

Anybody knows how to do a binary search with a 2d array of strings? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 months ago.
Improve this question
This is my array:
import numpy as np
Arr = np.array( [
["","A","B","C","D","E","F"],
["1","0","0","0","0","0","0"],
["2","0","0","0","0","0","0"],
["3","0","X","0","0","0","0"],
["4","0","0","0","0","0","0"],
["5","0","0","0","X","0","0"],
["6","X","0","0","0","0","0"],
["7","0","0","0","0","0","0"],
["8","0","0","0","0","0","0"]
])
I want to do a binary search but I don't know how to do it with an array of strings. Basically I want to look at the position in where all my "X" are.
def findRow(a, n, m, k):
#a is the 2d array
#n is the number of rows
#m is the number of columns
#k is the "X"
l = 0
r = n - 1
mid = 0
while (l <= r) :
mid = int((l + r) / 2)
# we'll check the left and
# right most elements
# of the row here itself
# for efficiency
if(k == a[mid][0]): #checking leftmost element
print("Found at (" , mid , ",", "0)", sep = "")
return
if(k == a[mid][m - 1]): # checking rightmost element
t = m - 1
print("Found at (" , mid , ",", t , ")", sep = "")
return
if(k > a[mid][0] and k < a[mid][m - 1]): # this means the element
# must be within this row
binarySearch(a, n, m, k, mid) # we'll apply binary
# search on this row
return
if (k < a[mid][0]):
r = mid - 1
if (k > a[mid][m - 1]):
l = mid + 1
def binarySearch(a, n, m, k, x): #x is the row number
# now we simply have to apply binary search as we
# did in a 1-D array, for the elements in row
# number
# x
l = 0
r = m - 1
mid = 0
while (l <= r):
mid = int((l + r) / 2)
if (a[x][mid] == k):
print("Found at (" , x , ",", mid , ")", sep = "")
return
if (a[x][mid] > k):
r = mid - 1
if (a[x][mid] < k):
l = mid + 1
print("Element not found")
This is what I have tried but this is for int 2d arrays. Now I have a string 2d Array and I'm trying to find the location of al my "X"'s.
I want to output to be: found in (A,6), (B,3), (D,5)
Basically I want to look at the position in where all my "X" are.
You can use np.where to get the indices for each axis, then zip them to get index tuples for all the locations:
>>> list(zip(*np.where(Arr == "X")))
[(3, 2), (5, 4), (6, 1)]
If you want the (row, column) "locations", you could do this:
>>> [(Arr[row, 0], Arr[0, col]) for row, col in zip(*np.where(Arr == "X"))]
[('3', 'B'), ('5', 'D'), ('6', 'A')]
However, you seem to be treating an array as a table. You should consider using Pandas:
>>> df = pd.DataFrame(Arr[1:, 1:], columns=Arr[0, 1:], index=range(1, len(Arr[1:]) + 1))
>>> df
A B C D E F
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 X 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 X 0 0
6 X 0 0 0 0 0
7 0 0 0 0 0 0
8 0 0 0 0 0 0
>>> rows, cols = np.where(df == "X")
>>> [*zip(df.index[rows], df.columns[cols])]
[(3, 'B'), (5, 'D'), (6, 'A')]

Printing pattern without importing modules

Please help me print the pattern below as it is, if the input entered is 7:
1 1 1 1 1 1 1
1 2 2 2 2 2 1
1 2 3 3 3 2 1
1 2 3 4 3 2 1
1 2 3 3 3 2 1
1 2 2 2 2 2 1
1 1 1 1 1 1 1
I figured out to find the middle element of the pattern with any input:
rows=int(input("Enter the number of rows:"))
l=[]
for x in range(1,rows+1):
if x%2!=0:
l.append(x)
mid_value=len(l)
Please help me complete the above pattern......
Thanks in advance!
If you use a list-of-lists to store the values, the value for any specific cell can be determined by doing some basic math involving the cell indexes and the number of rows.
An illustration:
def cell_value(i, j, n_rows):
# The value of any cell is the minimum distance
# from its own coordinates (i, j) to the "outside" (ie,
# an i or j less than 0 or equal to n_rows). Imagine an
# ant in the grid. How many steps would it have to take
# to escape the grid, using the shortest route?
return min(
abs(i - -1),
abs(i - n_rows),
abs(j - -1),
abs(j - n_rows),
)
N_ROWS = 7
rows = [
[
cell_value(i, j, N_ROWS)
for j in range(N_ROWS)
]
for i in range(N_ROWS)
]
for r in rows:
print(*r)
Output:
1 1 1 1 1 1 1
1 2 2 2 2 2 1
1 2 3 3 3 2 1
1 2 3 4 3 2 1
1 2 3 3 3 2 1
1 2 2 2 2 2 1
1 1 1 1 1 1 1
This looks like a homework question, so I'm going to try and explain how to approach it rather than just provide code.
A few things worth noting to start:
- The pattern's symmetrical in both directions, so we can save some effort and logic by only solving the top-left quarter, and copying it to the rest.
- Each row is similar the the one before, with one added at the point where the row and column indices (i and j) are equal - rather than recalculate every row from scratch, we can take the one before as a base.
So, for the first row, make a list of 1s the length of your input (7, in this case).
Copy this for the seventh row (note: row6 = row0 won't create a copy; you'll need row6 = list(row0) )
For the second and sixth rows, take a copy of the first row. If i is equal to or greater than j and is in the first half of the row, add 1 to it. You'll need to copy that in reverse for the back half of the row. (Alternative - set the value to j+1 rather than just adding 1)
Repeat until the fourth row, and you should be done.
EDIT: code included, because it was an interesting problem
numberOfRows = int(input("Enter the number of rows:"))
listOut = [[1]*numberOfRows] * numberOfRows #grid of 1s of appropriate size
for j in range(int((numberOfRows+1)/2)): #symmetrical, so only look to the middle
if j > 0:
listOut[j] = list(listOut[j-1]) #copy previous row
for i in range(int((numberOfRows+1)/2)):
if i>=j:
listOut[j][i] = j+1
listOut[j][numberOfRows-(i+1)] = j+1
#copy current row to appropriate distance from the end
listOut[numberOfRows-(j+1)] = list(listOut[j])
for row in listOut:
# * for sequence unpacking, printing lists as strings w/o commas
print(*row)
It might not be the most elegant solution but something like this should work:
n = int(input('Enter the number of rows:'))
table = [[1 for _ in range(n)] for _ in range(n)]
start = 0
end = n
while start < end:
start += 1
end -= 1
for i in range(start, end):
for j in range(start, end):
table[i][j] += 1
for row in table:
print(' '.join(str(ele) for ele in row))
Simple implementation without using any list
n = int(input())
x = n
for i in range((n // 2 + 1) if n % 2 and n > 1 else n //2):
for l in range(1, i + 1):
print(l, end=' ')
print((str(i + 1) + ' ') * x, end='')
for r in range(i, 0, -1):
print(r, end=' ')
print()
x -= 2
y = 1
for j in range(n // 2, 0, -1):
for l in range(1, j):
print(l, end=' ')
print((str(j) + ' ') * (2 * y + 1 if n % 2 else 2 * y), end='')
for r in range(j-1, 0, -1):
print(r, end=' ')
print()
y += 1

Generate binary strings that are at least in d hamming distance using ECC

I want to generate binary strings of length n=128 with the property that any pair of such strings are at least in d=10 hamming distance.
For this I am trying to use an Error Correcting Code (ECC) with minimum distance d=10. However, I cannot find any ecc that has code words of 128 bit length. If the code word length (n) and d are a little bit smaller/greater than 128 and 10, that still works for me.
Is there any ecc with this (similar) properties? Is there any python implementation of this?
Reed-Muller codes RM(3,7) have:
a block size of 128 bits
a minimum distance of 16
a message size of 64 bits
First construct a basis like this:
def popcnt(x):
return bin(x).count("1")
basis = []
by_ones = list(range(128))
by_ones.sort(key=popcnt)
for i in by_ones:
count = popcnt(i)
if count > 3:
break
if count <= 1:
basis.append(((1 << 128) - 1) // ((1 << i) | 1))
else:
p = ((1 << 128) - 1)
for b in [basis[k + 1] for k in range(7) if ((i >> k) & 1) != 0]:
p = p & b
basis.append(p)
Then you can use any linear combination of them, which are created by XORing subsets of rows of the basis, for example:
def encode(x, basis):
# requires x < (1 << 64)
r = 0
for i in range(len(basis)):
if ((x >> i) & 1) != 0:
r = r ^ basis[i]
return r
In some other implementation I found this was done by taking dot products with columns of the basis matrix and then reducing modulo 2. I don't know why they do that, it seems much easier to do it more directly by summing a subset of rows.
I needed the exact same thing. For me the naive approach worked very well! Simply generate random bit strings and check hamming distance between them, gradually building a list of strings that fulfills the requirement:
def random_binary_array(width):
"""Generate random binary array of specific width"""
# You can enforce additional array level constraints here
return np.random.randint(2, size=width)
def hamming2(s1, s2):
"""Calculate the Hamming distance between two bit arrays"""
assert len(s1) == len(s2)
# return sum(c1 != c2 for c1, c2 in zip(s1, s2)) # Wikipedia solution
return np.count_nonzero(s1 != s2) # a faster solution
def generate_hamm_arrays(n_values, size, min_hamming_dist=5):
"""
Generate a list of binary arrays ensuring minimal hamming distance between the arrays.
"""
hamm_list = []
while len(hamm_list) < size:
test_candidate = random_binary_array(n_values)
valid = True
for word in hamm_list:
if (word == test_candidate).all() or hamming2(word, test_candidate) <= min_hamming_dist:
valid = False
break
if valid:
hamm_list.append(test_candidate)
return np.array(hamm_list)
print(generate_hamm_arrays(16, 10))
Output:
[[0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 1]
[1 0 1 0 0 1 0 0 0 1 0 0 1 0 1 1]
[1 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0]
[1 0 0 1 1 0 0 1 1 0 0 1 1 1 0 1]
[0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1]
[1 1 0 0 0 0 0 1 0 1 1 1 0 1 1 1]
[1 1 0 1 0 1 0 1 1 1 1 0 0 1 0 0]
[0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0]
[1 1 0 0 0 0 1 1 1 0 0 1 0 0 0 1]
[0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0]]
And it's not too slow as long as you don't want a very dense list of strings (a small number of bits in a string + large hamming distance). From your specifications (128 bit strings with hamming distance 10 it is no problem) we can generate a 1000 bit strings in under 0.2 seconds on a really weak cpu:
import timeit
timeit.timeit(lambda: generate_hamm_arrays(n_values=128, size=100, min_hamming_dist=10), number=10)
>> 0.19202665984630585
Hope this solution is sufficient for you too.
My O(n*n!) solution (works in a reasonable time for N<14)
def hammingDistance(n1, n2):
return bin(np.bitwise_xor(n1, n2)).count("1")
N = 10 # binary code of length N
D = 6 # with minimum distance D
M = 2**N # number of unique codes in general
# construct hamming distance matrix
A = np.zeros((M, M), dtype=int)
for i in range(M):
for j in range(i+1, M):
A[i, j] = hammingDistance(i, j)
A += A.T
def recursivly_find_legit_numbers(nums, codes=set()):
codes_to_probe = nums
for num1 in nums:
codes.add(num1)
codes_to_probe = codes_to_probe - {num1}
for num2 in nums - {num1}:
if A[num1, num2] < D:
"Distance isn't sufficient, remove this number from set"
codes_to_probe = codes_to_probe - {num2}
if len(codes_to_probe):
recursivly_find_legit_numbers(codes_to_probe, codes)
return codes
group_of_codes = {}
for i in tqdm(range(M)):
satisfying_numbers = np.where(A[i] >= D)[0]
satisfying_numbers = satisfying_numbers[satisfying_numbers > i]
nums = set(satisfying_numbers)
if len(nums) == 0:
continue
group_of_codes[i] = recursivly_find_legit_numbers(nums, set())
group_of_codes[i].add(i)
largest_group = 0
for i, nums in group_of_codes.items():
if len(nums) > largest_group:
largest_group = len(nums)
ind = i
print(f"largest group for N={N} and D={D}: {largest_group}")
print("Number of unique groups:", len(group_of_codes))
largest group for N=10 and D=6: 6 Number of unique groups: 992
# generate largest group of codes
[format(num, f"0{N}b") for num in group_of_codes[ind]]
['0110100001',
'0001000010',
'1100001100',
'1010010111',
'1111111010',
'0001111101']

How do I add to a grid coordinate in python?

What I'm trying to do is have a 2D array and for every coordinate in the array, ask all the other 8 coordinates around it if they have stored a 1 or a 0. Similar to a minesweeper looking for mines.
I used to have this:
grid = []
for fila in range(10):
grid.append([])
for columna in range(10):
grid[fila].append(0)
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
for i in range 10:
for j in range 10:
if gird[fila + i][columna + j] == 1
neighbour += 1
But something didn't work well. I also had print statments to try to find the error that way but i still didnt understand why it only made half of the for loop. So I changed the second for loop to this:
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
if grid[fila - 1][columna - 1] == 1:
neighbour += 1
if grid[fila - 1][columna] == 1:
neighbour += 1
if grid[fila - 1][columna + 1] == 1:
neighbour += 1
if grid[fila][columna - 1] == 1:
neighbour += 1
if grid[fila][columna + 1] == 1:
neighbour += 1
if grid[fila + 1][columna - 1] == 1:
neighbour += 1
if grid[fila + 1][columna] == 1:
neighbour += 1
if grid[fila + 1][columna + 1] == 1:
neighbour += 1
And got this error:
if grid[fila - 1][columna + 1] == 1:
IndexError: list index out of range
It seems like I can't add on the grid coordinates but I can subtract. Why is that?
Valid indices in python are -len(grid) to len(grid)-1. the positive indices are accessing elements with offset from the front, the negative ones from the rear. adding gives a range error if the index is greater than len(grid)-1 that is what you see. subtracting does not give you a range error unless you get an index value less than -len(grid). although you do not check for the lower bound, which is 0 (zero) it seems to work for you as small negative indices return you values from the rear end. this is a silent error leading to wrong neighborhood results.
If you are computing offsets, you need to make sure your offsets are within the bounds of the lists you have. So if you have 10 elements, don't try to access the 11th element.
import collections
grid_offset = collections.namedtuple('grid_offset', 'dr dc')
Grid = [[0 for c in range(10)] for r in range(10)]
Grid_height = len(Grid)
Grid_width = len(Grid[0])
Neighbors = [
grid_offset(dr, dc)
for dr in range(-1, 2)
for dc in range(-1, 2)
if not dr == dc == 0
]
def count_neighbors(row, col):
count = 0
for nb in Neighbors:
r = row + nb.dr
c = col + nb.dc
if 0 <= r < Grid_height and 0 <= c < Grid_width:
# Add the value, or just add one?
count += Grid[r][c]
return count
Grid[4][6] = 1
Grid[5][4] = 1
Grid[5][5] = 1
for row in range(10):
for col in range(10):
print(count_neighbors(row, col), "", end='')
print()
Prints:
$ python test.py
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 0 0
0 0 0 1 2 3 1 1 0 0
0 0 0 1 1 2 2 1 0 0
0 0 0 1 2 2 1 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
The error is exactly what it says, you need to check if the coordinates fit within the grid:
0 <= i < 10 and 0 <= j < 10
Otherwise you're trying to access an element that doesn't exist in memory, or an element that's not the one you're actually thinking about - Python handles negative indexes, they're counted from the end.
E.g. a[-1] is the last element, exactly the same as a[len(a) - 1].

Optimization of minimum cost path not working

I'm trying to write an algorithm that will find the path in n*n matrix with minimum cost (every coordinate has a pre-defined cost). Cost of path is defined as the sum of all coordinate costs. The first line of input contains the size of a matrix and the following n lines are table rows. Last two lines of code are 1. begin coordinates 2. end coordinates. Output is the minimum path cost.
Example input :
5
0 1 2 1 1
0 0 1 5 1
1 0 0 1 1
1 1 0 7 0
1 8 0 0 0
0 0
4 4
Output should be 0
This is code with memoization (it works without memoization but it's slow)
import copy
import sys
sys.setrecursionlimit(9000)
INF = 100000
n = int(input())
memo = {}
def input_matrix(n) :
p = []
for i in range(n) :
p.append(list(map(int, input().split())))
return p
def min_cost(matrix, x, y, end_x, end_y) :
if x == end_x and y == end_y :
return 0
if (x, y) in memo :
return memo[(x, y)]
if x == len(matrix) or y == len(matrix) or x == -1 or y == -1 or matrix[y][x] == -1:
return INF
z = copy.deepcopy(matrix)
z[y][x] = -1
memo[(x, y)] = min(
min_cost(z, x+1, y, end_x, end_y)+matrix[y][x],
min_cost(z, x-1, y, end_x, end_y)+matrix[y][x],
min_cost(z, x, y+1, end_x, end_y)+matrix[y][x],
min_cost(z, x, y-1, end_x, end_y)+matrix[y][x]
)
return memo[(x, y)]
matrix = input_matrix(n)
begin_coords = list(map(int, input().split()))
end_coords = list(map(int, input().split()))
print(min_cost(matrix, begin_coords[0], begin_coords[1], end_coords[0], end_coords[1]))
The problem is that your use of the cache is not correct. Consider the following example, in which your code returns 1 instead of 0:
3
0 0 1
1 0 0
1 1 0
0 0
2 2
If you try to follow the code flow you'll see that your algorithms searches the matrix in the following way:
0 -> 0 -> 1 -> x
|
1 <- 0 <- 0 -> x
|
1 -> 1 -> 0
Moreover you are setting the value in the matrix at -1 when you perform the recursive call, so when you finally reach the goal the matrix is:
-1 -1 -1
-1 -1 -1
-1 -1 0
Sure, you are copying the matrices, but during a recursive call the whole path followed to reach that point will still be -1.
I.e. when your code finds 2, 2 it returns 0. The call on 1, 2 tries to compute the value for 0, 2 but returns inf because the bottom-left corner is -1, the call on 1, 3 and 1, 1 return +inf too. So for x=1, y=2 we get the correct value 1. The code backtracks, obtaining the matrix:
-1 -1 -1
-1 -1 -1
-1 1 0
And we have 1,2 -> 1 in our memo. We have to finish the call for 0, 2, which again tries -1, 2, 0, 3 and 0, 1 all of these return +inf and hence we compute 0 2 -> 2 which is correct.
Now however things start to go wrong. The call at 0, 1 has already tried to go 1, 1 but that returns +inf since the value is set to -1, the same holds for all other recursive calls. Hence we set 0, 1 -> 3 which is wrong.
Basically by setting the value in the matrix to -1 during recursive calls you have prevent the recursive call for 0, 1 to go right and get the correct value of 1.
The issue appears in the cached version because now *every time we return to 0 1 we get the wrong value. Without cache the code is able to reach 0 1 by a path not coming from 1 1 and hence discover that 0 1 -> 1.
Instead of cachine I would use a dynamic programming approach. Fill the matrix with +inf values. Start at the goal position and put a 0 there, then compute the neighbouring values by row/column:
def min_cost(matrix, x, y, end_x, end_y):
n = len(matrix)
memo = [[float('+inf') for _ in range(n)] for _ in range(n)]
memo[end_y, end_x] = 0
changed = True
while changed:
changed = False
for x in range(n):
for y in range(n):
m = matrix[y][x]
old_v = memo[y][x]
memo[y][x] = min(
memo[y][x],
m + min(memo[h][k] for h in (y-1, y+1) if 0 <= h < n for k in (x-1, x+1) if 0 <= k < n)
)
if memo[y][x] != old_v:
changed = True
return memo[y, x]
However this is still not as efficient as it could be. If you apply dynamic programming correctly you will end up with the Bellman-Ford Algorithm. Your grid is just a graph where each vertex x, y has four outgoing edges (except those on the border).

Categories