Optimization of minimum cost path not working - python

I'm trying to write an algorithm that will find the path in n*n matrix with minimum cost (every coordinate has a pre-defined cost). Cost of path is defined as the sum of all coordinate costs. The first line of input contains the size of a matrix and the following n lines are table rows. Last two lines of code are 1. begin coordinates 2. end coordinates. Output is the minimum path cost.
Example input :
5
0 1 2 1 1
0 0 1 5 1
1 0 0 1 1
1 1 0 7 0
1 8 0 0 0
0 0
4 4
Output should be 0
This is code with memoization (it works without memoization but it's slow)
import copy
import sys
sys.setrecursionlimit(9000)
INF = 100000
n = int(input())
memo = {}
def input_matrix(n) :
p = []
for i in range(n) :
p.append(list(map(int, input().split())))
return p
def min_cost(matrix, x, y, end_x, end_y) :
if x == end_x and y == end_y :
return 0
if (x, y) in memo :
return memo[(x, y)]
if x == len(matrix) or y == len(matrix) or x == -1 or y == -1 or matrix[y][x] == -1:
return INF
z = copy.deepcopy(matrix)
z[y][x] = -1
memo[(x, y)] = min(
min_cost(z, x+1, y, end_x, end_y)+matrix[y][x],
min_cost(z, x-1, y, end_x, end_y)+matrix[y][x],
min_cost(z, x, y+1, end_x, end_y)+matrix[y][x],
min_cost(z, x, y-1, end_x, end_y)+matrix[y][x]
)
return memo[(x, y)]
matrix = input_matrix(n)
begin_coords = list(map(int, input().split()))
end_coords = list(map(int, input().split()))
print(min_cost(matrix, begin_coords[0], begin_coords[1], end_coords[0], end_coords[1]))

The problem is that your use of the cache is not correct. Consider the following example, in which your code returns 1 instead of 0:
3
0 0 1
1 0 0
1 1 0
0 0
2 2
If you try to follow the code flow you'll see that your algorithms searches the matrix in the following way:
0 -> 0 -> 1 -> x
|
1 <- 0 <- 0 -> x
|
1 -> 1 -> 0
Moreover you are setting the value in the matrix at -1 when you perform the recursive call, so when you finally reach the goal the matrix is:
-1 -1 -1
-1 -1 -1
-1 -1 0
Sure, you are copying the matrices, but during a recursive call the whole path followed to reach that point will still be -1.
I.e. when your code finds 2, 2 it returns 0. The call on 1, 2 tries to compute the value for 0, 2 but returns inf because the bottom-left corner is -1, the call on 1, 3 and 1, 1 return +inf too. So for x=1, y=2 we get the correct value 1. The code backtracks, obtaining the matrix:
-1 -1 -1
-1 -1 -1
-1 1 0
And we have 1,2 -> 1 in our memo. We have to finish the call for 0, 2, which again tries -1, 2, 0, 3 and 0, 1 all of these return +inf and hence we compute 0 2 -> 2 which is correct.
Now however things start to go wrong. The call at 0, 1 has already tried to go 1, 1 but that returns +inf since the value is set to -1, the same holds for all other recursive calls. Hence we set 0, 1 -> 3 which is wrong.
Basically by setting the value in the matrix to -1 during recursive calls you have prevent the recursive call for 0, 1 to go right and get the correct value of 1.
The issue appears in the cached version because now *every time we return to 0 1 we get the wrong value. Without cache the code is able to reach 0 1 by a path not coming from 1 1 and hence discover that 0 1 -> 1.
Instead of cachine I would use a dynamic programming approach. Fill the matrix with +inf values. Start at the goal position and put a 0 there, then compute the neighbouring values by row/column:
def min_cost(matrix, x, y, end_x, end_y):
n = len(matrix)
memo = [[float('+inf') for _ in range(n)] for _ in range(n)]
memo[end_y, end_x] = 0
changed = True
while changed:
changed = False
for x in range(n):
for y in range(n):
m = matrix[y][x]
old_v = memo[y][x]
memo[y][x] = min(
memo[y][x],
m + min(memo[h][k] for h in (y-1, y+1) if 0 <= h < n for k in (x-1, x+1) if 0 <= k < n)
)
if memo[y][x] != old_v:
changed = True
return memo[y, x]
However this is still not as efficient as it could be. If you apply dynamic programming correctly you will end up with the Bellman-Ford Algorithm. Your grid is just a graph where each vertex x, y has four outgoing edges (except those on the border).

Related

Finding a Pattern in a Grid Python [duplicate]

This question already has answers here:
Largest rectangle of 1's in 2d binary matrix
(6 answers)
Closed 2 years ago.
I have randomly generated grid containing 0 and 1:
1 1 0 0 0 1 0 1
1 1 1 0 1 1 1 1
1 0 0 0 1 0 1 1
0 0 1 0 1 0 1 1
1 1 1 1 0 0 1 1
0 0 1 1 1 1 1 0
0 1 0 0 1 0 1 1
How can I iterate through the grid to find the largest cluster of 1s, that is equal or larger than 4 items (across row and column)?
I assume I need to keep a count of each found cluster while iterating and ones its more than 4 items, record and count in a list and then find the largest number.
The problem is that I cannot figure out how to do so across both rows and columns and record the count. I have iterated through the grid but not sure how to move further than two rows.
For example in the above example, the largest cluster is 8. There are some other clusters in the grid, but they have 4 elements:
A A 0 0 0 1 0 1
A A 1 0 1 1 1 1
1 0 0 0 1 0 1 1
0 0 1 0 1 0 1 1
1 1 B B 0 0 1 1
0 0 B B 1 1 1 0
0 1 0 0 1 0 1 1
The code I tried:
rectcount = []
for row in range(len(grid)):
for num in range(len(grid[row])):
# count = 0
try:
# if grid[row][num] == 1:
# if grid[row][num] == grid[row][num + 1] == grid[row + 1][num] == grid[row + 1][num + 1]:
# count += 1
if grid[row][num] == grid[row][num + 1]:
if grid[row + 1][num] == grid[row][num + 1]:
count += 1
# if grid[row][num] == grid[row][num + 1] and grid[row][num] == grid[row + 1][num]:
# count += 1
else:
count = 0
if grid[row][num] == grid[row + 1][num]:
count += 1
except:
pass
I've implemented three algorithms.
First algorithm is Simple, using easiest approach of nested loops, it has O(N^5) time complexity (where N is one side of input grid, 10 for our case), for our inputs of size 10x10 time of O(10^5) is quite alright. Algo id in code is algo = 0. If you just want to see this algorithm jump to line ------ Simple Algorithm inside code.
Second algorithm is Advanced, using Dynamic Programming approach, its complexity is O(N^3) which is much faster than first algorithm. Algo id in code is algo = 1. Jump to line ------- Advanced Algorithm inside code.
Third algorithm Simple-ListComp I implemented just for fun, it is almost same like Simple, same O(N^5) complexity, but using Python's list comprehensions instead of regular loops, that's why it is shorter, also a bit slower because doesn't use some optimizations. Algo id in code is algo = 2. Jump to line ------- Simple-ListComp Algorithm inside code to see algo.
The rest of code, besides algorithms, implements checking correctness of results (double-checking between algorithms), printing results, producing text inputs. Code is split into solving-task function solve() and testing function test(). solve() function has many arguments to allow configuring behavior of function.
All main code lines are documented by comments, read them to learn how to use code. Basically if s variable contains multi-line text with grid elements, same like in your question, you just run solve(s, text = True) and it will solve task and print results. Also you may choose algorithm out of two versions (0 (Simple) and 1 (Advanced) and 2 (Simple-ListComp)) by giving next arguments to solve function algo = 0, check = False (here 0 for algo 0). Look at test() function body to see simplest example of usage.
Algorithms output to console by default all clusters, from largest to smallest, largest is signified by . symbol, the rest by B, C, D, ..., Z symbols. You may set argument show_non_max = False in solve function if you want only first (largest) cluster to be shown.
I'll explain Simple algorithm:
Basically what algorithm does - it searches through all possible angled 1s rectangles and stores info about maximal of them into ma 2D array. Top-left point of such rectangle is (i, j), top-right - (i, k), bottom-left - (l, j + angle_offset), bottom-right - (l, k + angle_offset), all 4 corners, that's why we have so many loops.
In outer two i (row) , j (column) loops we iterate over whole grid, this (i, j) position will be top-left point of 1s rectangle, we need to iterate whole grid because all possible 1s rectangles may have top-left at any (row, col) point of whole grid. At start of j loop we check that grid at (i, j) position should always contain 1 because inside loops we search for all rectangle with 1s only.
k loop iterates through all possible top-right positions (i, k) of 1s rectangle. We should break out of loop if (i, k) equals to 0 because there is no point to extend k further to right because such rectangle will always contain 0.
In previous loops we fixed top-left and top-right corners of rectangle. Now we need to search for two bottom corners. For that we need to extend rectangle downwards at different angles till we reach first 0.
off loop tries extending rectangle downwards at all possible angles (0 (straight vertical), +1 (45 degrees shifted to the right from top to bottom), -1 (-45 degrees)), off basically is such number that grid[y][x] is "above" (corresponds to by Y) grid[y + 1][x + off].
l tries to extend rectangle downwards (in Y direction) at different angles off. It is extended till first 0 because it can't be extended further then (because each such rectangle will already contain 0).
Inside l loop there is if grid[l][max(0, j + off * (l - i)) : min(k + 1 + off * (l - i), c)] != ones[:k - j + 1]: condition, basically this if is meant to check that last row of rectangle contains all 1 if not this if breaks out of loop. This condition compares two list slices for non-equality. Last row of rectangle spans from point (l, j + angle_offset) (expression max(0, j + off * (l - i)), max-limited to be 0 <= X) to point (l, k + angle_offset) (expression min(k + 1 + off * (l - i), c), min-limited to be X < c).
Inside l loop there are other lines, ry, rx = l, k + off * (l - i) computes bottom-right point of rectangle (ry, rx) which is (l, k + angle_offset), this (ry, rx) position is used to store found maximum inside ma array, this array stores all maximal found rectangles, ma[ry][rx] contains info about rectangle that has bottom-right at point (ry, rx).
rv = (l + 1 - i, k + 1 - j, off) line computes new possible candidate for ma[ry][rx] array entry, possible because ma[ry][rx] is updated only if new candidate has larger area of 1s. Here rv[0] value inside rv tuple contains height of such rectangle, rv[1] contains width of such rectangle (width equals to the length of bottom row of rectangle), rv[2] contains angle of such rectangle.
Condition if rv[0] * rv[1] > ma[ry][rx][0] * ma[ry][rx][1]: and its body just checks if rv area is larger than current maximum inside array ma[ry][rx] and if it is larger then this array entry is updated (ma[ry][rx] = rv). I'll remind that ma[ry][rx] contains info (width, height, angle) about current found maximal-area rectangle that has bottom-right point at (ry, rx) and that has these width, height and angle.
Done! After algorithm run array ma contains information about all maximal-area angled rectangles (clusters) of 1s so that all clusters can be restored and printed later to console. Largest of all such 1s-clusters is equal to some rv0 = ma[ry0][rx0], just iterate once through all elements of ma and find such point (ry0, rx0) so that ma[ry0][rx0][0] * ma[ry0][rx0][1] (area) is maximal. Then largest cluster will have bottom-right point (ry0, rx0), bottom-left point (ry0, rx0 - rv0[1] + 1), top-right point (ry0 - rv0[0] + 1, rx0 - rv0[2] * (rv0[0] - 1)), top-left point (ry0 - rv0[0] + 1, rx0 - rv0[1] + 1 - rv0[2] * (rv0[0] - 1)) (here rv0[2] * (rv0[0] - 1) is just angle offset, i.e. how much shifted is first row along X compared to last row of rectangle).
Try it online!
# ----------------- Main function solving task -----------------
def solve(
grid, *,
algo = 1, # Choose algorithm, 0 - Simple, 1 - Advanced, 2 - Simple-ListComp
check = True, # If True run all algorithms and check that they produce same results, otherwise run just chosen algorithm without checking
text = False, # If true then grid is a multi-line text (string) having grid elements separated by spaces
print_ = True, # Print results to console
show_non_max = True, # When printing if to show all clusters, not just largest, as B, C, D, E... (chars from "cchars")
cchars = ['.'] + [chr(ii) for ii in range(ord('B'), ord('Z') + 1)], # Clusters-chars, these chars are used to show clusters from largest to smallest
one = None, # Value of "one" inside grid array, e.g. if you have grid with chars then one may be equal to "1" string. Defaults to 1 (for non-text) or "1" (for text).
offs = [0, +1, -1], # All offsets (angles) that need to be checked, "off" is such that grid[i + 1][j + off] corresponds to next row of grid[i][j]
debug = False, # If True, extra debug info is printed
):
# Preparing
assert algo in [0, 1, 2], algo
if text:
grid = [l.strip().split() for l in grid.splitlines() if l.strip()]
if one is None:
one = 1 if not text else '1'
r, c = len(grid), len(grid[0])
sgrid = '\n'.join([''.join([str(grid[ii][jj]) for jj in range(c)]) for ii in range(r)])
mas, ones = [], [one] * max(c, r)
# ----------------- Simple Algorithm, O(N^5) Complexity -----------------
if algo == 0 or check:
ma = [[(0, 0, 0) for jj in range(c)] for ii in range(r)] # Array containing maximal answers, Lower-Right corners
for i in range(r):
for j in range(c):
if grid[i][j] != one:
continue
for k in range(j + 1, c): # Ensure at least 2 ones along X
if grid[i][k] != one:
break
for off in offs:
for l in range(i + 1, r): # Ensure at least 2 ones along Y
if grid[l][max(0, j + off * (l - i)) : min(k + 1 + off * (l - i), c)] != ones[:k - j + 1]:
l -= 1
break
ry, rx = l, k + off * (l - i)
rv = (l + 1 - i, k + 1 - j, off)
if rv[0] * rv[1] > ma[ry][rx][0] * ma[ry][rx][1]:
ma[ry][rx] = rv
mas.append(ma)
ma = None
# ----------------- Advanced Algorithm using Dynamic Programming, O(N^3) Complexity -----------------
if algo == 1 or check:
ma = [[(0, 0, 0) for jj in range(c)] for ii in range(r)] # Array containing maximal answers, Lower-Right corners
for off in offs:
d = [[(0, 0, 0) for jj in range(c)] for ii in range(c)]
for i in range(r):
f, d_ = 0, [[(0, 0, 0) for jj in range(c)] for ii in range(c)]
for j in range(c):
if grid[i][j] != one:
f = j + 1
continue
if f >= j:
# Check that we have at least 2 ones along X
continue
df = [(0, 0, 0) for ii in range(c)]
for k in range(j, -1, -1):
t0 = d[j - off][max(0, k - off)] if 0 <= j - off < c and k - off < c else (0, 0, 0)
if k >= f:
t1 = (t0[0] + 1, t0[1], off) if t0 != (0, 0, 0) else (0, 0, 0)
t2 = (1, j - k + 1, off)
t0 = t1 if t1[0] * t1[1] >= t2[0] * t2[1] else t2
# Ensure that we have at least 2 ones along Y
t3 = t1 if t1[0] > 1 else (0, 0, 0)
if k < j and t3[0] * t3[1] < df[k + 1][0] * df[k + 1][1]:
t3 = df[k + 1]
df[k] = t3
else:
t0 = d_[j][k + 1]
if k < j and t0[0] * t0[1] < d_[j][k + 1][0] * d_[j][k + 1][1]:
t0 = d_[j][k + 1]
d_[j][k] = t0
if ma[i][j][0] * ma[i][j][1] < df[f][0] * df[f][1]:
ma[i][j] = df[f]
d = d_
mas.append(ma)
ma = None
# ----------------- Simple-ListComp Algorithm using List Comprehension, O(N^5) Complexity -----------------
if algo == 2 or check:
ma = [
[
max([(0, 0, 0)] + [
(h, w, off)
for h in range(2, i + 2)
for w in range(2, j + 2)
for off in offs
if all(
cr[
max(0, j + 1 - w - off * (h - 1 - icr)) :
max(0, j + 1 - off * (h - 1 - icr))
] == ones[:w]
for icr, cr in enumerate(grid[max(0, i + 1 - h) : i + 1])
)
], key = lambda e: e[0] * e[1])
for j in range(c)
]
for i in range(r)
]
mas.append(ma)
ma = None
# ----------------- Checking Correctness and Printing Results -----------------
if check:
# Check that we have same answers for all algorithms
masx = [[[cma[ii][jj][0] * cma[ii][jj][1] for jj in range(c)] for ii in range(r)] for cma in mas]
assert all([masx[0] == e for e in masx[1:]]), 'Maximums of algorithms differ!\n\n' + sgrid + '\n\n' + (
'\n\n'.join(['\n'.join([' '.join([str(e1).rjust(2) for e1 in e0]) for e0 in cma]) for cma in masx])
)
ma = mas[0 if not check else algo]
if print_:
cchars = ['.'] + [chr(ii) for ii in range(ord('B'), ord('Z') + 1)] # These chars are used to show clusters from largest to smallest
res = [[grid[ii][jj] for jj in range(c)] for ii in range(r)]
mac = [[ma[ii][jj] for jj in range(c)] for ii in range(r)]
processed = set()
sid = 0
for it in range(r * c):
sma = sorted(
[(mac[ii][jj] or (0, 0, 0)) + (ii, jj) for ii in range(r) for jj in range(c) if (ii, jj) not in processed],
key = lambda e: e[0] * e[1], reverse = True
)
if len(sma) == 0 or sma[0][0] * sma[0][1] <= 0:
break
maxv = sma[0]
if it == 0:
maxvf = maxv
processed.add((maxv[3], maxv[4]))
show = True
for trial in [True, False]:
for i in range(maxv[3] - maxv[0] + 1, maxv[3] + 1):
for j in range(maxv[4] - maxv[1] + 1 - (maxv[3] - i) * maxv[2], maxv[4] + 1 - (maxv[3] - i) * maxv[2]):
if trial:
if mac[i][j] is None:
show = False
break
elif show:
res[i][j] = cchars[sid]
mac[i][j] = None
if show:
sid += 1
if not show_non_max and it == 0:
break
res = '\n'.join([''.join([str(res[ii][jj]) for jj in range(c)]) for ii in range(r)])
print(
'Max:\nArea: ', maxvf[0] * maxvf[1], '\nSize Row,Col: ', (maxvf[0], maxvf[1]),
'\nLowerRight Row,Col: ', (maxvf[3], maxvf[4]), '\nAngle: ', ("-1", " 0", "+1")[maxvf[2] + 1], '\n', sep = ''
)
print(res)
if debug:
# Print all computed maximums, for debug purposes
for cma in [ma, mac]:
print('\n' + '\n'.join([' '.join([f'({e0[0]}, {e0[1]}, {("-1", " 0", "+1")[e0[2] + 1]})' for e0_ in e for e0 in (e0_ or ('-', '-', 0),)]) for e in cma]))
print(end = '-' * 28 + '\n')
return ma
# ----------------- Testing -----------------
def test():
# Iterating over text inputs or other ways of producing inputs
for s in [
"""
1 1 0 0 0 1 0 1
1 1 1 0 1 1 1 1
1 0 0 0 1 0 1 1
0 0 1 0 1 0 1 1
1 1 1 1 0 0 1 1
0 0 1 1 1 1 1 0
0 1 0 0 1 0 1 1
""",
"""
1 0 1 1 0 1 0 0
0 1 1 0 1 0 0 1
1 1 0 0 0 0 0 1
0 1 1 1 0 1 0 1
0 1 1 1 1 0 1 1
1 1 0 0 0 1 0 0
0 1 1 1 0 1 0 1
""",
"""
0 1 1 0 1 0 1 1
0 0 1 1 0 0 0 1
0 0 0 1 1 0 1 0
1 1 0 0 1 1 1 0
0 1 1 0 0 1 1 0
0 0 1 0 1 0 1 1
1 0 0 1 0 0 0 0
0 1 1 0 1 1 0 0
"""
]:
solve(s, text = True)
if __name__ == '__main__':
test()
Output:
Max:
Area: 8
Size Row,Col: (4, 2)
LowerRight Row,Col: (4, 7)
Angle: 0
CC000101
CC1011..
100010..
001010..
1BBB00..
00BBBDD0
010010DD
----------------------------
Max:
Area: 6
Size Row,Col: (3, 2)
LowerRight Row,Col: (2, 1)
Angle: -1
10..0100
0..01001
..000001
0BBB0101
0BBB1011
CC000100
0CC10101
----------------------------
Max:
Area: 12
Size Row,Col: (6, 2)
LowerRight Row,Col: (5, 7)
Angle: +1
0..01011
00..0001
000..010
BB00..10
0BB00..0
001010..
10010000
01101100
----------------------------

Generate binary strings that are at least in d hamming distance using ECC

I want to generate binary strings of length n=128 with the property that any pair of such strings are at least in d=10 hamming distance.
For this I am trying to use an Error Correcting Code (ECC) with minimum distance d=10. However, I cannot find any ecc that has code words of 128 bit length. If the code word length (n) and d are a little bit smaller/greater than 128 and 10, that still works for me.
Is there any ecc with this (similar) properties? Is there any python implementation of this?
Reed-Muller codes RM(3,7) have:
a block size of 128 bits
a minimum distance of 16
a message size of 64 bits
First construct a basis like this:
def popcnt(x):
return bin(x).count("1")
basis = []
by_ones = list(range(128))
by_ones.sort(key=popcnt)
for i in by_ones:
count = popcnt(i)
if count > 3:
break
if count <= 1:
basis.append(((1 << 128) - 1) // ((1 << i) | 1))
else:
p = ((1 << 128) - 1)
for b in [basis[k + 1] for k in range(7) if ((i >> k) & 1) != 0]:
p = p & b
basis.append(p)
Then you can use any linear combination of them, which are created by XORing subsets of rows of the basis, for example:
def encode(x, basis):
# requires x < (1 << 64)
r = 0
for i in range(len(basis)):
if ((x >> i) & 1) != 0:
r = r ^ basis[i]
return r
In some other implementation I found this was done by taking dot products with columns of the basis matrix and then reducing modulo 2. I don't know why they do that, it seems much easier to do it more directly by summing a subset of rows.
I needed the exact same thing. For me the naive approach worked very well! Simply generate random bit strings and check hamming distance between them, gradually building a list of strings that fulfills the requirement:
def random_binary_array(width):
"""Generate random binary array of specific width"""
# You can enforce additional array level constraints here
return np.random.randint(2, size=width)
def hamming2(s1, s2):
"""Calculate the Hamming distance between two bit arrays"""
assert len(s1) == len(s2)
# return sum(c1 != c2 for c1, c2 in zip(s1, s2)) # Wikipedia solution
return np.count_nonzero(s1 != s2) # a faster solution
def generate_hamm_arrays(n_values, size, min_hamming_dist=5):
"""
Generate a list of binary arrays ensuring minimal hamming distance between the arrays.
"""
hamm_list = []
while len(hamm_list) < size:
test_candidate = random_binary_array(n_values)
valid = True
for word in hamm_list:
if (word == test_candidate).all() or hamming2(word, test_candidate) <= min_hamming_dist:
valid = False
break
if valid:
hamm_list.append(test_candidate)
return np.array(hamm_list)
print(generate_hamm_arrays(16, 10))
Output:
[[0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 1]
[1 0 1 0 0 1 0 0 0 1 0 0 1 0 1 1]
[1 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0]
[1 0 0 1 1 0 0 1 1 0 0 1 1 1 0 1]
[0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1]
[1 1 0 0 0 0 0 1 0 1 1 1 0 1 1 1]
[1 1 0 1 0 1 0 1 1 1 1 0 0 1 0 0]
[0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0]
[1 1 0 0 0 0 1 1 1 0 0 1 0 0 0 1]
[0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0]]
And it's not too slow as long as you don't want a very dense list of strings (a small number of bits in a string + large hamming distance). From your specifications (128 bit strings with hamming distance 10 it is no problem) we can generate a 1000 bit strings in under 0.2 seconds on a really weak cpu:
import timeit
timeit.timeit(lambda: generate_hamm_arrays(n_values=128, size=100, min_hamming_dist=10), number=10)
>> 0.19202665984630585
Hope this solution is sufficient for you too.
My O(n*n!) solution (works in a reasonable time for N<14)
def hammingDistance(n1, n2):
return bin(np.bitwise_xor(n1, n2)).count("1")
N = 10 # binary code of length N
D = 6 # with minimum distance D
M = 2**N # number of unique codes in general
# construct hamming distance matrix
A = np.zeros((M, M), dtype=int)
for i in range(M):
for j in range(i+1, M):
A[i, j] = hammingDistance(i, j)
A += A.T
def recursivly_find_legit_numbers(nums, codes=set()):
codes_to_probe = nums
for num1 in nums:
codes.add(num1)
codes_to_probe = codes_to_probe - {num1}
for num2 in nums - {num1}:
if A[num1, num2] < D:
"Distance isn't sufficient, remove this number from set"
codes_to_probe = codes_to_probe - {num2}
if len(codes_to_probe):
recursivly_find_legit_numbers(codes_to_probe, codes)
return codes
group_of_codes = {}
for i in tqdm(range(M)):
satisfying_numbers = np.where(A[i] >= D)[0]
satisfying_numbers = satisfying_numbers[satisfying_numbers > i]
nums = set(satisfying_numbers)
if len(nums) == 0:
continue
group_of_codes[i] = recursivly_find_legit_numbers(nums, set())
group_of_codes[i].add(i)
largest_group = 0
for i, nums in group_of_codes.items():
if len(nums) > largest_group:
largest_group = len(nums)
ind = i
print(f"largest group for N={N} and D={D}: {largest_group}")
print("Number of unique groups:", len(group_of_codes))
largest group for N=10 and D=6: 6 Number of unique groups: 992
# generate largest group of codes
[format(num, f"0{N}b") for num in group_of_codes[ind]]
['0110100001',
'0001000010',
'1100001100',
'1010010111',
'1111111010',
'0001111101']

Inefficient Regularized Logistic Regression with Numpy

I am a machine learning noob attemping to implement regularized logistic regression via Newton's method.
The data have two features which are supposed to be expanded to 28 through finding all monomial terms of (u,v) up to degree 6
My code converges to the correct solution of norm(theta)=0.9384 after around 500 or so iterations when it should only take around 15 for lambda = 10, though the exercise is based on Matlab instead of Python. Each cycle of the parameter update is also very slow with my code and I am not sure exactly why. If anyone could explain why my code takes so many iterations to converge and why each iteration is painfully slow I would be very grateful!
The data are taken from Andrew Ng's open course exercise 5. The problem information and data can be found here http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex5/ex5.html
although I posted the data and my code below.
X data with two features
0.051267,0.69956
-0.092742,0.68494
-0.21371,0.69225
-0.375,0.50219
-0.51325,0.46564
-0.52477,0.2098
-0.39804,0.034357
-0.30588,-0.19225
0.016705,-0.40424
0.13191,-0.51389
0.38537,-0.56506
0.52938,-0.5212
0.63882,-0.24342
0.73675,-0.18494
0.54666,0.48757
0.322,0.5826
0.16647,0.53874
-0.046659,0.81652
-0.17339,0.69956
-0.47869,0.63377
-0.60541,0.59722
-0.62846,0.33406
-0.59389,0.005117
-0.42108,-0.27266
-0.11578,-0.39693
0.20104,-0.60161
0.46601,-0.53582
0.67339,-0.53582
-0.13882,0.54605
-0.29435,0.77997
-0.26555,0.96272
-0.16187,0.8019
-0.17339,0.64839
-0.28283,0.47295
-0.36348,0.31213
-0.30012,0.027047
-0.23675,-0.21418
-0.06394,-0.18494
0.062788,-0.16301
0.22984,-0.41155
0.2932,-0.2288
0.48329,-0.18494
0.64459,-0.14108
0.46025,0.012427
0.6273,0.15863
0.57546,0.26827
0.72523,0.44371
0.22408,0.52412
0.44297,0.67032
0.322,0.69225
0.13767,0.57529
-0.0063364,0.39985
-0.092742,0.55336
-0.20795,0.35599
-0.20795,0.17325
-0.43836,0.21711
-0.21947,-0.016813
-0.13882,-0.27266
0.18376,0.93348
0.22408,0.77997
0.29896,0.61915
0.50634,0.75804
0.61578,0.7288
0.60426,0.59722
0.76555,0.50219
0.92684,0.3633
0.82316,0.27558
0.96141,0.085526
0.93836,0.012427
0.86348,-0.082602
0.89804,-0.20687
0.85196,-0.36769
0.82892,-0.5212
0.79435,-0.55775
0.59274,-0.7405
0.51786,-0.5943
0.46601,-0.41886
0.35081,-0.57968
0.28744,-0.76974
0.085829,-0.75512
0.14919,-0.57968
-0.13306,-0.4481
-0.40956,-0.41155
-0.39228,-0.25804
-0.74366,-0.25804
-0.69758,0.041667
-0.75518,0.2902
-0.69758,0.68494
-0.4038,0.70687
-0.38076,0.91886
-0.50749,0.90424
-0.54781,0.70687
0.10311,0.77997
0.057028,0.91886
-0.10426,0.99196
-0.081221,1.1089
0.28744,1.087
0.39689,0.82383
0.63882,0.88962
0.82316,0.66301
0.67339,0.64108
1.0709,0.10015
-0.046659,-0.57968
-0.23675,-0.63816
-0.15035,-0.36769
-0.49021,-0.3019
-0.46717,-0.13377
-0.28859,-0.060673
-0.61118,-0.067982
-0.66302,-0.21418
-0.59965,-0.41886
-0.72638,-0.082602
-0.83007,0.31213
-0.72062,0.53874
-0.59389,0.49488
-0.48445,0.99927
-0.0063364,0.99927
Y data
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
My code below:
import pandas as pd
import numpy as np
import math
def sigmoid(theta, x):
return 1/(1 + math.exp(-1*theta.T.dot(x)))
def cost_function(X, y, theta):
s = 0
for i in range(m):
loss = -y[i]*np.log(sigmoid(theta, X[i])) - (1-y[i])*np.log(1-sigmoid(theta, X[i]))
s += loss
s /= m
s += (lamb/(2*m))*sum(theta[j]**2 for j in range(1, 28))
return s
def gradient(theta, X, y):
# add regularization terms
add_column = theta * (lamb/m)
add_column[0] = 0
a = sum((sigmoid(theta, X[i]) - y[i])*X[i] + add_column for i in range(m))/m
return a
def hessian(theta, X, reg_matrix):
matrix = []
for i in range(28):
row = []
for j in range(28):
cell = sum(sigmoid(theta, X[k])*(1-sigmoid(theta, X[k]))*X[k][i]*X[k][j] for k in range(m))
row.append(cell)
matrix.append(row)
H = np.array(matrix)
H = np.add(H, reg_matrix)
return H
def newtons_method(theta, iterations):
for i in range(iterations):
g = gradient(theta, X, y)
H = hessian(theta, X, reg_matrix)
theta = theta - np.linalg.inv(H).dot(g)
cost = cost_function(X,y,theta)
print(cost)
return theta
def map_feature(u, v): # expand features according to problem instructions
new_row = []
new_row.append(1)
new_row.append(u)
new_row.append(v)
new_row.append(u**2)
new_row.append(u*v)
new_row.append(v**2)
new_row.append(u**3)
new_row.append(u**2*v)
new_row.append(u*v**2)
new_row.append(v**3)
new_row.append(u**4)
new_row.append(u**3*v)
new_row.append(u*v**3)
new_row.append(v**4)
new_row.append(u**2*v**2)
new_row.append(u**5)
new_row.append(u**4*v)
new_row.append(u*v**4)
new_row.append(v**5)
new_row.append(u**2*v**3)
new_row.append(u**3*v**2)
new_row.append(u**6)
new_row.append(u**5*v)
new_row.append(u*v**5)
new_row.append(v**6)
new_row.append(u**4*v**2)
new_row.append(u**2*v**4)
new_row.append(u**3*v**3)
return np.array(new_row)
with open('ex5Logx.dat', 'r') as f:
array = []
for line in f.readlines():
array.append(line.strip().split(','))
for a in array:
a[0], a[1] = float(a[0]), float(a[1].strip())
xdata= np.array(array)
with open('ex5Logy.dat', 'r') as f:
array = []
for line in f.readlines():
array.append(line.strip())
for i in range(len(array)):
array[i] = float(array[i])
ydata= np.array(array)
X_df = pd.DataFrame(xdata, columns=['score1', 'score2'])
y_df = pd.DataFrame(ydata, columns=['acceptence'])
m = len(y_df)
iterations = 15
ones = np.ones((m,1)) # intercept term in first column
X = np.array(X_df)
X = np.append(ones, X, axis=1)
y = np.array(y_df).flatten()
new_X = [] # prepare new array for expanded features
for i in range(m):
new_row = map_feature(X[i][1], X[i][2])
new_X.append(new_row)
X = np.array(new_X)
theta = np.array([0 for i in range(28)]) # initialize parameters to 0
lamb = 10 # lambda constant for regularization
reg_matrix = np.zeros((28,28),dtype=int) # n+1*n+1 regularization matrix
np.fill_diagonal(reg_matrix, 1)
reg_matrix[0] = 0
reg_matrix = (lamb/m)*reg_matrix
theta = newtons_method(theta, iterations)
print(np.linalg.norm(theta))
I am not 100% sure but i went through one tutorial on Logistic Regression using Newton's method(http://thelaziestprogrammer.com/sharrington/math-of-machine-learning/solving-logreg-newtons-method) and it's implementation of Newton's method is little different from yours.Actually there is one major difference. In Newton's method it's adding product of inv of hessian and gradient to theta whereas you are subtracting. I know about logistic regression normal way not using newton's method. Apart from that it seems that you are using loops in Cost function and Hessian which i think can be done with one statement in numpy than looping.
I would suggest refer to attached link which i gave as it has done all implementation in python numpy and there are no loops. Loops which you have created are impacting performance.

How do I add to a grid coordinate in python?

What I'm trying to do is have a 2D array and for every coordinate in the array, ask all the other 8 coordinates around it if they have stored a 1 or a 0. Similar to a minesweeper looking for mines.
I used to have this:
grid = []
for fila in range(10):
grid.append([])
for columna in range(10):
grid[fila].append(0)
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
for i in range 10:
for j in range 10:
if gird[fila + i][columna + j] == 1
neighbour += 1
But something didn't work well. I also had print statments to try to find the error that way but i still didnt understand why it only made half of the for loop. So I changed the second for loop to this:
#edited
for fila in range (10):
for columna in range (10):
neighbour = 0
if grid[fila - 1][columna - 1] == 1:
neighbour += 1
if grid[fila - 1][columna] == 1:
neighbour += 1
if grid[fila - 1][columna + 1] == 1:
neighbour += 1
if grid[fila][columna - 1] == 1:
neighbour += 1
if grid[fila][columna + 1] == 1:
neighbour += 1
if grid[fila + 1][columna - 1] == 1:
neighbour += 1
if grid[fila + 1][columna] == 1:
neighbour += 1
if grid[fila + 1][columna + 1] == 1:
neighbour += 1
And got this error:
if grid[fila - 1][columna + 1] == 1:
IndexError: list index out of range
It seems like I can't add on the grid coordinates but I can subtract. Why is that?
Valid indices in python are -len(grid) to len(grid)-1. the positive indices are accessing elements with offset from the front, the negative ones from the rear. adding gives a range error if the index is greater than len(grid)-1 that is what you see. subtracting does not give you a range error unless you get an index value less than -len(grid). although you do not check for the lower bound, which is 0 (zero) it seems to work for you as small negative indices return you values from the rear end. this is a silent error leading to wrong neighborhood results.
If you are computing offsets, you need to make sure your offsets are within the bounds of the lists you have. So if you have 10 elements, don't try to access the 11th element.
import collections
grid_offset = collections.namedtuple('grid_offset', 'dr dc')
Grid = [[0 for c in range(10)] for r in range(10)]
Grid_height = len(Grid)
Grid_width = len(Grid[0])
Neighbors = [
grid_offset(dr, dc)
for dr in range(-1, 2)
for dc in range(-1, 2)
if not dr == dc == 0
]
def count_neighbors(row, col):
count = 0
for nb in Neighbors:
r = row + nb.dr
c = col + nb.dc
if 0 <= r < Grid_height and 0 <= c < Grid_width:
# Add the value, or just add one?
count += Grid[r][c]
return count
Grid[4][6] = 1
Grid[5][4] = 1
Grid[5][5] = 1
for row in range(10):
for col in range(10):
print(count_neighbors(row, col), "", end='')
print()
Prints:
$ python test.py
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 0 0
0 0 0 1 2 3 1 1 0 0
0 0 0 1 1 2 2 1 0 0
0 0 0 1 2 2 1 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
The error is exactly what it says, you need to check if the coordinates fit within the grid:
0 <= i < 10 and 0 <= j < 10
Otherwise you're trying to access an element that doesn't exist in memory, or an element that's not the one you're actually thinking about - Python handles negative indexes, they're counted from the end.
E.g. a[-1] is the last element, exactly the same as a[len(a) - 1].

Checking diagonally in nqueen

I have a fragment of my code where i wrote functions to check rows, column and diagonal for the queen placement so that they will not attack each other. Currently I'm having issues with the diagonal function:
def checkDiagonal(T):
for i in range(len(T) - 1):
if abs(T[i] - T[i + 1]) == 1:
return False
return True
The problem with this function is that it will only consider when queens are one length apart but not when cases more than one.
Example, if N = 7 it prints:
Enter the value of N: 7
0 Q 0 0 0 0 0
0 0 0 0 0 0 0
0 0 X 0 0 0 0
0 0 X 0 0 0 0
0 0 X 0 0 0 0
0 0 X 0 0 0 0
Q 0 0 0 0 0 0
the Q in the output is the partial solution i set in the code. The X is the next possible position for the queen but there is one X in the output that is clearly diagonal to the queen and will be attacked.
Partial solution list = [6,0], in this case it will be passed to the function as T
Two points (x1, y1) and (x2, y2) are one the same lower left -> upper right diagonal if and only if y1 - x1 == y2 - x2.
If I understand you question correctly, the partial solution T = [0,6] would represent the partial solution [(0,0), (1,6)]. So, since 0 - 0 = 0 != 5 == 6 - 1 , these two elements are not on the same diagonal.
However, for the partial solution [0 , 6, 2] = [(0,0), (1,6), (2,2)] we would have 0 - 0 == 0 == 2 - 2 and hence the two points would be on the same lower left -> upper right diagonal.
For the upper left -> lower right diagonal you would then have to find a similar condition, which I think you should be able to figure out, but let me know if you don't manage to find it.
This would lead to something like the code (only for this diagonal):
def checkDiagonal(T):
for i in xrange(len(T) - 1):
for j in xrange(i + 1, len(T))
if ((T[i] - i == T[j] - j):
return false
return true
Be careful however that I didn't have time to test this, so there might be small errors in it, but the general idea should be right.

Categories