I am trying to solve the XOR equation system. For example:
A = [[1, 1, 1, 0, 0], [0, 1, 1, 1, 0], [0, 0, 1, 1, 1], [0, 1, 1, 0, 1], [0, 1, 0, 1, 1]]
s = [3, 14, 13, 5, 2]
m = 5 # len(s)
Ax = s => x = [12, 9, 6, 1, 10]
I tried 2 ways:
The first way is Gaussian elimination (~2.5 second) which was showed here
The second way to invert modular matrix A (with modulo 2) and then, XOR multiply with A_invert and s. (~7.5 second)
Could you please show me is there a way or a python library to speed up. Even I tried to use gmpy2 library, but it cannot reduce much. Below I described python code so that you can easily follow.
Using Gaussian elimination:
def SolveLinearSystem (A, B, N):
for K in range (0, N):
if (A[K][K] == 0):
for i in range (K+1, N):
if (A[i][K]!=0):
for L in range (0, N):
s = A[K][L]
A[K][L] = A[i][L]
A[i][L] = s
s = B[i]
B[i] = B[K]
B[K] = s
break
for I in range (0, N):
if (I!=K):
if (A[I][K]):
#M = 0
for M in range (K, N):
A[I][M] = A[I][M] ^ A[K][M]
B[I] = B[I] ^ B[K]
SolveLinearSystem (A, s, 5)
Using Inversion
def identitymatrix(n):
return [[long(x == y) for x in range(0, n)] for y in range(0, n)]
def multiply_vector_scalar (vector, scalar, q):
kq = []
for i in range (0, len(vector)):
kq.append (vector[i] * scalar %q)
return kq
def minus_vector_scalar(vector1, scalar, vector2, q):
kq = []
for i in range (0, len(vector1)):
kq.append ((vector1[i] - scalar * vector2[i]) %q)
return kq
def inversematrix(matrix, q):
n = len(matrix)
A =[]
for j in range (0, n):
temp = []
for i in range (0, n):
temp.append (matrix[j][i])
A.append(temp)
Ainv = identitymatrix(n)
for i in range(0, n):
factor = gmpy2.invert(A[i][i], q) #invert mod q
A[i] = multiply_vector_scalar(A[i],factor,q)
Ainv[i] = multiply_vector_scalar(Ainv[i],factor,q)
for j in range(0, n):
if (i != j):
factor = A[j][i]
A[j] = minus_vector_scalar(A[j], factor, A[i], q)
Ainv[j] = minus_vector_scalar(Ainv[j], factor, Ainv[i], q)
return Ainv
def solve_equation (A, y):
result = []
for i in range (0, m):
temp = 0
for j in range (0, m):
temp = (temp ^ A[i][j]* y[j])
result.append(temp)
return result
A_invert = inversematrix(A, 2)
print solve_equation (A_invert, s)
Both of the methods you present make you do a cubic number of bit-operations. There are methods that are faster, both asymptotically and in practise.
A first step (that may well be sufficient for you) is to use a 32-bit integer (I believe they're called numpy.int32 in Python) to store 32 consecutive elements of a row. This will speed up row reduction by a factor close to 32 on large enough inputs and probably put a significant dent into your running time on modest inputs.
In your particular code, there are a number of things for you to trivially specialise to the mod-2 case. Search your code for % and inversemodp and handle all of those; the extra, pointless operations are most certainly not helping your runtime.
Related
I want to define indexes over finite ranges that eliminate ambiguity in Piecewise expressions.
For instance:
from sympy import *
x = IndexedBase('x')
n = Symbol('n', nonnegative = True, integer = True)
k = Idx('k', (1, n))
f = 1/sqrt(Sum(x[k]**2, (k, 1, n)))
j = Idx('j', (1,n))
diff = diff(f,x[j])
print(diff.simplify()) returns:
-Piecewise((x[j], n >= j), (0, True))/Sum(x[k]**2, (k, 1, n))**(3/2)
However, I already declared, when defining j, that n>=j, and I would expect x[j] in the numerator instead of the Piecewise expression. Is there a way to solve this problem?
Why use index j? It seems to confuse the simplification algorithm.
Instead, using index k for differentiation, it returns the expected
result, without the spurious Piecewise split:
from sympy import *
x = IndexedBase('x')
n = Symbol('n', nonnegative = True, integer = True)
k = Idx('k', (1, n))
f = 1/sqrt(Sum(x[k]**2, (k, 1, n)))
diff = diff(f, x[k])
print(str(diff.simplify()))
Output:
# -Sum(x[k], (k, 1, n))/Sum(x[k]**2, (k, 1, n))**(3/2)
I want to try to implement the following code from Matlab to Python (I am not familiar with Python in general, but I try to translate it from Matlab using basics)
% n is random integer from 1 to 10
% first set the random seed (because we want our results to be reproducible;
% the seed sets a starting point in the sequence of random numbers the program
rng(n)
% Generate random columns
a = rand(n, 1);
b = rand(n, 1);
c = rand(n, 1);
% Convert to a matrix
A = zeros(n);
for i = 1:n
if i ~= n
A(i + 1, i) = a(i + 1);
A(i, i + 1) = c(i);
end
A(i, i) = b(i);
end
This is my attempt in Python:
import numpy as np
## n is random integer from 1 to 10
np.random.seed(n)
### generate random columns:
a = np.random.rand(n)
b = np.random.rand(n)
c = np.random.rand(n)
A = np.zeros((n, n)) ## create zero n-by-n matrix
for i in range(0, n):
if (i != n):
A[i + 1, i] = a[i + 1]
A[i, i + 1] = c[i]
A[i, i] = b[i]
I run into an error on the line A[i + 1, i] = a[i]. Is there any structure in Python that I am missing out here?
As the above comments clearly points out the indexing error, here is a numpy way of doing it based on np.diag:
import numpy as np
# for reproducibility
np.random.seed(42)
# n is random integer from 1 to 10
n = np.random.randint(low=1, high=10)
# first diagonal below main diag: k = -1
a = np.random.rand(n-1)
# main diag: k = 0
b = np.random.rand(n)
# first diagonal above main diag: k = 1
c = np.random.rand(n-1)
# sum all 2-d arrays in order to obtain A
A = np.diag(a, k=-1) + np.diag(b, k=0) + np.diag(c, k=1)
Short answer is that for i = 1:n iterates [1, n], inclusive on both bounds, while for i in range(n): iterates [0, n), exclusive on the right bound. Therefore, the check if i ~= n correctly tests if you are at the right edge, while if (i!=n): does not. Replace it with
if i != n - 1:
The long answer is that you don't need any of that code in either language, since both MATLAB and numpy are intended to be used with vectorized operations. In MATLAB, you can write
A = diag(a(2:end), -1) + diag(b, 0) + diag(c(1:end-1), +1)
In numpy, it's very similar:
A = np.diag(a[1:], -1) + np.diag(b, 0) + np.diag(c[:-1], +1)
There are other tricks you can use, especially if you just want random numbers in the matrix:
A = np.random.rand(n, n)
A[np.tril_indices(n, -2)] = A[np.triu_indices(n, 2)] = 0
You can use other index-based approaches:
i, j = np.diag_indices(n)
i = np.concatenate((i[:-1], i, i[1:]))
j = np.concatenate((j[1:], j, j[:-1]))
A = np.zeros((n, n))
A[i, j] = np.random.rand(3 * n - 2)
I am trying to find the minimal cost path from point (0, 0) to point {(u, v) | u + v <= 100} in a 2d matrix of data I have generated.
My algorithm is pretty simple and currently I have managed to produce the following (visualized) results, which leads me to understand that I am way off in my algorithm.
# each cell of path_arr contains a tuple of (i,j) of the next cell in path.
# data contains the "cost" of stepping on its cell
# total_cost_arr is used to assist reconstructing the path.
def min_path(data, m=100, n=100):
total_cost_arr = np.array([np.array([0 for x in range(0, m)]).astype(float) for x in range(0, n)])
path_arr = np.array([np.array([(0, 0) for x in range(0, m)], dtype='i,i') for x in range(0, n)])
total_cost_arr[0, 0] = data[0][0]
for i in range(0, m):
total_cost_arr[i, 0] = total_cost_arr[i - 1, 0] + data[i][0]
for j in range(0, n):
total_cost_arr[0, j] = total_cost_arr[0, j - 1] + data[0][j]
for i in range(1, m):
for j in range(1, n):
total_cost_arr[i, j] = min(total_cost_arr[i - 1, j - 1], total_cost_arr[i - 1, j], total_cost_arr[i, j - 1]) + data[i][j]
if total_cost_arr[i, j] == total_cost_arr[i - 1, j - 1] + data[i][j]:
path_arr[i - 1, j - 1] = (i, j)
elif total_cost_arr[i, j] == total_cost_arr[i - 1, j] + data[i][j]:
path_arr[i - 1, j] = (i, j)
else:
path_arr[i, j - 1] = (i, j)
each cell of path_arr contains a tuple of (i,j) of the next cell in path.
data contains the "cost" of stepping on its cell, and
total_cost_arr is used to assist reconstructing the path.
I think that placing (i,j) in previous cell is causing some conflicts which lead to this behavior.
I don't think an array is the best structure for your problem.
You should use some graph data structure (with networkx for example) and use algorithm like the Dijkstra one's or A* (derivated from the first one).
The Dijkstra algorithm is implemented in netwokrkx (function for shortest path).
Let's say an array sig:
sig = np.array([1,2,3,4,5])
Another array k which consists of indexes:
k = np.array([1,2,0,4])
I want to find an array that interpolates between s[k[i]-1] and s[k[i]] only if k[i]!= 0 and k[i] != len(k) i.e
p=2
result = np.zeros(len(k))
for i in range(len(k)):
if(k[i] == 0):
result[i] = sig[k[i]]
elif(k[i] == len(k)):
result[i] = sig[k[i] -1]
else:
result[i] = sig[k[i] -1] + (sig[k[i]] - sig[k[i]-1])*(p - k[i-1])/(k[i] - k[i-1])
How do I do this without looping over len(k) by vectorization
Expected : result = array([1.66666667,3, 1, 4])
Because for k = 0 and k =4 I did not interpolate the values were returned as sig[0] and sig[3] respectively
For a (very) limited amount of cases like here, an approach to vectorize such code is to build a linear combination of each case and the corresponding calculation.
So, set up vectors
alpha = (k == 0) to match the first case,
beta = (k > 0) to match the second case, and
gamma = (k < len(k)) to match the third case.
Then, build up a proper linear combination like:
alpha * sig[k] + beta * sig[k-1] + gamma * (sig[k] - sig[k-1] * (p - np.roll(k, 1)) / (k - np.roll(k, 1))
Pay attention, that - by the way beta and gamma are set up above - the calculations of the second and third cases can be combined. Also, we need np.roll here, to get the proper k[i-1].
The final solution, minimized to a one-liner, looks like this:
import numpy as np
# Inputs
sig = np.array([1, 2, 3, 4, 5])
k = np.array([1, 2, 0, 4])
p = 2
# Original solution using loop
result = np.zeros(len(k))
for i in range(len(k)):
if(k[i] == 0):
result[i] = sig[k[i]]
elif(k[i] == len(k)):
result[i] = sig[k[i] -1]
else:
result[i] = sig[k[i] -1] + (sig[k[i]] - sig[k[i]-1])*(p - k[i-1])/(k[i] - k[i-1])
# Vectorized solution
res = (k == 0) * sig[k] + (k > 0) * sig[k-1] + (k < len(k)) * (sig[k] - sig[k-1]) * (p - np.roll(k, 1)) / (k - np.roll(k, 1))
# Outputs
print('Original solution using loop:\n ', result)
print('Vectorized solution:\n ', res)
The outputs are identical:
Original solution using loop:
[1.66666667 3. 1. 4. ]
Vectorized solution:
[1.66666667 3. 1. 4. ]
Hope that helps!
I have the following code snippet, which essentially does the following:
Given a 2d numpy array, arr, compute sum_arr as follow:
sum_arr[i, j] = arr[i, j] + min(sum_arr[i - 1, j-1:j+2]) if (i>0) else arr[i, j]
(reasonable indices for j - 1 : j + 2 of course, all within 0 and w)
Here's my implementation:
import numpy as np
h, w = 1000, 1000 # Shape of the 2d array
arr = np.arange(h * w).reshape((h, w))
sum_arr = arr.copy()
def min_parent(i, j):
min_index = j
if j > 0:
if sum_arr[i - 1, j - 1] < sum_arr[i - 1, min_index]:
min_index = j - 1
if j < w - 1:
if sum_arr[i - 1, j + 1] < sum_arr[i - 1, min_index]:
min_index = j + 1
return (i - 1, min_index)
for i, j in np.ndindex((h - 1, w)):
sum_arr[i + 1, j] += sum_arr[min_parent(i + 1, j)]
And here's the problem: this code snippet takes way too long to execute for only 1e6 operations (About 5s on average on my machine)
What is a better way of implementing this?
While your operation is sequential across rows, within rows it is not. It is therefore easy to vectorize row-wise and keep only a 1D outer loop which in relative terms shouldn't incur too much overhead.
Indeed, doing so gives me a ~200x speedup:
5.2975871179951355 # OP
0.023798351001460105 # vectorized rows
And the code is actually quite simple:
import numpy as np
h, w = 1000, 1000 # Shape of the 2d array
arr = np.arange(h * w).reshape((h, w))
def min_parent(i, j, sum_arr):
min_index = j
if j > 0:
if sum_arr[i - 1, j - 1] < sum_arr[i - 1, min_index]:
min_index = j - 1
if j < w - 1:
if sum_arr[i - 1, j + 1] < sum_arr[i - 1, min_index]:
min_index = j + 1
return (i - 1, min_index)
def OP():
sum_arr = arr.copy()
for i, j in np.ndindex((h - 1, w)):
sum_arr[i + 1, j] += sum_arr[min_parent(i + 1, j, sum_arr)]
return sum_arr
def vect_rows():
h, w = arr.shape
if w==1:
return arr.cumsum(0)
out = np.empty_like(arr)
out[0] = arr[0]
for i in range(1, h):
out[i, :-1] = np.minimum(out[i-1, :-1], out[i-1, 1:])
out[i, 1:] = np.minimum(out[i, :-1], out[i-1, 1:])
out[i] += arr[i]
return out
assert np.allclose(OP(), vect_rows())
from timeit import repeat
print(min(repeat(OP, number=3)))
print(min(repeat(vect_rows, number=3)))
Use dynamic programming:
On a different array, precompute the mins for the blocks of of size X (in your case you are doing it for size 3 (since you check j-1, j, j + 1). To determine the min for a block, use the value of the referenced position in the original array and the min of the previous block because you seem to be doing it dynamically.
This way you simply assign the index that needs it.