Solving linear equations in python with the answers restricted to 0/1 - python

(My previous posting has just been closed. However, I cannot see what's inappropriate with the question.)
I'm dealing with a linear equations-solving problem, in which the value for each variable is either 0 or 1.
Hopefully, I would like to develop a solver that can tell whether the value for each variable is definitely 0 or 1. For the final output, the value would be assigned to the variable if it is solved; otherwise it would be assigned None.
For example, the inputs of
a + b + c = 1
b + c = 1
should generate the outputs of
{a=0, b=None, c=None}
And the inputs of
a + b + 2c + d = 2
a + d = 1
should give
{a=None, b=1, c=0, d=None}
As far as I know, there already exist some general linear solvers in python (e.g. numpy.linalg.solve). Is it possible to utilize them and with modifications? If no, what is the recommended approach instead?
Thank you~

Your idea is very close. np.linalg.solve(a,b) can only be used, if a is square and of full-rank, i.e., all rows (or, equivalently, columns) must be linearly independent. Otherwise use for instance lstsq for the least-squares best "solution" of the system/equation.
import numpy as np
A = np.array([[1, 1, 1], [0, 1, 1]])
B = np.array([1, 1])
X = np.linalg.lstsq(A, B)[0] #only interested of the best solution
###solution for [a, b, c]:
###[-1.11022302e-16 5.00000000e-01 5.00000000e-01]
A = np.array([[1, 1, 2, 1], [1, 0, 0, 1]])
B = np.array([2, 1])
X = np.linalg.lstsq(A, B)[0]
###solution for [a, b, c, d]:
###[0.5 0.2 0.4 0.5]

Related

Why does scipy.stats.entropy(a, b) return inf while scipy.stats.entropy(b, a) doesn't?

In [15]: a = np.array([0.5, 0.5, 0, 0, 0])
In [16]: b = np.array([1, 0, 0, 0, 0])
In [17]: entropy(a, b)
Out[17]: inf
In [18]: entropy(b, a)
Out[18]: 0.6931471805599453
From their documentation, I expected both to return inf since the equation given is S = sum(pk * log(pk / qk), axis=0). What is the reason for the non-infinite output in line 18?
The entropy(b, a) function calculates the first pair:
>>> 1 * np.log(1/0.5)
>>> 0.6931471805599453
For entropy(a, b), there is one case of divide-by-zero, 0.5/0, which leads to an infinite solution.
For the rest, entropy() assumes 0 * np.log(0/0) = 0.
Looking into the definition of the Kullback-Leibler divergence, it seems like it is due to how it is defined.
This is from Wikipedia:
Whenever P(x) is zero the contribution of the corresponding term is interpreted as zero because
the limit goes to zero (click link for the equation).
When both p and q are provided the entropy function computes the KL-divergence. The KL-divergence is asymmetric meaning that KL(p,q)!=KL(q,p) unless p==q. Therefore you will get different answers.
Further on, as the other answers explains the fact that you have zeros in your distribution means that we will divide by zero according to the definition of the KL-divergence.
KL(p,q) = sum(p * log(p/q))

Constrained regression in Python with multiple constraints

I am currently working on setting up a constrained regression in Python using
import statsmodels.api as sm
model = sm.GLM(Y,X)
model.fit_constrained
'''Setting the restrictions on parameters in the form of (R, q), where R
and q are constraints' matrix and constraints' values, respectively. As
for the restriction in the aforementioned regression model, i.e.,
c = b - 1 or b - c = 1, R = [0, 1, -1] and q = 1.'''
function from StatsModel but running into some issues when I try to set it up with multiple constraints. I have seven coefficients, including a constant. I want to set it up so that a weighted sum of dummy 1 and dummy 2 equals zero and a weighted sum of dummy 3 and dummy 4 equals zero. To use a single constraint example,
results = model.fit_constrained(([0, 0, 0, a, b, 0, 0], 0))
where a and b are the weights on dummy 3 and dummy 4 and are variables I've predefined.
If I didn't have the a and b variables, and the dummies were equally weighted, I could just use the syntax
fit_constrained('Dummy1 + Dummy2, Dummy3 + Dummy4')
but when I try to use a similar syntax using
results = model.fit_constrained(([0, 0, 0, a, b, 0, 0], 0),([0, c, d, 0, 0, 0, 0], 0))
I get the error
ValueError: shapes (2,) and (7,6) not aligned: 2 (dim 0) != 7 (dim 0)
Does anyone have any ideas? Thanks so much!
I am still not sure which model you are running (posting a Minimal, Complete, and Verifiable example would certainly help), but the following should work for GLMs. From the docs, we have,
constraints (formula expression or tuple) – If it is a tuple, then the constraint needs to be given by two arrays (constraint_matrix, constraint_value), i.e. (R, q). Otherwise, the constraints can be given as strings or list of strings. see t_test for details.
This implies the function call should be along the following lines,
R = [[0, 0, 0, a, b, 0, 0],
[0, c, d, 0, 0, 0, 0]]
q = [0, 0]
results = model.fit_constrained((R, q))
This should work, but since we do not have your model I do not know for sure if R * params = q, which must hold according to the documentation.

How to derive with respect to a Matrix element with Sympy

Given the product of a matrix and a vector
A.v
with A of shape (m,n) and v of dim n, where m and n are symbols, I need to calculate the Derivative with respect to the matrix elements.
I haven't found the way to use a proper vector, so I started with 2 MatrixSymbol:
n, m = symbols('n m')
j = tensor.Idx('j')
i = tensor.Idx('i')
l = tensor.Idx('l')
h = tensor.Idx('h')
A = MatrixSymbol('A', n,m)
B = MatrixSymbol('B', m,1)
C=A*B
Now, if I try to derive with respect to one of A's elements with the indices I get back the unevaluated expression:
diff(C, A[i,j])
>>>> Derivative(A*B, A[i, j])
If I introduce the indices in C also (it won't let me use only one index in the resulting vector) I get back the product expressed as a Sum:
C[l,h]
>>>> Sum(A[l, _k]*B[_k, h], (_k, 0, m - 1))
If I derive this with respect to the matrix element I end up getting 0 instead of an expression with the KroneckerDelta, which is the result that I would like to get:
diff(C[l,h], A[i,j])
>>>> 0
I wonder if maybe I shouldn't be using MatrixSymbols to start with. How should I go about implementing the behaviour that I want to get?
SymPy does not yet know matrix calculus; in particular, one cannot differentiate MatrixSymbol objects. You can do this sort of computation with Matrix objects filled with arrays of symbols; the drawback is that the matrix sizes must be explicit for this to work.
Example:
from sympy import *
A = Matrix(symarray('A', (4, 5)))
B = Matrix(symarray('B', (5, 3)))
C = A*B
print(C.diff(A[1, 2]))
outputs:
Matrix([[0, 0, 0], [B_2_0, B_2_1, B_2_2], [0, 0, 0], [0, 0, 0]])
The git version of SymPy (and the next version) handles this better:
In [55]: print(diff(C[l,h], A[i,j]))
Sum(KroneckerDelta(_k, j)*KroneckerDelta(i, l)*B[_k, h], (_k, 0, m - 1))

Why SymPy didn't show me the inverse matrix result in the book?

According to the book I'm reading, the inverse matrix of
is
.
Where
a = e^(π*(2/3)*j), like the complex number j, only that the phase of j is 90°, but that of a is 120°.
So I tried this in SymPy:
from sympy import *
a = symbols('a')
T = Matrix([
[1, 1, 1],
[1, a**2, a],
[1, a, a**2]
])
simplify(T.inv())
This is the result in IPython:
which doesn't seem like the inverse matrix in the book at all.
Why did I get this?
And how can I get the result in the book using SymPy?
After your edit, it is clear that a is not a parameter, but rather it has a precise value, that is, -0.5 + i*sqrt(3)/2. If you don't tell SymPy what that value is, it will treat it as a parameter, and the inverted matrix looks like that. But if you give a the right value, then everything works:
from sympy import *
a = -0.5 + I*sqrt(3)/2
T = Matrix([
[1, 1, 1],
[1, a**2, a],
[1, a, a**2]
])
invT = Matrix([
[1, 1, 1],
[1, a, a**2],
[1, a**2, a]
])
simplify(1/3*(T*invT))
and this gives the identity matrix as expected.
This was my original answer:
You can't get the result given by your book, because it's wrong.
Emathelp.net confirms that the result found by SymPy is correct, and symbolab.com shows that the result provided by your book is wrong, because if you multiply A * A-1 you don't get the identity matrix.

build matrix from blocks

I have an object which is described by two quantities, A and B (in real case they can be more than two). Objects are correlated depending on the value of A and B. In particular I know the correlation matrix for A and for B. Just as example:
a = np.array([[1, 1, 0, 0],
[1, 1, 0, 0],
[0, 0, 1, 1],
[0, 0, 1, 1]])
b = np.array([[1, 1, 0],
[1, 1, 1],
[0, 1, 1]])
na = a.shape[0]
nb = b.shape[0]
correlation for A:
so if an element has A == 0.5 and the other equal to A == 1.5 they are fully correlated (red). Otherwise if an element has A == 0.5 and the second item has A == 3.5 they are uncorrelated (blue).
Similarly for B:
Now I want multiply the two correlation matrixes, but I want to obtain as final matrix a matrix with two axis, where the new axes are a folded version of the original axes:
def get_folded_bin(ia, ib):
return ia * nb + ib
here what I am doing:
result = np.swapaxes(np.tensordot(a, b, axes=0), 1, 2).reshape(na* nb, na * nb)
visually:
and in particular this must hold:
for ia1 in xrange(na):
for ia2 in xrange(na):
for ib1 in xrange(nb):
for ib2 in xrange(nb):
assert(a[ia1, ia2] * b[ib1, ib2] == result[get_folded_bin(ia1, ib1), get_folded_bin(ia2, ib2)])
actually my problem is to do it with more quantities (A, B, C, ...) in a general way. Maybe there is also a simpler function within numpy to do that.
np.einsum lets you simplify the tensordot expression a bit:
result = np.einsum('ij,kl->ikjl',a,b).reshape(-1, na * nb)
I don't think there's a way of eliminating the reshape.
It may also be easier to generalize to more arrays, though I wouldn't get carried away with too many iteration variables in one einsum expression.
I think finally I have found a solution:
np.kron(a,b)
and then I can compose with
np.kron(np.kron(a,b), c)

Categories