I want to test a a weighted SDP problem in cvxpy, namely min_{X > 0} | P * X - B |^2, where * is the Hadamard product. I started with the simple case with P being the identity matrix. This is the code I wrote for the problem.
import cvxpy as cp
import numpy as np
import random
##parameters
random.seed(1)
n = 50
CN = 10
#generate real A
D = np.diag(1000 * np.logspace(-np.log(CN), 0, n))
Q, R = np.linalg.qr(np.random.normal(size = (n,n)))
A = Q # D # Q
P = np.ones((n,n))
PA = np.multiply(P, A)
X = cp.Variable((n,n), symmetric=True)
X.value = np.identity(n)
constraints = [X >> 0]
prob = cp.Problem(cp.Minimize(0.5 * cp.norm(P # A - PA, 'fro')),
constraints)
prob.solve()
X.value
I used # in place of * because matrix multiplication with identity is equal to that with Hardamard product ( I also can't find the atom function for element-wise multiplication in cvxpy; if someone knows it, it would be really helpful if you can inform me).
After testing out, I consistently get the optimal value for X as the zero matrix. I think there must be some error in my implementation but I cannot tell which is wrong. Any help will be much appreciated.
I am trying to calculate the covariance matrix of two vectors a and b for which i am using numpys cov implementation R = np.cov(a,b). I got a little confussed when i noticed that np.cov(a,b)[0,0] != np.var(a) however i was able to find that, that had to do with biased vs unbiased estimators and is controlled by ddof.
However, that isn't the end of it. Why is R[0,1] != R[0,0]**0.5 * R[1,1]**0.5. Following my understanding and the definition of the covariance matrix on wikipedia https://en.wikipedia.org/wiki/Covariance_matrix
R[0,1] = R[1,0] = std(a) * std(b)
R[0,1] = R[1,0] = var(a)**0.5 * var(b)**0.5
R[0,1] = R[1,0] = R[0,0]**0.5 * R[1,1]**0.5
Where am i mistaken?
import numpy as np
rng = np.random.default_rng(seed=43)
a = rng.random((1,3))
b = rng.random((1,3))
R = np.cov(a,b,ddof=1)
print(R)
print('var a: ' + str(np.var(a,ddof=1)))
print('var b: ' + str(np.var(b,ddof=1)))
print('cov a,b: ' + str(np.var(a,ddof=1)**0.5*np.var(b,ddof=1)**0.5))
print('cov a,b: ' + str(R[0,0]**0.5*R[1,1]**0.5))
print('cov a,b: ' + str(np.std(a,ddof=1)*np.std(b,ddof=1)))
I apologist in advance for any spelling or stack overflow ethnic mistakes on my part. Any help is appreciated.
I'm not sure where the formula var(a)**0.5 * var(b)**0.5 is coming from, but it's not the formula I've seen for cross-covariance. I've seen this as the expected value of the products of x - mean_of_x and y - mean_of_y.
In a loop style (for clarity) this might look like:
a_mean = np.mean(a)
b_mean = np.mean(b)
s = 0
n = len(a[0])
for i, _ in enumerate(a[0]):
s += (a[0][i] - a_mean) * (b[0][i] - b_mean)
s / (n-1)
# 0.09175517729176114
In Numpy you can also do:
a_mean = np.mean(a)
b_mean = np.mean(b)
(a - a_mean) # (b - b_mean).T / (n-1)
# array([[0.09175518]])
This corresponds to the values you get in the corners.
If you want to divide by n rather than n-2, you can pass in the bias arg to cov()
np.cov(a, b, bias=True)
# array([[0.08562558, 0.06117012],
[0.06117012, 0.06361328]])
The corners here are the results you will get with the above code by dividing the results by 3 (n) rather than 2 (n-1)
I'm trying to write a program that can solve the general regression formula:
So I'm trying to implement this matrix equation, is there anyway to do this such as to let the user decide how big it can be, without me making more and more if conditions (so just one piece of code that collapses to the matrix that the user wishes for)?
Code:
#Solving the general matrix for the coefficients
if 3 == n:
a = np.array([[np.sum(np.multiply(FL[1],FL[1])),np.sum(np.multiply(FL[1],FL[2]))],
[np.sum(np.multiply(FL[1],FL[2])),np.sum(np.multiply(FL[2],FL[2]))]])
b = np.array([np.sum(np.multiply(FL[0],FL[1])),np.sum(np.multiply(FL[0],FL[2]))])
x = np.linalg.solve(a, b)
if 4 == n:
a = np.array([[np.sum(np.multiply(FL[1],FL[1])),np.sum(np.multiply(FL[1],FL[2])),np.sum(np.multiply(FL[1],FL[3]))],
[np.sum(np.multiply(FL[1],FL[2])),np.sum(np.multiply(FL[2],FL[2])),np.sum(np.multiply(FL[2],FL[3]))],
[np.sum(np.multiply(FL[1],FL[3])),np.sum(np.multiply(FL[2],FL[3])),np.sum(np.multiply(FL[3],FL[3]))]])
b = np.array([np.sum(np.multiply(FL[0],FL[1])),np.sum(np.multiply(FL[0],FL[2])),np.sum(np.multiply(FL[0],FL[3]))])
x = np.linalg.solve(a, b)
1 In this code Phi_0 corresponds to FL[i=1] and FL[0] corresponds to y.
You can make the algorithm independent of the order of the polynomial. The easiest way is using for loops, although these will be slow (since they don't exploit NumPy's vectorization).
Here is a reproducible example with random data:
import numpy as np
# Order of polynomial
n = 5
# Random seed for reproducibility
np.random.seed(1)
# Input arrays
phi = np.random.random((100,n))
y = np.random.random(100)
# Output arrays
a = np.zeros((n,n))
b = np.zeros(n)
for i in range(n):
b[i] = np.sum(y * phi[:,i])
for j in range(i,n):
# Exploit that matrix is diagonal
a[i,j] = a[j,i] = np.sum(phi[:,i] * phi[:,j])
# Coefficients array
x = np.linalg.solve(a,b)
I am trying to solve the following problem via a Finite Difference Approximation in Python using NumPy:
$u_t = k \, u_{xx}$, on $0 < x < L$ and $t > 0$;
$u(0,t) = u(L,t) = 0$;
$u(x,0) = f(x)$.
I take $u(x,0) = f(x) = x^2$ for my problem.
Programming is not my forte so I need help with the implementation of my code. Here is my code (I'm sorry it is a bit messy, but not too bad I hope):
## This program is to implement a Finite Difference method approximation
## to solve the Heat Equation, u_t = k * u_xx,
## in 1D w/out sources & on a finite interval 0 < x < L. The PDE
## is subject to B.C: u(0,t) = u(L,t) = 0,
## and the I.C: u(x,0) = f(x).
import numpy as np
import matplotlib.pyplot as plt
# definition of initial condition function
def f(x):
return x^2
# parameters
L = 1
T = 10
N = 10
M = 100
s = 0.25
# uniform mesh
x_init = 0
x_end = L
dx = float(x_end - x_init) / N
#x = np.zeros(N+1)
x = np.arange(x_init, x_end, dx)
x[0] = x_init
# time discretization
t_init = 0
t_end = T
dt = float(t_end - t_init) / M
#t = np.zeros(M+1)
t = np.arange(t_init, t_end, dt)
t[0] = t_init
# Boundary Conditions
for m in xrange(0, M):
t[m] = m * dt
# Initial Conditions
for j in xrange(0, N):
x[j] = j * dx
# definition of solution to u_t = k * u_xx
u = np.zeros((N+1, M+1)) # NxM array to store values of the solution
# finite difference scheme
for j in xrange(0, N-1):
u[j][0] = x**2 #initial condition
for m in xrange(0, M):
for j in xrange(1, N-1):
if j == 1:
u[j-1][m] = 0 # Boundary condition
else:
u[j][m+1] = u[j][m] + s * ( u[j+1][m] - #FDM scheme
2 * u[j][m] + u[j-1][m] )
else:
if j == N-1:
u[j+1][m] = 0 # Boundary Condition
print u, t, x
#plt.plot(t, u)
#plt.show()
So the first issue I am having is I am trying to create an array/matrix to store values for the solution. I wanted it to be an NxM matrix, but in my code I made the matrix (N+1)x(M+1) because I kept getting an error that the index was going out of bounds. Anyways how can I make such a matrix using numpy.array so as not to needlessly take up memory by creating a (N+1)x(M+1) matrix filled with zeros?
Second, how can I "access" such an array? The real solution u(x,t) is approximated by u(x[j], t[m]) were j is the jth spatial value, and m is the mth time value. The finite difference scheme is given by:
u(x[j],t[m+1]) = u(x[j],t[m]) + s * ( u(x[j+1],t[m]) - 2 * u(x[j],t[m]) + u(x[j-1],t[m]) )
(See here for the formulation)
I want to be able to implement the Initial Condition u(x[j],t[0]) = x**2 for all values of j = 0,...,N-1. I also need to implement Boundary Conditions u(x[0],t[m]) = 0 = u(x[N],t[m]) for all values of t = 0,...,M. Is the nested loop I created the best way to do this? Originally I tried implementing the I.C. and B.C. under two different for loops which I used to calculate values of the matrices x and t (in my code I still have comments placed where I tried to do this)
I think I am just not using the right notation but I cannot find anywhere in the documentation for NumPy how to "call" such an array so at to iterate through each value in the proposed scheme. Can anyone shed some light on what I am doing wrong?
Any help is very greatly appreciated. This is not homework but rather to understand how to program FDM for Heat Equation because later I will use similar methods to solve the Black-Scholes PDE.
EDIT: So when I run my code on line 60 (the last "else" that I use) I get an error that says invalid syntax, and on line 51 (u[j][0] = x**2 #initial condition) I get an error that reads "setting an array element with a sequence." What does that mean?
I have a m × n × n numpy.ndarray of m simultaneously diagonalizable square matrices and would like to use numpy to obtain their simultaneous eigenvalues.
For example, if I had
from numpy import einsum, diag, array, linalg, random
U = linalg.svd(random.random((3,3)))[2]
M = einsum(
"ij, ajk, lk",
U, [diag([2,2,0]), diag([1,-1,1])], U)
the two matrices in M are simultaneously diagonalizable, and I am looking for a way to obtain the array
array([[2., 1.],
[2., -1.],
[0., 1.]])
(up to permutation of the lines) from M. Is there a built-in or easy way to get this?
There is a fairly simple and very elegant simultaneous diagonalization algorithm based on Givens rotation that was published by Cardoso and Soulomiac in 1996:
Cardoso, J., & Souloumiac, A. (1996). Jacobi Angles for Simultaneous Diagonalization. SIAM Journal on Matrix Analysis and Applications, 17(1), 161–164. doi:10.1137/S0895479893259546
I've attached a numpy implementation of the algorithm at the end of this response. Caveat: It turns out simultaneous diagonalization is a bit of a tricky numerical problem, with no algorithm (to the best of my knowledge) that guarantees global convergence. However, the cases in which it does not work (see the paper) are degenerate and in practice I have never had the Jacobi angles algorithm fail on me.
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
"""
Routines for simultaneous diagonalization
Arun Chaganty <arunchaganty#gmail.com>
"""
import numpy as np
from numpy import zeros, eye, diag
from numpy.linalg import norm
def givens_rotate( A, i, j, c, s ):
"""
Rotate A along axis (i,j) by c and s
"""
Ai, Aj = A[i,:], A[j,:]
A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai
return A
def givens_double_rotate( A, i, j, c, s ):
"""
Rotate A along axis (i,j) by c and s
"""
Ai, Aj = A[i,:], A[j,:]
A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai
A_i, A_j = A[:,i], A[:,j]
A[:,i], A[:,j] = c * A_i + s * A_j, c * A_j - s * A_i
return A
def jacobi_angles( *Ms, **kwargs ):
r"""
Simultaneously diagonalize using Jacobi angles
#article{SC-siam,
HTML = "ftp://sig.enst.fr/pub/jfc/Papers/siam_note.ps.gz",
author = "Jean-Fran\c{c}ois Cardoso and Antoine Souloumiac",
journal = "{SIAM} J. Mat. Anal. Appl.",
title = "Jacobi angles for simultaneous diagonalization",
pages = "161--164",
volume = "17",
number = "1",
month = jan,
year = {1995}}
(a) Compute Givens rotations for every pair of indices (i,j) i < j
- from eigenvectors of G = gg'; g = A_ij - A_ji, A_ij + A_ji
- Compute c, s as \sqrt{x+r/2r}, y/\sqrt{2r(x+r)}
(b) Update matrices by multiplying by the givens rotation R(i,j,c,s)
(c) Repeat (a) until stopping criterion: sin theta < threshold for all ij pairs
"""
assert len(Ms) > 0
m, n = Ms[0].shape
assert m == n
sweeps = kwargs.get('sweeps', 500)
threshold = kwargs.get('eps', 1e-8)
rank = kwargs.get('rank', m)
R = eye(m)
for _ in xrange(sweeps):
done = True
for i in xrange(rank):
for j in xrange(i+1, m):
G = zeros((2,2))
for M in Ms:
g = np.array([ M[i,i] - M[j,j], M[i,j] + M[j,i] ])
G += np.outer(g,g) / len(Ms)
# Compute the eigenvector directly
t_on, t_off = G[0,0] - G[1,1], G[0,1] + G[1,0]
theta = 0.5 * np.arctan2( t_off, t_on + np.sqrt( t_on*t_on + t_off * t_off) )
c, s = np.cos(theta), np.sin(theta)
if abs(s) > threshold:
done = False
# Update the matrices and V
for M in Ms:
givens_double_rotate(M, i, j, c, s)
#assert M[i,i] > M[j, j]
R = givens_rotate(R, i, j, c, s)
if done:
break
R = R.T
L = np.zeros((m, len(Ms)))
err = 0
for i, M in enumerate(Ms):
# The off-diagonal elements of M should be 0
L[:,i] = diag(M)
err += norm(M - diag(diag(M)))
return R, L, err
I am not aware of any direct solution. But why not just getting the eigenvalues and the eigenvectors of the first matrix, and using the eigenvectors to transform all other matrices to the diagonal form? Something like:
eigvals, eigvecs = np.linalg.eig(matrix1)
eigvals2 = np.diagonal(np.dot(np.dot(transpose(eigvecs), matrix2), eigvecs))
You can the add the columns to an array via hstack if you like.
UPDATE: As pointed out below, this is only valid if no degenerate eigenvalues occur. Otherwise one would have to check first for the degenerate eigenvalues, then transform the 2nd matrix to a blockdiagonal form, and diagonalize eventual blocks bigger than 1x1 separately.
I am sure there is significant room for improvement in my solution, but I have come up with the following set of three functions doing the calculation for me in a semi-robust way.
def clusters(array,
orig_indices = None,
start = 0,
rtol=numpy.allclose.__defaults__[0],
atol=numpy.allclose.__defaults__[1]):
"""For an array, return a permutation that sorts the numbers and the sizes of the resulting blocks of identical numbers."""
array = numpy.asarray(array)
if not len(array):
return numpy.array([]),[]
if orig_indices is None:
orig_indices = numpy.arange(len(array))
x = array[0]
close = abs(array-x) <= (atol + rtol*abs(x))
first = sum(close)
r_perm, r_sizes = clusters(
array[~close],
orig_indices[~close],
start+first,
rtol, atol)
r_sizes.insert(0, first)
return numpy.concatenate((orig_indices[close], r_perm)), r_sizes
def permutation_matrix(permutation, dtype=dtype):
n = len(permutation)
P = numpy.zeros((n,n), dtype)
for i,j in enumerate(permutation):
P[j,i]=1
return P
def simultaneously_diagonalize(tensor, atol=numpy.allclose.__defaults__[1]):
tensor = numpy.asarray(tensor)
old_shape = tensor.shape
size = old_shape[-1]
tensor = tensor.reshape((-1, size, size))
diag_mask = 1-numpy.eye(size)
eigvalues, diagonalizer = numpy.linalg.eig(tensor[0])
diagonalization = numpy.dot(
numpy.dot(
matrix.linalg.inv(diagonalizer),
tensor).swapaxes(0,-2),
diagonalizer)
if numpy.allclose(diag_mask*diagonalization, 0):
return diagonalization.diagonal(axis1=-2, axis2=-1).reshape(old_shape[:-1])
else:
perm, cluster_sizes = clusters(diagonalization[0].diagonal())
perm_matrix = permutation_matrix(perm)
diagonalization = numpy.dot(
numpy.dot(
perm_matrix.T,
diagonalization).swapaxes(0,-2),
perm_matrix)
mask = 1-scipy.linalg.block_diag(
*list(
numpy.ones((blocksize, blocksize))
for blocksize in cluster_sizes))
print(diagonalization)
assert(numpy.allclose(
diagonalization*mask,
0)) # Assert that the matrices are co-diagonalizable
blocks = numpy.cumsum(cluster_sizes)
start = 0
other_part = []
for block in blocks:
other_part.append(
simultaneously_diagonalize(
diagonalization[1:, start:block, start:block]))
start = block
return numpy.vstack(
(diagonalization[0].diagonal(axis1=-2, axis2=-1),
numpy.hstack(other_part)))
If you know something about the size of the eigenvalues of the two matrices in advance, you can diagonalize a linear combination of the two matrices, with coefficients chosen to break the degeneracy. For example, if the eigenvalues of both lie between -10 and 10, you could diagonalize 100*M1 + M2. There's a slight loss of precision, but for many purposes it's good enough--and quick and easy!