scipy's sparse eigs yielding wrong eigenvectors - python

I am a bit confused regarding the following issue:
I am computing fixed points of a quantum channel, which means I want to compute the leading eigenvector of a specific matrix. The matrix is such that its dimensionality is n^2 x n^2 and defined in such a way that the leading eigenvalue reshaped to a matrix with shape n x n is a positive matrix (self adjoint with positive eigenvalues).
If I do this with scipy.sparse.linalg.eigs however I get wrong results. The exact computation (using scipy.linalg.eig) however works fine. I tried around playing with the arguments for k and ncv for the solver, but didn't get i working properly unless I set k=n**2 in which case eigs just refers to eig. This, however, won't work in the case that I have actually in mind where the channel (super_op in the script below) is actually encoded as a LinearOperator. So I rely on using eigs :/
Anybody any idea how to get this run properly?
Thanks to everybody in advance!
import numpy as np
from numpy.random import rand
from numpy import tensordot as td
from scipy.sparse.linalg import eigs
from scipy.linalg import eig
n = 16
d = 3
kraus_op = .5 - rand(n, d, n) + 1j * (.5 - rand(n, d, n))
super_op = td(kraus_op, kraus_op.conj(), [[1], [1]]).transpose(0, 2, 1, 3)
########
# Sparse
########
vals, vecs = eigs(super_op.reshape(n**2, n**2), k=n*(n-1), which='LM')
rho = vecs[:,0].reshape(n, n)
print('is self adjoint: ', np.allclose(rho, rho.conj().T))
super_op_times_rho = td(super_op, rho, [[2, 3], [0, 1]])
print('super_op(rho) == lambda * rho :', np.allclose(rho, super_op_times_rho/vals[0]))
########
# Exact
########
vals, vecs = eig(super_op.reshape(n**2, n**2))
rho = vecs[:,0].reshape(n, n)
print('is self adjoint: ', np.allclose(rho, rho.conj().T))
super_op_times_rho = td(super_op, rho, [[2, 3], [0, 1]])
print('super_op(rho) == lambda * rho :', np.allclose(rho, super_op_times_rho/vals[0]))
the result is:
is self adjoint: False
super_op(rho) == lambda * rho : True
is self adjoint: True
super_op(rho) == lambda * rho : True
For completeness:
Python 3.5.2
numpy 1.16.1
scipy 1.2.1

After all I found the solution with some help of my colleagues:
While eig gives the eigenvectors (1) sorted (by magnitude) and (2) such that the first entry is real, eigs has another sorting that must not be as in eig and also does not regularize the complex phase of the eigenvector. Correcting the phase is easily done by dividing the tensor by the phase of the first entry (corresponding to ensuring that the first diagonal element is real to get rid of the freedom of choosing the complex phase of an eigenvector and making Hermiticity possible...).
So the corrected code snippet for the sparse case is:
vals, vecs = eigs(super_op.reshape(chi**2, chi**2), k=chi*(chi-1), which='LM')
# find the index corresponding to the largest eigenvalue
arg = np.argmax(np.abs(vals))
rho = vecs[:,arg].reshape(chi, chi)
# regularize the output array
rho *= np.abs(rho[0, 0])/rho[0, 0]

Related

Machine Learning - implementing a Gradient Descent in Python from Octave code

I am trying to implement a gradient descent function in Python from scratch which I have implemented and work in GNU Octave. Unfortunately I am stuck. I fiddled with it for a while and checked the NumPy documentation but so far no luck.
I am aware of libraries such as scikit-learn, however my purpose is to learn to code such a function from scratch. Perhaps I am going about it the wrong way.
Below you will find all the code necessary to reproduce the error.
Thanks in advance for your help.
Actual result: test fails with error -> "ValueError: matmul: Input operand 0 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)"
Expected result: and array with values [5.2148, -0.5733]
Function gradientDescent() in Octave:
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
theta = theta - (alpha/m)*X'*(X*theta-y);
J_history(iter) = computeCost(X, y, theta);
end
Function gradient_descent() in python:
from numpy import zeros
def compute_cost(X, y, theta):
m = len(y)
ans = (X.T # theta).T - y
J = (ans # ans.T) / (2 * m)
return J[0, 0]
def gradient_descent(X, y, theta, alpha, num_iters):
m = len(y)
J_history = zeros((num_iters, 1), dtype=int)
for iter in range(num_iters):
theta = theta - (alpha / m) # X.T # (X # theta - y)
J_history[iter] = compute_cost(X, y, theta)
return theta
the test file: test_ml_utils.py
import unittest
import numpy as np
from ml.ml_utils import compute_cost, gradient_descent
class TestGradientDescent(unittest.TestCase):
# TODO: implement tests for Gradient Descent function
# [theta J_hist] = gradientDescent([1 5; 1 2; 1 4; 1 5],[1 6 4 2]',[0 0]',0.01,1000);
def test_gradient_descent_00(self):
X = np.array([[1, 5], [1, 2], [1, 4], [1, 5]])
y = np.array([1, 6, 4, 2])
theta = np.zeros(2)
alpha = 0.01
num_iter = 1000
r_theta = np.array([5.2148, -0.5733])
result = gradient_descent(X, y, theta, alpha, num_iter)
self.assertEqual((round(result, 4), r_theta), 'Result is wrong!')
if __name__ == '__main__':
unittest.main()
The __matmul__ operator # in Python binds more tightly than -. That means you're trying to do matrix multiplication with the operands (alpha / m), which is a scalar, and X.T, which is actually a matrix. See operator precedence.
In the Octave code, (alpha - m) * X' is doing scalar multiplication, not matrix, so if you want that same behavior in Python, use * rather than #. This seems to be because Octave overloads the * operator to perform scalar multiplication if one operand is a scalar, but matrix multiplication if both operands are matrices.
Adding to Adam's answer (which is correct in regards to the error you're getting).
However I wanted to add more generally, this code is meaningless to a reader without some sort of hint (whether programmatically or in the form of a comment) what dimensions the different variables take.
E.g., there is a hint in the code that y is likely to be 2-dimensional, and you're using len to get its size. Just as an example of how this might fail silently, consider this:
>>> y = numpy.array([[1,2,3,4,5]])
>>> len( y )
1
whereas presumably you want
>>> numpy.shape( y )
(1, 5)
or
>>> numpy.size( y )
5
I note in your unit test that you're passing a rank 1 vector instead of rank 2, so it turns out y is 1D instead of 2D, but operates with X which is 2D due to broadcasting. Your code therefore works despite the logic implied, but in the absence of making such things explicit, this is a runtime error waiting to happen.

Computing covariance matrix of complex array with defined function is not matching while comparing with np.cov

I am trying to write a simple covariance matrix function in Python.
import numpy as np
def manual_covariance(x):
mean = x.mean(axis=1)
print(x.shape[1])
cov = np.zeros((len(x), len(x)), dtype='complex64')
for i in range(len(mean)):
for k in range(len(mean)):
s = 0
for j in range(len(x[1])): # 5 Col
s += np.dot((x[i][j] - mean[i]), (x[k][j] - mean[i]))
cov[i, k] = s / ((x.shape[1]) - 1)
return cov
With this function if I compute the covariance of:
A = np.array([[1, 2], [1, 5]])
man_cov = manual_covariance(A)
num_cov = np.cov(A)
My answer matches with the np.cov(), and there is no problem. But, when I use complex number instead, my answer does not match with np.cov()
A = np.array([[1+1j, 1+2j], [1+4j, 5+5j]])
man_cov = manual_covariance(A)
num_cov = cov(A)
Manual result:
[[-0.5+0.j -0.5+2.j]
[-0.5+2.j 7.5+4.j]]
Numpy cov result:
[[0.5+0.j 0.5+2.j]
[0.5-2.j 8.5+0.j]]
I have tried printing every statement, to check where it can go wrong, but I am not able to find a fault.
It is because the dot product of two complex vectors z1 and z2 is defined as z1 ยท z2*, where * means conjugation. If you use s += np.dot((x[i,j] - mean[i]), np.conj(x[k,j] - mean[i])) you should get the correct result, where we have used Numpy's conjugate function.

How to apply crank-nicolson method in python to a wave equation like schrodinger's

I'm trying to do a particle in a box simulation with no potential field. Took me some time to find out that simple explicit and implicit methods break unitary time evolution so I resorted to crank-nicolson, which is supposed to be unitary. But when I try it I find that it still is not so. I'm not sure what I'm missing.. The formulation I used is this:
where T is the tridiagonal Toeplitz matrix for the second derivative wrt x and
The system simplifies to
The A and B matrices are:
I just solve this linear system for using the sparse module. The math makes sense and I found the same numeric scheme in some papers so that led me to believe my code is where the problem is.
Here's my code so far:
import numpy as np
import matplotlib.pyplot as plt
from scipy.linalg import toeplitz
from scipy.sparse.linalg import spsolve
from scipy import sparse
# Spatial discretisation
N = 100
x = np.linspace(0, 1, N)
dx = x[1] - x[0]
# Time discretisation
K = 10000
t = np.linspace(0, 10, K)
dt = t[1] - t[0]
alpha = (1j * dt) / (2 * (dx ** 2))
A = sparse.csc_matrix(toeplitz([1 + 2 * alpha, -alpha, *np.zeros(N-4)]), dtype=np.cfloat) # 2 less for both boundaries
B = sparse.csc_matrix(toeplitz([1 - 2 * alpha, alpha, *np.zeros(N-4)]), dtype=np.cfloat)
# Initial and boundary conditions (localized gaussian)
psi = np.exp((1j * 50 * x) - (200 * (x - .5) ** 2))
b = B.dot(psi[1:-1])
psi[0], psi[-1] = 0, 0
for index, step in enumerate(t):
# Within the domain
psi[1:-1] = spsolve(A, b)
# Enforce boundaries
# psi[0], psi[N - 1] = 0, 0
b = B.dot(psi[1:-1])
# Square integration to show if it's unitary
print(np.trapz(np.abs(psi) ** 2, dx))
You are relying on the Toeplitz constructor to produce a symmetric matrix, so that the entries below the diagonal are the same as above the diagonal. However, the documentation for scipy.linalg.toeplitz(c, r=None) says not "transpose", but
*"If r is not given, r == conjugate(c) is assumed."
so that the resulting matrix is self-adjoint. In this case this means that the entries above the diagonal have their sign switched.
It makes no sense to first construct a dense matrix and then extract a sparse representation. Construct it as sparse tridiagonal matrix from the start, using scipy.sparse.diags
A = sparse.diags([ (N-3)*[-alpha], (N-2)*[1+2*alpha], (N-3)*[-alpha]], [-1,0,1], format="csc");
B = sparse.diags([ (N-3)*[ alpha], (N-2)*[1-2*alpha], (N-3)*[ alpha]], [-1,0,1], format="csc");

Conflict between vectorization/broadcasting and solving an ODE with solve_ivp

Using a NumPy array and vectorization, I'm trying to create a population of n different individuals, with each individual having three properties: alpha, beta, and phenotype (the phenotype being calculated as the steady state of a differential equation that involves alpha and beta). So, I want each individual to have its own phenotype.
However, my code produces the same phenotype for every individual. Moreover, this unwanted behavior only occurs if there happen to be exactly n entries in solve_ivp's y0 array (which here is [0, 1]) -- otherwise, a broadcasting error is produced:
ValueError: operands could not be broadcast together with shapes (2,) (3,)
Here's the code:
import numpy as np
from scipy.integrate import solve_ivp
def create_population(n):
"""creates a population of n individuals"""
pop = np.zeros(n, dtype=[('alpha','<f8'),('beta','<f8'),('phenotype','<f8')])
pop['alpha'] = np.random.randn(n)
pop['beta'] = np.random.randn(n) + 5
def phenotype(n):
"""creates the phenotype"""
def pheno_ode(t_ode, y):
"""defines the ode for the phenotype"""
dydt = 0.123 - y + pop['alpha'] * (y ** pop['beta'] / (1 + y ** pop['beta']))
return dydt
t_end = 1e06
sol = solve_ivp(pheno_ode, [0, t_end], [0, 1], method='BDF')
return sol.y[0][-1] # last entry is assumed to be the steady state
pop['phenotype'] = phenotype(n)
return pop
popul = create_population(3)
print(popul)
In contrast, if the phenotype is calculated from alpha and beta via a "simple" equation, then vectorization works fine:
def phenotype(n):
"""creates the phenotype"""
phenotype_simple = 2 * pop['alpha'] + pop['beta']
return phenotype_simple
There are two problems that I can see:
First, you have the initial condition for the ODE set to [0, 1]. The sets the size of the vector solution for solve_ivp to 2, regardless of the value of n. However, the arrays pop['alpha'] and pop['beta'] have length n, and in your script, you call create_population with n set to 3. So you have a mismatch in the array shapes in the formula for dydt: y has length 2, but pop['alpha'] and pop['beta'] have length 3. That causes the error that you see.
You can fix this by using, say, np.ones(n) instead of [0, 1] as the initial condition in your call to solve_ivp.
The second problem is in the statement return sol.y[0][-1] in the function phenotype(n). sol.y has shape (n, num_points), where num_points is the number of points computed by solve_ivp. So sol.y[0] is just the first component of the solution, and sol.y[0][-1] is the last value of the solution for the first component. It is a scalar, so when you execute pop['phenotype'] = phenotype(n), you are assigning the same value (the steady state of the first component) to all the phenotypes.
The return statement should be return sol.y[:, -1]. That returns the last column of the solution array (i.e. all the steady state phenotypes).

Artefacts from Riemann sum in scipy.signal.convolve

Short summary: How do I quickly calculate the finite convolution of two arrays?
Problem description
I am trying to obtain the finite convolution of two functions f(x), g(x) defined by
To achieve this, I have taken discrete samples of the functions and turned them into arrays of length steps:
xarray = [x * i / steps for i in range(steps)]
farray = [f(x) for x in xarray]
garray = [g(x) for x in xarray]
I then tried to calculate the convolution using the scipy.signal.convolve function. This function gives the same results as the algorithm conv suggested here. However, the results differ considerably from analytical solutions. Modifying the algorithm conv to use the trapezoidal rule gives the desired results.
To illustrate this, I let
f(x) = exp(-x)
g(x) = 2 * exp(-2 * x)
the results are:
Here Riemann represents a simple Riemann sum, trapezoidal is a modified version of the Riemann algorithm to use the trapezoidal rule, scipy.signal.convolve is the scipy function and analytical is the analytical convolution.
Now let g(x) = x^2 * exp(-x) and the results become:
Here 'ratio' is the ratio of the values obtained from scipy to the analytical values. The above demonstrates that the problem cannot be solved by renormalising the integral.
The question
Is it possible to use the speed of scipy but retain the better results of a trapezoidal rule or do I have to write a C extension to achieve the desired results?
An example
Just copy and paste the code below to see the problem I am encountering. The two results can be brought to closer agreement by increasing the steps variable. I believe that the problem is due to artefacts from right hand Riemann sums because the integral is overestimated when it is increasing and approaches the analytical solution again as it is decreasing.
EDIT: I have now included the original algorithm 2 as a comparison which gives the same results as the scipy.signal.convolve function.
import numpy as np
import scipy.signal as signal
import matplotlib.pyplot as plt
import math
def convolveoriginal(x, y):
'''
The original algorithm from http://www.physics.rutgers.edu/~masud/computing/WPark_recipes_in_python.html.
'''
P, Q, N = len(x), len(y), len(x) + len(y) - 1
z = []
for k in range(N):
t, lower, upper = 0, max(0, k - (Q - 1)), min(P - 1, k)
for i in range(lower, upper + 1):
t = t + x[i] * y[k - i]
z.append(t)
return np.array(z) #Modified to include conversion to numpy array
def convolve(y1, y2, dx = None):
'''
Compute the finite convolution of two signals of equal length.
#param y1: First signal.
#param y2: Second signal.
#param dx: [optional] Integration step width.
#note: Based on the algorithm at http://www.physics.rutgers.edu/~masud/computing/WPark_recipes_in_python.html.
'''
P = len(y1) #Determine the length of the signal
z = [] #Create a list of convolution values
for k in range(P):
t = 0
lower = max(0, k - (P - 1))
upper = min(P - 1, k)
for i in range(lower, upper):
t += (y1[i] * y2[k - i] + y1[i + 1] * y2[k - (i + 1)]) / 2
z.append(t)
z = np.array(z) #Convert to a numpy array
if dx != None: #Is a step width specified?
z *= dx
return z
steps = 50 #Number of integration steps
maxtime = 5 #Maximum time
dt = float(maxtime) / steps #Obtain the width of a time step
time = [dt * i for i in range (steps)] #Create an array of times
exp1 = [math.exp(-t) for t in time] #Create an array of function values
exp2 = [2 * math.exp(-2 * t) for t in time]
#Calculate the analytical expression
analytical = [2 * math.exp(-2 * t) * (-1 + math.exp(t)) for t in time]
#Calculate the trapezoidal convolution
trapezoidal = convolve(exp1, exp2, dt)
#Calculate the scipy convolution
sci = signal.convolve(exp1, exp2, mode = 'full')
#Slice the first half to obtain the causal convolution and multiply by dt
#to account for the step width
sci = sci[0:steps] * dt
#Calculate the convolution using the original Riemann sum algorithm
riemann = convolveoriginal(exp1, exp2)
riemann = riemann[0:steps] * dt
#Plot
plt.plot(time, analytical, label = 'analytical')
plt.plot(time, trapezoidal, 'o', label = 'trapezoidal')
plt.plot(time, riemann, 'o', label = 'Riemann')
plt.plot(time, sci, '.', label = 'scipy.signal.convolve')
plt.legend()
plt.show()
Thank you for your time!
or, for those who prefer numpy to C. It will be slower than the C implementation, but it's just a few lines.
>>> t = np.linspace(0, maxtime-dt, 50)
>>> fx = np.exp(-np.array(t))
>>> gx = 2*np.exp(-2*np.array(t))
>>> analytical = 2 * np.exp(-2 * t) * (-1 + np.exp(t))
this looks like trapezoidal in this case (but I didn't check the math)
>>> s2a = signal.convolve(fx[1:], gx, 'full')*dt
>>> s2b = signal.convolve(fx, gx[1:], 'full')*dt
>>> s = (s2a+s2b)/2
>>> s[:10]
array([ 0.17235682, 0.29706872, 0.38433313, 0.44235042, 0.47770012,
0.49564748, 0.50039326, 0.49527721, 0.48294359, 0.46547582])
>>> analytical[:10]
array([ 0. , 0.17221333, 0.29682141, 0.38401317, 0.44198216,
0.47730244, 0.49523485, 0.49997668, 0.49486489, 0.48254154])
largest absolute error:
>>> np.max(np.abs(s[:len(analytical)-1] - analytical[1:]))
0.00041657780840698155
>>> np.argmax(np.abs(s[:len(analytical)-1] - analytical[1:]))
6
Short answer: Write it in C!
Long answer
Using the cookbook about numpy arrays I rewrote the trapezoidal convolution method in C. In order to use the C code one requires three files (https://gist.github.com/1626919)
The C code (performancemodule.c).
The setup file to build the code and make it callable from python (performancemodulesetup.py).
The python file that makes use of the C extension (performancetest.py)
The code should run upon downloading by doing the following
Adjust the include path in performancemodule.c.
Run the following
python performancemodulesetup.py build
python performancetest.py
You may have to copy the library file performancemodule.so or performancemodule.dll into the same directory as performancetest.py.
Results and performance
The results agree neatly with one another as shown below:
The performance of the C method is even better than scipy's convolve method. Running 10k convolutions with array length 50 requires
convolve (seconds, microseconds) 81 349969
scipy.signal.convolve (seconds, microseconds) 1 962599
convolve in C (seconds, microseconds) 0 87024
Thus, the C implementation is about 1000 times faster than the python implementation and a bit more than 20 times as fast as the scipy implementation (admittedly, the scipy implementation is more versatile).
EDIT: This does not solve the original question exactly but is sufficient for my purposes.

Categories