I am trying to create 10 dimensional convex function. I know that eigen values of its hessian matrix must be positive for function to be convex. I am doing the things below to find the hessian matrix, but its input is an array, I dont know how to represent a function as array.
def hessian(x):
"""
Calculate the hessian matrix with finite differences
Parameters:
- x : ndarray
Returns:
an array of shape (x.dim, x.ndim) + x.shape
where the array[i, j, ...] corresponds to the second derivative x_ij
"""
x_grad = np.gradient(x)
hessian = np.empty((x.ndim, x.ndim) + x.shape, dtype=x.dtype)
for k, grad_k in enumerate(x_grad):
# iterate over dimensions
# apply gradient again to every component of the first derivative.
tmp_grad = np.gradient(grad_k)
for l, grad_kl in enumerate(tmp_grad):
hessian[k, l, :, :] = grad_kl
return hessian
x = np.random.randn(100,100)
t=hessian(x)
As stated in the question you got this code from, x is the value of the function at the nodes of an evenly spaced mesh in parameter space, not the function itself.
If it's not possible to calculate the hessian of your function analytically, you have to use finite difference formulas like in this example code.
Related
I am using a function to calculate a likelihood density.
I am running through two xs which are vectors of length 7.
def lhd(x0, x1, dt): #Define a function to calculate the likelihood density given two values.
d = len(x0) #Save the length of the inputs for the below pdf input.
print(d)
print(len(x1))
lh = multivariate_normal.pdf(x1, mean=(1-dt)*x0, cov=2*dt*np.identity(d)) #Take the pdf from a multivariate normal built from x0, given x1.
return lh #Return this pdf value.
The mean here is a vector of length 7, and the covariance is a (7,7) array.
When I run this, I get the error
ValueError: Array 'mean' must be a vector of length 49.
but looking at the formula of the pdf I do not think this is correct. Any idea what is going wrong here?
If dt is a (7,7) array, (1-dt) is also (7,7), the * operator in (1-dt)*x0 is the element-wise multiplication, if x0 is a vector of length 7 the result will be a (7,7) array.
I guess you meant to use matrix multiplication, you can that this using the x0 - dt # x0 (where # denotes the matrix multiplication operator).
I generate a correlation matrix by drawing from a uniform distribution:
corr = np.random.uniform(low=0.1, high=0.3, size=[20, 20])
and set the main diagonal elements to one
corr[np.diag_indices_from(corr)] = 1
Then, I make the correlation matrix symmetric
corr[np.triu_indices(n=20, k=1)] = corr.T[np.triu_indices(n=20, k=1)]
which yields a totally positive matrix ,i.e., all values of the matrix are strictly positive.
According to numpy, however, the matrix is not positive (semi-) definit.
np.all(np.linalg.eigvals(corr) >= 0)
False
That's still not guaranteed to be PSD
I will give you two easy ways:
Sny square non-singular matrix can be used to create a PSD matrix with
A = A # A.T
Any matrix can be used to produce a PSD matrix with
A = (A + A.T)/2
A = A - np.eye(len(A)) * (np.min(np.linalg.eigvalsh(A)) - 0.001)
If you want the minimum perturbation to a symmetric matrix (the least squares projection to the positive semidefinite cone)
A_ = (v * np.maximum(w, 0.01)) # v.T
print(np.linalg.eigvalsh(A_))
Notice that I am giving a margin of 0.01, if I used strictly zero your test could fail due to numerical errors.
I've got a 2x2 matrix defined by the variables J00, J01, J10, J11 coming in from other inputs. Since the matrix is small, I was able to compute the spectral norm by first computing the trace and determinant
J_T = tf.reduce_sum([J00, J11])
J_ad = tf.reduce_prod([J00, J11])
J_cb = tf.reduce_prod([J01, J10])
J_det = tf.reduce_sum([J_ad, -J_cb])
and then solving the quadratic
L1 = J_T/2.0 + tf.sqrt(J_T**2/4.0 - J_det)
L2 = J_T/2.0 - tf.sqrt(J_T**2/4.0 - J_det)
spectral_norm = tf.maximum(L1, L2)
This works, but it looks rather ugly and it isn't generalizable to larger matrices. Is there cleaner way (maybe a method call that I'm missing) to compute spectral_norm?
The spectral norm of a matrix J equals the largest singular value of the matrix.
Therefore you can use tf.svd() to perform the singular value decomposition, and take the largest singular value:
spectral_norm = tf.svd(J,compute_uv=False)[...,0]
where J is your matrix.
Notes:
I use compute_uv=False since we are interested only in singular values, not singular vectors.
J does not need to be square.
This solution works also for the case where J has any number of batch dimensions (as long as the two last dimensions are the matrix dimensions).
The elipsis ... operation works as in NumPy.
I take the 0 index because we are interested only in the largest singular value.
in the python code I'm currently developing there is a particular function that really requires a speed optimization.
To a first approximation I would like to focus on pure python code (no C or Cython implementations).
The function generates a series of gaussian curves with varying sigma depending on the x-axis position. It takes three arguments:
x0, 1d numpy array, central values of the gaussian curves
h , 1d numpy array, heights of the gaussian curves
x , 1d numpy array, values for the definition of the total sum
My goal is to obtain the sum of all the curves in the fastest way possible (it is a sort of convolution with a gaussian curve that has a position dependent sigma).
At the moment my code is:
sigs = get_sigmas(x0) # function that returns the value of sigma at each position
all_gauss_args = -0.5*np.power((x[:, np.newaxis] - x0[np.newaxis, :]) /
sigs[np.newaxis, :], 2.0)
sum = (1.0/(np.sqrt(2 * np.pi) * sigs[np.newaxis, :])) * np.exp(all_gauss_arg) *\
h[np.newaxis, :]
sum = np.sum(sum, axis=1)
return sum
It is possible to make it faster?
Thanks in advance for the help
When I use the linear algebra module in scipy to calculate the matrix logarithm of a hermitian matrix, the matrix that it outputs isn't hermitian. I first define a vector using:
n = np.random.uniform(size = 3) + 1j*np.random.uniform(size = 3)
Then I define the respective hermitian matrix:
N = np.outer(n,n.conj())
However, linalg.logm(N) doesn't return a hermitian matrix. Why is this happening?
All but one eigenvalues of the random matrix are zero. Since functions on matrices can be written as functions on the eigenvalues of a matrix, I see why the logarithm has a problem there, because log(0) is not defined. Maybe the function doesn't see this problem and just returns garbage.
I guess that you just need to make sure that your random Hermitian matrix has nonzero eigenvalues.