Is there a method that I can call to create a random orthonormal matrix in python? Possibly using numpy? Or is there a way to create a orthonormal matrix using multiple numpy methods? Thanks.
Version 0.18 of scipy has scipy.stats.ortho_group and scipy.stats.special_ortho_group. The pull request where it was added is
For example,
In [24]: from scipy.stats import ortho_group # Requires version 0.18 of scipy
In [25]: m = ortho_group.rvs(dim=3)
In [26]: m
array([[-0.23939017, 0.58743526, -0.77305379],
[ 0.81921268, -0.30515101, -0.48556508],
[-0.52113619, -0.74953498, -0.40818426]])
In [27]: np.set_printoptions(suppress=True)
In [28]:
array([[ 1., 0., -0.],
[ 0., 1., 0.],
[-0., 0., 1.]])
You can obtain a random n x n orthogonal matrix Q, (uniformly distributed over the manifold of n x n orthogonal matrices) by performing a QR factorization of an n x n matrix with elements i.i.d. Gaussian random variables of mean 0 and variance 1. Here is an example:
import numpy as np
from scipy.linalg import qr
n = 3
H = np.random.randn(n, n)
Q, R = qr(H)
print (
[[ 1.00000000e+00 -2.77555756e-17 2.49800181e-16]
[ -2.77555756e-17 1.00000000e+00 -1.38777878e-17]
[ 2.49800181e-16 -1.38777878e-17 1.00000000e+00]]
EDIT: (Revisiting this answer after the comment by #g g.) The claim above on the QR decomposition of a Gaussian matrix providing a uniformly distributed (over the, so called, Stiefel manifold) orthogonal matrix is suggested by Theorems 2.3.18-19 of this reference. Note that the statement of the result suggests a "QR-like" decomposition, however, with the triangular matrix R having positive elements.
Apparently, the qr function of scipy (numpy) function does not guarantee positive diagonal elements for R and the corresponding Q is actually not uniformly distributed. This has been observed in this monograph, Sec. 4.6 (the discussion refers to MATLAB, but I guess both MATLAB and scipy use the same LAPACK routines). It is suggested there that the matrix Q provided by qr is modified by post multiplying it with a random unitary diagonal matrix.
Below I reproduce the experiment in the above reference, plotting the empirical distribution (histogram) of phases of eigenvalues of the "direct" Q matrix provided by qr, as well as the "modified" version, where it is seen that the modified version does indeed have a uniform eigenvalue phase, as would be expected from a uniformly distributed orthogonal matrix.
from scipy.linalg import qr, eigvals
from seaborn import distplot
n = 50
repeats = 10000
angles = []
angles_modified = []
for rp in range(repeats):
H = np.random.randn(n, n)
Q, R = qr(H)
Q_modified = Q # np.diag(np.exp(1j * np.pi * 2 * np.random.rand(n)))
fig, ax = plt.subplots(1,2, figsize = (10,3))
distplot(np.asarray(angles).flatten(),kde = False, hist_kws=dict(edgecolor="k", linewidth=2), ax= ax[0])
ax[0].set(xlabel='phase', title='direct')
distplot(np.asarray(angles_modified).flatten(),kde = False, hist_kws=dict(edgecolor="k", linewidth=2), ax= ax[1])
ax[1].set(xlabel='phase', title='modified');
This is the rvs method pulled from the, with minimal change - just enough to run as a stand alone numpy function.
import numpy as np
def rvs(dim=3):
random_state = np.random
H = np.eye(dim)
D = np.ones((dim,))
for n in range(1, dim):
x = random_state.normal(size=(dim-n+1,))
D[n-1] = np.sign(x[0])
x[0] -= D[n-1]*np.sqrt((x*x).sum())
# Householder transformation
Hx = (np.eye(dim-n+1) - 2.*np.outer(x, x)/(x*x).sum())
mat = np.eye(dim)
mat[n-1:, n-1:] = Hx
H =, mat)
# Fix the last sign such that the determinant is 1
D[-1] = (-1)**(1-(dim % 2))*
# Equivalent to, H) but faster, apparently
H = (D*H.T).T
return H
It matches Warren's test,
An easy way to create any shape (n x m) orthogonal matrix:
import numpy as np
n, m = 3, 5
H = np.random.rand(n, m)
u, s, vh = np.linalg.svd(H, full_matrices=False)
mat = u # vh
print(mat # mat.T) # -> eye(n)
Note that if n > m, it would obtain mat.T # mat = eye(m).
from scipy.stats import special_ortho_group
x = special_ortho_group.rvs(num_dim)
if you want a none Square Matrix with orthonormal column vectors you could create a square one with any of the mentioned method and drop some columns.
Numpy also has qr factorization.
import numpy as np
a = np.random.rand(3, 3)
q, r = np.linalg.qr(a)
q # q.T
# array([[ 1.00000000e+00, 8.83206468e-17, 2.69154044e-16],
# [ 8.83206468e-17, 1.00000000e+00, -1.30466244e-16],
# [ 2.69154044e-16, -1.30466244e-16, 1.00000000e+00]])
This code computes the Pearson correlation coefficient for all possible pairs of L=45 element vectors taken from a stack of M=102272. The result is a symmetric MxM matrix occupying about 40 GB of memory. The memory requirement isn't a problem for my computer, but I estimate from test runs that the ~5 billion passes through the inner loop will take a good 2-3 days to complete. My question: Is there a straightforward way to vectorize the inner loop to speed things up significantly?
# L = 45
# M = 102272
# data[M,L] (type 'float32')
cmat = np.zeros((M,M))
for i in range(M):
v1 = data[i,:]
z1 = (v1-np.average(v1))/np.std(v1)
for j in range(i+1):
v2 = data[j,:]
z2 = (v2-np.average(v2))/np.std(v2)
cmmat[i,j] = cmmat[j,i] =
There's a built-in numpy function that already exists to compute correlation matrix. Just use it!
>>> import numpy as np
>>> rng = np.random.default_rng(seed=42)
>>> xarr = rng.random((3, 3))
>>> xarr
array([[0.77395605, 0.43887844, 0.85859792],
[0.69736803, 0.09417735, 0.97562235],
[0.7611397 , 0.78606431, 0.12811363]])
>>> R1 = np.corrcoef(xarr)
>>> R1
array([[ 1. , 0.99256089, -0.68080986],
[ 0.99256089, 1. , -0.76492172],
[-0.68080986, -0.76492172, 1. ]])
Documentation link
Given a 2-d numpy array, X, of shape [m,m], I wish to apply a function and obtain a new 2-d numpy matrix P, also of shape [m,m], whose [i,j]th element is obtained as follows:
P[i][j] = exp (-|| X[i] - x[j] ||**2)
where ||.|| represents the standard L-2 norm of a vector. Is there any way faster than a simple nested for loop?
For example,
X = [[1,1,1],[2,3,4],[5,6,7]]
Then, at diagonal entries the rows accessed will be the same and the norm/magnitude of their difference will be 0. Hence,
P[0][0] = P[1][1] = P[2][2] = exp (0) = 1.0
P[0][1] = exp (- || X[0] - X[1] ||**2) = exp (- || [-1,-2,-3] || ** 2) = exp (-14)
The most trivial solution using a nested for loop is as follows:
import numpy as np
X = np.array([[1,2,3],[4,5,6],[7,8,9]])
P = np.zeros (shape=[len(X),len(X)])
for i in range (len(X)):
for j in range (len(X)):
P[i][j] = np.exp (- np.linalg.norm (X[i]-X[j])**2)
print (P)
This prints:
P = [[1.00000000e+00 1.87952882e-12 1.24794646e-47]
[1.87952882e-12 1.00000000e+00 1.87952882e-12]
[1.24794646e-47 1.87952882e-12 1.00000000e+00]]
Here, m is of the order of 5e4.
In [143]: X = np.array([[1,2,3],[4,5,6],[7,8,9]])
...: P = np.zeros (shape=[len(X),len(X)])
...: for i in range (len(X)):
...: for j in range (len(X)):
...: P[i][j] = np.exp (- np.linalg.norm (X[i]-X[j]))
In [144]: P
array([[1.00000000e+00, 5.53783071e-03, 3.06675690e-05],
[5.53783071e-03, 1.00000000e+00, 5.53783071e-03],
[3.06675690e-05, 5.53783071e-03, 1.00000000e+00]])
A no-loop version:
In [145]: np.exp(-np.sqrt(((X[:,None,:]-X[None,:,:])**2).sum(axis=2)))
array([[1.00000000e+00, 5.53783071e-03, 3.06675690e-05],
[5.53783071e-03, 1.00000000e+00, 5.53783071e-03],
[3.06675690e-05, 5.53783071e-03, 1.00000000e+00]])
I had to drop your **2 to match values.
With the norm applied to the 3d difference array:
In [148]: np.exp(-np.linalg.norm(X[:,None,:]-X[None,:,:], axis=2))
array([[1.00000000e+00, 5.53783071e-03, 3.06675690e-05],
[5.53783071e-03, 1.00000000e+00, 5.53783071e-03],
[3.06675690e-05, 5.53783071e-03, 1.00000000e+00]])
In one of the scikit packages (learn?) there's a cdist that may handle this sort of thing faster.
As hpaulj mentioned cdist does it better. Try the following.
from scipy.spatial.distance import cdist
import numpy as np
Notice the sqeuclidean. This means that scipy does not take the square root so you don't have to square like you did above with the norm.
This would be easier if you provided a sample array. You can create an array Q of size [m, m, m] where Q[i, j, k] = X[i, k] - X[j, k] by using
X[None,:,:] - X[:,None,:]
At this point, you're performing simple numpy operations against the third axis.
I am aware of the scipy.spatial.distance.pdist function and how to compute the mean from the resulting matrix/ndarray.
>>> x = np.random.rand(10000, 2)
>>> y = pdist(x, metric='euclidean')
>>> y.mean()
In the example above y gets quite large (nearly 2,500 times as large as the input array):
>>> y.shape
>>> from sys import getsizeof
>>> getsizeof(x)
>>> getsizeof(y)
>>> getsizeof(y) / getsizeof(x)
But since I am only interested in the mean pairwise distance, the distance matrix doesn't have to be kept in memory. Instead the mean of each row (or column) can be computed seperatly. The final mean value can then be computed from the row mean values.
Is there already a function which exploit this property or is there an easy way to extend/combine existing functions to do so?
If you use the square version of distance, it is equivalent to using the variance with n-1:
from scipy.spatial.distance import pdist, squareform
import numpy as np
x = np.random.rand(10000, 2)
y = np.array([[1,1], [0,0], [2,0]])
print(pdist(x, 'sqeuclidean').mean())
print(np.var(x, 0, ddof=1).sum()*2)
You will have to weight each row by the number of observations that make up the mean. For example the pdist of a 3 x 2 matrix is the flattened upper triangle (offset of 1) of the squareform 3 x 3 distance matrix.
arr = np.arange(6).reshape(3,2)
array([[0, 1],
[2, 3],
[4, 5]])
array([2.82842712, 5.65685425, 2.82842712])
from sklearn.metrics import pairwise_distances
square = pairwise_distances(arr)
array([[0. , 2.82842712, 5.65685425],
[2.82842712, 0. , 2.82842712],
[5.65685425, 2.82842712, 0. ]])
square[triu_indices(square.shape[0], 1)]
array([2.82842712, 5.65685425, 2.82842712])
There is the pairwise_distances_chuncked function that can be used to iterate over the distance matrix row by row, but you will need to keep track of the row index to make sure you only take the mean of values in the upper/lower triangle of the matrix (distance matrix is symmetrical). This isn't complicated, but I imagine you will introduce a significant slowdown.
tot = ((arr.shape[0]**2) - arr.shape[0]) / 2
weighted_means = 0
for i in gen:
if r < arr.shape[0]:
sm = i[0, r:].mean()
wgt = (i.shape[1] - r) / tot
weighted_means += sm * wgt
r += 1
I'm trying to find the solution to overdetermined linear homogeneous system (Ax = 0) using numpy in order to get the least linear squares solution for a linear regression.
This is the code I am using to generate the linear regression:
N = 100
x_data = np.linspace(0, N-1, N)
m = +5
n = -5
y_model = m*x_data + n
y_noise = y_model + np.random.normal(0, +5, N)
I want to recover m and n from y_noise. In other words, I want to resolve the homogeneous system (Ax = 0) where "x = (m, n)" and "A = (x_data | 1 | -y_noise)". So I convert non-homogeneous (Ax = y) into homogeneous (Ax = 0) using this code:
A = np.array(np.vstack((x_data, np.ones(N), -y_noise)).T)
I know I could resolve non-homogeneous system using np.linalg.lstsq((x_data | 1), y_noise)) but I want to get the solution for homogeneous system. I am finding a problem with this function as it only returns the trivial solution (x = 0):
x = np.linalg.lstsq(A, np.zeros(N))[0] => array([ 0., 0., 0.])
I was thinking about using eigenvectors to get the solution but it seems not to work:
A_T_A =, A)
eigen_values, eigen_vectors = np.linalg.eig(A_T_A)
# eigenvectors
[[ -2.03500000e-01 4.89890000e+00 5.31170000e+00]
[ -3.10000000e-03 1.02230000e+00 -2.64330000e+01]
[ 1.00000000e+00 1.00000000e+00 1.00000000e+00]]
# eigenvectors normalized
[[ -0.98365497700 -4.744666220 1.0] # (m1, n1, 1)
[ 0.00304878118 0.210130914 1.0] # (m2, n2, 1)
[ 25.7752417000 -5.132910010 1.0]] # (m3, n3, 1)
Which none of them fits model parameters (m=+5, n=-5)
How can I find (m, n) correctly? Thanks!
I have already found how to fix it, the problem is how I was interpreting the output of np.linalg.eig function, but the approach using eigenvectors is right. In spite of that, #Stelios is in the right when he says that the function np.linalg.lstsq returns the trivial solution (x = 0) because matrix A is full column rank.
I was assuming the output of np.linalg.eig was:
[[m1 n1 1]
[m2 n2 1]
[m3 n3 1]]
But it is not, the correct format is:
[[m1 m2 m3]
[n1 n2 n3]
[ 1 1 1]]
So if we want to get the solution which better fits model paramaters (m, n), we have to choose the eigenvector with the smallest eigenvalue and normalize it:
A_T_A =, A_homo)
eigen_values, eigen_vectors = np.linalg.eig(A_T_A)
# eigenvectors
[[ 1.96409304e-01 9.48763118e-01 -2.47531678e-01]
[ 2.94608003e-04 2.52391765e-01 9.67625088e-01]
[ -9.80521952e-01 1.90123494e-01 -4.92925776e-02]]
# MIN eigenvector
eigen_vector_min = eigen_vectors[:, np.argmin(eigen_values)]
[-0.24753168 0.96762509 -0.04929258]
# MIN eigenvector normalized
[ 5.02168258 -19.63023915 1. ] # [m, n, 1]
Finally we get that m = 5.02 and n = -19,6 which is a pretty good approximation.
I am trying to do something very simple, but confused by the abundance of information about sparse matrices and vectors in Python.
I want to create two vectors, x and y, one of length 5 and one of length 6, being sparse. Then I want to set one coordinate in each one of them. Then I want to create a matrix A, sparse, which is 5 x 6 and add to it the outer product between x and y. I then want to do SVD on that A.
Here is what I tried, and it goes wrong in many ways.
from scipy import sparse;
import numpy as np;
import scipy.sparse.linalg as ssl;
x = sparse.bsr_matrix(np.zeros(5));
x[1] = 1;
y = sparse.bsr_matrix(np.zeros(6));
y[1] = 2;
A = sparse.coo_matrix(5, 6);
A = A + np.outer(x,y.transpose())
svdresult = ssl.svds(A,1);
At first, you should determine data you want to store in sparse matrix before constructing it. Otherwise you should use sparse.csc_matrix or sparse.csr_matrix instead. Then you can assign or change data like this:
x[0, 1] = 1
At second, outer product of vectors x and y is equivalent to x.transpose() * y.
Here is working code:
from scipy import sparse
import numpy as np
import scipy.sparse.linalg as ssl
x = np.zeros(5)
x[1] = 1
x_bsr = sparse.bsr_matrix(x)
y = np.zeros(6)
y[1] = 2
y_bsr = sparse.bsr_matrix(y)
A = sparse.coo_matrix((5, 6)) # Sparse matrix 5 x 6
B = x_bsr.transpose().dot(y_bsr) # Outer product of x and y
svdresult = ssl.svds((A + B), 1)
(array([[ 5.55111512e-17],
[ -1.00000000e+00],
[ 0.00000000e+00],
[ -2.77555756e-17],
[ 1.11022302e-16]]), array([ 2.]), array([[ 0., -1., 0., 0., 0., 0.]]))