I was wondering if someone would help me with this problem.
I have complex 3x1 matrix that looks like,
matrix = np.matrix([[(0.0009118819655545739+0.0009118819655545738j)], [(0.0009118819655544588-0.0009118819655544589j)], [(-0.0009118819655545161-5.421010862427522e-20j)]])
As I am not interested in the last element I will delete as follows, so I get:
vector = np.delete(matrix, 2, 0)
This gives me,
vector = [[(0.0009118819655545739+0.0009118819655545738j)]]
[[(0.0009118819655544588-0.0009118819655544589j)]]
I am interested in finding the magnitude of the above vector (2X1). This is what I have done:
for element in vector:
mag = np.absolute(element)
return mag
What I get is the,
mag = 0.0012895958429705512
which is the magnitude of the first element.
How do I get the magnitude of the new 2x1 vector that incorporates the complex values of all elements in the vector?
Is the math correct: if I consider,
vector = [[(x1+x2j)]]
[[(y1-y2j)]]
As mag = sqrt((x1)**2 + (x2)**2 + (y1)**2 + (y2)**2))
which would instead give me a mag of 0.00182376392 rather than 0.0012895958429705512.
I would really appreciate the help with this problem.
Many thanks
numpy comes with numpy.linalg.norm for calculating the norm of a vector. c.f here
In your case that would simply be:
np.linalg.norm(vector)
Related
I'm hoping to find a way around the solution offered here to use 2D arrays in order to do 2D numerical integration.
import numpy as np
ksize = 50
a = 1.0
kdom = np.pi / a
x = np.linspace(- kdom, kdom, ksize)
y = np.linspace(- kdom, kdom, ksize)
dk = x[1]-x[0]
X,Y = np.meshgrid(x,y)
eigval = np.cos(X)+np.cos(Y)
eigvalflat = eigval.flatten()
intval = np.trapz(np.trapz(eigval,x),y)
sumval = np.sum(eigvalflat)*dk/ksize
print(intval,sumval)
Given my dummy example above, I'd like to find a way to properly integrate the 1D array (eigvalflat) while still as a flattened array even though it is a double integral.
Computationally, if the integrand is not separable, then the answer is that you can't recast the double integral as a single integral, unless you compute the integral one dimension at a time, which is what the assignment to intval is essentially doing.
Analytically, you'll have a better chance by asking yourself the question: given the 2d region of the integral (a rectangle in your example), can one find an integral over the boundary of that region? For that, Green's theorem has you covered with necessary and sufficient conditions.
I was going through the book called Hands-On Machine Learning with Scikit-Learn, Keras and Tensorflow and the author was explaining how the pseudo-inverse (Moore-Penrose inverse) of a matrix is calculated in the context of Linear Regression. I'm quoting verbatim here:
The pseudoinverse itself is computed using a standard matrix
factorization technique called Singular Value Decomposition (SVD) that
can decompose the training set matrix X into the matrix
multiplication of three matrices U Σ VT (see numpy.linalg.svd()). The
pseudoinverse is calculated as X+ = V * Σ+ * UT. To compute the matrix
Σ+, the algorithm takes Σ and sets to zero all values smaller than a
tiny threshold value, then it replaces all nonzero values with their
inverse, and finally it transposes the resulting matrix. This approach
is more efficient than computing the Normal equation.
I've got an understanding of how the pseudo-inverse and SVD are related from this post. But I'm not able to grasp the rationale behind setting all values less than the threshold to zero. The inverse of a diagonal matrix is obtained by taking the reciprocals of the diagonal elements. Then small values would be converted to large values in the inverse matrix, right? Then why are we removing the large values?
I went and looked into the numpy code, and it looks like follows, just for reference:
#array_function_dispatch(_pinv_dispatcher)
def pinv(a, rcond=1e-15, hermitian=False):
a, wrap = _makearray(a)
rcond = asarray(rcond)
if _is_empty_2d(a):
m, n = a.shape[-2:]
res = empty(a.shape[:-2] + (n, m), dtype=a.dtype)
return wrap(res)
a = a.conjugate()
u, s, vt = svd(a, full_matrices=False, hermitian=hermitian)
# discard small singular values
cutoff = rcond[..., newaxis] * amax(s, axis=-1, keepdims=True)
large = s > cutoff
s = divide(1, s, where=large, out=s)
s[~large] = 0
res = matmul(transpose(vt), multiply(s[..., newaxis], transpose(u)))
return wrap(res)
It's almost certainly an adjustment for numerical error. To see why this might be necessary, look what happens when you take the svd of a rank-one 2x2 matrix. We can create a rank-one matrix by taking the outer product of a vector like so:
>>> a = numpy.arange(2) + 1
>>> A = a[:, None] * a[None, :]
>>> A
array([[1, 2],
[2, 4]])
Although this is a 2x2 matrix, it only has one linearly independent column, and so its rank is one instead of two. So we should expect that when we pass it to svd, one of the singular values will be zero. But look what happens:
>>> U, s, V = numpy.linalg.svd(A)
>>> s
array([5.00000000e+00, 1.98602732e-16])
What we actually get is a singular value that is not quite zero. This result is inevitable in many cases given that we are working with finite-precision floating point numbers. So although the problem you have identified is a real one, we will not be able to tell in practice the difference between a matrix that really has a very small singular value and a matrix that ought to have a zero singular value but doesn't. Setting small values to zero is the safest practical way to handle that problem.
I am trying to understand this optimized code to find cosine similarity between users matrix.
def fast_similarity(ratings,epsilon=1e-9):
# epsilon -> small number for handling dived-by-zero errors
sim = ratings.T.dot(ratings) + epsilon
norms = np.array([np.sqrt(np.diagonal(sim))])
return (sim / norms / norms.T)
If ratings =
items
u [
s [1,2,3]
e [4,5,6]
r [7,8,9]
s ]
nomrs will be equal to = [1^2 + 5^2 + 9^2]
but why we are writing sim/norms/norms.T to calculate cosine similarity?
Any help is appreciated.
Going through the code we have that:
And this means that, one the diagonal of the sim matrix we have the result of the multiplication of each column.
You can give it a try if you want using a simple matrix:
And you can easily check that this gram matrix (that's how this matrix product is named) has this property.
Now the code defines norms that is nothing but an array taking the diagonal of our gram matrix and apply a sqrt on each element of it.
This will give us an array containing the norm value for each column:
So basically the norms vector contains the norm value of each column of the result matrix.
Once we have all those data we can evaluate the cosine similarity between those users, so we know that cosine similarity is evaluated like:
Note that :
So we have that our similarity is going to be:
So we just have to substitute the terms with our code variable to get:
And this explain why you have this line of code:
return sim / norms / norms.T
EDIT:
Since it seems that I was not clear, every time I am talking about matrix multiplication in this answer I am reffering to the DOT PRODUCT of two matrices.
This actually means that when it's written A*B we actually develop and
solve as A.T * B
I have 2 arrays (for the sake of the example, let's name them A and B) and i perform the following manipulations at them, but i get an error at the assignment of "d2" in my code.
n = len(tracks) #tracks is a list containing different-length 3d arrays
n=30; #test with a few tracks
length = len(tracks) #list containing the total number of "samples"
perm_index = np.random.permutation(length) #uniform sampling without replacement
subset_len = 5 # choose the size of subset of tracks A
subset_A = [tracks[x:x+1] for x in xrange(0, subset_len, 1)]
subset_B = [tracks[x:x+1] for x in xrange(subset_len, n, 1)]
tempA = distance_calc.dist_calcsub(len(subset_A), subset_A) # distance matrix calculation
tempA = mcp.sym_mcp(len(subset_A), tempA) # symmetrize mcp ???
tempB = distance_calc.dist_calcsubs(subset_A, subset_B) # distance matrix calculation
#symmetrize mcp ? ? its not diagonal, symmetric . . .
A = affinity.aff_conv(60, tempA) # conversion to affinity
B = affinity.aff_conv(60, tempB) # conversion to affinity
#((row,col)) = np.shape(A)
#A = normalization_affinity.norm_aff(row, col, A) # normalization of affinity matrix
# Normalize A and B for Laplacian using row sums of W, where W = [A B; B' B'*A^-1*B].
# Let d1 = [A B]*1, d2 = [B' B'*A^-1*B]*1, dhat = sqrt(1./[d1; d2]).
d1 = np.sum( np.vstack((A, np.transpose(B))) )
d2 = np.sum(B,0) + np.dot(np.sum(np.transpose(B),0), np.dot(np.linalg.pinv(A), B ))
dhat = np.transpose(np.sqrt( 1/ np.hstack((d1, d2)) ))
A = A* np.dot( dhat[0:subset_len], np.transpose(dhat[0:subset_len]) )
B = B* np.dot( dhat[0:subset_len], np.transpose(dhat[subset_len:n]) )
The error again is "ValueError: matrices are not aligned." because the np.dot vectors are 1d vectors of different size; I know the reason why this is happening but I am following exactly the equations to perform the Nystrom method.
P.S: I am following the method described in p.90-92 in this thesis: thesis link
Looking at the paper, you've got two problems here.
Let's start with the information you left out of your question. You're trying to do this operation:
bc + B.T * A^−1 * br
where ar and br are column vectors containing the row sums of A and B and bc is
the column sum of B.
In particular, you're mapping that A^-1 * br to np.dot( np.linalg.pinv(A), np.sum(B, 0)).
The first problem is that np.linalg.pinv is the pseudo-inverse, A+, not the multiplicative inverse, A^-1. Using a completely different operation just because it doesn't give you an error doesn't solve the problem.
So, how do you calculate the multiplicative inverse? Well, you can't. In general, the multiplicative inverse doesn't exist for non-square matrices, so given a 5x10 A, you're stuck right at the beginning.
Anyway, the second problem comes from the fact that your br isn't a column vector. If you want to think in matrix terms, as the paper does, it's a row vector, 10x1 instead of 1x10. If you want to think in numpy ndarray terms, it's a 1D (10,) array instead of a 2D (1, 10) array. If you think of the operation in matrix multiplication terms, you can't multiply a 10x5 matrix with a 10x1 matrix; if you think of it in NumPy terms as the multidimensional dot product, you can't multiply a (10, 5) array with a (10,) array.
It's true that you can extend the dot product to specifically the domain of MxN matrices vs. M vectors, and under that definition your multiplication would make sense. But that's not the definition used by either the paper's standard matrix multiplication notation or NumPy's dot function. So, what can you do? Well, note that the operation you're trying to do is commutative, so swapping the order of operands is perfectly legal—and if you do that, then it does happen to correspond to the general dot product. So, you could write this as np.dot(np.sum(B, 0), np.linalg.pinv(A)) and get the result you want. And there are a number of other ways you could transform the arrays that are idempotent in your matrix-vs.-vector multiplication domain but meaningful for np.dot, and they will all get you the same result. For example, np.dot(np.linalg.pinv(A).T, np.sum(B, 0)) will also work.
I'm also not sure why you're using dot product in the first place. I don't see anything in the notation to imply that
But all of this is a sideshow; if you inverted A properly, you would have something with the same dimensions as A, and multiply a 5x10 matrix with a 10x1 vector, or a (5, 10) array with a (10,) array, is already perfectly well defined. The only problem is that, again, you can't generally invert non-square matrices, so there's no way you can actually get to this place.
So, the real solution is to go back to wherever you decided on those shapes for A and B and try again.
In particular, it's pretty clear from the illustration in the paper showing the derivation of A and B from the larger matrix that the height of A is the height of B, and the width of A is the width of B.T, which is of course the height of B again.
Also, if the larger matrix is supposed to symmetric, and A is the upper left corner of a symmetric matrix, A has to be symmetric.
I also think you've mixed up row-column order and x-y order a few times, and bc is supposed to the column sums of B, not the column sums of B.T (which would just be the row sums of B, flipped into a row vector instead of a column vector).
While we're at it, let's use methods and operators where possible instead of writing everything in the longest possible way.
So, I think what you wanted is something like this:
A = np.random.random_sample((4, 4)) # square
A = (A + A.T) / 2 # and symmetric
B = np.random.random_sample((4, 10))
ar = A.sum(1)
br = B.sum(1)
bc = B.sum(0) # not B.T.sum(0), that's just br again!
d1 = ar + br
d2 = bc + np.dot(B.T, np.dot(np.linalg.inv(A), br))
Without actually reading the paper I can't be sure this is what you actually want, but this looks like it fits with a quick skim of those two pages, and it runs without any errors, so hopefully you can at least look at the results and see if they are what you want.
You are summing over the first dimension of B, so the shape is 10, the size of the second dimension of B.
You can calculate
np.dot( np.sum(B, 0), np.linalg.pinv(A))
but this gives you a vector with 5 elements, but B_T has only a size of 4. So something doesn't fit in your sample data.
My code:
from numpy import *
def pca(orig_data):
data = array(orig_data)
data = (data - data.mean(axis=0)) / data.std(axis=0)
u, s, v = linalg.svd(data)
print s #should be s**2 instead!
print v
def load_iris(path):
lines = []
with open(path) as input_file:
lines = input_file.readlines()
data = []
for line in lines:
cur_line = line.rstrip().split(',')
cur_line = cur_line[:-1]
cur_line = [float(elem) for elem in cur_line]
data.append(array(cur_line))
return array(data)
if __name__ == '__main__':
data = load_iris('iris.data')
pca(data)
The iris dataset: http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data
Output:
[ 20.89551896 11.75513248 4.7013819 1.75816839]
[[ 0.52237162 -0.26335492 0.58125401 0.56561105]
[-0.37231836 -0.92555649 -0.02109478 -0.06541577]
[ 0.72101681 -0.24203288 -0.14089226 -0.6338014 ]
[ 0.26199559 -0.12413481 -0.80115427 0.52354627]]
Desired Output:
Eigenvalues - [2.9108 0.9212 0.1474 0.0206]
Principal Components - Same as I got but transposed so okay I guess
Also, what's with the output of the linalg.eig function? According to the PCA description on wikipedia, I'm supposed to this:
cov_mat = cov(orig_data)
val, vec = linalg.eig(cov_mat)
print val
But it doesn't really match the output in the tutorials I found online. Plus, if I have 4 dimensions, I thought I should have 4 eigenvalues and not 150 like the eig gives me. Am I doing something wrong?
edit: I've noticed that the values differ by 150, which is the number of elements in the dataset. Also, the eigenvalues are supposed to add to be equal to the number of dimensions, in this case, 4. What I don't understand is why this difference is happening. If I simply divided the eigenvalues by len(data) I could get the result I want, but I don't understand why. Either way the proportion of the eigenvalues isn't altered, but they are important to me so I'd like to understand what's going on.
You decomposed the wrong matrix.
Principal Component Analysis requires manipulating the eigenvectors/eigenvalues
of the covariance matrix, not the data itself. The covariance matrix, created from an m x n data matrix, will be an m x m matrix with ones along the main diagonal.
You can indeed use the cov function, but you need further manipulation of your data. It's probably a little easier to use a similar function, corrcoef:
import numpy as NP
import numpy.linalg as LA
# a simulated data set with 8 data points, each point having five features
data = NP.random.randint(0, 10, 40).reshape(8, 5)
# usually a good idea to mean center your data first:
data -= NP.mean(data, axis=0)
# calculate the covariance matrix
C = NP.corrcoef(data, rowvar=0)
# returns an m x m matrix, or here a 5 x 5 matrix)
# now get the eigenvalues/eigenvectors of C:
eval, evec = LA.eig(C)
To get the eigenvectors/eigenvalues, I did not decompose the covariance matrix using SVD,
though, you certainly can. My preference is to calculate them using eig in NumPy's (or SciPy's)
LA module--it is a little easier to work with than svd, the return values are the eigenvectors
and eigenvalues themselves, and nothing else. By contrast, as you know, svd doesn't return these these directly.
Granted the SVD function will decompose any matrix, not just square ones (to which the eig function is limited); however when doing PCA, you'll always have a square matrix to decompose,
regardless of the form that your data is in. This is obvious because the matrix you
are decomposing in PCA is a covariance matrix, which by definition is always square
(i.e., the columns are the individual data points of the original matrix, likewise
for the rows, and each cell is the covariance of those two points, as evidenced
by the ones down the main diagonal--a given data point has perfect covariance with itself).
The left singular values returned by SVD(A) are the eigenvectors of AA^T.
The covariance matrix of a dataset A is : 1/(N-1) * AA^T
Now, when you do PCA by using the SVD, you have to divide each entry in your A matrix by (N-1) so you get the eigenvalues of the covariance with the correct scale.
In your case, N=150 and you haven't done this division, hence the discrepancy.
This is explained in detail here
(Can you ask one question, please? Or at least list your questions separately. Your post reads like a stream of consciousness because you are not asking one single question.)
You probably used cov incorrectly by not transposing the matrix first. If cov_mat is 4-by-4, then eig will produce four eigenvalues and four eigenvectors.
Note how SVD and PCA, while related, are not exactly the same. Let X be a 4-by-150 matrix of observations where each 4-element column is a single observation. Then, the following are equivalent:
a. the left singular vectors of X,
b. the principal components of X,
c. the eigenvectors of X X^T.
Also, the eigenvalues of X X^T are equal to the square of the singular values of X. To see all this, let X have the SVD X = QSV^T, where S is a diagonal matrix of singular values. Then consider the eigendecomposition D = Q^T X X^T Q, where D is a diagonal matrix of eigenvalues. Replace X with its SVD, and see what happens.
Question already adressed: Principal component analysis in Python