I've computed the eigenvalues and eigenstates of a Hamiltonian in Python. I have a matrix containing all the wavefunctions in discrete space psi. I'd like to normalise the total wavefunction (or the 'ket') (i.e the matrix of vectors) such that its modulus squared integrates to 1.
I've tried the following:
A= np.linalg.norm(abs(psi.T)**2)
normed_psi=psi.T/np.sqrt(A)
print(np.linalg.norm(normed_psi))
The matrix is transposed so I can access each state using psi[n].
However, the output of the print statement is:
20.44795885105457
When it should be 1.I feel like I'm not using linalg.norm correctly. I've also tried using my own integral function using the trapezium rule to no success.
I'm not really sure as to what to do at this point. Any help would be great.
It seems you're confusing np.linalg.norm and np.sum, up to the usual floating point issues these two snippets should be identical:
normed_psi = psi.T / np.sqrt(np.sum(psi.T**2))
normed_psi = psi.T / np.linalg.norm(psi.T)
Related
I'm translating a Python class to Matlab. Most of it is straightforward, but I'm not so good with Python syntax (I hardly ever use it). I'm stuck on the following:
# find the basis that will be uncorrelated using the covariance matrix
basis = (sqrt(eigenvalues)[newaxis,:] * eigenvectors).transpose()
Can someone help me figure out what the equivalent Matlab syntax would be?
I've found via Google that np.newaxis increases the dimensionality of the array, and transpose is pretty self explanatory. So for newaxis, something involving cat in matlab would probably do it, but I'm really not clear on how Python handles arrays TBH.
Assuming eigenvalues is a 1D array of length N in Python, then sqrt(eigenvalues)[newaxis,:] would be a 1xN array. This is translated to MATLAB as either sqrt(eigenvalues) or sqrt(eigenvalues).', depending on the orientation of the eigenvalues array in MATLAB.
The * operation then does broadcasting (in MATLAB this is called singleton expansion). It looks like the operation multiplies each eigenvector by the square root of the corresponding eigenvalue (assuming eigenvectors are the columns).
If in MATLAB you computed the eigendecomposition like this:
[eigenvectors, eigenvalues] = eig(A);
then you’d just do:
basis = sqrt(eigenvalues) * eigenvectors.';
or
basis = (eigenvectors * sqrt(eigenvalues)).';
(note the parentheses) because eigenvalues is a diagonal matrix.
EDIT:
As it turns out this is still a question of floating point rounding error like others. The asymmetry in fft vs ifft absolute error comes from the difference in the magnitudes of the numbers (1e10 vs 1e8).
So there are many questions about the differences between Numpy/Scipy and MATLAB FFT's; however, most of these come down to floating point rounding errors and the fact that MATLAB will make elements on the order of 1e-15 into true 0's which is not what I'm after.
I am seeing a totally different issue where for identical inputs the Numpy/Scipy FFT's produce differences on the order of 1e-6 from MATLAB. At the same time for identical inputs the Numpy/Scipy IFFT's produce differences on the order or 1e-9. My data is a complex 1D vector of length 2^14 with the zero point in the middle of the array (If you know how to share this let me know). As such for both languages I am calling fftshift before and after the fft (ifft) operation.
My question is where is this difference coming from and, more importantly, why is it asymmetric with the fft and ifft? I can live with a small difference but 1e-6 is large when it accumulates over a large number of fft's.
The functional form of the fft (I'm not doing anything else to it) for either language is:
def myfft
return fftshift(fft(fftshift(myData)))
def myifft
return fftshift(ifft(fftshift(myData)))
I have the data saved in a .mat file and load it with scipy.io.loadmat into python. The data is a (2**14,) numpy array
The fft differences are calculated and plotted with
myData = loadmat('mydata.mat',squeeze_me=True)
plt.figure(1)
py = myfft(myData['fft_IN'])
mat = myData['fft_OUT']
plt.plot(py.real-mat.real)
plt.plot(py.imag-mat.imag)
plt.title('FFT Difference')
plt.legend(['real','imaginary'],loc=3)
plt.savefig('fft_diff')
and the ifft differences are calculated with
myData = loadmat('mydata.mat',squeeze_me=True)
plt.figure(1)
py = myifft(myData['ifft_IN'])
mat = myData['ifft_OUT']
plt.plot(py.real-mat.real)
plt.plot(py.imag-mat.imag)
plt.title('FFT Difference')
plt.legend(['real','imaginary'],loc=3)
plt.savefig('fft_diff')
Versions:
Python:3.7
MATLAB:R2019a
Scipy:1.4.1
Numpy:1.18.5
As it turns out this is still a question of floating point rounding error like all the other MATLAB vs numpy fft questions.
For my data the output of the fft function has numbers on the order of 1e10. This means that a precision of around 1e-16 on a float of this size is an absolute error less than or equal to 1e-6. The asymmetry in fft vs ifft absolute error comes from the output of the ifft being around 1e8. As such, this absolute error would then be less than or equal to 1e-8 which is exactly what we see.
Credit for this goes to #CrisLuengo who also helpfully pointed out that the ordering of fftshift and ifftshift for proper handing of odd length arrays.
You'll have to come up with a better workable example to show what you're after (also I don't have MATLAB, just Octave, and likely many others). I ran a quick code of fft and back with no issues. Be aware, generally DFTs (FFTs) are extremely nuanced to work with. You need to consider sampling, windowing, etc. very carefully.
Also, why the comparison to MATLAB to begin with, are you trusting it more, or just want to learn more about why one package produces an answer vs another? MATLAB uses fftw under the hood, which is very well tested and documented, but it doesn't mean that all the above nuances aren't coming into play in a different way.
import numpy as np
import matplotlib.pyplot as plt
fft = np.fft.fft
ifft = np.fft.ifft
def myfft(myData):
return fft(myData)
def myifft(myData):
return ifft(myData)
myData = np.exp(-np.linspace(-1, 1, 256)**2 / (2 * .25**2))
plt.figure(1)
fft_python = myifft(myfft(myData))
plt.plot(myData - fft_python.real)
plt.plot(fft_python.imag)
plt.title('FFT Difference')
plt.legend(['real','imaginary'],loc=3)
plt.savefig('fft_diff')
I want to translate this MATLAB code into Python, I guess I did everything right, even though I didn't get the same results.
MATLAB script:
n=2 %Filter_Order
Wn=[0.4 0.6] %# Normalized cutoff frequencies
[b,a] = butter(n,Wn,'bandpass') % Transfer function coefficients of the filter
Python script:
import numpy as np
from scipy import signal
n=2 #Filter_Order
Wn=np.array([0.4,0.6]) # Normalized cutoff frequencies
b, a = signal.butter(n, Wn, btype='band') #Transfer function coefficients of the filter
a coefficients in MATLAB: 1, -5.55e-16, 1.14, -1.66e-16, 0.41
a coefficients in Python: 1, -2.77e-16, 1.14, -1.94e-16, 0.41
Could it just be a question of precision, since the two different values (the 2nd and 4th) are both on the order of 10^(-16)?!
The b coefficients are the same on the other hand.
You machine precision is about 1e-16 (in MATLAB this can be checked easily with eps(), I presume about the same in Python). The 'error' you are dealing with is thus on the order of machine precision, i.e. not actually calculable within fitting precision.
Also of note is that MATLAB ~= Python (or != in Python), thus the implementations of butter() on one hand and signal.butter() on the other will be slightly different, even if you use the exact same numbers, due to the way both languages are translated to machine code.
It rarely matters to have coefficients differing 16 orders of magnitude; the smaller ones would be essentially neglected. In case you do need exact values, consider using either symbolic math, or some kind of Variable Precision Arithmetic (vpa() in MATLAB), but I guess that in your case the difference is irrelevant.
I have a fairly simple question. I have been converting some statistical analysis code from R to Python. Up until now, I have been doing just fine, but I have gotten stuck on this particular line:
nlsfit <- nls(N~pnorm(m, mean=mean, sd=sd),data=data4fit,start=list(mean=mu, sd=sig), control=list(maxiter=100,warnOnly = TRUE))
Essentially, the program is calculating the non-linear least-squares fit for a set of data, the "nls" command. In the original text, the "tilde" looks like an "enye", I'm not sure if that is significant.
As I understand the equivalent of pnorm in Python is norm.cdf from from scipy.stats. What I want to know is, what does the "tilde/enye" do before the pnorm function is invoked. "m" is a predefined variable, while "mean" and "sd" are not.
I also found some code, essentially reproducing nls in Python: nls Python code, however, because of the date of the post (2013), I was wondering if there are any more recent equivalents, preferably written in Pyton 3.
Any advice is appreiated, thanks!
As you can see from ?nls: the first argument in nsl is formula:
formula: a nonlinear model formula including variables and parameters.
Will be coerced to a formula if necessary
Now, if you do ?formula, we can read this:
The models fit by, e.g., the lm and glm functions are specified in a
compact symbolic form. The ~ operator is basic in the formation of
such models. An expression of the form y ~ model is interpreted as a
specification that the response y is modelled by a linear predictor
specified symbolically by model
Therefore, the ~ in your case nls join the response/dependent/regressand variable in the left with the regressors/explanatory variables in the right part of your nonlinear least squares.
Best!
This minimizes
sum((N - pnorm(m, mean=mean, sd=sd))^2)
using starting values for mean and sd specified in start. It will perform a maximum of 100 iterations and it will return instead of signalling an error in the case of termination before convergence.
The first argument to nls is an R formula which specifies the regression where the left hand side of the tilde (N) is the dependent variable and the right side is the function of the parameters (mean, sd) and data (m) used to predict it.
Note that formula objects do not have a fixed meaning in R but rather each function can interpret them in any way it likes. For example, formula objects used by nls are interpreted differently than formula objects used by lm. In nls the formula y ~ a + b * x would be used to specify a linear regression but in lm the same regression would be expressed as y ~ x .
See ?pnorm, ?nls, ?nls.control and ?formula .
I am trying to find an inverse of this 9x9 covariance matrix so I can use it with mahalanobis distance. However, the result I'm getting from matrix inverse is a matrix full of 1.02939420e+16. I have been trying to find why, considering Wolfram would give me the correct answer, and this seems to have something to do with condition number of matrix, which in this case is 3.98290435292e+16.
Although I would like to understand the math behind this, what I really need at this moment is just a solution to this problem so I can continue with implementation. Is there a way how to find an inverse of such matrix? Or is it somehow possible to find inverse covariance matrix directly from data instead?
Edit: Matrix data (same as the pastebin link)
[[ 0.46811097 0.15024959 0.01806486 -0.03029948 -0.12472314 -0.11952018 -0.14738093 -0.14655549 -0.06794621]
[ 0.15024959 0.19338707 0.09046136 0.01293189 -0.05290348 -0.07200769 -0.09317139 -0.10125269 -0.12769464]
[ 0.01806486 0.09046136 0.12575072 0.06507481 -0.00951239 -0.02944675 -0.05349869 -0.07496244 -0.13193147]
[-0.03029948 0.01293189 0.06507481 0.12214787 0.04527352 -0.01478612 -0.02879678 -0.06006481 -0.1114809 ]
[-0.12472314 -0.05290348 -0.00951239 0.04527352 0.164018 0.05474073 -0.01028871 -0.02695087 -0.03965366]
[-0.11952018 -0.07200769 -0.02944675 -0.01478612 0.05474073 0.13397166 0.06839442 0.00403321 -0.02537928]
[-0.14738093 -0.09317139 -0.05349869 -0.02879678 -0.01028871 0.06839442 0.14424203 0.0906558 0.02984426]
[-0.14655549 -0.10125269 -0.07496244 -0.06006481 -0.02695087 0.00403321 0.0906558 0.17054466 0.14455264]
[-0.06794621 -0.12769464 -0.13193147 -0.1114809 -0.03965366 -0.02537928 0.02984426 0.14455264 0.32968928]]
The matrix m you provide has a determinant of 0 and is hence uninvertible from a numerical point of view (and this explain the great values you have which tends to bump to Inf):
In [218]: np.linalg.det(m)
Out[218]: 2.8479946613617788e-16
If you start doing linear algebra operations/problem solving, I strongly advise to check some basic concepts, which would avoid doing numerical mistakes/errors:
https://en.wikipedia.org/wiki/Invertible_matrix
You are faced with a very important and fundamental mathematical problem. If your method gives non-invertible matrix the method has a trouble. The method is trying to solve an ill-posed problem. Probably all well-posed problems have been solved in the XIX century. The most common way to solve ill-posed problems is regularization. Sometimes Moore-Penrose pseudoinverse may be convenient. Scipy.linalg have pseudoinverse. But pseudoinverse is not a shortcut. Using pseudoinverse you're replacing non-solvable problem A by solvable problem B. Sometimes the solution of problem B can successfully work instead of non-existent solution of problem A, but it is a matter of mathematical research.
Zero determinant means that your matrix has linearly dependent rows (or columns). In other words, some information in your model is redundant (it contains excessive or duplicate information). Re-develop your model in order to exclude redundancy.