Vector Normalization in Python - python

I'm trying to port this MatLab function in Python:
fs = 128;
x = (0:1:999)/fs;
y_orig = sin(2*pi*15*x);
y_noised = y_orig + 0.5*randn(1,length(x));
[yseg] = mapstd(y_noised);
I wrote this code (which works, so there are not problems with missing variables or else):
Norm_Y = 0
Y_Normalized = []
for i in range(0, len(YSeg), 1):
Norm_Y = Norm_Y + (pow(YSeg[i],2))
Norm_Y = sqrt(Norm_Y)
for i in range(0, len(YSeg), 1):
Y_Normalized.append(YSeg[i] / Norm_Y)
print("%3d %f" %(i, Y_Normalized[i]))
YSeg is Y_Noised (I wrote it in another section of the code).
Now I don't expect the values to be same between MatLab code and mine, cause YSeg or Y_Noised are generated by RAND values, so it's ok they are different, but they are TOO MUCH different.
These are the first 10 values in Matlab:
0.145728655284548
1.41918657039301
1.72322238170491
0.684826842884694
0.125379108969931
-0.188899711186140
-1.03820858801652
-0.402591786430960
-0.844782236884026
0.626897216311757
While these are the first 10 numbers in my python code:
0.052015
0.051132
0.041209
0.034144
0.034450
0.003812
0.048629
0.016854
0.024484
0.021435
It's like mine are 100 times lower. So I feel like I've missed a step during normalization. Can you help ?

You can normalize a vector quite easily in python with numpy:
import numpy as np
def normalize_vector(input_vector):
return input_vector / np.sqrt(np.sum(input_vector**2))
random_vec = np.random.rand(10)
vec_norm = normalize_vector(random_vec)
print(vec_norm)
You can call the provided function with your input vector (YSeg) and check the output. I would expect a similar output as in matlab.

This is an implementation in numpy:
import numpy as np
fs = 127
x = np.arange(10000) / fs
y_orig = np.sin(2 * np.pi * 15 * x)
y_noised = y_orig + 0.5 * np.random.randn(len(x))
yseg = (y_noised - y_noised.mean()) / y_noised.std()
However, why do you consider the values to be "too much different"? After all, the values of y_orig are in range [-1, 1] and you are randomly distorting them by ~0.4 on average.

Related

Mathieu Characteristic Value

I was trying to obtain the Mathieu characteristic values for a specific problem. I do not have any problem obtaining them, and I have read the documentation from Scipy regarding these functions. The problem is that I know for a fact that the points I am obtaining are not right. My script to obtain the characteristic values I need is below:
import numpy as np
import matplotlib.pyplot as plt
from scipy.special import mathieu_a, mathieu_b, mathieu_cem, mathieu_sem
M = 1.0
g = 1.0
l = 1.0
h = 0.06
U0 = M * g * l
q = 4 * M * l**2 * U0 / h**2
def energy(n, q):
if n % 2 == 0:
return (h**2 / (8 * M * l**2)) * mathieu_a(n, q) + U0
else:
return (h**2 / (8 * M * l**2)) * mathieu_b(n + 1, q) + U0
n_list = np.arange(0, 80, 1)
e_n = [energy(i, q) for i in n_list]
plt.plot(n_list, e_n, '.')
The resulting plot of these values is this one. There is a zone where it appears to be "noise" or a numerical error, and I know that those jumps must not occur. In reality, around x= 40 to x > 40, the points should behave like a staircase of two consecutive points, similar to what can be seen between 70 < x < 80. And the values that x can take for this case are only positive integers.
I saw that the implementation of the Mathieu function has some problems, see here. But this was six years ago! In the answer to this question they use the NAG Library for Python, but it is not exactly open-source.
Is there a way I can still use these functions from Scipy without having this problem? Or is it related to the precision I am using to obtain the Mathieu characteristic value?

Iterative Binomial Update without Loop

Can this be done without a loop?
import numpy as np
n = 10
x = np.random.random(n+1)
a, b = 0.45, 0.55
for i in range(n):
x = a*x[:-1] + b*x[1:]
I came across this setup in another question. There it was a covered by a little obscure nomenclature. I guess it is related to Binomial options pricing model but don't quite understand the topic to be honest. I just was intrigued by the formula and this iterative update / shrinking of x and wondered if it can be done without a loop. But I can not wrap my head around it and I am not sure if this is even possible.
What makes me think that it might work is that this vatiaton
n = 10
a, b = 0.301201, 0.59692
x0 = 123
x = x0
for i in range(n):
x = a*x + b*x
# ~42
is actually just x0*(a + b)**n
print(np.allclose(x, x0*(a + b)**n))
# True
You are calculating:
sum( a ** (n - i) * b ** i * x[i] * choose(n, i) for 0 <= i <= n)
[That's meant to be pseudocode, not Python.] I'm not sure of the best way to convert that into Numpy.
choose(n, i) is n!/ (i! (n-i)!), not the numpy choose function.
Using #mathfux's comment, one can do
import numpy as np
from scipy.stats import binom
binomial = binom(p=p, n=n)
pmf = binomial(np.arange(n+1))
res = np.sum(x * pmf)
So
res = x.copy()
for i in range(n):
res = p*res[1:] + (p-1)*res[:-1]
is just the expected value of a binomial distributed random variable x.

How to display a 2d interpolation function in python as a matrix?

I looked around a lot but it's hard to find an answer. Basically when one interpolates v -> w you would normally use one of the many interpolation functions. But I want to get the corresponding matrix Av = w.
In my case w is a 200x200 matrices with v beeing a random subset of w with half as many points. I don't really care for fancy math it could be as simple as weighting the known points by distance squared. I already tried just implementing it all with some for loops but it only really works with small values. But maybe it helps explaining my question.
from random import sample
def testScatter(xbig, ybig):
NumberOfPoints = int(xbig * ybig / 2) #half as many points as in full Sample
#choose random coordinates
Index = sample(range(xbig * ybig),NumberOfPoints)
IndexYScatter = np.remainder(Index, xbig)
IndexXScatter = np.array((Index - IndexYScatter) / xbig, dtype=int)
InterpolationMatrix = np.zeros((xbig * ybig , NumberOfPoints), dtype=np.float32)
WeightingSum = np.zeros(xbig * ybig )
coordsSamplePoints = []
for i in range(NumberOfPoints): #first set all the given points (no need to interpolate)
coordsSamplePoints.append(IndexYScatter[i] + xbig * IndexXScatter[i])
InterpolationMatrix[coordsSamplePoints[i], i] = 1
WeightingSum[coordsSamplePoints[i]] = 1
for x in range(xbig * ybig): #now comes the interpolation
if x not in coordsSamplePoints:
YIndexInterpol = x % xbig #xcoord in interpolated matrix
XIndexInterpol = (x - YIndexInterpol) / xbig #ycoord in interp. matrix
for y in range(NumberOfPoints):
XIndexScatter = IndexXScatter[y]
YIndexScatter = IndexYScatter[y]
distanceSquared = (np.float32(YIndexInterpol) - np.float32(YIndexScatter))**2+(np.float32(XIndexInterpol) - np.float32(XIndexScatter))**2
InterpolationMatrix[x,y] = 1/distanceSquared
WeightingSum[x] += InterpolationMatrix[x,y]
return InterpolationMatrix/ WeightingSum[:,None] , IndexXScatter, IndexYScatter
You need to spend some time with the Numpy documentation start at the top of this page and working your way down. Studying answers here on SO for questions asking how to vectorize an operation when using Numpy array's would help you. If you find that you are iterating over indices and performing calcs with Numpy arrays there is probably a better way.
First cut...
The first for loop can be replaced with:
coordsSamplePoints = IndexYScatter + (xbig * IndexXScatter)
InterpolationMatrix[coordsSamplePoints,np.arange(coordsSamplePoints.shape[0])] = 1
WeightingSum[coordsSamplePoints] = 1
This mainly makes use of elementwise arithmetic and Index arrays - the complete Indexing Tutorial should be read
You can test this by enhancing the function and executing the for loop along with Numpy way then comparing the result.
...
IM = InterpolationMatrix.copy()
WS = WeightingSum.copy()
for i in range(NumberOfPoints): #first set all the given points (no need to interpolate)
coordsSamplePoints.append(IndexYScatter[i] + xbig * IndexXScatter[i])
InterpolationMatrix[coordsSamplePoints[i], i] = 1
WeightingSum[coordsSamplePoints[i]] = 1
cSS = IndexYScatter + (xbig * IndexXScatter)
IM[cSS,np.arange(cSS.shape[0])] = 1
WS[cSS] = 1
# TEST Validity
print((cSS == coordsSamplePoints).all(),
(IM == InterpolationMatrix).all(),
(WS == WeightingSum).all())
...
The outer loop:
...
for x in range(xbig * ybig): #now comes the interpolation
if x not in coordsSamplePoints:
YIndexInterpol = x % xbig #xcoord in interpolated matrix
XIndexInterpol = (x - YIndexInterpol) / xbig #ycoord in interp. matrix
...
Can be replaced with:
...
space = np.arange(xbig * ybig)
mask = ~(space == cSS[:,None]).any(0)
iP = space[mask] # points to interpolate
yIndices = iP % xbig
xIndices = (iP - yIndices) / xbig
...
Complete solution:
import random
import numpy as np
def testScatter(xbig, ybig):
NumberOfPoints = int(xbig * ybig / 2) #half as many points as in full Sample
#choose random coordinates
Index = random.sample(range(xbig * ybig),NumberOfPoints)
IndexYScatter = np.remainder(Index, xbig)
IndexXScatter = np.array((Index - IndexYScatter) / xbig, dtype=int)
InterpolationMatrix = np.zeros((xbig * ybig , NumberOfPoints), dtype=np.float32)
WeightingSum = np.zeros(xbig * ybig )
coordsSamplePoints = IndexYScatter + (xbig * IndexXScatter)
InterpolationMatrix[coordsSamplePoints,np.arange(coordsSamplePoints.shape[0])] = 1
WeightingSum[coordsSamplePoints] = 1
IM = InterpolationMatrix
cSS = coordsSamplePoints
WS = WeightingSum
space = np.arange(xbig * ybig)
mask = ~(space == cSS[:,None]).any(0)
iP = space[mask] # points to interpolate
yIndices = iP % xbig
xIndices = (iP - yIndices) / xbig
dSquared = ((yIndices[:,None] - IndexYScatter) ** 2) + ((xIndices[:,None] - IndexXScatter) ** 2)
IM[iP,:] = 1/dSquared
WS[iP] = IM[iP,:].sum(1)
return IM / WS[:,None], IndexXScatter, IndexYScatter
I'm getting about 200x improvement with this over your original with (100,100) for the arguments. Probably some other minor improvements but they won't effect execution time significantly.
Broadcasting is another Numpy skill that is a must.

Why is the result in FFT code without using Scipy not similar to Scipy FFT?

The FFT code below did not give the result similar to scipy library of Python. But I don't know what's wrong in this code.
import numpy as np
import matplotlib.pyplot as plt
#from scipy.fftpack import fft
def omega(p, q):
return np.exp((-2j * np.pi * p) / q)
def fft(x):
N = len(x)
if N <= 1: return x
even = fft(x[0::2])
odd = fft(x[1::2])
combined = [0] * N
for k in range(N//2):
combined[k] = even[k] + omega(k,N) * odd[k]
combined[k + N//2] = even[k] - omega(k,N) * odd[k]
return combined
N = 600
T = 1.0 / 800.0
x = np.linspace(0, N*T, N)
#y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
y = np.sin(50.0 * 2.0*np.pi*x)
xf = np.linspace(0.0, 1.0/(2.0*T), N//2)
yf = fft(y)
yfa = 2.0/N * np.abs(yf[0:N//2])
plt.plot(xf, yfa)
plt.show()
This gives:
All the above comments, i.e. roundoff errors and implementation correctness, are true but you missed an important thing... FFT Cooley and Tukey original algorithm is working only if the number of samples N is a power of 2. You did notice that
np.allclose(yfa,yfa_sp)
>>> False
for your current input N = 600, the discrepancies are huge between your output and numpy/scipy. But now, let's use the closest power of two, in this case N = 2**9 = 512, which gives
np.allclose(yfa,yfa_sp)
>>> True
Wonderful! Outputs are now identical this time, and it can be verified for other powers of 2 (Nyquist criterion apart) sizes of input signal y. For in depth explanations, you may read accepted answer of this question to understand why numpy/scipy fft functions may allow all N (with most efficiency when N is a power of two, and least efficiency when N is prime) instead of just handling this error as you should have, with something like:
if np.log2(N) % 1 > 0:
raise ValueError('size of input y must be a power of 2')
or even, using bitwise and operator (a truly elegant test imo):
if N & N-1:
raise ValueError('size of input y must be a power of 2')
As suggested in the comments, if size of the signal could't be modified so easily, zero-padding is definitely the way to go for this kind of sampling issue.
Hope this helps.

Issues Translating Custom Discrete Fourier Transform from MATLAB to Python

I'm developing Python software for someone and they specifically requested that I use their DFT function, written in MATLAB, in my program. My translation is just plain not working, tested with sin(2*pi*r).
The MATLAB function below:
function X=dft(t,x,f)
% Compute DFT (Discrete Fourier Transform) at frequencies given
% in f, given samples x taken at times t:
% X(f) = sum { x(k) * e**(2*pi*j*t(k)*f) }
% k
shape = size(f);
t = t(:); % Format 't' into a column vector
x = x(:); % Format 'x' into a column vector
f = f(:); % Format 'f' into a column vector
W = exp(-2*pi*j * f*t');
X = W * x;
X = reshape(X,shape);
And my Python interpretation:
def dft(t, x, f):
i = 1j #might not have to set it to a variable but better safe than sorry!
w1 = f * t
w2 = -2 * math.pi * i
W = exp(w1 * w2)
newArr = W * x
return newArr
Why am I having issues? The MATLAB code works fine but the Python translation outputs a weird increasing sine curve instead of a Fourier transform. I get the feeling Python is handling the calculations slightly differently but I don't know why or how to fix this.
Here's your MATLAB code -
t = 0:0.005:10-0.005;
x = sin(2*pi*t);
f = 30*(rand(size(t))+0.225);
shape = size(f);
t = t(:); % Format 't' into a column vector
x = x(:); % Format 'x' into a column vector
f = f(:); % Format 'f' into a column vector
W = exp(-2*pi*1j * f*t'); %//'
X = W * x;
X = reshape(X,shape);
figure,plot(f,X,'ro')
And here's one version of numpy ported code might look like -
import numpy as np
from numpy import math
import matplotlib.pyplot as plt
t = np.arange(0, 10, 0.005)
x = np.sin(2*np.pi*t)
f = 30*(np.random.rand(t.size)+0.225)
N = t.size
i = 1j
W = np.exp((-2 * math.pi * i)*np.dot(f.reshape(N,1),t.reshape(1,N)))
X = np.dot(W,x.reshape(N,1))
out = X.reshape(f.shape).T
plt.plot(f, out, 'ro')
MATLAB Plot -
Numpy Plot -
Numpy arrays do element wise multiplication with *.
You need np.dot(w1,w2) for matrix multiplication using numpy arrays (not the case for numpy matrices)
Make sure you are clear on the distinction between Numpy arrays and matrices. There is a good help page "Numpy for Matlab Users":
http://wiki.scipy.org/NumPy_for_Matlab_Users
Doesn't appear to be working at present so here is a temporary link.
Also, use t.T to transpose a numpy array called t.

Categories