If I have a waveform x such as
x = [math.sin(W*t + Ph) for t in range(16)]
with arbitrary W and Ph, and I calculate its (Real) FFT f with
f = numpy.fft.rfft(x)
I can get the original x with
numpy.fft.irfft(f)
Now, what if I need to extend the range of the recovered waveform a number of samples to the left and to the right? I.e. a waveform y such that len(y) == 48, y[16:32] == x and y[0:16], y[32:48] are the periodic extensions of the original waveform.
In other words, if the FFT assumes its input is an infinite function f(t) sampled over t = 0, 1, ... N-1, how can I recover the values of f(t) for t<0 and t>=N?
Note: I used a perfect sine wave as an example, but in practice x could be anything: arbitrary signals such as x = range(16) or x = np.random.rand(16), or a segment of any length taken from a random .wav file.
Now, what if I need to extend the range of the recovered waveform a number of samples to the left and to the right? I.e. a waveform y such that len(y) == 48, y[16:32] == x and y[0:16], y[32:48] are the periodic extensions of the original waveform.
The periodic extension are also just x because it's the periodic extension.
In other words, if the FFT assumes its input is an infinite function f(t) sampled over t = 0, 1, ... N-1, how can I recover the values of f(t) for t<0 and t>=N?
The "N-point FFT assumes" that your signal is periodic with a periodicity of N. That's because all the harmonic base functions your block is decomposed into are periodic in the way that the previous N and succeding N samples are just a copy of the main N samples.
If you allow any value for W your input sinusoid won't be periodic with periodicity of N. But that does not stop the FFT function from decomposing it into a sum of many periodic sinusiods. And the sum of periodic sinusoids with periodicity of N will also have a periodicity of N.
Clearly, you have to rethink the problem.
Maybe you could make use of linear prediction. Compute a couple of linear prediction coefficients based on your fragment's windowed auto-correlation and the Levinson-Durbin recursion and extrapolate using those prediction coefficients. However, for a stable prediction filter, the prediction will converge to zero and the speed of convergence depends on what kind of signal you have. The perfect linear prediction coefficients for white noise, for example, are all zero. In that case you would "extrapolate" zeros to the left and the right. But there's not much you can do about it. If you have white noise, there is just no information in your fragment about surrounding samples because all the samples are independent (that's what white noise is about).
This kind of linear prediction is actually able to predict sinusoid samples perfectly. So, if your input is sin(W*t+p) for arbitrary W and p you will only need linear prediction with order two. For more complex signals I suggest an order of 10 or 16.
The following examples should give you a good idea of how to go about it:
>>> x1 = np.random.rand(4)
>>> x2 = np.concatenate((x1, x1))
>>> x3 = np.concatenate((x1, x1, x1))
>>> np.fft.rfft(x1)
array([ 2.30410617+0.j , -0.89574460-0.26838271j, -0.26468792+0.j ])
>>> np.fft.rfft(x2)
array([ 4.60821233+0.j , 0.00000000+0.j ,
-1.79148921-0.53676542j, 0.00000000+0.j , -0.52937585+0.j ])
>>> np.fft.rfft(x3)
array([ 6.91231850+0.j , 0.00000000+0.j ,
0.00000000+0.j , -2.68723381-0.80514813j,
0.00000000+0.j , 0.00000000+0.j , -0.79406377+0.j ])
Of course the easiest way to get three periods is to concatenate 3 copies of the inverse FFT in the time domain:
np.concatenate((np.fft.irfft(f),) * 3)
But if you want or have to do this in the frequency domain, you can do the following:
>>> a = np.arange(4)
>>> f = np.fft.rfft(a)
>>> n = 3
>>> ext_f = np.zeros(((len(f) - 1) * n + 1,), dtype=f.dtype)
>>> ext_f[::n] = f * n
>>> np.fft.irfft(ext_f)
array([ 0., 1., 2., 3., 0., 1., 2., 3., 0., 1., 2., 3.])
For stationary waveforms that are periodic in the FFT aperture or length, you can just cyclicly repeat the waveform, or the IFFT(FFT()) re-synthesized equivalent waveform, to extend them in the time domain. For waveforms which are widowed in time from sources that are not periodic in the FFT aperture or length, the FFT result will be the spectrum convolved with a Sinc function. So some sort of equivalent to a de-convolution will be required to recover the original un-windowed spectral content. Since this deconvolution is difficult or impossible, most commonly an analysis/re-synthesis method is used instead, such as a phase-vocoder process or other frequency estimators. Then those estimated frequencies, which may be different from those in the bins of a single raw FFT result, can be fed to a bank of sinusoidal synthesizers, a mix of phase-modified IFFTs, or other re-synthesis methods, to create a longer waveform with approximately the same spectral content.
Related
I am trying to get 3D points using cv2.triangulatePoints but it always returns almost same Z value. My output looks like this: As it seen, all points are in almost same Z value. There is no depth.
Here is my triangulation:
def triangulate(self, proj_mat1, pts0, pts1):
proj_mat0 = np.zeros((3,4))
proj_mat0[:, :3] = np.eye(3)
pts0, pts1 = self.normalize(pts0), self.normalize(pts1)
pts4d = cv2.triangulatePoints(proj_mat0, proj_mat1, pts0.T, pts1.T).T
pts4d /= pts4d[:, 3:]
out = np.delete(pts4d, 3, 1)
print(out)
return out
Here is my projection matrix calculation:
def getP(self, rmat, tvec):
P = np.concatenate([rmat, tvec.reshape(3, 1)], axis = 1)
return P
Here is the part that I get rmat, tvec and call triangulation:
E, mask = cv2.findEssentialMat(np.array(aa), np.array(bb), self.K)
_, R, t, mask = cv2.recoverPose(E, np.array(aa), np.array(bb), self.K)
proj_mat1 = self.getP(R, t)
out = self.triangulate(proj_mat1, np.array(aa, dtype = np.float32), np.array(bb, dtype = np.float32))
My camera matrix:
array([[787.8113353 , 0. , 318.49905794],
[ 0. , 786.9638204 , 245.98673477],
[ 0. , 0. , 1. ]])
My projection matrix 1:
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.]])
Explanations:
aa and bb are matched points from 2 frames.
self.K is my camera matrix
rotation and translation matrices are extracted from Essential matrix
Essential matrix calculated from matched keypoints. It changes every frame.
Projection matrix 2 changes every frame.
Output after changing first projection matrix (I switched from matplotlib to pangolin as 3D visualization tool):
Output after using P1 and P2 that I mentioned in comments:
Where is my mistake? Please let me know if any further information needed. I will update my question.
Unfortunately I don't have the possibility to double-check directly but my gut feeling is that the issues you are facing are essentially due to the choice of your first projection matrix
I did some research and I found this great paper with both theory and practice. Despite differing a little bit from your approach, there is a thing that is worth saying
If you check carefully, the first projection matrix is exactly the camera matrix with an additional last column equal to zero. In fact, the rotation matrix for the first camera reduces to the identity matrix and the corresponding translation vector is a null vector, so using this general formula:
P = KT
where P is the projection matrix, K the camera matrix and T the matrix obtained by the rotation matrix R flanked by the translation vector t according to:
T = [R|t]
then you will get:
Coming back to your case, first of all I would suggest to change your first projection matrix as just said
Also, I understand that you are planned to work with something different at every frame but if after the suggested change the things still don't match then in your shoes I'd start working with just 2 images [I think you implicitly did already to create the correspondence between aa and bb], calculating first the matrices with your algorithm and then checking with the ones obtained following the article above
In this way you would be able to understand/debug which matrices are creating you troubles
Thank you so much for all the effort #Antonino. My webcams were pretty bad. After changing every part of my code and making many trials I decided to change my webcams and bought good webcams. It worked :D Here is the result:
I know how to draw points moved by matrix, like this below
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
x=np.random.randn(2) #2*1 matrix
A=np.random.randn(2,2) #2*2 matrix
print ('the content of x:\n{}\n the content of A:\n{}'.format(x,A))
def action(pt,n):
record=[pt]
for i in range(n):
pt= A#pt
record=np.vstack([record,pt])
plt.scatter(record[:,1],record[:,1])
action(x,100)
the function "action" will draw something like a line, but I want to move points by matrix and then draw it like an orbit
SHORT ANSWER:
plt.scatter(record[:,1],record[:,1]) will feed same values in both x & y dimensions & hence will always return a line. Replace it by:
X,Y = np.hsplit(record,2)
plt.scatter(X,Y)
LONG ANSWER:
The main cause behind plot coming out as a line is that you are generating the plot using 2 constants (although randomly generated). I will illustrate using below example:
>>> c
array([[ 1., 2.],
[ 2., 4.]])
>>> d
array([ 3., 4.])
>>> d#c
array([ 11., 22.])
>>> d#c#c
array([ 55., 110.])
>>> d#c#c#c
array([ 275., 550.])
Notice how all the recursive operation is only multiplying the initial co-ordinate by 5 at each stage.
How to get a non-linear plot??
Utilize the variable 'i' which we are calling for loop operation by giving it a power of 2(parabola) or more.
Use random numbers populated in the 2 matrices greater than 1. Otherwise all the operations either increase the magnitude in -ve or if b/w (-1,1) the magnitude decreases.
Use mathematical functions to introduce non-linearity. Eg:
pt = pt + np.sin(pt)
Reflect if using 2 random matrices & looping over them is the only way to achieve the curve. If this activity is independent from your bigger programme etc, then probably try different approach by using mathematical functions which generate the curve you like.
I have a matrix which I'm trying to normalize by transforming each feature column to zero mean and unit standard deviation.
I have the following code that I'm using, but I want to know if that method actually does what I'm trying to or if it uses a different method.
from sklearn import preprocessing
mat_normalized = preprocessing.normalize(mat_from_df)
sklearn.preprocessing.normalize scales each sample vector to unit norm. (The default axis is 1, not 0.) Here's proof of that:
from sklearn.preprocessing import normalize
np.random.seed(444)
data = np.random.normal(loc=5, scale=2, size=(15, 2))
np.linalg.norm(normalize(data), axis=1)
# array([ 1., 1., 1., 1., 1., 1., ...
It sounds like you're looking for sklearn.preprocessing.scale to scale each feature vector to ~N(0, 1).
from sklearn.preprocessing import scale
# Are the scaled column-wise means approx. 0.?
np.allclose(scale(data).mean(axis=0), 0.)
# True
# Are the scaled column-wise stdevs. approx. 1.?
np.allclose(scale(data).std(axis=0), 1.)
# True
Like the documentation states:
sklearn.preprocessing.normalize(X, norm='l2',
axis=1, copy=True,
return_norm=False)
Scale input vectors individually to unit norm (vector length).
So it takes the norm (by default the L2 norm) and then ensures that the vector is unit.
So if we take as input an n×m-matrix, the output is an n×m-matrix. Every m-vector is normalized. For norm='l2' (the default), thus this means that the length is calculated (by the square root of the sum of the square of the components), and every element is divided by that length, such that the result is a vector with length 1.
Is there a numerically stable way to compute softmax function below?
I am getting values that becomes Nans in Neural network code.
np.exp(x)/np.sum(np.exp(y))
The softmax exp(x)/sum(exp(x)) is actually numerically well-behaved. It has only positive terms, so we needn't worry about loss of significance, and the denominator is at least as large as the numerator, so the result is guaranteed to fall between 0 and 1.
The only accident that might happen is over- or under-flow in the exponentials. Overflow of a single or underflow of all elements of x will render the output more or less useless.
But it is easy to guard against that by using the identity softmax(x) = softmax(x + c) which holds for any scalar c: Subtracting max(x) from x leaves a vector that has only non-positive entries, ruling out overflow and at least one element that is zero ruling out a vanishing denominator (underflow in some but not all entries is harmless).
Footnote: theoretically, catastrophic accidents in the sum are possible, but you'd need a ridiculous number of terms. For example, even using 16 bit floats which can only resolve 3 decimals---compared to 15 decimals of a "normal" 64 bit float---we'd need between 2^1431 (~6 x 10^431) and 2^1432 to get a sum that is off by a factor of two.
Softmax function is prone to two issues: overflow and underflow
Overflow: It occurs when very large numbers are approximated as infinity
Underflow: It occurs when very small numbers (near zero in the number line) are approximated (i.e. rounded to) as zero
To combat these issues when doing softmax computation, a common trick is to shift the input vector by subtracting the maximum element in it from all elements. For the input vector x, define z such that:
z = x-max(x)
And then take the softmax of the new (stable) vector z
Example:
def stable_softmax(x):
z = x - max(x)
numerator = np.exp(z)
denominator = np.sum(numerator)
softmax = numerator/denominator
return softmax
# input vector
In [267]: vec = np.array([1, 2, 3, 4, 5])
In [268]: stable_softmax(vec)
Out[268]: array([ 0.01165623, 0.03168492, 0.08612854, 0.23412166, 0.63640865])
# input vector with really large number, prone to overflow issue
In [269]: vec = np.array([12345, 67890, 99999999])
In [270]: stable_softmax(vec)
Out[270]: array([ 0., 0., 1.])
In the above case, we safely avoided the overflow problem by using stable_softmax()
For more details, see chapter Numerical Computation in deep learning book.
Extending #kmario23's answer to support 1 or 2 dimensional numpy arrays or lists. 2D tensors (assuming the first dimension is the batch dimension) are common if you're passing a batch of results through softmax:
import numpy as np
def stable_softmax(x):
z = x - np.max(x, axis=-1, keepdims=True)
numerator = np.exp(z)
denominator = np.sum(numerator, axis=-1, keepdims=True)
softmax = numerator / denominator
return softmax
test1 = np.array([12345, 67890, 99999999]) # 1D numpy
test2 = np.array([[12345, 67890, 99999999], # 2D numpy
[123, 678, 88888888]]) #
test3 = [12345, 67890, 999999999] # 1D list
test4 = [[12345, 67890, 999999999]] # 2D list
print(stable_softmax(test1))
print(stable_softmax(test2))
print(stable_softmax(test3))
print(stable_softmax(test4))
[0. 0. 1.]
[[0. 0. 1.]
[0. 0. 1.]]
[0. 0. 1.]
[[0. 0. 1.]]
There is nothing wrong with calculating the softmax function as it is in your case. The problem seems to come from exploding gradient or this sort of issues with your training methods. Focus on those matters with either "clipping values" or "choosing the right initial distribution of weights".
I'm wondering if anyone can help me better understand the fourier series by seeing the output of a fourier transform actually used as the coefficients in the series of sine and cosine functions.
I have some function, I sample 4 times to get [0, 1, 0, 1], therefore N == 4. How do I express this as a fourier series? Using numpy, the fft gives me...[ 2.+0.j, 0.+0.j, -2.+0.j, 0.+0.j] Basically, I need to see the expansion of this, NOT in summation notation, just because otherwise I will just have a fear-induced cloudy mind. By expansion, I mean sin(something * x) + cos(something * x) + ...
The relationship between the Discrete Fourier Transform (DFT) and Fourier series can be summarized as follows,
The Fourier series coefficients of a periodic signal x are given by the DFT of one period of x, divided by N, were N is also the number of samples in each period.
This means that the Fourier series and the DFT are related only if the period of the signal is equal to N times its sampling rate, which is not the case in general.
Therefore in practice when you need the DFT, use scipy.fftpack.fft, while Fourier series coefficients can be calculated with a direct summation in python. There is plenty of literature on-line about both concepts, but do not mix the two as it will probably be mostly confusing, instead of being helpful.
I guess you are looking for something like this:
from numpy.fft import fft
x = array([ 0., 1., 0., 1.])
y = fft(x)
#first rescale it
nfft = len(x)
y /= nfft
n = arange(0,4)
# notice that y[1] and y[3] are identically zero:
x_reconstructed = y[0] +y[2] * cos(2*2*pi/nfft*n)
and now you have x_reconstructed==x. Now you can go to the page of the DFT, especially this equation, and understand summation notation based on the example above.