Here is a weird one:
I have found myself needing a numpy function that is what I would call the true inverse of np.cos (or another trigonometric function, cosine is used here for definiteness). What I mean by ''true inverse'' is a function invcos, such that
np.cos(invcos(x)) = x
for any real float x. Two observations: invcos(x) exists (it is a complex float) and np.arccos(x) does not do the job, because it only works for -1 < x < 1.
My question is if there is an efficient numpy function for this operation or if it can built from existing ones easily?
My attempt was to use a combination of np.arccos and np.arccosh to build the function by hand. This is based on the observation that np.arccos can deal with x in [-1,1] and np.arccosh can deal with x outside [-1,1] if one multiplies by the complex unit. To see that this works:
cos_x = np.array([0.5, 1., 1.5])
x = np.arccos(cos_x)
cos_x_reconstucted = np.cos(x)
# [0.5 1. nan]
x2 = 1j*np.arccosh(cos_x)
cos_x_reconstructed2 = np.cos(x2)
# [nan+nanj 1.-0.j 1.5-0.j]
So we could combine this to
def invcos(array):
x1 = np.arccos(array)
x2 = 1j*np.arccosh(array)
print(x1)
print(x2)
x = np.empty_like(x1, dtype=np.complex128)
x[~np.isnan(x1)] = x1[~np.isnan(x1)]
x[~np.isnan(x2)] = x2[~np.isnan(x2)]
return x
cos_x = np.array([0.5, 1., 1.5])
x = invcos(cos_x)
cos_x_reconstructed = np.cos(x)
# [0.5-0.j 1.-0.j 1.5-0.j]
This gives the correct results, but naturally raises RuntimeWarnings:
RuntimeWarning: invalid value encountered in arccos.
I guess since numpy even tells me that my algorithm is not efficient, it is probably not efficient. Is there a better way to do this?
For readers who are interested in why this strange function may be useful: The motivation comes from a physics background. In certain theories, one can have vector components that are 'off-shell', which means that the components might even be longer than the vector. The above function can be useful to nevertheless parametrize things in terms of angles.
My question is if there is an efficient numpy function for this operation or if it can built from existing ones easily?
Yes; it is... np.arccos.
From the documentation:
For real-valued input data types, arccos always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.
For complex-valued input, arccos is a complex analytic function that has branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.
So all we need to do is ensure that the input is a complex number (even if its imaginary part is zero):
>>> import numpy as np
>>> np.arccos(2.0)
__main__:1: RuntimeWarning: invalid value encountered in arccos
nan
>>> np.arccos(2 + 0j)
-1.3169578969248166j
For an array, we need the appropriate dtype:
>>> np.arccos(np.ones((3,3)) * 2)
array([[nan, nan, nan],
[nan, nan, nan],
[nan, nan, nan]])
>>> np.arccos(np.ones((3,3), dtype=np.complex) * 2)
array([[0.-1.3169579j, 0.-1.3169579j, 0.-1.3169579j],
[0.-1.3169579j, 0.-1.3169579j, 0.-1.3169579j],
[0.-1.3169579j, 0.-1.3169579j, 0.-1.3169579j]])
Related
I know how to draw points moved by matrix, like this below
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
x=np.random.randn(2) #2*1 matrix
A=np.random.randn(2,2) #2*2 matrix
print ('the content of x:\n{}\n the content of A:\n{}'.format(x,A))
def action(pt,n):
record=[pt]
for i in range(n):
pt= A#pt
record=np.vstack([record,pt])
plt.scatter(record[:,1],record[:,1])
action(x,100)
the function "action" will draw something like a line, but I want to move points by matrix and then draw it like an orbit
SHORT ANSWER:
plt.scatter(record[:,1],record[:,1]) will feed same values in both x & y dimensions & hence will always return a line. Replace it by:
X,Y = np.hsplit(record,2)
plt.scatter(X,Y)
LONG ANSWER:
The main cause behind plot coming out as a line is that you are generating the plot using 2 constants (although randomly generated). I will illustrate using below example:
>>> c
array([[ 1., 2.],
[ 2., 4.]])
>>> d
array([ 3., 4.])
>>> d#c
array([ 11., 22.])
>>> d#c#c
array([ 55., 110.])
>>> d#c#c#c
array([ 275., 550.])
Notice how all the recursive operation is only multiplying the initial co-ordinate by 5 at each stage.
How to get a non-linear plot??
Utilize the variable 'i' which we are calling for loop operation by giving it a power of 2(parabola) or more.
Use random numbers populated in the 2 matrices greater than 1. Otherwise all the operations either increase the magnitude in -ve or if b/w (-1,1) the magnitude decreases.
Use mathematical functions to introduce non-linearity. Eg:
pt = pt + np.sin(pt)
Reflect if using 2 random matrices & looping over them is the only way to achieve the curve. If this activity is independent from your bigger programme etc, then probably try different approach by using mathematical functions which generate the curve you like.
Is there a numerically stable way to compute softmax function below?
I am getting values that becomes Nans in Neural network code.
np.exp(x)/np.sum(np.exp(y))
The softmax exp(x)/sum(exp(x)) is actually numerically well-behaved. It has only positive terms, so we needn't worry about loss of significance, and the denominator is at least as large as the numerator, so the result is guaranteed to fall between 0 and 1.
The only accident that might happen is over- or under-flow in the exponentials. Overflow of a single or underflow of all elements of x will render the output more or less useless.
But it is easy to guard against that by using the identity softmax(x) = softmax(x + c) which holds for any scalar c: Subtracting max(x) from x leaves a vector that has only non-positive entries, ruling out overflow and at least one element that is zero ruling out a vanishing denominator (underflow in some but not all entries is harmless).
Footnote: theoretically, catastrophic accidents in the sum are possible, but you'd need a ridiculous number of terms. For example, even using 16 bit floats which can only resolve 3 decimals---compared to 15 decimals of a "normal" 64 bit float---we'd need between 2^1431 (~6 x 10^431) and 2^1432 to get a sum that is off by a factor of two.
Softmax function is prone to two issues: overflow and underflow
Overflow: It occurs when very large numbers are approximated as infinity
Underflow: It occurs when very small numbers (near zero in the number line) are approximated (i.e. rounded to) as zero
To combat these issues when doing softmax computation, a common trick is to shift the input vector by subtracting the maximum element in it from all elements. For the input vector x, define z such that:
z = x-max(x)
And then take the softmax of the new (stable) vector z
Example:
def stable_softmax(x):
z = x - max(x)
numerator = np.exp(z)
denominator = np.sum(numerator)
softmax = numerator/denominator
return softmax
# input vector
In [267]: vec = np.array([1, 2, 3, 4, 5])
In [268]: stable_softmax(vec)
Out[268]: array([ 0.01165623, 0.03168492, 0.08612854, 0.23412166, 0.63640865])
# input vector with really large number, prone to overflow issue
In [269]: vec = np.array([12345, 67890, 99999999])
In [270]: stable_softmax(vec)
Out[270]: array([ 0., 0., 1.])
In the above case, we safely avoided the overflow problem by using stable_softmax()
For more details, see chapter Numerical Computation in deep learning book.
Extending #kmario23's answer to support 1 or 2 dimensional numpy arrays or lists. 2D tensors (assuming the first dimension is the batch dimension) are common if you're passing a batch of results through softmax:
import numpy as np
def stable_softmax(x):
z = x - np.max(x, axis=-1, keepdims=True)
numerator = np.exp(z)
denominator = np.sum(numerator, axis=-1, keepdims=True)
softmax = numerator / denominator
return softmax
test1 = np.array([12345, 67890, 99999999]) # 1D numpy
test2 = np.array([[12345, 67890, 99999999], # 2D numpy
[123, 678, 88888888]]) #
test3 = [12345, 67890, 999999999] # 1D list
test4 = [[12345, 67890, 999999999]] # 2D list
print(stable_softmax(test1))
print(stable_softmax(test2))
print(stable_softmax(test3))
print(stable_softmax(test4))
[0. 0. 1.]
[[0. 0. 1.]
[0. 0. 1.]]
[0. 0. 1.]
[[0. 0. 1.]]
There is nothing wrong with calculating the softmax function as it is in your case. The problem seems to come from exploding gradient or this sort of issues with your training methods. Focus on those matters with either "clipping values" or "choosing the right initial distribution of weights".
I have been dealing with linear algebra problems of the form A = Bx in Python and comparing this to a colleague's code in MATLAB and Mathematica. We have noticed differences between Python and the others when B is a singular matrix. When using numpy.linalg.solve() I throw a singular matrix error, so I've instead implemented .pinv() (the Moore Penrose pseudo inverse).
I understand that storing the inverse is computationally inefficient and am first of all curious if there's a better way of dealing with singular matrices in Python. However the thrust of my question lies in how Python chooses an answer from an infinite solution space, and why it chooses a different one than MATLAB and Mathematica do.
Here is my toy problem:
B = np.array([[2,4,6],[1,0,3],[0,7,0]])
A = np.array([[12],[4],[7]])
BI = linalg.pinv(B)
x = BI.dot(A)
The answer that Python outputs to me is:
[[ 0.4]
[ 1. ]
[ 1.2]]
While this is certainly a correct answer, it isn't the one I had intended: (1,1,1). Why does Python generate this particular solution? Is there a way to return the space of solutions rather than one possible solution? My colleague's code returned (1, 1, 1) - is there a reason that Python is different from Mathematica and MATLAB?
In short, your code (and apparently np.linalg.lstsq) uses the Moore-Penrose pseudoinverse, which is implemented in np.linalg.pinv. MATLAB and Mathematica likely use Gaussian elimination to solve the system. We can replicate this latter approach in Python using the LU decomposition:
B = np.array([[2,4,6],[1,0,3],[0,7,0]])
y = np.array([[12],[4],[7]])
P, L, U = scipy.linalg.lu(B)
This decomposes B as B = P L U, where U is now an upper-diagonal matrix, and P L is invertible. In particular, we find:
>>> U
array([[ 2., 4., 6.],
[ 0., 7., 0.],
[ 0., 0., 0.]])
and
>>> np.linalg.inv(P # L) # y
array([[ 12.],
[ 7.],
[ 0.]])
The goal is to solve this under-determined, transformed problem, U x = (P L)^{-1} y. The solution set is the same as our original problem. Let a solution be written as x = (x_1, x_2, x_3). Then we immediately see that any solution must have x_2 = 1. Then we must have 2 x_1 + 4 + 6 x_2 = 12. Solving for x_1, we get x_1 = 4 - 3 x_2. And so any solution is of the form (4 - 3 x_2, 1, x_2).
The easiest way to generate a solution for the above is to simply choose x_2 = 1. Then x_1 = 1, and you recover the solution that MATLAB gives you: (1, 1, 1).
On the other hand, np.linalg.pinv computes the Moore-Penrose pseudoinverse, which is the unique matrix satisfying the pseudionverse properties for B. The emphasis here is on unique. Therefore, when you say:
my question lies in how Python chooses an answer from an infinite solution space
the answer is that all of the choosing is actually done by you when you use the pseudoinverse, because np.linalg.pinv(B) is a unique matrix, and hence np.linalg.pinv(B) # y is unique.
To generate the full set of solutions, see the comment above by #ali_m.
So I'm new to python AND data analysis, but have been tasked to create a scatter plot. The data set that I'm using has many elements containing None values. When I use the polyfit method to create a trendline(best-fit line) I get errors for the Nones. I've tried using lists and numpy arrays with dismal results. I've also tried masked_array, masked_invalid, ect. in MULTIPLE configurations, but it kept giving me an array filled with Nones. Is there a way of creating a trendline in such a way that I don't need to remove the elements that have None values? I need them to keep my plot dimensions correct. I'm using Python 2.7. This is what I got so far:
import matplotlib.pyplot as plt
import numpy as np
import numpy.ma as ma
import pylab
#The InterpolatedUnivariateSpline method popped up during my endeavor
#to extrapolate the trendline through the gaps in data.
#To be honest, I don't think its doing anything for me...
from scipy.interpolate import InterpolatedUnivariateSpline
fig, ax = plt.subplots(1,1)
ax.scatter(y, dbm, color = 'purple', marker = 'o', s = 100)
plt.xlim(min(y), max(y))
plt.xlabel('Temp - C')
dbm_array = np.asarray(dbm) #dbm and y are lists earlier in the program
y_array = np.asarray(y)
x = np.linspace(min(y), max(y), len(y))
order = 1
s = InterpolatedUnivariateSpline(y, dbm, k=order)
blah = s(x)
plt.plot(y, blah, '--k')
This gives me the scatter plot without the trendline for some reason. No errors, so I guess I got that going for me....
Thank you so much in advance!
First of all, if you have arrays, there should be no Nones in them, just nans. This is because None is an object which cannot be expressed as a number. So, the first problem may be here. Let's have a look:
import numpy as np
a = np.array([None, 1, 2, 3, 4, None])
What do we get?
>>> a
array([None, 1, 2, 3, 4, None], dtype=object)
This is most certainly something we did not. It is an array of objects, which is most of the time something not very useful. You cannot perform any calculations on that one:
>>> 2*a
unsupported operand type(s) for *: 'int' and 'NoneType'
This happens because the element-wise multiplication tries to multiply 2*None.
So, what you really want to have is:
>>> a = np.array([np.nan, 1, 2, 3, 4, np.nan])
>>> a
array([ nan, 1., 2., 3., 4., nan])
>>> a.dtype
dtype('float64')
>>> 2 * a
array([ nan, 2., 4., 6., 8., nan])
Now everything works as expected.
So, the first thing is to check that your input arrays have the correct form. If you then have problems with curve fitting, you may create an array without the nasty nans in there:
import numpy as np
a = np.array([[0,np.nan], [1, 1], [2, 1.5], [3.2, np.nan], [4, 5]])
b = a[-np.isnan(a[:,1])]
Let's see the contents of a and b:
>>> a
array([[ 0. , nan],
[ 1. , 1. ],
[ 2. , 1.5],
[ 3.2, nan],
[ 4. , 5. ]])
>>> b
array([[ 1. , 1. ],
[ 2. , 1.5],
[ 4. , 5. ]])
And this is what you want. The curve is fitted with b without any nans which have the habit of migrating around and making the results of calculations nans. (This is by design.)
How does this work, then? The np.isnan(a[:,1]) returns a boolean array with True at each position with a nan in column 1 in a and False for each valid number. As this is exactly the opposite of what we want, we'll negate it by adding the minus sign in front. And then the indexing picks only the rows which have numbers.
In case you have your X data and Y data in two different 1-D vectors, do this:
# original y data: Y
# original x data: X
# both have the same length
# calculate a mask to be used (a boolean vector)
msk = -np.isnan(Y)
# use the mask to plot both X and Y only at the points where Y is not NaN
plot(X[msk], Y[msk])
In some cases you may not have the X data at all, but you would like to number the points from, e.g. 0 onwards (as matplotlib does if you only give it one vector). There are a couple of possibilities, but this is one:
msk = -np.isnan(Y)
X = np.arange(len(Y))
plot(X[msk], Y[msk])
If I have a waveform x such as
x = [math.sin(W*t + Ph) for t in range(16)]
with arbitrary W and Ph, and I calculate its (Real) FFT f with
f = numpy.fft.rfft(x)
I can get the original x with
numpy.fft.irfft(f)
Now, what if I need to extend the range of the recovered waveform a number of samples to the left and to the right? I.e. a waveform y such that len(y) == 48, y[16:32] == x and y[0:16], y[32:48] are the periodic extensions of the original waveform.
In other words, if the FFT assumes its input is an infinite function f(t) sampled over t = 0, 1, ... N-1, how can I recover the values of f(t) for t<0 and t>=N?
Note: I used a perfect sine wave as an example, but in practice x could be anything: arbitrary signals such as x = range(16) or x = np.random.rand(16), or a segment of any length taken from a random .wav file.
Now, what if I need to extend the range of the recovered waveform a number of samples to the left and to the right? I.e. a waveform y such that len(y) == 48, y[16:32] == x and y[0:16], y[32:48] are the periodic extensions of the original waveform.
The periodic extension are also just x because it's the periodic extension.
In other words, if the FFT assumes its input is an infinite function f(t) sampled over t = 0, 1, ... N-1, how can I recover the values of f(t) for t<0 and t>=N?
The "N-point FFT assumes" that your signal is periodic with a periodicity of N. That's because all the harmonic base functions your block is decomposed into are periodic in the way that the previous N and succeding N samples are just a copy of the main N samples.
If you allow any value for W your input sinusoid won't be periodic with periodicity of N. But that does not stop the FFT function from decomposing it into a sum of many periodic sinusiods. And the sum of periodic sinusoids with periodicity of N will also have a periodicity of N.
Clearly, you have to rethink the problem.
Maybe you could make use of linear prediction. Compute a couple of linear prediction coefficients based on your fragment's windowed auto-correlation and the Levinson-Durbin recursion and extrapolate using those prediction coefficients. However, for a stable prediction filter, the prediction will converge to zero and the speed of convergence depends on what kind of signal you have. The perfect linear prediction coefficients for white noise, for example, are all zero. In that case you would "extrapolate" zeros to the left and the right. But there's not much you can do about it. If you have white noise, there is just no information in your fragment about surrounding samples because all the samples are independent (that's what white noise is about).
This kind of linear prediction is actually able to predict sinusoid samples perfectly. So, if your input is sin(W*t+p) for arbitrary W and p you will only need linear prediction with order two. For more complex signals I suggest an order of 10 or 16.
The following examples should give you a good idea of how to go about it:
>>> x1 = np.random.rand(4)
>>> x2 = np.concatenate((x1, x1))
>>> x3 = np.concatenate((x1, x1, x1))
>>> np.fft.rfft(x1)
array([ 2.30410617+0.j , -0.89574460-0.26838271j, -0.26468792+0.j ])
>>> np.fft.rfft(x2)
array([ 4.60821233+0.j , 0.00000000+0.j ,
-1.79148921-0.53676542j, 0.00000000+0.j , -0.52937585+0.j ])
>>> np.fft.rfft(x3)
array([ 6.91231850+0.j , 0.00000000+0.j ,
0.00000000+0.j , -2.68723381-0.80514813j,
0.00000000+0.j , 0.00000000+0.j , -0.79406377+0.j ])
Of course the easiest way to get three periods is to concatenate 3 copies of the inverse FFT in the time domain:
np.concatenate((np.fft.irfft(f),) * 3)
But if you want or have to do this in the frequency domain, you can do the following:
>>> a = np.arange(4)
>>> f = np.fft.rfft(a)
>>> n = 3
>>> ext_f = np.zeros(((len(f) - 1) * n + 1,), dtype=f.dtype)
>>> ext_f[::n] = f * n
>>> np.fft.irfft(ext_f)
array([ 0., 1., 2., 3., 0., 1., 2., 3., 0., 1., 2., 3.])
For stationary waveforms that are periodic in the FFT aperture or length, you can just cyclicly repeat the waveform, or the IFFT(FFT()) re-synthesized equivalent waveform, to extend them in the time domain. For waveforms which are widowed in time from sources that are not periodic in the FFT aperture or length, the FFT result will be the spectrum convolved with a Sinc function. So some sort of equivalent to a de-convolution will be required to recover the original un-windowed spectral content. Since this deconvolution is difficult or impossible, most commonly an analysis/re-synthesis method is used instead, such as a phase-vocoder process or other frequency estimators. Then those estimated frequencies, which may be different from those in the bins of a single raw FFT result, can be fed to a bank of sinusoidal synthesizers, a mix of phase-modified IFFTs, or other re-synthesis methods, to create a longer waveform with approximately the same spectral content.