Having trouble plotting a log-log plot in python

Having trouble plotting a log-log plot in python - python

Hey so I'm trying to plot variables like age against its frequency, for a rotating body. I am given the period and period derivative aswell as their associated errors. Since frequency is related to period by:
f = 1/T
where frequency is f and period is T
then,
df = - (1/(T^2)) * dT
where dT and dF are the derivatives of period and frequency
but when it comes to plotting the log of this I can't do it in python as it doesn't accept negative values for a loglog plot.
I've tried a work around of using only absolute values but then I only get half the errors when plotting error bars. Is there a way to make python plot both the negative and positive error bars? The frequency derivative itself is a negative quantity.

Unfortunately, log(x) cannot be negative because log(x) = y <=> 10^y = x.
Is 10^y ever going to be -5?
Unfortunately it is impossible to make 10^y<=0 because as y becomes -infinity, x approaches 1/infinity; x approaches, but never passes 0.
Is it possible to plot log(x), where x is negative?
One simple solution to your problem however, is to take the absolute value of df. By doing this, negative numbers become positive. The only downside is that after you've transformed the data this way, you will need to undo the transformation. If the number was negative (and turned positive due to abs(df)), then you must multiply it by -1 afterwards.
You may need to define your own absolute value function that records any values it needs to make positive:
changeList = []
def absRecordChanges(value):
if value < 0 :
value = value * -1
changeList.append(value)
return value
There are other ways to solve the problem, but they are all centred around transforming your data to meet the conditions of a log tranformation (x > 0), and having the data you changed recorded so you can change it back afterward (before you plot it).
EDIT:
While fiddling around in desmos, I was able to plot log(x) where x is any integer. I used a piecewise function to do this: {x<0:-log(abs(x)),log (x)}.
def piecewiseLog(x)
If x <= 0 :
return -log(abs(x))
else :
return log(x)
As I'm not familiar with matlab syntax, this link has an alternative solution: http://www.mathworks.com/matlabcentral/answers/31566-display-negative-values-on-logarithmic-graph

Related

Overflow in exp when curve_fit of datetime

I tried to fit datetime vs. float data using curve_fit. As far as I understand, curve_fit does not work with datetime, so I first have to convert the data to numerical values. This gives me very large values for x that cause an overflow in the exp function. My code is below. The same code does work if I fit with a polynomial instead of the exponential.
def func(x, a):
return (np.exp(a*x))
def fit_exponential(gd):
gdtemp['Date'] = pd.to_datetime(gdtemp.Date)
mask = (gdtemp['Date'] > '2020-01-30') & (gdtemp['Date'] <= '2020-03-20')
gdtemp = gdtemp.loc[mask].copy()
x = pd.to_numeric(gdtemp.Date)
y=gdtemp['Confirmed']
popt, pcov = curve_fit(func,x, y)
How can I modify the code to work with the exponential?
I have two ideas on how to fix this but am not sure how to go about implementing this:
1st idea: Don't convert with to_numeric, but in some other way that produces smaller numbers. My input data is fairly simple and consists of exactly 1 row per day, so I don't need time or anything else. Is there another function similar to to_numeric() that ignores the time part and produces smaller numbers?
2nd idea: divide the numeric date values by some large number and later multiply back. What number should I use for dividing?

I solved this, by mapping the large numerical x values to the interval [0;1] and fitting on this interval.
The essential modifications are:
small_x = (x - x.min()) / (x.max() - x.min())
popt, pcov = curve_fit(func4 ,small_x, y )
The values in the exponent are now reasonable (on the order of 1 in my case) and there is no problem with overflows.
Without this mapping I would end up with very large x values (on the order of 10^(15)) and very tiny values for a (on the order of 10^(-15)) which obviously the fititng function did not like.

Convert a sawtooth into a continuous linear function

Data from angular encoders is in a sawtooth shape ranging from 0° to 360°. I would now like to create a continuous linear function that describes the total angle.
I would like to go from a sawtooth function that can be created like this (in python with numpy):
x = np.arange(0,1000,2)
y = np.arange(0,1000,2)%360
Plot sawtooth function
Back to the linear (in this case identity) function:
x = np.arange(0,1000,2)
y = np.arange(0,1000,2)
Plot linear function
The data I'm trying to use this on is not generated, it's measurement data from an angular encoder. I do not know the frequency. I know that the function value is in the interval [0,360]. I'm looking for a solution that can also handle a 'negative' sawtooth.

Hi I faced your same issue and I solved it:
This is how my signal looks like:
Sawtooth function of angle
It's an array containing the Angle of a rotating complex number in the range [-pi, pi].
What I wanted is the continuous linear function as you described.
I just thought to compare two consecutive elements of the array containing the angle values and exactly when the difference between them is a multiple of pi, each next value is incremented by such difference.
n=0
for i in range(len(Angle)-1):
if round((Angle[i] - Angle[i+1])/pi) == n+2:
n=n+2
Angle[i+1]=Angle[i+1] + pi*n
This is what I got:
Linear Angle

It looks like you just need to split effective value in two parts, let's call them base and reminder. Total value would be base + reminder.
Then, you analyze changes of input and in case it was high (359 or close) and suddenly became low (0 or close), you add 360 to base. You subtract 360 from base if change happened in other direction. After base recalculation you assign input value to reminder for future reference. And that's all.

Find plateau in Numpy array

I am looking for an efficient way to detect plateaus in otherwise very noisy data. The plateaus are always relatively broad A simple example of what this data could look like:
test=np.random.uniform(0.9,1,100)
test[10:20]=0
plt.plot(test)
Note that there can be multiple plateaus (which should all be detected) which can have different values.
I've tried using scipy.signal.argrelextrema, but it doesn't seem to be doing what I want it to:
peaks=argrelextrema(test,np.less,order=25)
plt.vlines(peaks,ymin=0, ymax=1)
I don't need the exact interval of the plateau- a rough range estimate would be enough, as long as that estimate is bigger or equal than the actual plateau range. It should be relatively efficient however.

There is a method scipy.signal.find_peaks that you can try, here is an exmple
import numpy
from scipy.signal import find_peaks
test = numpy.random.uniform(0.9, 1.0, 100)
test[10 : 20] = 0
peaks, peak_plateaus = find_peaks(- test, plateau_size = 1)
although find_peaks only finds peaks, it can be used to find valleys if the array is negated, then you do the following
for i in range(len(peak_plateaus['plateau_sizes'])):
if peak_plateaus['plateau_sizes'][i] > 1:
print('a plateau of size %d is found' % peak_plateaus['plateau_sizes'][i])
print('its left index is %d and right index is %d' % (peak_plateaus['left_edges'][i], peak_plateaus['right_edges'][i]))
it will print
a plateau of size 10 is found
its left index is 10 and right index is 19

This is really just a "dumb" machine learning task. You'll want to code a custom function to screen for them. You have two key characteristics to a plateau:
They're consecutive occurrences of the same value (or very nearly so).
The first and last points deviate strongly from a forward and backward moving average, respectively. (Try quantifying this based on the standard deviation if you expect additive noise, for geometric noise you'll have to take the magnitude of your signal into account too.)
A simple loop should then be sufficient to calculate a forward moving average, stdev of points in that forward moving average, reverse moving average, and stdev of points in that reverse moving average.
Read until you find a point well outside the regular noise (compare to variance). Start buffering those indices into a list.
Keep reading and buffering indices into that list while they have the same value (or nearly the same, if your plateaus can be a little rough; you'll want to use some tolerance plus the standard deviation of your plateaus, or just some tolerance if you expect them all to behave similarly).
If the variance of the points in your buffer gets too high, it's not a plateau, too rough; throw it out and start scanning again from your current position.
If the last value was very different from the previous (on the order of the change that triggered your code to start buffering indices) and in the opposite direction of the original impulse, cap your buffer here; you've got a plateau there.
Now do whatever you want with the points at those indices. Delete them, replace them with a linear interpolation between the two boundary points, whatever.
I could generate some noise and give you some sample code, but this is really something you're going to have to adapt to your application. (For example, there's a shortcoming in this method that a plateau which captures a point on the middle of the "cliff edge" may leave that point when it removes the rest of the plateau. If that's something you're worried about, you'll have to do a little more exploring after you ID the plateau.) You should be able to do this in a single pass over the data, but it might be wise to get some statistics on the whole set first to intelligently tweak your thresholds.
If you have an exact definition of what constitutes a plateau, you can make this a lot less hand-wavey and ML-looking, but so long as you're trying to identify fuzzy pattern, you're gonna have to take a statistics-based approach.

I had a similar problem, and found a simple heuristic solution shared below. I find plateaus as ranges of constant gradient of the signal. You could change the code to also check that the gradient is (close to) 0.
I apply a moving average (uniform_filter_1d) to filter out noise. Also, I calculate the first and second derivative of the signal numerically, so I'm not sure it matches the requirement of efficiency. But it worked perfectly for my signal and might be a good starting point for others.
def find_plateaus(F, min_length=200, tolerance = 0.75, smoothing=25):
'''
Finds plateaus of signal using second derivative of F.
Parameters
----------
F : Signal.
min_length: Minimum length of plateau.
tolerance: Number between 0 and 1 indicating how tolerant
the requirement of constant slope of the plateau is.
smoothing: Size of uniform filter 1D applied to F and its derivatives.
Returns
-------
plateaus: array of plateau left and right edges pairs
dF: (smoothed) derivative of F
d2F: (smoothed) Second Derivative of F
'''
import numpy as np
from scipy.ndimage.filters import uniform_filter1d
# calculate smooth gradients
smoothF = uniform_filter1d(F, size = smoothing)
dF = uniform_filter1d(np.gradient(smoothF),size = smoothing)
d2F = uniform_filter1d(np.gradient(dF),size = smoothing)
def zero_runs(x):
'''
Helper function for finding sequences of 0s in a signal
https://stackoverflow.com/questions/24885092/finding-the-consecutive-zeros-in-a-numpy-array/24892274#24892274
'''
iszero = np.concatenate(([0], np.equal(x, 0).view(np.int8), [0]))
absdiff = np.abs(np.diff(iszero))
ranges = np.where(absdiff == 1)[0].reshape(-1, 2)
return ranges
# Find ranges where second derivative is zero
# Values under eps are assumed to be zero.
eps = np.quantile(abs(d2F),tolerance)
smalld2F = (abs(d2F) <= eps)
# Find repititions in the mask "smalld2F" (i.e. ranges where d2F is constantly zero)
p = zero_runs(np.diff(smalld2F))
# np.diff(p) gives the length of each range found.
# only accept plateaus of min_length
plateaus = p[(np.diff(p) > min_length).flatten()]
return (plateaus, dF, d2F)

Creating a fool proof graphing calculator using python - Python 2.7

I am trying to create a fool proof graphing calculator using python and pygame.
I created a graphing calculator that works for most functions. It takes a user string infix expression and converts it to postfix for easier calculations. I then loop through and pass in x values into the postfix expression to get a Y value for graphing using pygame.
The first problem I ran into was when taking calculations of impossible things. (like dividing by zero, square root of -1, 0 ^ non-positive number). If something like this would happen I would output None and that pixel wouldn't be added to the list of points to be graphed.
* I have showed all the different attempts I have made at this to help you understand where I cam coming from. If you would like to only see my most current code and method, jump down to where it says "current".
Method 1
My first method was after I acquired all my pixel values, I would paint them using the pygame aalines function. This worked, except it wouldn't work when there were missing points in between actual points because it would just draw the line across the points. (1/x would not work but something like 0^x would)
This is what 1/x looks like using the aalines method
Method 1.1
My next Idea was to split the line into two lines every time a None was printed back. This worked for 1/x, but I quickly realized that it would only work if one of the passed in X values exactly landed on a Y value of None. 1/x might work, but 1/(x+0.0001) wouldn't work.
Method 2
My next method was to convert the each pixel x value into the corresponding x point value in the window (for example, (0,0) on the graphing window actually would be pixel (249,249) on a 500x500 program window). I would then calculate every y value with the x values I just created. This would work for any line that doesn't have a slope > 1 or < -1.
This is what 1/x would look like using this method.
Current
My most current method is supposed to be a advanced working version of method 2.
Its kind of hard to explain. Basically I would take the x value in between each column on the display window. For every pixel I would do this just to the left and just to the right of it. I would then plug those two values into the expression to get two Y values. I would then loop through each y value on that column and check if the current value is in between both of the Y values calculated earlier.
size is a list of size two that is the dimensions of the program window.
xWin is a list of size two that holds the x Min and x Max of the graphing window.
yWin is a list of size two that holds the y Min and y Max of the graphing window.
pixelToPoint is a function that takes scalar pixel value (just x or just y) and converts it to its corresponding value on the graphing window
pixels = []
for x in range(size[0]):
leftX = pixelToPoint(x,size[0]+1, xWin, False)
rightX = pixelToPoint(x+1, size[0]+1, xWin, False)
leftY = calcPostfix(postfix, leftX)
rightY = calcPostfix(postfix, rightX)
for y in range(size[1]):
if leftY != None and rightY != None:
yPoint = pixelToPoint(y,size[1],yWin, True)
if (rightY <= yPoint <= leftY) or (rightY >= yPoint >= leftY):
pixels.append((x,y))
for p in pixels:
screen.fill(BLACK, (p, (1, 1)))
This fixed the problem in method 2 of having the pixels not connected into a continuous line. However, it wouldn't fix the problem of method 1 and when graphing 1/x, it looked exactly the same as the aalines method.
-------------------------------------------------------------------------------------------------------------------------------
I am stuck and can't think of a solution. The only way I can think of fixing this is by using a whole bunch of x values. But this way seems really inefficient. Also I am trying to make my program as resizable and customizable as possible so everything must be variably driven and I am not sure what type of calculations are needed to find out how many x values are needed to be used depending on the program window size and the graph's window size.
I'm not sure if I am on the right track or if there is a completely different method of doing this, but I want to create my graphing calculator to able to graph any function (just like my actual graphing calculator).
Edit 1
I just tried using as many x values as there are pixels (500x500 display window calculates 250,000 y values).
Its worked for every function I've tried with it, but it is really slow. It takes about 4 seconds to calculate (it fluctuates depending on the equation). I've looked around online and have found graphing calculators that are almost instantaneous in their graphing, but I cant figure out how they do it.
This online graphing calcuator is extremely fast and effective. There must be some algorithm other than using a bunch of x values than can achieve what I want because that site is doing it..

The problem you have is that to be able to know if between two point you can reasonably draw a line you have to know if the function is continuous in the interval.
It is a complex problem in General what you could do is use the following heuristic. If the slope of the line have changed too much from the previous one guess you have a non continuous point in the interval and don't draw a line.

Another solution would be based on solution 2.
After have draw the points that correspond to every value of the x axis try to draw for every adjacent x: (x1, x2) the y within (y1 = f(x1), y2 = f(x2)) that can be reach by an x within (x1, x2).
This can be done by searching by dichotomy or via the Newton search heuristic an x that could fit.

Problems trying to calculate FWHM with scipy.interpolate

I am having problems trying to find the FWHM of some data. I initially tried to fit a curve using interpolate.interp1d. With this I was able to create a function that when I entered an x value it would return an interpolated y value. The issue is that I need the inverse of this functionality. In other words, I want to switch my independent and dependent variables. When I try to switch them, I get errors because the independent data has to be sorted. If I sort the data, I will lose the indexes, and therefore lose the shape of my graph.
I tried:
x = np.linspace(0, line.shape[0], line.shape[0])
self.x_curve = interpolate.interp1d(x, y, 'linear')
where y is my data.
To get the inverse, I tried:
self.x_curve = interpolate.interp1d(sorted(y), x, 'linear')
but the values are off.
I then moved on and tried to use UnivariateSpline and get the roots to find the FWHM (from this question here: Finding the full width half maximum of a peak), but the roots() method keeps giving me an empty list [].
This is what I used:
x_curve = interpolate.UnivariateSpline(x, y)
r = x_curve.roots()
print(r)
Here is an image of the data (with the UnivariateSpline):
Any ideas? Thanks.

Using UnivariateSpline.roots() to get FWHM will only work if you shift the data so that its value is 0 at FWHM.
Seeing that the background of the data is noisy, I'd first estimate the baseline. For example:
y_baseline = y[(x<200) & (x>350)].mean()
(adjust the limits for x as you see fit). Then shift the data so that the middle of the baseline and the peak is at 0. Seeing that your data has a minimum and not a maximum as in the example, I'm using y.min():
y_shifted = y - (y.min()+y_baseline)/2.0
Now fit a spline to this shifted data and roots() should be able to find the roots, the difference of which is the FWHM.
x_curve = interpolate.UnivariateSpline(x, y_shifted, s=0)
x_curve.roots()
Increase the s parameter if you want to estimate the FWHM from smoothed data.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.