Python Overflow error multiplication - python

My code structure for a equation i am working on goes like this.
import matplotlib.pyplot as plt
for x in range (0 ,20):
temp = x * (-1*10**(-9)*(2.73**(x/0.0258*0.8) - 1) + 3.1)
P.append(temp)
Q.append(x)
print temp
plt.plot(Q,P)
plt.show()
Printing temp gives me this
4.759377049180328889121938118
-33447.32349862001706705983714
-2238083697441414267.104517188
-1.123028419942448387512537968E+32
-5.009018636753031534804021565E+45
-2.094526332030486492065138064E+59
-8.407952213322881981287736804E+72
-3.281407666305436036872349205E+86
-1.254513385166959745710275399E+100
-4.721184644539803475363811828E+113
-1.754816222227633792004755288E+127
-6.457248346728221564046430946E+140
-2.356455347384037854507854340E+154
-8.539736787129928434375037129E+167
-3.076467506425168063232368199E+181
-1.102652635599075169095479067E+195
-3.934509583907661118429424988E+208
-1.398436369682635574296418585E+222
-4.953240988408539700713401539E+235
-1.749015740500628326472633516E+249
The results shown are highly inaccurate. I know this because, the graph obtained is not what i am supposedly to get. A quick plotting of the same equation in google gave me this
This pic shows the differences in the graphs
The actual plot is the google.com one.
I m fairly certain that the errors are due to the floating point calculations. Can someone help me correct the formulated equations ?

Beginning from around 0.7 your scores drop into nothingness. Google is very clever to figure that out and limits the y-axis to a reasonable scale. In matplotlib you have to set this scale manually.
Also note that you are plotting integers from 0 to 19. When plotting continuous functions, linearly spaced points in an interval often make more sense.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 0.8, 100)
y = x * (-1e-9 *(2.73**(x/0.0258*0.8) - 1) + 3.1)
plt.plot(x,y)
plt.ylim(-0.5, 2)

Related

Curve fitting with cubic spline

I am trying to interpolate a cumulated distribution of e.g. i) number of people to ii) number of owned cars, showing that e.g. the top 20% of people own much more than 20% of all cars - off course 100% of people own 100% of cars. Also I know that there are e.g. 100mn people and 200mn cars.
Now coming to my code:
#import libraries (more than required here)
import pandas as pd
from scipy import interpolate
from scipy.interpolate import interp1d
from sympy import symbols, solve, Eq
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
%matplotlib inline
import plotly.express as px
from scipy import interpolate
curve=pd.read_excel('inputs.xlsx',sheet_name='inputdata')
Input data: Curveplot (cumulated people (x) on the left // cumulated cars (y) on the right)
#Input data in list form (I am not sure how to interpolate from a list for the moment)
cumulatedpeople = [0, 0.453086, 0.772334, 0.950475, 0.978981, 0.999876, 0.999990, 1]
cumulatedcars= [0, 0.016356, 0.126713, 0.410482, 0.554976, 0.950073, 0.984913, 1]
x, y = points[:,0], points[:,1]
interpolation = interp1d(x, y, kind = 'cubic')
number_of_people_mn= 100000000
oneperson = 1 / number_of_people_mn
dataset = pd.DataFrame(range(number_of_people_mn + 1))
dataset.columns = ["nr_of_one_person"]
dataset.drop(dataset.index[:1], inplace=True)
#calculating the position of every single person on the cumulated x-axis (between 0 and 1)
dataset["cumulatedpeople"] = dataset["nr_of_one_person"] / number_of_people_mn
#finding the "cumulatedcars" to the "cumulatedpeople" via interpolation (between 0 and 1)
dataset["cumulatedcars"] = interpolation(dataset["cumulatedpeople"])
plt.plot(dataset["cumulatedpeople"], dataset["cumulatedcars"])
plt.legend(['Cubic interpolation'], loc = 'best')
plt.xlabel('Cumulated people')
plt.ylabel('Cumulated cars')
plt.title("People-to-car cumulated curve")
plt.show()
However when looking at the actual plot, I get the following result which is false: Cubic interpolation
In fact, the curve should look almost like the one from a linear interpolation with the exact same input data - however this is not accurate enough for my purpose: Linear interpolation
Is there any relevant step I am missing out or what would be the best way to get an accurate interpolation from the inputs that almost looks like the one from a linear interpolation?
Short answer: your code is doing the right thing, but the data is unsuitable for cubic interpolation.
Let me explain. Here is your code that I simplified for clarity
from scipy.interpolate import interp1d
from matplotlib import pyplot as plt
cumulatedpeople = [0, 0.453086, 0.772334, 0.950475, 0.978981, 0.999876, 0.999990, 1]
cumulatedcars= [0, 0.016356, 0.126713, 0.410482, 0.554976, 0.950073, 0.984913, 1]
interpolation = interp1d(cumulatedpeople, cumulatedcars, kind = 'cubic')
number_of_people_mn= 100#000000
cumppl = np.arange(number_of_people_mn + 1)/number_of_people_mn
cumcars = interpolation(cumppl)
plt.plot(cumppl, cumcars)
plt.plot(cumulatedpeople, cumulatedcars,'o')
plt.show()
note the last couple of lines -- I am plotting, on the same graph, both the interpolated results and the input date. Here is the result
orange dots are the original data, blue line is cubic interpolation. The interpolator passes through all the points so technically is doing the right thing
Clearly it is not doing what you would want
The reason for such strange behavior is mostly at the right end where you have a few x-points that are very close together -- the interpolator produces massive wiggles trying to fit very closely spaced points.
If I remove two right-most points from the interpolator:
interpolation = interp1d(cumulatedpeople[:-2], cumulatedcars[:-2], kind = 'cubic')
it looks a bit more reasonable:
But still one would argue linear interpolation is better. The wiggles on the left end now because the gaps between initial x-poonts are too large
The moral here is that cubic interpolation should really be used only if gaps between x points are roughly the same
Your best bet here, I think, is to use something like curve_fit
a related discussion can be found here
specifically monotone interpolation as explained here yields good results on your data. Copying the relevant bits here, you would replace the interpolator with
from scipy.interpolate import pchip
interpolation = pchip(cumulatedpeople, cumulatedcars)
and get a decent-looking fit:

Fill missing array values using extrapolated plot Python

I have a 2D numpy array containing X and Y data. The axis X contain time information with resolution of nano seconds. My problem occours because I need to compare simulated signal and a real signal. The problem of the simulated signal is that the simulator, with optimization purposes, has a diferent step sizes, as show on fig. 1.
In other hand my real data was acquired by an osciloscope and your data has exaclty 1 ns of diference between each point recorded. Because of this I need to have the same scale in the X axis to make a correct comparasion. How can I get the extra points to make my data with a constant step between the points?
EDIT 1
I need that this new points fill my array to make the simulated data with constant step, like show in fig 2.
The green points show an example of data extracted from extrapolated data.
A common way to do this is to simply duplicate some points (adding a point with same average value doesn't modify much most of statistical values)
So you have to change the dataset everytime you change the scale. Takes lots of time every scale change but it is super easy. If you don't have to change the scale too much, you can try.
This problem was solved using scipy interpolate module. Eg.
interpolate.py
import matplotlib.pyplot as plt
from scipy import interpolate as inter
import numpy as np
Fs = 0.1
f = 0.01
sample = 10
x = np.arange(sample)
y = np.sin(2 * np.pi * f * x / Fs)
inte = inter.interp1d(x,y)
new_x = np.arange(0,9,0.1)
new_y = inte(new_x)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.scatter(new_x,new_y,s=5,marker='.')
ax1.scatter(x,y,s=50,marker='*')
plt.show()
This code give the following result.

Plotting functions at a specific y-interval

I need to plot a few exponential curves on the same plot - with the constraint that the plot ends at y=1.
For reference, here is the code:
from numpy import arange
from matplotlib import pyplot as plt
T = arange(60,89)
curve1 = 2**(T - 74)
curve2 = 2**(T - 60)
plt.plot(T,curve1 )
plt.plot(T,curve2 )
plt.show()
Here's the result:
The second curve is just barely visible, since the values are comparatively so low.
The problem I'm having is that all of these curves blow up to 700000+ fairly rapidly, but I'm only interested in the range being (0,1). How do I plot just these bits, but with nice smooth curves (so that one curve doesn't just stop halfway along)?
As you've found, this is easy to do if you adjust the range (T) for each function you add. However, if you need to change the functions, you'll need to recheck it.
The problem you're dealing with, generically, is calculating the x-range of some functions given their y-range - or, as a mathematician may put it, determining the domain of a function that corresponds to a range of its image. While, for a arbitrary function, this is impossible, it's possible if your function is injective, as is the case.
Let's say we have a function y=f(x), and the yrange is [y1,y2]. The x-range will be [f^(-1)(y1), f^(-1)(y2] (f^-1 being the inverse function of f)
Since we have multiple functions that we need to plot, the x_range is simply the greatest range - of all - the lower portion of the final range is the minimum of the lower portion of all the ranges, and the upper portion is the maximum of the upper portions.
Here's some code that exemplifies all this, by taking as a parameter the number of steps, and calculating the proper T over the x-range:
from numpy import arange
from matplotlib import pyplot as plt
from sympy import sympify, solve
f1= '2**(T - 74)' #note these are strings
f2= '2**(T - 60)'
y_bounds= (0.001, 1) #exponential functions never take 0 value, so we use 0.001
mm= (min, max)
x_bounds= [m(solve(sympify(f+"-"+str(y)))[0] for f in (f1,f2)) for y,m in zip(y_bounds, mm)]
print x_bounds
N_STEPS=100 #distributed over x_bounds
T = arange(x_bounds[0], x_bounds[1]+0.001, (x_bounds[1]-x_bounds[0])/N_STEPS)
curve1 = eval(f1) #this evaluates the function over the range, by evaluating the string as a python expression
curve2 = eval(f2)
plt.plot(T,curve1)
plt.plot(T,curve2)
plt.ylim([0,1])
plt.show()
The code outputs both the x-range (50.03, 74) and this plot:
This one was really easy: sorry to have wasted time on it.
Just set the y limits to be [0,1] via
plt.ylim([0,1])
and you're done.
In addition to the other answers, it may be worth plotting on a log-scale, since the growth of your functions is essentially exponential. For example:
from numpy import arange
from matplotlib import pyplot as plt
T = arange(60,89)
curve1 = 2**(T - 74)
curve2 = 2**(T - 60)
plt.semilogy(T,curve1 )
plt.semilogy(T,curve2 )
plt.show()

Plot Mandelbrot with matplotlib / pyplot / numpy / python

I am new to python and learning by following Python "Scientific lecture notes Release 2013.1" tutorial. Please help me solve this Mandelbrot problem in the srcreenshot below (Pg 71). Please provide step-wise commands with explanation if possible because programming concepts are new to me.
http://dl.dropbox.com/u/50511173/mandelbrot.png
I tried to solve this as follows:
import numpy as np
import matplotlib.pyplot as plt
x,y=np.ogrid[-2:1:10j,-1.5:1.5:10j]
c=x + 1j*y
z=0
for g in range(50):
z=z**2 + c
plt.imshow(z.T, extent=[-2,1,-1.5,1.5])
I encountered the following error "TypeError: Image data can not convert to float"
What does this error exactly mean and how to correct it? I am finding it difficult to understand the imshow() function. What do the individual terms inside imshow() mean?
Thank You.
The Mandelbrot set is not the values of z you are trying to plot, which are giving you problems because they are complex numbers. The Mandelbrot set is made up of the points p of the complex plane for which the recurrence relation z_n = z_n-1**2 + p remains bounded. This is checked in a practical way by comparing the result after a few iterations to some threshold. In your case, if you add the following lines after your for loop:
threshold = 2
mask = np.abs(z) < threshold
and then plot mask you should see the set plot on screen.
To understand the general workings of imshow's arguments, you will be better off reading the docs than asking here.
Thanks to #Jan and #Jaime. I got it working as follows, takes too much time to calculate though:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
x,y=np.ogrid[-2:1:5000j,-1.5:1.5:5000j]
print('')
print('Grid set')
print('')
c=x + 1j*y
z=0
for g in range(500):
print('Iteration number: ',g)
z=z**2 + c
threshold = 2
mask=np.abs(z) < threshold
print('')
print('Plotting using imshow()')
plt.imshow(mask.T,extent=[-2,1,-1.5,1.5])
print('')
print('plotting done')
print('')
plt.gray()
print('')
print('Preparing to render')
print('')
plt.show()
You get this error because plt.imshow does not accept arrays of complex numbers. You can address the real or imaginary part of an array Z as Z.real or Z.imag. Thus if you want to plot the real part
plt.imshow(z.real.T, extent=[-2,1,-1.5,1.5])
would do the job.
The arguments in 'imshow' define the following things.
If z is a N-by-M matrix, it is interpreted as point values on a regular grid. By extent you specify how this grid extends in space...
You're trying to plot a complex value with imshow which is why you're getting that error, can use a threshold as others have suggested, but you might want to consider using np.angle or np.abs as well. You can also simplify your calculation of z using Python's built-in reduce method.
Had some fun with this one, but this shows the general idea:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
x, y = np.ogrid[-2:1:500j, -1.5:1.5:500j]
# Increase this to improve the shape of the fractal
iterations = 9
c = x + 1j*y
z = reduce(lambda x, y: x**2 + c, [1] * iterations, c)
plt.figure(figsize=(10, 10))
plt.imshow(np.angle(z));
plt.figure(figsize=(10, 10))
plt.imshow(np.log(np.abs(z)));

Spline representation with scipy.interpolate: Poor interpolation for low-amplitude, rapidly oscillating functions

I need to (numerically) calculate the first and second derivative of a function for which I've attempted to use both splrep and UnivariateSpline to create splines for the purpose of interpolation the function to take the derivatives.
However, it seems that there's an inherent problem in the spline representation itself for functions who's magnitude is order 10^-1 or lower and are (rapidly) oscillating.
As an example, consider the following code to create a spline representation of the sine function over the interval (0,6*pi) (so the function oscillates three times only):
import scipy
from scipy import interpolate
import numpy
from numpy import linspace
import math
from math import sin
k = linspace(0, 6.*pi, num=10000) #interval (0,6*pi) in 10'000 steps
y=[]
A = 1.e0 # Amplitude of sine function
for i in range(len(k)):
y.append(A*sin(k[i]))
tck =interpolate.UnivariateSpline(x, y, w=None, bbox=[None, None], k=5, s=2)
M=tck(k)
Below are the results for M for A = 1.e0 and A = 1.e-2
http://i.imgur.com/uEIxq.png Amplitude = 1
http://i.imgur.com/zFfK0.png Amplitude = 1/100
Clearly the interpolated function created by the splines is totally incorrect! The 2nd graph does not even oscillate the correct frequency.
Does anyone have any insight into this problem? Or know of another way to create splines within numpy/scipy?
Cheers,
Rory
I'm guessing that your problem is due to aliasing.
What is x in your example?
If the x values that you're interpolating at are less closely spaced than your original points, you'll inherently lose frequency information. This is completely independent from any type of interpolation. It's inherent in downsampling.
Nevermind the above bit about aliasing. It doesn't apply in this case (though I still have no idea what x is in your example...
I just realized that you're evaluating your points at the original input points when you're using a non-zero smoothing factor (s).
By definition, smoothing won't fit the data exactly. Try putting s=0 in instead.
As a quick example:
import matplotlib.pyplot as plt
import numpy as np
from scipy import interpolate
x = np.linspace(0, 6.*np.pi, num=100) #interval (0,6*pi) in 10'000 steps
A = 1.e-4 # Amplitude of sine function
y = A*np.sin(x)
fig, axes = plt.subplots(nrows=2)
for ax, s, title in zip(axes, [2, 0], ['With', 'Without']):
yinterp = interpolate.UnivariateSpline(x, y, s=s)(x)
ax.plot(x, yinterp, label='Interpolated')
ax.plot(x, y, 'bo',label='Original')
ax.legend()
ax.set_title(title + ' Smoothing')
plt.show()
The reason that you're only clearly seeing the effects of smoothing with a low amplitude is due to the way the smoothing factor is defined. See the documentation for scipy.interpolate.UnivariateSpline for more details.
Even with a higher amplitude, the interpolated data won't match the original data if you use smoothing.
For example, if we just change the amplitude (A) to 1.0 in the code example above, we'll still see the effects of smoothing...
The problem is in choosing suitable values for the s parameter. Its values depend on the scaling of the data.
Reading the documentation carefully, one can deduce that the parameter should be chosen around s = len(y) * np.var(y), i.e. # of data points * variance. Taking for example s = 0.05 * len(y) * np.var(y) gives a smoothing spline that does not depend on the scaling of the data or the number of data points.
EDIT: sensible values for s depend of course also on the noise level in the data. The docs seem to recommend choosing s in the range (m - sqrt(2*m)) * std**2 <= s <= (m + sqrt(2*m)) * std**2 where std is the standard deviation associated with the "noise" you want to smooth over.

Categories