I've been trying to fit some data to the best fit line for a specific set x and y.
I tried unsucessfully many times, and I cant seem to find a way to fit the data using yscale('log') and xscale('log'). I get this weird result but I can't seem to find why it is giving this strange [result]
[result]:https://www.dropbox.com/s/g6m4f8wh7r7jffg/Imagem%20sem%20t%C3%ADtulo.png?dl=0 .
My code:
#!/usr/bin/env python
# import the necessary modules
import numpy as np
import matplotlib.pyplot as plt
# Generate x and y values which the curve will be fitted to
# (In practical cases, these should be read in)
x = [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048]
y = [497775, 150760, 50929, 19697, 8520, 3948, 1812, 710, 214, 57, 18, 4]
p = np.polyfit(x,y,1)
plt.plot(x, np.polyval(p,x), 'r-')
plt.plot(x, y)
plt.yscale('log')
plt.xscale('log')
plt.show()
I have a hunch that it is because I am using polyvals, but I can´t find how to calculate it for logarithms. Can you help? I am new to this and I need help!
Something that appears linear on a log-log plot is not a linear function, it's an exponential function.
You're getting the best fit line:
y = a * x + b
but what you want is the best fit exponential function of the form:
y = a * x**k
There are a number of ways to do this. Using polyfit is a good approach, but you'll need to fit the line in "log space". In other words, fit a line of logarithm of x to the logarithm of y.
For example, based on your code:
import numpy as np
import matplotlib.pyplot as plt
x = [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048]
y = [497775, 150760, 50929, 19697, 8520, 3948, 1812, 710, 214, 57, 18, 4]
logx, logy = np.log(x), np.log(y)
p = np.polyfit(logx, logy, 1)
y_fit = np.exp(np.polyval(p, logx))
plt.plot(x, y_fit, 'r-')
plt.plot(x, y)
plt.yscale('log')
plt.xscale('log')
plt.show()
Related
I am trying to plot the trained curve in matplotlib. However I am getting this thing:
The scatter works fine:
How can I create the curve using plot?
It may be that the order of your X_train data is wrong. Try to sort them out. For instance, if X_train is just a list of numbers, you could say:
X_train.sort()
You can plot a smooth line curve by first determining the spline curve’s coefficients using the scipy.interpolate.make_interp_spline():
import numpy as np
import numpy as np
from scipy.interpolate import make_interp_spline
import matplotlib.pyplot as plt
# Dataset
x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.array([20, 30, 5, 12, 39, 48, 50, 3])
X_Y_Spline = make_interp_spline(x, y)
# Returns evenly spaced numbers
# over a specified interval.
X_ = np.linspace(x.min(), x.max(), 500)
Y_ = X_Y_Spline(X_)
# Plotting the Graph
plt.plot(X_, Y_)
plt.title("Plot Smooth Curve Using the scipy.interpolate.make_interp_spline() Class")
plt.xlabel("X")
plt.ylabel("Y")
plt.show()
Result:
It seems, that you have unsorted values in X_train. For instance, if
In [1]: X_train
Out [1]: array([30, 20, 50, 40])
then
In [2]: model.predict_proba(X_train)
Out [2]: array([0.2, 0.1, 0.8, 0.5])
Here, plt.plot will try to plot lines from point [30, 0.2] to point [20, 0.1], then from [20, 0.1] to [50, 0.8], then from [50, 0.8] to [40, 0.5].
Thus, the solution to your problem is to sort X_train before plotting =)
import numpy as np
X_train_sorted = np.sort(X_train)
y_train_sorted = model.predict_proba(X_train_sorted)
plt.scatter(X_train_sorted, y_train_sorted)
plt.plot(X_train_sorted, y_train_sorted)
I wanted to bestfit a parametric curve to a set of points. The beginning and end of the curve should coincide with a first and last sample point respectively.
I have tried this code below, but it is giving me a closed curve. Is there a way to modify this code slightly to ensure the curve is not closed?
import numpy as np
from scipy import interpolate
from matplotlib import pyplot as plt
x = np.array([23, 24, 24, 25, 25])
y = np.array([13, 12, 13, 12, 13])
# append the starting x,y coordinates
x = np.r_[x, x[0]]
y = np.r_[y, y[0]]
# fit splines to x=f(u) and y=g(u), treating both as periodic. also note that s=0
# is needed in order to force the spline fit to pass through all the input points.
tck, u = interpolate.splprep([x, y], s=0, per=True)
# evaluate the spline fits for 1000 evenly spaced distance values
xi, yi = interpolate.splev(np.linspace(0, 1, 1000), tck)
# plot the result
fig, ax = plt.subplots(1, 1)
ax.plot(x, y, 'or')
ax.plot(xi, yi, '-b')'''
Many thanks for your help.
You are appending the first x and y values to the end of the x and y arrays:
# append the starting x,y coordinates
x = np.r_[x, x[0]]
y = np.r_[y, y[0]]
..which means that you want the spline to end the same place that it starts, and then you are telling the interpolate.splprep function that you want a periodic curve with the per=True keyword argument:
tck, u = interpolate.splprep([x, y], per=True, s=0)
..which will give you exactly what you get..:
Just remove the two lines where you append the last x and y values to the x and y arrays and remove the per=True keyword argument and you get what you are asking for:
import numpy as np
from scipy import interpolate
from matplotlib import pyplot as plt
x = np.array([23, 24, 24, 25, 25])
y = np.array([13, 12, 13, 12, 13])
# append the starting x,y coordinates
# x = np.r_[x, x[0]]
# y = np.r_[y, y[0]]
# fit splines to x=f(u) and y=g(u), treating both as periodic. also note that s=0
# is needed in order to force the spline fit to pass through all the input points.
# tck, u = interpolate.splprep([x, y], per=True, s=0)
tck, u = interpolate.splprep([x, y], s=0)
# evaluate the spline fits for 1000 evenly spaced distance values
xi, yi = interpolate.splev(np.linspace(0, 1, 1000), tck)
# plot the result
fig, ax = plt.subplots(1, 1)
ax.plot(x, y, 'or')
ax.plot(xi, yi, '-b')
plt.show()
I have the following data-set:
x = 0, 5, 10, 15, 20, 25, 30
y = 0, 0.13157895, 0.31578947, 0.40789474, 0.46052632, 0.5, 0.53947368
Now, I want to plot this data and fit this data set with my defined function f(x) = (A*K*x/(1+K*x)) and find the parameters A and K ?
I wrote the following python script but it seems like it can't do the fitting I require:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
x = np.array([0, 5, 10, 15, 20, 25, 30])
y = np.array([0, 0.13157895, 0.31578947, 0.40789474, 0.46052632, 0.5, 0.53947368])
def func(x, A, K):
return (A*K*x / (1+K*x))
plt.plot(x, y, 'b-', label='data')
popt, pcov = curve_fit(func, x, y)
plt.plot(x, func(x, *popt), 'r-', label='fit')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
Still, it's not giving a best curve fit. Can anyone help me with the changes in the python script or a new script where I can properly fit the data with my desired fitting function ?
The classic problem: You didn't give any inital guess for A neither K. In this case the default value will be 1 for all parameters, which is not suitable for your dataset, and the fitting will converge to somewhere else. You can figure out the guesses different ways: by looking at the data, by the real meaning of parameters, etc.. You can guess values with the p0 parameter of scipy.optimize.curve_fit. It accepts list of values in the order they are in the func you want to optimize. I used 0.1 for both, and I got this curve:
popt, pcov = curve_fit(func, x, y, p0=[0.1, 0.1])
Try Minuit, which is a package implemented at Cern.
from iminuit import Minuit
import numpy as np
import matplotlib.pyplot as plt
def func(x, A, K):
return (A*K*x / (1+K*x))
def least_squares(a, b):
yvar = 0.01
return sum((y - func(x, a, b)) ** 2 / yvar)
x = np.array([0, 5, 10, 15, 20, 25, 30])
y = np.array([0, 0.13157895, 0.31578947, 0.40789474, 0.46052632, 0.5, 0.53947368])
m = Minuit(least_squares, a=5, b=5)
m.migrad() # finds minimum of least_squares function
m.hesse() # computes errors
plt.plot(x, y, "o")
plt.plot(x, func(x, *m.values.values()))
# print parameter values and uncertainty estimates
for p in m.parameters:
print("{} = {} +/- {}".format(p, m.values[p], m.errors[p]))
And the outcome:
a = 0.955697134431429 +/- 0.4957121286951612
b = 0.045175437602766676 +/- 0.04465599806912648
I would like to know how to update a graph and or plot in matplotlib every few seconds. Code:
import matplotlib.pyplot as plt
import numpy as np
axes = plt.gca()
axes.set_xlim([0,5])
axes.set_ylim([0,100])
X = [0, 1, 2, 3, 4, 5]
Y = [15, 30, 45, 60, 75, 90]
plt.plot(X, Y)
plt.xlabel('Time spent studying (hours)')
plt.ylabel('Score (percentage)')
plt.show()
What you have written is correct , but in order to make your code dynamic , you can put the code in a function and pass the X and Y coordinates to the function . One example as shown below
def GrapgPlot(X, Y):
"Your code"
GrapgPlot([0, 1, 2, 3, 4, 5],[90, 30, 45, 60, 75, 90])
In the plot if you are certain that X axis will not change than you can fix X axis in the code and take only Y axis values as a list from the user as an input and pass it in the function as an argument.
else the best way if you do want user interaction . Update the X and Y axis list with a loop and pass X and Y values in the function as an argument
Used time.sleep(1) for being able to see the changes and reversed Y for new data to be updated. Hopefully this is what you want:
%matplotlib notebook
import time
import matplotlib.pyplot as plt
X = [0, 1, 2, 3, 4, 5]
Y = [15, 30, 45, 60, 75, 90]
fig, ax = plt.subplots()
ax.set_xlim([0,5])
ax.set_ylim([0,100])
ax.set_xlabel('Time spent studying (hours)')
ax.set_ylabel('Score (percentage)')
l, = ax.plot(X, Y)
for ydata in [Y, Y[::-1]]*2:
l.set_ydata(ydata)
fig.canvas.draw()
time.sleep(0.5)
I am running simulations with 2 variables: P and Q.
Both P and Q vary from [0.2, 0.4, 0.6, 0.8]
Each combination of P and Q produce an output which I call NB_Means.
nb_means is produced by running the simulator with P=0.2 and varying the Q with [.2,.4,.6,.8], after which I move on to the next P (.4) and repeat the same process.
EX: so below in nb_means: p=.2&q=.2 -> 32 and p=.2&q=.4 -> 159 ... and so on
I am attempting to plot the wire frame as so:
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import matplotlib.pyplot as plt
import numpy as np
x=[.2,.2,.2,.2,.4,.4,.4,.4,.6,.6,.6,.6,.8,.8,.8,.8]
y=[.2,.4,.6,.8,.2,.4,.6,.8,.2,.4,.6,.8,.2,.4,.6,.8]
nb_means = [32, 159, 216, 327, 206, 282, 295, 225, 308, 252, 226, 229, 301, 276, 262, 273]
fig = plt.figure()
ax = plt.axes(projection='3d')
X,Y = np.meshgrid(x,y)
ax.set_title('Name Based Routing')
ax.set_xlabel('Prob of Request')
ax.set_ylabel('Prob of Publish')
ax.set_zlabel('RTT')
ax.plot_wireframe(X, Y, nb_means, rstride=10, cstride=10)
plt.show()
However, as you see in the output above... I expected the wireplot to increase along the Q axis. But it does not.
Am I setting up my x and y incorrectly?
The X, Y, and nb_means are all the problem. They should all be 2D arrays (your nb_means is currently a 1D list). You also don't need to make X and Y using meshgrid, all you need to do is reshape them all:
X = np.reshape(x, (4,4))
Y = np.reshape(y, (4,4))
nb2 = np.reshape(nb_means, (4,4))
...
ax.plot_wireframe(X, Y, nb2)
You may also not really want that rstride=10 and cstride=10.