Matplotlib Plot curve logistic regression - python

I am trying to plot the trained curve in matplotlib. However I am getting this thing:
The scatter works fine:
How can I create the curve using plot?

It may be that the order of your X_train data is wrong. Try to sort them out. For instance, if X_train is just a list of numbers, you could say:
X_train.sort()

You can plot a smooth line curve by first determining the spline curve’s coefficients using the scipy.interpolate.make_interp_spline():
import numpy as np
import numpy as np
from scipy.interpolate import make_interp_spline
import matplotlib.pyplot as plt
# Dataset
x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.array([20, 30, 5, 12, 39, 48, 50, 3])
X_Y_Spline = make_interp_spline(x, y)
# Returns evenly spaced numbers
# over a specified interval.
X_ = np.linspace(x.min(), x.max(), 500)
Y_ = X_Y_Spline(X_)
# Plotting the Graph
plt.plot(X_, Y_)
plt.title("Plot Smooth Curve Using the scipy.interpolate.make_interp_spline() Class")
plt.xlabel("X")
plt.ylabel("Y")
plt.show()
Result:

It seems, that you have unsorted values in X_train. For instance, if
In [1]: X_train
Out [1]: array([30, 20, 50, 40])
then
In [2]: model.predict_proba(X_train)
Out [2]: array([0.2, 0.1, 0.8, 0.5])
Here, plt.plot will try to plot lines from point [30, 0.2] to point [20, 0.1], then from [20, 0.1] to [50, 0.8], then from [50, 0.8] to [40, 0.5].
Thus, the solution to your problem is to sort X_train before plotting =)
import numpy as np
X_train_sorted = np.sort(X_train)
y_train_sorted = model.predict_proba(X_train_sorted)
plt.scatter(X_train_sorted, y_train_sorted)
plt.plot(X_train_sorted, y_train_sorted)

Related

Keep the gap between two datasets in matplotlib

I have two datasets
firstX = [0, 1, 2, 3, 4, 5, 6] # X Axis
firstY = [10, 10, 20, 30, 40, 60, 70] # Y Axis
secondX = [9, 10, 11, 12, 13, 14, 15] # X Axis
secondY = [40, 20, 60, 11, 77, 12, 54] # Y Axis
I want to plot these two datasets in the same chart but without connecting them together. As you can see, there is a disconnection between them (in X axis, 7 and 8 are missing). When I concat them, matplotlib will try to connect the last point of the first dataset (6, 70) with the first point of the second dataset (9, 40). I would like to know how to avoid this behavior
You can just plot them individually. If they're sublists of a list, e.g. X = [[X1], [X2]], Y = [[Y1], [Y2]], you can loop over them.
import matplotlib.pyplot as plt
fig = plt.figure()
for i in range(len(X)):
plt.plot(X[i], Y[i])
plt.show()
Instead of concatenating the datasets, you can call the plot command two times, plotting two times to the same axes:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(firstX, firstY)
ax.plot(secondX, secondY)
From what I understand your question, this should work:
import matplotlib.pyplot as plt
plt.figure()
plt.plot(firstX, firstY, c='b')
plt.plot(secondX, secondY, c='b')
plt.show

Matplotlib scatter(): Polynomial regression line [duplicate]

This question already has answers here:
Simplest way to make a polynomial regression with sklearn?
(2 answers)
polynomial regression using python
(3 answers)
Closed 4 years ago.
Is it possible to do a polynomial regression line on a scatter() in matplotlib?
This is my graph:
https://imgur.com/a/Xh1BO
alg_n = [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4...]
orig_hc_runtime = [0.01, 0.02, 0.03, 0.04, 0.04, 0.04, 0.05, 0.09...]
plt.scatter(alg_n, orig_hc_runtime, label="Orig HC", color="b", s=4)
plt.scatter(alg_n, mod_hc_runtime, label="Mod HC", color="c", s=4)
...
x_values = [x for x in range(5, n_init+2, 2)]
y_values = [y for y in range(0, 10, 2)]
plt.xlabel("Number of Queens")
plt.ylabel("Time (sec)")
plt.title("Algorithm Performance: Time")
plt.xticks(x_values)
plt.yticks(y_values)
plt.grid(linewidth="1", color="white")
plt.legend()
plt.show()
Is it possible to have regression lines for eat data set? If so, can you please explain how I can do it.
Not sure if it can be done just using matplotlib but you can always compute regression separately and plot it. I leave an example code using scikit-learn to compute regression line.
import numpy as np
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline
x = [1, 2, 3, 4, 5, 8, 10]
y = [1.1, 3.8, 8.5, 16, 24, 65, 99.2]
model = make_pipeline(PolynomialFeatures(2), LinearRegression())
model.fit(np.array(x).reshape(-1, 1), y)
x_reg = np.arange(11)
y_reg = model.predict(x_reg.reshape(-1, 1))
plt.scatter(x, y)
plt.plot(x_reg, y_reg)
plt.show()
Output :
I would advise you to use the Seaborn library. It is built on top of matplotlib and has many statistical plotting routines. Have a look at the examples for regplot and lmplot: http://seaborn.pydata.org/tutorial/regression.html#functions-to-draw-linear-regression-models
In your case, you could do something like:
import pandas as pd
import seaborn as sns
df = pd.DataFrame.from_dict({"Number of Queens": [1, 1, 1, 2, 2, 2, 3,
3, 3, 4, 4, 4],
"Time (sec)": [0.01, 0.02, 0.03, 0.04, 0.04, 0.04,
0.05, 0.09, 0.12, 0.14, 0.15, 0.16]})
sns.lmplot('Number of Queens', 'Time (sec)', df, order=1)
If you want regression lines for different groups, add a column with the group labels and add it to the hue parameter of lm_plot.

How to update a plot or graph in matplotlib

I would like to know how to update a graph and or plot in matplotlib every few seconds. Code:
import matplotlib.pyplot as plt
import numpy as np
axes = plt.gca()
axes.set_xlim([0,5])
axes.set_ylim([0,100])
X = [0, 1, 2, 3, 4, 5]
Y = [15, 30, 45, 60, 75, 90]
plt.plot(X, Y)
plt.xlabel('Time spent studying (hours)')
plt.ylabel('Score (percentage)')
plt.show()
What you have written is correct , but in order to make your code dynamic , you can put the code in a function and pass the X and Y coordinates to the function . One example as shown below
def GrapgPlot(X, Y):
"Your code"
GrapgPlot([0, 1, 2, 3, 4, 5],[90, 30, 45, 60, 75, 90])
In the plot if you are certain that X axis will not change than you can fix X axis in the code and take only Y axis values as a list from the user as an input and pass it in the function as an argument.
else the best way if you do want user interaction . Update the X and Y axis list with a loop and pass X and Y values in the function as an argument
Used time.sleep(1) for being able to see the changes and reversed Y for new data to be updated. Hopefully this is what you want:
%matplotlib notebook
import time
import matplotlib.pyplot as plt
X = [0, 1, 2, 3, 4, 5]
Y = [15, 30, 45, 60, 75, 90]
fig, ax = plt.subplots()
ax.set_xlim([0,5])
ax.set_ylim([0,100])
ax.set_xlabel('Time spent studying (hours)')
ax.set_ylabel('Score (percentage)')
l, = ax.plot(X, Y)
for ydata in [Y, Y[::-1]]*2:
l.set_ydata(ydata)
fig.canvas.draw()
time.sleep(0.5)

How to create colour map from 3 arrays in python

I'm trying to create a colour plot in python of two arrays t1 and t2 with the colours being set by a third one v, but I can't get the colour bar to be in terms of the v array, it is instead in terms of t1. This is my code:
import matplotlib.pyplot as plt
import numpy as np
t1 = [75, 76, 77, 78]
t2 = [75, 76, 77, 78]
v = [0.5, 0.5, 0.8, 0.8]
image_data = np.column_stack([t1, t2, v])
plt.imshow(image_data)
plt.colorbar()
plt.show()
It produces this figure:
Any help would be much appreciated.
You cannot use imshow to set x and y coordinates, and color as 3rd.
It is to show a matrix image, where there are X*Y values, and all these values represent color.
Perhaps you want to use scatter.
E.g. you can try:
import matplotlib.pyplot as plt
t1 = [0,1,2,3]
t2 = [0, 10, 20, 30]
v = [0.5, 0.5, 0.8, 0.8]
plt.scatter(t1, t2, c=v, cmap='Greens')
plt.colorbar()
plt.show()
You can check which colormap is most suitable for you.

How to fit curve lines with yscale('log') - Python

I've been trying to fit some data to the best fit line for a specific set x and y.
I tried unsucessfully many times, and I cant seem to find a way to fit the data using yscale('log') and xscale('log'). I get this weird result but I can't seem to find why it is giving this strange [result]
[result]:https://www.dropbox.com/s/g6m4f8wh7r7jffg/Imagem%20sem%20t%C3%ADtulo.png?dl=0 .
My code:
#!/usr/bin/env python
# import the necessary modules
import numpy as np
import matplotlib.pyplot as plt
# Generate x and y values which the curve will be fitted to
# (In practical cases, these should be read in)
x = [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048]
y = [497775, 150760, 50929, 19697, 8520, 3948, 1812, 710, 214, 57, 18, 4]
p = np.polyfit(x,y,1)
plt.plot(x, np.polyval(p,x), 'r-')
plt.plot(x, y)
plt.yscale('log')
plt.xscale('log')
plt.show()
I have a hunch that it is because I am using polyvals, but I can´t find how to calculate it for logarithms. Can you help? I am new to this and I need help!
Something that appears linear on a log-log plot is not a linear function, it's an exponential function.
You're getting the best fit line:
y = a * x + b
but what you want is the best fit exponential function of the form:
y = a * x**k
There are a number of ways to do this. Using polyfit is a good approach, but you'll need to fit the line in "log space". In other words, fit a line of logarithm of x to the logarithm of y.
For example, based on your code:
import numpy as np
import matplotlib.pyplot as plt
x = [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048]
y = [497775, 150760, 50929, 19697, 8520, 3948, 1812, 710, 214, 57, 18, 4]
logx, logy = np.log(x), np.log(y)
p = np.polyfit(logx, logy, 1)
y_fit = np.exp(np.polyval(p, logx))
plt.plot(x, y_fit, 'r-')
plt.plot(x, y)
plt.yscale('log')
plt.xscale('log')
plt.show()

Categories