Boxplot drawer function at https://matplotlib.org/gallery/statistics/bxp.html has the following example:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
# fake data
np.random.seed(19680801)
data = np.random.lognormal(size=(37, 4), mean=1.5, sigma=1.75)
labels = list('ABCD')
# compute the boxplot stats
stats = cbook.boxplot_stats(data, labels=labels, bootstrap=10000)
for n in range(len(stats)):
stats[n]['med'] = np.median(data)
stats[n]['mean'] *= 2
print(list(stats[0]))
There is a line of code stats[n]['mean'] *= 2 within the for loop that I can't understand. Is it wrong or does it mean something?
Related
import numpy as np
import scipy.stats as stats
import math
import ipywidgets as widgets
from ipywidgets import interactive
import seaborn as sns
import matplotlib.pyplot as plt
mu_0 = 50
mu_1 = mu_0*1.1
#mu_2 = mu_0*1.5
n= 3
sigma=4.32/math.sqrt(n)
horizontal_values=np.linspace(55, 75, num=101)
def critical_value(mu_1,sigma, alpha=0.04):
c=stats.norm.ppf(1-alpha,mu_0,sigma)
return c
c= critical_value(mu_1,sigma)
power = stats.norm.sf(c,mu_1,sigma)
print (power)
print(c)
Hello,
I need to plot a graph from these data: so when you enter different mu_0 you get different powers
I need to enter every element in that array(horizontal values) to that function(the one that calculates the power so we can see the power in accordance to the speed)
And after that I want to draw a curve accordingly.
TLDR I want to change mu_0 between 55 and 75 and use the results to draw a graph. However I dont know how to go about it.
I think this is what you are looking for.
import numpy as np
import scipy.stats as stats
import math
import ipywidgets as widgets
from ipywidgets import interactive
import seaborn as sns
import matplotlib.pyplot as plt
def critical_value(mu_1,sigma, alpha=0.04):
c=stats.norm.ppf(1-alpha,mu_0,sigma)
return c
def func(mu_0): # function for calculating power
mu_1 = mu_0*1.1
#mu_2 = mu_0*1.5
n = 3
sigma=4.32/math.sqrt(n)
c = critical_value(mu_1,sigma)
power = stats.norm.sf(c,mu_1,sigma)
return power
horizontal_values=np.linspace(55, 75, num=101)
power = [func(mu) for mu in horizontal_values] # calculates power for different mu_0
plt.plot(horizontal_values, power) # plot
plt.xlabel('mu')
plt.ylabel('Power')
plt.show()
I want to plot a tendency line on top of a data plot. This must be simple but I have not been able to figure out how to get to it.
Let us say I have the following:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame(np.random.randint(0,100,size=(100, 1)), columns=list('A'))
sns.lineplot(data=df)
ax.set(xlabel="Index",
ylabel="Variable",
title="Sample")
plt.show()
The resulting plot is:
What I would like to add is a tendency line. Something like the red line in the following:
I thank you for any feedback.
A moving average is one method (my first thought, and already suggested).
Another method is to use a polynomial fit. Since you had 100 points in your original data, I picked a 10th order fit (square root of data length) in the example below. With some modification of your original code:
idx = [i for i in range(100)]
rnd = np.random.randint(0,100,size=100)
ser = pd.Series(rnd, idx)
fit = np.polyfit(idx, rnd, 10)
pf = np.poly1d(fit)
plt.plot(idx, rnd, 'b', idx, pf(idx), 'r')
This code provides a plot like this:
You can do something like this using Rolling Average:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
data = np.random.randint(0,100,size=(100, 1))
df["rolling_avg"] = df.A.rolling(7).mean().shift(-3)
sns.lineplot(data=df)
plt.show()
You could also do a Regression plot to analyse how data can be interpolated using:
ax = sns.regplot(x=df.index, y="A",
data=df,
scatter_kws={"s": 10},
order=10,
ci=None)
Code is below
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.cluster import KMeans
import seaborn as sns
df = pd.DataFrame(np.random.rand(10,3), columns=["A", "B","C"])
km = KMeans(n_clusters=3).fit(df)
df['cluster_id'] = km.labels_
test = {0:"Blue", 1:"Red", 2:"Green"}
#sns.scatterplot()
plt.show()
I am trying to plot without x,y that is column constraints. I need to plot any number of columns just want to plot the cluster graph
Original(2018.11.01)
I have 3 numpy:x、y、z,created by my laser scanner(40 degree / 1 step).
I want to used them to build a 3D model.
I think it must should be use matplotlib.tri
But I have no idea to decide triangulated data
Here is my data :https://www.dropbox.com/s/d9p62kv9jcq9bwh/xyz.zip?dl=0
And Original model:https://i.imgur.com/XSyONff.jpg
Code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.tri as mtri
x_all=np.load("x.npy")
y_all=np.load("y.npy")
z_all=np.load("z.npy")
tri = #I have no idea...
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_trisurf(x_all,y_all,z_all,triangles=tri.triangles)
Thank so much.
Update(2018.11.02)
I try this way to decide triangulated data
Delaunay Triangulation of points from 2D surface in 3D with python?
code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.tri as mtri
from stl import mesh
x_all=np.load("x.npy")
y_all=np.load("y.npy")
z_all=np.load("z.npy")
model=np.vstack((x_all,y_all,z_all))
model=np.transpose(model)
model -= model.mean(axis=0)
rad = np.linalg.norm(model, axis=1)
zen = np.arccos(model[:,-1] / rad)
azi = np.arctan2(model[:,1], model[:,0])
tris = mtri.Triangulation(zen, azi)
plt.show()
And my model looks like:
https://i.stack.imgur.com/KVPHP.png
https://i.stack.imgur.com/LLQsQ.png
https://i.stack.imgur.com/HdzFm.png
Even though it has better surface on it,but there is a big hole over my model.Any idea to fixs it?
Assuming you want to reduce the complexity, i.e find triangles in your files to reduce the complexity. You may look into fitting a convex hull to your points, see here fore more info
Based on the file you provided this produces a surf plot of the object.
from numpy import load, stack
from matplotlib.pyplot import subplots
from mpl_toolkits.mplot3d import Axes3D
from scipy import spatial
x = load("x.npy")
y = load("y.npy")
z = load("z.npy")
points = stack((x,y,z), axis = -1)
v = spatial.ConvexHull(points)
fig, ax = subplots(subplot_kw = dict(projection = '3d'))
ax.plot_trisurf(*v.points.T, triangles = v.simplices.T)
fig.show()
I am unable to get regression line and the variance bounds around it while plotting seaborn.pairplot with kind=reg as shown in the examples at http://seaborn.pydata.org/generated/seaborn.pairplot.html
import pandas pd
import seaborn as sns
import numpy as np
import matplotlib as plt
# Preparing random dataFrame with two colums, viz., random x and lag-1 values
lst1 = list(np.random.rand(10000))
df = pd.DataFrame({'x1':lst1})
df['x2'] = df['x1'].shift(1)
df = df[df['x2'] > 0]
# Plotting now
pplot = sns.pairplot(df, kind="reg")
pplot.set(ylim=(min(df['x1']), max(df['x1'])))
pplot.set(xlim=(min(df['x1']), max(df['x1'])))
plt.show()
The regression line is there, you just don't see it, because it's hidden by the unnaturally high number of points in the plot.
So let's reduce the number of points and you'll see the regression as expected.
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
# Preparing random dataFrame with two colums, viz., random x and lag-1 values
lst1 = list(np.random.rand(100))
df = pd.DataFrame({'x1':lst1})
df['x2'] = df['x1'].shift(1)
df = df[df['x2'] > 0]
# Plotting now
pplot = sns.pairplot(df, kind="reg")
plt.show()