I'm trying to get the turning points (peaks and valleys) of this data.
Here is what I'm trying,
import scipy
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from peakdetect import peakdetect
df = pd.read_csv('temp_sample.csv')
x = df['time'].to_list()
y = df['temp'].to_list()
turnpoints = peakdetect(y, x, lookahead=20)
print(turnpoints)
peaks = np.array(turnpoints[0])
valleys = np.array(turnpoints[1])
plt.plot(x,y)
plt.plot(peaks[:,0], peaks[:,1], 'ro')
plt.plot(valleys[:,0], valleys[:,1], 'ko')
This does not plot all the turnpoints. This is a temperature dataset, so we can see the rise, steady and fall states of the temperature. For steady state, this script only plots one end. Is it possible to get all the turnpoints in the graph no matter the state.
Related
I have a dummy dataset, df:
Demand WTP
0 13.0 111.3
1 443.9 152.9
2 419.6 98.2
3 295.9 625.5
4 150.2 210.4
I would like to plot this data as a step function in which the "WTP" are y-values and "Demand" are x-values.
The step curve should start with from the row with the lowest value in "WTP", and then increase gradually with the corresponding x-values from "Demand". However, I can't get the x-values to be cumulative, and instead, my plot becomes this:
I'm trying to get something that looks like this:
but instead of a proportion along the y-axis, I want the actual values from my dataset:
This is my code:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
Demand_quantity = pd.Series([13, 443.9, 419.6, 295.9, 150.2])
Demand_WTP = [111.3, 152.9, 98.2, 625.5, 210.4]
demand_data = {'Demand':Demand_quantity, 'WTP':Demand_WTP}
Demand = pd.DataFrame(demand_data)
Demand.sort_values(by = 'WTP', axis = 0, inplace = True)
print(Demand)
# sns.ecdfplot(data = Demand_WTP, x = Demand_quantity, stat = 'count')
plt.step(Demand['Demand'], Demand['WTP'], label='pre (default)')
plt.legend(title='Parameter where:')
plt.title('plt.step(where=...)')
plt.show()
You can try:
import matplotlib.pyplot as plt
import pandas as pd
df=pd.DataFrame({"Demand":[13, 443.9, 419.6, 295.9, 150.2],"WTP":[111.3, 152.9, 98.2, 625.5, 210.4]})
df=df.sort_values(by=["Demand"])
plt.step(df.Demand,df.WTP)
But I am not really sure about what you want to do. If the x-values are the df.Demand, than the dataframe should be sorted according to this column.
If you want to cumulate the x-values, than try to use numpy.cumsum:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df=pd.DataFrame({"Demand":[13, 443.9, 419.6, 295.9, 150.2],"WTP":[111.3, 152.9, 98.2, 625.5, 210.4]})
df=df.sort_values(by=["WTP"])
plt.step(np.cumsum(df.Demand),df.WTP)
I want to plot a tendency line on top of a data plot. This must be simple but I have not been able to figure out how to get to it.
Let us say I have the following:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame(np.random.randint(0,100,size=(100, 1)), columns=list('A'))
sns.lineplot(data=df)
ax.set(xlabel="Index",
ylabel="Variable",
title="Sample")
plt.show()
The resulting plot is:
What I would like to add is a tendency line. Something like the red line in the following:
I thank you for any feedback.
A moving average is one method (my first thought, and already suggested).
Another method is to use a polynomial fit. Since you had 100 points in your original data, I picked a 10th order fit (square root of data length) in the example below. With some modification of your original code:
idx = [i for i in range(100)]
rnd = np.random.randint(0,100,size=100)
ser = pd.Series(rnd, idx)
fit = np.polyfit(idx, rnd, 10)
pf = np.poly1d(fit)
plt.plot(idx, rnd, 'b', idx, pf(idx), 'r')
This code provides a plot like this:
You can do something like this using Rolling Average:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
data = np.random.randint(0,100,size=(100, 1))
df["rolling_avg"] = df.A.rolling(7).mean().shift(-3)
sns.lineplot(data=df)
plt.show()
You could also do a Regression plot to analyse how data can be interpolated using:
ax = sns.regplot(x=df.index, y="A",
data=df,
scatter_kws={"s": 10},
order=10,
ci=None)
Original(2018.11.01)
I have 3 numpy:x、y、z,created by my laser scanner(40 degree / 1 step).
I want to used them to build a 3D model.
I think it must should be use matplotlib.tri
But I have no idea to decide triangulated data
Here is my data :https://www.dropbox.com/s/d9p62kv9jcq9bwh/xyz.zip?dl=0
And Original model:https://i.imgur.com/XSyONff.jpg
Code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.tri as mtri
x_all=np.load("x.npy")
y_all=np.load("y.npy")
z_all=np.load("z.npy")
tri = #I have no idea...
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_trisurf(x_all,y_all,z_all,triangles=tri.triangles)
Thank so much.
Update(2018.11.02)
I try this way to decide triangulated data
Delaunay Triangulation of points from 2D surface in 3D with python?
code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.tri as mtri
from stl import mesh
x_all=np.load("x.npy")
y_all=np.load("y.npy")
z_all=np.load("z.npy")
model=np.vstack((x_all,y_all,z_all))
model=np.transpose(model)
model -= model.mean(axis=0)
rad = np.linalg.norm(model, axis=1)
zen = np.arccos(model[:,-1] / rad)
azi = np.arctan2(model[:,1], model[:,0])
tris = mtri.Triangulation(zen, azi)
plt.show()
And my model looks like:
https://i.stack.imgur.com/KVPHP.png
https://i.stack.imgur.com/LLQsQ.png
https://i.stack.imgur.com/HdzFm.png
Even though it has better surface on it,but there is a big hole over my model.Any idea to fixs it?
Assuming you want to reduce the complexity, i.e find triangles in your files to reduce the complexity. You may look into fitting a convex hull to your points, see here fore more info
Based on the file you provided this produces a surf plot of the object.
from numpy import load, stack
from matplotlib.pyplot import subplots
from mpl_toolkits.mplot3d import Axes3D
from scipy import spatial
x = load("x.npy")
y = load("y.npy")
z = load("z.npy")
points = stack((x,y,z), axis = -1)
v = spatial.ConvexHull(points)
fig, ax = subplots(subplot_kw = dict(projection = '3d'))
ax.plot_trisurf(*v.points.T, triangles = v.simplices.T)
fig.show()
I convert an oscilloscope dataset with millions of values into a pandas DataFrame. Next step is to plot it. But Matplotlib needs on my fairly powerful machine ~50 seconds to plot the DataFrame.
import pandas as pd
import matplotlib.pyplot as plt
import readTrc
datX, datY, m = readTrc.readTrc('C220180104_ch2_UHF00000.trc')
srx, sry = pd.Series(datX), pd.Series(datY)
df = pd.concat([srx, sry], axis = 1)
df.set_index(0, inplace = True)
df.plot(grid = 1)
plt.show()
Now I found out that there is a way to make matplotlib faster with large datasets by using 'Agg'.
import matplotlib
matplotlib.use('Agg')
import pandas as pd
import matplotlib.pyplot as plt
import readTrc
datX, datY, m = readTrc.readTrc('C220180104_ch2_UHF00000.trc')
srx, sry = pd.Series(datX), pd.Series(datY)
df = pd.concat([srx, sry], axis = 1)
df.set_index(0, inplace = True)
df.plot(grid = 1)
plt.show()
Unfortunately no plot is shown. The process of processing the plot takes ~5 seconds (a big improvement) but no plot is shown. Is this method not compatible with pandas?
You can use Ploty and Lenspy (was built to solve this exact problem). Here is an example of how you can plot 10m points on scatter plot. This plot runs super fast on my 2016 MacBook.
import numpy as np
import plotly.graph_objects as go
from lenspy import DynamicPlot
# First, let's create a very large figure
x = np.arange(1, 11, 1e-6)
y = 1e-2*np.sin(1e3*x) + np.sin(x) + 1e-3*np.sin(1e10*x)
fig = go.Figure(data=[go.Scattergl(x=x, y=y)])
fig.update_layout(title=f"{len(x):,} Data Points.")
# Use DynamicPlot.show to view the plot
plot = DynamicPlot(fig)
plot.show()
# Plot will be available in the browser at http://127.0.0.1:8050/
For your use case (again, I cannot test this since I don’t have access to your dataset):
import pandas as pd
import matplotlib.pyplot as plt
import readTrc
from lenspy import DynamicPlot
import plotly.graph_objects as go
datX, datY, m = readTrc.readTrc('C220180104_ch2_UHF00000.trc')
srx, sry = pd.Series(datX), pd.Series(datY)
fig = go.Figure(data=[go.Scattergl(x=srx, y=sry)])
fig.update_layout(title=f"{len(x):,} Data Points.")
# Use DynamicPlot.show to view the plot
plot = DynamicPlot(fig)
plot.show()
Disclaimer: I am the creator of Lenspy
I have a signal that is not sampled equidistant; for further processing it needs to be. I thought that scipy.signal.resample would do it, but I do not understand its behavior.
The signal is in y, corresponding time in x.
The resampled is expected in yy, with all corresponding time in xx. Does anyone know what I do wrong or how to achieve what I need?
This code does not work: xx is not time:
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt
x = np.array([0,1,2,3,4,5,6,6.5,7,7.5,8,8.5,9])
y = np.cos(-x**2/4.0)
num=50
z=signal.resample(y, num, x, axis=0, window=None)
yy=z[0]
xx=z[1]
plt.plot(x,y)
plt.plot(xx,yy)
plt.show()
Even when you give the x coordinates (which corresponds to the t argument), resample assumes that the sampling is uniform.
Consider using one of the univariate interpolators in scipy.interpolate.
For example, this script:
import numpy as np
from scipy import interpolate
import matplotlib.pyplot as plt
x = np.array([0,1,2,3,4,5,6,6.5,7,7.5,8,8.5,9])
y = np.cos(-x**2/4.0)
f = interpolate.interp1d(x, y)
num = 50
xx = np.linspace(x[0], x[-1], num)
yy = f(xx)
plt.plot(x,y, 'bo-')
plt.plot(xx,yy, 'g.-')
plt.show()
generates this plot:
Check the docstring of interp1d for options to control the interpolation, and also check out the other interpolation classes.