I am attempting to find the convolution of two rectangular pulses.
No errors are being thrown - and I am getting a suitably shaped waveform output - however, the magnitude of my answer appears to be vastly too large, and I'm also unsure of how to fit a correct x/time axis to this convolution.
Additionally, the magnitude of the convolution seems to depend on the number of samples in the two pulses (essentially the sampling frequency) - which I would say is incorrect.
As I am attempting to model a continuous time signal, rather than discrete, I have set the sampling frequency very high.
Clearly I am doing something wrong - but what is it, and how do I correct it?
Thanks very much - and apologies if some of the code is not very "pythonic" (Recent Java convert)!
EDIT: Whilst attempting to evaluate this by hand, I have found that the time axis is too small by a factor of 2; again, I don't know why this would be
import numpy as np
import matplotlib.pyplot as plt
from sympy.functions.special import delta_functions as dlta
def stepFunction(t): #create pulses below from step-functions
retval = 0
if t == 0:
retval = 1
else:
retval = dlta.Heaviside(t)
return retval
def hT (t=0, start=0, dur=8, samples=1000):
time = np.linspace(start, start + dur, samples, True)
data = np.zeros(len(time))
hTArray = np.column_stack((time, data))
for row in range(len(hTArray)):
hTArray[row][1] = 2 * (stepFunction(hTArray[row][0] - 4) - stepFunction(hTArray[row][0] - 6))
return hTArray
def xT (t=0, start=0, dur=8, samples=1000):
time = np.linspace(start, start + dur, samples, True)
data = np.zeros(len(time))
hTArray = np.column_stack((time, data))
for row in range(len(hTArray)):
hTArray[row][1] = (stepFunction(hTArray[row][0]) - stepFunction(hTArray[row][0] - 4))
return hTArray
hTArray = hT() #populate two arrays with functions
xTArray = xT()
resCon = np.convolve(hTArray[:, 1], xTArray[:, 1]) #convolute signals/array data
Xaxis = np.linspace(hTArray[0][0], hTArray[len(hTArray) - 1][0],
len(resCon), endpoint=True) # create time axis, with same intervals as original functions
#Plot the functions & convolution
plt.plot(hTArray[:, 0], hTArray[:, 1], label=r'$x1(t)$')
plt.plot(xTArray[:, 0], xTArray[:, 1], label=r'$x2(t)$')
plt.plot(Xaxis, resCon)
plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=3,
ncol=2, mode="expand", borderaxespad=0.)
ax = plt.gca()
ax.grid(True)
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.xaxis.set_ticks_position('bottom')
ax.spines['bottom'].set_position(('data', 0))
ax.yaxis.set_ticks_position('left')
ax.spines['left'].set_position(('data', 0))
plt.show()
When you convolute discrete signals, you need to scale appropriately to keep the signal's energy (integral over |x(t)|²) constant:
import numpy as np
import matplotlib.pyplot as plt
n = 1000
t = np.linspace(0, 8, n)
T = t[1] - t[0] # sampling width
x1 = np.where(t<=4, 1, 0) # build input functions
x2 = np.where(np.logical_and(t>=4, t<=6), 2, 0)
y = np.convolve(x1, x2, mode='full') * T # scaled convolution
ty = np.linspace(0, 2*8, n*2-1) # double time interval
# plot results:
fg, ax = plt.subplots(1, 1)
ax.plot(t, x1, label="$x_1$")
ax.plot(t, x2, label="$x_2$")
ax.plot(ty, y, label="$x_1\\star x_2$")
ax.legend(loc='best')
ax.grid(True)
fg.canvas.draw()
Related
I'm trying to show my 2D data on a 3D space.
Here is my code below:
import numpy as np
import matplotlib.pyplot as plt
i = 60
n = 1000
r = 3.8
eps = 0.7
y = np.ones((n, i))
# random numbers on the first row of array x
np.random.seed(1)
x = np.ones((n+1, i))
x[0, :] = np.random.random(i)
def logistic(r, x):
return r * x * (1 - x)
present_indi = np.arange(i)
next_indi = (present_indi + 1) % i
prev_indi = (present_indi - 1) % i
for n in range(1000):
y[n, :] = logistic(r, x[n, :])
x[n+1, :] = (1-eps)*y[n, present_indi] + 0.5*eps*(y[n, prev_indi] + y[n, next_indi])
#print(x)
# the above logic generates a 2D array 'x'. with i columns and n rows.
fig, ax = plt.subplots()
for i in range(60):
for n in range(1000):
if n>=900:
ax.plot(i,x[n,i],'*k',ms=0.9)
plt.xlabel('i')
plt.ylabel('x')
plt.title('test')
plt.show()
The above code perfectly displays i and x graph. I have plotted all the elements of 1st column of X, then all elements of second column, then the third and so on....., using the nested for loop logic (refer to the code)
Now what I need to do is, extend the plotting to 3D, i.e use Xaxis = i, Yaxis= n, Zaxis= array 'x'
I want to plot something like this:
I know I have to use mplot3D
But doing the following won't give me any result:
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for i in range(60):
for n in range(1000):
if n>=900:
ax.plot_wireframe(i,n,x[n,i],rstride=1,cstride=1)
Plotting 3d images in matplotlib is a little tricky. Generally you plot whole surfaces at once instead of plotting one line at a time. You do so by passing three 2d arrays, one for each position dimension (x, y, z). But you can't just pass any old 2d arrays either; the points themselves have to be in a precise order!
Sometimes you can do something that just works, but I find it easier to explicitly parameterize plots using u and v dimensions. Here's what I was able to get working here:
# Abstract u and v parameters describing surface coordinates
u_plt = np.arange(x.shape[1])
v_plt = np.arange(x.shape[0])
# The outer products here produce 2d arrays. We multiply by
# ones in this case for an identity transformation, but in
# general, you could use any broadcasted operation on `u`
# and `v`.
x_plt = np.outer(np.ones(np.size(v_plt)), u_plt)
y_plt = np.outer(v_plt, np.ones(np.size(u_plt)))
# In this case, our `x` array gives the `z` values directly.
z_plt = x
fig = plt.figure(figsize=(16, 10))
ax = fig.add_subplot(111, projection='3d')
ax.set_zmargin(1) # Add a bit more space around the plot.
ax.plot_wireframe(x_plt, y_plt, z_plt,
rstride=1, cstride=1, # "Resolution" of the plot
color='blue', linewidth=1.0,
alpha=0.7, antialiased=True)
# Tilt the view to match the example.
ax.view_init(elev = 45, azim = -45)
plt.xlabel('i')
plt.ylabel('x')
plt.title('test')
plt.show()
And here's the resulting image. I had to reduce n to 80 to make this comprehensible at all, and I have no idea what I am looking at, so I am not sure it's correct. But I think it looks broadly similar to the example you gave.
Just to illustrate the power of this approach, here's a nautilus shell. It uses a two-stage parameterization, which could be compressed, but which I find conceptually clearer:
n_ticks = 100
# Abstract u and v parameters describing surface coordinates
u_plt = np.arange(n_ticks // 2) * 2
v_plt = np.arange(n_ticks)
# theta is the angle along the leading edge of the shell
# phi is the angle along the spiral of the shell
# r is the distance of the edge from the origin
theta_plt = np.pi * ((u_plt / n_ticks) * 0.99 + 0.005)
phi_plt = np.pi * v_plt / (n_ticks / 5)
r_plt = v_plt / (n_ticks / 5)
# These formulas are based on the formulas for rendering
# a sphere parameterized by theta and phi. The only difference
# is that r is variable here too.
x_plt = r_plt[:, None] * np.cos(phi_plt[:, None]) * np.sin(theta_plt[None, :])
y_plt = r_plt[:, None] * np.sin(phi_plt[:, None]) * np.sin(theta_plt[None, :])
z_plt = r_plt[:, None] * \
(np.ones(np.shape(phi_plt[:, None])) * np.cos(theta_plt[None, :]))
# This varies the color along phi
colors = cm.inferno(1 - (v_plt[:, None] / max(v_plt))) * \
np.ones(np.shape(u_plt[None, :, None]))
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(111, projection='3d')
ax.set_zmargin(1)
ax.plot_surface(x_plt, y_plt, z_plt,
rstride=1, cstride=1,
facecolors=colors, linewidth=1.0,
alpha=0.3, antialiased=True)
ax.view_init(elev = 45, azim = -45)
plt.show()
I have a 3D polygon plot and want to smooth the plot on the y axis (i.e. I want it to look like 'slices of a surface plot').
Consider this MWE (taken from here):
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.collections import PolyCollection
import matplotlib.pyplot as plt
from matplotlib import colors as mcolors
import numpy as np
from scipy.stats import norm
fig = plt.figure()
ax = fig.gca(projection='3d')
xs = np.arange(-10, 10, 2)
verts = []
zs = [0.0, 1.0, 2.0, 3.0]
for z in zs:
ys = np.random.rand(len(xs))
ys[0], ys[-1] = 0, 0
verts.append(list(zip(xs, ys)))
poly = PolyCollection(verts, facecolors=[mcolors.to_rgba('r', alpha=0.6),
mcolors.to_rgba('g', alpha=0.6),
mcolors.to_rgba('b', alpha=0.6),
mcolors.to_rgba('y', alpha=0.6)])
poly.set_alpha(0.7)
ax.add_collection3d(poly, zs=zs, zdir='y')
ax.set_xlabel('X')
ax.set_xlim3d(-10, 10)
ax.set_ylabel('Y')
ax.set_ylim3d(-1, 4)
ax.set_zlabel('Z')
ax.set_zlim3d(0, 1)
plt.show()
Now, I want to replace the four plots with normal distributions (to ideally form continuous lines).
I have created the distributions here:
def get_xs(lwr_bound = -4, upr_bound = 4, n = 80):
""" generates the x space betwee lwr_bound and upr_bound so that it has n intermediary steps """
xs = np.arange(lwr_bound, upr_bound, (upr_bound - lwr_bound) / n) # x space -- number of points on l/r dimension
return(xs)
xs = get_xs()
dists = [1, 2, 3, 4]
def get_distribution_params(list_):
""" generates the distribution parameters (mu and sigma) for len(list_) distributions"""
mus = []
sigmas = []
for i in range(len(dists)):
mus.append(round((i + 1) + 0.1 * np.random.randint(0,10), 3))
sigmas.append(round((i + 1) * .01 * np.random.randint(0,10), 3))
return mus, sigmas
mus, sigmas = get_distribution_params(dists)
def get_distributions(list_, xs, mus, sigmas):
""" generates len(list_) normal distributions, with different mu and sigma values """
distributions = [] # distributions
for i in range(len(list_)):
x_ = xs
z_ = norm.pdf(xs, loc = mus[i], scale = sigmas[0])
distributions.append(list(zip(x_, z_)))
#print(x_[60], z_[60])
return distributions
distributions = get_distributions(list_ = dists, xs = xs, mus = mus, sigmas = sigmas)
But adding them to the code (with poly = PolyCollection(distributions, ...) and ax.add_collection3d(poly, zs=distributions, zdir='z') throws a ValueError (ValueError: input operand has more dimensions than allowed by the axis remapping) I cannot resolve.
The error is caused by passing distributions to zs where zs expects that when verts in PolyCollection has shape MxNx2 the object passed to zs has shape M. So when it reaches this check
cpdef ndarray broadcast_to(ndarray array, shape):
# ...
if array.ndim < len(shape):
raise ValueError(
'input operand has more dimensions than allowed by the axis '
'remapping')
# ...
in the underlying numpy code, it fails. I believe this occurs because the number of dimensions expected (array.ndim) is less than the number of dimensions of zs (len(shape)). It is expecting an array of shape (4,) but receives an array of shape (4, 80, 2).
This error can be resolved by using an array of the correct shape - e.g. zs from the original example or dists from your code. Using zs=dists and adjusting the axis limits to [0,5] for x, y, and z gives
This looks a bit odd for two reasons:
There is a typo in z_ = norm.pdf(xs, loc = mus[i], scale = sigmas[0]) which gives all the distributions the same sigma, it should be z_ = norm.pdf(xs, loc = mus[i], scale = sigmas[i])
The viewing geometry: the distributions have the positive xz plane as their base, this is also the plane we are looking through.
Changing the viewing geometry via ax.view_init will yield a clearer plot:
Edit
Here is the complete code which generates the plot shown,
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.collections import PolyCollection
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import numpy as np
from scipy.stats import norm
np.random.seed(8)
def get_xs(lwr_bound = -4, upr_bound = 4, n = 80):
return np.arange(lwr_bound, upr_bound, (upr_bound - lwr_bound) / n)
def get_distribution_params(list_):
mus = [round((i+1) + 0.1 * np.random.randint(0,10), 3) for i in range(len(dists))]
sigmas = [round((i+1) * .01 * np.random.randint(0,10), 3) for i in range(len(dists))]
return mus, sigmas
def get_distributions(list_, xs, mus, sigmas):
return [list(zip(xs, norm.pdf(xs, loc=mus[i], scale=sigmas[i] if sigmas[i] != 0.0
else 0.1))) for i in range(len(list_))]
dists = [1, 2, 3, 4]
xs = get_xs()
mus, sigmas = get_distribution_params(dists)
distributions = get_distributions(dists, xs, mus, sigmas)
fc = [mcolors.to_rgba('r', alpha=0.6), mcolors.to_rgba('g', alpha=0.6),
mcolors.to_rgba('b', alpha=0.6), mcolors.to_rgba('y', alpha=0.6)]
poly = PolyCollection(distributions, fc=fc)
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.add_collection3d(poly, zs=np.array(dists).astype(float), zdir='z')
ax.view_init(azim=115)
ax.set_zlim([0, 5])
ax.set_ylim([0, 5])
ax.set_xlim([0, 5])
I based it off the code you provide in the question, but made some modifications for brevity and to be more consistent with the usual styling.
Note - The example code you have given will fail depending on the np.random.seed(), in order to ensure it works I have added a check in the call to norm.pdf which ensures the scale is non-zero: scale = sigma[i] if sigma[i] != 0.0 else 0.1.
Using ax.add_collection3d(poly, zs=dists, zdir='z') instead of ax.add_collection3d(poly, zs=distributions, zdir='z') should fix the issue.
Additionally, you might want to replace
def get_xs(lwr_bound = -4, upr_bound = 4, n = 80):
""" generates the x space betwee lwr_bound and upr_bound so that it has n intermediary steps """
xs = np.arange(lwr_bound, upr_bound, (upr_bound - lwr_bound) / n) # x space -- number of points on l/r dimension
return(xs)
xs = get_xs()
by
xs = np.linspace(-4, 4, 80)
Also, I believe the scale = sigmas[0] should actually be scale = sigmas[i] in the line
z_ = norm.pdf(xs, loc = mus[i], scale = sigmas[0])
Finally, I believe you should adjust the xlim, ylim and zlim appropriatly, as you swapped the y and z dimensions of the plot and changed its scales when comparing to the reference code.
I am interested to optimize a function which is the convolution of two functions. The main problem is that my resulting function is completly of scale and i do not understand what np.convolve actually does.
I wrote a small script that should convolve two Gaussian, but the resulting Gaussian is much larger in size than the input functions:
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt
import numpy as np
# https://stackoverflow.com/questions/18088918/combining-two-gaussians-into-another-guassian
def gauss(x, p): # p[0]==mean, p[1]==stdev, p[2]==heightg, p[3]==baseline
a = p[2]
mu = p[0]
sig = p[1]
#base = p[3]
return a * np.exp(-1.0 * ((x - mu)**2.0) / (2.0 * sig**2.0)) #+ base
p0 = [0, 0.3, 1] # Inital guess is a normal distribution
p02 = [0, 0.2, 0.5]
xp = np.linspace(-4, 4, 2000)
convolved = np.convolve(gauss(xp, p0),gauss(xp, p02), mode="same")
fig = plt.figure()
plt.subplot(2, 1, 1)
plt.plot(xp, gauss(xp, p0), lw=3, alpha=2.5)
plt.plot(xp, gauss(xp, p02), lw=3, alpha=2.5)
plt.xlim([-2, 2])
plt.subplot(2, 1, 2)
plt.plot(xp, gauss(xp, p0), lw=3, alpha=2.5)
plt.plot(xp, gauss(xp, p02), lw=3, alpha=2.5)
plt.plot(xp, convolved, lw=3, alpha=2.5,label="too damn high?")
plt.legend()
plt.xlim([-2, 2])
plt.tight_layout()
plt.show()
The resulting gaussian after the convolution is much higher
than what i expected (wikipedia):
You gotta renormalize for the dx between two x ticks.
Numpy is substituting an integration for a summation, but since the functions takes only the Y values it doesn't care about the volume element on the integration axis which you need to include manually.
I've had to deal with this problem as well and it is a pain when you start doing stuff with dx=1 and than all of a sudden you get a wrong result because of different x axis distribution.
xp = np.linspace(-4, 4, 2000)
dx = xp[1] - xp[0]
convolved = np.convolve(gauss(xp, p0),gauss(xp, p02), mode="same") * dx
!!NOTE: don't put the renormalization inside the function definition. dx should be counted only once because of the integral going into a summation. If you put it inside the function it will actually be counted twice because bot gaussian are generated using it.
PS: To try and understand this better you can try to generate the x axis data with different spacing and without renormalizing you will see that the height of your convolution will differ (the smaller the spacing the greater the height)
fig = plt.figure()
ax = fig.add_subplot(111)
for spacing in (100,500,1000,2000):
spacing += 1
xp = np.linspace(-4, 4, spacing)
dx = xp[1] - xp[0]
convolved = np.convolve(gauss(xp, p01),gauss(xp, p02), mode="same") * dx
ax.plot(xp, convolved, lw=3, alpha=2.5,label="spacing = {:g}".format(8/spacing))
ax.set_title("Convolution with different x spacing. With renormalization")
fig.legend()
plt.show()
I have data of a plot on two arrays that are stored in unsorted way, so the plot jumps from one place to another discontinuously:
I have tried one example of finding the closest point in a 2D array:
import numpy as np
def distance(pt_1, pt_2):
pt_1 = np.array((pt_1[0], pt_1[1]))
pt_2 = np.array((pt_2[0], pt_2[1]))
return np.linalg.norm(pt_1-pt_2)
def closest_node(node, nodes):
nodes = np.asarray(nodes)
dist_2 = np.sum((nodes - node)**2, axis=1)
return np.argmin(dist_2)
a = []
for x in range(50000):
a.append((np.random.randint(0,1000),np.random.randint(0,1000)))
some_pt = (1, 2)
closest_node(some_pt, a)
Can I use it somehow to "clean" my data? (in the above code, a can be my data)
Exemplary data from my calculations is:
array([[ 2.08937872e+001, 1.99020033e+001, 2.28260611e+001,
6.27711094e+000, 3.30392288e+000, 1.30312878e+001,
8.80768833e+000, 1.31238275e+001, 1.57400130e+001,
5.00278061e+000, 1.70752624e+001, 1.79131456e+001,
1.50746185e+001, 2.50095731e+001, 2.15895974e+001,
1.23237801e+001, 1.14860312e+001, 1.44268222e+001,
6.37680265e+000, 7.81485403e+000],
[ -1.19702178e-001, -1.14050879e-001, -1.29711421e-001,
8.32977493e-001, 7.27437322e-001, 8.94389885e-001,
8.65931116e-001, -6.08199292e-002, -8.51922900e-002,
1.12333841e-001, -9.88131292e-324, 4.94065646e-324,
-9.88131292e-324, 4.94065646e-324, 4.94065646e-324,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
-4.94065646e-324, 0.00000000e+000]])
After using radial_sort_line (of Joe Kington) I have received the following plot:
This is actually a problem that's tougher than you might think in general.
In your exact case, you might be able to get away with sorting by the y-values. It's hard to tell for sure from the plot.
Therefore, a better approach for somewhat circular shapes like this is to do a radial sort.
For example, let's generate some data somewhat similar to yours:
import numpy as np
import matplotlib.pyplot as plt
t = np.linspace(.2, 1.6 * np.pi)
x, y = np.cos(t), np.sin(t)
# Shuffle the points...
i = np.arange(t.size)
np.random.shuffle(i)
x, y = x[i], y[i]
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
plt.show()
Okay, now let's try to undo that shuffle by using a radial sort. We'll use the centroid of the points as the center and calculate the angle to each point, then sort by that angle:
x0, y0 = x.mean(), y.mean()
angle = np.arctan2(y - y0, x - x0)
idx = angle.argsort()
x, y = x[idx], y[idx]
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
plt.show()
Okay, pretty close! If we were working with a closed polygon, we'd be done.
However, we have one problem -- This closes the wrong gap. We'd rather have the angle start at the position of the largest gap in the line.
Therefore, we'll need to calculate the gap to each adjacent point on our new line and re-do the sort based on a new starting angle:
dx = np.diff(np.append(x, x[-1]))
dy = np.diff(np.append(y, y[-1]))
max_gap = np.abs(np.hypot(dx, dy)).argmax() + 1
x = np.append(x[max_gap:], x[:max_gap])
y = np.append(y[max_gap:], y[:max_gap])
Which results in:
As a complete, stand-alone example:
import numpy as np
import matplotlib.pyplot as plt
def main():
x, y = generate_data()
plot(x, y).set(title='Original data')
x, y = radial_sort_line(x, y)
plot(x, y).set(title='Sorted data')
plt.show()
def generate_data(num=50):
t = np.linspace(.2, 1.6 * np.pi, num)
x, y = np.cos(t), np.sin(t)
# Shuffle the points...
i = np.arange(t.size)
np.random.shuffle(i)
x, y = x[i], y[i]
return x, y
def radial_sort_line(x, y):
"""Sort unordered verts of an unclosed line by angle from their center."""
# Radial sort
x0, y0 = x.mean(), y.mean()
angle = np.arctan2(y - y0, x - x0)
idx = angle.argsort()
x, y = x[idx], y[idx]
# Split at opening in line
dx = np.diff(np.append(x, x[-1]))
dy = np.diff(np.append(y, y[-1]))
max_gap = np.abs(np.hypot(dx, dy)).argmax() + 1
x = np.append(x[max_gap:], x[:max_gap])
y = np.append(y[max_gap:], y[:max_gap])
return x, y
def plot(x, y):
fig, ax = plt.subplots()
ax.plot(x, y, color='lightblue')
ax.margins(0.05)
return ax
main()
Sorting the data base on their angle relative to the center as in #JoeKington 's solution might have problems with some parts of the data:
In [1]:
import scipy.spatial as ss
import matplotlib.pyplot as plt
import numpy as np
import re
%matplotlib inline
In [2]:
data=np.array([[ 2.08937872e+001, 1.99020033e+001, 2.28260611e+001,
6.27711094e+000, 3.30392288e+000, 1.30312878e+001,
8.80768833e+000, 1.31238275e+001, 1.57400130e+001,
5.00278061e+000, 1.70752624e+001, 1.79131456e+001,
1.50746185e+001, 2.50095731e+001, 2.15895974e+001,
1.23237801e+001, 1.14860312e+001, 1.44268222e+001,
6.37680265e+000, 7.81485403e+000],
[ -1.19702178e-001, -1.14050879e-001, -1.29711421e-001,
8.32977493e-001, 7.27437322e-001, 8.94389885e-001,
8.65931116e-001, -6.08199292e-002, -8.51922900e-002,
1.12333841e-001, -9.88131292e-324, 4.94065646e-324,
-9.88131292e-324, 4.94065646e-324, 4.94065646e-324,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
-4.94065646e-324, 0.00000000e+000]])
In [3]:
plt.plot(data[0], data[1])
plt.title('Unsorted Data')
Out[3]:
<matplotlib.text.Text at 0x10a5c0550>
See x values between 15 and 20 are not sorted correctly.
In [10]:
#Calculate the angle in degrees of [0, 360]
sort_index = np.angle(np.dot((data.T-data.mean(1)), np.array([1.0, 1.0j])))
sort_index = np.where(sort_index>0, sort_index, sort_index+360)
#sorted the data by angle and plot them
sort_index = sort_index.argsort()
plt.plot(data[0][sort_index], data[1][sort_index])
plt.title('Data Sorted by angle relatively to the centroid')
plt.plot(data[0], data[1], 'r+')
Out[10]:
[<matplotlib.lines.Line2D at 0x10b009e10>]
We can sort the data based on a nearest neighbor approach, but since the x and y are of very different scale, the choice of distance metrics becomes an important issue. We will just try all the distance metrics available in scipy to get an idea:
In [7]:
def sort_dots(metrics, ax, start):
dist_m = ss.distance.squareform(ss.distance.pdist(data.T, metrics))
total_points = data.shape[1]
points_index = set(range(total_points))
sorted_index = []
target = start
ax.plot(data[0, target], data[1, target], 'o', markersize=16)
points_index.discard(target)
while len(points_index)>0:
candidate = list(points_index)
nneigbour = candidate[dist_m[target, candidate].argmin()]
points_index.discard(nneigbour)
points_index.discard(target)
#print points_index, target, nneigbour
sorted_index.append(target)
target = nneigbour
sorted_index.append(target)
ax.plot(data[0][sorted_index], data[1][sorted_index])
ax.set_title(metrics)
In [6]:
dmetrics = re.findall('pdist\(X\,\s+\'(.*)\'', ss.distance.pdist.__doc__)
In [8]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 5)
except:
ax.set_title(metrics + '(unsuitable)')
It looks like standardized euclidean and mahanalobis metrics give the best result. Note that we choose a starting point of the 6th data (index 5), it is the data point this the largest y value (use argmax to get the index, of course).
In [9]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 13)
except:
ax.set_title(metrics + '(unsuitable)')
This is what happens if you choose the starting point of max. x value (index 13). It appears that mahanalobis metrics is better than standardized euclidean as it is not affected by the starting point we choose.
If we do the assumption that the data are 2D and the x axis should be in an increasing fashion, then you could:
sort the x axis data, e.g. x_old and store the result in a different variable, e.g. x_new
for each element in the x_new find its index in the x_old array
re-order the elements in the y_axis array according to the indices that you got from previous step
I would do it with python list instead of numpy array due to list.index method been more easily manipulated than the numpy.where method.
E.g. (and assume that x_old and y_old are your previous numpy variables for x and y axis respectively)
import numpy as np
x_new_tmp = x_old.tolist()
y_new_tmp = y_old.tolist()
x_new = sorted(x_new_tmp)
y_new = [y_new_tmp[x_new_tmp.index(i)] for i in x_new]
Then you can plot x_new and y_new
I have two arrays. One is the raw signal of length (1000, ) and the other one is the smooth signal of length (100,). I want to visually represent how the smooth signal represents the raw signal. Since these arrays are of different length, I am not able to plot them one over the other. Is there a way to do so in matplotlib?
Thanks!
As rth suggested, define
x1 = np.linspace(0, 1, 1000)
x2 = np.linspace(0, 1, 100)
and then plot raw versus x1, and smooth versus x2:
plt.plot(x1, raw)
plt.plot(x2, smooth)
np.linspace(0, 1, N) returns an array of length N with equally spaced values from 0 to 1 (inclusive).
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(2015)
raw = (np.random.random(1000) - 0.5).cumsum()
smooth = raw.reshape(-1,10).mean(axis=1)
x1 = np.linspace(0, 1, 1000)
x2 = np.linspace(0, 1, 100)
plt.plot(x1, raw)
plt.plot(x2, smooth)
plt.show()
yields
You will need two different x-axes for this job. You cannot plot two variables with different lengths in one single plot.
import matplotlib.pyplot as plt
import numpy as np
y = np.random.random(100) # the smooth signal
x = np.linspace(0,100,100) # it's x-axis
y1 = np.random.random(1000) # the raw signal
x1 = np.linspace(0,100,1000) # it's x-axis
fig = plt.figure()
ax = fig.add_subplot(121)
ax.plot(x,y,label='smooth-signal')
ax.legend(loc='best')
ax2 = fig.add_subplot(122)
ax2.plot(x1,y1,label='raw-signal')
ax2.legend(loc='best')
plt.suptitle('Smooth-vs-raw signal')
fig.show()