non equally spaced points along x-axis in a plot - python

When I want to plot a curve f(x) with pyplot, what I usually do is to create a vector X with all the x-values equally spaced:
import numpy as np
X=np.linspace(0.,1.,100)
then I create the function
def f(x):
return x**2
and then I make the plot
from matplotlib import pyplot as plt
plt.plot(X,f(X))
plt.show()
However, in some cases I might want the x-values not to be equally spaced, when the function is very stiff in some regions and very smooth in others.
What is the correct way to properly choose the best X vector for the function I want to plot?

In its generality there is not definitive answer to this. But you can of course always choose the complete range with the required density,
X = np.linspace(0.,1., 6000)
or you can decide for some intervals and set the density differently for those
x1 = np.linspace(0.0,0.5, 60)
x2 = np.linspace(0.5,0.6, 5000)
x3 = np.linspace(0.6,1.0, 10)
X = np.concatenate((x1, x2, x3))

Related

Plot stochastic trajectories deviations from 'real' path using a colormesh in matplotlib (Python)

Hi I created a program that will create deviations from a real trajectory, it is complicated and I do not have a simple example unfortunately.
It calculates a path with stochastic initial conditions from the real path and does this for x iterations, the goal is to show that the deviations become larger at greater times.
The real path and the deviations are showed below.
However I want to show that the deviations become greater the longer in time we are. Ofcourse I could just calculate the variance and plot mean+var and mean-var at each time step but I was wondering if I could plot something like this, using hist2d
You see that the blocks are not as smooth as a like and this is not that great to use.
Then I went and looked at python's kde and created the following.
This is also not preferable as I think it bins more points at the minima and maxima. Also it is 'too smeared out'. Especially in the beginning, all the points are the same so I want there just to be a straight line to really show that the deviations start later on.
I guess my question is; is what I want even possible and what package/command should I use. I haven't found what I am looking for on other questions. Or has anyone a suggestion to nicely show what I want in a any other way?
Here is an idea plotting multiple curves with transparency on top of each other:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 200)
for _ in range(1000):
plt.plot(x, np.sin(x * np.random.normal(1, 0.1)) * np.random.normal(1, 0.1), color='r', alpha=0.02)
plt.plot(x, np.sin(x), color='b')
plt.margins(x=0)
plt.show()
Another option creates a 2d histogram:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 200)
all_curves = np.array([np.sin(x * np.random.normal(1, 0.1)) * np.random.normal(1, 0.1) for _ in range(100)])
plt.hist2d(x=np.tile(x, all_curves.shape[0]), y=all_curves.ravel(), bins=(100, 100), cmap='inferno')
plt.show()
Still another approach would use fill_between (as suggested by #bramb) between confidence intervals:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 200)
all_curves = np.array([np.sin(x * np.random.normal(1, 0.1)) * np.random.normal(1, 0.1) for _ in range(1000)])
confidence_interval1 = 95
confidence_interval2 = 80
confidence_interval3 = 50
for ci in [confidence_interval1, confidence_interval2, confidence_interval3]:
low = np.percentile(all_curves, 50 - ci / 2, axis=0)
high = np.percentile(all_curves, 50 + ci / 2, axis=0)
plt.fill_between(x, low, high, color='r', alpha=0.2)
plt.plot(x, np.sin(x), color='b')
plt.margins(x=0)
plt.show()
You could use something like the matplotlib.pyplot.fill_between method. It fills everything between y1 (max) and y2 (min) for a given (common) x array. You would then be able to accentuate that the filled region keeps enlarging with increasing x value.
However, this would require you to find the minimal and maximal value of your deviations at each time point and save these to two separate arrays. The exact method of doing this will depend on how you are storing these individual runs.
In case they are separate lists / arrays, you can convert these to a numpy matrix / pandas dataframe and use the minimum / maximum methods along the relevant axis.

Plot and function with three variables in python

An equation which is represent as below
sin(x)*sin(y)*sin(z)+cos(x)*sin(y)*cos(z)=0
I know the code to plot function for z=f(x,y) using matplotlib but to plot above function I don’t know the code, but I tried MATLAB MuPad code which is as follows
Plot(sin(x)*sin(y)*sin(z)+cos(x)*sin(y)*cos(z),#3d)
This will be much easier if you can isolate z. Your equation is the same as sin(z)/cos(z) = -cos(x)*sin(y)/(sin(x)*sin(y)) so z = atan(-cos(x)*sin(y)/(sin(x)*sin(y))).
Please don't mistake me, but I think your given equation to plot can be reduced to a simple 2D plot.
sin(x)*sin(y)*sin(z)+cos(x)*sin(y)*cos(z) = 0
sin(y)[sin(x)*sin(z)+cos(x)*cos(z)] = 0
sin(y)*cos(x-z) = 0
Hence sin(y) = 0 or cos(x-z)=0
Hence y = n*pi (1) or x-z=(2*n + 1)pi/2
Implies, x = z + (2*n + 1)pi/2 (2)
For (1), it will be a straight line (the plot of y vs n) and in second case, you will get parallel lines which cuts x-axis at (2*n + 1)pi/2 and distance between two parallel lines would be pi. (Assuming you keep n constant).
Assuming, your y can't be zero, you could simplify the plot to a 2D plot with just x and z.
And answering your original question, you need to use mplot3d to plot 3D plots. But as with any graphing tool, you need values or points of x, y, z. (You can compute the possible points by programming). Then you feed those points to the plot, like below.
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = plt.axes(projection="3d")
xs = [] # X values
ys = [] # Y values
zs = [] # Z values
ax.plot3D(xs, ys, zs)
plt.show()

Matplotlib: Coloring scatter plot by density relative to another data set

I'm new to Python and having some trouble with matplotlib. I currently have data that is contained in two numpy arrays, call them x and y, that I am plotting on a scatter plot with coordinates for each point (x, y) (i.e I have points x[0], y[0] and x1, y1 and so on on my plot). I have been using the following code segment to color the points in my scatter plot based on the spatial density of nearby points (found this on another stackoverflow post):
http://prntscr.com/abqowk
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
x = np.random.normal(size=1000)
y = x*3 + np.random.normal(size=1000)
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
idx = z.argsort()
fig,ax = plt.subplots()
ax.scatter(x,y,c=z,s=50,edgecolor='')
plt.show()
Output:
I've been using it without being sure exactly how it works (namely the point density calculation - if someone could explain how exactly that works, would also be much appreciated).
However, now I'd like to color code by the ratio of the spatial density of points in x,y to that of the spatial density of points in another set of numpy arrays, call them x2, y2. That is, I would like to make a plot such that I can identify how the density of points in x,y compares to the points in x2,y2 on the same scatter plot. Could someone please explain how I could go about doing this?
Thanks in advance for your help!
I've been trying to do the same thing based on that same earlier post, and I think I just figured it out! The trick is to use matplotlib.colors.Normalize() to define a scale and then weight it according to some data set (xnorm,ynorm):
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as mplc
import matplotlib.cm as cm
from scipy.stats import gaussian_kde
def kdeplot(x,y,xnorm,ynorm):
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
wt = 1.0*len(x)/(len(xnorm)*1.0)
norm = mplc.Normalize(vmin=0, vmax=8/wt)
cmap = cm.gnuplot
idx = z.argsort()
x, y, z = x[idx], y[idx], z[idx]
args = (x,y)
kwargs = {'c':z,'s':10,'edgecolor':'','cmap':cmap,'norm':norm}
return args, kwargs
# (x1,y1) is some data set whose density map coloring you
# want to scale to (xnorm,ynorm)
args,kwargs = kdeplot(x1,y1,xnorm,ynorm)
plt.scatter(*args,**kwargs)
I used trial and error to optimize my normalization for my particular data and choice of colormap. Here's what my data looks like scaled to itself; here's my data scaled to some comparison data (which is on the bottom of that image).
I'm not sure this method is entirely general, but it works in my case: I know that my data and the comparison data are in similar regions of parameter space, and they both have gaussian scatter, so I can use a naive linear scaling determined by the number of data points and it results in something that gives the right idea visually.

How to draw enveloping line with a shaded area which incorporates a large number of data points?

For the figure above, how can I draw an enveloping line with a shaded area, similar to the figure below?
Replicating your example is easy because it's possible to calculate the min and max at each x and fill between them. eg.
import matplotlib.pyplot as plt
import numpy as np
#dummy data
y = [range(20) + 3 * i for i in np.random.randn(3, 20)]
x = list(range(20))
#calculate the min and max series for each x
min_ser = [min(i) for i in np.transpose(y)]
max_ser = [max(i) for i in np.transpose(y)]
#initial plot
fig, axs = plt.subplots()
axs.plot(x, x)
for s in y:
axs.scatter(x, s)
#plot the min and max series over the top
axs.fill_between(x, min_ser, max_ser, alpha=0.2)
giving
For your displayed data, that might prove problematic because the series do not share x values in all cases. If that's the case then you need some statistical technique to smooth the series somehow. One option is to use a package like seaborn, which provides functions to handle all the details for you.

Normalize a multiple data histogram

I have several arrays that I'm plotting a histogram of, like so:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.normal(0,.5,1000)
y = np.random.normal(0,.5,100000)
plt.hist((x,y),normed=True)
Of course, however, this normalizes both of the arrays individually, so that they both have the same peak. I'm looking to normalize them to the total number of elements, so that the histogram of y will be visibly taller than that of x. Is there a handy way to do this in matplotlib or will I have to mess around in numpy? I haven't found anything about it.
Another way to put it is that if I were instead to make a cumulative plot of the two arrays, they shouldn't both top out at 1, but should add to 1.
Yes, you can compute the histogram with numpy and renormalise it.
x = np.random.normal(0,.5,1000)
y = np.random.normal(0,.5,100000)
xhist, xbins = np.histogram(x, normed=True)
yhist, ybins = np.histogram(x, normed=True)
And now, you apply your regularisation. For example, if you want x to be normalised to 1 and y proportional:
yhist *= len(y) / len(x)
Now, to plot the histogram:
def plot_histogram(data, edge_bins, **kwargs):
bins = edge_bins[:-1] + edge_bins[1:]
plt.step(bins, data, **kwargs)
plot_histogram(xhist, xbins, c='b')
plot_histogram(yhist, ybins, c='g')

Categories