I am wondering whether I can plot a graph in which I show a range of best and worst results using matplotlib. The result should look something like this:
Image of the graph I want to replicate here.
You see the ranges around each point that specify what the best and worst measure is? This is exactly what I am looking for.
I'm pretty sure the errorbar function does exactly what you want:
https://matplotlib.org/3.5.0/api/_as_gen/matplotlib.pyplot.errorbar.html
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(10)
y = np.arange(10)
# yerr can be a single number or an array with same length as x and y
# depending on whether you want it to be constant or changing
yerr = 1
plt.errorbar(x, y, yerr=yerr)
plt.show()
Related
I have a function with an histogram, plotted like this :
import matplotlib.pyplot as plt
import numpy as np
lin = np.linspace(min(foo), max(foo), len(foo))
plt.plot(lin, bar)
plt.hist(bar, density=True, bins=100, histtype='stepfilled', alpha=0.2)
plt.show()
Where foo and bar are simple arrays.
However, I would want to have the whole thing in a vertical way... I could add orientation='horizontal' to the histogram, but it would not change the function (and from what I have seen, there is nothing similar for a plot -> obviously it wouldn't be a function then, but a curve). Otherwise, I could add plt.gca().invert_yaxis() somewhere, but the same problem resides : plot is used for functions, so the swap of it does... well, that :
So, the only way I have now is to manually turn the whole original picture by 90 degrees, but then the axis are turned too and will no longer be on the left and bottom (obviously).
So, have you another idea ? Maybe I should try something else than plt.plot ?
EDIT : In the end, I would want something like the image below, but with axes made right.
If you have a plot of y vs x, you can swap axes by swapping arrays:
plt.plot(bar, lin)
There's no special feature because it's supported out of the box. As you've discovered, plotting a transposed histogram can be accomplished by passing in
orientation='horizontal'
I couldn't find any matplotlib method dealing with the issue. You can rotate the curve in a purely mathematical way, i.e. do it through the rotation matrix. In this simple case it is sufficient to just exchange variables x and y but in general it looks like this (let's take a parabola for a clear example):
rotation = lambda angle: np.array([[ np.cos(angle), -np.sin(angle)],
[np.sin(angle), np.cos(angle)]])
x = np.linspace(-10,10,1000)
y = -x**2
matrix = np.vstack([x,y]).T
rotated_matrix = matrix # rotation(np.deg2rad(90))
fig, ax = plt.subplots(1,2)
ax[0].plot(rotated_matrix[:,0], rotated_matrix[:,1])
ax[1].plot(x,y)
rotated_matrix = matrix # rotation(np.deg2rad(-45))
fig, ax = plt.subplots(1,2)
ax[0].plot(rotated_matrix[:,0], rotated_matrix[:,1])
ax[1].plot(x,y)
Let's say I want to visualize the functions f[n] = e^{-(x-n)^2}/n for n=1...10. Notice that these are not probability distributions.
(not actually the plot I want to do, but close enough).
I'd like to demonstrate it with something like a violin-plot (https://matplotlib.org/gallery/statistics/violinplot.html) where for each n I have a vertical line and I plot the function on both sides of the vertical line.
But violin plots seem to only be used for showing the locations of a sample of data. So all the tools for it require me to give it a data set. The data I want to plot isn't of that type - it's an actual known function.
[if you want more context this is related to an earlier question of mine - https://stats.stackexchange.com/questions/403359/visualizing-2d-data-when-one-dimension-is-discrete-and-the-other-continuous].
The question is a bit broad, so maybe this is not actually what you're looking for. But as I understand it, you just want to plot your function at position f(x,n) at different positions n and have x on the vertical axis.
import numpy as np
import matplotlib.pyplot as plt
f = lambda x, n: np.exp(-(x-n)**2)/n
x = np.linspace(-2,12,101)
ns = np.arange(1,11)
for n in ns:
plt.fill_betweenx(x, -f(x,n)+n, f(x,n)+n, color="C0", alpha=0.5)
plt.xlabel("n")
plt.ylabel("x")
plt.xticks(ns)
plt.show()
IIUC, you want something like this:
df = pd.DataFrame({n: [np.exp(-(x-n)**2)/n for x in np.arange(-1,1,0.1)] for n in range(1,11)})
fig, ax = plt.subplots(1,1, figsize=(10,10))
ax.violinplot(df.T)
plt.show()
Output:
EDIT: I responded in the comments but I've tried the method in the marked post - my z data is not calculated form my x and y so I can't use a function like that.
I have xyz data that looks like the below:
NEW:the xyz data in the file i produce - I extract these as x,y,z
And am desperately trying to get a plot that has x against y with z as the colour.
y is binned data that goes from (for instance) 2.5 to 0.5 in uneven bins. So the y values are all the same for one set of x and z data. The x data is temperature and the z is density info.
So I'm expecting a plot that looks like a bunch of stacked rectangles where there is a gradient of colour for one bin of y values which spans lots of x values.
However all the codes I've tried don't like my z values and the best I can do is:
The axes look right but the colour bar goes from the bottom to the top of the y axis instead of plotting one z value for each x value at the correct y value
I got this to work with this code:
import matplotlib.cm as cm
from matplotlib.colors import LogNorm
import numpy as np
import scipy.interpolate
data=pandas.read_csv('Data.csv',delimiter=',', header=0,index_col=False)
x=data.tempbin
y=data.sizefracbin
z=data.den
x=x.values
y=y.values
z=z.values
X,Y=np.meshgrid(x,y)
Z=[]
for i in range(len(x)):
Z.append(z)
Z=np.array(Z)
plt.pcolormesh(X,Y,Z)
plt.colorbar()
plt.show()
I've tried everything I could find online such as in the post here: matplotlib 2D plot from x,y,z values
But either there is a problem reshaping my z values or it just gives me empty plots with various errors all to do (I think) with my z values.
Am I missing something? Thank you for your help!
Edit in reponse to : ImportanceOfBeingErnest
I tried this :
import matplotlib.cm as cm
from matplotlib.colors import LogNorm
import numpy as np
import scipy.interpolate
data=pandas.read_csv('Data.csv',delimiter=',', header=0,index_col=False)
data.sort_values('sizefrac')
x=data.tempbin
y=data.sizefrac
z=data.INP
x=x.values
y=y.values
z=z.values
X=x[1:].reshape(N,N)
Y=y[1:].reshape(N,N)
Z=z[1:].reshape(N,N)
plt.pcolormesh(X,Y,Z)
plt.colorbar()
plt.show()
and got a very empty plot. Just showed me the axes and colourbar as in my attached image but pure white inside the axes! No error or anything...
And the reshaping I need to remove a data point from each because otherwise the reshaping won't work
Adapting the linked question to you problem, you should get:
import numpy as np
import matplotlib.pyplot as plt
x = list(range(10))*10
y = np.repeat(list(range(10)), 10)
# build random z data
z = np.multiply(x, y)
N = int(len(z)**.5)
Z = z.reshape(N, N)
plt.imshow(Z[::-1], extent=(np.amin(x), np.amax(x), np.amin(y), np.amax(y)), aspect = 'auto')
plt.show()
The answer was found by Silmathoron in a comment on his answer above - the answer above did not help but in the comments he noticed that the X,Y data was not gridded in w way which would create rectangles on the plot and also mentioned that Z needed to be one smaller than X and Y - from this I could fix my code - thanks all
I am trying to figure out how to make a 3d figure of uni-variate kdensity plots as they change over time (since they pull from a sliding time window of data over time).
Since I can't figure out how to do that directly, I am first trying to get the x,y plotting data for kdensity plots of matplotlib in python. I hope after I extract them I can use them along with a time variable to make a three dimensional plot.
I see several posts telling how to do this in Matlab. All reference getting Xdata and Ydata from the underlying figure:
x=get(h,'Xdata')
y=get(h,'Ydata')
How about in python?
The answer was already contained in another thread (How to create a density plot in matplotlib?). It is pretty easy to get a set of kdensity x's and y's from a set of data.
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8 # data is a set of univariate data
xs = np.linspace(0,max(data),200) # This 200 sets the # of x (and so also y) points of the kdensity plot
density = gaussian_kde(data)
density.covariance_factor = lambda : .25
density._compute_covariance()
ys = density(xs)
plt.plot(xs,ys)
And there you have it. Both the kdensity plot and it's underlying x,y data.
Not sure how kdensity plots work, but note that matplotlib.pyplot.plot returns a list of the added Line2D objects, which are, in fact, where the X and Y data are stored. I suspect they did that to make it work similarly to MATLAB.
import matplotlib.pyplot as plt
h = plt.plot([1,2,3],[2,4,6]) # [<matplotlib.lines.Line2D object at 0x021DA9F0>]
x = h[0].get_xdata() # [1,2,3]
y = h[0].get_ydata() # [2,4,6]
I'm new to Python and having some trouble with matplotlib. I currently have data that is contained in two numpy arrays, call them x and y, that I am plotting on a scatter plot with coordinates for each point (x, y) (i.e I have points x[0], y[0] and x1, y1 and so on on my plot). I have been using the following code segment to color the points in my scatter plot based on the spatial density of nearby points (found this on another stackoverflow post):
http://prntscr.com/abqowk
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
x = np.random.normal(size=1000)
y = x*3 + np.random.normal(size=1000)
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
idx = z.argsort()
fig,ax = plt.subplots()
ax.scatter(x,y,c=z,s=50,edgecolor='')
plt.show()
Output:
I've been using it without being sure exactly how it works (namely the point density calculation - if someone could explain how exactly that works, would also be much appreciated).
However, now I'd like to color code by the ratio of the spatial density of points in x,y to that of the spatial density of points in another set of numpy arrays, call them x2, y2. That is, I would like to make a plot such that I can identify how the density of points in x,y compares to the points in x2,y2 on the same scatter plot. Could someone please explain how I could go about doing this?
Thanks in advance for your help!
I've been trying to do the same thing based on that same earlier post, and I think I just figured it out! The trick is to use matplotlib.colors.Normalize() to define a scale and then weight it according to some data set (xnorm,ynorm):
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as mplc
import matplotlib.cm as cm
from scipy.stats import gaussian_kde
def kdeplot(x,y,xnorm,ynorm):
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
wt = 1.0*len(x)/(len(xnorm)*1.0)
norm = mplc.Normalize(vmin=0, vmax=8/wt)
cmap = cm.gnuplot
idx = z.argsort()
x, y, z = x[idx], y[idx], z[idx]
args = (x,y)
kwargs = {'c':z,'s':10,'edgecolor':'','cmap':cmap,'norm':norm}
return args, kwargs
# (x1,y1) is some data set whose density map coloring you
# want to scale to (xnorm,ynorm)
args,kwargs = kdeplot(x1,y1,xnorm,ynorm)
plt.scatter(*args,**kwargs)
I used trial and error to optimize my normalization for my particular data and choice of colormap. Here's what my data looks like scaled to itself; here's my data scaled to some comparison data (which is on the bottom of that image).
I'm not sure this method is entirely general, but it works in my case: I know that my data and the comparison data are in similar regions of parameter space, and they both have gaussian scatter, so I can use a naive linear scaling determined by the number of data points and it results in something that gives the right idea visually.