I want to plot a 3-D function z=f(x,y) in contourf fashion, while highlighting some levels where z is constant with black lines and custom labels. The x,y,z data points are contained in a *.csv file which I manipulate with pandas. The function f(x,y) (hence the z-points) have a scaling of type f(x,y) proprtional to 1/(xy)*
This is the result I got with my code
As you can see, there's no good scaling results on the colors. I.e. all lines above 99% (which take up the vast majority of the image space) are the same color, whereas I'd like to have a different shade for every "section" (i.e. one shade between 98% and 98.5%, another one between 98.5% and 99%, etc). The colorbar is also weird, and I guess this is due to the wrong scaling in the image, to being with. How do I obtain the wanted result? Hereby the code I'm using as of now (it should be plug-and-play).
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rcParams
rcParams.update({
"text.usetex": True,
"font.family": "sans-serif",
"font.sans-serif": "Helvetica",
})
rcParams['text.latex.preamble'] = r'\usepackage{amsmath}' #for \text command
# This custom formatter removes trailing zeros, e.g. "1.0" becomes "1", and
# then adds a percent sign.
def fmt(x):
s = f"{x:.2f}"
if s.endswith("0"):
s = f"{x:.1f}"
if s.endswith("0"):
s = f"{x:.0f}"
return rf"{s} \%" if plt.rcParams["text.usetex"] else f"{s} %"
contour_data = pd.read_csv('MyData.csv', header=None, names=['x','y','z'])
Z = contour_data.pivot_table(index='x', columns='y', values='z', dropna=False).T.values
X_unique = np.sort(contour_data.x.unique())
Y_unique = np.sort(contour_data.y.unique())
X, Y = np.meshgrid(X_unique, Y_unique)
# Initialize plot objects
rcParams['figure.figsize'] = 5, 5 # sets plot size
fig = plt.figure()
ax = fig.add_subplot(111)
# Define levels in z-axis where we want lines to appear
levels = np.array([98.,98.5,99.,99.5,99.6,99.7,99.8,99.85])
# Generate a color mapping of the levels we've specified
cpf = ax.contourf(X,Y,Z,
len(levels),
extend='both',
cmap='Reds'
)
fig.colorbar(cpf, ticks=levels, orientation='vertical')
# Set all level lines to black
line_colors = ['black' for l in cpf.levels]
# Make plot and customize axes
cp = ax.contour(X, Y, Z,
levels=levels,
colors=line_colors)
ax.clabel(cp, cp.levels, inline=True, fmt=fmt, fontsize=10, colors=line_colors)
ax.set_xlabel(r'$X-\text{axis } [\alpha]$')
ax.set_ylabel(r'$Y-\text{axis } [\beta]$')
#plt.savefig('figure_for_stackoverflow.pdf') # uncomment to save vector/high-res version
The dataset can be found here
The only additional requirement I have is that I'd not like to add any other modules to my code if possible. In case the result I'm seeking is impossible to obtain with the imported modules, I'm ready to relax this constraint. In this particular dataset there are no "NaN" elements in Z, but as a general rule I'd like the NaN spots in Z to be white. Thanks for you help!
I tried plotting a surface and highlight constant lines of said surface using a contour/contourf but the result I obtain is not consisted with how my dataset is structured.
Use a BoundaryNorm.
from matplotlib.colors import BoundaryNorm
...
# note that I have extended your levels vector,
# on the bottom and on the top, to cover the full range of z
levels = np.array([z.min(), 98.,98.5,99.,99.5,99.6,99.7,99.8,99.85, z.max()])
cf = plt.contourf(x, y, z,
levels=levels,
norm=colors.BoundaryNorm(levels,256), cmap='Reds')
ct = plt.contour(x, y, z, levels=levels, colors='k k k k k k w w w w'.split())
cl = plt.clabel(ct)
cb = plt.colorbar(cf)
plt.show()
UPDATE
Even better: use extend='both' in contourf and in colorbar,
levels = np.array([98.,98.5,99.,99.5,99.6,99.7,99.8,99.85])
cf = plt.contourf(x, y, z,
levels=levels,
extend='both',
cmap='Reds')
ct = plt.contour(x, y, z, levels=levels, colors='k k k k k w w w'.split())
cl = plt.clabel(ct)
cb = plt.colorbar(cf, extend='both')
plt.show()
Notice the pointed ends in the Colorbar
Related
I'm using python to create a 3D surface map, I have an array of data I'm trying to plot as a 3D surface, the issue is that I have logged the Z axis (necessary to show peaks in data) which means the default colormap doesn't work (displays one continous color). I've tried using the LogNorm to normalise the colormap but again this produces one continous color. I'm not sure whether I should be using the logged values to normalise the map, but if i do this the max is negative and produces an error?
fig=plt.figure(figsize=(10,10))
ax=plt.axes(projection='3d')
def log_tick_formatter(val, pos=None):
return "{:.2e}".format(10**val)
ax.zaxis.set_major_formatter(mticker.FuncFormatter(log_tick_formatter))
X=np.arange(0,2,1)
Y=np.arange(0,3,1)
X,Y=np.meshgrid(X,Y)
Z=[[1.2e-11,1.3e-11,-1.8e-11],[6e-13,1.3e-13,2e-15]]
Z_min=np.amin(Z)
Z_max=np.amax(Z)
norm = colors.LogNorm(vmin=1e-15,vmax=(Z_max),clip=False)
ax.plot_surface(X,Y,np.transpose(np.log10(Z)),norm=norm,cmap='rainbow')
Just an example of the logarithmic colors and logarithmic data:
#!/usr/bin/env ipython
import numpy as np
import matplotlib as mpl
import matplotlib.pylab as plt
import matplotlib.colors as colors
# ------------------
X=np.arange(0,401,1);nx= np.size(X)
Y=np.arange(40,200,1);ny = np.size(Y)
X,Y=np.meshgrid(X,Y)
Z = 10000*np.random.random((ny,nx))
Z=np.array(Z)
# ------------------------------------------------------------
Z_min=np.amin(Z)
Z_max=np.amax(Z)
# ------------------------------------------------------------
norm = colors.LogNorm(vmin=np.nanmin(Z),vmax=np.nanmax(Z),clip=False)
# ------------------------------------------------------------
fig = plt.figure(figsize=(15,5));axs = [fig.add_subplot(131),fig.add_subplot(132),fig.add_subplot(133)]
p0 = axs[0].pcolormesh(X,Y,np.log10(Z),cmap='rainbow',norm=norm);plt.colorbar(p0,ax=axs[0]);
axs[0].set_title('Original method: NOT TO DO!')
p1 = axs[1].pcolormesh(X,Y,Z,cmap='rainbow',norm=norm);plt.colorbar(p1,ax=axs[1])
axs[1].set_title('Normalized colorbar, original data')
p2 = axs[2].pcolormesh(X,Y,np.log10(Z),cmap='rainbow');plt.colorbar(p2,ax=axs[2])
axs[2].set_title('Logarithmic data, original colormap')
plt.savefig('test.png',bbox_inches='tight')
# --------------------------------------------------------------
So the result is like this:
In the first case, we have used logarithmic colormap and also taken the logarithm of the data, so the colorbar does not work anymore as the values on the map are small and we have used large limits for the colorbar.
In the middle image, we use the normalized colorbar or logarithmic colorbar so that it is quite natively understood what is on the image and what are the values. The third case is when we take the logarithm from the data and the colorbar is just showing the power of the 10th we have to use in order to interpret the coloured value on the plot.
So, in the end, I would suggest the middle method: use the logarithmic colorbar and original data.
Edit: to solve your problem you are taking the log of the data then you are taking it again when calculating the norm, simply remove the norm and apply vmin and vmax directly to the drawing function
ax.plot_surface(X, Y, np.transpose(np.log10(Z)), cmap='rainbow',vmin=np.log10(1e-15),vmax=np.log10(Z_max))
you can use the facecolor argument of plot_surface to define color for each face independent of z, here's a simplified example
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
x = np.linspace(0,10,100)
y = np.linspace(0,10,100)
x,y = np.meshgrid(x,y)
z = np.sin(x+y)
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
cmap = matplotlib.cm.get_cmap('rainbow')
def rescale_0_to_1(item):
max_z = np.amax(item)
min_z = np.amin(item)
return (item - min_z)/(max_z-min_z)
rgba = cmap(rescale_0_to_1(z)) # some values of z to calculate color with
real_z = np.log(z+1) # real values of z to draw
surf = ax.plot_surface(x, y, real_z, cmap='rainbow', facecolors=rgba)
plt.show()
you can modify it to calculate colors based on x or y or something completely unrelated.
I am trying to do a matplotlib contourf plot with some x, y, and z values. Basically the z values will define the color of the plot.
However, where I am at right now one region (i.e. the important region for me) is very small compared to the rest (see figure), so it can be very difficult to see this particular region actually (a few small black "dots"). So I was thinking if it was possible maybe to get the first lvl (or last level since it's negative values in this case) in another color, or maybe outline it with a thin white line or something, so one can really see the small and important dots ?
I am plotting with this code:
import matplotlib.pyplot as plt
from matplotlib import rcParams
import matplotlib.colors as colors
import numpy as np
nx = 41
ny = 67
x = np.linspace(0.01, 1, nx)
y = np.linspace(0.01, 2, ny)
x_bc = x[:, np.newaxis]
y_bc = y[np.newaxis, :]
z = x_bc*y_bc
max_value = np.amax(z)
cmapp = plt.get_cmap('Greys')
level_intervals = [100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 8, 1.92, 0]
level_list = [max_value-i for i in level_intervals]
col_bar = plt.contourf(x, y, z.T, level_list, cmap=cmapp)
plt.xlabel('x')
plt.ylabel('y')
plt.colorbar(col_bar, cmap=cmapp)
plt.show()
I am sorry for not providing any real data, but I can't replicate the data used for the plot below (where there actually is some small amounts/dots of almost black, inside the almost black (weird sentence). However, the size and how the z data is created is just as above. There has, however, been many calculations in between before getting the data from the figure.
Edit based on your comment below: You can restrict the contours in the region/range you want. For example, I modified the x, y, and z data in your sample code above to plot more contour lines. I then select only the contour lines for highest magnitude levels = sorted(level_list)[-5:] (last 5 lines here) for highlighting with the red color. Try doing it for your actual data and see if the points in the region of interest become visible. I am writing below only the lines which I modified in your code.
fig = plt.figure(figsize=(8, 6))
nx = 67
ny = 77
# Modified your actual values to get some more contour lines
x = np.linspace(1, 16, nx)
y = np.linspace(1, 15, ny)
z = x_bc*y_bc*0.2
col_bar = plt.contourf(x, y, z.T, level_list, cmap=cmapp)
plt.contour(col_bar, levels = sorted(level_list)[-5:], colors=('r',),linestyles=('-',),linewidths=(3,))
Output
You can create a custom colormap based on an existing one and replace one of the colors with e.g. red.
You may then use a BoundaryNorm to use the colors from the new colormap for the specified levels.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
d = np.linspace(-3,3)
x,y = np.meshgrid(d,d)
data = -585.22 + 94*np.exp(-(x**2+y**2))
levels = np.linspace(-585.22, -485.22, 13)
norm = matplotlib.colors.BoundaryNorm(levels,len(levels))
colors = list(plt.cm.Greys(np.linspace(0,1,len(levels)-1)))
colors[-1] = "red"
cmap = matplotlib.colors.ListedColormap(colors,"", len(colors))
im = plt.contourf(data, levels, cmap=cmap, norm=norm)
plt.colorbar(ticks=levels)
plt.show()
I am trying to plot data points whose color corresponds to their class labels. I am more familiar with R in terms of data visualization. In R, I would do the following:
x = matrix(runif(100), 2, 20)
y = matrix(runif(100), 2, 20)
labels = c(rep(0, 20), rep(1, 20))
plot(rbind(x, y), col = labels)
Then I will be able to have a scatter plot of data points from two classes and their point colors are the labels. I am not sure how to do this in python. So far what I did was
import numpy
plot(numpy.vstack((x,y)), c = labels)
But apparently python does not like integer values for colors.... Your help will be greatly appreciated!
You are on the right track. You have three vectors of data: x, y, and c, where c is an integer array with class labels.
The simplest thing you can do is:
import matplotlib.pyplot as plt
import numpy as np
# create some random data grouped into three groups
x = np.random.random(100)
y = np.random.random(100)
c = np.random.choice(range(3), 100)
# plot the data
fig = plt.figure()
ax = fig.add_subplot(111)
# plot x,y data with c as the color vector, set the line width of the markers to 0
ax.scatter(x, y, c=c, lw=0)
This gives you:
If you want more control over your colors, you may even create your own color table, for example:
mycolors = np.array([ 'g', 'm', 'c' ])
ax.scatter(x, y, c=mycolors[c], lw=0)
And now the colors are 0=green, 1=magenta, 2=cyan:
Of course, you may also specify color triplets (RGB) or quadruplets (RGBA) instead of the color names. This gives you a more granular control.
You may also use the built-in colormaps or create your own. I just find the above solution the most transparent with discrete data with only few possible values.
I'm trying to make a simple 2d plot from a 3 column data sets e.g. y=f(x) and z=f(x). I want to plot xy and would like to display z using color. For example, the rectangular regions between [x1,x2, min(y), max(y)] ... will be filled by a background color depending on the value of z. I tried to use fill_between but could not associate a colormap with it. I'm new to matplotlib and python. I would very much appreciate your comments/suggestions.
Edit: I don't have an accurate plot but I'll try to explain my query with the help of following figure sample plot
say between x=0.5 to x=1, z=1
x=1.0, to x=1.5, z=2 ....
so I would like to cover x=0.5 to x=1 (min(y) to max(y)] with some color that corresponds to z=1, and between x=1, x=1.5, z=2 and so on.. I want to show this variation using a colormap and to display this colorbar at the right side.
Here's the solution those who want cannot use contourf or need fill_between for some other reason (as in this case with irregular grid data).
import numpy as np
import matplotlib.pyplot as plt
from random import randint, sample
import matplotlib.colorbar as cbar
# from Numeric import asarray
%matplotlib inline
# The edges of 2d grid
# Some x column has varying rows of y (but always the same number of rows)
# z array that corresponds a value in each xy cell
xedges = np.sort(sample(range(1, 9), 6))
yedges = np.array([np.sort(sample(range(1, 9), 6)) for i in range(5)])
z = np.random.random((5,5))
f, ax = plt.subplots(1, sharex=True, figsize=(8,8))
f.subplots_adjust(hspace=0)
ax.set_ylabel(r'y')
ax.set_xlabel(r'x')
ax.set_ylim(0,10)
ax.set_xlim(0,10)
c = ['r','g','b','y','m']
normal = plt.Normalize(z.min(), z.max())
cmap = plt.cm.jet(normal(z))
# plot showing bins, coloured arbitrarily.
# I want each cell coloured according to z.
for i in range(len(xedges)-1):
for j in range(len(yedges)):
ax.vlines(xedges[i],yedges[i][j],yedges[i][j+1],linestyle='-')
ax.hlines(yedges[i][j],xedges[i],xedges[i+1],linestyle='-')
ax.vlines(xedges[i+1],yedges[i][j],yedges[i][j+1],linestyle='-')
ax.hlines(yedges[i][j+1],xedges[i],xedges[i+1],linestyle='-')
ax.fill_between([xedges[i],xedges[i+1]],yedges[i][j],yedges[i][j+1],facecolor=cmap[i][j][:])
cax, _ = cbar.make_axes(ax)
cb2 = cbar.ColorbarBase(cax, cmap=plt.cm.jet,norm=normal)
This gives
It sound to me like you should use contourf
http://matplotlib.org/examples/pylab_examples/contourf_demo.html
This would take x as some dependant variable, produce y = y(x) and z = z(x). It seems that your z is not dependant on y but contourf can still handle this.
As a simple example:
import pylab as plt
x = plt.linspace(0,2,100)
y = plt.linspace(0,10,100)
z = [[plt.sinc(i) for i in x] for j in y]
CS = plt.contourf(x, y, z, 20, # \[-1, -0.1, 0, 0.1\],
cmap=plt.cm.rainbow)
plt.colorbar(CS)
plt.plot(x,2+plt.sin(y), "--k")
The are many variations but hopefully this captures the elements you are looking for
if I make a scatter plot with matplotlib:
plt.scatter(randn(100),randn(100))
# set x, y lims
plt.xlim([...])
plt.ylim([...])
I'd like to annotate a given point (x, y) with an arrow pointing to it and a label. I know this can be done with annotate, but I'd like the arrow and its label to be placed "optimally" in such a way that if it's possible (given the current axis scales/limits) that the arrow and the label do not overlap with the other points. eg if you wanted to label an outlier point. is there a way to do this? it doesn't have to be perfect, but just an intelligent placement of the arrow/label, given only the (x,y) coordinates of the point to be labeled. thanks.
Basically, no, there isn't.
Layout engines that handle placing map labels similar to this are surprisingly complex and beyond the scope of matplotlib. (Bounding box intersections are actually a rather poor way of deciding where to place labels. What's the point in writing a ton of code for something that will only work in one case out of 1000?)
Other than that, due to the amount of complex text rendering that matplotlib does (e.g. latex), it's impossible to determine the extent of text without fully rendering it first (which is rather slow).
However, in many cases, you'll find that using a transparent box behind your label placed with annotate is a suitable workaround.
E.g.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
x, y = np.random.random((2,500))
fig, ax = plt.subplots()
ax.plot(x, y, 'bo')
# The key option here is `bbox`. I'm just going a bit crazy with it.
ax.annotate('Something', xy=(x[0], y[0]), xytext=(-20,20),
textcoords='offset points', ha='center', va='bottom',
bbox=dict(boxstyle='round,pad=0.2', fc='yellow', alpha=0.3),
arrowprops=dict(arrowstyle='->', connectionstyle='arc3,rad=0.5',
color='red'))
plt.show()
Use adjustText (full disclosure, I wrote it).
Let's label the first 10 points. The only parameter I changed was lowering the force of repelling from the points, since there is so many of them and we want the algorithm to take a bit more time and place the annotations more carefully.
import numpy as np
import matplotlib.pyplot as plt
from adjustText import adjust_text
np.random.seed(1)
x, y = np.random.random((2,500))
fig, ax = plt.subplots()
ax.plot(x, y, 'bo')
ts = []
for i in range(10):
ts.append(plt.text(x[i], y[i], 'Something'+str(i)))
adjust_text(ts, x=x, y=y, force_points=0.1, arrowprops=dict(arrowstyle='->',
color='red'))
plt.show()
It's not ideal, but the points are really dense here and sometimes there is no way to place the text near to its target without overlapping any of them. But it's all automatic and easy to use, and also doesn't let labels overlap each other.
PS
It uses bounding box intersections, but rather successfully I'd say!
Another example using awesome Phlya's package based on adjustText_mtcars:
from adjustText import adjust_text
import matplotlib.pyplot as plt
mtcars = pd.read_csv(
"https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
)
def plot_mtcars(adjust=False, force_points=1, *args, **kwargs):
# plt.figure(figsize=(9, 6))
plt.scatter(mtcars["wt"], mtcars["mpg"], s=15, c="r", edgecolors=(1, 1, 1, 0))
texts = []
for x, y, s in zip(mtcars["wt"], mtcars["mpg"], mtcars["model"]):
texts.append(plt.text(x, y, s, size=9))
plt.xlabel("wt")
plt.ylabel("mpg")
if adjust:
plt.title(
"force_points: %.1f\n adjust_text required %s iterations"
% (
force_points,
adjust_text(
texts,
force_points=force_points,
arrowprops=dict(arrowstyle="-", color="k", lw=0.5),
**kwargs,
),
)
)
else:
plt.title("Original")
return plt
fig = plt.figure(figsize=(12, 12))
force_points = [0.5, 1, 2, 4]
for index, k in enumerate(force_points):
fig.add_subplot(2, 2, index + 1)
plot_mtcars(adjust=True, force_points=k)