Matplotlib contourf plots with streamline numbers - python

I am trying to generate a contour plot with line numbers inside! I used plt.contourf to draw the contour plot and plt.clabel to draw the lines on my contour plot! The numbers in my plot are incorrect as shown in the figure!
Contour plot with lines and wrong numbers
Contour lines with correct numbers
X = Data3.iloc[:,0].drop_duplicates()
Y = Data3.iloc[:,1].drop_duplicates()
Z = Data3.pivot('Battery capacity (kWh)','Solar capacity (kW)', 'Diesel electricity generation
(kWh)')
plt.figure(figsize=(7, 5))
contours = plt.contourf(X, Y, Z, 10, cmap='viridis', alpha=0.8 )
plt.colorbar();
plt.clabel(contours, inline = True, fontsize=8, fmt='%d', colors = 'black')
plt.xlabel('Solar Capacity (kW)',fontsize = 13) # x-axis label with fontsize 12
plt.ylabel('Battery Capacity (kWh)',fontsize = 12) # y-axis label with fontsize 12
plt.title('Diesel Electricity Generation (% of total generation)',fontsize = 15)
plt.scatter(x=(233*1.83*0.16), y=250, color = 'r', marker='o')
I also used plt.contour and plt.clabel, the numbers were placed correctly! How can I draw lines on plt.contourf without mixing the line numbers?

this question is almost a MRE. It would be helpful if it was because then I'd be able to copy and paste this code and run it on my computer. The only thing that's missing is the definitions for X,Y and Z, so I made a version of the question that is reproducible where it tries to graph a simpler contour plot:
import numpy as np
import itertools
from matplotlib import pyplot as plt
# Initialize the contour map data as a multiplication table
X= np.arange(10)
Y= np.arange(10)
Z= np.zeros((10, 10))
for i in range(10):
for j in range(10):
Z[i][j] = i*j
# Rest of example:
contours = plt.contour(X, Y, Z, 10, cmap='viridis', alpha=0.8 )
plt.colorbar();
plt.clabel(contours, inline = True, fontsize=8, fmt='%d', colors = 'black')
plt.xlabel('Solar Capacity (kW)',fontsize = 13) # x-axis label with fontsize 12
plt.ylabel('Battery Capacity (kWh)',fontsize = 12) # y-axis label with fontsize 12
plt.title('Diesel Electricity Generation (% of total generation)',fontsize = 15)
# Change location of single red point:
plt.scatter(x=5, y=7, color = 'r', marker='o')
When I run my example though, the contour labels show up correctly.
I'm wondering if this problem has to do with the input data in the Data3 variable.
Edit: I tried plotting this data with contourf to match the original question with contours = plt.contourf(X, Y, Z, 10, cmap='viridis', alpha=0.8 )
and I now see the problems with the contour labels:
I tried playing with all of the options to clabels and couldn't come up with something that outputs something suitable.
I suspect this is a bug with contourf. I couldn't find a bug report for this, so would you be comfortable with opening a bug ticket here in matplotlib?
In the short term, I suppose you could work around this by using contour() to plot. Then, if the plot really needs filled contours, my best idea is to fill them in manually with MS Paint or something -- but that's not a very good idea at all.

Related

How do I correctly implement contours of histograms with logscale binning in numpy/matplotlib

I am trying to plot contours of data that his been binned using numpy.hist2d, except the bins are set using numpy.logscale (equal binning in log space).
Unfortunately, this results in a strange behavior that I can't seem to resolve: the placement of the contours does not match the location of the points in x/y. I plot both the 2d histogram of the data, and the contours, and they do not overlap.
It looks like what is actually happening is the contours are being placed on the physical location of the plot in linear space where I expect them to be placed in log space.
It's a strange phenomenon that I think can be best described by the following plots, using identical data but binned in different ways.:
Here is a minimum working example to produce the logbinned data:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.normal(loc=500, scale=100,size=10000)
y = np.random.normal(loc=600, scale=60, size=10000)
nbins = 50
bins = (np.logspace(np.log10(10),np.log10(1000),nbins),np.logspace(np.log10(10),np.log10(1000),nbins))
HH, xe, ye = np.histogram2d(x,y,bins=bins)
plt.hist2d(x,y,bins=bins,cmin=1);
grid = HH.transpose()
extent = np.array([xe.min(), xe.max(), ye.min(), ye.max()])
cs = plt.contourf(grid,2,extent=extent,extend='max',cmap='plasma',alpha=0.5,zorder=100)
plt.contour(grid,2,extent=extent,colors='k',zorder=100)
plt.yscale('log')
plt.xscale('log')
It's fairly clear what is happening -- the contour is getting misplaced do the scaling of the bins. I'd like to be able to plot the histogram and the contour here together.
If anyone has an idea of how to resolve this, that would be very helpful - thanks!
This is your problem:
cs = plt.contourf(grid,2,extent=extent,...)
You are passing in a single 2d array specifying the values of the histograms, but you aren't passing the x and y coordinates these data correspond to. By only passing in extent there's no way for pyplot to do anything other than assume that the underlying grid is uniform, stretched out to fit extent.
So instead what you have to do is to define x and y components for each value in grid. You have to think a bit how to do this, because you have (n, n)-shaped data and (n+1,)-shaped edges to go with it. We should probably choose the center of each bin to associate a data point with. So we need to find the midpoint of each bin, and pass those arrays to contour[f].
Something like this:
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng()
size = 10000
x = rng.normal(loc=500, scale=100, size=size)
y = rng.normal(loc=600, scale=60, size=size)
nbins = 50
bins = (np.geomspace(10, 1000, nbins),) * 2
HH, xe, ye = np.histogram2d(x, y, bins=bins)
fig, ax = plt.subplots()
ax.hist2d(x, y, bins=bins, cmin=1)
grid = HH.transpose()
# compute bin midpoints
midpoints = (xe[1:] + xe[:-1])/2, (ye[1:] + ye[:-1])/2
cs = ax.contourf(*midpoints, grid, levels=2, extend='max', cmap='plasma', alpha=0.5, zorder=100)
ax.contour(*midpoints, grid, levels=2, colors='k', zorder=100)
# these are a red herring during debugging:
#ax.set_yscale('log')
#ax.set_xscale('log')
(I've cleaned up your code a bit.)
Alternatively, if you want to avoid having those white strips at the top and edge, you can keep your bin edges, and pad your grid with zeros:
grid_padded = np.pad(grid, [(0, 1)])
cs = ax.contourf(xe, ye, grid_padded, levels=2, extend='max', cmap='plasma', alpha=0.5, zorder=100)
ax.contour(xe, ye, grid_padded, levels=2, colors='k', zorder=100)
This gives us something like
This seems prettier, but if you think about your data this is less exact, because your data points are shifted with respect to the bin coordinates they correspond to. If you look closely you can see the contours being shifted with respect to the output of hist2d. You could fix this by generating geomspaces with one more final value which you only use for this final plotting step, and again use the midpoints of these edges (complete with a last auxiliary one).

How to show only the outline of a bar plot matplotlib

I'm plotting data as a bar plot in matplotlib and am trying to only show the outline of the bars, so that it appears as a 'stepped graph' of the data.
I've added my code below along with an image of the desired output.
plt.bar(x, y, align='center', width=0.1, edgecolor='black', color='none')
The plot I have:
The plot I would like:
Are there any other libraries that may be able to produce this? The bar keyword arguments don't seem to have anything that can.
Your image looks like a function that is horizontal around each x,y value. The following code simulates this:
for every x,y: create two new points one at x-0.5 and one at x+0.5, both with the same y
to close the shape at the ends, add (x[0]-0.5, 0) at the start and (x[-1]+0.5, 0) at the end.
import numpy as np
from matplotlib import pyplot as plt
x = np.arange(0, 30, 1)
y = np.random.uniform(2, 10, 30)
xs = [x[0] - 0.5]
ys = [0]
for i in range(len(x)):
xs.append(x[i] - 0.5)
xs.append(x[i] + 0.5)
ys.append(y[i])
ys.append(y[i])
xs.append(x[-1] + 0.5)
ys.append(0)
plt.plot(xs, ys, color='dodgerblue')
# optionally color the area below the curve
plt.fill_between(xs, 0, ys, color='gold')
PS: #AsishM. mentioned in the comments that matplotlib also has its own step function. If that function fulfils, please use that one. If you need some extra control or variation, this answer could give a start, such as coloring the area below the curve or handling the shape at the ends.

matplotlib separating scatterplot points and creating a divisionary curve

I'm attempting to create a divisionary curve on a scatter plot in matplotlib that would divide my scatterplot according to marker size.
The (x,y) are phi0 and phi0dot and I'm coloring/sizing according a to third variable 'e-folds'. I'd like to draw an 'S' shaped curve that divides the plot into small, black markers and large, cyan markers.
Here is a sample scatterplot run with a very few number of points for an example. Ultimately I will run with tens of thousands of points of data such that the divisionary would be much finer and more obviously 'S' shaped. This is roughly what I have in mind.
My code thus far looks like this:
# Set up the PDF
pdf_pages = PdfPages(outfile)
plt.rcParams["font.family"] = "serif"
# Create the canvas
canvas = plt.figure(figsize=(14.0, 14.0), dpi=100)
plt.subplot(1, 1, 1)
for a, phi0, phi0dot, efolds in datastore:
if efolds[-1] > 65:
plt.scatter(phi0[0], phi0dot[0], s=200, color='aqua')
else:
plt.scatter(phi0[0], phi0dot[0], s=30, color='black')
# Apply labels
plt.xlabel(r"$\phi_0$")
plt.ylabel(r"$\dot{\phi}_0$")
# Finish the file
pdf_pages.savefig(canvas)
pdf_pages.close()
print("Finished!")
This type of separation is very akin to what I'd like to do, but don't see immediately how I would extend this to my problem. Any advice would be much appreciated.
I would assume that the separation line between the differently classified points is a simple contour line along the threshold value.
Here I'm assuming classification takes values of 0 or 1, hence one can draw a contour along 0.5,
ax.contour(x,y,clas, [0.5])
Example:
import numpy as np
import matplotlib.pyplot as plt
# Some data on a grid
x,y = np.meshgrid(np.arange(20), np.arange(10))
z = np.sin(y+1) + 2*np.cos(x/5) + 2
fig, ax = plt.subplots()
# Threshold; values above the threshold belong to another class as those below.
thresh = 2.5
clas = z > thresh
size = 100*clas + 30*~clas
# scatter plot
ax.scatter(x.flatten(), y.flatten(), s = size.flatten(), c=clas.flatten(), cmap="bwr")
# threshold line
ax.contour(x,y,clas, [.5], colors="k", linewidths=2)
plt.show()

Plot map in loop without plotting previous points

I am trying to plot points drifting in the sea. The following code works, but plots all the points of the previous plots as well:
Duration = 6 #hours
plt.figure(figsize=(20,10))#20,10
map = Basemap(width=300000,height=300000,projection='lcc',
resolution='c',lat_0=51.25,lon_0=-4)#lat_0lon_0 is left under
map.drawmapboundary(fill_color='turquoise')
map.fillcontinents(color='white',lake_color='aqua')
map.drawcountries(linestyle='--')
x=[]
y=[]
for i in range (0,Duration):
x,y = map(Xpos1[i],Ypos1[i])
map.scatter(x, y, s=1, c='k', marker='o', label = 'Aurelia aurita', zorder=2)
plt.savefig('GIF10P%d' %i)
Xpos1 and Ypos1 are a list of masked arrays. Every array in the lists has a length of 10, so 10 points should be plotted in each map:
Xpos1=[[latitude0,lat1,lat2,lat3,..., lat9],
[latitude0,lat1,lat2,lat3,..., lat9],...]
This gives me six figures, I'll show you the first and last:
Every picture is supposed to have 10 points, but the last one is a combination of all the maps (so 60 points).
How do I still get 6 maps with only 10 points each?
Edit:
When I use the answer from matplotlib.pyplot will not forget previous plots - how can I flush/refresh? I get the error
ValueError: Can not reset the axes. You are probably trying to re-use an artist in more than one Axes which is not supported
A similar error pops up when I use the answer from: How to "clean the slate"?
namely,
plt.clf()
plt.cla()#after plt.show()
Any help is deeply appreciated!
Instead of plotting new scatter plots for each image it would make sense to update the scatter plot's data. The advantage is that the map only needs to be created once, saving some time.
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np
Duration = 6 #hours
Xpos1 = np.random.normal(-4, 0.6, size=(Duration,10))
Ypos1 = np.random.normal(51.25, 0.6, size=(Duration,10))
plt.figure(figsize=(20,10))
m = Basemap(width=300000,height=300000,projection='lcc',
resolution='c',lat_0=51.25,lon_0=-4)
m.drawmapboundary(fill_color='turquoise')
m.fillcontinents(color='white',lake_color='aqua')
m.drawcountries(linestyle='--')
scatter = m.scatter([], [], s=10, c='k', marker='o', label = 'Aurelia aurita', zorder=2)
for i in range (0,Duration):
x,y = m(Xpos1[i],Ypos1[i])
scatter.set_offsets(np.c_[x,y])
plt.savefig('GIF10P%d' %i)
plt.show()
As #Primusa said, simply moving all of the things into the for loop works to redefine the map.
The correct code is then:
for i in range (0,Duration,int(24/FramesPerDay)):
plt.figure(figsize=(20,10))#20,10
map = Basemap(width=300000,height=300000,projection='lcc',
resolution='c',lat_0=51.25,lon_0=-4)#lat_0lon_0 is left under
map.drawmapboundary(fill_color='turquoise')
map.fillcontinents(color='white',lake_color='aqua')
map.drawcountries(linestyle='--')
x,y = map(Xpos1[i],Ypos1[i])
map.scatter(x, y, s=1, c='k', marker='o', label = 'Aurelia aurita', zorder=2)
plt.savefig('GIF10P%d' %i)

Instead of grid lines on a plot, can matplotlib print grid crosses?

I want to have some grid lines on a plot, but actually full-length lines are too much/distracting, even dashed light grey lines. I went and manually did some editing of the SVG output to get the effect I was looking for. Can this be done with matplotlib? I had a look at the pyplot api for grid, and the only thing I can see that might be able to get near it are the xdata and ydata Line2D kwargs.
This cannot be done through the basic API, because the grid lines are created using only two points. The grid lines would need a 'data' point at every tick mark for there to be a marker drawn. This is shown in the following example:
import matplotlib.pyplot as plt
ax = plt.subplot(111)
ax.grid(clip_on=False, marker='o', markersize=10)
plt.savefig('crosses.png')
plt.show()
This results in:
Notice how the 'o' markers are only at the beginning and the end of the Axes edges, because the grid lines only involve two points.
You could write a method to emulate what you want, creating the cross marks using a series of Artists, but it's quicker to just leverage the basic plotting capabilities to draw the cross pattern.
This is what I do in the following example:
import matplotlib.pyplot as plt
import numpy as np
NPOINTS=100
def set_grid_cross(ax, in_back=True):
xticks = ax.get_xticks()
yticks = ax.get_yticks()
xgrid, ygrid = np.meshgrid(xticks, yticks)
kywds = dict()
if in_back:
kywds['zorder'] = 0
grid_lines = ax.plot(xgrid, ygrid, 'k+', **kywds)
xvals = np.arange(NPOINTS)
yvals = np.random.random(NPOINTS) * NPOINTS
ax1 = plt.subplot(121)
ax2 = plt.subplot(122)
ax1.plot(xvals, yvals, linewidth=4)
ax1.plot(xvals, xvals, linewidth=7)
set_grid_cross(ax1)
ax2.plot(xvals, yvals, linewidth=4)
ax2.plot(xvals, xvals, linewidth=7)
set_grid_cross(ax2, in_back=False)
plt.savefig('gridpoints.png')
plt.show()
This results in the following figure:
As you can see, I take the tick marks in x and y to define a series of points where I want grid marks ('+'). I use meshgrid to take two 1D arrays and make 2 2D arrays corresponding to the double loop over each grid point. I plot this with the mark style as '+', and I'm done... almost. This plots the crosses on top, and I added an extra keyword to reorder the list of lines associated with the plot. I adjust the zorder of the grid marks if they are to be drawn behind everything.*****
The example shows the left subplot where by default the grid is placed in back, and the right subplot disables this option. You can notice the difference if you follow the green line in each plot.
If you are bothered by having grid crosses on the boarder, you can remove the first and last tick marks for both x and y before you define the grid in set_grid_cross, like so:
xticks = ax.get_xticks()[1:-1] #< notice the slicing
yticks = ax.get_yticks()[1:-1] #< notice the slicing
xgrid, ygrid = np.meshgrid(xticks, yticks)
I do this in the following example, using a larger, different marker to make my point:
***** Thanks to the answer by #fraxel for pointing this out.
You can draw on line segments at every intersection of the tickpoints. Its pretty easy to do, just grab the tick locations get_ticklocs() for both axis, then loop through all combinations, drawing short line segments using axhline and axvline, thus creating a cross hair at every intersection. I've set zorder=0 so the cross-hairs are drawn first, so that they are behind the plot data. Its easy to control the color/alpha and cross-hair size. Couple of slight 'gotchas'... do the plot before you get the tick locations.. and also the xmin and xmax parameters seem to require normalisation.
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.plot((0,2,3,5,5,5,6,7,8,6,6,4,3,32,7,99), 'r-',linewidth=4)
x_ticks = ax.xaxis.get_ticklocs()
y_ticks = ax.yaxis.get_ticklocs()
for yy in y_ticks[1:-1]:
for xx in x_ticks[1:-1]:
plt.axhline(y=yy, xmin=xx / max(x_ticks) - 0.02,
xmax=xx / max(x_ticks) + 0.02, color='gray', alpha=0.5, zorder=0)
plt.axvline(x=xx, ymin=yy / max(y_ticks) - 0.02,
ymax=yy / max(y_ticks) + 0.02, color='gray', alpha=0.5, zorder=0)
plt.show()

Categories