Contour Plot of Binary Data (0 or 1) - python

I have x values, y values, and z values. The z values are either 0 or 1, essentially indicating whether an (x,y) pair is a threat (1) or not a threat (0).
I have been trying to plot a 2D contour plot using the matplotlib contourf. This seems to have been interpolating between my z values, which I don't want. So, I did a bit of searching and found that I could use pcolormesh to better plot binary data. However, I am still having some issues.
First, the colorbar of my pcolormesh plot doesn't show two distinct colors (white or red). Instead, it shows a full spectrum from white to red. See the attached plot for what I mean. How do I change this so that the colorbar only shows two colors, for 0 and 1? Second, is there a way to draw a grid of squares into the contour plot so that it is more clear for which x and y intervals the 0s and 1s are occurring. Third, my code calls for minorticks. However, these do not show up in the plot. Why?
The code which I use is shown here. The vels and ms for x and y can really be anything, and the threat_bin is just the corresponding 0 or 1 values for all the (vets,ms) pairs:
fig=plt.figure(figsize=(6,5))
ax2=fig.add_subplot(111)
from matplotlib import cm
XX,YY=np.meshgrid(vels, ms)
cp=ax2.pcolormesh(XX/1000.0,YY,threat_bin, cmap=cm.Reds)
ax2.minorticks_on()
ax2.set_ylabel('Initial Meteoroid Mass (kg)')
ax2.set_xlabel('Initial Meteoroid Velocity (km/s)')
ax2.set_yscale('log')
fig.colorbar(cp, ticks=[0,1], label='Threat Binary')
plt.show()
Please be simple with your recommendations, and let me know the code I should include or change with respect to what I have at the moment.

Related

Random (false data) lines appearing in contourf plot at certain # of levels

I'm trying to use matplotlib and contourf to generate some filled (polar) contour plots of velocity data. I have some data (MeanVel_Z_Run16_np) I am plotting on theta (Th_Run16) and r (R_Run16), as shown here:
fig,ax = plt.subplots(subplot_kw={'projection':'polar'})
levels = np.linspace(-2.5,4,15)
cplot = ax.contourf(Th_Run16,R_Run16,MeanVel_Z_Run16_np,levels,cmap='plasma')
ax.set_rmax(80)
ax.set_rticks([15,30,45,60])
rlabels = ax.get_ymajorticklabels()
for label in rlabels:
label.set_color('#E6E6FA')
cbar = plt.colorbar(cplot,pad=0.1,ticks=[0,3,6,9,12,15])
cbar.set_label(r'$V_{Z}$ [m/s]')
plt.show()
This generates the following plot:
Velocity plot with 15 levels:
Which looks great (and accurate), outside of that random straight orange line roughly between 90deg and 180deg. I know that this is not real data because I plotted this in MATLAB and it did not appear there. Furthermore, I have realized it appears to relate to the number of contour levels I use. For example, if I bump this code up to 30 levels instead of 15, the result changes significantly, with odd triangular regions of uniform value:
Velocity plot with 30 levels:
Does anyone know what might be going on here? How can I get contourf to just plot my data without these strange misrepresentations? I would like to use 15 contour levels at least. Thank you.

Creating a pseudo color plot with a linear and nonlinear axis and computing values based on the center of grid values

I have the equation: z(x,y)=1+x^(2/3)y^(-3/4)
I would like to calculate values of z for x=[0,100] and y=[10^1,10^4]. I will do this for 100 points in each axis direction. My grid, then, will be 100x100 points. In the x-direction I want the points spaced linearly. In the y-direction I want the points space logarithmically.
Were I to need these values I could easily go through the following:
x=np.linspace(0,100,100)
y=np.logspace(1,4,100)
z=np.zeros( (len(x), len(y)) )
for i in range(len(x)):
for j in range(len(y)):
z[i,j]=1+x[i]**(2/3)*y[j]**(-3/4)
The problem for me comes with visualizing these results. I know that I would need to create a grid of points. I feel my options are to create a meshgrid with the values and then use pcolor.
My issue here is that the values at the center of the block do not coincide with the calculated values. In the x-direction I could fix this by shifting the x-vector by half of dx (the step between successive values). I'm not so sure how I would do this for the y-axis. Furthermore, If I wanted to compute values for each of the y-direction values, including the end points, they would not all show up.
In the final visualization I would like to have the y-axis as a log scale and the x axis as a linear scale. I would also like the tick marks to fall in the center of the cells, correlating with the correct value. Can someone point me to the correct plotting functions for this. I have to resolve the issue using pcolor or pcolormesh.
Should you require more details, please let me know.
In current matplotlib, you can use pcolormesh with shading='nearest', and it will center the blocks with the values:
import matplotlib.pyplot as plt
y_plot = np.log10(y)
z[5, 5] = 0 # to make it more evident
plt.pcolormesh(x, y_plot, z, shading="nearest")
plt.colorbar()
ax = plt.gca()
ax.set_xticks(x)
ax.set_yticks(y_plot)
plt.axvline(x[5])
plt.axhline(y_plot[5])
Output:

pcolormesh ticks center for each data point/tile

I have some z=f(x,y) data that I would like to display in a heat map. So I am using np.meshgrid to create a (x,y)-grid and then call pcolormesh. However the ticks are not centered for each "tile" that correspond to a data point -- in the docs, I did not find any instructions on how to center the ticks for each tile so that I can immediately read off the corresponding value. Any ideas?
In the image attached for instance, it is not clear to which x-value the tick corresponds.
In a pcolormesh the grid is defined by the edge values. In the following example the value of 6 in the lower left corner is the value between 0 and 1 in each dimension. I think this is perfectly understandable to everyone.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
x = np.arange(5)
y = np.arange(3)
X,Y = np.meshgrid(x,y)
Z = np.random.randint(1,9, size=(2,4))
plt.pcolormesh(X,Y,Z)
plt.show()
Now if you want to (only) have the ticks in the middle of the cell, you can set them as follows
plt.xticks(x[:-1]+0.5)
plt.yticks(y[:-1]+0.5)
If the lower left pixel actually does not correspond to the data between 0 and 1, but to the data at 0, the grid is 'wrong'; a solution would be to fix it by translating it by half the pixel width.
plt.pcolormesh(X-0.5,Y-0.5,Z)
As above, the ticks could be adapted to show only certain numbers, using plt.xticks.

To Plot data points in Basemap and skip values which are zero

Presently I am plotting my data values in Basemap using scatter , but some of the data values are zero, hence when I set the color bar they give me some colored dots even for zero (blue in my case). I would like to have only the data values which are not zero in my plot. I find imshow little complex , I just would just like to provide x , y values and plot the data with colorbar. I want something like vmin to show only values greater than zero.. Could you please suggest your view...
Below is the code
xs, ys = m(lon,lat)
m.scatter(xs, ys, c=mean)
c = m.colorbar(location='bottom',pad='7%')
You can replace the 0 values with np.nan (after importing numpy), which will have them not display.
Alternatively, you can used a numpy masked array.

Fill area under curve in matlibplot python on log scale

I'm trying to fill the area under a curve with matplotlib. The script below works fine.
import matplotlib.pyplot as plt
from math import sqrt
x = range(100)
y = [sqrt(i) for i in x]
plt.plot(x,y,color='k',lw=2)
plt.fill_between(x,y,0,color='0.8')
plt.show()
However if I set the y-scale to logarithmic (see below). It sometimes fills the area above the curve ! Can anyone help me? I would like to fill the area between the curve and y = 0.
x = range(100)
y = [sqrt(i) for i in x]
plt.plot(x,y,color='k',lw=2)
plt.fill_between(x,y,0,color='0.8')
plt.yscale('log')
plt.show()
Thanks in advance!
With a logarithmic y-scale, fill_between(x, y, 0) tells matplotlib to fill the region between log(0) = -infinity and log(y). Naturally, it balks. You can avoid the problem by changing 0 to some small number like 1e-6.
As mentioned, 0 -> -inf in a log scale. Thus, any plotted value that was less than or equal to zero would be problematic (requiring an infinite ylim in log space). This problem exists independently of whether you are using fill_between() or not.
Fortunately, matplotlib provides a way to handle this nicely. In the default behavior, matplotlib masks the values of every value less than or equal to zero. In your example, this means that your entire y=0 line is masked and excluded from the polygon defining the filled-between area. The result is that the polygon is simply closed by drawing a line from (100,10) down and leftward to (0,0). Another option is to clip the values. In this case, they are set to 1e-300 and are not consulted when determining the ylim of the plot. So to get your desired result, do the following:
plt.yscale('log', nonposy='clip')

Categories