I am trying to add a small line outside of my axis range which I want to use as a highly customized legend at a later stage. However, using axes.hlines changes the xlim of my axis, even though I specify transform = axes.transAxes. The xlim appears to be set such that the coordinates of the hlines are included in the datacoordinate range. Only, that these coordinates are meant to be axes coordinates, not data coordinates.
Here comes a minimal working example:
import numpy as np
import matplotlib.pyplot as plt
x_data = np.random.rand(10)+10
y_data = np.random.rand(10)
fig, ax = plt.subplots()
ax.scatter(x_data,y_data)
ax.hlines(0.5,1.1,1.2, transform = ax.transAxes, clip_on = False)
results in xlims being changed by the ax.hlines command:
while with ax.hlines being commented out one gets:
Related
Please see code below. I am trying to get 2 exact same scatter plots, in ax2 I try to set the color after the scatter plot has been created. How can I achieve this?
I want this because in my interface I am trying to have the user (optionally) select data to color the scatterplot with. I could just redo the whole plot but I am guessing for large number of data points, it is better add colors to existing axis object. Is that correct?
import numpy as np
import matplotlib.pyplot as plt
fig, (ax1,ax2) = plt.subplots(1,2)
x = np.random.rand(100)
y = np.random.rand(100)
z = np.random.rand(100)
#in ax1, I set color using the 'c' argument
ax1.scatter(x,y, c=z)
sc = ax2.scatter(x,y)
#in ax2, I try to mimic the 'c' argument with set_color but it raises error
# sc.set_color(z)
I'm trying to get the functionality of fill_betweenx() without having to use the function itself, because it doesn't accept the interpolate parameter. I need the interpolate functionality that is supported by fill_between(), but for the filling to happen relative to the x axis. It sounds like the interpolate parameter will be supported for fill_betweenx() in matplotlib 2.1, but it would be great to have access to the functionality via a workaround in the meantime.
This is the line of code in question:
ax4.fill_betweenx(x,300,p, where=p>=150, interpolate=True, facecolor='White', lw=1, zorder=2)
Unfortunately this gives me AttributeError: Unknown property interpolate.
One lazy way to do it is to use the fill_between() function with inverted coordinates on a figure that you don't show (i.e. close the figure before using plt.show()), and then re-use the vertices of the PolyCollection that fill_between() returns on your actual plot. It's not perfect, but it works as a quick fix. Here an example of what I'm talking about:
from matplotlib import pyplot as plt
from matplotlib.collections import PolyCollection
import numpy as np
fig, axes = plt.subplots(nrows = 2, ncols =2, figsize=(8,8))
#the data
x = np.linspace(0,np.pi/2,3)
y = np.sin(x)
#fill_between without interpolation
ax = axes[0,0]
ax.plot(x,y,'k')
ax.fill_between(x,0.5,y,where=y>0.25)
#fill_between with interpolation, keep the PolyCollection
ax = axes[0,1]
ax.plot(x,y,'k')
poly_col = ax.fill_between(x,0.5,y,where=y>0.25,interpolate=True)
#fill_betweenx -- no interpolation possible
ax = axes[1,0]
ax.plot(y,x,'k')
ax.fill_betweenx(x,0.5,y,where=y>0.25)
#faked fill_betweenx:
ax = axes[1,1]
ax.plot(y,x,'k')
#get the vertices from the saved PolyCollection, swap x- and y-values
v=poly_col.get_paths()[0].vertices
#convert to correct format
v2=list(zip(v[:,1],v[:,0]))
#and add to axes
ax.add_collection(PolyCollection([v2]))
#voila
plt.show()
The result of the code looks like this:
I'm trying to create a CDF but at the end of the graph, there is a vertical line, shown below:
I've read that his is because matplotlib uses the end of the bins to draw the vertical lines, which makes sense, so I added into my code as:
bins = sorted(X) + [np.inf]
where X is the data set I'm using and set the bin size to this when plotting:
plt.hist(X, bins = bins, cumulative = True, histtype = 'step', color = 'b')
This does remove the line at the end and produce the desired effect, however when I normalise this graph now it produces an error:
ymin = max(ymin*0.9, minimum) if not input_empty else minimum
UnboundLocalError: local variable 'ymin' referenced before assignment
Is there anyway to either normalise the data with
bins = sorted(X) + [np.inf]
in my code or is there another way to remove the line on the graph?
An alternative way to plot a CDF would be as follows (in my example, X is a bunch of samples drawn from the unit normal):
import numpy as np
import matplotlib.pyplot as plt
X = np.random.randn(10000)
n = np.arange(1,len(X)+1) / np.float(len(X))
Xs = np.sort(X)
fig, ax = plt.subplots()
ax.step(Xs,n)
I needed a solution where I would not need to alter the rest of my code (using plt.hist(...) or, with pandas, dataframe.plot.hist(...)) and that I could reuse easily many times in the same jupyter notebook.
I now use this little helper function to do so:
def fix_hist_step_vertical_line_at_end(ax):
axpolygons = [poly for poly in ax.get_children() if isinstance(poly, mpl.patches.Polygon)]
for poly in axpolygons:
poly.set_xy(poly.get_xy()[:-1])
Which can be used like this (without pandas):
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
X = np.sort(np.random.randn(1000))
fig, ax = plt.subplots()
plt.hist(X, bins=100, cumulative=True, density=True, histtype='step')
fix_hist_step_vertical_line_at_end(ax)
Or like this (with pandas):
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.randn(1000))
fig, ax = plt.subplots()
ax = df.plot.hist(ax=ax, bins=100, cumulative=True, density=True, histtype='step', legend=False)
fix_hist_step_vertical_line_at_end(ax)
This works well even if you have multiple cumulative density histograms on the same axes.
Warning: this may not lead to the wanted results if your axes contain other patches falling under the mpl.patches.Polygon category. That was not my case so I prefer using this little helper function in my plots.
Assuming that your intentions are pure aesthetic, add a vertical line, of the same color as your plot background:
ax.axvline(x = value, color = 'white', linewidth = 2)
Where "value" stands for the right extreme of the rightmost bin.
I have 3 vectors - x,y,vel each having some 8k values. I also have quite a few files containing these 3 vectors. All the files have different x,y,vel. I want to get multiple scatter plots with the following conditions:
Color coded according to the 3rd variable i.e vel.
Once the ranges have been set for the colors (for the data from the 1st file), they should remain constant for all the remaining files. i don't want a dynamically changing (color code changing with each new file).
Want to plot a colorbar.
I greatly appreciate all your thoughts!!
I have attached the code for a single file.
import numpy as np
import matplotlib.pyplot as plt
# Create Map
cm = plt.cm.get_cmap('RdYlBu')
x,y,vel = np.loadtxt('finaldata_temp.txt', skiprows=0, unpack=True)
vel = [cm(float(i)/(8000)) for i in xrange(8000)] # 8000 is the no. of values in each of x,y,vel vectors.
# 2D Plot
plt.scatter(x, y, s=27, c=vel, marker='o')
plt.axis('equal')
plt.savefig('testfig.png', dpi=300)
plt.show()
quit()
You will have to iterate over all your data files to get the maximum value for vel, I have added a few lines of code (that need to be adjusted to fit your case) that will do that.
Therefore, your colorbar line has been changed to use the max_vel, allowing you to get rid of that code using the fixed value of 8000.
Additionally, I took the liberty to remove the black edges around the points, because I find that they 'obfuscate' the color of the point.
Lastly, I have added adjusted your plot code to use an axis object, which is required to have a colorbar.
import numpy as np
import matplotlib.pyplot as plt
# This is needed to iterate over your data files
import glob
# Loop over all your data files to get the maximum value for 'vel'.
# You will have to adjust this for your code
"""max_vel = 0
for i in glob.glob(<your files>,'r') as fr:
# Iterate over all lines
if <vel value> > max_vel:
max_vel = <vel_value>"""
# Create Map
cm = plt.cm.get_cmap('RdYlBu')
x,y,vel = np.loadtxt('finaldata_temp.txt', skiprows=0, unpack=True)
# Plot the data
fig=plt.figure()
fig.patch.set_facecolor('white')
# Here we switch to an axis object
# Additionally, you can plot several of your files in the same figure using
# the subplot option.
ax=fig.add_subplot(111)
s = ax.scatter(x,y,c=vel,edgecolor=''))
# Here we assign the color bar to the axis object
cb = plt.colorbar(mappable=s,ax=ax,cmap=cm)
# Here we set the range of the color bar based on the maximum observed value
# NOTE: This line only changes the calculated color and not the display
# 'range' of the legend next to the plot, for that we need to switch to
# ColorbarBase (see second code snippet).
cb.setlim(0,max_vel)
cb.set_label('Value of \'vel\'')
plt.show()
Snippet, demonstrating ColorbarBase
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
cm = plt.cm.get_cmap('RdYlBu')
x = [1,5,10]
y = [2,6,9]
vel = [7,2,1]
# Plot the data
fig=plt.figure()
fig.patch.set_facecolor('white')
ax=fig.add_subplot(111)
s = ax.scatter(x,y,c=vel,edgecolor=''))
norm = mpl.colors.Normalize(vmin=0, vmax=10)
ax1 = fig.add_axes([0.95, 0.1, 0.01, 0.8])
cb = mpl.colorbar.ColorbarBase(ax1,norm=norm,cmap=cm,orientation='vertical')
cb.set_clim(vmin = 0, vmax = 10)
cb.set_label('Value of \'vel\'')
plt.show()
This produces the following plot
For more examples of what you can do with the colorbar, specifically the more flexible ColorbarBase, I would suggest that you check the documentation -> http://matplotlib.org/examples/api/colorbar_only.html
I'm trying to do a heat map over a shape file in python. I need to make quite a few of these so don't want to read in the .shp every time.
Instead, I thought I could create a lineCollection instance of the map boundaries and overlay the two images. Problem is - I can't seem to get the two to line up correctly.
Here is the code, where linecol is the lineCollection object.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.contourf(xi,yi,zi)
ax.add_collection(linecol, autolim = False)
plt.show()
Is there an easy way to fix the limits of linecol to match those of the other plot? I've had a play with set_xlim and transforms.Bbox, but can't seem to manage it.
Thank you very much for your help!
Transforms are tricky because of the various coordinate systems involved. See http://matplotlib.sourceforge.net/users/transforms_tutorial.html.
I managed to scale a LineCollection to the appropriate size like this. The key was to realize that I needed to add + ax.transData to the new transform I set on the LineCollection. (When you don't set any transform on an artist object, ax.transData is the default. It converts data coordinates into display coordinates.)
from matplotlib import cm
import matplotlib.pyplot as plt
import matplotlib.collections as mc
import matplotlib.transforms as tx
import numpy as np
fig = plt.figure()
# Heat map spans 1 x 1.
ax = fig.add_subplot(111)
xs = ys = np.arange(0, 1.01, 0.01)
zs = np.random.random((101,101))
ax.contourf(xs, ys, zs, cmap=cm.autumn)
lines = mc.LineCollection([[(5,1), (9,5), (5,9), (1,5), (5,1)]])
# Shape spans 10 x 10. Resize it to 1 x 1 before applying the transform from
# data coords to display coords.
trans = tx.Affine2D().scale(0.1) + ax.transData
lines.set_transform(trans)
ax.add_collection(lines)
plt.show()
(Output here: http://i.stack.imgur.com/hDNN8.png Not enough reputation to post inline.)
It should be easy to modify this if you need the shape translated or scaled unequally on x and y.