getting x,y from a scatter plot with multiple datasets? - python

I have a scatter plot that is composed of different calls for scatter:
import matplotlib.pyplot as plt
import numpy as np
def onpick3(event):
index = event.ind
print '--------------'
print index
artist = event.artist
print artist
fig_handle = plt.figure()
x,y = np.random.rand(10),np.random.rand(10)
x1,y1 = np.random.rand(10),np.random.rand(10)
axes_size = 0.1,0.1,0.9,0.9
ax = fig_handle.add_axes(axes_size)
p = ax.scatter (x,y, marker='*', s=60, color='r', picker=True, lw=2)
p1 = ax.scatter (x1,y1, marker='*', s=60, color='b', picker=True, lw=2)
fig_handle.canvas.mpl_connect('pick_event', onpick3)
plt.show()
I'd like the points to be clickable, and get the x,y of the selected indexes.
However since scatter is being called more than once, I get the same indexes twice, so I cant use x[index] inside the onpick3 method
Is there a straightforward way to get the points?
It seems that event.artist gives back the same PathCollection that is given back from scatter (p and p1 in this case).
But I couldn't find any way to use it to extract the x,y of the selected indexes
Tried using event.artist.get_paths() - but it doesn't seem to be giving back all the scatter points, but only the one that I clicked on..so I'm really not sure what event.artist is giving back and what are the event.artist.get_paths() function is giving back
EDIT
it seems that event.artist._offsets gives an array with the relevant offsets, but for some reason when trying to use event.artist.offsetsI get
AttributeError: 'PathCollection' object has no attribute 'offsets'
(although if I understand the docs, it should be there)

To get the x, y coordinates for the collection that scatter returns, use event.artist.get_offsets() (Matplotlib has explicit getters and setters for mostly historical reasons. All get_offsets does is return self._offsets, but the public interface is through the "getter".).
So, to complete your example:
import matplotlib.pyplot as plt
import numpy as np
def onpick3(event):
index = event.ind
xy = event.artist.get_offsets()
print '--------------'
print xy[index]
fig, ax = plt.subplots()
x, y = np.random.random((2, 10))
x1, y1 = np.random.random((2, 10))
p = ax.scatter(x, y, marker='*', s=60, color='r', picker=True)
p1 = ax.scatter(x1, y1, marker='*', s=60, color='b', picker=True)
fig.canvas.mpl_connect('pick_event', onpick3)
plt.show()
However, if you're not varying things by a 3rd or 4th variable, you may not want to use scatter to plot points. Use plot instead. scatter returns a collection that's much more difficult to work with than the Line2D that plot returns. (If you do go the route of using plot, you'd use x, y = artist.get_data().)
Finally, not to plug my own project too much, but if you might find mpldatacursor useful. It abstracts away a lot of you're doing here.
If you decide to go that route, your code would look similar to:

Related

How to get the list of matplotlib.lines.Line2D objects from plot?

Currently I'm doing this:
my_list = []
for x,y in zip(xs, ys):
my_list.extend(ax.plot(x, y, linestyle = '', marker = 'o'))
because I want to access each line2d object separately. Is there a way to call ax.plot or ax.scatter once and somehow get the list of all line2d objects?
You are trying to access the container that holds the lines in a plot. A plot has many containers such as figure and axes. This is actually easy to do, just plot the lines, and then access them with ax.lines. The later will be a list of the Line2D objects.
Here is a simple example that demonstrates this container functionality by accessing the lines in a plot and using the matplotlib.lines.Line2D.set_color() function.
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(figsize=(12,8)) # define fig and axes
x = np.linspace(-20,20) # make some x values between -20 and 20
f_x = np.sin(x/4) # make some y values that are f(x) = sin(x/4)
g_x = np.cos(x/4) # make some y values that are g(x) = cos(x/4)
ax.plot(x, f_x, linewidth=2.5, label='f(x)') # plot f(x)
ax.plot(x, g_x, linewidth=2.5, label='g(x)') # plot g(x)
ax.lines[0].set_color('blue')
ax.lines[1].set_color('red')
#add titles, x labels, y labels, legend
title = ax.set_title('Example plot', fontsize=14, fontweight='bold')
xlabels = ax.set_xlabel('x_values')
ylabels = ax.set_ylabel('y_values')
legend = ax.legend(fontsize=14)
You can learn more about containers by reviewing the Artist Tutorial in the matplotlib documentation.

Colormap in Matplotlib without using the "scatter" function

I have constructed a scatter plot with x and y positions. Now I have an array with a third variable, density, and I want to assign a color for each point in my scatter plot depending on its density value. I know how to do it using the "scatter" task of matplotlib, for example:
x = [1,2,3,4]
y = [5,3,7,1]
density = [1,2,3,4]
map = plt.scatter(x, y, c=density)
colorbar = plt.colorbar(map)
Now, I would like to do the same using the "plot" function instead, something like:
map = plt.plot(x,y, '.', c=t)
I am trying to do an animation of a galaxy merger, and assign each particle a color depending of the density of that region. So far the code only works with the "plot" task, so I need to implement it that way, but all the examples I've found use the former way.
Thanks in advance!
First off, #tcaswell is right. You're probably wanting to animate a scatter plot. Using lots of plot calls for this will result in much worse performance than changing the collection that scatter returns.
However, here's how you'd go about using multiple plot calls to do this:
import numpy as np
import matplotlib.pyplot as plt
xdata, ydata, zdata = np.random.random((3, 10))
cmap = plt.cm.gist_earth
norm = plt.Normalize(zdata.min(), zdata.max())
fig, ax = plt.subplots()
for x, y, z in zip(xdata, ydata, zdata):
ax.plot([x], [y], marker='o', ms=20, color=cmap(norm(z)))
sm = plt.cm.ScalarMappable(norm, cmap)
sm.set_array(zdata)
fig.colorbar(sm)
plt.show()
Just for comparison, here's the exact same thing using scatter:
import numpy as np
import matplotlib.pyplot as plt
xdata, ydata, zdata = np.random.random((3, 10))
fig, ax = plt.subplots()
scat = ax.scatter(xdata, ydata, c=zdata, s=200, marker='o')
fig.colorbar(scat)
plt.show()
If you wanted to change the position of the markers in the scatter plot, you'd use scat.set_offsets(xydata), where xydata is an Nx2 array-like sequence.

Adding y=x to a matplotlib scatter plot if I haven't kept track of all the data points that went in

Here's some code that does scatter plot of a number of different series using matplotlib and then adds the line y=x:
import numpy as np, matplotlib.pyplot as plt, matplotlib.cm as cm, pylab
nseries = 10
colors = cm.rainbow(np.linspace(0, 1, nseries))
all_x = []
all_y = []
for i in range(nseries):
x = np.random.random(12)+i/10.0
y = np.random.random(12)+i/5.0
plt.scatter(x, y, color=colors[i])
all_x.extend(x)
all_y.extend(y)
# Could I somehow do the next part (add identity_line) if I haven't been keeping track of all the x and y values I've seen?
identity_line = np.linspace(max(min(all_x), min(all_y)),
min(max(all_x), max(all_y)))
plt.plot(identity_line, identity_line, color="black", linestyle="dashed", linewidth=3.0)
plt.show()
In order to achieve this I've had to keep track of all the x and y values that went into the scatter plot so that I know where identity_line should start and end. Is there a way I can get y=x to show up even if I don't have a list of all the points that I plotted? I would think that something in matplotlib can give me a list of all the points after the fact, but I haven't been able to figure out how to get that list.
You don't need to know anything about your data per se. You can get away with what your matplotlib Axes object will tell you about the data.
See below:
import numpy as np
import matplotlib.pyplot as plt
# random data
N = 37
x = np.random.normal(loc=3.5, scale=1.25, size=N)
y = np.random.normal(loc=3.4, scale=1.5, size=N)
c = x**2 + y**2
# now sort it just to make it look like it's related
x.sort()
y.sort()
fig, ax = plt.subplots()
ax.scatter(x, y, s=25, c=c, cmap=plt.cm.coolwarm, zorder=10)
Here's the good part:
lims = [
np.min([ax.get_xlim(), ax.get_ylim()]), # min of both axes
np.max([ax.get_xlim(), ax.get_ylim()]), # max of both axes
]
# now plot both limits against eachother
ax.plot(lims, lims, 'k-', alpha=0.75, zorder=0)
ax.set_aspect('equal')
ax.set_xlim(lims)
ax.set_ylim(lims)
fig.savefig('/Users/paul/Desktop/so.png', dpi=300)
Et voilĂ 
In one line:
ax.plot([0,1],[0,1], transform=ax.transAxes)
No need to modify the xlim or ylim.
Starting with matplotlib 3.3 this has been made very simple with the axline method which only needs a point and a slope. To plot x=y:
ax.axline((0, 0), slope=1)
You don't need to look at your data to use this because the point you specify (i.e. here (0,0)) doesn't actually need to be in your data or plotting range.
If you set scalex and scaley to False, it saves a bit of bookkeeping. This is what I have been using lately to overlay y=x:
xpoints = ypoints = plt.xlim()
plt.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
or if you've got an axis:
xpoints = ypoints = ax.get_xlim()
ax.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)
Of course, this won't give you a square aspect ratio. If you care about that, go with Paul H's solution.

Add second axis to polar plot

I try to plot two polar plots in one figure. See code below:
fig = super(PlotWindPowerDensity, self).get_figure()
rect = [0.1, 0.1, 0.8, 0.8]
ax = WindSpeedDirectionAxes(fig, rect)
self.values_dict = collections.OrderedDict(sorted(self.values_dict.items()))
values = self.values_dict.items()
di, wpd = zip(*values)
wpd = np.array(wpd).astype(np.double)
wpdmask = np.isfinite(wpd)
theta = self.radar_factory(int(len(wpd)))
# spider plot
ax.plot(theta[wpdmask], wpd[wpdmask], color = 'b', alpha = 0.5)
ax.fill(theta[wpdmask], wpd[wpdmask], facecolor = 'b', alpha = 0.5)
# bar plot
ax.plot_bar(table=self.table, sectors=self.sectors, speedbins=self.wpdbins, option='wind_power_density', colorfn=get_sequential_colors)
fig.add_axes(ax)
return fig
The length of the bar is the data base (how many sampling points for this sector). The colors of the bars show the frequency of certain value bins (eg. 2.5-5 m/s) in the correspondent sector (blue: low, red: high). The blue spider plot shows the mean value for each sector.
In the shown figure, the values of each plot are similar, but this is rare. I need to assign the second plot to another axis and show this axis in another direction.
EDIT:
After the nice answer of Joe, i get the result of the figure.
That's almost everything i wanted to achieve. But there are some points i wasn't able to figure out.
The plot is made for dynamicly changing data bases. Therefore i need a dynamic way to get the same location of the circles. Till now I solve it with:
start, end = ax2.get_ylim()
ax2.yaxis.set_ticks(np.arange(0, end, end / len(ax.yaxis.get_ticklocs())))
means: for second axis i alter the ticks in order to fit the ticklocs to the one's of first axis.
In most cases i get some decimal places, but i don't want that, because it corrupts the clearness of the plot. Is there a way to solve this problem more smartly?
The ytics (the radial one's) range from 0 to the next-to-last circle. How can i achieve that the values range from the first circle to the very last (the border)? The same like for the first axis.
So, as I understand it, you want to display data with very different magnitudes on the same polar plot. Basically you're asking how to do something similar to twinx for polar axes.
As an example to illustrate the problem, it would be nice to display the green series on the plot below at a different scale than the blue series, while keeping them on the same polar axes for easy comparison.:
import numpy as np
import matplotlib.pyplot as plt
numpoints = 30
theta = np.linspace(0, 2*np.pi, numpoints)
r1 = np.random.random(numpoints)
r2 = 5 * np.random.random(numpoints)
params = dict(projection='polar', theta_direction=-1, theta_offset=np.pi/2)
fig, ax = plt.subplots(subplot_kw=params)
ax.fill_between(theta, r2, color='blue', alpha=0.5)
ax.fill_between(theta, r1, color='green', alpha=0.5)
plt.show()
However, ax.twinx() doesn't work for polar plots.
It is possible to work around this, but it's not very straight-forward. Here's an example:
import numpy as np
import matplotlib.pyplot as plt
def main():
numpoints = 30
theta = np.linspace(0, 2*np.pi, numpoints)
r1 = np.random.random(numpoints)
r2 = 5 * np.random.random(numpoints)
params = dict(projection='polar', theta_direction=-1, theta_offset=np.pi/2)
fig, ax = plt.subplots(subplot_kw=params)
ax2 = polar_twin(ax)
ax.fill_between(theta, r2, color='blue', alpha=0.5)
ax2.fill_between(theta, r1, color='green', alpha=0.5)
plt.show()
def polar_twin(ax):
ax2 = ax.figure.add_axes(ax.get_position(), projection='polar',
label='twin', frameon=False,
theta_direction=ax.get_theta_direction(),
theta_offset=ax.get_theta_offset())
ax2.xaxis.set_visible(False)
# There should be a method for this, but there isn't... Pull request?
ax2._r_label_position._t = (22.5 + 180, 0.0)
ax2._r_label_position.invalidate()
# Ensure that original axes tick labels are on top of plots in twinned axes
for label in ax.get_yticklabels():
ax.figure.texts.append(label)
return ax2
main()
That does what we want, but it looks fairly bad at first. One improvement would be to the tick labels to correspond to what we're plotting:
plt.setp(ax2.get_yticklabels(), color='darkgreen')
plt.setp(ax.get_yticklabels(), color='darkblue')
However, we still have the double-grids, which are rather confusing. One easy way around this is to manually set the r-limits (and/or r-ticks) such that the grids will fall on top of each other. Alternately, you could write a custom locator to do this automatically. Let's stick with the simple approach here:
ax.set_rlim([0, 5])
ax2.set_rlim([0, 1])
Caveat: Because shared axes don't work for polar plots, the implmentation I have above will have problems with anything that changes the position of the original axes. For example, adding a colorbar to the figure will cause all sorts of problems. It's possible to work around this, but I've left that part out. If you need it, let me know, and I'll add an example.
At any rate, here's the full, stand-alone code to generate the final figure:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1977)
def main():
numpoints = 30
theta = np.linspace(0, 2*np.pi, numpoints)
r1 = np.random.random(numpoints)
r2 = 5 * np.random.random(numpoints)
params = dict(projection='polar', theta_direction=-1, theta_offset=np.pi/2)
fig, ax = plt.subplots(subplot_kw=params)
ax2 = polar_twin(ax)
ax.fill_between(theta, r2, color='blue', alpha=0.5)
ax2.fill_between(theta, r1, color='green', alpha=0.5)
plt.setp(ax2.get_yticklabels(), color='darkgreen')
plt.setp(ax.get_yticklabels(), color='darkblue')
ax.set_ylim([0, 5])
ax2.set_ylim([0, 1])
plt.show()
def polar_twin(ax):
ax2 = ax.figure.add_axes(ax.get_position(), projection='polar',
label='twin', frameon=False,
theta_direction=ax.get_theta_direction(),
theta_offset=ax.get_theta_offset())
ax2.xaxis.set_visible(False)
# There should be a method for this, but there isn't... Pull request?
ax2._r_label_position._t = (22.5 + 180, 0.0)
ax2._r_label_position.invalidate()
# Bit of a hack to ensure that the original axes tick labels are on top of
# whatever is plotted in the twinned axes. Tick labels will be drawn twice.
for label in ax.get_yticklabels():
ax.figure.texts.append(label)
return ax2
if __name__ == '__main__':
main()
Just to add onto #JoeKington 's (great) answer, I found that the "hack to ensure that the original axes tick labels are on top of whatever is plotted in the twinned axes" didn't work for me so as an alternative I've used:
from matplotlib.ticker import MaxNLocator
#Match the tick point locations by setting the same number of ticks in the
# 2nd axis as the first
ax2.yaxis.set_major_locator(MaxNLocator(nbins=len(ax1.get_yticks())))
#Set the last tick as the plot limit
ax2.set_ylim(0, ax2.get_yticks()[-1])
#Remove the tick label at zero
ax2.yaxis.get_major_ticks()[0].label1.set_visible(False)

How do I tell Matplotlib to create a second (new) plot, then later plot on the old one?

I want to plot data, then create a new figure and plot data2, and finally come back to the original plot and plot data3, kinda like this:
import numpy as np
import matplotlib as plt
x = arange(5)
y = np.exp(5)
plt.figure()
plt.plot(x, y)
z = np.sin(x)
plt.figure()
plt.plot(x, z)
w = np.cos(x)
plt.figure("""first figure""") # Here's the part I need
plt.plot(x, w)
FYI How do I tell matplotlib that I am done with a plot? does something similar, but not quite! It doesn't let me get access to that original plot.
If you find yourself doing things like this regularly it may be worth investigating the object-oriented interface to matplotlib. In your case:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(5)
y = np.exp(x)
fig1, ax1 = plt.subplots()
ax1.plot(x, y)
ax1.set_title("Axis 1 title")
ax1.set_xlabel("X-label for axis 1")
z = np.sin(x)
fig2, (ax2, ax3) = plt.subplots(nrows=2, ncols=1) # two axes on figure
ax2.plot(x, z)
ax3.plot(x, -z)
w = np.cos(x)
ax1.plot(x, w) # can continue plotting on the first axis
It is a little more verbose but it's much clearer and easier to keep track of, especially with several figures each with multiple subplots.
When you call figure, simply number the plot.
x = arange(5)
y = np.exp(5)
plt.figure(0)
plt.plot(x, y)
z = np.sin(x)
plt.figure(1)
plt.plot(x, z)
w = np.cos(x)
plt.figure(0) # Here's the part I need
plt.plot(x, w)
Edit: Note that you can number the plots however you want (here, starting from 0) but if you don't provide figure with a number at all when you create a new one, the automatic numbering will start at 1 ("Matlab Style" according to the docs).
However, numbering starts at 1, so:
x = arange(5)
y = np.exp(5)
plt.figure(1)
plt.plot(x, y)
z = np.sin(x)
plt.figure(2)
plt.plot(x, z)
w = np.cos(x)
plt.figure(1) # Here's the part I need, but numbering starts at 1!
plt.plot(x, w)
Also, if you have multiple axes on a figure, such as subplots, use the axes(h) command where h is the handle of the desired axes object to focus on that axes.
(don't have comment privileges yet, sorry for new answer!)
The accepted answer here says to use the object oriented interface (matplotlib) but the answer itself incoporates some of the MATLAB-style interface (matplotib.pyplot).
It is possible to use solely the OOP method, if you like that sort of thing:
import numpy as np
import matplotlib
x = np.arange(5)
y = np.exp(x)
first_figure = matplotlib.figure.Figure()
first_figure_axis = first_figure.add_subplot()
first_figure_axis.plot(x, y)
z = np.sin(x)
second_figure = matplotlib.figure.Figure()
second_figure_axis = second_figure.add_subplot()
second_figure_axis.plot(x, z)
w = np.cos(x)
first_figure_axis.plot(x, w)
display(first_figure) # Jupyter
display(second_figure)
This gives the user manual control over the figures, and avoids problems associated with pyplot's internal state supporting only a single figure.
An easy way to plot separate frame for each iteration could be:
import matplotlib.pyplot as plt
for grp in list_groups:
plt.figure()
plt.plot(grp)
plt.show()
Then python will plot different frames.
One way I found after some struggling is creating a function which gets data_plot matrix, file name and order as parameter to create boxplots from the given data in the ordered figure (different orders = different figures) and save it under the given file_name.
def plotFigure(data_plot,file_name,order):
fig = plt.figure(order, figsize=(9, 6))
ax = fig.add_subplot(111)
bp = ax.boxplot(data_plot)
fig.savefig(file_name, bbox_inches='tight')
plt.close()

Categories