matplotlib - updating existing plots with points - python

I have a function wrapper for making a plot in matplotlib. i want to know how best we return the figure handle from inside the function. I want to use the figure handle to update the plot by putting more points on it. The size of the points should depend on it's value of the data point. The bigger the data point, the bigger the size of the point.

One common way is to return an Axes object from your function. You can do additional plotting directly from the Axes.
You don't say whether your function is using the pyplot state machine or bare-bones Matplotlib, but here's an example of the former:
import matplotlib.pyplot as plt
x = range(3)
y1 = [2, 1, 3]
y2 = [3, 2, 1]
def plot_data(x, y):
"""Plots x, y. Returns the Axes."""
plt.plot(x, y, '-.k')
return plt.gca()
ax = plot_data(x, y1)
ax.scatter(x, y2, s=y2)
Here we also use the s= argument to specify the size of each point. Matplotlib assumes certain units for these values so you may end up having to multiply by some constant to scale them to meet your aesthetics.
Note that in addition to returning the Axes, sometimes it's useful to also have your plotting function also take an existing Axes as the argument.

Related

Swap axis for a linspace plot

I have a function with an histogram, plotted like this :
import matplotlib.pyplot as plt
import numpy as np
lin = np.linspace(min(foo), max(foo), len(foo))
plt.plot(lin, bar)
plt.hist(bar, density=True, bins=100, histtype='stepfilled', alpha=0.2)
plt.show()
Where foo and bar are simple arrays.
However, I would want to have the whole thing in a vertical way... I could add orientation='horizontal' to the histogram, but it would not change the function (and from what I have seen, there is nothing similar for a plot -> obviously it wouldn't be a function then, but a curve). Otherwise, I could add plt.gca().invert_yaxis() somewhere, but the same problem resides : plot is used for functions, so the swap of it does... well, that :
So, the only way I have now is to manually turn the whole original picture by 90 degrees, but then the axis are turned too and will no longer be on the left and bottom (obviously).
So, have you another idea ? Maybe I should try something else than plt.plot ?
EDIT : In the end, I would want something like the image below, but with axes made right.
If you have a plot of y vs x, you can swap axes by swapping arrays:
plt.plot(bar, lin)
There's no special feature because it's supported out of the box. As you've discovered, plotting a transposed histogram can be accomplished by passing in
orientation='horizontal'
I couldn't find any matplotlib method dealing with the issue. You can rotate the curve in a purely mathematical way, i.e. do it through the rotation matrix. In this simple case it is sufficient to just exchange variables x and y but in general it looks like this (let's take a parabola for a clear example):
rotation = lambda angle: np.array([[ np.cos(angle), -np.sin(angle)],
[np.sin(angle), np.cos(angle)]])
x = np.linspace(-10,10,1000)
y = -x**2
matrix = np.vstack([x,y]).T
rotated_matrix = matrix # rotation(np.deg2rad(90))
fig, ax = plt.subplots(1,2)
ax[0].plot(rotated_matrix[:,0], rotated_matrix[:,1])
ax[1].plot(x,y)
rotated_matrix = matrix # rotation(np.deg2rad(-45))
fig, ax = plt.subplots(1,2)
ax[0].plot(rotated_matrix[:,0], rotated_matrix[:,1])
ax[1].plot(x,y)

Fill area between 2 lines (when one is below another)

When this is run, the graph shown does not shade some parts that are below 5. How should I edit it such that it covers the entire area?
import matplotlib.pyplot as plt
x = [1,2,3,4,5,6,7,8,9,10]
y = [4,9,1,3,6,2,4,7,6,3]
z = [5]*len(y)
plt.plot(x,y)
plt.plot(x,z)
plt.fill_between(x,y,z,where=[(y[i]<z[i]) for i in range(len(x))],facecolor='r')
plt.show()
If you look at the comprehension you are using to calculate where to fill, you'll notice it only checks at the points listed in your y and z lists. However, there are regions in between those points that need to be filled as well.
This behavior is mentioned in the documentation:
Semantically, where is often used for y1 > y2 or similar. By default, the nodes of the polygon defining the filled region will only be placed at the positions in the x array. Such a polygon cannot describe the above semantics close to the intersection. The x-sections containing the intersecion are simply clipped
You need interpolate=True:
Setting interpolate to True will calculate the actual intersection point and extend the filled region up to this point
plt.fill_between(
x,y,z,
where=[(y[i]<z[i]) for i in range(len(x))],
facecolor='r',
interpolate=True
)
Since you also asked for a way to avoid having a list of 5, you may use axhline instead, as well as switching your lists to numpy arrays for easy comparison:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1,2,3,4,5,6,7,8,9,10])
y = np.array([4,9,1,3,6,2,4,7,6,3])
z = 5
plt.plot(x,y)
plt.axhline(y=z, color='orange')
plt.fill_between(x,y,z,where=y<z, facecolor='r', interpolate=True)
plt.show()

Plot x-values without y-values in pyplot

Plotting y-values in pyplot is easy, given a list of y_values = [0, 1, 4, 9], pyplot automatically plots this using
plt.plot(y_values)
plt.show()
As pyplot automatically enumerates these using [0,1,2,3]. However, given a list of x_values, is there a way to automatically plot these without providing y-values? e.g. let pyplot automatically enumerating them?
I've tried
plt.plot(x=x_values); plt.plot(xdata=x_values)
However none of these seem to work. Of course, one way would be to flip the axes, but is there a simpler way I've overlooked?
The x and y arguments in pyplot.plot(*args, **kwargs) are positional arguments. According to the documentation, e.g.
plot(x, y) # plot x and y using default line style and color
plot(x, y, 'bo') # plot x and y using blue circle markers
plot(y) # plot y using x as index array 0..N-1
Now, how would pyplot know that if you specify a single argument, you would want it to be interpreted as the ordinate instead of the coordinate? It's simply not possible the way the function is written.
A solution to plot the index against some list is to supply the index as y argument:
import matplotlib.pyplot as plt
x_values = [0, 1, 4, 9]
plt.plot(x_values, range(len(x_values)))
plt.show()

Tick label text and frequency in matplotlib plot

I want to plot some data stored in a Pandas Dataframe using matplotlib. I want to put specific labels on x axis ticks. So, I set them with:
ax.xaxis.set_ticklabels(data_frame['labels'])
That works well, but it sets a tick label for each data point, making the plot unreadable, so I tried:
ax.locator_params(axis='x', nbins=3)
which reduces the number of ticks to 3, but the labels are not corresponding to correct data points (if labels are a,b,c,d,e ..., x,y,z I get labels a,b,c instead of a,m,z or something like that). My next idea was to set tick labels positions:
ax.xaxis.set_ticks(data_frame.index.values)
but it does not work.
What works is:
ax.xaxis.set_ticklabels(data_frame['labels'][::step])
ax.xaxis.set_ticks(data_frame.index.values[::step])
without setting any locator_params.
This is almost perfect. It fixes the ticks and labels, but when I zoom the plot (using the matplotlib interactive window) new labels are obviously not appearing. And what I need are readable ticks that adjust themselves depending on plot zoom (this is what ax.locator_params(axis='x', nbins=3) does correctly without any custom labels).
In other words: I need to set specific label for each data point but show only few of them on the plot axis ticks without losing the correct assignment.
Using Locator we can define how many ticks shall be produced and where they should be placed. By sub-classing MaxNLocator (this is essentially the default Locator) we can reuse the functionality and simply filter out unwanted ticks (e.g. ticks outside the label range). My approach could definitely be improved at this point, as sparse or non-equidistant x-range data would break my simple filtering solution. Also float values might be a challenge, but I'm certain such a data range could always be mapped to a convenient integer range if the above conditions do not apply. But this is beyond the scope of this question.
With Formatter we can now simply lookup the corresponding labels in our label list to produce the correct tick label. For finding the closest matching value, we can efficiently utilize the bisect module (related question). For static plots we could rely on the assumption that our Locator already produces indices we can directly use for our list access (avoiding unnecessary bisect operation). However, the dynamic view (see the bottom left corner in the screenshots) uses the Formatter to format non-tick position labels. Thus, using bisect is the more general and stable approach.
import matplotlib.pyplot as plt
import numpy as np
import bisect
from matplotlib.ticker import Formatter
from matplotlib.ticker import MaxNLocator
x = np.arange(0, 100, 1)
y = np.sin(x)
# custom labels, could by anything
l = ["!{}!".format(v) for v in x]
plt.plot(x, y)
ax = plt.gca()
class LookupLocator(MaxNLocator):
def __init__(self, valid_ticks, nbins='auto', min_n_ticks=0, integer=True):
MaxNLocator.__init__(self, integer=integer, nbins=nbins, min_n_ticks=min_n_ticks)
self._valid_ticks = valid_ticks
self._integer = integer
def is_tick_valid(self, t):
if self._integer:
return t.is_integer() and int(t) in self._valid_ticks
return t in self._valid_ticks
def tick_values(self, vmin, vmax):
return filter(self.is_tick_valid, MaxNLocator.tick_values(self, vmin, vmax))
class LookupFormatter(Formatter):
def __init__(self, tick_values, tick_labels):
Formatter.__init__(self)
self._tick_values = tick_values
self._tick_labels = tick_labels
def _find_closest(self, x):
# https://stackoverflow.com/questions/12141150/from-list-of-integers-get-number-closest-to-a-given-value
i = bisect.bisect_left(self._tick_values, x)
if i == 0:
return i
if i == len(self._tick_values):
return i - 1
l, r = self._tick_values[i - 1], self._tick_values[i]
if l - x < x - r:
return i
return i - 1
def __call__(self, x, pos=None):
return self._tick_labels[self._find_closest(x)]
ax.xaxis.set_major_locator(LookupLocator(x))
ax.xaxis.set_major_formatter(LookupFormatter(x, l))
plt.show()

Change axes ticks of quiver - Python

I'm plotting a vector field with the quiver method of Matplotlib.
My array to store this vector has a dimension x * y but I'm working with a space that varies from -2 to 2.
So far, to plot the vector field I have this method:
import matplotlib.pyplot as plt
def plot_quiver(vector_field_x, vector_field_y, file_path):
plt.figure()
plt.subplots()
plt.quiver(vector_field_x, vector_field_y)
plt.savefig(file_path + '.png')
plt.close()
Which gives me this output, as an example, for a 10 x 10 array:
But to generate this vector field I centered my data in the x = 0, y = 0, x and y ranging from -2 to 2.
Then, I would like to plot the axis of the image following this pattern.
As an standard approach, I tried to do the following:
def plot_quiver(vector_field_x, vector_field_y, file_path):
plt.figure()
fig, ax = plt.subplots()
ax.quiver(vector_field_x, vector_field_y)
ax.set_xticks([-2, 0, 2])
ax.set_yticks([-2, 0, 2])
plt.savefig(file_path + '.png')
plt.close()
Which usually works with Matplotlib methods, as imshow and streamplot, for example.
But this what I've got with this code:
Which is not what I want.
So, I'm wondering how can I perform what I explained here to change the axes ticks.
Thank you in advance.
Funny thing, I just learnt about quiver yesterday... :)
According to the quiver documentation, the function can accept from 2 to 5 arguments...
The simplest way to use the function is to pass it two arrays with equal number of elements U and V. Then, matplotlib will plot an arrow for each element in the arrays. Specifically, for each element i,j you will get an arrow placed at i,j and with components defined by U[i,j] and V[i,j]. This is what is happening to you
A more complete syntax is to pass our arrays with equal number of elements X, Y, U and V. Again, you will get an arrow for each i,j element with components defined by U[i,j] and V[i,j], but this time they will be placed at coordinates X[i,j], Y[i,j].
In conclusion:
you need to call quiver like
quiver(values_x, values_y, vector_field_x, vector_field_y)
Probably you already did it, but you can get values_x and values_y using the numpy.meshgrid function.
The matplotlib example for the quiver function might be useful, also.
I hope it helps!

Categories