How to find out what set_data expects - python

I'm trying to make an animation with matplotlib, in this case a 3D scatter plot. I'm hitting a problem that I absolutely always hit when I try to do this, which is that I don't know what arguments I should pass to set_data, and I don't know how to find out. In this case, it apparently expects two arguments, despite it being a 3d plot.
Since I've experienced related problems often, rather than asking about the specifics of the particular plot I'm trying to animate, I will ask the general question: given an element of a MatPlotLib plot, how can I determine what arguments its set_data method expects, either by interrogating it, or by knowing where it's documented?

From an example for an Animated 3D random walk from the MatPlotLib documentation:
def update_lines(num, dataLines, lines):
for line, data in zip(lines, dataLines):
# NOTE: there is no .set_data() for 3 dim data...
line.set_data(data[0:2, :num])
line.set_3d_properties(data[2, :num])
return lines
So as confusing as you discovered it is set_data by itself is not meant for 3D data, as well as according to the docs it accepts:
2D array (rows are x, y) or two 1D arrays
Looking more at this example we can see that the set_3d_properties has been used altogether.
This whole update_lines was set as a callback parameter for animation.FuncAnimation.

Usually to find the documentation you can either search it up online (e.g doc for set_data) or from a python prompt you can use the help function, which will show you the docstring of the object (can be used on a module/function/class etc) if it has any.
For example if you want to know what the datetime.datetime.now does (I dont have mathplotlib install to use it on it):
>>> import datetime
>>> help(datetime.datetime.now)
Help on built-in function now:
now(tz=None) method of builtins.type instance
Returns new datetime object representing current time local to tz.
tz
Timezone object.
If no tz is specified, uses local timezone.

Related

Turn off minor ticks on xaxis

I'm trying to simplify the look of a graph of mine. To this extent, I would like to set ticks only on definite points.
The 'native' plot, out of a df.groupby.max().plot() operation looks like this:
I don't like the fact that my data starts at 0.3 and ends at 0.6, but the graph is somewhat adding real estate there. To have the plot limited to the numbers I want, I do:
ax1.set_xlim(0.3,0.6)
Which however adds a series of intermediate points I wouldn't like to have:
Now, for some reason, halfway points appear. Note that they do not belong to the measured data.
I've then tried the recipes found - among other places - here
ax1.set_xticks = np.arange(0.3,0.6,0.1)
--> no change
ax1.xaxis.set_tick_params(which='minor',bottom=False)
--> no change
ax1.minorticks_off()
--> no change
I've run out of options and I'm not sure what I'm doing wrong here, any help appreciated.
OK,
thanks to #DavidG's hint I found the issue. It's maybe not subtle but worth mentioning. I can remove the whole thing if this turns out to be too trivial.
The issue was created by this wrong call to the ax.set_xticks() function:
ax1.set_xticks = np.arange(0.3,0.6,0.1)
Although I cited the place where I took this approach, I actually managed to implement it wrongly. The right way would have been:
ax1.set_xticks(np.arange(0.3,0.6,0.1))
So, actually, I wasn't seeing any change in the plot because I wasn't calling the function correctly.
But there's more.
My code was actually assigning the (name?) ax1.set_xticks to an np.array, so that when trying to then implement the correct syntax, I kept getting an error:
ax1.set_xticks([0.3,0.4,0.5,0.6])
Traceback (most recent call last):
File "<ipython-input-93-df3b8935eb28>", line 1, in <module>
ax1.set_xticks([0.3,0.4,0.5,0.6])
TypeError: 'numpy.ndarray' object is not callable
Even with a simple list, I was getting the error. This is because I had assigned the name ax1.set_xticks to, indeed, an np.array object.
Once reset the variable space and properly called the function, everything ran smoothly.

What are the guidelines for using matplotlib's set_array() routine?

The documentation for set_array is very skimpy. What does it do? What range of values can it take? How does it work in conjunction with other color-related routines and data structures?
On the collections docpage it is said to "Set the image array from numpy array A." It is described in the same way in the colormap API. That's all.
I find no mention of set_array() (much less examples) in any of several popular books on matplotlib programming, such as Devert (2014), McGreggor (2015), Root (2015) and Tossi (2009).
Yet, if set_array() is some arcane function that is only needed in rare cases, why does it show up so often both in matplotlib examples and in examples posted on the SciKit Learn website? Seems like a pretty mainstream function, and so it ought to have more mainstream documentation.
For example:
Matplotlib docs: Use of set_array() in creation of a multi-colored line
Matplotlib docs: Line collection with masked arrays
Scikit Learn docs: Visualization of stockmarket structure
Sifting through Stack Overflow posts that mention set_array() I found this one, where a poster states that "set_array() handles mapping an array of data values to RGB", and this one where posters indicate that set_array() must be called in some cases when one is setting up a ScalarMappable object.
I've tried experimenting with the examples I've found on-line, changing the range of values passed in to set_array(), for example, to try to figure out what it is doing. But, at this point, I'm spending way too much time on this one dumb function. Those deeply into color maps have enough context to guess what it does, but I can't afford to take a detour that big, just to understand this one function.
Could someone please offer a quick description and maybe some links?
The set_array method doesn't do much per se. It only defines the content of an array that is internal to the object in which it is defined. See for instance in the source of matplotlib.cm
def set_array(self, A):
"""
Set the image array from numpy array *A*.
Parameters
----------
A : ndarray
"""
self._A = A
self._update_dict['array'] = True
In the multicolored_line example of the matplotlib documentation, this is used to map colors of a cmap.
Let's take a similar example and create a collection of lines and map the segments to indexed colors in a colormap:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap, BoundaryNorm
f, axes = plt.subplots(ncols=3)
y = np.arange(0,1,0.1).repeat(2)
x = np.append(y[1:], [1])
segments = np.array(list(zip(x,y))).reshape(-1, 2, 2)
cmap = ListedColormap(['r', 'g', 'b'])
norm = BoundaryNorm([-0.5, 0.5, 1.5, 2.5], cmap.N)
for ax in axes:
ax.add_collection(LineCollection(segments, cmap=ListedColormap(['r', 'g', 'b']), norm=norm))
axes[1].collections[0].set_array(np.array([0,1]))
axes[2].collections[0].set_array(np.array([0,1,2]))
axes[1].set_title('set_array to [0,1]')
axes[2].set_title('set_array to [0,1,2]')
This gives the following output:
What is does is to map the segment to the indexed colors defined in the cmap (here 0->'r', 1->'g', 2->'b'). This behaviour is specified in the matpotlib.collections source:
Each Collection can optionally be used as its own `.ScalarMappable` by
passing the *norm* and *cmap* parameters to its constructor. If the
Collection's `.ScalarMappable` matrix ``_A`` has been set (via a call
to `.Collection.set_array`), then at draw time this internal scalar
mappable will be used to set the ``facecolors`` and ``edgecolors``,
ignoring those that were manually passed in.

Python bokeh modify axis scale

How can I modify the y-axis scale at figures and at charts? I want something like this: my_figure.y_range.end = my_figure.y_range.end * 1.3
So I want a bit higher y-axis. Thank you!
Figure uses DataRange1d objects by default, which causes the range to automatically computed. But this happens on the browser, because it takes into account information like glyph extent that are only available at render time. The reason that my_figure.y_range.end * 1.3 does not work is because the "automatic" value of end is not known yet. It is only set automatically inside the browser. You can override the "automatic" behaviour of a DataRange by supplying start and end, but you have to give it an explicit, numeric value that you want, i.e.:
my_figure.y_range.end = 10
Alternatively, DataRange1d models have range_padding property that you can set, which controls the amount of "extra padding" added to the automatically computed bounds. It is described here:
http://docs.bokeh.org/en/latest/docs/reference/models/ranges.html#bokeh.models.ranges.DataRange1d.range_padding
This might accomplish what you want in a different way, but note that it affects both start and end.
Finally, if you'd just like to completely control the range, without having auto ranging at all, you can do this when you create the figure:
p = figure(..., x_range=(10, 20))
This will create a fixed Range1d for the x-axis with start=10 and end=20.

Why ticklabel_format do not take effect?

I am using Matplotlib to plot a 3*4 subplots. I am going to change the style of tick label to 'sci' so that the plotting can be more neat. But using the following code, the ticklabel_format do not even take effect on the axis.
import matplotlib.pylab as pl
fig, axes = pl.subplots(nrows=3, ncols=4)
for i,row in enumerate(axes):
for j,ax in enumerate(row):
ax.set_title('title')
ax.ticklabel_format(styl='plain')
pl.tight_layout()
pl.show()
I have intently make a typo 'styl', but it doesn't report error. So I assume the ticklabel_format function doesn't even run.
It not taking effect because ax.ticklabel_format takes any keyword argument and creates a dictionary. To see this take a look at the documentation here and you will see it takes an argument **kwargs. If you just replace styl with style then your code will work.
I suggest you take a look at this SO post to get a feel of what was going wrong but in brief: the function can take any argument. It then attempts to pass these on to other functions. If none of these require it then the argument is simply lost in the ether. As a result there is no error message!
Try playing around with the following example to get a feel **kwargs.
def f(**kwargs):
print kwargs
return
f(anything='something')

Semantic Type Safety in Python

In my recent project I have the problem, that some values are often misinterpreted. For instance I calculate a wave as a sum of two waves (for which I need two amplitudes and two phase shifts), and then sample it at 4 points. I pass these tuples of four values to different functions, but sometimes I made the mistake to pass wave parameters instead of sample points.
These errors are hard to find, because all the calculations work without any error, but the values are totally meaningless in this context and so the results are just wrong.
What I want now is some kind of semantic type. I want to state that the one function returns sample points and the other function awaits sample points, and that I can do nothing that would conflict this declarations without immediately getting an error.
Is there any way to do this in python?
I would recommend implementing specific data types to be able to distinguish between different kind of information with the same structure.
You can simply subclass list for example and then do some type checking at runtime within your functions:
class WaveParameter(list):
pass
class Point(list):
pass
# you can use them just like lists
point = Point([1, 2, 3, 4])
wp = WaveParameter([5, 6])
# of course all methods from list are inherited
wp.append(7)
wp.append(8)
# let's check them
print(point)
print(wp)
# type checking examples
print isinstance(point, Point)
print isinstance(wp, Point)
print isinstance(point, WaveParameter)
print isinstance(wp, WaveParameter)
So you can include this kind of type checking in your functions, to make sure the correct kind of data was passed to it:
def example_function_with_waveparameter(data):
if not isinstance(data, WaveParameter):
log.error("received wrong parameter type (%s instead WaveParameter)" %
type(data))
# and then do the stuff
or simply assert:
def example_function_with_waveparameter(data):
assert(isinstance(data, WaveParameter))
Pyhon's notion of a "semantic type" is called a class, but as mentioned, Python is dynamically typed so even using custom classes instead of tuples you won't get any compile-time error - at best you'll get runtime errors if your classes are designed in such a way that trying to use one instead of the other will fail.
Now classes are not just about data, they are about behaviour too, so if you have functions that do waveform-specific computations these functions would probably become methods of the Waveform class, and idem for the Point part, and this might be enough to avoid logical errors like passing a "waveform" tuple to a function expecting a "point" tuple.
To make a long story short: if you want a statically typed functional language, Python is not the right tool (Haskell might be a better choice). If you really want / have to use Python, try using classes and methods instead of tuples and functions, it still won't detect type errors at compile-time but chances are you'll have less type errors AND that these type errors will be detected at runtime instead of producing wrong results.

Categories