How to set the physical length from an axis in matplotlib? - python

Is there a command to set the length of an axis? I do not mean the range! Independently from the values, the range from the axis or other factors, I want to set its length. How can I do that?
Something like plt.yaxislenght(20)?

I'm not sure of a specific way to set an axis length of axes generated by e.g. plt.subplots. You can use ax.set_aspect(num), but this adjusts the aspect ratio, and therefore will change both axes in a dependent way.
You can however use ax = plt.axes([left,bottom,width,height]) to add individual subplots in whatever positions you like. This should allow you to achieve what you want, but will be tedious because you need to place each subplot manually.

What you want to do is tricky due to the way that mpl works underneath. Most of the artist are specified in units that are not on-screen units (data, axes, or figure space: see transfrom tutorial). This gives you a good deal of power/flexibility as most of the time you want to work in one of the relative sets of coordinates, however the cost is if you want to set 'absolute' sizes of things you end up having to do it indirectly.
If you want you axis to be a fixed length (in display units) between figures, then you need to control the size of you axes (in figure units) by hand (via fig.add_axes) and then use fig.set_size_inches to set the size of your over-all figure. By tweaking these values you can get what you want.

Related

Possible to matplotlib's constrained_layout ignore axis tick labels?

I have a reasonably complicated grid of subplots that involves two sets (one on the left and another on the right) of columns plotting a set of quantities for each row, separated by a common legend to label the entries in each row.
Here is a sample of what I want to accomplish
Using matplotlib with constrained_layout = True works 95% perfectly for applying the out optimal sizes & spacing for the columns, down to the tricky case of having the legend run down the middle. The remaining 5% is highlighted in red, where the wordy x-axis tick labels seem to push away the columns: it would be perfect if there was a way to make the layout engine "ignore" the tick labels in determining the spacing.
Methods using other libraries are also appreciated. Thank you in advance.
What I tried:
subplots_adjust
GridSpec
The main difficulty with those attempts:
constrained_layout is incompatible with those settings, so one must sacrifice the optimized legend spacing at the cost of getting the column spacing right, or vice versa.

Matplotlib: increase distance between automatically generated tick labels

When generating a new figure or axis with matplotlib (or pyplot), there is (I assume) some sort of automated way to determine how many ticks are appropriate for each axis.
Unfortunately, this often results in labels which are too close to be read comfortably, or even overlap. I'm aware of the ways to specify tick locations and labels explicitly (e.g. ax.set_xticks, ax.set_xtick_labels, but I wonder if whatever does the automatic tick distribution if nothing is specified can be influenced by some global matplotlib parameter(s).
Do such global parameters exist, and what are they?
I'm generating lots of figures automatically and save them, and it can get a little annoying having to treat them all individually ...
In case there is no simple way to tell matplotlib to thin out the labels, is there some other workaround to achieve more generous spacing between them?
by reading the documentation of xticks matplotlib.pyplot.xticks there seems to be no such global arguments.
However it is very simple to get around it by using the explicit xticks and xticks_labels taking into account that you can:
change the font size (decrease it)
rotate the labels (by 45°) or make them vertical (less overlapping).
increase the fig size.
program a function that generates adaptif xticks based on your input.
and many other possible workarounds.

Python adaptive histogram widths

I am currently working on a project where I have to bin up to 10-dimensional data. This works totally fine with numpy.histogramdd, however with one have a serious obstacle:
My parameter space is pretty large, but only a fraction is actually inhabited by data (say, maybe a few % or so...). In these regions, the data is quite rich, so I would like to use relatively small bin widths. The problem here, however, is that the RAM usage totally explodes. I see usage of 20GB+ for only 5 dimensions which is already absolutely not practical. I tried defining the grid myself, but the problem persists...
My idea would be to manually specify the bin edges, where I just use very large bin widths for empty regions in the data space. Only in regions where I actually have data, I would need to go to a finer scale.
I was wondering if anyone here knows of such an implementation already which works in arbitrary numbers of dimensions.
thanks 😊
I think you should first remap your data, then create the histogram, and then interpret the histogram knowing the values have been transformed. One possibility would be to tweak the histogram tick labels so that they display mapped values.
One possible way of doing it, for example, would be:
Sort one dimension of data as an unidimensional array;
Integrate this array, so you have a cumulative distribution;
Find the steepest part of this distribution, and choose a horizontal interval corresponding to a "good" bin size for the peak of your histogram - that is, a size that gives you good resolution;
Find the size of this same interval along the vertical axis. That will give you a bin size to apply along the vertical axis;
Create the bins using the vertical span of that bin - that is, "draw" horizontal, equidistant lines to create your bins, instead of the most common way of drawing vertical ones;
That way, you'll have lots of bins where data is more dense, and lesser bins where data is more sparse.
Two things to consider:
The mapping function is the cumulative distribution of the sorted values along that dimension. This can be quite arbitrary. If the distribution resembles some well known algebraic function, you could define it mathematically and use it to perform a two-way transform between actual value data and "adaptive" histogram data;
This applies to only one dimension. Care must be taken as how this would work if the histograms from multiple dimensions are to be combined.

Python: how to plot points with little overlapping

I am using python to plot points. The plot shows relationship between area and the # of points of interest (POIs) in this area. I have 3000 area values and 3000 # of POI values.
Now the plot looks like this:
The problem is that, at lower left side, points are severely overlapping each other so it is hard to get enough information. Most areas are not that big and they don't have many POIs.
I want to make a plot with little overlapping. I am wondering whether I can use unevenly distributed axis or use histogram to make a beautiful plot. Can anyone help me?
I would suggest using a logarithmic scale for the y axis. You can either use pyplot.semilogy(...) or pyplot.yscale('log') (http://matplotlib.org/api/pyplot_api.html).
Note that points where area <= 0 will not be rendered.
I think we have two major choices here. First adjusting this plot, and second choosing to display your data in another type of plot.
In the first option, I would suggest clipping the boundries. You have plenty of space around the borders. If you limit the plot to the boundries, your data would scale better. On top of it, you may choose to plot the points with smaller dots, so that they would seem less overlapping.
Second option would be to choose displaying data in a different view, such as histograms. This might give a better insight in terms of distribution of your data among different bins. But this would be completely different type of view, in regards to the former plot.
I would suggest trying to adjust the plot by limiting the boundries of the plot to the data points, so that the plot area would have enough space to scale the data and try using histograms later. But as I mentioned, these are two different things and would give different insights about your data.
For adjusting you might try this:
x1,x2,y1,y2 = plt.axis()
plt.axis((x1,x2,y1,y2))
You would probably need to make minor adjustments to the axis variables. Note that there should definetly be better options instead of this, but this was the first thing that came to my mind.

On adjusting margins in matplotlib

I am trying to minimize margins around a 1X2 figure, a figure which are two stacked subplots. I searched a lot and came up with commands like:
self.figure.subplots_adjust(left=0.01, bottom=0.01, top=0.99, right=0.99)
Which leaves a large gap on top and between the subplots. Playing with these parameters, much less understanding them was tough (things like ValueError: bottom cannot be >= top)
My questions :
What is the command to completely minimize the margins?
What do these numbers mean, and what coordinate system does this follow (the non-standard percent thing and origin point of this coordinate system)? What are the special rules on top of this coordinate system?
Where is the exact point this command needs to be called? From experiment, I figured out it works after you create subplots. What if you need to call it repeatedly after you resize a window and need to resize the figure to fit inside?
What are the other methods of adjusting layouts, especially for a single subplot?
They're in figure coordinates: http://matplotlib.sourceforge.net/users/transforms_tutorial.html
To remove gaps between subplots, use the wspace and hspace keywords to subplots_adjust.
If you want to have things adjusted automatically, have a look at tight_layout
Gridspec: http://matplotlib.sourceforge.net/users/gridspec.html

Categories