Comparison arrow object between two bar containers - python

So I have the following two arrays:
base = np.arange(2)
y_axis = [32.59, 28.096]
And the following code
base = np.arange(2)
fig,ax = plt.subplots()
fig.set_figheight(10)
fig.set_figwidth(15)
bars = ax.bar(base, y_axis, width = 0.3)
bars[0].set_color('g')
ax.bar_label(bars,[f'{i}%' for i in y_axis])
ax.set_xticks(base, labels = ['Simplificado','Não simplificados'])
ax.arrow(base[0],y5,dx = base[1], dy = x5-y5)
That results in the following image
What I want to do is a comparison, arrow something kinda like this. Any ideas on a way to build up such arrow?
Sorry for bad image.

You could use matplotlib.path.
That can be used to draw polygons or also just a polyline following a specific path as used for this case.
This plot isn't optimized to look pretty (see notes at the end for potential improvement), but to show the concept:
Code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.path as mpath
base = np.arange(2)
y_axis = [32.59, 28.096]
fig, ax = plt.subplots()
fig.set_figheight(10)
fig.set_figwidth(15)
path_y_gap = 5
delta_value = y_axis[1] - y_axis[0]
Path = mpath.Path
path_data = [
(Path.MOVETO, (base[0],y_axis[0])),
(Path.MOVETO, (base[0],y_axis[0]+path_y_gap)),
(Path.MOVETO, (base[1],y_axis[0]+path_y_gap)),
#(Path.MOVETO, (base[1],y_axis[1])), # alternative to the arrow
]
codes, verts = zip(*path_data)
path = mpath.Path(verts, codes)
x, y = zip(*path.vertices)
line, = ax.plot(x, y, 'k-')
ax.text( 0.5 , y_axis[0] + path_y_gap + 0.5, round(delta_value,2))
ax.arrow(base[1], y_axis[0]+path_y_gap, 0, -(-delta_value + path_y_gap),
head_width = 0.02 , head_length = 0.8, length_includes_head = True)
bars = ax.bar(base, y_axis, width = 0.3)
bars[0].set_color('g')
ax.bar_label(bars,[f'{i}%' for i in y_axis])
ax.set_xticks(base, labels = ['Simplificado','Não simplificados'])
Notes:
path doesn't offer arrow shaped ends, as a workaround the last section is done by a normal matplotlib arrow
Check the alternative in the path_data to the arrow for the last section
I haven't dealt with overlay of the bar % text and the path / arrow, but you could e.g. easily put a y-offset variable to start/end above that text
Check Bézier example in the matplotlib path tutorial if you prefer a 'rounded' line
You may for sure adapt the float digits another way than the used round()
The first MOVETO sets the starting point, an explicit endpoint isn't required.

Related

How to measure a text element in matplotlib

I need to lay out a table full of text boxes using matplotlib. It should be obvious how to do this: create a gridspec for the table members, fill in each element of the grid, take the maximum heights and widths of the elements in the grid, change the appropriate height and widths of the grid columns and rows. Easy peasy, right?
Wrong.
Everything works except the measurements of the items themselves. Matplotlib consistently returns the wrong size for each item. I believe that I have been able to track this down to not even being able to measure the size of a text path correctly:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatch
import matplotlib.text as mtext
import matplotlib.path as mpath
import matplotlib.patches as mpatches
fig, ax = plt.subplots(1, 1)
ax.set_axis_off()
text = '!?' * 16
size=36
## Buildand measure hidden text path
text_path=mtext.TextPath(
(0.0, 0.0),
text,
prop={'size' : size}
)
vertices = text_path.vertices
code = text_path.codes
min_x, min_y = np.min(
text_path.vertices[text_path.codes != mpath.Path.CLOSEPOLY], axis=0)
max_x, max_y = np.max(
text_path.vertices[text_path.codes != mpath.Path.CLOSEPOLY], axis=0)
## Transform measurement to graph units
transData = ax.transData.inverted()
((local_min_x, local_min_y),
(local_max_x, local_max_y)) = transData.transform(
((min_x, min_y), (max_x, max_y)))
## Draw a box which should enclose the path
x_offset = (local_max_x - local_max_y) / 2
y_offset = (local_max_y - local_min_y) / 2
local_min_x = 0.5 - x_offset
local_min_y = 0.5 - y_offset
local_max_x = 0.5 + x_offset
local_max_y = 0.5 + y_offset
path_data = [
(mpath.Path.MOVETO, (local_min_x, local_min_y)),
(mpath.Path.LINETO, (local_max_x, local_min_y)),
(mpath.Path.LINETO, (local_max_x, local_max_y)),
(mpath.Path.LINETO, (local_min_x, local_max_y)),
(mpath.Path.LINETO, (local_min_x, local_min_y)),
(mpath.Path.CLOSEPOLY, (local_min_x, local_min_y)),
]
codes, verts = zip(*path_data)
path = mpath.Path(verts, codes)
patch = mpatches.PathPatch(
path,
facecolor='white',
edgecolor='red',
linewidth=3)
ax.add_patch(patch)
## Draw the text itself
item_textbox = ax.text(
0.5, 0.5,
text,
bbox=dict(boxstyle='square',
fc='white',
ec='white',
alpha=0.0),
transform=ax.transAxes,
size=size,
horizontalalignment="center",
verticalalignment="center",
alpha=1.0)
plt.show()
Run this under Python 3.8
Expect: the red box to be the exact height and width of the text
Observe: the red box is the right height, but is most definitely not the right width.
There doesn't seem to be any way to do this directly, but there's a way to do it indirectly: instead of using a text box, use TextPath, transform it to Axis coordinates, and then use the differences between min and max on each coordinate. (See https://matplotlib.org/stable/gallery/text_labels_and_annotations/demo_text_path.html#sphx-glr-gallery-text-labels-and-annotations-demo-text-path-py for a sample implementation. This implementation has a significant bug -- it uses vertices and codes directly, which break in the case of a clipped text path.)

Python: Changing visual parameters of ptitprince repo derived from seaborn and matplotlib

I am using a github repository called ptitprince, which is derived from seaborn and matplotlib, to generate graphs.
For example, this is the code using the ptitprince repo:
# coding: utf8
import pandas as pd
import ptitprince as pt
import seaborn as sns
import os
import matplotlib.pyplot as plt
#sns.set(style="darkgrid")
#sns.set(style="whitegrid")
#sns.set_style("white")
sns.set(style="whitegrid",font_scale=2)
import matplotlib.collections as clt
df = pd.read_csv ("u118phag.csv", sep= ",")
df.head()
savefigs = True
figs_dir = 'figs'
if savefigs:
# Make the figures folder if it doesn't yet exist
if not os.path.isdir('figs'):
os.makedirs('figs')
#automation
f, ax = plt.subplots(figsize=(4, 5))
#f.subplots_adjust(hspace=0,wspace=0)
dx = "Treatment"; dy = "score"; ort = "v"; pal = "Set2"; sigma = .2
ax=pt.RainCloud(x = dx, y = dy, data = df, palette = pal, bw = sigma,
width_viol = .6, ax = ax, move=.2, offset=.1, orient = ort, pointplot = True)
f.show()
if savefigs:
f.savefig('figs/figure20.png', bbox_inches='tight', dpi=500)
which generates the following graph
The raw code not using ptitprince is as follows and produces the same graph as above:
# coding: utf8
import pandas as pd
import ptitprince as pt
import seaborn as sns
import os
import matplotlib.pyplot as plt
#sns.set(style="darkgrid")
#sns.set(style="whitegrid")
#sns.set_style("white")
sns.set(style="whitegrid",font_scale=2)
import matplotlib.collections as clt
df = pd.read_csv ("u118phag.csv", sep= ",")
df.head()
savefigs = True
figs_dir = 'figs'
if savefigs:
# Make the figures folder if it doesn't yet exist
if not os.path.isdir('figs'):
os.makedirs('figs')
f, ax = plt.subplots(figsize=(7, 5))
dy="Treatment"; dx="score"; ort="h"; pal = sns.color_palette(n_colors=1)
#adding color
pal = "Set2"
f, ax = plt.subplots(figsize=(7, 5))
ax=pt.half_violinplot( x = dx, y = dy, data = df, palette = pal, bw = .2, cut = 0.,
scale = "area", width = .6, inner = None, orient = ort)
ax=sns.stripplot( x = dx, y = dy, data = df, palette = pal, edgecolor = "white",
size = 3, jitter = 1, zorder = 0, orient = ort)
ax=sns.boxplot( x = dx, y = dy, data = df, color = "black", width = .15, zorder = 10,\
showcaps = True, boxprops = {'facecolor':'none', "zorder":10},\
showfliers=True, whiskerprops = {'linewidth':2, "zorder":10},\
saturation = 1, orient = ort)
if savefigs:
f.savefig('figs/figure21.png', bbox_inches='tight', dpi=500)
Now, what I'm trying to do is to figure out how to modify the graph so that I can (1) move the plots closer together, so there is not so much white space between them, and (2) shift the x-axis to the right, so that I can make the distribution (violin) plot wider without it getting cut in half by the y-axis.
I have tried to play around with subplots_adjust() as you can see in the first box of code, but I receive an error. I cannot figure out how to appropriately use this function, or even if that will actually bring the different graphs closer together.
I also know that I can increase the distribution size by increasing this value width = .6, but if I increase it too high, the distribution plot begins to being cut off by the y-axis. I can't figure out if I need to adjust the overall plot using the plt.subplots,or if I need to move each individual plot.
Any advice or recommendations on how to change the visuals of the graph? I've been staring at this for awhile, and I can't figure out how to make seaborn/matplotlib play nicely with ptitprince.
You may try to change the interval of X-axis being shown using ax.set_xbound (put a lower value than you currently have for the beginning).

How to plot a thermometer?

In a recent, very broad question it was asked how to plot several symbols, like "circles, squares, rectangles, stars, thermometers, and boxplots" with matplotlib. From that list, all but thermometers are obvious as either shown in the documentation or in many existing stackoverflow answers. Since the OP did not seem interested in thermomenters at all, I'd rather ask a new question specifically about thermometers here.
How to plot thermometers in matplotlib?
In principle you can plot any symbol you like, making it either a marker or a Path. There does not seem to be any unicode symbol for thermometers though. Font awesome has a thermometer symbol and plotting FontAwesome symbols in matplotlib is possible. Yet there are only 5 differnt fillings
Also, the color of such font symbol is uniform, yet ideally one would have the inner part of a thermometer (the "mercury pillar") in a different color (probably mostly red for associative reasons) or in different colors as to encode temperature in color as well.
So is it possible to have a temperature symbol where the mercury pillar encodes temperature (or in fact any other quantity) in terms of color and filling level? And if so, how?
(I gave an answer below, alternatives to or improvements of that method are welcome as further answers here.)
An option to plot a thermometer consisting of two parts is to create two Paths, the outer hull and the inner mercury pillar. For this one can create the Paths from scratch and allow the inner path to be variable depending on a (normalized) input parameter.
Then plotting both paths as individual scatter plots is possible. In the following, we create a class that has a scatter method, which works similar to a usual scatter, except that it would also take the additional arguments temp for the temperature and tempnorm for the normalization of the temperature as input.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.path as mpath
class TemperaturePlot():
#staticmethod
def get_hull():
verts1 = np.array([[0,-128],[70,-128],[128,-70],[128,0],
[128,32.5],[115.8,61.5],[96,84.6],[96,288],
[96,341],[53,384],[0,384]])
verts2 = verts1[:-1,:] * np.array([-1,1])
codes1 = [1,4,4,4,4,4,4,2,4,4,4]
verts3 = np.array([[0,-80],[44,-80],[80,-44],[80,0],
[80,34.3],[60.7,52],[48,66.5],[48,288],
[48,314],[26.5,336],[0,336]])
verts4 = verts3[:-1,:] * np.array([-1,1])
verts = np.concatenate((verts1, verts2[::-1], verts4, verts3[::-1]))
codes = codes1 + codes1[::-1][:-1]
return mpath.Path(verts/256., codes+codes)
#staticmethod
def get_mercury(s=1):
a = 0; b = 64; c = 35
d = 320 - b
e = (1-s)*d
verts1 = np.array([[a,-b],[c,-b],[b,-c],[b,a],[b,c],[c,b],[a,b]])
verts2 = verts1[:-1,:] * np.array([-1,1])
verts3 = np.array([[0,0],[32,0],[32,288-e],[32,305-e],
[17.5,320-e],[0,320-e]])
verts4 = verts3[:-1,:] * np.array([-1,1])
codes = [1] + [4]*12 + [1,2,2,4,4,4,4,4,4,2,2]
verts = np.concatenate((verts1, verts2[::-1], verts3, verts4[::-1]))
return mpath.Path(verts/256., codes)
def scatter(self, x,y, temp=1, tempnorm=None, ax=None, **kwargs):
self.ax = ax or plt.gca()
temp = np.atleast_1d(temp)
ec = kwargs.pop("edgecolor", "black")
kwargs.update(linewidth=0)
self.inner = self.ax.scatter(x,y, **kwargs)
kwargs.update(c=None, facecolor=ec, edgecolor=None, color=None)
self.outer = self.ax.scatter(x,y, **kwargs)
self.outer.set_paths([self.get_hull()])
if not tempnorm:
mi, ma = np.nanmin(temp), np.nanmax(temp)
if mi == ma:
mi=0
tempnorm = plt.Normalize(mi,ma)
ipaths = [self.get_mercury(tempnorm(t)) for t in temp]
self.inner.set_paths(ipaths)
Usage of this class could look like this,
plt.rcParams["figure.figsize"] = (5.5,3)
plt.rcParams["figure.dpi"] = 72*3
fig, ax = plt.subplots()
p = TemperaturePlot()
p.scatter([.25,.5,.75], [.3,.4,.5], s=[800,1200,1600], temp=[28,39,35], color="C3",
ax=ax, transform=ax.transAxes)
plt.show()
where we plot 3 Thermometers with different temperatures depicted by the fill of the "mercury" pillar. Since no normalization is given it will normalize the temperatures of [28,39,35] between their minimum and maximum.
Or we can use color (c) and temp to show the temparature as in
np.random.seed(42)
fig, ax = plt.subplots()
n = 42
x = np.linspace(0,100,n)
y = np.cumsum(np.random.randn(n))+5
ax.plot(x,y, color="darkgrey", lw=2.5)
p = TemperaturePlot()
p.scatter(x[::4],y[::4]+3, s=300, temp=y[::4], c=y[::4], edgecolor="k", cmap="RdYlBu_r")
ax.set_ylim(-6,18)
plt.show()

python scatter plot with errorbars and colors mapping a physical quantity

I'm trying to do a quite simple scatter plot with error bars and semilogy scale. What is a little bit different from tutorials I have found is that the color of the scatterplot should trace a different quantity. On one hand, I was able to do a scatterplot with the errorbars with my data, but just with one color. On the other hand, I realized a scatterplot with the right colors, but without the errorbars.
I'm not able to combine the two different things.
Here an example using fake data:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
n=100
Lx_gas = 1e40*np.random.random(n) + 1e37
Tx_gas = np.random.random(n) + 0.5
Lx_plus_error = Lx_gas
Tx_plus_error = Tx_gas/2.
Tx_minus_error = Tx_gas/4.
#actually positive numbers, this is the quantity that should be traced by the
#color, in this example I use random numbers
Lambda = np.random.random(n)
#this is actually different from zero, but I want to be sure that this simple
#code works with the log axis
Lx_minus_error = np.zeros_like(Lx_gas)
#normalize the color, to be between 0 and 1
colors = np.asarray(Lambda)
colors -= colors.min()
colors *= (1./colors.max())
#build the error arrays
Lx_error = [Lx_minus_error, Lx_plus_error]
Tx_error = [Tx_minus_error, Tx_plus_error]
##--------------
##important part of the script
##this works, but all the dots are of the same color
#plt.errorbar(Tx_gas, Lx_gas, xerr = Tx_error,yerr = Lx_error,fmt='o')
##this is what is should be in terms of colors, but it is without the error bars
#plt.scatter(Tx_gas, Lx_gas, marker='s', c=colors)
##what I tried (and failed)
plt.errorbar(Tx_gas, Lx_gas, xerr = Tx_error,yerr = Lx_error,\
color=colors, fmt='o')
ax = plt.gca()
ax.set_yscale('log')
plt.show()
I even tried to plot the scatterplot after the errorbar, but for some reason everything plotted on the same window is put in background with respect to the errorplot.
Any ideas?
Thanks!
You can set the color to the LineCollection object returned by the errorbar as described here.
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
n=100
Lx_gas = 1e40*np.random.random(n) + 1e37
Tx_gas = np.random.random(n) + 0.5
Lx_plus_error = Lx_gas
Tx_plus_error = Tx_gas/2.
Tx_minus_error = Tx_gas/4.
#actually positive numbers, this is the quantity that should be traced by the
#color, in this example I use random numbers
Lambda = np.random.random(n)
#this is actually different from zero, but I want to be sure that this simple
#code works with the log axis
Lx_minus_error = np.zeros_like(Lx_gas)
#normalize the color, to be between 0 and 1
colors = np.asarray(Lambda)
colors -= colors.min()
colors *= (1./colors.max())
#build the error arrays
Lx_error = [Lx_minus_error, Lx_plus_error]
Tx_error = [Tx_minus_error, Tx_plus_error]
sct = plt.scatter(Tx_gas, Lx_gas, marker='s', c=colors)
cb = plt.colorbar(sct)
_, __ , errorlinecollection = plt.errorbar(Tx_gas, Lx_gas, xerr = Tx_error,yerr = Lx_error, marker = '', ls = '', zorder = 0)
error_color = sct.to_rgba(colors)
errorlinecollection[0].set_color(error_color)
errorlinecollection[1].set_color(error_color)
ax = plt.gca()
ax.set_yscale('log')
plt.show()

Creating a hexagonal grid (u-matrix) in Python using a Regularpolycollection

I am trying to create a hexagonal grid to use with a u-matrix in Python (3.4) using a RegularPolyCollection (see code below) and have run into two problems:
The hexagonal grid is not tight. When I plot it there are empty spaces between the hexagons. I can fix this by resizing the window, but since this is not reproducible and I want all of my plots to have the same size, this is not satisfactory. But even if it were, I run into the second problem.
Either the top or right hexagons don't fit in the figure and are cropped.
I have tried a lot of things (changing figure size, subplot_adjust(), different areas, different values of d, etc.) and I am starting to get crazy! It feels like the solution should be simple, but I simply cannot find it!
import SOM
import matplotlib.pyplot as plt
from matplotlib.collections import RegularPolyCollection
import numpy as np
import matplotlib.cm as cm
from mpl_toolkits.axes_grid1 import make_axes_locatable
m = 3 # The height
n = 3 # The width
# Some maths regarding hexagon geometry
d = 10
s = d/(2*np.cos(np.pi/3))
h = s*(1+2*np.sin(np.pi/3))
r = d/2
area = 3*np.sqrt(3)*s**2/2
# The center coordinates of the hexagons are calculated.
x1 = np.array([d*x for x in range(2*n-1)])
x2 = x1 + r
x3 = x2 + r
y = np.array([h*x for x in range(2*m-1)])
c = []
for i in range(2*m-1):
if i%4 == 0:
c += [[x,y[i]] for x in x1]
if (i-1)%2 == 0:
c += [[x,y[i]] for x in x2]
if (i-2)%4 == 0:
c += [[x,y[i]] for x in x3]
c = np.array(c)
# The color of the hexagons
d_matrix = np.zeros(3*3)
# Creating the figure
fig = plt.figure(figsize=(5, 5), dpi=100)
ax = fig.add_subplot(111)
# The collection
coll = RegularPolyCollection(
numsides=6, # a hexagon
rotation=0,
sizes=(area,),
edgecolors = (0, 0, 0, 1),
array= d_matrix,
cmap = cm.gray_r,
offsets = c,
transOffset = ax.transData,
)
ax.add_collection(coll, autolim=True)
ax.axis('off')
ax.autoscale_view()
plt.show()
See this topic
Also you need to add scale on axis like
ax.axis([xmin, xmax, ymin, ymax])
The hexalattice module of python (pip install hexalattice) gives solution to both you concerns:
Grid tightness: You have full control over the hexagon border gap via the 'plotting_gap' argument.
The grid plotting takes into account the grid final size, and adds sufficient margins to avoid the crop.
Here is a code example that demonstrates the control of the gap, and correctly fits the grid into the plotting window:
from hexalattice.hexalattice import *
create_hex_grid(nx=5, ny=5, do_plot=True) # Create 5x5 grid with no gaps
create_hex_grid(nx=5, ny=5, do_plot=True, plotting_gap=0.2)
See this answer for additional usage examples, more images and links
Disclosure: the hexalattice module was written by me

Categories