I'd like to do a boxplot with rounded corners but not sure how. Saw a post to make rounded corners for barplot but no luck with boxplot. ax.artists is a list of matplotlib.patches.PathPatch objects and I think they control the box styles.
Below is some sample code
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.random.rand(100, 1), columns=['value'])
df['type'] = pd.Series(np.repeat(['type1','type2', 'type3', 'type4'], 25))
ax = sns.boxplot(data=df, x="type", y="value")
There are similar questions (e.g. Bar chart with rounded corners and Seaborn barplot with rounded corners). These solutions need quite some adaptions to be usable here.
The boxplot's rectangles aren't stored as rectangles, but as patch artists. To get their bounding box, the entent of their path needs to be calculated.
The parameters of FancyBboxPatch need some experimenting. Setting pad=0 makes the rounded rectangle occupy the same space. mutation_aspect (defaults to 1) is needed to make the vertical boxes look well. For your own application, some fine-tuning might be needed.
from matplotlib import pyplot as plt
from matplotlib.patches import FancyBboxPatch
from matplotlib.path import get_path_collection_extents
import seaborn as sns
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(100, 1), columns=['value'])
df['type'] = pd.Series(np.repeat(['type1', 'type2', 'type3', 'type4'], 25))
ax = sns.boxplot(data=df, x="type", y="value")
new_patches = []
for patch in reversed(ax.artists):
bb = patch.get_path().get_extents()
color = patch.get_facecolor()
p_bbox = FancyBboxPatch((bb.xmin, bb.ymin),
abs(bb.width), abs(bb.height),
boxstyle="round,pad=0,rounding_size=0.2",
ec="black", fc=color,
mutation_aspect=0.2)
patch.remove()
new_patches.append(p_bbox)
for patch in new_patches:
ax.add_patch(patch)
plt.show()
Related
When I run the code below I notice that the heatmap does not have a square shape knowing that I have used square=True but it did not work! Any idea how can I print the heatmap in a square format? Thank you!
The code:
from datetime import datetime
import numpy as np
import pandas as pd
import matplotlib as plt
import os
import seaborn as sns
temp_hourly_A5_A7_AX_ASHRAE=pd.read_csv('C:\\Users\\cvaa4\\Desktop\\projects\\s\\temp_hourly_A5_A7_AX_ASHRAE.csv',index_col=0, parse_dates=True, dayfirst=True, skiprows=2)
sns.heatmap(temp_hourly_A5_A7_AX_ASHRAE,cmap="YlGnBu", vmin=18, vmax=27, square=True, cbar=False, linewidth=0.0001);
The result:
square=True should work to have square cells, below is a working example:
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.tile([0,1], 15*15).reshape(-1,15))
sns.heatmap(df, square=True)
If you want a square shape of the plot however, you can use set_aspect and the shape of the data:
ax = sns.heatmap(df)
ax.set_aspect(df.shape[1]/df.shape[0]) # here 0.5 Y/X ratio
You can use matplotlib and set a figsize before plotting heatmap.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
rnd = np.random.default_rng(12345)
data = rnd.uniform(-100, 100, [100, 50])
plt.figure(figsize=(6, 5))
sns.heatmap(data, cmap='viridis');
Note that I used figsize=(6, 5) rather than a square figsize=(5, 5). This is because on a given figsize, seaborn also puts the colorbar, which might cause the heatmap to be squished a bit. You might want to change those figsizes too depending on what you need.
Currently displaying some data with Seaborn / Pandas. I'm looking to overlay the mean of each category (x=ks2) - but can't figure out how to do this with Seaborn.
I can remove the inner="box" - but want to replace that with a marker for the mean of each category.
Ideally, then link each mean calculated...
Any pointers greatly received.
Cheers
Science.csv has 9k+ entries
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
sns.set(style="whitegrid", palette="pastel", color_codes=True)
# Load the dataset
# df = pd.read_csv("science.csv") << loaded from csv
df = pd.DataFrame({'ks2': [1, 1, 2,3,3,4],
'science': [40, 50, 34,20,0,44]})
# Draw a nested violinplot and split the violins for easier comparison
sns.violinplot(x="ks2", y="science", data=df, split=True,
inner="box",linewidth=2)
sns.despine(left=True)
plt.savefig('plot.png')
try:
from numpy import mean
then overlay sns.pointplot with estimator=mean
sns.pointplot(x = 'ks2', y='science', data=df, estimator=mean)
then play with linestyles
I want to replicate plots from this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5000555/pdf/nihms774453.pdf I'm particularly interested in plot on page 16, right panel. I tried to do this in matplotlib but it seems to me that there is no way to access lines in linecollection.
I don't know how to change the color of the each line, according to the value at every index. I'd like to eventually get something like here: https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/multicolored_line.html but for every line, according to the data.
this is what I tried:
the data in numpy array: https://pastebin.com/B1wJu9Nd
import pandas as pd, numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib import colors as mcolors
%matplotlib inline
base_range = np.arange(qq.index.max()+1)
fig, ax = plt.subplots(figsize=(12,8))
ax.set_xlim(qq.index.min(), qq.index.max())
# ax.set_ylim(qq.columns[0], qq.columns[-1])
ax.set_ylim(-5, len(qq.columns) +5)
line_segments = LineCollection([np.column_stack([base_range, [y]*len(qq.index)]) for y in range(len(qq.columns))],
cmap='viridis',
linewidths=(5),
linestyles='solid',
)
line_segments.set_array(base_range)
ax.add_collection(line_segments)
axcb = fig.colorbar(line_segments)
plt.show()
my result:
what I want to achieve:
I want this plot's y-axis to be centered at 38, and the y-axis scaled such that the 'humps' disappear. How do I accomplish this?
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02', '33.79', '34.73', '36.47', '35.03', '33.45',
'33.82', '33.38', '34.68', '36.93', '33.44', '33.55',
'33.18', '33.07', '33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 2))
for i,j in zip(s,df[0]):
ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, df[0])
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.grid()
#plt.yticks(np.arange(min(df[p]), max(df[p])+1, 2))
fig.savefig("test.png")
plt.show()
I'm not entirely certain if this is what you're looking for but you can adjust the y-limits explicitly to change the scale, i.e.
ax.set_ylim([ax.get_ylim()[0], 42])
Which only sets the upper bound, leaving the lower limit unchanged, this would give you
you can supply any values you find appropriate, i.e.
ax.set_ylim([22, 52])
will give you something that looks like
Also note that the tick labels and general appearance of your plot will differ from what is shown here.
Edit - Here is the complete code as requested:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame()
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02','33.79','34.73','36.47','35.03','33.45',
'33.82','33.38','34.68','36.93','33.44','33.55',
'33.18','33.07','33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 3))
#for i,j in zip(s,df[0]):
# ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, pd.to_numeric(df[0]))
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.set_xticklabels(pd.to_datetime(s).strftime('%m.%d'), rotation=45)
ax.set_ylim([22, 52])
plt.show()
I'm trying to create a heatmap in seaborn (python) with certain squares colored with a different color, (these squares contain insignificant data - in my case it will be squares with values less than 1.3, which is -log of p-values >0.05). I couldn't find such function. Masking these squares also didn't work.
Here is my code:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import seaborn as sns; sns.set()
data = [[1.3531363408, 3.339479161, 0.0760855365], [5.1167382617, 3.2890920405, 2.4764601828], [0.0025058257, 2.3165128345, 1.6532714962], [0.2600549869, 5.8427407219, 6.6627226609], [3.0828581725, 16.3825494439, 12.6722666929], [2.3386307357, 13.7275065772, 12.5760972276], [1.224683813, 2.2213656372, 0.6300876451], [0.4163788387, 1.8128374089, 0.0013106046], [0.0277592882, 2.9286203949, 0.810978992], [0.0086613622, 0.6181261247, 1.8287878837], [1.0174519889, 0.2621290291, 0.1922637697], [3.4687429571, 4.0061981716, 0.5507951444], [7.4201304939, 3.881457516, 0.1294141768], [2.5227546319, 6.0526491816, 0.3814362442], [8.147538027, 14.0975727815, 7.9755706939]]
cmap2 = mpl.colors.ListedColormap(sns.cubehelix_palette(n_colors=20, start=0, rot=0.4, gamma=1, hue=0.8, light=0.85, dark=0.15, reverse=False))
ax = sns.heatmap(data, cmap=cmap2, vmin=0)
plt.show()
I want to add that I'm not very advanced programmer.
OK, so I can answer my question myself now :) Here is the code that solved the problem:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import seaborn as sns; sns.set()
data = np.array([[1.3531363408, 3.339479161, 0.0760855365],
[5.1167382617, 3.2890920405, 2.4764601828],
[0.0025058257, 2.3165128345, 1.6532714962],
[0.2600549869, 5.8427407219, 6.6627226609],
[3.0828581725, 16.3825494439, 12.6722666929],
[2.3386307357, 13.7275065772, 12.5760972276],
[1.224683813, 2.2213656372, 0.6300876451],
[0.4163788387, 1.8128374089, 0.0013106046],
[0.0277592882, 2.9286203949, 0.810978992],
[0.0086613622, 0.6181261247, 1.8287878837],
[1.0174519889, 0.2621290291, 0.1922637697],
[3.4687429571, 4.0061981716, 0.5507951444],
[7.4201304939, 3.881457516, 0.1294141768],
[2.5227546319, 6.0526491816, 0.3814362442],
[8.147538027, 14.0975727815, 7.9755706939]])
cmap1 = mpl.colors.ListedColormap(['c'])
fig, ax = plt.subplots(figsize=(8, 8))
sns.heatmap(data, ax=ax)
sns.heatmap(data, mask=data > 1.3, cmap=cmap1, cbar=False, ax=ax)
plt.show()
So the problem with masking which didn't work before was that it works only on arrays not on lists.
And another thing is just plotting the heatmap twice -second time with masking.
The only thing I still don't understand is that it masks opposite fields from what is written.. I want to mask values below 1.3, but then it colored values above 1.3.. So I wrote mask=data >1.3 and now it works...