horizontal grid only (in python using pandas plot + pyplot) - python

I would like to get only horizontal grid using pandas plot.
The integrated parameter of pandas only has grid=True or grid=False, so I tried with matplotlib pyplot, changing the axes parameters, specifically with this code:
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
ax2 = plt.subplot()
ax2.grid(axis='x')
df.plot(kind='bar',ax=ax2, fontsize=10, sort_columns=True)
plt.show(fig)
But I get no grid, neither horizontal nor vertical. Is Pandas overwriting the axes? Or am I doing something wrong?

Try setting the grid after plotting the DataFrame. Also, to get the horizontal grid, you need to use ax2.grid(axis='y'). Below is an answer using a sample DataFrame.
I have restructured how you define ax2 by making use of subplots.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'lab':['A', 'B', 'C'], 'val':[10, 30, 20]})
fig, ax2 = plt.subplots()
df.plot(kind='bar',ax=ax2, fontsize=10, sort_columns=True)
ax2.grid(axis='y')
plt.show()
Alternatively, you can also do the following: Use the axis object returned from the DataFrame plot directly to turn on the horizontal grid
fig = plt.figure()
ax2 = df.plot(kind='bar', fontsize=10, sort_columns=True)
ax2.grid(axis='y')
Third option as suggested by #ayorgo in the comments is to chain the two commands as
df.plot(kind='bar',ax=ax2, fontsize=10, sort_columns=True).grid(axis='y')

Related

Heatmap with multi-color y-axis and correspondend colorbar

I want to create a heatmap with seaborn, similar to this (with the following code):
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Default heatmap
ax = sns.heatmap(df)
plt.show()
I'd also like to add a new variable (lets say new_var = pd.DataFrame(np.random.random((5,1)), columns=["new variable"])), such as that the values (and possibly the spine and ticks as well) of the y-axis are colored according to the new variable and a second color bar plotted in the same plot to represent the colors of the y-axis values. How can I do that?
This uses the new values to color the y-ticks and the y-tick labels and adds the associated colorbar.
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Default heatmap
ax = sns.heatmap(df)
new_var = pd.DataFrame(np.random.random((5,1)), columns=["new variable"])
# Create the colorbar for y-ticks and labels
norm = plt.Normalize(new_var.min(), new_var.max())
cmap = matplotlib.cm.get_cmap('turbo')
yticks_locations = ax.get_yticks()
yticks_labels = df.index.values
#hide original ticks
ax.tick_params(axis='y', left=False)
ax.set_yticklabels([])
for var, ytick_loc, ytick_label in zip(new_var.values, yticks_locations, yticks_labels):
color = cmap(norm(float(var)))
ax.annotate(ytick_label, xy=(1, ytick_loc), xycoords='data', xytext=(-0.4, ytick_loc),
arrowprops=dict(arrowstyle="-", color=color, lw=1), zorder=0, rotation=90, color=color)
# Add colorbar for y-tick colors
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
cb = ax.figure.colorbar(sm)
# Match the seaborn style
cb.outline.set_visible(False)
I found your problem interesting, and inspired by the unanswered comment above:
How do you change the second colorbar position? For example, one on top the other on bottom sides. - Py-ser
I decided to spend a while doing some tests. After a little digging i find that cbar_kws={"orientation": "horizontal"} is the argument for sns.heatmap that makes the colorbars horizontal.
Borrowing the code from the solution and making some changes, you can format your plot the way you want as in:
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Default heatmap
ax = sns.heatmap(df, cbar_kws={"orientation": "horizontal"}, square = False, annot = True)
new_var = pd.DataFrame(np.random.random((5,1)), columns=["new variable"])
# Create the colorbar for y-ticks and labels
norm = plt.Normalize(new_var.min(), new_var.max())
cmap = matplotlib.cm.get_cmap('turbo')
yticks_locations = ax.get_yticks()
yticks_labels = df.index.values
#hide original ticks
ax.tick_params(axis='y', left=False)
ax.set_yticklabels([])
for var, ytick_loc, ytick_label in zip(new_var.values, yticks_locations, yticks_labels):
color = cmap(norm(float(var)))
ax.annotate(ytick_label, xy=(1, ytick_loc), xycoords='data', xytext=(-0.4, ytick_loc),
arrowprops=dict(arrowstyle="-", color=color, lw=1), zorder=0, rotation=90, color=color)
# Add colorbar for y-tick colors
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
cb = ax.figure.colorbar(sm)
# Match the seaborn style
cb.outline.set_visible(False)
Also, you will notice that I listed the values ​​related to each cell in the heatmap, but just out of curiosity to make it clearer to check that everything was working as expected.
I'm still not very happy with the shape/size of the horizontal colorbar, but I'll keep testing and update any progress by editing this answer!
==========================================
EDIT
just to keep track of the updates, first i tried to change just some parameters of seaborn's heatmap function but wouldn't consider this a major improvement on the task... by adding
ax = sns.heatmap(df, cbar_kws = dict(use_gridspec=True, location="top", shrink =0.6), square = True, annot = True)
I end up with:
I did get to separate the colormap using the matplotlib subplot routine and honestly i believe this is the right way given the parameter control that is possible to get here, by:
# Define two rows for subplots
fig, (cax, ax) = plt.subplots(nrows=2, figsize=(5,5.025), gridspec_kw={"height_ratios":[0.025, 1]})
# Default heatmap
ax = sns.heatmap(df, cbar=False, annot = True)
# colorbar
fig.colorbar(ax.get_children()[0], cax=cax, orientation="horizontal")
plt.show()
I obtained:
Which is still not the prettiest graph I've ever made, but now the position and size of the heatmap can be edited normally within the plt.subplots subroutines that give absolute control over these parameters.

Multiple count plots in seaborn

I have a CSV file which has multiple columns, now I am trying to plot side by side count plot for selected columns, using below code, I am able to make only two-column, but when I trying to add more column, it's not working. How to plot multiple selected columns and plot it side by side.
While I plotting two graphs, its overlapping, how to increase the gap.
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
train_data = pd.read_csv(r"train_ctrUa4K.csv")
plt.figure(figsize=(10, 8))
fig, ax =plt.subplots(1,2)
sns.countplot(train_data['Gender'], ax=ax[0])
sns.countplot(train_data['Dependents'], ax=ax[1])
#sns.countplot(train_data['Self_Employed'], ax=ax[1])
#sns.countplot(train_data['Property_Area'], ax=ax[1,1])
fig.show()
change the number of columns in the call to subplots()
fig, ax = plt.subplots(1,4)
sns.countplot(train_data['Gender'], ax=ax[0])
sns.countplot(train_data['Dependents'], ax=ax[1])
sns.countplot(train_data['Self_Employed'], ax=ax[2])
sns.countplot(train_data['Property_Area'], ax=ax[3])
If you have too many subplots to fit on a single line, you can increase the number of rows as well. Be careful that if you have more than one row and more than one column, then the variable ax will be a 2D array:
fig, ax = plt.subplots(2,2)
sns.countplot(train_data['Gender'], ax=ax[0,0])
sns.countplot(train_data['Dependents'], ax=ax[0,1])
sns.countplot(train_data['Self_Employed'], ax=ax[1,0])
sns.countplot(train_data['Property_Area'], ax=ax[1,1])

Seaborn Heatmap: Move colorbar on top of the plot

I have a basic heatmap created using the seaborn library, and want to move the colorbar from the default, vertical and on the right, to a horizontal one above the heatmap. How can I do this?
Here's some sample data and an example of the default:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Default heatma
ax = sns.heatmap(df)
plt.show()
Looking at the documentation we find an argument cbar_kws. This allows to specify argument passed on to matplotlib's fig.colorbar method.
cbar_kws : dict of key, value mappings, optional.
Keyword arguments for fig.colorbar.
So we can use any of the possible arguments to fig.colorbar, providing a dictionary to cbar_kws.
In this case you need location="top" to place the colorbar on top. Because colorbar by default positions the colorbar using a gridspec, which then does not allow for the location to be set, we need to turn that gridspec off (use_gridspec=False).
sns.heatmap(df, cbar_kws = dict(use_gridspec=False,location="top"))
Complete example:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
ax = sns.heatmap(df, cbar_kws = dict(use_gridspec=False,location="top"))
plt.show()
I would like to show example with subplots which allows to control size of plot to preserve square geometry of heatmap. This example is very short:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Define two rows for subplots
fig, (cax, ax) = plt.subplots(nrows=2, figsize=(5,5.025), gridspec_kw={"height_ratios":[0.025, 1]})
# Draw heatmap
sns.heatmap(df, ax=ax, cbar=False)
# colorbar
fig.colorbar(ax.get_children()[0], cax=cax, orientation="horizontal")
plt.show()
You have to use axes divider to put colorbar on top of a seaborn figure. Look for the comments.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
from mpl_toolkits.axes_grid1.colorbar import colorbar
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Use axes divider to put cbar on top
# plot heatmap without colorbar
ax = sns.heatmap(df, cbar = False)
# split axes of heatmap to put colorbar
ax_divider = make_axes_locatable(ax)
# define size and padding of axes for colorbar
cax = ax_divider.append_axes('top', size = '5%', pad = '2%')
# make colorbar for heatmap.
# Heatmap returns an axes obj but you need to get a mappable obj (get_children)
colorbar(ax.get_children()[0], cax = cax, orientation = 'horizontal')
# locate colorbar ticks
cax.xaxis.set_ticks_position('top')
plt.show()
For more info read this official example of matplotlib: https://matplotlib.org/gallery/axes_grid1/demo_colorbar_with_axes_divider.html?highlight=demo%20colorbar%20axes%20divider
Heatmap argument like sns.heatmap(df, cbar_kws = {'orientation':'horizontal'}) is useless because it put colorbar on bottom position.

Histogram at specific coordinates inside axes

What I want to achieve with Python 3.6 is something like this :
Obviously made in paint and missing some ticks on the xAxis. Is something like this possible? Essentially, can I control exactly where to plot a histogram (and with what orientation)?
I specifically want them to be on the same axes just like the figure above and not on separate axes or subplots.
fig = plt.figure()
ax2Handler = fig.gca()
ax2Handler.scatter(np.array(np.arange(0,len(xData),1)), xData)
ax2Handler.hist(xData,bins=60,orientation='horizontal',normed=True)
This and other approaches (of inverting the axes) gave me no results. xData is loaded from a panda dataframe.
# This also doesn't work as intended
fig = plt.figure()
axHistHandler = fig.gca()
axScatterHandler = fig.gca()
axHistHandler.invert_xaxis()
axHistHandler.hist(xData,orientation='horizontal')
axScatterHandler.scatter(np.array(np.arange(0,len(xData),1)), xData)
A. using two axes
There is simply no reason not to use two different axes. The plot from the question can easily be reproduced with two different axes:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")
xData = np.random.rand(1000)
fig,(ax,ax2)= plt.subplots(ncols=2, sharey=True)
fig.subplots_adjust(wspace=0)
ax2.scatter(np.linspace(0,1,len(xData)), xData, s=9)
ax.hist(xData,bins=60,orientation='horizontal',normed=True)
ax.invert_xaxis()
ax.spines['right'].set_visible(False)
ax2.spines['left'].set_visible(False)
ax2.tick_params(axis="y", left=0)
plt.show()
B. using a single axes
Just for the sake of answering the question: In order to plot both in the same axes, one can shift the bars by their length towards the left, effectively giving a mirrored histogram.
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")
xData = np.random.rand(1000)
fig,ax= plt.subplots(ncols=1)
fig.subplots_adjust(wspace=0)
ax.scatter(np.linspace(0,1,len(xData)), xData, s=9)
xlim1 = ax.get_xlim()
_,__,bars = ax.hist(xData,bins=60,orientation='horizontal',normed=True)
for bar in bars:
bar.set_x(-bar.get_width())
xlim2 = ax.get_xlim()
ax.set_xlim(-xlim2[1],xlim1[1])
plt.show()
You might be interested in seaborn jointplots:
# Import and fake data
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(2,1000)
# actual plot
jg = sns.jointplot(data[0], data[1], marginal_kws={"bins":100})
jg.ax_marg_x.set_visible(False) # remove the top axis
plt.subplots_adjust(top=1.15) # fill the empty space
produces this:
See more examples of bivariate distribution representations, available in Seaborn.

Overplot seaborn regplot and swarmplot

I would like to overplot a swarmplot and regplot in seaborn, so that I can have a y=x line through my swarmplot.
Here is my code:
import matplotlib.pyplot as plt
import seaborn as sns
sns.regplot(y=y, x=x, marker=' ', color='k')
sns.swarmplot(x=x_data, y=y_data)
I don't get any errors when I plot, but the regplot never shows on the plot. How can I fix this?
EDIT: My regplot and swarmplot don't overplot and instead, plot in the same frame but separated by some unspecified y amount. If I flip them so regplot is above the call to swarmplot, regplot doesn't show up at all.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame({"x":x_data,"y":y_data} )
sns.regplot(y="y", x="x", data= df, color='k', scatter_kws={"alpha" : 0.0})
sns.swarmplot(y="y", x="x", data= df)
SECOND EDIT: The double axis solution from below works beautifully!
In principle the approach of plotting a swarmplot and a regplot simulatneously works fine.
The problem here is that you set an empty marker (marker = " "). This destroys the regplot, such that it's not shown. Apparently this is only an issue when plotting several things to the same graph; plotting a single regplot with empty marker works fine.
The solution would be not to specify the marker argument, but instead set the markers invisible by using the scatter_kws argument: scatter_kws={"alpha" : 0.0}.
Here is a complete example:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
## generate some data
n=19; m=9
y_data = []
for i in range(m):
a = (np.random.poisson(lam=0.99-float(i)/m,size=n)+i*.9+np.random.rand(1)*2)
a+=(np.random.rand(n)-0.5)*2
y_data.append(a*m)
y_data = np.array(y_data).flatten()
x_data = np.floor(np.sort(np.random.rand(n*m))*m)
## put them into dataframe
df = pd.DataFrame({"x":x_data,"y":y_data} )
## plotting
sns.regplot(y="y", x="x", data= df, color='k', scatter_kws={"alpha" : 0.0})
sns.swarmplot(x="x", y="y", data= df)
plt.show()
Concerning the edited part of the question:
Since swarmplot is a categorical plot, the axis in the plot still goes from -0.5 to 8.5 and not as the labels suggest from 10 to 18.
A possible workaround is to use two axes and twiny.
fig, ax = plt.subplots()
ax2 = ax.twiny()
sns.swarmplot(x="x", y="y", data= df, ax=ax)
sns.regplot(y="y", x="x", data= df, color='k', scatter_kws={"alpha" : 0.0}, ax=ax2)
ax2.grid(False) #remove grid as it overlays the other plot

Categories