How to get visible bars on a histogram/distribution plot? [duplicate] - python

While doing some practice problems using seaborn and a Jupyter notebook, I realized that the distplot() graphs did not have the darker outlines on the individual bins that all of the sample graphs in the documentation have. I tried creating the graphs using Pycharm and noticed the same thing. Thinking it was a seaborn problem, I tried some hist() charts using matplotlib, only to get the same results.
import matplotlib.pyplot as plt
import seaborn as sns
titanic = sns.load_dataset('titanic')
plt.hist(titanic['fare'], bins=30)
yielded the following graph:
Finally I stumbled across the 'edgecolor' parameter on the plt.hist() function, and setting it to black did the trick. Unfortunately I haven't found a similar parameter to use on the seaborn distplot() function, so I am still unable to get a chart that looks like it should.
I looked into changing the rcParams in matplotlib, but I have no experience with that and the following script I ran seemed to do nothing:
import matplotlib as mpl
mpl.rcParams['lines.linewidth'] = 1
mpl.rcParams['lines.color'] = 'black'
mpl.rcParams['patch.linewidth'] = 1
mpl.rcParams['patch.edgecolor'] = 'black'
mpl.rcParams['axes.linewidth'] = 1
mpl.rcParams['axes.edgecolor'] = 'black'
I was just kind of guessing at the value I was supposed to change, but running my graphs again showed no changes.
I then attempted to go back to the default settings using mpl.rcdefaults()
but once again, no change.
I reinstalled matplotlib using conda but still the graphs look the same. I am running out of ideas on how to change the default edge color for these charts. I am running the latest versions of Python, matplotlib, and seaborn using the Conda build.

As part of the update to matplotlib 2.0 the edges on bar plots are turned off by default. However, you may use the rcParam
plt.rcParams["patch.force_edgecolor"] = True
to turn the edges on globally.
Probably the easiest option is to specifically set the edgecolor when creating a seaborn plot, using the hist_kws argument,
ax = sns.distplot(x, hist_kws=dict(edgecolor="k", linewidth=2))
For matplotlib plots, you can directly use the edgecolor or ec argument.
plt.bar(x,y, edgecolor="k")
plt.hist(x, edgecolor="k")
Equally, for pandas plots,
df.plot(kind='hist',edgecolor="k")
A complete seaborn example:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
x = np.random.randn(100)
ax = sns.distplot(x, hist_kws=dict(edgecolor="k", linewidth=2))
plt.show()

As of Mar, 2021 :
sns.histplot(data, edgecolor='k', linewidth=2)
work.
Using hist_kws=dict(edgecolor="k", linewidth=2) gave an error:
AttributeError: 'PolyCollection' object has no property 'hist_kws'

Using the available styles in seaborn could also solve your problem.
Available styles in seaborn are :
ticks
dark
darkgrid
white
whitegrid

Related

Cannot change default colormap in matplotlib

I am trying to set the default colormap (not just the color of a specific plot) for matplotlib in my jupyter notebook (Python 3). I found the commands: plt.set_cmap("gray") and mpl.rc('image', cmap='gray'), that should set the default colormap to gray, but both commands are just ignored during execution and I still get the old colormap.
I tried these two codes:
import matplotlib as mpl
mpl.rc('image', cmap='gray')
plt.hist([[1,2,3],[4,5,6]])
import matplotlib.pyplot as plt
plt.set_cmap("gray")
plt.hist([[1,2,3],[4,5,6]])
They should both generate a plot with gray tones. However, the histogram has colors, which correspond to the first two colors of the default colormap. What am I not getting?
Thanks to the comment of Chris, I found the issue, it's not the default colormap that I need to change but the default color cycle. it's described here: How to set the default color cycle for all subplots with matplotlib?
import matplotlib as mpl
import matplotlib.pyplot as plt
from cycler import cycler
# Set the default color cycle
colors=plt.cm.gray(np.linspace(0,1,3))
mpl.rcParams['axes.prop_cycle'] = mpl.cycler(color=colors)
plt.hist([[1,2,3],[4,5,6]])
Since you have two data sets your are passing, you'll need to specify two colors.
plt.hist([[1,2,3],[4,5,6]], color=['black','purple'])
You can make use of the color argument in matplotlib plot function.
import matplotlib.pyplot as plt
plt.hist([[1,2,3],[4,5,6]], color=['gray','gray'])
with this method you have to specify the color scheme for each dataset hence an array of colors as I have put it above.
If you are using a version of matplotlib between prio and 2.0 you need to use rcParams (still working in newer versions):
import matplotlib.pyplot as plt
plt.rcParams['image.cmap'] = 'gray'

Python Matplotlib/Seaborn/Jupyter - Putting bar plot in wrong place?

I'm using the following in a Jupyter notebook, using the latest Anaconda update (including Matplotlib 3.1.1,)
Thanks to SpghttCd, I have the code to do a stacked horizontal bar, but Seaborn puts it on a new plot below the default one.
How might I best fix this problem?
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data=pd.DataFrame(data={"R1":["Yes","Yes","Yes","No","No"]})
freq = data["R1"].value_counts(normalize=True)*100
fig,ax = plt.subplots()
freq.to_frame().T.plot.barh(stacked=True)
You see two axes in Jupyter because you create a fresh one with plt.subplots() and pandas also creates another one.
If you need to reuse an existing axe, pass it to plotting method using ax switch:
fig, axe = plt.subplots()
freq.to_frame().T.plot.barh(stacked=True, ax=axe)
See pandas documentation for details, plotting method always exhibits an ax switch:
ax : Matplotlib axis object, optional
If you accept pandas creates it for you, as #Bharath M suggested, just issue:
axe = freq.to_frame().T.plot.barh(stacked=True)
Then you will see an unique axes and you can access it trough the variable axe.

No outlines on bins of Matplotlib histograms or Seaborn distplots

While doing some practice problems using seaborn and a Jupyter notebook, I realized that the distplot() graphs did not have the darker outlines on the individual bins that all of the sample graphs in the documentation have. I tried creating the graphs using Pycharm and noticed the same thing. Thinking it was a seaborn problem, I tried some hist() charts using matplotlib, only to get the same results.
import matplotlib.pyplot as plt
import seaborn as sns
titanic = sns.load_dataset('titanic')
plt.hist(titanic['fare'], bins=30)
yielded the following graph:
Finally I stumbled across the 'edgecolor' parameter on the plt.hist() function, and setting it to black did the trick. Unfortunately I haven't found a similar parameter to use on the seaborn distplot() function, so I am still unable to get a chart that looks like it should.
I looked into changing the rcParams in matplotlib, but I have no experience with that and the following script I ran seemed to do nothing:
import matplotlib as mpl
mpl.rcParams['lines.linewidth'] = 1
mpl.rcParams['lines.color'] = 'black'
mpl.rcParams['patch.linewidth'] = 1
mpl.rcParams['patch.edgecolor'] = 'black'
mpl.rcParams['axes.linewidth'] = 1
mpl.rcParams['axes.edgecolor'] = 'black'
I was just kind of guessing at the value I was supposed to change, but running my graphs again showed no changes.
I then attempted to go back to the default settings using mpl.rcdefaults()
but once again, no change.
I reinstalled matplotlib using conda but still the graphs look the same. I am running out of ideas on how to change the default edge color for these charts. I am running the latest versions of Python, matplotlib, and seaborn using the Conda build.
As part of the update to matplotlib 2.0 the edges on bar plots are turned off by default. However, you may use the rcParam
plt.rcParams["patch.force_edgecolor"] = True
to turn the edges on globally.
Probably the easiest option is to specifically set the edgecolor when creating a seaborn plot, using the hist_kws argument,
ax = sns.distplot(x, hist_kws=dict(edgecolor="k", linewidth=2))
For matplotlib plots, you can directly use the edgecolor or ec argument.
plt.bar(x,y, edgecolor="k")
plt.hist(x, edgecolor="k")
Equally, for pandas plots,
df.plot(kind='hist',edgecolor="k")
A complete seaborn example:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
x = np.random.randn(100)
ax = sns.distplot(x, hist_kws=dict(edgecolor="k", linewidth=2))
plt.show()
As of Mar, 2021 :
sns.histplot(data, edgecolor='k', linewidth=2)
work.
Using hist_kws=dict(edgecolor="k", linewidth=2) gave an error:
AttributeError: 'PolyCollection' object has no property 'hist_kws'
Using the available styles in seaborn could also solve your problem.
Available styles in seaborn are :
ticks
dark
darkgrid
white
whitegrid

How to show matplotlib plot from a figure object

My code contains the following lines:
from matplotlib.figure import Figure
figure = Figure(figsize=(10, 5), dpi=dpi)
How can I get matplotlib to show this figure? I also show it embedded in tkinter, which workes fine. However I would also be able to show it in the standard matplotlib window. But I can't for the life of me get it to work.
According to AttributeError while trying to load the pickled matplotlib figure, a simple workaround is:
fig = plt.Figure(...)
......
managed_fig = plt.figure(...)
canvas_manager = managed_fig.canvas.manager
canvas_manager.canvas.figure = fig
fig.set_canvas(canvas_manager.canvas)
Note that I encountered "'Figure' object has no attribute '_original_dpi'" in my environment. Not sure if it's some compatibility issue between my PyPlot and the PyQt5. Just did a hack:
fig._original_dpi = 60
to get around this. Not sure if there are any better solutions.
I usually use matplotlib's pyplot for immediate generation (or produce images in jupyter notebooks). This would look like the following:
import matplotlib.pyplot as plt
figure = plt.figure(figsize=(10, 5), dpi=dpi)
plt.show()
This shows the (blank) figure as desired.

Plot semilogx with matplotlib then convert it into Bokeh

I plot a figure containing several curves using matplotlib and then try to convert it into bokeh:
import numpy as np
import matplotlib.pyplot as plt
from bokeh import mpl
from bokeh.plotting import show, output_file
num_plots = 6
colormap = plt.cm.gist_ncar
time = np.random.random_sample((300, 6))
s_strain = np.random.random_sample((300, 6))
def time_s_strain_bokeh(num_plots, colormap, time, s_strain):
plt.gca().set_color_cycle([colormap(i) for i in np.linspace(0, 0.9, num_plots)])
plt.figure(2)
for i in range(0, num_plots):
plt.plot(time[:,i], s_strain[:,i])
plt.grid(True)
# save it to bokeh
output_file('anywhere.html')
show(mpl.to_bokeh())
time_s_strain_bokeh(num_plots, colormap, time, s_strain)
it works fine. However, I want to have a semilogx plot. When I change plt.plot in the "for" loop into plt.semilogx, I have the following error:
UnboundLocalError: local variable 'laxis' referenced before assignment
What can I do to change the x-axis onto log scale?
I'm with the same issue! 1/2 of the solution is this (supose my data is in a Pandas dataframe called pd):
pd.plot(x='my_x_variable', y='my_y_variable)
p = mpl.to_bokeh()
p.x_mapper_type='log' # I found this property with p.properties_with_values()
show(p)
I edited this answare because I just found part 2/2 of the solution:
When I use just the code above, the plot is semilog (ok!), but the x axis is flipped (mirrored)!!!
The solution I found is explicitly redefine xlim:
p.x_range.start=0.007 # supose pd['my_x_variable'] starts at 0.007
p.x_range.end=0.17 # supose pd['my_x_variable'] ends at 0.17
With this my plot became identical with the matplotlib original plot. The final code looks like:
pd.plot(x='my_x_variable', y='my_y_variable)
p = mpl.to_bokeh()
p.x_mapper_type='log'
p.x_range.start= pd['my_x_variable'].iloc[1] # numpy start at 0, take care!
p.x_range.end= pd['my_x_variable'].iloc[-1]
show(p)
As of Bokeh 0.12, partial and incomplete MPL compatibility is provided by the third party mplexporter library, which now appears to be unmaintained. Full (or at least, much more complete) MPL compat support will not happen until the MPL team implements MEP 25. However, implementing MEP 25 is an MPL project task, and the timeline/schedule is entirely outside of the control of the Bokeh project.
The existing MPL compat based on mplexporter is provided "as-is" in case it is useful in the subset of simple situations that it currently works for. My suggestion is to use native Bokeh APIs directly for anything of even moderate complexity.
You can find an example of a semilog plot created using Bokeh APIs here:
http://docs.bokeh.org/en/latest/docs/user_guide/plotting.html#log-scale-axes

Categories