I have the following code:
import matplotlib.pyplot as plt
import numpy as np
xticks = ['A','B','C']
Scores = np.array([[5,7],[4,6],[8,3]])
colors = ['red','blue']
fig, ax = plt.subplots()
ax.hist(Scores,bins=3,density=True,histtype='bar',color=colors)
plt.show()
Which gives the following output:
I have two questions:
How can I make the height of bars represent the values in Scores e.g. the left most red column should be of height 5 and left most blue column should be of height 7, and so on.
How can I assign values across x-axis from xticks list e.g. the left two columns should have 'A' written under them, the next two 'B' and so on.
You confound a histogram with a bar plot. Here you want a bar plot. If you want to use pandas, this is going to be very easy:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
xticks = ['A','B','C']
Scores = np.array([[5,7],[4,6],[8,3]])
colors = ['red','blue']
names = ["Cat", "Dog"]
fig, ax = plt.subplots()
pd.DataFrame(Scores, index=xticks, columns=names).plot.bar(color=colors, ax=ax)
plt.show()
If using matplotlib alone, it's slighlty more complicated, because each column needs to be plotted independently,
import matplotlib.pyplot as plt
import numpy as np
xticks = ['A','B','C']
Scores = np.array([[5,7],[4,6],[8,3]])
colors = ['red','blue']
names = ["Cat", "Dog"]
fig, ax = plt.subplots()
x = np.arange(len(Scores))
ax.bar(x-0.2, Scores[:,0], color=colors[0], width=0.4, label=names[0])
ax.bar(x+0.2, Scores[:,1], color=colors[1], width=0.4, label=names[1])
ax.set(xticks=x, xticklabels=xticks)
ax.legend()
plt.show()
You already did a lot of the work for the histogram. Now you just need some bar plots.
import matplotlib.pyplot as plt
import numpy as np
xticks = ['A','B','C']
Scores = np.array([[5,7],[4,6],[8,3]])
colors = ['red','blue']
fig, ax = plt.subplots()
# Width of bars
w=.2
# Plot both separately
ax.bar([1,2,3],Scores[:,0],width=w,color=colors[0])
ax.bar(np.add([1,2,3],w),Scores[:,1],width=w,color=colors[1])
# Assumes you want ticks in the middle
ax.set_xticks(ticks=np.add([1,2,3],w/2))
ax.set_xticklabels(xticks)
plt.show()
plt.xticks(range(0, 6), ('A', 'A', 'B', 'B', 'C', 'C')) would work to answer question part 2 I believe. I'm not sure about the heights, as I haven't made histograms.
Related
I drawn the comparison bar chart for very small values with the following code,
import pandas as pd
import matplotlib.pyplot as plt
data = [[ 0.00790019035339353, 0.00002112],
[0.0107705593109131, 0.0000328540802001953],
[0.0507792949676514, 0.0000541210174560547]]
df = pd.DataFrame(data, columns=['A', 'B'])
df.plot.bar()
plt.bar(df['A'], df['B'])
plt.show()
Due to very small values I can't visualise the chart colour for the ('B' column) smaller value (e.g. 0.00002112) in the graph.
How can I modify the code to visualise smaller value(B column) colour in the graph? Thanks..
A common way to display data with different orders of magnitude is
to use a logarithmic scaling for the y-axis. Below the logarithm
to base 10 is used but other bases could be chosen.
import pandas as pd
import matplotlib.pyplot as plt
data = [[ 0.00790019035339353, 0.00002112],
[0.0107705593109131, 0.0000328540802001953],
[0.0507792949676514, 0.0000541210174560547]]
df = pd.DataFrame(data, columns=['A', 'B'])
df.plot.bar()
plt.yscale("log")
plt.show()
Update:
To change the formatting of the yaxis labels an instance of ScalarFormatter can be used:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
data = [[ 0.00790019035339353, 0.00002112],
[0.0107705593109131, 0.0000328540802001953],
[0.0507792949676514, 0.0000541210174560547]]
df = pd.DataFrame(data, columns=['A', 'B'])
df.plot.bar()
plt.yscale("log")
plt.gca().yaxis.set_major_formatter(ScalarFormatter())
plt.show()
You could create 2 y-axis like this:
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
width = 0.2
df['A'].plot(kind='bar', color='green', ax=ax1, width=width, position=1, label = 'A')
df['B'].plot(kind='bar', color='blue', ax=ax2, width=width, position=0, label = 'B')
ax1.set_ylabel('A')
ax2.set_ylabel('B')
# legend
h1, l1 = ax1.get_legend_handles_labels()
h2, l2 = ax2.get_legend_handles_labels()
ax1.legend(h1+h2, l1+l2, loc=2)
plt.show()
I am trying to make a stacked histogram using matplotlib by looping through the categories in the dataframe and assigning the bar color based on a dictionary.
I get this error on the ax1.hist() call. How should I fix it?
AttributeError: 'numpy.ndarray' object has no attribute 'hist'
Reproducible Example
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
%matplotlib inline
plt.style.use('seaborn-whitegrid')
y = [1,5,9,2,4,2,5,6,1]
cat = ['A','B','B','B','A','B','B','B','B']
df = pd.DataFrame(list(zip(y,cat)), columns =['y', 'cat'])
fig, axes = plt.subplots(3,3, figsize=(5,5), constrained_layout=True)
fig.suptitle('Histograms')
ax1 = axes[0]
mycolorsdict = {'A':'magenta', 'B':'blue'}
for key, batch in df.groupby(['cat']):
ax1.hist(batch.y, label=key, color=mycolorsdict[key],
density=False, cumulative=False, edgecolor='black',
orientation='horizontal', stacked=True)
Updated effort, still not working
This is close, but it is not stacking (should see stacks at y=5); I think maybe because of the loop?
mycolorsdict = {'A':'magenta', 'B':'blue'}
for ii, ax in enumerate(axes.flat):
for key, batch in df.groupby(['cat']):
ax.hist(batch.y,
label=key, color=mycolorsdict[key],density=False, edgecolor='black',
cumulative=False, orientation='horizontal', stacked=True)
To draw on a specific subplot, two indices are needed (row, column), so axes[0,0] for the first subplot. The error message comes from using ax1 = axes[0] instead of ax1 = axes[0,0].
Now, to create a stacked histogram via ax.hist(), all the y-data need to be provided at the same time. The code below shows how this can be done starting from the result of groupby. Also note, that when your values are discrete, it is important to explicitly set the bin boundaries making sure that the values fall precisely between these boundaries. Setting the boundaries at the halves is one way.
Things can be simplified a lot using seaborn's histplot(). Here is a breakdown of the parameters used:
data=df the dataframe
y='y' gives the dataframe column for histogram. Use x= (instead of y=) for a vertical histogram.
hue='cat' gives the dataframe column to create mulitple groups
palette=mycolorsdict; the palette defines the coloring; there are many ways to assign a palette, one of which is a dictionary on the hue values
discrete=True: when working with discrete data, seaborn sets the appropriate bin boundaries
multiple='stack' creates a stacked histogram, depending on the hue categories
alpha=1: default seaborn sets an alpha of 0.75; optionally this can be changed
ax=axes[0, 1]: draw on the 2nd subplot of the 1st row
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn-whitegrid')
y = [1, 5, 9, 2, 4, 2, 5, 6, 1]
cat = ['A', 'B', 'B', 'B', 'A', 'B', 'B', 'B', 'B']
df = pd.DataFrame({'y':y, 'cat':cat})
fig, axes = plt.subplots(3, 3, figsize=(20, 10), constrained_layout=True)
fig.suptitle('Histograms')
mycolorsdict = {'A': 'magenta', 'B': 'blue'}
groups = df.groupby(['cat'])
axes[0, 0].hist([batch.y for _, batch in groups],
label=[key for key, _ in groups], color=[mycolorsdict[key] for key, _ in groups], density=False,
edgecolor='black',
cumulative=False, orientation='horizontal', stacked=True, bins=np.arange(0.5, 10))
axes[0, 0].legend()
sns.histplot(data=df, y='y', hue='cat', palette=mycolorsdict, discrete=True, multiple='stack', alpha=1, ax=axes[0, 1])
plt.show()
I'm rather new with regard to Matplotlib which want to use for making (multiple) histograms of data counts with float intervals on the x-axis in a tkinter toplevel window. See below a highly simplified part of my code. Because I use float intervals, I need to apply a ax2.hist(...) call in stead of the ax1.bar(...) call. See my code below. However, the result from ax2.hist(...) is not what I want. I would like tot have the counts at the y-axis as is the case in ax1. With other words, how do I get a histogram with y-axis from ax1 and the x-axis from ax2?
I hope somebody can suggest how to deal with this. I couldn't find it on the matplotlib site, sofar.
import tkinter as tk
import matplotlib.pyplot as plt
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg
import numpy as np
import random
root = tk.Tk()
panel = tk.Toplevel()
panel.title('Title')
lijst = []
for i in range(100):
a = random.randrange(100)
a=a/10
lijst.append(a)
nplijst = np.array(lijst)
counts, bins = np.histogram(nplijst)
names = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h' ,'i', 'j']
print(counts)
print(bins)[![resulting ax1 bar graph and ax2 histogram][1]][1]
fig1 = plt.Figure()
ax1 = fig1.add_subplot(121)
ax2 = fig1.add_subplot(122)
ax1.bar(names, counts, width = 1, edgecolor="k")
ax2.hist(counts, bins = bins, edgecolor="k")
ax1.set_title('ax1')
ax2.set_title('ax2')
chart_type1 = FigureCanvasTkAgg(fig1, panel)
chart_type1.get_tk_widget().pack()
You can use the sharey keyword in plt.subplots like so:
fig1, axs = plt.subplots(1, 2, sharey=True) # create a 1x2 grid of plots
axs[0].bar(names, counts, width = 1, edgecolor="k")
axs[1].hist(counts, bins = bins, edgecolor="k")
axs[0].set_title('ax1')
axs[1].set_title('ax2')
If you want the tick labels back on the second plot, add
for ax in axs:
ax.yaxis.set_tick_params(labelleft=True)
Seaborn barplots have an xtick for every unique value along the x coordinate. But for the hue values, there are no ticks:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame(columns=["model", "time", "value"])
df["model"] = ["on"]*2 + ["off"]*2
df["time"] = ["short", "long"] * 2
df["value"] = [1, 10, 2, 4]
fig, ax = plt.subplots()
bar = sns.barplot(data=df, x="model", hue="time", y="value", edgecolor="white")
Is it possible to add ticks for the hue, too?
Some of the hue colors are quite similar and I would like to add a text description, too.
You have to be careful about the number of hues that you might have in your dataset, and the number of categories and so forth.
If you have N categories, then they are each plotted at axis coordinates 0,1,...,N-1. Then the various hues are plotted centered around this coordinate. For 2 hues like in your example, the bars are at x±0.2
fig, ax = plt.subplots()
bar = sns.barplot(data=df, x="model", hue="time", y="value", edgecolor="white")
ax.set_xticks([-0.2,0.2, 0.8,1.2])
ax.set_xticklabels(["on/short","on/long",'off/short','off/long'])
Note that I would strongly recommend that you use order= and hue_order= in your call to barplot() to be sure that your labels match the bars.
This question already has answers here:
Scatter plots in Pandas/Pyplot: How to plot by category [duplicate]
(8 answers)
Closed 5 years ago.
My data frame has three columns of SKU, Saving and label (categorical var). When I call plt.legend(), it adds legend of "Saving", but I want to add legend to my colors (a,b,c,d)?
from numpy import *
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(100,1), columns=['Saving'])
df['SKU'] = np.arange(100)
df['label'] = np.random.choice(['a', 'b', 'c','d'], 100)
fig, ax = plt.subplots()
colors = {'a':'red', 'b':'blue', 'c':'green', 'd':'white'}
figSaving = ax.scatter(df['SKU'], df['Saving'], c=df['label'].apply(lambda x: colors[x]))
plt.show()
plt.legend is a callable. By writing plt.legend={'a', 'b', 'c', 'd'}, you are replacing that callable by a set, which in itself does nothing (except making it impossible to call legend afterwards. What you want to do is to call plt.legend(). See https://matplotlib.org/users/legend_guide.html.
import pandas as pd
import numpy as np
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(100,1), columns=['Saving'])
df['SKU'] = np.arange(100)
df['label'] = np.random.choice(['a', 'b', 'c','d'], 100)
fig, ax = plt.subplots()
colors = {'a':'red', 'b':'blue', 'c':'green', 'd':'white'}
figSaving = ax.scatter(df['SKU'], df['Saving'], c=df['label'].apply(lambda x: colors[x]))
# build the legend
red_patch = mpatches.Patch(color='red', label='a')
blue_patch = mpatches.Patch(color='blue', label='b')
green_patch = mpatches.Patch(color='green', label='c')
white_patch = mpatches.Patch(color='white', label='d')
# set up for handles declaration
patches = [red_patch, blue_patch, green_patch, white_patch]
# define and place the legend
#legend = ax.legend(handles=patches,loc='upper right')
# alternative declaration for placing legend outside of plot
legend = ax.legend(handles=patches,bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.show()