I'm pretty new to matplotlib and I'm trying to plot an array for time-series use, but I don't use a date, only the index for order. I have another array with a colour code for every entry in the previous array.
I'm trying to plot them similar to this but only in one line.
My data looks like:
array = ['event0', 'event1', 'event2', 'event0', 'event6', ..]
colours = ['r', 'g', 'b', 'r', 'y', ..]
import matplotlib.pyplot as plt
# input data
array = ['event0', 'event1', 'event2', 'event0', 'event6']
colours = ['r', 'g', 'b', 'r', 'y']
# for easier plotting, convert your data into numerical data:
method 1:
events of same type get same y-coordinate
array_numbers = [float(a.split('event')[1]) for a in array]
method 2:
all events get the same y-coordinate:
array_numbers = [1 for a in array]
plotting:
# create figure, and subplots
fig = plt.figure()
ax = plt.subplot(111)
# plot newly generated numbers against their index, using the colours specified
ax.scatter(range(len(array_numbers)), array_numbers, c=colours)
# create ticklabels:
y_tick_labels = ['sensor-{}'.format(a) for a in range(7)]
# set positions of ticks, and add names
ax.set_yticks(range(7))
ax.set_yticklabels(y_tick_labels)
You can use plt.scatter(x, y, c=colours)
Related
I am working with matplotlib and below you can see my data and my plot.
data = {
'type_sale': ['g_1','g_2','g_3','g_4','g_5','g_6','g_7','g_8','g_9','g_10'],
'open':[70,20,24,150,80,90,60,90,20,20],
}
df = pd.DataFrame(data, columns = ['type_sale',
'open',
])
df.plot(x='type_sale', kind='bar', title='Bar Graph')
So now I want to put a different color (color = 'red') on the fourth bar. I tryed but I colored all instead only one.
So can anybody help me how to solve this ?
The ax.bar() method returns a list of bars that you can then manipulate, in your case with .set_color():
import matplotlib.pyplot as plt
f=plt.figure()
ax=f.add_subplot(1,1,1)
## bar() will return a list of bars
barlist = ax.bar([1,2,3,4], [1,2,3,4])
barlist[3].set_color('r')
plt.show()
You can try this solution
# libraries
import numpy as np
import matplotlib.pyplot as plt
# create a dataset
height = [3, 12, 5, 18, 45]
bars = ('A', 'B', 'C', 'D', 'E')
x_pos = np.arange(len(bars))
# Create bars with different colors
plt.bar(x_pos, height, color=['black', 'red', 'green', 'blue', 'cyan'])
# Create names on the x-axis
plt.xticks(x_pos, bars)
# Show graph
plt.show()
Here is the documentation link
Link
Sometimes datasets have a number of variables with a selection of other 'things' that contribute to them. It can be useful to show the contribution (e.g. %) to a variable of these different 'things'. However, sometimes not all of the 'things' contribute to all of the variables. When plotting as a bar chart, this leads to spaces when a specific variable does not have a contribution from a 'thing'. Is there a way to just not plot the specific bar for a variable in a bar chart if the contribution of the 'thing' is zero?
An example below shows a selection of variables (a-j) that have various things that could contribute to them (1-5). NOTE: the gaps when the contribution of a 'thing' (1-5) to a variable (a-j) is zero.
from random import randrange
# Make the dataset of data for variables (a-j)
columns = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
data = np.array([np.random.randn(5)**2 for i in range(10)])
df = pd.DataFrame(data.T, columns=columns)
for col in df.columns:
# Set 3 of the 5 'things' to be np.NaN per column
for n in np.arange(3):
idx = randrange(5)
df.loc[list(df.index)[idx], col] = np.NaN
# Normalise the data to 100% of values
df.loc[:,col] = df[col].values / df[col].sum()*100
# Setup plot
figsize = matplotlib.figure.figaspect(.33)
fig = plt.figure(figsize=figsize)
ax = plt.gca()
df.T.plot.bar(rot=0, ax=ax)
# Add a legend and show
plt.legend(ncol=len(columns))
plt.show()
As commented, there's no inbuilt function for this. Here's an approach that you can explore:
# we will use this to shift the bars
shifted = df.notnull().cumsum()
# the width for each bar
width = 1 / len(df.columns)
fig = plt.figure(figsize=(10,3))
ax = plt.gca()
colors = [f'C{i}' for i in range(df.shape[1])]
for i,idx in enumerate(df.index):
offsets = shifted.loc[idx]
values = df.loc[idx]
ax.bar(np.arange(df.shape[1]) + offsets*width, values,
color=colors[i], width=width, label=idx)
ax.set_xticks(np.arange(df.shape[1]))
ax.set_xticklabels(df.columns);
ax.legend()
Output:
I am attempting to build a violin plot to illustrate depth on the y-axis and a distance away from a known point on the x-axis. I am able to get the x-axis labels to distribute appropriately spaced on the x-axis based on the variable distances but i am unable to get the violin plots to align. They plots appear to be shifted to the y-axis. Any help would be appreciated. My code is below:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
path = 'O:\info1.csv'
df = pd.read_csv(path)
item = ['a', 'b', 'c', 'd', 'e', 'f']
dist = [450, 1400, 2620, 3100, 3830, 4940]
plt.rcParams.update({'font.size': 15})
fig, axes1 = plt.subplots(figsize=(20,10))
axes1 = sns.violinplot(x='item', y='surface', data=df, hue = 'item', order = (item))
axes1.invert_yaxis()
axes1.set_xlabel('Item')
axes1.set_ylabel('Depth')
axes1.set_xticks(dist)
plt.xticks(rotation=20)
plt.show()
Example dataset:
You cannot use seaborn violin plot, because from the vignette:
This function always treats one of the variables as categorical and
draws data at ordinal positions (0, 1, … n) on the relevant axis, even
when the data has a numeric or date type.
So if you draw it directly with seaborn, it is categorical:
sns.violinplot(x='dist', y='surface', data=df, hue = 'item',dodge=False,cut=0)
To place the boxplot according, you need to use matplotlib, first we get the data out in the format required and define a color palette:
surface_values = list([np.array(value) for name,value in df.groupby('item')['surface']])
dist_values = df.groupby('item')['dist'].agg("mean")
pal = ["crimson","darkblue","rebeccapurple"]
You need to set the width, provide the distance, and for the inner "box", we modify the code from here:
fig, ax = plt.subplots(1, 1,figsize=(8,4))
parts = ax.violinplot(surface_values,widths=200,positions=dist_values,
showmeans=False, showmedians=False,showextrema=False)
for i,pc in enumerate(parts['bodies']):
pc.set_facecolor(pal[i])
pc.set_edgecolor('black')
pc.set_alpha(1)
quartile1, medians, quartile3 = np.percentile(surface_values, [25, 50, 75], axis=1)
whiskers = np.array([
adjacent_values(sorted_array, q1, q3)
for sorted_array, q1, q3 in zip(surface_values, quartile1, quartile3)])
whiskersMin, whiskersMax = whiskers[:, 0], whiskers[:, 1]
inds = dist_values
ax.scatter(inds, medians, marker='o', color='white', s=30, zorder=3)
ax.vlines(inds, quartile1, quartile3, color='k', linestyle='-', lw=5)
ax.vlines(inds, whiskersMin, whiskersMax, color='k', linestyle='-', lw=1)
If you don't need the inner box, you can just call plt.violin ...
thanks for including a bit of data.
To change your plot, the item and dist variables in your code need to be adjusted, and remove the item = [a,b...] and dist = [] arrays in your code. The ticks on the x-axis using the axes1.set_xticks needs a bit of tweaking to get what you're looking for there.
Example 1:
removed the two arrays that were creating the plot you were seeing before; violinplot function unchanged.
# item = ['a', 'b', 'c', 'd', 'e', 'f'] * Removed
# dist = [450, 1400, 2620, 3100, 3830, 4940] * Removed
plt.rcParams.update({'font.size': 15})
fig, axes1 = plt.subplots(figsize=(20,10))
axes1 = sb.violinplot(x='item', y='surface', data=df, hue = 'item', inner = 'box')
axes1.invert_yaxis()
axes1.set_xlabel('Item')
axes1.set_ylabel('Depth')
#axes1.set_xticks(dist) * Removed
plt.xticks(rotation=20)
plt.show()
Inside each curve, there is a black shape with a white dot inside. This is the miniature box plot mentioned above. If you'd like to remove the box plot, you can set the inner = None parameter in the violinplot call to simplify the look of the final visualization.
Example 2:
put dist on your x axis in place of the xticks.
plt.rcParams.update({'font.size': 15})
plt.subplots(figsize=(20,10))
# Put 'dist' as your x input, keep your categorical variable (hue) equal to 'item'
axes1 = sb.violinplot(data = df, x = 'dist', y = 'surface', hue = 'item', inner = 'box');
axes1.invert_yaxis()
axes1.set_xlabel('Item')
axes1.set_ylabel('Depth');
I'm not confident the items and the distances you are working with have a relationship you want to show on the x-axis, or if you just want to use those integers as your tick marks for that axis. If there is an important relationship between the item and the dist, you could use a dictionary new_dict = {450: 'a', 1400: 'b', 2620: 'c' ...
Hope you find this helpful.
Following the pylab_examples, I have created a simple 2x5 cells table in matplotlib.
Code:
# Prepare table
columns = ('A', 'B', 'C', 'D', 'E')
rows = ["A", "B"]
cell_text = [["1", "1","1","1","1"], ["2","2","2","2","2"]]
# Add a table at the bottom of the axes
ax[4].axis('tight')
ax[4].axis('off')
the_table = ax[4].table(cellText=cell_text,colLabels=columns,loc='center')
Now, I want to color cell A1 with color = "#56b5fd" and cell A2 with color = "#1ac3f5". All other cells should remain white. Matplotlib's table_demo.py as well as this example only show me how to apply a color map with pre-defined colors that depend on the values in the cell.
How to assign specific colors to specific cells in a Matplotlib-generated table?
The easiest way to colorize the background of cells in a table is to use the cellColours argument. You may supply a list of lists or an array with the same shape as the data.
import matplotlib.pyplot as plt
# Prepare table
columns = ('A', 'B', 'C', 'D', 'E')
rows = ["A", "B"]
cell_text = [["1", "1","1","1","1"], ["2","2","2","2","2"]]
# Add a table at the bottom of the axes
colors = [["#56b5fd","w","w","w","w"],[ "#1ac3f5","w","w","w","w"]]
fig, ax = plt.subplots()
ax.axis('tight')
ax.axis('off')
the_table = ax.table(cellText=cell_text,cellColours=colors,
colLabels=columns,loc='center')
plt.show()
Alternatively, you can set the facecolor of a specific cell as
the_table[(1, 0)].set_facecolor("#56b5fd")
the_table[(2, 0)].set_facecolor("#1ac3f5")
Resulting in the same output as above.
#ImportanceOfBeingErnest provided an excellent answer. However, for earlier versions of Matplotlib, the second approach:
the_table[(1, 0)].set_facecolor("#56b5fd")
will result in a TypeError: TypeError: 'Table' object has no attribute '__getitem__' The TypeError can be overcome by using the following syntax instead:
the_table.get_celld()[(1,0)].set_facecolor("#56b5fd")
the_table.get_celld()[(2,0)].set_facecolor("#1ac3f5")
See also this example.
(Confirmed on Matplotlib 1.3.1)
I'm trying to plot a dataframe to a few subplots using pandas and matplotlib.pyplot. But I want to have the two columns use different y axes and have those shared between all subplots.
Currently my code is:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Area':['A', 'A', 'A', 'B', 'B', 'C','C','C','D','D','D','D'],
'Rank':[1,2,3,1,2,1,2,3,1,2,3,4],
'Count':[156,65,152,70,114,110,195,92,44,179,129,76],
'Value':[630,426,312,191,374,109,194,708,236,806,168,812]}
)
df = df.set_index(['Area', 'Rank'])
fig = plt.figure(figsize=(6,4))
for i, l in enumerate(['A','B','C','D']):
if i == 0:
sub1 = fig.add_subplot(141+i)
else:
sub1 = fig.add_subplot(141+i, sharey=sub1)
df.loc[l].plot(kind='bar', ax=sub1)
This produces:
This works to plot the 4 graphs side by side which is what I want but both columns use the same y-axis I'd like to have the 'Count' column use a common y-axis on the left and the 'Value' column use a common secondary y-axis on the right.
Can anybody suggest a way to do this? My attempts thus far have lead to each graph having it's own independent y-axis.
To create a secondary y axis, you can use twinax = ax.twinx(). Once can then join those twin axes via the join method of an axes Grouper, twinax.get_shared_y_axes().join(twinax1, twinax2). See this question for more details.
The next problem is then to get the two different barplots next to each other. Since I don't think there is a way to do this using the pandas plotting wrappers, one can use a matplotlib bar plot, which allows to specify the bar position quantitatively. The positions of the left bars would then be shifted by the bar width.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Area':['A', 'A', 'A', 'B', 'B', 'C','C','C','D','D','D','D'],
'Rank':[1,2,3,1,2,1,2,3,1,2,3,4],
'Count':[156,65,152,70,114,110,195,92,44,179,129,76],
'Value':[630,426,312,191,374,109,194,708,236,806,168,812]}
)
df = df.set_index(['Area', 'Rank'])
fig, axes = plt.subplots(ncols=len(df.index.levels[0]), figsize=(6,4), sharey=True)
twinaxes = []
for i, l in enumerate(df.index.levels[0]):
axes[i].bar(df["Count"].loc[l].index.values-0.4,df["Count"].loc[l], width=0.4, align="edge" )
ax2 = axes[i].twinx()
twinaxes.append(ax2)
ax2.bar(df["Value"].loc[l].index.values,df["Value"].loc[l], width=0.4, align="edge", color="C3" )
ax2.set_xticks(df["Value"].loc[l].index.values)
ax2.set_xlabel("Rank")
[twinaxes[0].get_shared_y_axes().join(twinaxes[0], ax) for ax in twinaxes[1:]]
[ax.tick_params(labelright=False) for ax in twinaxes[:-1]]
axes[0].set_ylabel("Count")
axes[0].yaxis.label.set_color('C0')
axes[0].tick_params(axis='y', colors='C0')
twinaxes[-1].set_ylabel("Value")
twinaxes[-1].yaxis.label.set_color('C3')
twinaxes[-1].tick_params(axis='y', colors='C3')
twinaxes[0].relim()
twinaxes[0].autoscale_view()
plt.show()