While I can get multiple lines on a chart and multiple bars on a chart - I cannot get a line and bar on the same chart using the same PeriodIndex.
Faux code follows ...
# play data
n = 100
x = pd.period_range('2001-01-01', periods=n, freq='M')
y1 = (Series(np.random.randn(n)).diff() + 5).tolist()
y2 = (Series(np.random.randn(n)).diff()).tolist()
df = pd.DataFrame({'bar':y2, 'line':y1}, index=x)
# let's plot
plt.figure()
ax = df['bar'].plot(kind='bar', label='bar')
df['line'].plot(kind='line', ax=ax, label='line')
plt.savefig('fred.png', dpi=200)
plt.close()
Any help will be greatly appreciated ...
The problem is: bar plots don't use index values as x axis, but use range(0, n). You can use twiny() to create a second axes that share yaxis with the bar axes, and draw line curve in this second axes.
The most difficult thing is how to align x-axis ticks. Here we define the align function, which will align ax2.get_xlim()[0] with x1 in ax1 and ax2.get_xlim()[1] with x2 in ax1:
def align_xaxis(ax2, ax1, x1, x2):
"maps xlim of ax2 to x1 and x2 in ax1"
(x1, _), (x2, _) = ax2.transData.inverted().transform(ax1.transData.transform([[x1, 0], [x2, 0]]))
xs, xe = ax2.get_xlim()
k, b = np.polyfit([x1, x2], [xs, xe], 1)
ax2.set_xlim(xs*k+b, xe*k+b)
Here is the full code:
from matplotlib import pyplot as plt
import pandas as pd
from pandas import Series
import numpy as np
n = 50
x = pd.period_range('2001-01-01', periods=n, freq='M')
y1 = (Series(np.random.randn(n)) + 5).tolist()
y2 = (Series(np.random.randn(n))).tolist()
df = pd.DataFrame({'bar':y2, 'line':y1}, index=x)
# let's plot
plt.figure(figsize=(20, 4))
ax1 = df['bar'].plot(kind='bar', label='bar')
ax2 = ax1.twiny()
df['line'].plot(kind='line', label='line', ax=ax2)
ax2.grid(color="red", axis="x")
def align_xaxis(ax2, ax1, x1, x2):
"maps xlim of ax2 to x1 and x2 in ax1"
(x1, _), (x2, _) = ax2.transData.inverted().transform(ax1.transData.transform([[x1, 0], [x2, 0]]))
xs, xe = ax2.get_xlim()
k, b = np.polyfit([x1, x2], [xs, xe], 1)
ax2.set_xlim(xs*k+b, xe*k+b)
align_xaxis(ax2, ax1, 0, n-1)
and the output:
Related
I am trying to create a colored line with certain conditions. Basically I would like to have the line colored red when pointing down on the y axis, green when pointing up and blue when neither.
I played around with some similar examples I found but I have never been able to convert them to work with plot() on an axis. Just wondering how this could be done.
Here is some code that I have come up with so far:
#create x,y coordinates
x = numpy.random.choice(10,10)
y = numpy.random.choice(10,10)
#create an array of colors based on direction of line (0=r, 1=g, 2=b)
colors = []
#create an array that is one position away from original
#to determine direction of line
yCopy = list(y[1:])
for y1,y2 in zip(y,yCopy):
if y1 > y2:
colors.append(0)
elif y1 < y2:
colors.append(1)
else:
colors.append(2)
#add tenth spot to array as loop only does nine
colors.append(2)
#create a numpy array of colors
categories = numpy.array(colors)
#create a color map with the three colors
colormap = numpy.array([matplotlib.colors.colorConverter.to_rgb('r'),matplotlib.colors.colorConverter.to_rgb('g'),matplotlib.colors.colorConverter.to_rgb('b')])
#plot line
matplotlib.axes.plot(x,y,color=colormap[categories])
Not sure how to get plot() to accept an array of colors. I always get an error about the format type used as the color. Tried heximal, decimal, string and float. Works perfect with scatter().
I don't think that you can use an array of colors in plot (the documentation says that color can be any matlab color, while the scatter docs say you can use an array).
However, you could fake it by plotting each line separately:
import numpy
from matplotlib import pyplot as plt
x = range(10)
y = numpy.random.choice(10,10)
for x1, x2, y1,y2 in zip(x, x[1:], y, y[1:]):
if y1 > y2:
plt.plot([x1, x2], [y1, y2], 'r')
elif y1 < y2:
plt.plot([x1, x2], [y1, y2], 'g')
else:
plt.plot([x1, x2], [y1, y2], 'b')
plt.show()
OK. So I figured out how to do it using LineCollecion to draw the line on a axis.
import numpy as np
import pylab as pl
from matplotlib import collections as mc
segments = []
colors = np.zeros(shape=(10,4))
x = range(10)
y = np.random.choice(10,10)
i = 0
for x1, x2, y1,y2 in zip(x, x[1:], y, y[1:]):
if y1 > y2:
colors[i] = tuple([1,0,0,1])
elif y1 < y2:
colors[i] = tuple([0,1,0,1])
else:
colors[i] = tuple([0,0,1,1])
segments.append([(x1, y1), (x2, y2)])
i += 1
lc = mc.LineCollection(segments, colors=colors, linewidths=2)
fig, ax = pl.subplots()
ax.add_collection(lc)
ax.autoscale()
ax.margins(0.1)
pl.show()
There is an example on the matplotlib page showing how to use a LineCollection to plot a multicolored line.
The remaining problem is to get the colors for the line collection. So if y are the values to compare,
cm = dict(zip(range(-1,2,1),list("gbr")))
colors = list( map( cm.get , np.sign(np.diff(y)) ))
Complete code:
import numpy as np; np.random.seed(5)
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
x = np.arange(10)
y = np.random.choice(10,10)
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
cm = dict(zip(range(-1,2,1),list("rbg")))
colors = list( map( cm.get , np.sign(np.diff(y)) ))
lc = LineCollection(segments, colors=colors, linewidths=2)
fig, ax = plt.subplots()
ax.add_collection(lc)
ax.autoscale()
ax.margins(0.1)
plt.show()
I have followed this example (Drawing lines between two plots in Matplotlib) but am running into problems. I believe it has something to do with the fact that I essentially have two different y points, but am not sure how to amend the code to fix it. I would like the line to start at one point and end at the other point directly below it, as well as plotting for all lines.
fig=plt.figure(figsize=(22,10), dpi=150)
ax1 = fig.add_subplot(1, 1, 1)
ax2 = ax1.twinx()
n = 10
y1 = np.random.random(n)
y2 = np.random.random(n) + 1
x1 = np.arange(n)
ax1.scatter(x1, y1)
ax2.scatter(x1, y2)
i = 1
xy = (x1[i],y1[i])
con = ConnectionPatch(xyA=xy, xyB=xy, coordsA="data", coordsB="data",
axesA=ax1, axesB=ax2, color="red")
ax2.add_artist(con)
ax1.plot(x1[i],y1[i],'g+',markersize=12)
ax2.plot(x1[i],y1[i],'g+',markersize=12)
Just iterate over zipped (x, y1, y2):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import ConnectionPatch
fig = plt.figure(figsize=(10, 5), dpi=100)
ax1 = fig.add_subplot(1, 1, 1)
ax2 = ax1.twinx()
n = 10
y1 = np.random.random(n)
y2 = np.random.random(n) + 1
x1 = np.arange(n)
# I add some colors blue for left y-axis, red for right y-axis
ax1.scatter(x1, y1, c='b')
ax2.scatter(x1, y2, c='r')
# Now iterate over paired x, and 2 y values:
for xi, y1i, y2i in zip(x1, y1, y2):
con = ConnectionPatch(
xyA=(xi, y1i),
xyB=(xi, y2i),
coordsA="data",
coordsB="data",
axesA=ax1,
axesB=ax2,
color='g',
)
ax1.add_artist(con)
plt.show()
Out:
I want to create a plot for two different datasets similar to the one presented in this answer:
In the above image, the author managed to fix the overlapping problem of the error bars by adding some small random scatter in x to the new dataset.
In my problem, I must plot a similar graphic, but having some categorical data in the x axis:
Any ideas on how to slightly move one the error bars of the second dataset using categorical variables at the x axis? I want to avoid the overlapping between the bars for making the visualization easier.
You can translate each errorbar by adding the default data transform to a prior translation in data space. This is possible when knowing that categories are in general one data unit away from each other.
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
from matplotlib.transforms import Affine2D
x = list("ABCDEF")
y1, y2 = np.random.randn(2, len(x))
yerr1, yerr2 = np.random.rand(2, len(x))*4+0.3
fig, ax = plt.subplots()
trans1 = Affine2D().translate(-0.1, 0.0) + ax.transData
trans2 = Affine2D().translate(+0.1, 0.0) + ax.transData
er1 = ax.errorbar(x, y1, yerr=yerr1, marker="o", linestyle="none", transform=trans1)
er2 = ax.errorbar(x, y2, yerr=yerr2, marker="o", linestyle="none", transform=trans2)
plt.show()
Alternatively, you could translate the errorbars after applying the data transform and hence move them in units of points.
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
from matplotlib.transforms import ScaledTranslation
x = list("ABCDEF")
y1, y2 = np.random.randn(2, len(x))
yerr1, yerr2 = np.random.rand(2, len(x))*4+0.3
fig, ax = plt.subplots()
trans1 = ax.transData + ScaledTranslation(-5/72, 0, fig.dpi_scale_trans)
trans2 = ax.transData + ScaledTranslation(+5/72, 0, fig.dpi_scale_trans)
er1 = ax.errorbar(x, y1, yerr=yerr1, marker="o", linestyle="none", transform=trans1)
er2 = ax.errorbar(x, y2, yerr=yerr2, marker="o", linestyle="none", transform=trans2)
plt.show()
While results look similar in both cases, they are fundamentally different. You will observe this difference when interactively zooming the axes or changing the figure size.
Consider the following approach to highlight plots - combination of errorbar and fill_between with non-zero transparency:
import random
import matplotlib.pyplot as plt
# create sample data
N = 8
data_1 = {
'x': list(range(N)),
'y': [10. + random.random() for dummy in range(N)],
'yerr': [.25 + random.random() for dummy in range(N)]}
data_2 = {
'x': list(range(N)),
'y': [10.25 + .5 * random.random() for dummy in range(N)],
'yerr': [.5 * random.random() for dummy in range(N)]}
# plot
plt.figure()
# only errorbar
plt.subplot(211)
for data in [data_1, data_2]:
plt.errorbar(**data, fmt='o')
# errorbar + fill_between
plt.subplot(212)
for data in [data_1, data_2]:
plt.errorbar(**data, alpha=.75, fmt=':', capsize=3, capthick=1)
data = {
'x': data['x'],
'y1': [y - e for y, e in zip(data['y'], data['yerr'])],
'y2': [y + e for y, e in zip(data['y'], data['yerr'])]}
plt.fill_between(**data, alpha=.25)
Result:
Threre is example on lib site: https://matplotlib.org/stable/gallery/lines_bars_and_markers/errorbar_subsample.html
enter image description here
You need parameter errorevery=(m, n),
n - how often plot error lines, m - shift with range from 0 to n
Is it at all possible for me to make one set of subplots (with 2 plots) in a for loop that runs three times, and then fit the three sets of subplots into one main figure. The whole point of this is to be able to have 6 plots on one figure, but have a space between every other plot. I know how to have 6 plots in one figure, but I can only put space between every plot instead of every other plot. I hope my question makes sense. As for the data that I'm using, it is a pretty basic data set I'm using for practice right now. Each pair of plot share the same x-axis, which is why I don't want a space between them.
import matplotlib.pyplot as plt
x1 = [0,1,2,3,4,5]
y1 = [i*2 for i in x1]
y2 = [i*3 for i in x1]
x2 = [4,8,12,16,20]
y3 = [i*5 for i in x2]
y4 = [i*3 for i in x2]
x3 = [0,1,2,3,4,5]
y5 = [i*4 for i in x3]
y6 = [i*7 for i in x3]
fig = plt.figure(1,figsize=(5,5))
ax1 = plt.subplot(611)
ax1.plot(x1,y1)
ax2 = plt.subplot(612)
ax2.plot(x1,y2)
ax3 = plt.subplot(613)
ax3.plot(x2,y3)
ax4 = plt.subplot(614)
ax4.plot(x2,y4)
ax5 = plt.subplot(615)
ax5.plot(x3,y5)
ax6 = plt.subplot(616)
ax6.plot(x3,y6)
fig.subplots_adjust(hspace=0.5)
plt.show()
This is what I get:
Your code makes a graph with six sub-plots. If you make eight subplots and leave two of them empty, you get your added space. Here is the code I used, slightly modified from your code.
import matplotlib.pyplot as plt
x1 = [0,1,2,3,4,5]
y1 = [i*2 for i in x1]
y2 = [i*3 for i in x1]
x2 = [4,8,12,16,20]
y3 = [i*5 for i in x2]
y4 = [i*3 for i in x2]
x3 = [0,1,2,3,4,5]
y5 = [i*4 for i in x3]
y6 = [i*7 for i in x3]
fig = plt.figure(1,figsize=(5,7))
ax1 = plt.subplot(811)
ax1.plot(x1,y1)
ax2 = plt.subplot(812)
ax2.plot(x1,y2)
ax3 = plt.subplot(814)
ax3.plot(x2,y3)
ax4 = plt.subplot(815)
ax4.plot(x2,y4)
ax5 = plt.subplot(817)
ax5.plot(x3,y5)
ax6 = plt.subplot(818)
ax6.plot(x3,y6)
fig.subplots_adjust(hspace=0.5)
plt.show()
I get this result:
I had to increase the figure size height to 7 inches to accommodate the extra space. Is that what you want?
I am trying to plot groups of data which have different bar sizes and may have different group sizes. How can I group the bars that belong to the same groups (shown as the same color) so that they are side by side? (Similar to this, except the same colors should be side-by-side)
width = 0.50
groupgap=2
y1=[20,80]
y2=[60,30,10]
x1 = np.arange(len(y1))
x2 = np.arange(len(y2))+groupgap
ind = np.concatenate((x1,x2))
fig, ax = plt.subplots()
rects1 = ax.bar(x1, y1, width, color='r', ecolor= "black",label="Gender")
rects2 = ax.bar(x2, y2, width, color='b', ecolor= "black",label="Type")
ax.set_ylabel('Population',fontsize=14)
ax.set_xticks(ind)
ax.set_xticklabels(('Male', 'Female','Student', 'Faculty','Others'),fontsize=14)
ax.legend()
The idea of using a gap between the categories (groupgap) is indeed a way to go. You would just have to add the length of the first group as well:
x2 = np.arange(len(y2))+groupgap+len(y1)
Here is the complete example where I used groupgap=1:
import matplotlib.pyplot as plt
import numpy as np
width = 1
groupgap=1
y1=[20,80]
y2=[60,30,10]
x1 = np.arange(len(y1))
x2 = np.arange(len(y2))+groupgap+len(y1)
ind = np.concatenate((x1,x2))
fig, ax = plt.subplots()
rects1 = ax.bar(x1, y1, width, color='r', edgecolor= "black",label="Gender")
rects2 = ax.bar(x2, y2, width, color='b', edgecolor= "black",label="Type")
ax.set_ylabel('Population',fontsize=14)
ax.set_xticks(ind)
ax.set_xticklabels(('Male', 'Female','Student', 'Faculty','Others'),fontsize=14)
plt.show()