Related
In the below code I plotted a time-line chart but I don't know how can I show all Y-axis values by integer type and standard interval. Does anyone has any idea?
code link: https://colab.research.google.com/drive/1Fq91PXlylJMKh6oUpysM95gpLwBfUcGx?usp=sharing
from matplotlib.patches import Patch
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams.update(matplotlib.rcParamsDefault)
fig, ax = plt.subplots(1, figsize=(20,7))
ax.barh(df.year, df.days_start_to_end, left=df.startNum, color=df.color)
xticks = np.arange(0, df.endNum.max()+1, 3)
xticks_labels = pd.date_range(proj_start, end=df.end.max()).strftime("%m/%d ")
xticks_minor = np.arange(0, df.endNum.max()+1, 1)
#ax.set_yticks(np.arange(len(df.year)))
ax.set_xticks(xticks)
#ax.set_xticks(xticks_minor, minor=True)
ax.set_xticklabels(xticks_labels[::3])
c_dict = { 'Perfect': '#4db249', 'Good':'#539de3',
'Normal':'#ffbd63', 'Severe': '#ff7361', 'Drastic':'#ff2626'}
legendEl = [Patch(facecolor = c_dict[i], label = i) for i in c_dict]
plt.legend(handles = legendEl)
plt.show()
the dates (year) that you have are just numbers and not datetime. If you would like to see all the labels in Y-axis, you can simply convert that column to string while plotting. This will tell matplotlib that these are texts and need to be plotted as categorical data. Below is the updated line... Note that this will plot the years without the decimal. Hope you don't need that.
ax.barh(df.year.astype('string'), df.days_start_to_end, left=df.startNum, color=df.color)
Updated plot
I want this plot's y-axis to be centered at 38, and the y-axis scaled such that the 'humps' disappear. How do I accomplish this?
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02', '33.79', '34.73', '36.47', '35.03', '33.45',
'33.82', '33.38', '34.68', '36.93', '33.44', '33.55',
'33.18', '33.07', '33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 2))
for i,j in zip(s,df[0]):
ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, df[0])
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.grid()
#plt.yticks(np.arange(min(df[p]), max(df[p])+1, 2))
fig.savefig("test.png")
plt.show()
I'm not entirely certain if this is what you're looking for but you can adjust the y-limits explicitly to change the scale, i.e.
ax.set_ylim([ax.get_ylim()[0], 42])
Which only sets the upper bound, leaving the lower limit unchanged, this would give you
you can supply any values you find appropriate, i.e.
ax.set_ylim([22, 52])
will give you something that looks like
Also note that the tick labels and general appearance of your plot will differ from what is shown here.
Edit - Here is the complete code as requested:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame()
s=['05/02/2019', '06/02/2019', '07/02/2019', '08/02/2019',
'09/02/2019', '10/02/2019', '11/02/2019', '12/02/2019',
'13/02/2019', '20/02/2019', '21/02/2019', '22/02/2019',
'23/02/2019', '24/02/2019', '25/02/2019']
df[0]=['38.02','33.79','34.73','36.47','35.03','33.45',
'33.82','33.38','34.68','36.93','33.44','33.55',
'33.18','33.07','33.17']
# Data for plotting
fig, ax = plt.subplots(figsize=(17, 3))
#for i,j in zip(s,df[0]):
# ax.annotate(str(j),xy=(i,j+0.8))
ax.plot(s, pd.to_numeric(df[0]))
ax.set(xlabel='Dates', ylabel='Latency',
title='Hongkong to sing')
ax.set_xticklabels(pd.to_datetime(s).strftime('%m.%d'), rotation=45)
ax.set_ylim([22, 52])
plt.show()
I'm drawing several point plots in seaborn on the same graph. The x-axis is ordinal, not numerical; the ordinal values are the same for each point plot. I would like to shift each plot a bit to the side, the way pointplot(dodge=...) parameter does within multiple lines within a single plot, but in this case for multiple different plots drawn on top of each other. How can I do that?
Ideally, I'd like a technique that works for any matplotlib plot, not just seaborn specifically. Adding an offset to the data won't work easily, since the data is not numerical.
Example that shows the plots overlapping and making them hard to read (dodge within each plot works okay)
import pandas as pd
import seaborn as sns
df1 = pd.DataFrame({'x':list('ffffssss'), 'y':[1,2,3,4,5,6,7,8], 'h':list('abababab')})
df2 = df1.copy()
df2['y'] = df2['y']+0.5
sns.pointplot(data=df1, x='x', y='y', hue='h', ci='sd', errwidth=2, capsize=0.05, dodge=0.1, markers='<')
sns.pointplot(data=df2, x='x', y='y', hue='h', ci='sd', errwidth=2, capsize=0.05, dodge=0.1, markers='>')
I could use something other than seaborn, but the automatic confidence / error bars are very convenient so I'd prefer to stick with seaborn here.
Answering this for the most general case first.
A dodge can be implemented by shifting the artists in the figure by some amount. It might be useful to use points as units of that shift. E.g. you may want to shift your markers on the plot by 5 points.
This shift can be accomplished by adding a translation to the data transform of the artist. Here I propose a ScaledTranslation.
Now to keep this most general, one may write a function which takes the plotting method, the axes and the data as input, and in addition some dodge to apply, e.g.
draw_dodge(ax.errorbar, X, y, yerr =y/4., ax=ax, dodge=d, marker="d" )
The full functional code:
import matplotlib.pyplot as plt
from matplotlib import transforms
import numpy as np
import pandas as pd
def draw_dodge(*args, **kwargs):
func = args[0]
dodge = kwargs.pop("dodge", 0)
ax = kwargs.pop("ax", plt.gca())
trans = ax.transData + transforms.ScaledTranslation(dodge/72., 0,
ax.figure.dpi_scale_trans)
artist = func(*args[1:], **kwargs)
def iterate(artist):
if hasattr(artist, '__iter__'):
for obj in artist:
iterate(obj)
else:
artist.set_transform(trans)
iterate(artist)
return artist
X = ["a", "b"]
Y = np.array([[1,2],[2,2],[3,2],[1,4]])
Dodge = np.arange(len(Y),dtype=float)*10
Dodge -= Dodge.mean()
fig, ax = plt.subplots()
for y,d in zip(Y,Dodge):
draw_dodge(ax.errorbar, X, y, yerr =y/4., ax=ax, dodge=d, marker="d" )
ax.margins(x=0.4)
plt.show()
You may use this with ax.plot, ax.scatter etc. However not with any of the seaborn functions, because they don't return any useful artist to work with.
Now for the case in question, the remaining problem is to get the data in a useful format. One option would be the following.
df1 = pd.DataFrame({'x':list('ffffssss'),
'y':[1,2,3,4,5,6,7,8],
'h':list('abababab')})
df2 = df1.copy()
df2['y'] = df2['y']+0.5
N = len(np.unique(df1["x"].values))*len([df1,df2])
Dodge = np.linspace(-N,N,N)/N*10
fig, ax = plt.subplots()
k = 0
for df in [df1,df2]:
for (n, grp) in df.groupby("h"):
x = grp.groupby("x").mean()
std = grp.groupby("x").std()
draw_dodge(ax.errorbar, x.index, x.values,
yerr =std.values.flatten(), ax=ax,
dodge=Dodge[k], marker="o", label=n)
k+=1
ax.legend()
ax.margins(x=0.4)
plt.show()
You can use linspace to easily shift your graphs to where you want them to start and end. The function also makes it very easy to scale the graph so they would be visually the same width
import numpy as np
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.pyplot as plt
start_offset = 3
end_offset = start_offset
y1 = np.random.randint(0, 10, 20) ##y1 has 20 random ints from 0 to 10
y2 = np.random.randint(0, 10, 10) ##y2 has 10 random ints from 0 to 10
x1 = np.linspace(0, 20, y1.size) ##create a number of steps from 0 to 20 equal to y1 array size-1
x2 = np.linspace(0, 20, y2.size)
plt.plot(x1, y1)
plt.plot(x2, y2)
plt.show()
I have a pandas dataframe which I would like to slice, and plot each slice in a separate subplot. I would like to use the sharey='all' and have matplotlib decide on some reasonable y-axis limits, rather than having to search the dataframe for the min and max and add offsets.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig, axes = plt.subplots(nrows=0,ncols=0, sharey='all', tight_layout=True)
for i in range(1, len(df.columns) + 1):
ax = fig.add_subplot(2,3,i)
iC = df.iloc[:, i-1]
iC.plot(ax=ax)
Which gives the following plot:
In fact, it gives that irrespective of what I specify sharey to be ('all','col','row',True, or False). What I sought after using sharey='all' would be something like:
Can somebody perhaps explain me what I'm doing wrong here?
The following version would only add those axes you need for your df-columns and share their y-scales:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig = plt.figure(tight_layout=True)
ref_ax = None
for i in range(len(df.columns)):
ax = fig.add_subplot(2, 3, i+1, sharey=ref_ax)
ref_ax=ax
iC = df.iloc[:, i]
iC.plot(ax=ax)
plt.show()
The grid-layout Parameters, which are explicitly given as ...add_subplot(2, 3, ... here can of course be calculated with respect to len(df.columns).
Your plots are not shared. You create a subplot grid with 0 rows and 0 columns, i.e. no subplots at all, but those nonexisting subplots have their y axes shared. Then you create some other (existing) subplots, which are not shared. Those are the ones that are plotted to.
Instead you need to set nrows and ncols to some useful values and plot to those hence created axes.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.arange(50).reshape((5,10))).transpose()
fig, axes = plt.subplots(nrows=2,ncols=3, sharey='all', tight_layout=True)
for i, ax in zip(range(len(df.columns)), axes.flat):
iC = df.iloc[:, i]
iC.plot(ax=ax)
for j in range(len(df.columns),len(axes.flat)):
axes.flatten()[j].axis("off")
plt.show()
UPDATED
I have write down a code like the given bellow..
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.read_csv("data_1.csv",index_col="Group")
print df
fig,ax = plt.subplots(1)
heatmap = ax.pcolor(df)########
ax.pcolor(df,edgecolors='k')
cbar = plt.colorbar(heatmap)##########
plt.ylim([0,12])
ax.invert_yaxis()
locs_y, labels_y = plt.yticks(np.arange(0.5, len(df.index), 1), df.index)
locs_x, labels_x = plt.xticks(np.arange(0.5, len(df.columns), 1), df.columns)
ax.set_xticklabels(labels_x, rotation=10)
ax.set_yticklabels(labels_y,fontsize=10)
plt.show()
Which takes input like given bellow and plot a heat map with the two side leabel left and bottom..
GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30
S2,28,20,20,39,25
S3,20,21,21,44,21
I further want to add additional labels at right side as given bellow to the data and want to plot a heatmap with three side label. right left and bottom.
GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30,V1
S2,28,20,20,39,25,V2
S3,20,21,21,44,21,V3
What changes should i incorporate into the code.
Please help ..
You may create a new axis on the right of the plot, called twinx. Then you need to essentially adjust this axis the same way you already did with the first axis.
u = u"""GP1,c1,c2,c3,c4,c5
S1,21,21,20,69,30
S2,28,20,20,39,25
S3,20,21,21,44,21"""
import io
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df= pd.read_csv(io.StringIO(u),index_col="GP1")
fig,ax = plt.subplots(1)
heatmap = ax.pcolor(df, edgecolors='k')
cbar = plt.colorbar(heatmap, pad=0.1)
bx = ax.twinx()
ax.set_yticks(np.arange(0.5, len(df.index), 1))
ax.set_xticks(np.arange(0.5, len(df.columns), 1), )
ax.set_xticklabels(df.columns, rotation=10)
ax.set_yticklabels(df.index,fontsize=10)
bx.set_yticks(np.arange(0.5, len(df.index), 1))
bx.set_yticklabels(["V1","V2","V3"],fontsize=10)
ax.set_ylim([0,12])
bx.set_ylim([0,12])
ax.invert_yaxis()
bx.invert_yaxis()
plt.show()