Matplotlib subplot using for loop Python - python

Trying to create multiple charts and save it as one image. I managed to combine multiple charts but there is couple things that going wrong. Could not set tittles for all charts only for last one for some reason. Also numbers is not showing in full as last chart. Also want to change colors for line(white), labels(white), background(black) and rotate a date so it would be easily to read it.
dataSet = {"info":[{"title":{"Value":[list of data]}},{"title":{"Value":[list of data]}},
...]}
fig, ax = plt.subplots(2, 3, sharex=False, sharey=False, figsize=(22, 10), dpi=70,
linewidth=0.5)
ax = np.array(ax).flatten()
for i, data in enumerate(dataSet['info']):
for key in data:
df: DataFrame = pd.DataFrame.from_dict(data[key]).fillna(method="backfill")
df['Date'] = pd.to_datetime(df['Date'], unit='ms')
df.index = pd.DatetimeIndex(df['Date'])
x = df['Date']
y = df['Value']
ax[i].plot(x, y)
current_values = plt.gca().get_yticks()
plt.gca().set_yticklabels(['{:,.0f}'.format(x) for x in current_values])
plt.title(key)
plt.show()

Your figure consists of the various axes objects. To set the title for each plot you need to use the corresponding axes object, which provides the relevant methods you need to change the appearance.
See for example:
import matplotlib.pyplot as plt
import numpy as np
fig, axarr = plt.subplots(2, 2)
titles = list("abcd")
for ax, title in zip(axarr.ravel(), titles):
x = np.arange(10)
y = np.random.random(10)
ax.plot(x, y, color='white')
ax.set_title(title)
ax.set_facecolor((0, 0, 0))
fig.tight_layout()
In order to change labels, show the legend, change the background, I would recommend to read the documentations.
For the dates, you can rotate the labels or use fig.autofmt_xdate().

Related

Matplotlib flattens the first of two plots when I add the second plot?

Matplotlib madness...
dfin = pd.read_csv(inputfilename, sep=";", encoding='ISO-8859-1')
# create a return column
dfin['return'] = dfin['close'].shift(9) / dfin['close'].shift(12)
# create a cumulative sum column
dfin['return_cum'] = dfin['return'].cumsum()
close = dfin.iloc[:-1]['close']
test = dfin.iloc[:-1]['close'] * dfin.iloc[:-1]['return']
fig, axs = plt.subplots(figsize=(20, 10), sharex=True, sharey=True)
axs.plot(close, color='black')
axs.plot(test, color='blue')
plt.show()
plt.close()
However, when I try to run a cumulative plot of any kind, MPL flattens the first plot and plots the second relative to it:
test = dfin.iloc[:-1]['close'] * dfin.iloc[:-1]['return_cum']
I'm doing stock analysis, and trying to plot returns relative to the existing closing price. I don't understand why MPL is flatting the first plot - or how to make it stop.
Thanks for any help.
It's not flattening it per se. But the scale of the second line/plot is much bigger than the first that it shows like it's flattened.
You will need to use multiple scales (multiple y axis).
Check out this example from the matplotlib documentation.
Basically, you will need to do something like this:
...
fig, axs = plt.subplots(figsize=(20, 10), sharex=True, sharey=True)
axs.plot(close, color='black')
// same code as before above
// changed code below
ax2 = axs.twinx()
ax2.plot(test, color='blue')
fig.tight_layout()
plt.show()
plt.close()

Matplotlib pie charts as scatter plot

I have an interesting problem where I am trying to use multiple matplotlib pie charts as a scatter plot. I have read this post regarding this matplotlib tutorial and was able to get those working. However, I found that I was able to achieve the same results using the built-in pie function and plotting many pie charts on the same axis.
When using this alternative method, I found that after plotting the pie charts the axes lose their labels and whenever you pan the original data is still contained inside of the where the bounds of the original data should be, but the pie charts are only contained inside of the figure canvas.
The following code replicates the issue that I'm having.
import matplotlib.pyplot as plt
import pandas as pd
import random
def rand(): #simulate some random data
return [random.randint(0,100) for _ in range(10)]
def plot_pie(x, ax):
ax.pie(x[['a','b','c']], center=(x['lat'],x['lon']), radius=1,colors=['r', 'b', 'g'])
#my data is stored in a similar styled dataframe that I read from a csv and the data is static
sim_data = pd.DataFrame({'a':rand(),'b':rand(),'c':rand(), 'lat':rand(),'lon':rand()})
fig, ax = plt.subplots()
plt.scatter(x=sim_data['lat'], y=sim_data['lon'], s=1000, facecolor='none',edgecolors='r')
y_init = ax.get_ylim()
x_init = ax.get_xlim()
sim_data.apply(lambda x : plot_pie(x,ax), axis=1)
ax.set_ylim(y_init)
ax.set_xlim(x_init)
plt.show()
The reason that I reset the x and y limits of the axis is that I assume the pie function automatically sets the bounds of the axes to the last pie chart and this was my work around.
UPDATE
After reading the docs again I found that matplotlib pie chart objects as a default are set to not clip to the extents of any axes. To solve it, just updating that parameter seemed to work for me. The following code is the solution to my problem. I also found that by plotting each pie chart I would lose my axes ticks, to solve that I had to pass the frame parameter to the pie charts.
def plot_pie(x, ax):
ax.pie(x[['a','b','c']], center=(x['lat'],x['lon']), radius=1,colors=['r', 'b', 'g'], wedgeprops={'clip_on':True}, frame=True)
Data generated as in original post. I added a frame for each plot for clarity.
def plot_pie(x, ax, r=1):
# radius for pieplot size on a scatterplot
ax.pie(x[['a','b','c']], center=(x['lat'],x['lon']), radius=r, colors=['r', 'b', 'g'])
fig, axs = plt.subplots(1, 3, figsize=(15, 5))
fig.patch.set_facecolor('white')
# original plot
ax = axs[0]
ax.scatter(x=sim_data['lat'], y=sim_data['lon'], s=1000, facecolor='none', edgecolors='r')
y_init = ax.get_ylim()
x_init = ax.get_xlim()
sim_data.apply(lambda x : plot_pie(x,ax), axis=1)
ax.set_ylim(y_init)
ax.set_xlim(x_init)
ax.set_title('Original')
ax.set_frame_on(True)
# r-beginner's solution
ax = axs[1]
ax.scatter(x=sim_data['lat'], y=sim_data['lon'], s=1000, facecolor='none', edgecolors='r')
y_init = ax.get_ylim()
x_init = ax.get_xlim()
sim_data.apply(lambda x : plot_pie(x,ax), axis=1)
ax.set_ylim([0, y_init[1]*1.1])
ax.set_xlim([0, x_init[1]*1.1])
ax.set_title('r-beginners')
ax.set_frame_on(True)
# my solution
ax = axs[2]
# do not use `s=` for size, it will not work properly when you are scattering pieplots
# because pieplots will be plotted above them
ax.scatter(x=sim_data['lat'], y=sim_data['lon'], s=0)
# git min/max values for the axes
y_init = ax.get_ylim()
x_init = ax.get_xlim()
sim_data.apply(lambda x : plot_pie(x, ax, r=7), axis=1)
# from zero to xlim/ylim with step 10
_ = ax.yaxis.set_ticks(range(0, round(y_init[1])+10, 10))
_ = ax.xaxis.set_ticks(range(0, round(x_init[1])+10, 10))
_ = ax.set_title('My')
ax.set_frame_on(True)

Changing the order of entries for a geopandas choropleth map legend

I am plotting a certain categorical value over the map of a city. The line of code I use to plot is the following:
fig = plt.figure(figsize=(12, 12))
ax = plt.gca()
urban_data.plot(column="category", cmap="viridis", ax=ax, categorical=True, /
k=4, legend=True, linewidth=0.5, /
legend_kwds={'fontsize':'19', 'loc':'lower left'})
where urban data is a geopandas dataframe, and I am using matplotlib as plotting library. The argument legend_kwds allows me to control minor things on the legend, like the position or the font size, but I cannot decide major things like, for example, the order of the entries in the legend box. In fact my categories are ranked, let's say 1-2-3-4, but I always get them displayed in a different order.
Is it possible to have more control over the legend? For example by calling it outside the gdf.plot() function? And, if so, how do I match the colors in the legend with those in the map, which are discrete values (that I don't know exactly) of a viridis colormap?
EDIT: here is a verifiable example. Unfortunately shapefiles need other files to work, and here a geometry (an area, not a point) column is needed, so I have to ask you to download this shpfile of the US. Everything you need is within this folder. Here's the code to reproduce the issue. The plot in output is bad because I did not care about the coordinates system here, but the important thing is the legend.
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
gdf=gpd.read_file('.../USA_adm1.shp')
clusters=np.random.randint(0,4, size=52)
gdf['cluster']=clusters
clusdict={1: 'lower-middle', 2: 'upper-middle', 3: 'upper', 0: 'lower'}
gdf['cluster']=gdf['cluster'].map(clusdict)
fig = plt.figure(figsize=(12, 12))
ax = plt.gca()
gdf.plot(column='cluster',cmap='viridis', categorical=True, legend=True, ax=ax)
The bad news is that categories in legends produced by geopandas are sorted and this is hardcoded (see source-code here).
One solution is hence to have the categorical column such that if it is sorted, it would correspond to the desired order. Using integers seems fine for that. Then one can replace the names in the legend, once it is produced in the correct order.
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
gdf=gpd.read_file('data/USA_adm/USA_adm1.shp')
clusters=np.random.randint(0,4, size=52)
gdf['cluster']=clusters
clusdict={1: 'lower-middle', 2: 'upper-middle', 3: 'upper', 0: 'lower'}
fig = plt.figure(figsize=(12, 12))
ax = plt.gca()
gdf.plot(column='cluster',cmap='viridis', categorical=True, legend=True, ax=ax)
def replace_legend_items(legend, mapping):
for txt in legend.texts:
for k,v in mapping.items():
if txt.get_text() == str(k):
txt.set_text(v)
replace_legend_items(ax.get_legend(), clusdict)
plt.show()
I had to alter the accepted answer (the second line in the function) from #ImportanceOfBeingErnest a bit to get it to work (maybe there have been updates since),
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
gdf=gpd.read_file('data/USA_adm/USA_adm1.shp')
clusters=np.random.randint(0,4, size=52)
gdf['cluster']=clusters
clusdict={1: 'lower-middle', 2: 'upper-middle', 3: 'upper', 0: 'lower'}
fig = plt.figure(figsize=(12, 12))
ax = plt.gca()
gdf.plot(column='cluster',cmap='viridis', categorical=True, legend=True, ax=ax)
def replace_legend_items(legend, mapping):
for txt in legend.get_texts():
for k,v in mapping.items():
if txt.get_text() == str(k):
txt.set_text(v)
replace_legend_items(ax.get_legend(), clusdict)
plt.show()
Assuming that you have 4 legends, you can do the following to set them in whatever order you like. The following code shows how to put them in the following order (using index): 0, 2, 3, 1.
Here ax is the axis object which you have define using ax = plt.gca()
handles,labels = ax.get_legend_handles_labels()
handles = [handles[0], handles[2], handles[3], handles[1]]
labels = [labels[0], labels[2], labels[3], labels[1]]
ax.legend(handles, labels)
Let me give you an example:
Default order
fig, ax = plt.subplots()
x = np.arange(5)
plt.plot(x, x, label=r'$y=x$')
plt.plot(x, 2*x, label=r'$y=2x$')
plt.plot(x, 3*x, label=r'$y=3x$')
plt.plot(x, 4*x, label=r'$y=4x$')
plt.legend(fontsize=16)
Manually changed order
fig, ax = plt.subplots()
x = np.arange(5)
plt.plot(x, x, label=r'$y=x$')
plt.plot(x, 2*x, label=r'$y=2x$')
plt.plot(x, 3*x, label=r'$y=3x$')
plt.plot(x, 4*x, label=r'$y=4x$')
handles,labels = ax.get_legend_handles_labels()
handles = [handles[0], handles[2],handles[3], handles[1]]
labels = [labels[0], labels[2], labels[3], labels[1]]
ax.legend(handles, labels, fontsize=16)
One can also use list comprehension using a pre-specified order list as
order = [0, 2, 3, 1]
handles,labels = ax.get_legend_handles_labels()
handles = [handles[i] for i in order]
labels = [labels[i] for i in order]
ax.legend(handles, labels, fontsize=16)

How to make xticks evenly spaced despite their value?

I am trying to generate a plot with x-axis being a geometric sequence while the y axis is a number between 0.0 and 1.0. My code looks like this:
form matplotlib import pyplot as plt
plt.xticks(X)
plt.plot(X,Y)
plt.show()
which generates a plot like this:
As you can see, I am explicitly setting the x-axis ticks to the ones belonging to the geometric sequence.
My question:Is it possible to make x-ticks evenly spaced despite their value, as the initial terms of the sequence are small, and crowded together. Kind of like logarithmic scale, which would be ideal if dealing with powers of a base, but not for a geometric sequence, I think, as is the case here.
You can do it by plotting your variable as a function of the "natural" variable that parametrizes your curve. For example:
n = 12
a = np.arange(n)
x = 2**a
y = np.random.rand(n)
fig = plt.figure(1, figsize=(7,7))
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
ax1.plot(x,y)
ax1.xaxis.set_ticks(x)
ax2.plot(a, y) #we plot y as a function of a, which parametrizes x
ax2.xaxis.set_ticks(a) #set the ticks to be a
ax2.xaxis.set_ticklabels(x) # change the ticks' names to x
which produces:
I had the same problem and spent several hours trying to find something appropriate. But it appears to be really easy and you do not need to make any parameterization or play with some x-ticks positions, etc.
The only thing you need to do is just to plot your x-values as str, not int: plot(x.astype('str'), y)
By modifying the code from the previous answer you will get:
n = 12
a = np.arange(n)
x = 2**a
y = np.random.rand(n)
fig = plt.figure(1, figsize=(7,7))
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
ax1.plot(x,y)
ax1.xaxis.set_ticks(x)
ax2.plot(x.astype('str'), y)
Seaborn has a bunch of categorical plot handling natively this kind of task.
Such as pointplot:
sns.pointplot(x="x", y="y", data=df, ax=ax)
Exemple
fig, [ax1, ax2] = plt.subplots(2, figsize=(7,7))
sns.lineplot(data=df, x="x", y="y", ax=ax1) #relational plot
sns.pointplot(data=df, x="x", y="y", ax=ax2) #categorical plot
In case of using Pandas Dataframe:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
n = 12
df = pd.DataFrame(dict(
X=2**np.arange(n),
Y=np.random.randint(1, 9, size=n),
)).set_index('X')
# index is reset in order to use as xticks
df.reset_index(inplace=True)
fig = plt.figure()
ax1 = plt.subplot(111)
df['Y'].plot(kind='bar', ax=ax1, figsize=(7, 7), use_index=True)
# set_ticklabels used to place original indexes
ax1.xaxis.set_ticklabels(df['X'])
convert int to str:
X = list(map(str, X))
plt.xticks(X)
plt.plot(X,Y)
plt.show()

Pandas dataframe plotting: yscale log and xy label and legend issues

Intro
I am new to python, matplotlib and pandas. I spent a lot of time reviewing material to come up with the following. And I am stuck.
Question:
I am trying to plot using pandas. I have three Y axis and of which one is log scale.
I cannot figure out why the log function(1) and label function(2) doesn't work for my secondary axis ax2 in the code. It works everywhere else.
All the legends are separated (3). Is there a simpler way to handle this other than do manually.
When I plot the secondary axis part, separately it comes out fine. I ran the plot removing third axis, still problem persists. I put here the code with all axis as I need the solution proposed to work together in this manner.
Here methods are given for solving (3) alone but I am particularly looking for dataframe based plotting. Also other manual techniques are given in the same site, which I do not want to use!
Code and explanation
# Importing the basic libraries
import matplotlib.pyplot as plt
from pandas import DataFrame
# test3 = Dataframe with 5 columns
test3 = df.ix[:,['tau','E_tilde','Max_error_red','time_snnls','z_t_gandb']]
# Setting up plot with 3 'y' axis
fig, ax = plt.subplots()
ax2, ax3 = ax.twinx(), ax.twinx()
rspine = ax3.spines['right']
rspine.set_position(('axes', 1.25))
ax3.set_frame_on(True)
ax3.patch.set_visible(False)
fig.subplots_adjust(right=0.75)
# Setting the color and labels
ax.set_xlabel('tau(nounit)')
ax.set_ylabel('Time(s)', color = 'b')
ax2.set_ylabel('Max_error_red', color = 'r')
ax3.set_ylabel('E_tilde', color = 'g')
# Setting the logscaling
ax.set_xscale('log') # Works
ax2.set_yscale('log')# Doesnt work
# Plotting the dataframe
test3.plot(x = 'tau', y = 'time_snnls', ax=ax, style='b-')
test3.plot(x = 'tau', y = 'Max_error_red', ax=ax2, style='r-', secondary_y=True)
test3.plot(x = 'tau', y = 'z_t_gandb', ax=ax, style='b-.')
test3.plot(x = 'tau', y = 'E_tilde', ax=ax3, style='g-')
The issue is the secondary_y=True option. Remove that, and it works fine. I think the problem is that you have already set up your twin axes, and having secondary_y=True is interfering with that.
As for the legend: set legend=False in each of your test3.plot commands, and then gather then legend handles and labels from the axes after you have made the plot using ax.get_legend_handles_labels(). Then you can plot them all on one legend.
Finally, to make sure the axes labels are set correctly, you must set them after you have plotted your data, as the pandas DataFrame plotting methods will overwrite whatever you have tried to set. By doing this afterwards, you make sure that it is your label that is set.
Heres a working script (with dummy data):
import matplotlib.pyplot as plt
from pandas import DataFrame
import numpy as np
# Fake up some data
test3 = DataFrame({
'tau':np.logspace(-3,0,100),
'E_tilde':np.linspace(100,0,100),
'Max_error_red':np.logspace(-2,1,100),
'time_snnls':np.linspace(5,0,100),
'z_t_gandb':np.linspace(16,15,100)
})
# Setting up plot with 3 'y' axis
fig, ax = plt.subplots()
ax2, ax3 = ax.twinx(), ax.twinx()
rspine = ax3.spines['right']
rspine.set_position(('axes', 1.25))
ax3.set_frame_on(True)
ax3.patch.set_visible(False)
fig.subplots_adjust(right=0.75)
# Setting the logscaling
ax.set_xscale('log') # Works
ax2.set_yscale('log')# Doesnt work
# Plotting the dataframe
test3.plot(x = 'tau', y = 'time_snnls', ax=ax, style='b-',legend=False)
test3.plot(x = 'tau', y = 'Max_error_red', ax=ax2, style='r-',legend=False)
test3.plot(x = 'tau', y = 'z_t_gandb', ax=ax, style='b-.',legend=False)
test3.plot(x = 'tau', y = 'E_tilde', ax=ax3, style='g-',legend=False)
# Setting the color and labels
ax.set_xlabel('tau(nounit)')
ax.set_ylabel('Time(s)', color = 'b')
ax2.set_ylabel('Max_error_red', color = 'r')
ax3.set_ylabel('E_tilde', color = 'g')
# Gather all the legend handles and labels to plot in one legend
l1 = ax.get_legend_handles_labels()
l2 = ax2.get_legend_handles_labels()
l3 = ax3.get_legend_handles_labels()
handles = l1[0]+l2[0]+l3[0]
labels = l1[1]+l2[1]+l3[1]
ax.legend(handles,labels,loc=5)
plt.show()

Categories