I have the following graphic generated with the following code
I want to correct the x-axis display to make the date more readable.
I would also like to be able to enlarge the graph
My code is :
import requests
import urllib.parse
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
def get_api_call(ids, **kwargs):
API_BASE_URL = "https://apis.datos.gob.ar/series/api/"
kwargs["ids"] = ",".join(ids)
return "{}{}?{}".format(API_BASE_URL, "series", urllib.parse.urlencode(kwargs))
df = pd.read_csv(get_api_call(
["168.1_T_CAMBIOR_D_0_0_26", "101.1_I2NG_2016_M_22",
"116.3_TCRMA_0_M_36", "143.3_NO_PR_2004_A_21", "11.3_VMATC_2004_M_12"],
format="csv", start_date=2018
))
time = df.indice_tiempo
construccion=df.construccion
emae = df.emae_original
time = pd.to_datetime(time)
list = d = {'date':time,'const':construccion,'EMAE':emae}
dataset = pd.DataFrame(list)
plt.plot( 'date', 'EMAE', data=dataset, marker='o', markerfacecolor='blue', markersize=12, color='skyblue', linewidth=4)
plt.plot( 'date', 'const', data=dataset, marker='', color='olive', linewidth=2)
plt.legend()
To make the x-tick labels more readable, try rotating them. So use, for example, a 90 degree rotation.
plt.xticks(rotation=90)
To enlarge the size, you can define your own size using the following in the beginning for instance
fig, ax = plt.subplots(figsize=(10, 8))
I am fairly sure that this can be done by using the window itself of Matplotlib. If you have the latest version you can enlarge on a section of the graph by clicking the zoom button in the bottom left. To get the x-tick labels to be more readable you can simply click the expand button in the top right or use Sheldore's solution.
Related
I'm creating an html page with a dropdown menu. When the user hits the "submit" button after making their selection from the dropdown menu, the cgi script runs and pulls data from a csv file, plots it using matplotlib, and then displays the plot using base64. The plot has dates along the x-axis and percentage on the y-axis.
I've got it all working in python 3.8 using spyder, but when I load it to my server (which uses python 3.4) it creates a huge plot that I have to scroll on the browser. When I change the figsize to height less than 10, it cuts off the x-axis label and tick labels. I've rotated the xticks 30* to make them readable. How do I essentially "zoom out" on the entire figure including tick & axis labels?
Here's the portion of my code that creates the plot:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import base64
fig, ax = plt.subplots(figsize=(15, 10))
df = pd.read_csv(filepath, header=1, parse_dates=['Report_Date'], index_col=['Report_Date'])
ax.plot(df.index.values, df['colname'], color='teal')
ax.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1, decimals=None, symbol='%', is_latex=False))
plt.xlabel('Report Date')
plt.ylabel('ylabel')
plt.title('title')
plt.xticks(rotation=30, ha='right')
plt.savefig('picture.png', dpi=200)
data_uri = base64.b64encode(open('picture.png','rb').read()).decode('utf-8')
img_tag = '<img src='data:image/png;base64,{0}>'.format(data_uri)
print(img_tag)
I think that the simplest way for you is to add
plt.tight_layout before plt.savefig
like this:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import base64
fig, ax = plt.subplots(figsize=(15, 10))
df = pd.read_csv(filepath, header=1, parse_dates=['Report_Date'], index_col=['Report_Date'])
ax.plot(df.index.values, df['colname'], color='teal')
ax.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1, decimals=None, symbol='%', is_latex=False))
plt.xlabel('Report Date')
plt.ylabel('ylabel')
plt.title('title')
plt.xticks(rotation=30, ha='right')
plt.tight_layout()
plt.savefig('picture.png', dpi=200)
data_uri = base64.b64encode(open('picture.png','rb').read()).decode('utf-8')
img_tag = '<img src='data:image/png;base64,{0}>'.format(data_uri)
print(img_tag)
more info about it : https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.tight_layout.html
I have looked on this forum for the solution to my problem, but not quite able to find it for my case.
Here is a minimum working example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.randint(0,100,size=(16,15)),
columns=list('ABCDEFGHIJKLMNO'))
clr=["#1b0031","#3fee42","#b609d2","#c9ff5d","#7449f6","#03fca1","#9164ff","#ffaf06",
"#087dff","#ff5c0d","#0081b0","#fff276","#530069","#8cff9c","#ff56d7"]
df1=df.loc[0:3]
df1.loc[4]=clr
df1=df1.drop(columns=["A","M","J","F"])
clr1=list(df1.loc[4])
df1=df1.drop(4)
df2=df.loc[4:7]
df2=df2.reset_index(drop=True)
df2.loc[4]=clr
df2=df2.drop(columns=["B","M","K","L"])
clr2=list(df2.loc[4])
df2=df2.drop(4)
df3=df.loc[8:11]
df3=df3.reset_index(drop=True)
df3.loc[4]=clr
df3=df3.drop(columns=["D","L","F"])
clr3=list(df3.loc[4])
df3=df3.drop(4)
df4=df.loc[12:16]
df4=df4.reset_index(drop=True)
df4.loc[4]=clr
df4=df4.drop(columns=["G","I","N","O"])
clr4=list(df4.loc[4])
df4=df4.drop(4)
fig, axes = plt.subplots(nrows=2, ncols=2,sharex=True,figsize=(8,15))
df1.plot.area(ax=axes[0][0],color=clr1)
box = axes[0][0].get_position()
axes[0][0].legend(loc='center left', bbox_to_anchor=(1, 0.5),fontsize=12)
df2.plot.area(ax=axes[1][0],color=clr2)
box = axes[1][0].get_position()
axes[1][0].legend(loc='center left', bbox_to_anchor=(1, 0.5),fontsize=12)
df3.plot.area(ax=axes[0][1],color=clr3)
box = axes[1][0].get_position()
axes[1][0].legend(loc='center left', bbox_to_anchor=(1, 0.5),fontsize=12)
df4.plot.area(ax=axes[1][1],color=clr4)
box = axes[1][0].get_position()
axes[1][0].legend(loc='center left', bbox_to_anchor=(1, 0.5),fontsize=12)
This generates the following
I want to make a common legend on the right side of the figure. For example, even though "A" appears in three subplots, I would like to have it appear only once in the common legend.
From my dataframe df , I know which column names map to which color. Is there a way to use this information to build a legend?
Looking forward to any suggestions.
Easiest way to do this, given that each column maps directly onto a single color, is to disable the automatic legend generation in pandas.DataFrame.plot.area() via legend=False and instead manually creating the list of legend handles.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
df = pd.DataFrame(np.random.randint(0,100,size=(16,15)),
columns=list('ABCDEFGHIJKLMNO'))
clr=["#1b0031","#3fee42","#b609d2","#c9ff5d","#7449f6","#03fca1","#9164ff","#ffaf06",
"#087dff","#ff5c0d","#0081b0","#fff276","#530069","#8cff9c","#ff56d7"]
df1=df.loc[0:3]
df1.loc[4]=clr
df1=df1.drop(columns=["A","M","J","F"])
clr1=list(df1.loc[4])
df1=df1.drop(4)
df2=df.loc[4:7]
df2=df2.reset_index(drop=True)
df2.loc[4]=clr
df2=df2.drop(columns=["B","M","K","L"])
clr2=list(df2.loc[4])
df2=df2.drop(4)
df3=df.loc[8:11]
df3=df3.reset_index(drop=True)
df3.loc[4]=clr
df3=df3.drop(columns=["D","L","F"])
clr3=list(df3.loc[4])
df3=df3.drop(4)
df4=df.loc[12:16]
df4=df4.reset_index(drop=True)
df4.loc[4]=clr
df4=df4.drop(columns=["G","I","N","O"])
clr4=list(df4.loc[4])
df4=df4.drop(4)
fig, axes = plt.subplots(nrows=2, ncols=2,sharex=True,figsize=(8,15))
df1.plot.area(ax=axes[0][0],color=clr1, legend=False)
df2.plot.area(ax=axes[1][0],color=clr2, legend=False)
df3.plot.area(ax=axes[0][1],color=clr3, legend=False)
df4.plot.area(ax=axes[1][1],color=clr4, legend=False)
handles = [Patch(color = clr[i], label = df.columns.values[i]) for i in range(len(clr))]
plt.figlegend(handles=handles)
plt.show()
You can adjust the position of the legend using the bbox_to_anchor and loc arguments, as normally.
I have dataframes which I am trying to plot them in one single plot.
However, it needs to be step-by-step by iteration. Like the one single plot should be updated at each time loop runs.
What I am trying now is
for i in range(0, len(df))
plt.plot(df[i].values[:,0], df[i].values[:,1])
plt.show()
It seems work but it generates a graph at each iteration.
I want them all to be in one plot as it is being updated.
Thanks for your help.
Edit: Regarding the answers, you referred does not contain what I wanted.
That one is just superimposing two datasets.
What I wanted was that as a new graph is superimposed, the original figure created should be updated at the next iteration, not showing them all at once after the end of the loop.
Here's an example of a plot that gets updated automatically using matplotlib's animation feature. However, you could also call the update routine yourself, whenever necessary:
import numpy as np
import matplotlib.pyplot as plt
import pandas
import matplotlib.animation as animation
from matplotlib.animation import FuncAnimation
df = pandas.DataFrame(data=np.linspace(0, 100, 101), columns=["colA"])
fig = plt.figure()
ax = plt.gca()
ln, = ax.plot([], [], "o", mew=2, mfc="None", ms=15, mec="r")
class dataPlot(object):
def __init__(self):
self.objs = ax.plot(df.loc[0,"colA"], "g*", ms=15, mew=2, mec="g", mfc="None", label="$Data$")
fig.legend(self.objs, [l.get_label() for l in self.objs], loc="upper center", prop={"size":18}, ncol=2)
def update(self, iFrame):
for o in self.objs:
o.remove()
print("Rendering frame {:d}".format(iFrame))
self.objs = ax.plot(df.loc[iFrame,"colA"], "g*", ms=15, mew=2, mec="g", mfc="None", label="$Data$")
return ln,
dp = dataPlot()
ani = FuncAnimation(fig, dp.update, frames=df.index, blit=True)
plt.show()
I have 15 barh subplots that looks like this:
I can't seem to get the legend working, so I'll see [2,3,4] as separate labels in the graph and in the legend.
I'm having trouble with making this work for subgraphs. My code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
def plot_bars_by_data(data, title):
fig, axs = plt.subplots(8,2, figsize=(20,40))
fig.suptitle(title, fontsize=20)
fig.subplots_adjust(top=0.95)
plt.rcParams.update({'font.size': 13})
axs[7,1].remove()
column_index = 0
for ax_line in axs:
for ax in ax_line:
if column_index < len(data.columns):
column_name = data.columns[column_index]
current_column_values = data[column_name].value_counts().sort_index()
ax.barh([str(i) for i in current_column_values.index], current_column_values.values)
ax.legend([str(i) for i in current_column_values.index])
ax.set_title(column_name)
column_index +=1
plt.show()
# random data
df_test = pd.DataFrame([np.random.randint(2,5,size=15) for i in range(15)], columns=list('abcdefghijlmnop'))
plot_bars_by_data(df_test, "testing")
I just get a 8x2 bars that looks like the above graph. How can I fix this?
I'm using Python 3.6 and Jupyter Python notebook.
Use the following lines in your code. I can't put the whole output here as its a large figure with lots of subplots and hence showing a particular subplot. It turns out that first you have to create a handle for your subplot and then pass the legend values and the handle to produce the desired legends.
colors = ['r', 'g', 'b']
axx = ax.barh([str(i) for i in current_column_values.index], current_column_values.values, color=colors)
ax.legend(axx, [str(i) for i in current_column_values.index])
Sample Output
I am really struggling with matplotlib, escpecially with the axis settings. My goal is to set up 6 subplots in one figure, which all display different datasets but have the same amount of ticklabels.
The relevant part of my sourcecode looks like:
graph4.py:
# Import Matolotlib Modules #
import matplotlib as mpl
from matplotlib.figure import Figure
from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as FigureCanvas
from matplotlib import ticker
import matplotlib.pyplot as plt
mpl.rcParams['font.sans-serif']='Arial' #set font to arial
# Import GTK Modules #
import gtk
#Import System Modules #
import sys
# Import Numpy Modules #
from numpy import genfromtxt
import numpy
# Import Own Modules #
import mysubplot as mysp
class graph4():
weekdays = ['Montag', 'Dienstag', 'Mittwoch', 'Donnerstag', 'Freitag', 'Samstag']
def __init__(self, graphview):
#create new Figure
self.figure = Figure(figsize=(100,100), dpi=75)
#create six subplots within self.figure
self.subplot = []
for j in range(6):
self.subplot.append(self.figure.add_subplot(321 + j))
self.__conf_subplots__() #configure title, xlabel, ylabel and grid of all subplots
#to make it look better
self.figure.subplots_adjust(left=0.125, bottom=0.1, right=0.9, top=0.96, wspace=0.2, hspace=0.6)
#Matplotlib <-> GTK
self.canvas = FigureCanvas(self.figure) # a gtk.DrawingArea
self.canvas.set_flags(gtk.HAS_FOCUS|gtk.CAN_FOCUS)
self.canvas.grab_focus()
self.canvas.show()
graphview.pack_start(self.canvas, True, True)
#add labels and grid to all subplots
def __conf_subplots__(self):
index = 0
for i in self.subplot:
mysp.conf_subplot(i, 'Zeit', 'Menge', graph4.weekdays[index], True)
i.plot([], [], 'bo') #empty plot
index +=1
def plot(self, filename_list):
index = 0
for filename in filename_list:
data = genfromtxt(filename, delimiter=',') #load data from filename
if data.size != 0: #only if file isn't empty
if index <= len(self.subplot): #plot every file on a different subplot
mysp.plot(self.subplot[index],data[0:, 1], data[0:, 0])
index +=1
self.canvas.draw()
def clear_plot(self):
#clear axis of all subplots
for i in self.subplot:
i.cla()
self.__conf_subplots__()
mysubplot.py: (helper module)
# Import Matplotlib Modules
from matplotlib.axes import Subplot
import matplotlib.dates as md
import matplotlib.pyplot as plt
# Import Own Modules #
import mytime as myt
# Import Numpy Modules #
import numpy as np
def conf_subplot(subplot, xlabel, ylabel, title, grid):
if(xlabel != None):
subplot.set_xlabel(xlabel)
if(ylabel != None):
subplot.set_ylabel(ylabel)
if(title != None):
subplot.set_title(title)
subplot.grid(grid)
#rotate xaxis labels
plt.setp(subplot.get_xticklabels(), rotation=30, fontsize=12)
#display date on xaxis
subplot.xaxis.set_major_formatter(md.DateFormatter('%H:%M:%S'))
subplot.xaxis_date()
def plot(subplot, x, y):
subplot.plot(x, y, 'bo')
I think the best way to explain what goes wrong is with the use of screenshots. After I start my application, everything looks good:
If I double click a 'Week'-entry on the left, the method clear_plot() in graph4.py is called to reset all subplots. Then a list of filenames is passed to the method plot() in graph4.py. The method plot() opens each file and plots each dataset on a different subplot. So after I double click a entry, it looks like:
As you can see, each subplot has a different number of xtick labels, which looks pretty ugly to me. Therefore, I am looking for a solution to improve this. My first approach was to set the ticklabels manually with xaxis.set_ticklabels(), so that each subplot has the same number of ticklabels. However, as strange as it sounds, this only works on some datasets and I really don't know why. On some datasets, everything works fine and on other datasets, matplotlib is basically doing what it wants and displays xaxis labels that I didn't specify. I also tried FixedLocator(), but I got the same result. On some datasets it is working and on others, matplotlib is using a different number of xtick labels.
What am I doing wrong?
Edit:
As #sgpc suggested, I tried to use pyplot. My sourcecode now looks like this:
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as FigureCanvas
import matplotlib.dates as md
mpl.rcParams['font.sans-serif']='Arial' #set font to arial
import gtk
import sys
# Import Numpy Modules #
from numpy import genfromtxt
import numpy
# Import Own Modules #
import mysubplot as mysp
class graph2():
weekdays = ['Montag', 'Dienstag', 'Mittwoch', 'Donnerstag', 'Freitag', 'Samstag']
def __init__(self, graphview):
self.figure, temp = plt.subplots(ncols=2, nrows=3, sharex = True)
#2d array -> list
self.axes = [ y for x in temp for y in x]
#axis: date
for i in self.axes:
i.xaxis.set_major_formatter(md.DateFormatter('%H:%M:%S'))
i.xaxis_date()
#make space and rotate xtick labels
self.figure.autofmt_xdate()
#Matplotlib <-> GTK
self.canvas = FigureCanvas(self.figure) # a gtk.DrawingArea
self.canvas.set_flags(gtk.HAS_FOCUS|gtk.CAN_FOCUS)
self.canvas.grab_focus()
self.canvas.show()
graphview.pack_start(self.canvas, True, True)
def plot(self, filename_list):
index = 0
for filename in filename_list:
data = genfromtxt(filename, delimiter=',') #get dataset
if data.size != 0: #only if file isn't empty
if index < len(self.axes): #print each dataset on a different subplot
self.axes[index].plot(data[0:, 1], data[0:, 0], 'bo')
index +=1
self.canvas.draw()
#not yet implemented
def clear_plot(self):
pass
If I plot some datasets, I get the following output:
http://i.imgur.com/3ngYTNr.png (sorry, I still don't have enough reputation to embedd pictures)
Moreover, I am not really sure if sharing the x-axis is a really good idea, because it is possible that the x-values differ in every subplot (for example: in the first subplot, the x-values ranges from 8:00am - 11:00am and in the second subplot the x-values ranges from 7:00pm - 9:00pm).
If I get rid of sharex = True, I get the following output:
http://i.imgur.com/rxHeSyJ.png (sorry, I still don't have enough reputation to embedd pictures)
As you can see, the output now looks better. However now, the labels on the x-axes are not updated. I assume that is because the last suplots are empty.
My next attempt was to use an axis for each subplot. Therefore, I made this changes:
for i in self.axes:
plt.setp(i.get_xticklabels(), visible=True, rotation = 30) #<-- I added this line...
i.xaxis.set_major_formatter(md.DateFormatter('%H:%M:%S'))
i.xaxis_date()
#self.figure.autofmt_xdate() #<--changed this line
self.figure.subplots_adjust(left=0.125, bottom=0.1, right=0.9, top=0.96, wspace=0.2, hspace=0.6) #<-- and added this line
Now I get the following output:
i.imgur.com/TmA1goE.png (sorry, I still don't have enough reputation to embedd pictures)
So with this attempt, I am basically struggling with the same problem as with Figure() and add_subplot().
I really don't know, what else I could try to make it work...
I would strongly recommend you to use pyplot.subplots() with sharex=True:
fig, axes = subplots(ncols=2, nrows=3, sharex= True)
Then you access each axes using:
ax = axes[i,j]
And you can plot doing:
ax.plot(...)
To control the number of ticks for each AxesSubplot you can use:
ax.locator_params(axis='x', nbins=6)
OBS: axis can be 'x', 'y' or 'both'