I am running a loop to extract data and graph plots using Seaborn, Pandas and Python. I just want to save each plot as a graphic and close it but I am not able to figure out how to do this.
/usr/local/lib/python3.6/dist-packages/seaborn/axisgrid.py:311: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_open_warning).
I had expected g.close() to work but I got the error:
AttributeError: 'FacetGrid' object has no attribute 'close'
for o in options:
s = "SELECT * from options_yahoo where contract_name = '" + o + "'
SQL_Query = pd.read_sql_query(s, conn)
df = pd.DataFrame(SQL_Query)
g = sns.relplot( kind="line", data=df[['bid','ask','lastprice']])
g.savefig( o+ ".png")
g.close()
I expect to be able to have a more efficient solution that doesn't use up so much memory and bring warning errors. Some best practices would be much appreciated.
Seaborn plots responds to pyplot commands, you can do plt.close() to close the current figure, even if it was plotted by Seaborn
If you want to close a specific figure corresponding to a seaborn plot (e.g. a FacetGrid) called sns_plot, use:
plt.close(sns_plot.fig)
Related
I have to use different csv files to create plots out of them in the same figure. My coding environment is google colab (it's like Jupyter notebook in google's cloud). So I decided to create a figure and then loop through the files and do the plots. It looks something like this:
import healpy as hp
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(16,12))
ax = fig.add_subplot(111)
for void_file in ['./filepath1.csv','./filepath2.csv','./filepath3.csv', ...]:
helper_image = hp.gnomview(void_file, .....)
data = func1(helper_image, .....)
plt.plot(len(data), data, ......)
What I want is to only add into the figure the plots created with the line plt.plot(len(data), data, ......), but what happens is that also the helper images from the line helper_image = hp.gnomview(....) sneak into the image and spoil it (healpy is a package for spherical data). The line helper_image = .... is only there to make some necessary calculations, but unfortunately they come along with plots.
How can I suppress the creation of the plots by helper_image = hp.gnomview(....)? Or can I somehow tell the figure or ax to include only plots that I specify? Or are there any easy alternatives that don't require a loop for plotting? Tnx
you can use return_projected_image=True and no_plot=True keyword arguments, see https://healpy.readthedocs.io/en/latest/generated/healpy.visufunc.gnomview.html
I am making a lot of plots and saving them to a file, it all works, but during the compilation I get the following message:
RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`).
fig = self.plt.figure(figsize=self.figsize)
So I think I could improve the code by closing the figures, I googled it and found that I should use fig.close(). However I get the following error 'Figure' object has no attribute 'close'. How should I make it work?
This is the loop in which I create plots:
for i in years:
ax = newdf.plot.barh(y=str(i), rot=0)
fig = ax.get_figure()
fig.savefig('C:\\Users\\rysza\\Desktop\\python data analysis\\zajecia3\\figure'+str(i)+'.jpeg',bbox_inches='tight')
fig.close()
Replace fig.close() with plt.close(fig), close is a function defined directly in the module.
Try this, matplotlib.pyplot.close(fig) , for more information refer this website
https://matplotlib.org/2.1.0/api/_as_gen/matplotlib.pyplot.close.html
I just started switching from R to python, and have been a bit confused by the way plots are handled.
In R, I would generate a scatter plot this way:
myPlot <- ggplot(myData, aes(x=x, y=y)) + geom_point(). myPlot will be treated as an object, I can save it, copy it, pass it, or just plot it later.
However, in python I couldn't figure out how to do it. For example, when I use:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'X': [1,2,3],
'Y': [4,5,6]
})
ax = df.plot(kind="scatter", x='X', y='Y')
all I want to do here is to save the plot to an object, so that it can be easily plotted later without executing all the code again (it's easy to redo in this dummy case but my data is far more complicated)
It seems that I was able to save some information into "ax", as suggested online, but I couldn't figure out how to reproduce the plot with the object "ax".
Thank you so much~
Have a look at the visualization section of the pandas docs.
ax in your example indeed holds the plot object. Depending on your environment you can save it to a figure or display it inline.
The easiest way is just to plt.show().
I've tried different methods to save my plot but every thing I've tried has turned up with a blank image and I'm not currently out of ideas. Any help with other suggestions that could fix this? The code sample is below.
word_frequency = nltk.FreqDist(merged_lemmatizedTokens) #obtains frequency distribution for each token
print("\nMost frequent top-10 words: ", word_frequency.most_common(10))
word_frequency.plot(10, title='Top 10 Most Common Words in Corpus')
plt.savefig('img_top10_common.png')
I was able to save the NLTK FreqDist plot, when I first initialized a figure object, then called the plot function and finally saved the figure object.
import matplotlib.pyplot as plt
from nltk.probability import FreqDist
fig = plt.figure(figsize = (10,4))
plt.gcf().subplots_adjust(bottom=0.15) # to avoid x-ticks cut-off
fdist = FreqDist(merged_lemmatizedTokens)
fdist.plot(10, cumulative=False)
plt.show()
fig.savefig('freqDist.png', bbox_inches = "tight")
I think you can try the following:
plt.ion()
word_frequency.plot(10, title='Top 10 Most Common Words in Corpus')
plt.savefig('img_top10_common.png')
plt.ioff()
plt.show()
This is because inside nltk's plot function, plt.show() is called and once the figure is closed, plt.savefig() has no active figure to save anymore.
The workaround is to turn interactive mode on, such that the plt.show() from inside the nltk function does not block. Then savefig is called with a current figure available and saves the correct plot. To then show the figure, interactive mode needs to be turned off again and plt.show() be called externally - this time in a blocking mode.
Ideally, nltk would rewrite their plotting function to either allow to set the blocking status, or to not show the plot and return the created figure, or to take a Axes as input to which to plot. Feel free to reach out to them with this request.
Let's say there's a time series that I want to plot in matplotlib:
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
s = pd.Series(np.random.rand(1, len(dates))[0], index=dates)
The GUI backends in matplotlib have this nice feature that they show the cursor coordinates in the window. When I plot pandas series using its plot() method like this:
fig = plt.figure()
s.plot()
fig.show()
the cursor's x coords are shown in full yyyy-mm-dd at the bottom of the window as you can see on pic 1.
However when I plot the same series s with pyplot:
fig = plt.figure()
plt.plot(s.index, s.values)
fig.show()
full dates are only shown when I zoom in and in the default view I can only see Mon-yyyy (see pic 2) and I would see just the year if the series were longer.
In my project there are functions for drawing complex, multi-series graphs from time series data using plt.plot(), so when I view the results in GUI I only see the full dates in the close-ups. I'm using ipython3 v. 4.0 and I'm mostly working with the MacOSX backend, but I tried TK, Qt and GTK backends on Linux with no difference in the behavior.
So far I've got 2 ideas on how to get the full dates displayed in GUI at any zoom level:
rewrite plt.plot() to pd.Series.plot()
use canvas event handler to get the x-coord from the cursor pos and print it somewhere
However before I attempt any of the above I need to know for sure if there is a better quicker way to get the full dates printed in the graph window. I guess there is, because pandas is using it, but I couldn't find it in pyplot docs or examples or elsewhere online and it's none of these 2 calls:
ax.xaxis_date()
fig.autofmt_xdate()
Somebody please advise.
Hooks for formatting the info are Axes.format_coord or Axes.fmt_xdata. Standard formatters are defined in matplotlib.dates (plus some additions from pandas). A basic solution could be:
import matplotlib.dates
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
series = pd.Series(np.random.rand(len(dates)), index=dates)
plt.plot(series.index, series.values)
plt.gca().fmt_xdata = matplotlib.dates.DateFormatter('%Y-%m-%d')
plt.show()