Let's say there's a time series that I want to plot in matplotlib:
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
s = pd.Series(np.random.rand(1, len(dates))[0], index=dates)
The GUI backends in matplotlib have this nice feature that they show the cursor coordinates in the window. When I plot pandas series using its plot() method like this:
fig = plt.figure()
s.plot()
fig.show()
the cursor's x coords are shown in full yyyy-mm-dd at the bottom of the window as you can see on pic 1.
However when I plot the same series s with pyplot:
fig = plt.figure()
plt.plot(s.index, s.values)
fig.show()
full dates are only shown when I zoom in and in the default view I can only see Mon-yyyy (see pic 2) and I would see just the year if the series were longer.
In my project there are functions for drawing complex, multi-series graphs from time series data using plt.plot(), so when I view the results in GUI I only see the full dates in the close-ups. I'm using ipython3 v. 4.0 and I'm mostly working with the MacOSX backend, but I tried TK, Qt and GTK backends on Linux with no difference in the behavior.
So far I've got 2 ideas on how to get the full dates displayed in GUI at any zoom level:
rewrite plt.plot() to pd.Series.plot()
use canvas event handler to get the x-coord from the cursor pos and print it somewhere
However before I attempt any of the above I need to know for sure if there is a better quicker way to get the full dates printed in the graph window. I guess there is, because pandas is using it, but I couldn't find it in pyplot docs or examples or elsewhere online and it's none of these 2 calls:
ax.xaxis_date()
fig.autofmt_xdate()
Somebody please advise.
Hooks for formatting the info are Axes.format_coord or Axes.fmt_xdata. Standard formatters are defined in matplotlib.dates (plus some additions from pandas). A basic solution could be:
import matplotlib.dates
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
series = pd.Series(np.random.rand(len(dates)), index=dates)
plt.plot(series.index, series.values)
plt.gca().fmt_xdata = matplotlib.dates.DateFormatter('%Y-%m-%d')
plt.show()
Related
I would like to know if the behavior of the following code is expected.
The first figure (Series) is saved as I would expect. The second (DataFrame) is not.
If this is not a bug, how can I achieve my (obvious) goal?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
pd.Series(np.random.randn(100)).plot()
fig.savefig('c:\\temp\\plt_series.png')
fig = plt.figure()
pd.DataFrame(np.random.randn(100,2)).plot()
fig.savefig('c:\\temp\\plt_df.png')
After saving the figure, close the current plot using plt.close() to close the current figure, otherwise the old one is still active even if the next plot is being generated. You can also use plt.close('all') to be sure all open figures are closed.
I would like to plot an animated heatmap from a group of DataFrames (for example saved in a dictionary), either as gif or a movie.
For example, say I have the following collection of DFs. I can display all of them one after the other. But I would like to have them all being shown in the same figure in the same way as a GIF is shown (a loop of the heatmaps).
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
dataframe_collection = {}
for i in range(5):
dataframe_collection[i] = pd.DataFrame(np.random.random((5,5)))
# Here within the same loop just for brevity
sns.heatmap(dataframe_collection[i])
plt.show()
The simplest way is to first create separate png images, and then use a software such as ImageMagick to convert them to an animated gif.
Example to create the png's:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
dataframe_collection = {}
for i in range(5):
dataframe_collection[i] = pd.DataFrame(np.random.random((5,5)))
#plt.pcolor(dataframe_collection[i])
sns.heatmap(dataframe_collection[i])
plt.gca().set_ylim(0, len(dataframe_collection[i])) #avoiding problem with axes
plt.axis('off')
plt.tight_layout()
plt.savefig(f'dataframe_{i}.png')
After installing ImageMagick the following shell command creates a gif. If the defaults are not satisfying, use the docs to explore the many options.
convert.exe -delay 20 -loop 0 dataframe_*.png dataframes.gif
See also this post about creating animations and an animated gif inside matplotlib.
Note that Seaborn's heatmap also has some features such as sns.heatmap(dataframe_collection[i], annot=True).
If you're unable to use ImageMagick, you could show a video by quickly displaying single png files, simulating a video.
This and this post contain more explanations and example code. Especially the second part of this answer looks promising.
I just started switching from R to python, and have been a bit confused by the way plots are handled.
In R, I would generate a scatter plot this way:
myPlot <- ggplot(myData, aes(x=x, y=y)) + geom_point(). myPlot will be treated as an object, I can save it, copy it, pass it, or just plot it later.
However, in python I couldn't figure out how to do it. For example, when I use:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'X': [1,2,3],
'Y': [4,5,6]
})
ax = df.plot(kind="scatter", x='X', y='Y')
all I want to do here is to save the plot to an object, so that it can be easily plotted later without executing all the code again (it's easy to redo in this dummy case but my data is far more complicated)
It seems that I was able to save some information into "ax", as suggested online, but I couldn't figure out how to reproduce the plot with the object "ax".
Thank you so much~
Have a look at the visualization section of the pandas docs.
ax in your example indeed holds the plot object. Depending on your environment you can save it to a figure or display it inline.
The easiest way is just to plt.show().
I've seen a bunch of solutions (including matplotlib example code) that to have an x-tick label be multi-line you can just introduce a newline character. Which I did (below is a code excerpt that adds this newline):
subplot.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d\n%H:%M", tz=startTime.tzinfo))
However, I noticed this introduces a weird quirk in which when I mouse-over the plots it kind of causes all the plots to 'jump' up and down (shift slightly up and then back down when I mouse over again). Note: if there is just one plot then the bottom matplotlib toolbar (with the save button etc..) shifts up and down only. This makes it unpleasant to look at when you are trying to move the mouse around and interact with the plots. I noticed when I take out the new-line character this quirk disappears. Anyone else run into this and solve it (as in keeping multiline label without this weird 'jump' quirk)?
I'm using Python 3.6 and matplotlib 1.5.3. using TKAgg backend.
By default, the same formatter is used for the values shown in the NavigationToolbar as on the axes. I suppose that you want to use the format "%m-%d\n%H:%M" in question just for the ticklabel formatting and are happy to use a single-line format for the values shown when moving the mouse.
This can be achieved by using a different formatter for those two cases.
# Format tick labels
ax.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d\n%H:%M"))
# Format toolbar coordinates
ax.fmt_xdata = mdates.DateFormatter('%m-%d %H:%M')
Example picture:
Complete code for reproduction:
import matplotlib
matplotlib.use("TkAgg")
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
dates = pd.date_range("2016-06-01 09:00", "2016-06-01 16:00", freq="H" )
y = np.cumsum(np.random.normal(size=len(dates)))
df = pd.DataFrame({"Dates" : dates, "y": y})
fig, ax = plt.subplots()
ax.plot_date(df["Dates"], df.y, '-')
ax.xaxis.set_major_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d\n%H:%M"))
ax.fmt_xdata = mdates.DateFormatter('%m-%d %H:%M')
plt.show()
I'm required to use the information from a .sac file and plot it against a grid. I know that using various ObsPy functions one is able to plot the Seismograms using st.plot() but I can't seem to get it against a grid. I've also tried following the example given here "How do I draw a grid onto a plot in Python?" but have trouble when trying to configure my x axis to use UTCDatetime. I'm new to python and programming of this sort so any advice / help would be greatly appreciated.
Various resources used:
"http://docs.obspy.org/tutorial/code_snippets/reading_seismograms.html"
"http://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.plot.html#obspy.core.stream.Stream.plot"
The Stream's plot() method actually automatically generates a grid, e.g. if you take the default example and plot it via:
from obspy.core import read
st = read() # without filename an example file is loaded
tr = st[0] # we will use only the first channel
tr.plot()
You may want to play with the number_of_ticks, tick_format and tick_rotationparameters as pointed out in http://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.plot.html.
However if you want more control you can pass a matplotlib figure as input parameter to the plot() method:
from obspy.core import read
import matplotlib.pyplot as plt
fig = plt.figure()
st = read('/path/to/file.sac')
st.plot(fig=fig)
# at this point do whatever you want with your figure, e.g.
fig.gca().set_axis_off()
# finally display your figure
fig.show()
Hope it helps.