I just started switching from R to python, and have been a bit confused by the way plots are handled.
In R, I would generate a scatter plot this way:
myPlot <- ggplot(myData, aes(x=x, y=y)) + geom_point(). myPlot will be treated as an object, I can save it, copy it, pass it, or just plot it later.
However, in python I couldn't figure out how to do it. For example, when I use:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'X': [1,2,3],
'Y': [4,5,6]
})
ax = df.plot(kind="scatter", x='X', y='Y')
all I want to do here is to save the plot to an object, so that it can be easily plotted later without executing all the code again (it's easy to redo in this dummy case but my data is far more complicated)
It seems that I was able to save some information into "ax", as suggested online, but I couldn't figure out how to reproduce the plot with the object "ax".
Thank you so much~
Have a look at the visualization section of the pandas docs.
ax in your example indeed holds the plot object. Depending on your environment you can save it to a figure or display it inline.
The easiest way is just to plt.show().
Related
I have to use different csv files to create plots out of them in the same figure. My coding environment is google colab (it's like Jupyter notebook in google's cloud). So I decided to create a figure and then loop through the files and do the plots. It looks something like this:
import healpy as hp
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(16,12))
ax = fig.add_subplot(111)
for void_file in ['./filepath1.csv','./filepath2.csv','./filepath3.csv', ...]:
helper_image = hp.gnomview(void_file, .....)
data = func1(helper_image, .....)
plt.plot(len(data), data, ......)
What I want is to only add into the figure the plots created with the line plt.plot(len(data), data, ......), but what happens is that also the helper images from the line helper_image = hp.gnomview(....) sneak into the image and spoil it (healpy is a package for spherical data). The line helper_image = .... is only there to make some necessary calculations, but unfortunately they come along with plots.
How can I suppress the creation of the plots by helper_image = hp.gnomview(....)? Or can I somehow tell the figure or ax to include only plots that I specify? Or are there any easy alternatives that don't require a loop for plotting? Tnx
you can use return_projected_image=True and no_plot=True keyword arguments, see https://healpy.readthedocs.io/en/latest/generated/healpy.visufunc.gnomview.html
There are a number of helpful posts for using LineCollections in Matplotlib.
I have working code, but am having trouble figuring out how to set the transparency of the lines. For example, in Pandas it's as easy as doing:
df.plot(kind='line',alpha=.25)
However, I chose the LineCollection method because I want to plot a dataframe with >15k lines and the above example does not work.
I've tried adding ax.set_alpha(.25) in my code:
fig, ax = plt.subplots()
ax.set_xlim(np.min(may_days), np.max(may_days))
ax.set_ylim(np.min(may_segments.min()), np.max(may_segments.max()))
line_segments = LineCollection(may_segments,cmap='jet')
line_segments.set_array(may_days)
ax.add_collection(line_segments)
ax.set_alpha(.05)
ax.set_title('Daily May Data')
plt.show()
but there is no change.
Unfortunately I cannot provide a sample of the data with which I'm working; however, I've found the second example this Matplotlib gallery doc to be easy to copy.
You do it the same way you'd do it in pandas.
line_segments = LineCollection(may_segments, cmap='jet', alpha=0.05)
Let's say there's a time series that I want to plot in matplotlib:
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
s = pd.Series(np.random.rand(1, len(dates))[0], index=dates)
The GUI backends in matplotlib have this nice feature that they show the cursor coordinates in the window. When I plot pandas series using its plot() method like this:
fig = plt.figure()
s.plot()
fig.show()
the cursor's x coords are shown in full yyyy-mm-dd at the bottom of the window as you can see on pic 1.
However when I plot the same series s with pyplot:
fig = plt.figure()
plt.plot(s.index, s.values)
fig.show()
full dates are only shown when I zoom in and in the default view I can only see Mon-yyyy (see pic 2) and I would see just the year if the series were longer.
In my project there are functions for drawing complex, multi-series graphs from time series data using plt.plot(), so when I view the results in GUI I only see the full dates in the close-ups. I'm using ipython3 v. 4.0 and I'm mostly working with the MacOSX backend, but I tried TK, Qt and GTK backends on Linux with no difference in the behavior.
So far I've got 2 ideas on how to get the full dates displayed in GUI at any zoom level:
rewrite plt.plot() to pd.Series.plot()
use canvas event handler to get the x-coord from the cursor pos and print it somewhere
However before I attempt any of the above I need to know for sure if there is a better quicker way to get the full dates printed in the graph window. I guess there is, because pandas is using it, but I couldn't find it in pyplot docs or examples or elsewhere online and it's none of these 2 calls:
ax.xaxis_date()
fig.autofmt_xdate()
Somebody please advise.
Hooks for formatting the info are Axes.format_coord or Axes.fmt_xdata. Standard formatters are defined in matplotlib.dates (plus some additions from pandas). A basic solution could be:
import matplotlib.dates
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
series = pd.Series(np.random.rand(len(dates)), index=dates)
plt.plot(series.index, series.values)
plt.gca().fmt_xdata = matplotlib.dates.DateFormatter('%Y-%m-%d')
plt.show()
I'm required to use the information from a .sac file and plot it against a grid. I know that using various ObsPy functions one is able to plot the Seismograms using st.plot() but I can't seem to get it against a grid. I've also tried following the example given here "How do I draw a grid onto a plot in Python?" but have trouble when trying to configure my x axis to use UTCDatetime. I'm new to python and programming of this sort so any advice / help would be greatly appreciated.
Various resources used:
"http://docs.obspy.org/tutorial/code_snippets/reading_seismograms.html"
"http://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.plot.html#obspy.core.stream.Stream.plot"
The Stream's plot() method actually automatically generates a grid, e.g. if you take the default example and plot it via:
from obspy.core import read
st = read() # without filename an example file is loaded
tr = st[0] # we will use only the first channel
tr.plot()
You may want to play with the number_of_ticks, tick_format and tick_rotationparameters as pointed out in http://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.plot.html.
However if you want more control you can pass a matplotlib figure as input parameter to the plot() method:
from obspy.core import read
import matplotlib.pyplot as plt
fig = plt.figure()
st = read('/path/to/file.sac')
st.plot(fig=fig)
# at this point do whatever you want with your figure, e.g.
fig.gca().set_axis_off()
# finally display your figure
fig.show()
Hope it helps.
I am writing a script in Python (.py file) and I am using Matplotlib to plot an array.
I want to add a legend with a formula to the plot, but I haven't been able to do it.
I have done this before in IPython or the terminal. In this case, writing something like this:
legend(ur'$The_formula$')
worked perfectly. However, this doesn't work when I call my .py script from the terminal/IPython.
The easiest way is to assign the label when you plot the data,
e.g.:
import matplotlib.pyplot as plt
ax = plt.gca() # or any other way to get an axis object
ax.plot(x, y, label=r'$\sin (x)$')
ax.legend()
When writing code for labels it is:
import pylab
# code here
pylab.plot(x,y,'f:', '$sin(x)$')
So perhaps pylab.legend('$latex here$')
Edit:
The u is for unicode strings, try just r'$\latex$'