Python ggplot and ggplotly - python

Former R user, I used to combine extensively ggplot and plot_ly libraries via the ggplotly() function to display data.
Newly arrived in Python, I see that the ggplot library is available, but cant find anything on a simple combination with plotly for graphical reactive displays.
What I would look for is something like :
from ggplot import*
import numpy as np
import pandas as pd
a = pd.DataFrame({'grid': np.arange(-4, 4),
'test_data': np.random.random_integers(0, 10,8)})
p2 = ggplot(a, aes(x = 'grid', y = 'test_data'))+geom_line()
p2
ggplotly(p2)
Where the last line would launch a classic plotly dynamic viewer with all the great functionalities of mouse graphical interactions, curves selections and so on...
Thanks for your help :),
Guillaume

This open plotnine issue describes a similar enhancement request.
Currently the mpl_to_plotly function seems to work sometimes (for some geoms?), but not consistently. The following code, seems to work ok.
from plotnine import *
from plotly.tools import mpl_to_plotly as ggplotly
import numpy as np
import pandas as pd
a = pd.DataFrame({'grid': np.arange(-4, 4),
'test_data': np.random.randint(0, 10,8)})
p2 = ggplot(a, aes(x = 'grid', y = 'test_data')) + geom_point()
ggplotly(p2.draw())

You don't need ggplotly in python if all you are seeking is an interactive interface.
ggplot (or at least plotnine, which is the implementation that I am using) uses matplotlib which is already interactive, unlike the R ggplot2 package that requires plotly on top.

Related

When I run '''sns.histplot(df['price'])''' in pycharm I get the code output but no graph, why is this?

I'm using pycharm to run some code using Seaborn. I'm very new to python and am just trying to learn the ropes so I'm following a tutorial online. I've imported the necessary libraries and have run the below code
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# import the data saved as a csv
df = pd.read_csv('summer-products-with-rating-and-performance_2020-08.csv')
df["has_urgency_banner"] = df["has_urgency_banner"].fillna(0)
df["discount"] = (df["retail_price"] -
df["price"])/df["retail_price"]
df["rating_five_percent"] = df["rating_five_count"]/df["rating_count"]
df["rating_four_percent"] = df["rating_four_count"]/df["rating_count"]
df["rating_three_percent"] = df["rating_three_count"]/df["rating_count"]
df["rating_two_percent"] = df["rating_two_count"]/df["rating_count"]
df["rating_one_percent"] = df["rating_one_count"]/df["rating_count"]
ratings = [
"rating_five_percent",
"rating_four_percent",
"rating_three_percent",
"rating_two_percent",
"rating_one_percent"
]
for rating in ratings:
df[rating] = df[rating].apply(lambda x: x if x>= 0 and x<= 1 else 0)
# Distribution plot on price
sns.histplot(df['price'])
My output is as follows:
Process finished with exit code 0
so I know there are no errors in the code but I don't see any graphs anywhere as I'm supposed to.
Ive found a way around this by using this at the end
plt.show()
which opens a new tab and uses matplotlib to show me a similar graph.
However in the code I'm using to follow along, matplotlib is not imported or used (I understand that seaborn has built in Matplotlib functionality) as in the plt.show statement is not used but the a visual graph is still achieved.
I've also used print which gives me the following
AxesSubplot(0.125,0.11;0.775x0.77)
Last point to mention is that the code im following along with uses the following
import seaborn as sns
# Distribution plot on price
sns.distplot(df['price'])
but distplot has now depreciated and I've now used histplot because I think that's the best alternative vs using displot, If that's incorrect please let me know.
I feel there is a simple solution as to why I'm not seeing a graph but I'm not sure if it's to do with pycharm or due to something within the code.
matplotlib is a dependency of seaborn. As such, importing matplotlib with import matplotlib.pyplot as plt and calling plt.show() does not add any overhead to your code.
While it is annoying that there is no sns.plt.show() at this time (see this similar question for discussion), I think this is the simplest solution to force plots to show when using PyCharm Community.
Importing matplotlib in this way will not affect how your exercises run as long as you use a namespace like plt.
Be aware the 'data' must be pandas DataFrame object, not: <class 'pandas.core.series.Series'>
I using this, work finely:
# Distribution plot on price
sns.histplot(df[['price']])
plt.show()

Using d3.select on a python-generated object

I'm using mpld3 to create a graph in a web app. I want to allow the user to drag the orange line.
I explicitly want to code this drag-and-drop behavior directly in d3.js, because d3.js allows me to set restrictions on the drag behavior (specifically I want to allow only horizontal drag, as shows in this JSFiddle.)
I am having trouble selecting the orange line element. I am getting its ID with mpld3.utils.get_id(), but passing this id on the JS side to either d3.select() or to mpld3.get_element() does not work.
import mpld3
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1,1,20)
fig, ax = plt.subplots()
ax.plot(x,x**3)
line = ax.plot([0,0],[-1,0])
js = '''
<script src="https://d3js.org/d3.v4.min.js"></script>
'''
html = mpld3.fig_to_html(fig)
print(mpld3.utils.get_id(line)) # gives something like 'el68722312816129664'
with open('pured3_output.html', 'w') as f:
f.write(html+js)
After opening pured3_output.html in the browser, I type this in the browser console:
>>> mpld3.get_element('el68722312816129664')
null
>>> d3.select('#el68722312816129664')
Selection {_groups: Array(1), _parents: Array(1)}
I don't know what it's returning but it's clearly not the right thing, because, d3.select('#el68722312816129664').remove() doesn't do anything.
I'm not particularly attached to using mpld3, compared to other packages built on top of d3.js, such as Plotly. I am using mpld3 so far because it seems simple and lightweight. (By the way, Plotly did not allow me to do what I wanted without touching the d3.js.). If this question is for some reason hard to solve in mpld3 (I don't see why it should be), please let me know how you would do it with a different package.

How do I use the detrending function for matplotlib.pyplot.acorr?

I am a Python beginner. I am trying to detrend a time-series before running an autocorrelation analysis by using acorr in matplotlib. But there is something about the syntax that I fail understand.
Matplotlib's website (https://matplotlib.org/3.1.0/api/_as_gen/matplotlib.pyplot.acorr.html) describes how to use detrending with the acorr function: "x is detrended by the detrend callable. This must be a function x = detrend(x) accepting and returning an numpy.array." I must be reading this wrong, because the code I use does not work.
Failed attempts:
plt.acorr(values, detrend=True)
plt.acorr(values, detrend="linear")
plt.acorr(values=detrend(values))
As you can see, some rudimentary fact about syntax or matplotlib escapes me. Please help.
In matplotlib.mlab you find functions which you can use for detrending. An example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import mlab
wn = np.random.normal(size=10**3)
plt.figure()
plt.acorr(np.abs(wn), maxlags=200, detrend=mlab.detrend_none) #default detrend
plt.figure()
plt.acorr(np.abs(wn), maxlags=200, detrend=mlab.detrend) #subtract sample mean

Python codes using numpy and matplotlib without outputting result

I just started learning doing financial analytics with python today. I ran into a block of code and it requires importing numpy and matplotlib to run the codes and generate a graph(not sure if it really can).
The codes are like this:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rcParams['font.family'] = 'serif'
K = 8000
S = np.linspace(7000,9000,100)
h = np.maximum(S-K, 0)
plt.figure()
plt.plot(S, h, lw=2.5)
plt.xlabel('index level $S-t$ at maturity')
plt.ylabel("inner value of European call option")
plt.grid(True)
I installed both numpy and matplotlib but after I tried running the codes nothing came out.
I really don't know what went wrong. This is my first time using numpy and matplotlib and I have not idea how to solve this problem.
Please help!

using rpy2 with IPython notebooks?

Is it possible to use rpy2 (calling ggplot2) with IPython notebooks, and then save them (and share on NBViewer like other notebooks http://nbviewer.ipython.org/)? Is there any challenge in having the rpy2 ggplots appear in the notebook and/or interactively? It would be helpful if someone could provide an example session and its output of making a ggplot2 figure within a notebook using rpy2 in IPython.
This was written without looking the code in rmagic.
They have have a more clever way to do it (I have 11 lines of code).
import uuid
from rpy2.robjects.packages import importr
from IPython.core.display import Image
grdevices = importr('grDevices')
def ggplot_notebook(gg, width = 800, height = 600):
fn = '{uuid}.png'.format(uuid = uuid.uuid4())
grdevices.png(fn, width = width, height = height)
gg.plot()
grdevices.dev_off()
return Image(filename=fn)
To try it:
from rpy2.robjects.lib import ggplot2
from rpy2.robjects import Formula
datasets = importr('datasets')
mtcars = datasets.__rdata__.fetch('mtcars')['mtcars']
p = ggplot2.ggplot(mtcars) + \
ggplot2.aes_string(x='mpg', y='cyl') + \
ggplot2.geom_point() + \
ggplot2.geom_smooth() + \
ggplot2.facet_wrap(Formula('~ am'))
ggplot_notebook(p, height=300)
It's possible with the rmagic extension, which uses rpy2. You seem to need to print() the figure to show it, though. Here's an example session: http://nbviewer.ipython.org/5029692
If you prefer to use rpy2 directly, it must be possible. Have a look at the rpy2 documentation for ggplot2. To get it into the notebook, you can draw to a PNG/SVG device, then read it from the Python side (this is what rmagic does).

Categories