I know the question has been asked before regarding using matplotlib in pyspark. I wanted to know how to do the same through shell. I used the same code as given in this solution
How to use matplotlib to plot pyspark sql results
However I am getting this when I run in the shell.
<matplotlib.axes.AxesSubplot object at 0x7f1cd604b690>
I am not entirely sure I do understand your question, so please provide more details in case I got you wrong.
If you happen to see an instance of a Subplot-Class such as given by you, rather have the previous command's return value assigned to a variable, because otherwise you cannot use it. I will give a simple example of a subplot below:
In essence, instead of:
import matplotlib.pyplot as plt
fig = plt.figure()
fig.add_subplot(111)
do the following:
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(x_values, y_values)
plt.show()
If this didnt answer your question, please provide a minimal working example of what you are doing.
Related
I'm using pycharm to run some code using Seaborn. I'm very new to python and am just trying to learn the ropes so I'm following a tutorial online. I've imported the necessary libraries and have run the below code
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# import the data saved as a csv
df = pd.read_csv('summer-products-with-rating-and-performance_2020-08.csv')
df["has_urgency_banner"] = df["has_urgency_banner"].fillna(0)
df["discount"] = (df["retail_price"] -
df["price"])/df["retail_price"]
df["rating_five_percent"] = df["rating_five_count"]/df["rating_count"]
df["rating_four_percent"] = df["rating_four_count"]/df["rating_count"]
df["rating_three_percent"] = df["rating_three_count"]/df["rating_count"]
df["rating_two_percent"] = df["rating_two_count"]/df["rating_count"]
df["rating_one_percent"] = df["rating_one_count"]/df["rating_count"]
ratings = [
"rating_five_percent",
"rating_four_percent",
"rating_three_percent",
"rating_two_percent",
"rating_one_percent"
]
for rating in ratings:
df[rating] = df[rating].apply(lambda x: x if x>= 0 and x<= 1 else 0)
# Distribution plot on price
sns.histplot(df['price'])
My output is as follows:
Process finished with exit code 0
so I know there are no errors in the code but I don't see any graphs anywhere as I'm supposed to.
Ive found a way around this by using this at the end
plt.show()
which opens a new tab and uses matplotlib to show me a similar graph.
However in the code I'm using to follow along, matplotlib is not imported or used (I understand that seaborn has built in Matplotlib functionality) as in the plt.show statement is not used but the a visual graph is still achieved.
I've also used print which gives me the following
AxesSubplot(0.125,0.11;0.775x0.77)
Last point to mention is that the code im following along with uses the following
import seaborn as sns
# Distribution plot on price
sns.distplot(df['price'])
but distplot has now depreciated and I've now used histplot because I think that's the best alternative vs using displot, If that's incorrect please let me know.
I feel there is a simple solution as to why I'm not seeing a graph but I'm not sure if it's to do with pycharm or due to something within the code.
matplotlib is a dependency of seaborn. As such, importing matplotlib with import matplotlib.pyplot as plt and calling plt.show() does not add any overhead to your code.
While it is annoying that there is no sns.plt.show() at this time (see this similar question for discussion), I think this is the simplest solution to force plots to show when using PyCharm Community.
Importing matplotlib in this way will not affect how your exercises run as long as you use a namespace like plt.
Be aware the 'data' must be pandas DataFrame object, not: <class 'pandas.core.series.Series'>
I using this, work finely:
# Distribution plot on price
sns.histplot(df[['price']])
plt.show()
I'm solving the optimization problem by different methods (for study).
The idea is written bellow:
instance = class_with_diff_methods()
instance.create_some_func()
instance.optimization_method_one()
instance.create_data_for_plot()
instance.display()
instance.optimization_method_two()
instance.create_data_for_plot()
instance.display()
So I wanted to add the new data iteratively, without saving and I implemented my idea this way:
`
import matplotlib.pyplot as plt
def display(self):
if not self.__plot_exists:
self.figure = plt.figure()
plt.scatter( self.data[0], self.data[1])
plt.plot( self.data[0], self.appdata)
self.figure.show()
self.__plot_exists = True
else:
plt.plot( self.data[0], self.appdata)
plt.show( block=False)`
This works, but the problem is that I don't really understand, why I even need (it necessary) using "self.figure" in the first part of the code and why I should use "plt.show()" instead of just "self.figure.show" at the last line.
I would be grateful for links, that will help me clearly understand how this works.
The plt.show() method is native from matplotlib, it is what launches the GUI. Check out their documentation if you would like to know about these types of things:
https://matplotlib.org/contents.html
I am using PyCharm 2016.1 and Python 2.7 on Windows 10 and imported the matplotlib module.
As the matplotlib module ist very extensive and I am relatively new to Python, I hoped the Auto Complete function in PyCharm could help me to get an overview of the existent properties/ functions of an object. It would be more convenient as digging through the api documentation every time, not knowing what to look for an where to find it.
For example:
from matplotlib import pyplot as plt
fig, ax = plt.subplots()
When I type ax. there ist no auto completion for the properties, functions etc. of the axis, I only get the suggestions list.
I already tried this and imported the axis module directly with:
import matplotlib.axis as axis
or
from matplotlib.axis import Axis as axis
Smart Auto Completion and 'Collect run-time types information' is already enabled.
Is there a way to enable the auto completion like described or is there another IDE that supports that?
I believe your problem is highlighted here:
https://intellij-support.jetbrains.com/hc/en-us/community/posts/205816499-Improving-collecting-run-time-type-information-for-code-insight?sort_by=votes
Tldr return types can vary, so it cant be figured out at compile time.
Most accepted way is to use a type hint, since it can only figure out what type it as run time :
import matplotlib.axes._axes as axes
fig = plt.figure(figsize=(5,10))
ax1 = fig.add_subplot(3,1,1) # type:axes.Axes
ax1.set_xlabel('Test') <- now autocompletes
You can also try an assert isinstance:
import matplotlib.axes._axes as axes
fig = plt.figure(figsize=(5,10))
ax1 = fig.add_subplot(3,1,1)
assert isinstance(ax1, axes.Axes)
ax1.set_xlabel('Test')
It wont find the autocomplete if you do it after the method you are looking for:
ax1.set_xlabel('Test')
assert isinstance(ax1, axes.Axes)
With this, you shouldnt let isinstance dictate the control flow of your code, if you are trying to run a method that doesnt exist on an object, it should crash, however, if your different object has a method of the same name (!) then you have inadvertently reached that goal without annotations being there. So I like it better, since you want it to crash early and in the correct place. YMMV
From the doc:
Assertions should not be used to test for failure cases that can
occur because of bad user input or operating system/environment
failures, such as a file not being found. Instead, you should raise an
exception, or print an error message, or whatever is appropriate. One
important reason why assertions should only be used for self-tests of
the program is that assertions can be disabled at compile time.
If Python is started with the -O option, then assertions will be
stripped out and not evaluated. So if code uses assertions heavily,
but is performance-critical, then there is a system for turning them
off in release builds. (But don't do this unless it's really
necessary.
https://wiki.python.org/moin/UsingAssertionsEffectively
Alternatively, if you dont want to add to your code in this fashion, and have Ipython/jupyter installed through anoconda, you can get the code completion from the console by right clicking the code to be ran and choosing "execute selection in console"
In addition to Paul's answer. If you are using fig, ax = plt.subplots() , you could use figure type hint. See below example:
from matplotlib import pyplot as plt
import matplotlib.axes._axes as axes
import matplotlib.figure as figure
fig, ax = plt.subplots() # type:figure.Figure, axes.Axes
ax.
fig.
I import matplotlib:
import matplotlib.pyplot as plt
Then, at some point I try this:
plt.set_xlim(datetime.date(2011,1,1),datetime.date(2015,9,1))
And as a result I get this:
TypeError: 'list' object is not callable
My guess was that I, at some point, rewritten plt.xlim (or something like that). So, I tried
plt.clf()
It does not help. I am still getting the same error message. So, does anybody know why plt.set_xlim is a list and what to do with that?
I had the same problem too. Apparently the xlim must be set by the axis not the plot. This contradicts some of the matplotlib documentation and examples (e.g., here)
Anyways:
ax = plt.gca()
ax.set_xlim(your_xmin, your_xmax)
Edit: It appears that the matplotlib documentation has now been fixed.
I am using the anaconda distribution of ipython/Qt console. I want to plot things inline so I type the following from the ipython console:
%pylab inline
Next I type the tutorial at (http://pandas.pydata.org/pandas-docs/dev/visualization.html) into ipython...
import matplotlib.pyplot as plt
import pandas as pd
ts = pd.Series(randn(1000), index = pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()
ts.plot()
... and this is all that i get back:
<matplotlib.axes.AxesSubplot at 0x109253410>
But there is no plot. What could be wrong? Is there another command that I need to supply? The tutorial suggests that that is all that I need to type.
Plots are not displayed until you run
plt.show()
There could be 2 ways to approach this problem:
1) Either invoke the inline/osx/qt/gtk/gtk3/tk backend. Depends on the ipython console that you have been using. So, simply do:
%matplotlib inline #Here the inline backend is invoked, which removes the necessity of calling show after each plot.
or for ipython/qt console, do:
%matplotlib qt #This one works for me, thus, depends on the ipython console you use.
#
2) Or, do the traditional way as aforementioned (already answered above on this page):
plt.show() #However, you will have to call this show function each time.