Python Seaborn and Pandas Data Presentatation in Graphical Form - python

I was trying ML in python. During the process I had to come across seaborn.pairplot or other plot function and pandas.plot functions. To be simple I'm putting the scope of this question within seaborn only as the same solution is likely to work for pandas also.
I was trying the very basic codes presented in seaborn website.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset("tips")
g = sns.relplot(data=tips, x="total_bill", y="tip")
g.ax.axline(xy1=(10, 2), slope=.2, color="b", dashes=(5, 2))
The datasets have been cloned from seaborn Github repository. Still the moment the execution comes to sns.*plot() functions it gives a 'NoneType' object is not callable Error.
The seaborn dataset from Git repository has quite a few csv datesets.A bove was about tips.csv. sns.load_dataset("tips") can load the dataset tips.csv from current directory. Whereas for another dataset penguins residing in the same directory sns.load_dataset("penguins") fails!
After os is imported by import os within penguins.py, print(os.getcwd()) gives a different result even though both penguins.csv and tips.csv are in the same folder and also penguins.py and tips.py are in the same folder but different from dataset folder as blow.
'D:\Protege\Documents\Seaborn\~.csv' #where all datasets are
'D:\Protege\Documents\Seaborn\process\~.py' #where all ~.py files are
For tips os.getcwd rightly gives the first directory. Whereas in penguins.py the same function gives the second result.
If I put sns.load_dataset('D:\Protege\Documents\Seaborn\Penguins.csv') it says penguins is not one of the datasets.
Let's discuss for tips and 'NoneType' object is not callable Error first.

Related

Can't run Python visual element in Power BI

I'm trying to create a hist plot in Power BI.
I got installed ANaconda, MS Vusial Code.
Screenshots with my settings:
I'm trying make hist with simple table with 1 column.
The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script:
dataset = pandas.DataFrame(reg_min_ses_dt)
dataset = dataset.drop_duplicates()
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.histplot(data=dataset['reg_min_ses_dt'])
plt.show()
But I get this error:
I think, I just didn't set up some python extension or something else.
I just want make Python visual like this.
You need to activate conda before you can use its packages in PBIDesktop.exe. Simple as that.

How do I make the ploy show in my df analysis

I have a dataframe of emails that has three columns: From, Message and Received (which is a date format).
I've written the below script to show how many messages there are per month in a bar plot.
But the plot doesn't show and I can't work out why, it's no doubt very simple. Any help understanding why is much appreciated!
Thanks!
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('XXX')
df = df[df['Message'].notna()]
df['Received'] = pd.to_datetime(df['Received'], format='%d/%m/%Y')
df['Received'].groupby(df['Received'].dt.month).count().plot
A pyplot object (commonly plt) is not shown until you call plt.show(). It is designed that way so you can create your plot and then modify it as needed before showing or saving.
Also checkout plt.savefig().

Plotting dataframes using matplotlib in Python IDE

I am trying to plot a dataframe which has been taken from get_data_yahoo attribute in pandas_datareader.data on python IDE using matplotlib.pyplot and I am getting an KeyError for the X-Co-ordinate in prices.plot no matter what I try. Please help!
I have tried this out :-
import matplotlib.pyplot as plt
from pandas import Series,DataFrame
import pandas_datareader.data as pdweb
import datetime
prices=pdweb.get_data_yahoo(['CVX','XOM','BP'],start=datetime.datetime(2020,2,24),
end=datetime.datetime(2020,3,20))['Adj Close']
prices.plot(x="Date",y=["CVX","XOM","BP"])
plt.imshow()
plt.show()
And I have tried this as well:-
prices=DataFrame(prices.to_dict())
prices.plot(x="Timestamp",y=["CVX","XOM","BP"])
plt.imshow()
plt.show()
Please Help...!!
P.S: I am also getting some kind of warning, please explain about it if you could :)
The issue is that the Date column isn't an actual column when you import the data. It's an index. So just use:
prices = prices.reset_index()
Before plotting. This will convert the index into a column, and generate a new, integer-labelled index.
Also, in regards to the warnings, Pandas is full of them and they are super annoying! You can turn them off with the standard python library warnings.
import warnings
warnings.filterwarnings('ignore')

Pandas pyplot throwing error "no numeric data to plot" when the dataset clearly has correct data

So I have this very basic piece of code to just learn Box plots in matplotlib.pyplot , I am following a tutorial where it works perfectly well for the instructor but not me. Its literally the same code, I would like to know if this feature has been like changed or something. Dataset
import pandas
import matplotlib.pyplot as plt
# This is taken from CSV but its easily available on the web , epecially kaggle
url ="D:\PycharmProjects\ML\Datasets\pima-indians-diabetes-database\diabetes.csv"
names = ['preg','plas','pres','skin','test','mass','pedi','age','class']
data = pandas.read_csv(url,names = names)
# this is where the issue arises
data.plot(kind='box',subplots = 'True',layout=(3,3),sharex=False,sharey=False)
plt.show()
Before to plot, add this line to your code:
data = data.apply(pd.to_numeric)

Plot multiple data using for loop, pyplot and genfromtxt

I am pretty sure this particular problem must have been treated somewhere but I cannot find it so I put the question.
I have 66 files with data stored in one single column. I wish to plot all data in a single plot. I'm used to do it with bash where acquiring and plotting data inside a loop is pretty trivial but I can't figure out in python.
thanks a lot for your help.
NM
Something like this should do it, although it will depend on how your data files are named.
import matplotlib.pyplot as plt
import numpy as np
fig,ax = plt.subplots()
# Lets say your files are called data-00.txt, data-01.txt etc.
for i in range(66):
data=np.genfromtxt('data-{:02d}.txt'.format(i))
ax.plot(data)
fig.savefig('my_fig.png')

Categories