Why this Python pandas DataFrame code does not work? - python

My code:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
income_vs_hardship = %sql SELECT per_capita_income_, hardship_index FROM chicago_socioeconomic_data;
plot = sns.jointplot(x='per_capita_income_',y='hardship_index', data=pd.DataFrame(income_vs_hardship))
Correct answer:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
income_vs_hardship = %sql SELECT per_capita_income_, hardship_index FROM chicago_socioeconomic_data;
plot = sns.jointplot(x='per_capita_income_',y='hardship_index', data=income_vs_hardship.DataFrame())
The only difference:
data=pd.DataFrame(income_vs_hardship) vs. data=income_vs_hardship.DataFrame()
If DataFrame is a method belongs to pandas, why my code does not work.
The error shows 'unable to interpret the per_capita_income.'

DataFrame is a class of the pandas module, not a method that you can apply to a DataFrame instance.
income_vs_hardship.DataFrame() can't be interpreted by Python, as income_vs_hardship has no DataFrame method. Instead, pd.DataFrame(income_vs_hardship) creates a DtaFrame object.

Related

Why does Pandas Plot looks different when using csv or xlsx data?

i've got two datasets with the exact same data but they look different when plotted the same way. One is a .xlsx file and one is a .csv file.
Here are the two codes:
For the CSV:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from sklearn.cluster import KMeans
daten = pd.read_csv(r"Path\Übungsdaten.csv", header=0, sep=";")
print("Total rows: {0}".format(len(daten)))
print(daten.columns)
plt.scatter(daten['InsuredValue'], daten['Policy'])
plt.xlim(2500000)
plt.ylim(100100)
plt.show()
And for the xlsx:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from sklearn.cluster import KMeans
daten = pd.read_excel(r"Path\Übungsdaten.xlsx")
print("Total rows: {0}".format(len(daten)))
plt.scatter(daten['InsuredValue'],daten['Policy'] )
plt.xlim(2500000)
plt.ylim(100100)
plt.show()
Here are the two Plots:
csv with plt.xlim(2500000) plt.ylim(100100)
and the csv without restrictions:
and finally the .xlsx plot:
My question is first of all, why is there a black bar on the bottom of the first two plots? (im guessing this is every single value of "InsuredValue") and how can I form the csv plo to the same ratio as the xlsx plot?
Thank you very much
I had to convert the "InsuredValue" column to int with the following code:
daten.astype({'InsuredValue':'int'})

seaborn mixing of plots

I'm having trouble creating this plot in spyder:
import seaborn as sns
import pandas as pd
from pandas.api.types import CategoricalDtype
diamonds= sns.load_dataset("diamonds")
df=diamonds.copy()
cut_Kategoriler=["Fair","Good","Very Good","Premium","Ideal"]
df.cut=df.cut.astype(CategoricalDtype(categories = cut_Kategoriler,ordered=True))
print(df.head())
sns.catplot(x="cut",y="price",data=df)
sns.barplot(x="cut",y="price",hue="color",data=df)
I want create two plots. But these plots overflap. How can i separate the graphics in the last two lines?
You need to import matplotlib.pyplot as plt and then add plt.show() after each of the two plots.
The modified code is added below:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt # Import Matplotlib
from pandas.api.types import CategoricalDtype
diamonds = sns.load_dataset("diamonds")
df=diamonds.copy()
cut_Kategoriler=["Fair","Good","Very Good","Premium","Ideal"]
df.cut=df.cut.astype(CategoricalDtype(categories = cut_Kategoriler,ordered=True))
print(df.head())
sns.catplot(x="cut",y="price",data=df)
plt.show() # Display the first plot
sns.barplot(x="cut",y="price",hue="color",data=df)
plt.show() # Display the second plot

Finding the maximum and minum value of a Pandas Multi index Pivot table

I have a pivot table as shown below.I need to find the maximum and minimum value present in the column
"Chip_Current[uAmp]".could you please tell me how to approach this?
Please see my code below
import pandas as pd
import numpy as np
import xlsxwriter
import plotly
import cufflinks as cf
#Enabling the offline mode for interactive plotting locally
from plotly.offline import download_plotlyjs,init_notebook_mode,plot,iplot
init_notebook_mode(connected=True)
cf.go_offline()
%matplotlib inline
init_notebook_mode()
df = pd.read_csv("Chip_Current_pdm_dis_Corners_2p0_A.txt",delim_whitespace=True)
F_16MHz=LP = df[(df['Frequency[MHz]'] == 1.6)]
F_16MHz_PVT=pd.pivot_table(F_16MHz, index = ['Device_ID', 'Temp(deg)' ,'Supply[V]','Frequency[MHz]'],values = 'Chip_Current[uAmp]')
F_16MHz_PVT['SPEC_MAX[uA]']=710
F_16MHz_PVT
You can use the .min() and .max() functions, as follows:
F_16MHz_PVT['Chip_Current[uAmp]'].min()
F_16MHz_PVT['Chip_Current[uAmp]'].max()

how can I use pandas to plot the graph?

If I have this length.csv file content:
May I know how can I use pandas plot dot graph base on this xy and yx?
import pandas as pd
df = pd.read_csv('C:\\path\to\folder\length.csv')
Now if you print df, you will get the following
df.plot(x='yx', y='xy', kind='scatter')
You can change your plot type to different types like line, bar etc.
Refer to https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html
You can easily use matplotlib. The plot method in Pandas is a wrapper for matplotlib.
If you wish to use Pandas, you can do it as such:
import pandas as pd
df = pd.read_csv('length.csv')
df.plot(x='xy', y='yx')
If you decide to go ahead with matplotlib, you can do as follows:
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline # Include this line only if on a notebook (like Jupyter or Colab)
df = pd.read_csv('length.csv')
plt.plot(df['xy'], df['yx'])
plt.xlabel('xy')
plt.ylabel('yx')
plt.title('xy vs yx Plot')
plt.show()

plot graph from python dataframe

i want to convert that dataframe
into this dataframe and plot a matplotlib graph using date along x axis
changed dataframe
Use df.T.plot(kind='bar'):
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame.from_csv('./housing_price_index_2010-11_100.csv')
df.T.plot(kind='bar')
plt.show()
you can also assign the transpose to a new variable and plot that (what you asked in the comment):
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame.from_csv('./housing_price_index_2010-11_100.csv')
df_transposed = df.T
df_transposed.plot(kind='bar')
plt.show()
both result the same:

Categories