Suppose I have dataframe, which has index composed of two columns and I want to plot it:
import pandas
from matplotlib import pyplot as plot
df=pandas.DataFrame(data={'floor':[1,1,1,2,2,2,3,3],'room':[1,2,3,1,1,2,1,3],'count':[1, 1, 3,2,2,4,1,5]})
df2=df.groupby(['floor','room']).sum()
df2.plot()
plot.show()
The above example will result in a plot where row numbers are used for x axis and no tick labels. Are there any facilities to use the index instead?
Say, I'd like to have x axis separated into even sections for first column of index and spread out points values of second index column inside those sections.
Related
I have a very huge Dataframe which I want to plot. I created a smaller one for demonstration here:
How can I plot this dataframe, I would like the columns to be the x axis and sum up all the values in bars
I was trying to plot the seaborn distribution plot for a list of columns.
for i in ['age', 'trestbps', 'chol','thalach','oldpeak', 'ca']:
sns.distplot(Data_heart_copy[i])
The output with above code
However, what I wanted to display is distplot for all the above columns in a single window with a compact command
The output that I am looking for looks like this
Required Output
You need to use subplot to put plots side by side:
import matplotlib.pyplot as plt
for i, col in enumerate(['age', 'trestbps', 'chol','thalach','oldpeak', 'ca']):
plt.subplot(2,3,i+1)
sns.distplot(Data_heart_copy[col])
plt.subplot(nrow, ncol, item) takes 3 input arguments: number of rows in the grid, number of columns, and the plot index (starting from 1 to nrow x ncol)
I have the following pandas Data Frame:
and I need to make line plots using the column names (400, 400.5, 401....) as the x axis and the data frame values as the y axis, and using the index column ('fluorophore') as the label for that line plot. I want to be able to choose which fluorophores I want to plot.
How can I accomplish that?
I do not know your dataset, so if it's always just full columns of NaN you could do
df[non_nan_cols].T[['FAM', 'TET']].plot.line()
Where non_nan_cols is a list of your columns that do not contain NaN values.
Alternatively, you could
choice_of_fp = df.index.tolist()
x_val = np.asarray(df.columns.tolist())
for i in choice_of_fp:
mask = np.isfinite(df.loc[i].values)
plt.plot(x_val[mask], df.loc[i].values[mask], label=i)
plt.legend()
plt.show()
which allows to have NaN values. Here choice_of_fp is a list containing the fluorophores you want to plot.
You can do the below and it will use all columns except the index and plot the chart.
abs_data.set_index('fluorophore ').plot()
If you want to filter values for fluorophore then you can do this
abs_data[abs_data.fluorophore .isin(['A', 'B'])].set_index('fluorophore ').plot()
I have a very simple data frame but I could not plot a line using a row and a column. Here is an image, I would like to plot a "line" that connects them.
enter image description here
I tried to plot it but x-axis disappeared. And I would like to swap those axes. I could not find an easy way to plot this simple thing.
Try:
import matplotlib.pyplot as plt
# Categories will be x axis, sexonds will be y
plt.plot(data["Categories"], data["Seconds"])
plt.show()
Matplotlib generates the axis dynamically, so if you want the labels of the x-axis to appear you'll have to increase the size of your plot.
I am trying to plot variable Vs SalePrice data. I tried pd.scatter_matrix but I am getting number of unnecessary plot with various combinations. I look for is SalePrice in Y axis and a scatter plot for each element from the data set. Here is the code I tried.
data_prep_num['Sales_test_data']=data_sales_price_old
att=['Sales_test_data','YearBuilt','LotArea','MSSubClass','BsmtFinSF1','TotalBsmtSF','1stFlrSF','2ndFlrSF','GrLivArea','GarageArea']
pd.scatter_matrix(data_prep_num[att],alpha=.4,figsize=(30,30))```
If you want to use pd.plotting.scatter_matrix but only want one of the rows (i.e. the Sales_test_data column), you can iterate over the plotting axes, and hide the combinations you don't want.
Assuming the SalePrice is the very first column (index 0):
import numpy as np
import matplotlib.pyplot as plt
axes = pd.plotting.scatter_matrix(data_prep_num[att], alpha=0.4, figsize=(30,30))
for i in range(np.shape(axes)[0]):
if i != 0:
for j in range(np.shape(axes)[1]):
axes[i,j].set_visible(False)
Note: This is obviously not super efficient when you start having lots of columns though.