Use dataframe column names as labels in pylab.plot

Use dataframe column names as labels in pylab.plot - python

I would like to plot the data in a dataframe and have the column headers be the labels. I tried this:
dfm.columns = ['a','b']
plot(dfm.cumsum(), label= dfm.columns.values)
legend(loc='upper left')
But got this:
Instead of both lines being labeled ['a','b'], I'd like the blue line to be a and the green to be b using pylab

I think it's the way you have your data set up in part of the code you're not showing.
Here's an example, I used df.plot() in this case.
import pandas as pd
import random
import matplotlib.pyplot as plt
x = [random.randint(10,20) for r in range(100)]
y = [random.randint(0,10) for r in range(100)]
df = pd.DataFrame([x,y]).T #T for transpose
df.columns=['a','b']
df.plot(kind='line')
plt.legend(loc='upper left')
plt.show()
Edit
pylab version
import pandas as pd
import random
import matplotlib.pylab as plt
x = [random.randint(10,20) for r in range(100)]
y = [random.randint(0,10) for r in range(100)]
df = pd.DataFrame([x,y]).T
plt.plot(df)
plt.legend(['a','b'],loc='upper left')
plt.show()

Related

How to plot Multiline Graphs Via Seaborn library in Python?

I have written a code that looks like this:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([10.03,100.348,1023.385])
power1 = np.array([100000,86000,73000])
power2 = np.array([1008000,95000,1009000])
df1 = pd.DataFrame(data = {'Size': T, 'Encrypt_Time': power1, 'Decrypt_Time': power2})
exp1= sns.lineplot(data=df1)
plt.savefig('exp1.png')
exp1_smooth= sns.lmplot(x='Size', y='Time', data=df, ci=None, order=4, truncate=False)
plt.savefig('exp1_smooth.png')
That gives me Graph_1:
The Size = x- axis is a constant line but as you can see in my code it varies from (10,100,1000).
How does this produces a constant line? I want to produce a multiline graph with x-axis = Size(T),y- axis= Encrypt_Time and Decrypt_Time (power1 & power2).
Also I wanted to plot a smooth graph of the same graph I am getting right now but it gives me error. What needs to be done to achieve a smooth multi-line graph with x-axis = Size(T),y- axis= Encrypt_Time and Decrypt_Time (power1 & power2)?

I think it not the issue, the line represents for size looks like constant but it NOT.
Can see that values of size in range 10-1000 while the minimum division of y-axis is 20,000 (20 times bigger), make it look like a horizontal line on your graph.
You can try with a bigger values to see the slope clearly.
If you want 'size` as x-axis, you can try below example:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([10.03,100.348,1023.385])
power1 = np.array([100000,86000,73000])
power2 = np.array([1008000,95000,1009000])
df1 = pd.DataFrame(data = {'Size': T, 'Encrypt_Time': power1, 'Decrypt_Time': power2})
fig = plt.figure()
fig = sns.lineplot(data=df1, x='Size',y='Encrypt_Time' )
fig = sns.lineplot(data=df1, x='Size',y='Decrypt_Time' )

Consistent color argument between matplotlib scatter to matplotlib plot?

I'm hoping to use matplotlib to plot inter-annual variation of monthly data (below). By passing c=ds['time.year'] in plt.scatter(), I achieve the desired outcome. However, I would like to be able to connect the points with an analogous plt.plot() call. Is this possible?
import pandas as pd
import matplotlib.pyplot as plt
import xarray as xr
# create y data
y = []
for yr in range(10):
for mo in range(12):
y.append(yr+mo+(yr*mo)**2)
# create datetime vector
t = pd.date_range(start='1/1/2010', periods=120, freq='M')
# combine in DataArray
ds = xr.DataArray(y, coords={'time':t}, dims=['time'])
# scatter plot with color
im = plt.scatter(ds['time.month'], ds.values, c=ds['time.year'])
plt.colorbar(im)
Output:
I have tried the following, but it does not work:
plt.plot(ds['time.month'], ds.values, c=ds['time.year'])

You can create a norm mapping the range of years to the range of colors. The norm together with the used colormap, can server as input for a ScalarMapple to create an accompanying colorbar. With the default 'viridis' colormap the code could look like:
import matplotlib.pyplot as plt
from matplotlib.cm import ScalarMappable
import pandas as pd
import xarray as xr
y = []
for yr in range(10):
for mo in range(12):
y.append(yr + mo + (yr * mo) ** 2)
t = pd.date_range(start='1/1/2010', periods=120, freq='M')
ds = xr.DataArray(y, coords={'time': t}, dims=['time'])
norm = plt.Normalize(ds['time.year'].min(), ds['time.year'].max())
cmap = plt.cm.get_cmap('viridis')
for year in range(int(ds['time.year'].min()), int(ds['time.year'].max()) + 1):
plt.plot(ds['time.month'][ds['time.year'] == year],
ds.values[ds['time.year'] == year],
ls='-', marker='o', color=cmap(norm(year)))
plt.colorbar(ScalarMappable(cmap=cmap, norm=norm))
plt.xticks(range(1, 13))
plt.show()

How to fill color by groups in histogram using Matplotlib?

I know how to do this in R and have provided a code for it below. I want to know how can I do something similar to the below mentioned in Python Matplotlib or using any other library
library(ggplot2)
ggplot(dia[1:768,], aes(x = Glucose, fill = Outcome)) +
geom_bar() +
ggtitle("Glucose") +
xlab("Glucose") +
ylab("Total Count") +
labs(fill = "Outcome")

Using pandas you can pivot the dataframe and directly plot it.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# dataframe with two columns in "long form"
g = np.array([np.random.normal(5, 10, 500),
np.random.rayleigh(10, size=500)]).flatten()
df = pd.DataFrame({'Glucose': g, 'Outcome': np.repeat([0,1],500)})
# pivot and plot
df.pivot(columns="Outcome", values="Glucose").plot.hist(bins=100)
plt.show()

Please consider the following example, which uses seaborn 0.11.1.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# generate random data
data = {'Glucose': np.random.normal(5, 10, 100),
'Outcome': np.random.randint(2, size=100)}
df = pd.DataFrame(data)
# plot
fig, ax = plt.subplots(figsize=(10, 10))
sns.histplot(data=df, x='Glucose', hue='Outcome', stat='count', edgecolor=None)
ax.set_title('Glucose')

How to annotate regression lines in seaborn lmplot?

I have plotted two variables against each other in Seaborn and used the hue keyword to separate the variables into two categories.
I want to annotate each regression line with the coefficient of determination. This question only describes how to show the labels for a line with using the legend.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_excel(open('intubation data.xlsx', 'rb'), sheet_name='Data
(pretest)', header=1, na_values='x')
vars_of_interest = ['PGY','Time (sec)','Aspirate (cc)']
df['Resident'] = df['PGY'] < 4
lm = sns.lmplot(x=vars_of_interest[1], y=vars_of_interest[2],
data=df, hue='Resident', robust=True, truncate=True,
line_kws={'label':"bob"})

Using your code as it is:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_excel(open('intubation data.xlsx', 'rb'), sheet_name='Data
(pretest)', header=1, na_values='x')
vars_of_interest = ['PGY','Time (sec)','Aspirate (cc)']
df['Resident'] = df['PGY'] < 4
p = sns.lmplot(x=vars_of_interest[1], y=vars_of_interest[2],
data=df, hue='Resident', robust=True, truncate=True,
line_kws={'label':"bob"}, legend=True)
# assuming you have 2 groups
ax = p.axes[0, 0]
ax.legend()
leg = ax.get_legend()
L_labels = leg.get_texts()
# assuming you computed r_squared which is the coefficient of determination somewhere else
label_line_1 = r'$R^2:{0:.2f}$'.format(0.3)
label_line_2 = r'$R^2:{0:.2f}$'.format(0.21)
L_labels[0].set_text(label_line_1)
L_labels[1].set_text(label_line_2)
Voila:
Graph created with my own random data since OP hasn't provided any.

Problem when using datetime data to draw graphic

I want to draw a graphic with using datas in datetime format as xaxis, but the process lasts very, very, extremly long, over 30 mins there is still no graphic. But once I apply datas in another column, the graphic will occur very soon. All the datas' formats are 'list'.
I'm confused about that, since they are all in the same format, why I can't draw the graphic out using the datetime formate as xaxis??
here is my code, I cherish all your time and help!
from matplotlib import pyplot as plt
import csv
names = []
x = []
y = []
names=[]
with open('all.csv','r') as csvfile: #this csv file contains over 16000 datas
plots= csv.reader(csvfile,delimiter=',')
for row in plots:
x.append(row[1]) #row1 is the datetime format data
y.append(row[2])
print(x,y)
plt.plot(x,y)
plt.show()
Lines of my csv file look something like:
2016/05/02 10:47:45,14.1,20.1,N.C.,170.7,518.3,-1259,-12.61,375.8,44.92,13.76,92.74,132.6,38.86,165.3,170.9,311.5,252.3,501.2,447.2,378.4,35.48,7.868,181.2,
I want the first column as xaxis and the following colums as yaxis...
and the y axis doesn't change, no matter how I change the y axis limit.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_csv('all.csv')
x = df.iloc[:,1]
y = df.iloc[:,3]
x = pd.to_datetime(x)
plt.figure(num=3, figsize=(15, 5))
plt.plot(x,y)
my_y_ticks = np.arange(0, 40, 10)
plt.xticks(rotation = 90)
plt.show()

I havent understood exactly what you mean with all the datas' format are list, but I think you could use something like this:
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('all.csv')
x = df.iloc[:,0]
y = df.iloc[:,1]
x = pd.to_datetime(x)
plt.plot(x,y)
plt.show()
Maybe showing some rows can be useful
EDIT:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.read_csv('all.csv')
x = df.iloc[:,0]
y = df.iloc[:,1]
x = pd.to_datetime(x, format="%Y/%m/%d %H/%M/%S") #if the format is different, change here
fig, ax = plt.subplots()
ax.plot(x, y)
xfmt = mdates.DateFormatter("%Y/%m/%d %H:%M:%S")
ax.xaxis.set_major_formatter(xfmt)
plt.xticks(rotation=70)
plt.show()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Use dataframe column names as labels in pylab.plot - python

Related

How to plot Multiline Graphs Via Seaborn library in Python?

Consistent color argument between matplotlib scatter to matplotlib plot?

How to fill color by groups in histogram using Matplotlib?

How to annotate regression lines in seaborn lmplot?

Problem when using datetime data to draw graphic

Categories

Resources