I'm having a little problem to set the xlim when I'm working with a timedelta.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import datetime
fig1 = plt.figure(figsize=(20,10))
ax1 = fig1.add_subplot(111)
df = pd.DataFrame({'deltaTime': [0, 10, 20, 30], 'length': [0.002, 0.005, 0.004, 0.003]})
df['deltaTime'] = pd.to_timedelta(df['deltaTime'], unit='m')
ax1.xaxis.set_major_formatter(DateFormatter('%M'))
ax1.set_xlim([datetime.time(0,0,0), datetime.time(1,0,0)])
ax1.plot_date(df['deltaTime'], df['length'], marker='o', markersize=5, linestyle='-')
plt.show()
This line seems not to work:
ax1.set_xlim([datetime.time(0,0,0), datetime.time(1,0,0)])
Is there something similar that I could use in order to get my limits set when I'm using pandas timedelta?
matplotlib plot_date takes x and y that are the datetime objects and not timedelta (duration) objects. you can convert the timedelta objects to datetime objects as shown below (by adding a date object with timedelta). hope this is what you were looking for.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import datetime
fig1 = plt.figure(figsize=(4,4))
ax1 = fig1.add_subplot(111)
df = pd.DataFrame({'deltaTime': [0, 10, 20, 30], 'length': [0.002, 0.005, 0.004, 0.003]})
df['deltaTime'] = pd.to_timedelta(df['deltaTime'], unit='m')
df['start_date'] = pd.Timestamp('20171204')+ df['deltaTime']
print df['start_date']
ax1.plot_date(df['start_date'], df['length'], marker='o', markersize=5, linestyle='-')
ax1.xaxis.set_major_formatter(DateFormatter('%M'))
ax1.set_xlim(['20171204 00:10:00', '20171204 00:30:00'])
plt.show()
results in
0 2017-12-04 00:00:00
1 2017-12-04 00:10:00
2 2017-12-04 00:20:00
3 2017-12-04 00:30:00
Name: start_date, dtype: datetime64[ns]
Related
How can I change the frequency of my x ticks to every hour using matplotlib.pyplot? I looked at similar posts, but could not figure out how to apply their solutions to my data since I only have times, not full dates. Here's an example of my data:
Time SRH_1000m
14:03:00 318
14:08:00 321
14:13:00 261
14:17:00 312
14:22:00 285
See: https://matplotlib.org/stable/gallery/text_labels_and_annotations/date.html
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.DataFrame({'time': ['14:03:00', '14:07:00', '14:08:00', '14:15:00'], 'value': [0,1,2,3]})
df['time'] = pd.to_datetime(df['time'], format='%H:%M:%S')
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(df['time'], df['value'])
ax.xaxis.set_major_locator(mdates.MinuteLocator(interval=5))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M:%S'))
How do you reformat from datetime to Week 1, Week 2... to plot onto a seaborn line chart?
Input
Date Ratio
0 2019-10-04 0.350365
1 2019-10-04 0.416058
2 2019-10-11 0.489051
3 2019-10-18 0.540146
4 2019-10-25 0.598540
5 2019-11-08 0.547445
6 2019-11-01 0.722628
7 2019-11-15 0.788321
8 2019-11-22 0.875912
9 2019-11-27 0.948905
Desired output
I was able to cheese it by matching the natural index of the dataframe to the week. I wonder if there's another way to do this.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data = {'Date': ['2019-10-04',
'2019-10-04',
'2019-10-11',
'2019-10-18',
'2019-10-25',
'2019-11-08',
'2019-11-01',
'2019-11-15',
'2019-11-22',
'2019-11-27'],
'Ratio': [0.350365,
0.416058,
0.489051,
0.540146,
0.598540,
0.547445,
0.722628,
0.788321,
0.875912,
0.948905]}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
graph = sns.lineplot(data=df,x='Date',y='Ratio')
plt.show()
# First plot looks bad.
week_mapping = dict(zip(df['Date'].unique(),range(len(df['Date'].unique()))))
df['Week'] = df['Date'].map(week_mapping)
graph = sns.lineplot(data=df,x='Week',y='Ratio')
plt.show()
# This plot looks better, but method seems cheesy.
It looks like your data is already spaced weekly, so you can just do:
df.groupby('Date',as_index=False)['Ratio'].mean().plot()
Output:
You can make a new column with the week number and use that as your x value. This would give you the week of the year. If you want to start your week numbers with 0, just subtract the week number of the first date from the value (see the commented out section of the code)
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from datetime import datetime as dt
data = {'Date': ['2019-10-04',
'2019-10-04',
'2019-10-11',
'2019-10-18',
'2019-10-25',
'2019-11-08',
'2019-11-01',
'2019-11-15',
'2019-11-22',
'2019-11-27'],
'Ratio': [0.350365,
0.416058,
0.489051,
0.540146,
0.598540,
0.547445,
0.722628,
0.788321,
0.875912,
0.948905]}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
# To get the week number of the year
df.loc[:, 'Week'] = df['Date'].dt.week
# Or you can use the line below for the exact output you had
#df.loc[:, 'Week'] = df['Date'].dt.week - (df.sort_values(by='Date').iloc[0,0].week)
graph = sns.lineplot(data=df,x='Week',y='Ratio')
plt.show()
I have one dataframe df as below:
df = pd.DataFrame({'date': [20121231,20130102, 20130105, 20130106, 20130107, 20130108],'price': [25, 163, 235, 36, 40, 82]})
How to make df['date'] as date type and make 'price' as y-label and 'date' as x-label?
Thanks a lot.
Use to_datetime with parameter format, check http://strftime.org/:
df['date'] = pd.to_datetime(df['date'], format='%Y%m%d')
print (df)
date price
0 2012-12-31 25
1 2013-01-02 163
2 2013-01-05 235
3 2013-01-06 36
4 2013-01-07 40
5 2013-01-08 82
And then plot:
df.plot(x='date', y='price')
import pandas as pd
%matplotlib inline
df = pd.DataFrame({'date': [20121231,20130102, 20130105, 20130106, 20130107,
20130108],'price': [25, 163, 235, 36, 40, 82]})
df['date'] = pd.to_datetime(df['date'], format='%Y%m%d')
df.plot(x='date', y='price')
With pandas you can directly convert the date column to datetime type. And then you can plot with matplotlib. Take a look at this answer and also this one.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
df = pd.DataFrame(
{'date': [20121231, 20130102, 20130105, 20130106, 20130107, 20130108],
'price': [25, 163, 235, 36, 40, 82]
})
fig, ax = plt.subplots()
# Date plot with matplotlib
ax.plot_date(
pd.to_datetime(df["date"], format="%Y%m%d"),
df["price"],
'v-'
)
# Days and months and the horizontal locators
ax.xaxis.set_minor_locator(dates.DayLocator())
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n%a'))
ax.xaxis.set_major_locator(dates.MonthLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('\n\n\n%b\n%Y'))
ax.xaxis.grid(True, which="minor")
ax.yaxis.grid()
plt.tight_layout()
plt.show()
Result:
My DataFrame looks that:
I plot it by this code:
tmp['event_name'].plot(style='.', figsize=(20,10), grid=True)
Results looks that:
I want to change size of points( using column details).
Question:
How can I do it? Plot haven't argument size and I can not using plot.scatter() because I can not use time format for x axis.
DataFrame.plot passes any unknown keywords down to Matplotlib.Artist, as stated in the linked docs. Therefore, you can specify the marker size using the general matplotlib syntax ms:
tmp['event_name'].plot(style='.', figsize=(20,10), grid=True, ms=5)
That said, you can use plt.scatter with time stamps as well, which makes using the 'details' column as marker size more straight forward:
import matplotlib.pyplot as plt
import pandas as pd
data = {'time': ['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04'],
'event_name': [2, 2, 2, 2],
'details': [46, 16, 1, 7]}
df = pd.DataFrame(data)
dates = [pd.to_datetime(date) for date in df.time]
plt.scatter(dates, df.event_name, s=df.details)
plt.show()
You can try so:
for index, i in enumerate(df['details']):
plt.plot(df.index[index], df.iloc[index]['event_name'], marker='.', linestyle='None', markersize=i*4, color='b')
plt.show()
Example:
import matplotlib.pyplot as plt
import pandas as pd
df = {'time': ['2015-01-01','2015-01-02','2015-01-03', '2015-01-04', '2015-01-05'],'event_name': [2,2,2,2,2], 'details':[46,16,1,7,4]}
df = pd.DataFrame(data=df)
df['time'] = pd.to_datetime(df['time'], format='%Y-%m-%d')
df = df.set_index('time')
df:
details event_name
time
2015-01-01 46 2
2015-01-02 16 2
2015-01-03 1 2
2015-01-04 7 2
2015-01-05 4 2
Output:
Hi I have a dataframe like this:
Date Influenza[it] Febbre[it] Cefalea[it] Paracetamolo[it] \
0 2008-01 989 2395 1291 2933
1 2008-02 962 2553 1360 2547
2 2008-03 1029 2309 1401 2735
3 2008-04 1031 2399 1137 2296
Unnamed: 6 tot_incidence
0 NaN 4.56
1 NaN 5.98
2 NaN 6.54
3 NaN 6.95
I'd like to plot different figures with on x-axis the Date column and the y-axis the Influenza[it] column and another column like Febbre[it]. Then again x-axis the Date column, y-axis Influenza[it] column and another column (ex. Paracetamolo[it]) and so on. I'm trying to figure out if there is a fast way to make it without completely manipulate the dataframes.
You can simply plot 3 different subplots.
import pandas as pd
import matplotlib.pyplot as plt
dic = {"Date" : ["2008-01","2008-02", "2008-03", "2008-04"],
"Influenza[it]" : [989,962,1029,1031],
"Febbre[it]" : [2395,2553,2309,2399],
"Cefalea[it]" : [1291,1360,1401,1137],
"Paracetamolo[it]" : [2933,2547,2735,2296]}
df = pd.DataFrame(dic)
#optionally convert to datetime
df['Date'] = pd.to_datetime(df['Date'])
fig, ax = plt.subplots(1,3, figsize=(13,7))
df.plot(x="Date", y=["Influenza[it]","Febbre[it]" ], ax=ax[0])
df.plot(x="Date", y=["Influenza[it]","Cefalea[it]" ], ax=ax[1])
df.plot(x="Date", y=["Influenza[it]","Paracetamolo[it]" ], ax=ax[2])
#optionally equalize yaxis limits
for a in ax:
a.set_ylim([800, 3000])
plt.show()
If you want to plot each plot separately in a jupyter notebook, the following might do what you want.
Additionally we convert the dates from format year-week to a datetime to be able to plot them with matplotlib.
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
dic = {"Date" : ["2008-01","2008-02", "2008-03", "2008-04"],
"Influenza[it]" : [989,962,1029,1031],
"Febbre[it]" : [2395,2553,2309,2399],
"Cefalea[it]" : [1291,1360,1401,1137],
"Paracetamolo[it]" : [2933,2547,2735,2296]}
df = pd.DataFrame(dic)
#convert to datetime, format year-week -> date (monday of that week)
df['Date'] = [ date + "-1" for date in df['Date']] # add "-1" indicating monday of that week
df['Date'] = pd.to_datetime(df['Date'], format="%Y-%W-%w")
cols = ["Febbre[it]", "Cefalea[it]", "Paracetamolo[it]"]
for col in cols:
plt.close()
fig, ax = plt.subplots(1,1)
ax.set_ylim([800, 3000])
ax.plot(df.Date, df["Influenza[it]"], label="Influenza[it]")
ax.plot(df.Date, df[col], label=col)
ax.legend()
plt.show()