I have imported an excel (.xlsx) spreadsheet into my python code (using Pandas) and want to extract data from it and the spreadsheet contains the following;
DATE: Lecture1: Lecture2:
16/07/2020 09:30 11:00
17/07/2020 09:45 11:30
18/07/2020 09:45 11:00
19/07/2020 10:00 14:30
20/07/2020 09:30 14:45
How can I create the part of the code so that if "now = date.today()", then "print" the row of my lectures for that day...
I have the following;
import pandas as pd
data = pd.read_excel(r'/home/timetable1.xlsx')
data["Date"] = pd.to_datetime(data["Date"]).dt.strftime("%d-%m-%Y")
df = pd.DataFrame(data)
print (df)
This prints out the whole timetable as shown below (note the format changes slightly);
Date Lecture1 Lecture2
0 16-07-2020 09:30:00 11:00:00
1 17-07-2020 09:45:00 11:30:00
2 18-07-2020 09:45:00 11:00:00
3 19-07-2020 10:00:00 14:30:00
4 20-07-2020 09:30:00 14:45:00
So I am not sure what the part of the code will be to determine 'todays' date and show only 'todays' lecture times for example something like this maybe;
now = date.today()
now.strftime("%d-%m-%y")
if ["Date" == now]:
print ('timetable1.xlsx' index_col=now)
I am new to coding so not very good at it. The above code is wrong I know I can't think of a way to display the info.
So my desired output that I want;
Date Lecture1 Lecture2
18-07-2020 09:45:00 11:00:00
Your input would be much appreciated.
Check this:
data['Date'] = pd.to_datetime(data['Date']).dt.strftime("%d-%m-%Y")
now = pd.to_datetime('today').strftime("%d-%m-%Y")
print(data[data['Date'] == now])
Here you go:
from datetime import date
df['DATE'] = pd.to_datetime(df.DATE, format='%d/%m/%Y')
print(df[df.DATE == pd.to_datetime(date.today())])
Output (It's 19th for me)
DATE Lecture1 Lecture2
3 2020-07-19 10:00 14:30
What you can do is take in the current date in the correct format as the dataset like this:
today=date.today()
compare=today.strftime("%d-%m-%y")
And the do a .loc command on the dataframe
df.loc[df['Date'] == compare]
Related
I have a pandas Dataframe in which one of the column is pandas datetime column created using pd.to_datetime()1. I want to extract the date and hour from each datetime object, in other words, I want to change the minute and seconds to 0.
I used normalize() to change the time to midnight but don't how how to change the time to start of the hour. Please suggest a way to do so.
making some test data and turning it into a dataframe
rng = pd.date_range('1/1/2018 11:59:00', periods=3, freq='min')
df = pd.DataFrame(rng)
print(df)
print(df[0].round('H'))
gives the input
0
0 2018-01-01 11:59:00
1 2018-01-01 12:00:00
2 2018-01-01 12:01:00
and rounded to the nearest hour gives
0
0 2018-01-01 12:00:00
1 2018-01-01 12:00:00
2 2018-01-01 12:00:00
and
print(df[0].dt.floor('H'))
gives
0
0 2018-01-01 11:00:00
1 2018-01-01 12:00:00
2 2018-01-01 12:00:00
if you always want to round down. Likewise, ceil('H') if you want to round up
I think you need to checkout pandas.Series.dt.strftime
Or try this:
import datetime
df=pd.DataFrame({'timestamp':[pd.Timestamp('today')]})
df['Date']=[pd.to_datetime(i.date())+ datetime.timedelta(hours=i.hour) for i in df['timestamp']]
How can I create a new column that has the day only, and hour of day only based of a column that has a datetime timestamp?
DF has column such as:
Timestamp
2019-05-31 21:11:43
2018-11-21 18:01:00
2017-11-21 22:01:04
2020-04-15 11:01:00
2017-04-20 04:00:33
I want two new columns that look like below:
Day | Hour of Day
2019-05-31 21:00
2018-11-21 18:00
2017-11-21 22:00
2020-04-15 11:00
2017-04-20 04:00
I tried something like below but it only gives me a # for hour of day,
df['hour'] = pd.to_datetime(df['Timestamp'], format='%H:%M:%S').dt.hour
where output would be 9 for 9:32:00 which isnt what I want to calculate
Thanks!
Please try dt.strftime(format+string)
df['hour'] = pd.to_datetime(df['Timestamp']).dt.strftime("%H"+":00")
Following your comments below. Lets Try use df.assign and extract hour and date separately
df=df.assign(hour=pd.to_datetime(df['Timestamp']).dt.strftime("%H"+":00"), Day=pd.to_datetime(df['Timestamp']).dt.date)
You could convert time to string and then just select substrings by index.
df = pd.DataFrame({'Timestamp': ['2019-05-31 21:11:43', '2018-11-21 18:01:00',
'2017-11-21 22:01:04', '2020-04-15 11:01:00',
'2017-04-20 04:00:33']})
df['Day'], df['Hour of Day'] = zip(*df.Timestamp.apply(lambda x: [str(x)[:10], str(x)[11:13]+':00']))
I have a pandas dataframe with timestamps shown below:
6/30/2019 3:45:00 PM
I would like to round the date based on time. Anything before 6AM will be counted as the day before.
6/30/2019 5:45:00 AM -> 6/29/2019
6/30/2019 6:30:00 AM -> 6/30/2019
What I have considered doing is splitting date and time into 2 different columns then using an if statement to shift the date (if time >= 06:00 etc). Just wondering there is a built in function in pandas to do this. Ive seen posts of people rounding up and down based on the closest hour but never a specific time threshold (6AM).
Thank you for the help!
there could be a better way to do this.. But this is one way of doing it.
import pandas as pd
def checkDates(d):
if d.time().hour < 6:
return d - pd.Timedelta(days=1)
else:
return d
ls = ["12/31/2019 3:45:00 AM", "6/30/2019 9:45:00 PM", "6/30/2019 10:45:00 PM", "1/1/2019 4:45:00 AM"]
df = pd.DataFrame(ls, columns=["dates"])
df["dates"] = df["dates"].apply(lambda d: checkDates(pd.to_datetime(d)))
print (df)
dates
0 2019-12-30 03:45:00
1 2019-06-30 21:45:00
2 2019-06-30 22:45:00
3 2018-12-31 04:45:00
Also note i am not taking into consideration of the time. when giving back the result..
if you just want the date at the end of it you can just get that out of the datetime object doing something like this
print ((pd.to_datetime("12/31/2019 3:45:00 AM")).date()) >>> 2019-12-31
if understand python well and dont want anyone else(in the future) to understand what your are doing
one liner to the above is.
df["dates"] = df["dates"].apply(lambda d: pd.to_datetime(d) - pd.Timedelta(days=1) if pd.to_datetime(d).time().hour < 6 else pd.to_datetime(d))
I have a pandas dataframe with some time data which looks like
0 08:00 AM
1 08:15 AM
2 08:30 AM
3 7:45 AM
4 7:30 AM
There are 660 rows like these in total (datatype- String). I want to plot the distribution(histogram) of this column. How can I do that? Also some of the rows are just an empty strings (missing data), so I have to also handle that while plotting. What can be the best way to handle that?
I have tried to use pandas.to_datetime() to convert string to timestamp, but still after that I am stuck on how to plot distribution of those timestamps and missing data.
Let's assume you have the dataframe you're talking about, and you're able to cast as pandas datetime objects:
import pandas as pd
df = pd.DataFrame(['8:00 AM', '8:15 AM', '08:30 AM', '', '7:45 AM','7:45 AM'], columns = ['time'])
df.time = pd.to_datetime(df.time)
df looks like this:
time
0 2019-08-16 08:00:00
1 2019-08-16 08:15:00
2 2019-08-16 08:30:00
3 NaT
4 2019-08-16 07:45:00
5 2019-08-16 07:45:00
I would groupby both hour and minute .
df.groupby([df['time'].dt.hour, df['time'].dt.minute]).count().plot(kind="bar")
results
I am trying to extract only time from a datetime column but cannot find any solution. I am not good at string manipulation either.
Example:
Datetime
2017-01-17 00:40:00
2017-01-17 01:40:00
2017-01-17 02:40:00
2017-01-17 03:40:00
2017-01-17 04:40:00
Desired Output:
Time
00:40:00
01:40:00
02:40:00
03:40:00
04:40:00
You can do this with the dt.time
df = pd.DataFrame({'date': {0: '26-1-2014 04:40:00', 1: '26-1-2014 03:40:00', 2:'26-1-2015 02:40:00', 3:'30-1-2014 01:40:00'}})
df['time'] = pd.to_datetime(df.date).dt.time
This will add a time column
Let's assume that the column name for your datetime objects is called 'DatetimeColumn'. You can iterate over the Dataframe and modify the Datetime objects to just represent the time. Here is how you would modify them individually:
for row in df.rows:
timeValue = row['DatetimeColumn'].strftime('%H:%M:%S')