The right way to parse date - python

Exist an easier way to do this kind of parse date?
I'm trying to make a filter in pandas to localize dates 3 months ago and loc the entire month too.
The code works, but I'm searching for the best way.
final_date = pd.to_datetime(f'{(datetime.today() - timedelta(days=90)).year}-{(datetime.today() - timedelta(days=90)).month}-01', dayfirst=True)

My Suggestion
This version of the code is more concise and easier to read.
It avoids using multiple calls to datetime.today() and string formatting.
import pandas as pd
from datetime import date, timedelta
# Calculate the date 90 days ago
d = date.today() - timedelta(days=90)
# Format the date as a string in the '%Y-%m-01' format
date_str = d.strftime('%Y-%m-01')
# Parse the string as a datetime object
final_date = pd.to_datetime(date_str, dayfirst=True)

Related

Using the time functions

I have a small question. I have an array that saves dates in the following format.
'01/02/20|07/02/20'
It is saved as a string, which uses the start date on the left side of the "|" and end date on the other side.
It is only the end date that matters here, but is there a function or algorithm I can use to automatically calculate the difference in days and months between now.datetime and the end date (right-hand side of "|")?
Thanks, everyone
datetime.strptime is the main routine for parsing strings into datetimes. It can handle all sorts of formats, with the format determined by a format string you give it:
In [34]: from datetime import datetime
In [35]: end_date = datetime.strptime(s.split('|')[1], '%d/%m/%y')
In [36]: diff = datetime.now() - end_date
In [37]: diff
Out[37]: datetime.timedelta(days=81, seconds=81712, microseconds=14069)
In [38]: diff.days
Out[38]: 81
You might be looking for something like this.
Python datetime module is the way to go for a problem like this
import datetime
dates = '01/02/20|07/02/20'
enddate = dates.split('|')[-1]
# Use %y for 2 digits year else %Y for 4 digit year
enddate = datetime.datetime.strptime(enddate, "%d/%m/%y")
today = datetime.date.today()
print(abs(enddate.date() - today).days)
Output:
81

How to play around with JSON date format?

I have a JSON date data set and trying to calculate the time difference between two different JSON DateTime.
For example :
'2015-01-28T21:41:38.508275' - '2015-01-28T21:41:34.921589'
Please look at the python code below:
#let's say 'time' is my data frame and JSON formatted time values are under the 'due_date' column
time_spent = time.iloc[2]['due_date'] - time.iloc[10]['due_date']
This doesn't work. I also tried to cast each operand to int, but it also didn't help. What are the different ways to perform this calculation?
I use parser from dateutil.
Something like that:
from dateutil.parser import parse
first_date_obj = parse("2015-01-28T21:41:38.508275")
second_date_obj = parse("2015-02-28T21:41:38.508275")
print(second_date_obj - first_date_obj)
You can also access the year, month, day of the date object like that:
print(first_date_obj.year)
print(first_date_obj.month)
print(first_date_obj.day)
# and so on
from datetime import datetime
date_format = '%Y-%m-%dT%H:%M:%S.%f'
d2 = time.iloc[2]['due_date']
d1 = time.iloc[10]['due_date']
time_spent = datetime.strptime(d2, date_format) - datetime.strptime(d1, date_format)
print(time_spent.days) # 0
print(time_spent.microseconds) # 586686
print(time_spent.seconds) # 3
print(time_spent.total_seconds()) # 3.586686
The easiest thing to do is to use the pandas datetime capability (since you are already using iloc I assume you are using pandas). You can convert the entire dataframe column labeled due_date to be a pandas datetime datatype using
import pandas as pd
time['due_date'] = pd.to_datetime(time['due_date']
then calculate the time difference you want using
time_spent = time.iloc[2]['due_date'] - time.iloc[10]['due_date']
time_spent will be a pandas timedelta object that you can then manipulate as necessary.
See https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html and https://pandas.pydata.org/pandas-docs/stable/user_guide/timedeltas.html.

Comparison between datetime and datetime64[ns] in pandas

I'm writing a program that checks an excel file and if today's date is in the excel file's date column, I parse it
I'm using:
cur_date = datetime.today()
for today's date. I'm checking if today is in the column with:
bool_val = cur_date in df['date'] #evaluates to false
I do know for a fact that today's date is in the file in question. The dtype of the series is datetime64[ns]
Also, I am only checking the date itself and not the timestamp afterwards, if that matters. I'm doing this to make the timestamp 00:00:00:
cur_date = datetime.strptime(cur_date.strftime('%Y_%m_%d'), '%Y_%m_%d')
And the type of that object after printing is datetime as well
For anyone who also stumbled across this when comparing a dataframe date to a variable date, and this did not exactly answer your question; you can use the code below.
Instead of:
self.df["date"] = pd.to_datetime(self.df["date"])
You can import datetime and then add .dt.date to the end like:
self.df["date"] = pd.to_datetime(self.df["date"]).dt.date
You can use
pd.Timestamp('today')
or
pd.to_datetime('today')
But both of those give the date and time for 'now'.
Try this instead:
pd.Timestamp('today').floor('D')
or
pd.to_datetime('today').floor('D')
You could have also passed the datetime object to pandas.to_datetime but I like the other option mroe.
pd.to_datetime(datetime.datetime.today()).floor('D')
Pandas also has a Timedelta object
pd.Timestamp('now').floor('D') + pd.Timedelta(-3, unit='D')
Or you can use the offsets module
pd.Timestamp('now').floor('D') + pd.offsets.Day(-3)
To check for membership, try one of these
cur_date in df['date'].tolist()
Or
df['date'].eq(cur_date).any()
When converting datetime64 type using pd.Timestamp() it is important to note that you should compare it to another timestamp type. (not a datetime.date type)
Convert a date to numpy.datetime64
date = '2022-11-20 00:00:00'
date64 = np.datetime64(date)
Seven days ago - timestamp type
sevenDaysAgoTs = (pd.to_datetime('today')-timedelta(days=7))
convert date64 to Timestamp and see if it was in the last 7 days
print(pd.Timestamp(pd.to_datetime(date64)) >= sevenDaysAgoTs)

Check date is within one year python

I have a date string formatted like this: "2017-05-31T06:44:13Z".
I need to check whether this date is within a one year span from today's date.
Which is the best method to do it: convert it into a timestamp and check, or convert into a date format?
Convert the timestamp to a datetime object so it can be compared with other datetime objects using <, >, =.
from datetime import datetime
from dateutil.relativedelta import relativedelta
# NOTE this format basically ignores the timezone. This may or may not be what you want
date_to_check = datetime.strptime('2017-05-31T06:44:13Z', '%Y-%m-%dT%H:%M:%SZ')
today = datetime.today()
one_year_from_now = today + relativedelta(years=1)
if today <= date_to_check <= one_year_from_now:
# do whatever
Use the datetime package together with timedelta:
import datetime
then = datetime.datetime.strptime("2017-05-31T06:44:13Z".replace('T',' ')[:-1],'%Y-%m-%d %H:%M:%S')
now = datetime.datetime.now()
d = datetime.timedelta(days = 365)
and simply check if now-d > then.

Extract Date from excel and append it in a list using python

I have an column in excel which has dates in the format ''17-12-2015 19:35". How can I extract the first 2 digits as integers and append it to a list? In this case I need to extract 17 and append it to a list. Can it be done using pandas also?
Code thus far:
import pandas as pd
Location = r'F:\Analytics Materials\files\paymenttransactions.csv'
df = pd.read_csv(Location)
time = df['Creation Date'].tolist()
print (time)
You could extract the day of each timestamp like
from datetime import datetime
import pandas as pd
location = r'F:\Analytics Materials\files\paymenttransactions.csv'
df = pd.read_csv(location)
timestamps = df['Creation Date'].tolist()
dates = [datetime.strptime(timestamp, '%d-%m-%Y %H:%M') for timestamp in timestamps]
days = [date.strftime('%d') for date in dates]
print(days)
The '%d-%m-%Y %H:%M'and '%d' bits are format specififers, that describe how your timestamp is formatted. See e.g. here for a complete list of directives.
datetime.strptime parses a string into a datetimeobject using such a specifier. dateswill thus hold a list of datetime instances instead of strings.
datetime.strftime does the opposite: It turns a datetime object into string, again using a format specifier. %d simply instructs strftime to only output the day of a date.

Categories