I would like to be able to do greater than and less than against dates. How would I go about doing that? For example:
date1 = "20/06/2013"
date2 = "25/06/2013"
date3 = "01/07/2013"
date4 = "07/07/2013"
datelist = [date1, date2, date3]
for j in datelist:
if j <= date4:
print j
If I run the above, I get date3 back and not date1 or date2. I think I need I need to get the system to realise it's a date and I don't know how to do that. Can someone lend a hand?
Thanks
You can use the datetime module to convert them all to datetime objects. You are comparing strings in your example:
>>> from datetime import datetime
>>> date1 = datetime.strptime(date1, "%d/%m/%Y")
>>> date2 = datetime.strptime(date2, "%d/%m/%Y")
>>> date3 = datetime.strptime(date3, "%d/%m/%Y")
>>> date4 = datetime.strptime(date4, "%d/%m/%Y")
>>> datelist = [date1, date2, date3]
>>> for j in datelist:
... if j <= date4:
... print(j.strftime('%d/%m/%Y'))
...
20/06/2013
25/06/2013
01/07/2013
You are comparing strings, not dates. You should use a date-based object-type, such as datetime.
How to compare two dates?
You can use the datetime module:
>>> from datetime import datetime
>>> d = datetime.strptime(date4, '%d/%m/%Y')
>>> for j in datelist:
... d1 = datetime.strptime(j, '%d/%m/%Y')
... if d1 <= d:
... print j
...
20/06/2013
25/06/2013
01/07/2013
The problem with your comparison is that a string comparison first compares the first character, followed by the second one, and the third, and so on. You can of course convert the strings to dates, like the other answers suggest, but there is a different solution as well.
In order to compare dates as strings you need to have them in a different format like : 'yyyy-mm-dd'. In this way it first compares the year, than the month and finally the day:
>>> d1 = '2012-10-11'
>>> d2 = '2012-10-12'
>>> if d2 > d1:
... print('this works!')
this works!
The advantages of this are simplicity (for me at least) and performance because it saves the conversion of strings to dates (and possibly back) while still reliably comparing the dates. In programs I use I compare dates a lot as well. Since I take the dates from files it are always strings to begin with, and because performance is an issue with my program I normally like to compare dates as strings in this way.
Of course this would mean you would have to convert your dates to a different format, but if that is a one time action, it could well be worth the effort.. :)
Related
I am given two dates as strings below, and I want to subtract them to get the number 16 as my output. I tried converting them to date format first and then doing the math, but it didn't work.
from datetime import datetime
date_string = '2021-05-27'
prev_date_string = '2021-05-11'
a = datetime.strptime(date_string '%y/%m/%d')
b = datetime.strptime(prev_date_string '%y/%m/%d')
c = a - b
print (c)
There are two problems with the strptime calls. First, they are missing commas (,) between the two arguments. Second, the format string you use must match the format of the dates you have.
Also, note the result of subtracting two datetime objects is a timedelta object. If you just want to print out the number 16, you'll need to extract the days property of the result:
a = datetime.strptime(date_string, '%Y-%m-%d')
b = datetime.strptime(prev_date_string, '%Y-%m-%d')
c = a-b
print (c.days)
The simple answer for this problem.
from datetime import date
a = date(2021, 5, 11)
b = date(2021, 5, 27)
c = b - a
print(c.days)
I am trying to convert the date of birth in pandas from mm/dd/yy to mm/dd/yyyy. See screenshot below:
The issue im having is when converting date of birth from
06/13/54
04/15/70
to the mm/dd/yyyy format it is assuming that the date is in the 2000's. Obviously those users wouldn't even be born yet. Is there a function or something that can be used to make sure the conversion is done properly or as proper as it can be. Let's assume for this case no user lives past 90.
You really shouldn't strftime back to the very bad, not good mm-dd-yy format, but keep things as Pandas datetimes.
Either way, you can come up with a function that fixes "bad-looking" dates and .apply() it – this is using a single pd.Series, but that's what dataframes are composed of anyway, so you get the idea.
>>> s = pd.Series(["06/13/54", "04/15/70"])
>>> s2 = pd.to_datetime(s)
0 2054-06-13
1 2070-04-15
dtype: datetime64[ns]
>>> def fix_date(dt):
... if dt.year >= 2021: # change threshold accordingly
... return dt.replace(year=dt.year - 100)
... return dt
...
>>> s3 = s2.apply(fix_date)
0 1954-06-13
1 1970-04-15
dtype: datetime64[ns]
>>>
Simply replace the year if it is in the future?
x = pd.to_datetime('06/13/54',format='%m/%d/%y')
if x>datetime.datetime.now():
x.replace(year=x.year-100)
Input
df = pd.DataFrame({'date':["06/13/54", "04/15/70"]})
df.date = pd.to_datetime(df.date, format='%m/%d/%y')
df
Input df
date
0 2054-06-13
1 1970-04-15
Code
df.date = df.date.mask(df.date.gt(pd.Timestamp('today')), df.date-pd.DateOffset(years=100))
df
Output
date
0 1954-06-13
1 1970-04-15
I am trying to store a date in a human readable format. For that I save and read back a string containing a date. I am using date.min to denote a date before any other.
from datetime import datetime, date
d = date.min
s = datetime.strftime(d, "%Y-%m-%d")
print(s)
# 1-01-01
d2 = datetime.strptime(s, "%Y-%m-%d")
# ValueError: time data '1-01-01' does not match format '%Y-%m-%d'
However, when I try to parse the date using strptime that was output by strftime, I only get an error. It seems that strptime is expecting leading zeros like 0001, which strftime is not outputting.
It might be possible to use None. Are there any other ways to work around what seems like a bug to me?
You need to add leading zeros:
try replacing:
s = {datetime.strftime(d, "%Y-%m-%d")}
with:
s = f'{d.year:04d}-{datetime.strftime(d, "%m-%d")}'
If you want to work with dates easily, I can really suggest the 'Arrow' library.
https://pypi.org/project/arrow/
Python 3.9 on Linux exhibits the problem as expected in the many comments on the question. One workaround that should work on all platforms is to use ISO format for the date string with date.isoformat():
>>> from datetime import date, datetime
>>> s = date.min.isoformat()
>>> s
'0001-01-01'
>>> d = datetime.strptime(s, "%Y-%m-%d")
>>> d
datetime.datetime(1, 1, 1, 0, 0)
>>> assert d.date() == date.min
You can also use date.fromisoformat() instead of strptime():
>>> date.fromisoformat(date.min.isoformat())
datetime.date(1, 1, 1)
The strftime can't add leading zeros to the year. It calls the underlying C function and its behavior on adding leading zeros to the year is platform specific. You can work around this by formatting the date object by yourself. Just do a check if d.year is less than 1000 and add how many leading zeros needed:
d = date.min
year_str = ''.join(['0' for _ in range(4 - len(str(d.year)))]) + str(d.year)
s = '{year}-{md}'.format(year=year_str, md=datetime.strftime(d, "%m-%d"))
d2 = datetime.strptime(s, "%Y-%m-%d")
# now d2 is equal to datetime.datetime(1, 1, 1, 0, 0)
I have a small question. I have an array that saves dates in the following format.
'01/02/20|07/02/20'
It is saved as a string, which uses the start date on the left side of the "|" and end date on the other side.
It is only the end date that matters here, but is there a function or algorithm I can use to automatically calculate the difference in days and months between now.datetime and the end date (right-hand side of "|")?
Thanks, everyone
datetime.strptime is the main routine for parsing strings into datetimes. It can handle all sorts of formats, with the format determined by a format string you give it:
In [34]: from datetime import datetime
In [35]: end_date = datetime.strptime(s.split('|')[1], '%d/%m/%y')
In [36]: diff = datetime.now() - end_date
In [37]: diff
Out[37]: datetime.timedelta(days=81, seconds=81712, microseconds=14069)
In [38]: diff.days
Out[38]: 81
You might be looking for something like this.
Python datetime module is the way to go for a problem like this
import datetime
dates = '01/02/20|07/02/20'
enddate = dates.split('|')[-1]
# Use %y for 2 digits year else %Y for 4 digit year
enddate = datetime.datetime.strptime(enddate, "%d/%m/%y")
today = datetime.date.today()
print(abs(enddate.date() - today).days)
Output:
81
I have two different dates and I want to know the difference in days between them. The format of the date is YYYY-MM-DD.
I have a function that can ADD or SUBTRACT a given number to a date:
def addonDays(a, x):
ret = time.strftime("%Y-%m-%d",time.localtime(time.mktime(time.strptime(a,"%Y-%m-%d"))+x*3600*24+3600))
return ret
where A is the date and x the number of days I want to add. And the result is another date.
I need a function where I can give two dates and the result would be an int with date difference in days.
Use - to get the difference between two datetime objects and take the days member.
from datetime import datetime
def days_between(d1, d2):
d1 = datetime.strptime(d1, "%Y-%m-%d")
d2 = datetime.strptime(d2, "%Y-%m-%d")
return abs((d2 - d1).days)
Another short solution:
from datetime import date
def diff_dates(date1, date2):
return abs(date2-date1).days
def main():
d1 = date(2013,1,1)
d2 = date(2013,9,13)
result1 = diff_dates(d2, d1)
print '{} days between {} and {}'.format(result1, d1, d2)
print ("Happy programmer's day!")
main()
You can use the third-party library dateutil, which is an extension for the built-in datetime.
Parsing dates with the parser module is very straightforward:
from dateutil import parser
date1 = parser.parse('2019-08-01')
date2 = parser.parse('2019-08-20')
diff = date2 - date1
print(diff)
print(diff.days)
Answer based on the one from this deleted duplicate
I tried the code posted by larsmans above but, there are a couple of problems:
1) The code as is will throw the error as mentioned by mauguerra
2) If you change the code to the following:
...
d1 = d1.strftime("%Y-%m-%d")
d2 = d2.strftime("%Y-%m-%d")
return abs((d2 - d1).days)
This will convert your datetime objects to strings but, two things
1) Trying to do d2 - d1 will fail as you cannot use the minus operator on strings and
2) If you read the first line of the above answer it stated, you want to use the - operator on two datetime objects but, you just converted them to strings
What I found is that you literally only need the following:
import datetime
end_date = datetime.datetime.utcnow()
start_date = end_date - datetime.timedelta(days=8)
difference_in_days = abs((end_date - start_date).days)
print difference_in_days
Try this:
data=pd.read_csv('C:\Users\Desktop\Data Exploration.csv')
data.head(5)
first=data['1st Gift']
last=data['Last Gift']
maxi=data['Largest Gift']
l_1=np.mean(first)-3*np.std(first)
u_1=np.mean(first)+3*np.std(first)
m=np.abs(data['1st Gift']-np.mean(data['1st Gift']))>3*np.std(data['1st Gift'])
pd.value_counts(m)
l=first[m]
data.loc[:,'1st Gift'][m==True]=np.mean(data['1st Gift'])+3*np.std(data['1st Gift'])
data['1st Gift'].head()
m=np.abs(data['Last Gift']-np.mean(data['Last Gift']))>3*np.std(data['Last Gift'])
pd.value_counts(m)
l=last[m]
data.loc[:,'Last Gift'][m==True]=np.mean(data['Last Gift'])+3*np.std(data['Last Gift'])
data['Last Gift'].head()
I tried a couple of codes, but end up using something as simple as (in Python 3):
from datetime import datetime
df['difference_in_datetime'] = abs(df['end_datetime'] - df['start_datetime'])
If your start_datetime and end_datetime columns are in datetime64[ns] format, datetime understands it and return the difference in days + timestamp, which is in timedelta64[ns] format.
If you want to see only the difference in days, you can separate only the date portion of the start_datetime and end_datetime by using (also works for the time portion):
df['start_date'] = df['start_datetime'].dt.date
df['end_date'] = df['end_datetime'].dt.date
And then run:
df['difference_in_days'] = abs(df['end_date'] - df['start_date'])
pd.date_range('2019-01-01', '2019-02-01').shape[0]