Get age from timestamp - python

I have a dataframe with timestamp of BirthDate = 2001-10-10 11:01:04.343
How can I get an actual age?
I tried like that:
i.loc[0, "BirthDate"] = pd.to_datetime('today').normalize() - i.loc[0, "BirthDate"].normalize()
output is: 7248 days 00:00:00
but is there any better method which give me just output 19 years?
If i use:
(i.loc[0, "BirthDate"] = pd.to_datetime('today').normalize() - i.loc[0, "BirthDate"].normalize())/365
the output is:
19 days 20:34:50:958904109 and it is type <class 'pandas.timedeltas.Timedelta>

The timedelta result is wrong because you are dividing by 365 where you shouldn't. It actually means 19.86 years.
In some more detail, you are taking a value which is in years, and dividing it with 365; so now you have a result which shows 1/365 of the input duration. The proper way to get the result in years is to divide by another timedelta.
>>> from datetime import timedelta
>>> t = timedelta(days=7248)
>>> 7248/365.0
19.85753424657534
>>> print(t)
7248 days, 0:00:00
>>> t/timedelta(days=365)
19.85753424657534
>>> # years
How exactly to represent a year is not completely well-defined. You could use timedelta(days=365.2425) for the arithmetically correct length of a year, but then of course that produces odd results if you try to convert that back to a resolution where hours and minutes are important.

First, delete the last part of the timestamp and then the following python code can be applied:
from datetime import datetime, date
def calculate_age(born):
born = datetime.strptime(born, "%d/%m/%Y").date()
today = date.today()
return today.year - born.year - ((today.month, today.day) < (born.month, born.day))
df['Age'] = df['Date of birth'].apply(calculate_age)
print(df)

Related

Python3 - datetime function -- clarification on .days meaning

I am wanting to take user input of date1 and date2 and calculate the difference to determine how many weeks are in between. My entire program is below. The example dates I'm using are:
date1 = 2023-03-15
date2 = 2022-11-09
Output is Number of weeks: 18 -- which is correct.
My 1st question that I need help in clarifying is why do I need the .days after the days = abs(date2-date1).days? I have searched for many hours via Google, stackoverflow, Youtube and Python docs https://docs.python.org/3.9/library/datetime.html?highlight=datetime#module-datetime. I'm pretty new to Python and reading the docs sometimes trips me up, so please forgive if it's in there -- I've struggled reading through some of it. Why is the .days needed? I know that if I remove .days, the output is: Number of weeks: 18 days, 0:00:00. Where is the documentation on needing the .days listed in the datetime module docs??? Can someone help me understand this please?
My 2nd question is why do I get Number of weeks: 0 when I change .days to .seconds? (this is when I was testing things and comment out the weeks = days//7 and print out days) The one part in the docs that I think addresses this the following: https://docs.python.org/3.9/library/datetime.html?highlight=datetime#module-datetime:~:text=the%20given%20year.-,Supported%20operations%3A,!%3D.%20The%20latter%20cases%20return%20False%20or%20True%2C%20respectively.,-In%20Boolean%20contexts.... and if this is correct, am I reading it correctly that if the difference in dates are to be determined, only "days" are returned, and thus no seconds or microseconds?
Thank you for your help! Code below:
#Find the number of weeks between two given dates
from datetime import datetime
#User input for 1st date in YYYY-MM-DD format
date1 = input("Enter 1st date in YYYY-MM-DD format: ")
date1 = datetime.strptime(date1, "%Y-%m-%d")
#User input for 2nd date in YYYY-MM-DD format
date2 = input("Enter 2nd date in YYYY-MM-DD format: ")
date2 = datetime.strptime(date2, "%Y-%m-%d")
#Calculate the weeks between the 2 given dates
days = abs(date2-date1).days
weeks = days//7
print("Number of weeks: ", weeks)
Output-correct answer with .days included:
Enter 1st date in YYYY-MM-DD format: 2023-03-15
Enter 2nd date in YYYY-MM-DD format: 2022-11-09
Number of weeks: 18
Output-with no .days added:
Enter 1st date in YYYY-MM-DD format: 2023-03-15
Enter 2nd date in YYYY-MM-DD format: 2022-11-09
Number of weeks: 18 days, 0:00:00
Output-(regarding 2nd question with the .seconds put in place of .days:
Enter 1st date in YYYY-MM-DD format: 2023-03-15
Enter 2nd date in YYYY-MM-DD format: 2022-11-09
Number of weeks: 0
Subtracting two datetime.datetime objects returns a datetime.timedelta object:
>>> date1 = datetime.strptime('2023-03-15', "%Y-%m-%d")
>>> date2 = datetime.strptime('2022-11-09', "%Y-%m-%d")
>>> date2-date1
datetime.timedelta(days=-126)
>>>
From the docs:
Only days, seconds and microseconds are stored internally. Arguments are converted to those units:
A millisecond is converted to 1000 microseconds.
A minute is converted to 60 seconds.
An hour is converted to 3600 seconds.
A week is converted to 7 days.
Here is some example usage of datetime.timedelta objects. So for your second question, I believe that you're right; for the difference between two days, there are no .seconds, as it is strictly a difference of days. For .seconds to be nonzero, you'd have to have some component of the difference that is larger than a 1,000,000 microseconds but smaller than 86,400 seconds, I suppose.
TL;DR: The answer to both of your questions is "because that is a property of the datetime.timedelta class."
More fully, the date1 and date2 objects you create in your code are both instances of the datetime.datetime class. The - operation between them makes a timedelta object.
Why is the .days needed to avoid printing ", 0:00:00"?
By default all the date information in the timedelta object you created with the operation abs(date2-date1) is printed (including seconds and microseconds, even after modifying it with the //7 operation). When you use the . operator, you access the days attribute of the timedelta object, and only that attribute's value is used.
Why do I get "Number of weeks: 0" when I change .days to .seconds?
The value of the seconds attribute of the timedelta object you created with the operation abs(date2-date1) is integer 0.
See below:
>>> from datetime import datetime
>>> date1 = "2023-03-15"
>>> date1, type(date1)
('2023-03-15', <class 'str'>)
>>> date1 = datetime.strptime(date1, "%Y-%m-%d")
>>> date1, type(date1)
(datetime.datetime(2023, 3, 15, 0, 0), <class 'datetime.datetime'>)
>>> date2 = "2022-11-09"
>>> date2, type(date2)
('2022-11-09', <class 'str'>)
>>> date2 = datetime.strptime(date2, "%Y-%m-%d")
>>> date2, type(date2)
(datetime.datetime(2022, 11, 9, 0, 0), <class 'datetime.datetime'>)
>>> abs(date2-date1), type(abs(date2-date1))
(datetime.timedelta(days=126), <class 'datetime.timedelta'>)
>>> abs(date2-date1).days, type(abs(date2-date1).days)
(126, <class 'int'>)
>>> abs(date2-date1).seconds, type(abs(date2-date1).seconds)
(0, <class 'int'>)
See also: this discussion.
To be precise you calculate difference in 7 days intervals, but the week is actualy a time period which starts on Monday (or Sunday) and ends on Sunday (or Monday). So the difference between 2022-10-01 (Sat) and 2022-10-04 (Tue) in weeks of year is 1, but in days is 3 (0 7days intervals in your case).
So if you need to find the distance between two dates in weeks of year you have to take account of the weekdays:
from datetime import date
d1 = date(2022,10,18)
d2 = date(2022,10,5)
w1 = d1.weekday()
w2 = d2.weekday()
# dfference in weeks of year = 2
((d1-d2).days - (w1-w2))/7 # 2.0
# difference in days = 13
d1-d2 # datetime.timedelta(days=13)

How to convert numbers to dates

the emplyee number is composed of year and month and 3 digit control number how to know the number of years they works if we base on todays date? Employee1 201011003, eployee2 200605015
You can use datetime library like this:
from datetime import date
date_str = '201011003'
year = int(date_str[0:4])
month = int(date_str[4:6])
d = date(year, month, 1)
year_delta = (date.today() - d).days // 365
print(year_delta)
You can use datetime.strptime to read the date string into a datetime object. By subtracting two datetime objects you'll get back a timedelta object, which you can use to compute the years the employee has been there.
from datetime import datetime
def get_date(s):
return datetime.strptime(s[:6], '%Y%m')
Examples
>>> get_date('201011003')
datetime.datetime(2010, 11, 1, 0, 0)
>>> get_date('200605015')
datetime.datetime(2006, 5, 1, 0, 0)
Depending on the precision you want, you can approximate the number of years the employee has been there like
def get_years(s):
start = datetime.strptime(s[:6], '%Y%m')
now = datetime.now()
return (now - start).days / 365.25
>>> get_years('201011003')
9.527720739219713
>>> get_years('200605015')
14.03148528405202
To get very accurate results, I suggest you to use the dateutil package. It contains a super powerful function called relativedelta that is going to give you the years, months and days that have passed since the day you are interested in, considering leap years (instead of just days, as the datetime.timedelta does).
Also, just as CoryKramer did, we can use the strptime function to parse the date from the employee's codes you have.
import datetime as dt
from dateutil.relativedelta import relativedelta
employee = '201011003'
date_joined = dt.datetime.strptime(employee[:6], '%Y%m')
result = relativedelta(dt.datetime.today(), date_joined)
print('The employee has been working for {} years, {} months and {} days'.format(
result.years, result.months, result.days))
Outputs
The employee has been working for 9 years, 6 months and 11 days

Given a specific datetime, how do I subtract it by day?

I'm aware of how to subtract the current date from a day datetime.datetime.now() - timedelta(days=n_days), but how do I subtract a specific day (datetime format) from a number of days?
Thanks in advance.
I tried subtracting the datetime directly from timedelta(days=n_days), but it gave a type error.
what I got:
difference = a_datetime - timedelta(days=n_days)
but it gave a type error.
expected result
difference = something - timedelta(days=n_days)
should result the date n days from date something
Below code works:
import datetime
dt = datetime.date(2019, 1, 23)
print dt
new_dt = dt - datetime.timedelta(days=1)
print new_dt
Output:
2019-01-23
2019-01-22
Speculation: You seem to be missing a datetime before timedelta in your code
Are you sure you want to subtract a datetime from a number of days? Think about it: You're trying to do:
e.g: 203 days - now
203 - 12/02/2019
Interpret current date as days?
203 - 737510.75
= -737307.75

Subtracting Dates With Python

I'm working on a simple program to tell an individual how long they have been alive.
I know how to get the current date, and get their birthday. The only problem is I have no way of subtracting the two, I know a way of subtracting two dates, but unfortunately it does not include hours, minutes, or seconds.
I am looking for a method that can subtract two dates and return the difference down to the second, not merely the day.
from datetime import datetime
birthday = datetime(1988, 2, 19, 12, 0, 0)
diff = datetime.now() - birthday
print diff
# 8954 days, 7:03:45.765329
Use UTC time otherwise age in seconds can go backwards during DST transition:
from datetime import datetime
born = datetime(1981, 12, 2) # provide UTC time
age = datetime.utcnow() - born
print(age.total_seconds())
You also can't use local time if your program runs on a computer that is in a different place (timezone) from where a person was born or if the time rules had changed in this place since birthday. It might introduce several hours error.
If you want to take into account leap seconds then the task becomes almost impossible.
When substracting two datetime objects you will get a new datetime.timedelta object.
from datetime import datetime
x = datetime.now()
y = datetime.now()
delta = y - x
It will give you the time difference with resolution to microsencods.
For more information take a look at the official documentation.
Create a datetime.datetime from your date:
datetime.datetime.combine(birthdate, datetime.time())
Now you can subtract it from datetime.datetime.now().
>>> from datetime import date, datetime, time
>>> bday = date(1973, 4, 1)
>>> datetime.now() - datetime.combine(bday, time())
datetime.timedelta(14392, 4021, 789383)
>>> print datetime.now() - datetime.combine(bday, time())
14392 days, 1:08:13.593813
import datetime
born = datetime.date(2002, 10, 31)
today = datetime.date.today()
age = today - born
print(age.total_seconds())
Output: 463363200.0
Since DateTime.DateTime is an immutable type method like these always produce a new object the difference of two DateTime object produces a DateTime.timedelta type:
from datetime import date,datetime,time,timedelta
dt=datetime.now()
print(dt)
dt2=datetime(1997,7,7,22,30)
print(dt2)
delta=dt-dt2
print(delta)
print(int(delta.days)//365)
print(abs(12-(dt2.month-dt.month)))
print(abs(dt.day))
The output timedelta(8747,23:48:42.94) or what ever will be days when u test the code indicates that the time delta encodes an offset of 8747 days and 23hour and 48 minute ...
The Output
2021-06-19 22:27:36.383761
1997-07-07 22:30:00
8747 days, 23:57:36.383761
23 Year
11 Month
19 Day

Compute number of dates between two string dates and return an integer

I have a .txt file data-set like this with the date column of interest:
1181206,3560076,2,01/03/2010,46,45,M,F
2754630,2831844,1,03/03/2010,56,50,M,F
3701022,3536017,1,04/03/2010,40,38,M,F
3786132,3776706,2,22/03/2010,54,48,M,F
1430789,3723506,1,04/05/2010,55,43,F,M
2824581,3091019,2,23/06/2010,59,58,M,F
4797641,4766769,1,04/08/2010,53,49,M,F
I want to work out the number of days between each date and 01/03/2010 and replace the date with the days offset {0, 2, 3, 21...} yielding an output like this:
1181206,3560076,2,0,46,45,M,F
2754630,2831844,1,2,56,50,M,F
3701022,3536017,1,3,40,38,M,F
3786132,3776706,2,21,54,48,M,F
1430789,3723506,1,64,55,43,F,M
2824581,3091019,2,114,59,58,M,F
4797641,4766769,1,156,53,49,M,F
I've been trying for ages and its getting really frustrating. I've tried converting to datetime using the datetime.datetime.strptime( '01/03/2010', "%d/%m/%Y").date() method and then subtracting the two dates but it gives me an output of e.g. '3 days, 0:00:00' but I can't seem to get an output of only the number!
The difference between two dates is a timedelta. Any timedelta instance has days attribute that is an integer value you want.
This is fairly simple. Using the code you gave:
date1 = datetime.datetime.strptime('01/03/2010', '%d/%m/%Y').date()
date2 = datetime.datetime.strptime('04/03/2010', '%d/%m/%Y').date()
You get two datetime objects.
(date2-date1)
will give you the time delta. The mistake you're making is to convert that timedelta to a string. timedelta objects have a days attribute. Therefore, you can get the number of days using it:
(date2-date1).days
This generates the desired output.
Using your input (a bit verbose...)
#!/usr/bin/env python
import datetime
with open('input') as fd:
d_first = datetime.date(2010, 03, 01)
for line in fd:
date=line.split(',')[3]
day, month, year= date.split(r'/')
d = datetime.date(int(year), int(month), int(day))
diff=d - d_first
print diff.days
Gives
0
2
3
21
64
114
156
Have a look at pleac, a lot of date-example there using python.

Categories