getting the age from RFC Mex - python

I have a data frame where I'm trying to get the age of the user, but the problem is that there is no birth date, so here in my country exist some kind of tax ID where you can get this data:
ABCD971021XZY or ABCD971021
Where the first 4 letters represent the name and last name and the numbers are the birthday date
in the case above would be 1997/10/21
At this point I've already tried this:
# To slice the RFC
df_v['new_column'] = df_v['RFC'].apply(lambda x: x[4:10])
# Trying to gt the date
from datetime import datetime, timedelta
s = "971021"
date = datetime(year=int(s[0:2]), month=int(s[2:4]), day=int(s[4:6]))
OUT: 0097-10-21
What I'm looking for is to look something like this.
1997-10-21

The problem is that the millenium and century are not given explicitly in the tax ID, and there is no single way to convert from a two-digit year to a four-digit year.
e.g. 971021 tells you that the birth year is xx97, but for all datetime knows, that could mean the year 1597 or 1097 or 2397.
You as the programmer will have to decide how to encode your assumptions about what millenium and century a person was most likely born in. For example, a simplistic (untested) solution could be:
year_last_two = int(s[0:2])
# If the year given is less than 20, this person was most likely born in the 2000's
if year_last_two < 20:
year = 2000 + year_last_two
# Otherwise, the person was most likely born in the 1900's
else:
year = 1900 + year_last_two
date = datetime(year=year, month=int(s[2:4]), day=int(s[4:6]))
Of course, this solution only applies in 2019, and also assumes no one is more than 100 years old. You could make it better by using the current year as the splitting point.

Related

How to get a year-week text in Python?

I have a table which contains information on the number of changes done on a particular day. I want to add a text field to it in the format YYYY-WW (e. g. 2022-01) which indicates the week number of the day. I need this information to determine in what week the total number of changes was the highest.
How can I determine the week number in Python?
Below is the code based on this answer:
week_nr = day.isocalendar().week
year = day.isocalendar().year
week_nr_txt = "{:4d}-{:02d}".format(year, week_nr)
At a first glance it seems to work, but I am not sure that week_nr_txt will contain year-week tuple according to the ISO 8601 standard.
Will it?
If not how do I need to change my code in order to avoid any week-related errors (example see below)?
Example of a week-related error: In year y1 there are 53 weeks and the last week spills over into the year y1+1.
The correct year-week tuple is y1-53. But I am afraid that my code above will result in y2-53 (y2=y1+1) which is wrong.
Thanks. I try to give my answer. You can easily use datetime python module like this:
from datetime import datetime
date = datetime(year, month, day)
# And formating the date time object like :
date.strftime('%Y-%U')
Then you will have the year and wich week the total information changes

Why is my program outputting the wrong number of days?

So I am writing a program to work out the number of days you have been alive after imputting your birthday. There is a problem as i am getting the wrong number of days but can figure out why. i inputted my birthday as 04/04/19 and i got 730625 days which is clearly wrong.
import datetime #imports module
year = int(input("What year were you born in"))
month = int(input("What month where you born in (number)"))
date = int(input("What date is your birthday? "))
birthdate = datetime.date(date, month, year) #converts to dd/mm/yy
today = datetime.date.today() #todays date
daysAlive = (today - birthdate).days #calculates how many days since birth
print("You have been alive for {} days.".format(daysAlive)) #outputs result
I initially got the same error as you but then I checked my code and managed to fix my mistake.
So your DOB is 04/04/19, when you input that into datetime.date() and it looks at the value for year which is 19, it will treat that as 0019. As in 19 AD, not 2019. You should make sure that you input the full year.
Also like SimonN said, the parameters for datetime.date() are year, month, day, not the other way around.
You have the parameters the wrong way round in datetime.date they should be (year,month,day)
datetime takes arguments as (year, month, date). Note that you cannot enter year like 09 for 2009. Datetime will count it as 0009-MM-DD. You have to enter complete year in the input as 2009
...
birthdate = datetime.date(year, month, date)
...
So, with your input, the output for me is (It may differ with your timezone):
You have been alive for 170 days.
class datetime.date(year, month, day)
should be in the format yy/mm/dd.
Try this code for Python 3.6 or higher,
because of f-stings:
import datetime
year = int(input("What year were you born in: "))
month = int(input("What month were you born in (number): "))
day = int(input("What day were you born in: "))
birth_date = datetime.date(year, month, day) # converts to yy/mm/dd
today = datetime.date.today() # todays date
days_alive = (today - birth_date).days # calculates how many days since birth
print(f"You are {days_alive} days old.") # outputs result
Check the answer using other sources.

Checking errors about year,month,day

1. The days enterred should be after 15/10/1582
2. Should consider the leapyears.
3. Even when "ctrl + c" or alphabets are enterred, the source code should go on (Use try...except)
3. Repeat until 0 is enterred in 'year'.
THis is what I tried.....
while True:
year = int(input("Year: "))
if year == 0
break
month = int(input("Month: ")
day = int(input("Days: "))
I completely can't think of how to solve this, so I'd like to get some hints how I should deal with this problem!
Year: 2019 Month: 0 Day: 12 There is only January ~ December
Year: 2019 Month: 1 Day: 0 Day should be at least 1
Year: 2019 Month: 1 Day: 32 January is upto 31
Year: 2020 Month: 2 Day: 30 2020 is a leapyear, but Feburary is upto
29
Year: 2019 Month: 2 Day: 29 2019 is not a leapyear, so Feburary is
upto 28
Year: 1582 Month:1 Day:1 1/1/1582 is before when Gregorian calender
started
Year: 2019 Month: 1 Day:8 OK
Year: 0
Well, the first thing to do, obviously, is to check whether your user inputs are proper numerics. As mentionned in the instructions, this can be done using exception handling (try/except blocks). Exception handling is documented so first check the doc and use the interactive shell to test out things until you get how ot works... Just a couple hints here: only catch the exact exceptions you expect at a given point, and have the less possible code in the try block so you're sure you only catch the exceptions raised by this exact piece of code.
(NB : Note that this can ALSO be done without exception handling by testing the content of the strings returned by input() _before_ passing them to int(), but you're obviously expected to use exception handling here, cf the "Use try/except" mention.)
The second thing is to validate that the individual values entered for day, month and year are in the expected range; ie there are only 12 month, so for this variable, any value lower than 1 (january) or higher than 12 (december) is invalid.
Note that since the number of days in a month changes from month to month and, for february, can change from year to year, you can only validate days once you know the month and the year.
I suggest you first make the "day" validation work without taking care of leap years, and only then take care of the leap year special case. As often, a good data structure is key to simple effective code, so read about the standard basic Python data types (lists, dicts, tuples etc) and think about which of those types you could use to map a month number to how many days it has (for a non leap year, that is).
There are quite a few other things to care of, but first manage to get those first two points working and the rest should not be too difficult.

Convert the date (start and stop) to a time interval so I can compare using Python

Hello I am working with a .csv file that contains the Birth Date and Death Date of all the presidents. The problem that I am trying to solve is what year is the year that the most presidents were alive. I assume that to do this, I have to convert the dates of the birth and deaths of the presidents to a time series and the presidents who are currently alive, will have to have their death dates changed to present time. Does anyone know I can go about doing this using Python and the packages - Pandas and NumPy? Here is the code I have so far:
Also the date is in this format: Feb 22 1732
If the president hasn't died then his death date is blank
#!/usr/bin/python
#simple problem: find the year that the most presidents
#were alive
import pandas as pd
import numpy as np
#import the presidents.csv and save as a dataframe
presidents = pd.read_csv('presidents.csv')
#view the first ten lines of the dataframe
presidents.head(10)
#change the column names to remove whitespace
presidents.columns = ['President','Birth Date','Birth Place','Death Date','Location of Death']
#save the column names of the dataframe into a list
columns_of_pres = list(presidents.columns)
#create a data frame that contains just the name, birth and death date of the president
birth_and_deathbirth_and_death = presidents[['President','Birth Date','Death Date']]
If your goal is only to solve is what year is the year that the most presidents were alive, then you should just
1) get the year out of your date field year = 'Feb 22 1732'.split(' ')[-1]
2) for each president make a list of years in which he was alive. aliveYears = range(birthYear,deathYear)
3) use collections.Counter() to count which year you find most presidents.
Something like this:
from collections import Counter
yearCount = Counter()
for p in presidents:
birthYear = ....split(' ')[-1]
deathYear = ....split(' ')[-1]
for year in range(birthYear,deathYear):
yearCount.update({year})
Let's assume you have transformed your dataframe in the following format:
president birth_year death_year
President1 1875 1925
President2 1900 1950
President3 1925 1975
(If you need help with that transformation let me know.)
Then the following function will count the number of presidents alive at a given year:
def president_count(year):
return(((df['birth_year'] <= year) & (df['death_year'] >= year)).sum())
Indeed, ((df['birth_year'] <= year) & (df['death_year'] >= year)) returns a boolean series, with true or false depending on whether the president is alive. You then sum the series to get the number of presidents alive.
You can then use a simple loop to get the maximum.

How can I subtract two dates in Python?

So basically what I want to do is to subtract the date of birth from todays date in order to get a persons age, I've successfully done this, but I can only get it to show the persons age in days.
dateofbirth = 19981128
dateofbirth = list(str(dateofbirth))
now = datetime.date.today()
yr = dateofbirth[:4]
yr = ''.join(map(str, yr))
month = dateofbirth[4:6]
month = ''.join(map(str, month))
day = dateofbirth[6:8]
day = ''.join(map(str, day))
birth = datetime.date(int(yr), int(month), int(day))
age = now - birth
print(age)
In this case, age comes out as days, is there any way to get it as xx years xx months and xx days?
You can use strptime:
>>> import datetime
>>> datetime.datetime.strptime('19981128', '%Y%m%d')
datetime.datetime(1998, 11, 28, 0, 0)
>>> datetime.datetime.now() - datetime.datetime.strptime('19981128', '%Y%m%d')
datetime.timedelta(5823, 81486, 986088)
>>> print (datetime.datetime.now() - datetime.datetime.strptime('19981128', '%Y%m%d'))
5823 days, 22:38:18.039365
The result of subtracting two dates in Python is a timedelta object, which just represents a duration. It doesn't "remember" when it starts, and so it can't tell you how many months have elapsed.
Consider that the period from 1st January to 1st March is "two months", and the period from 1st March to 28th April is "1 month and 28 days", but in a non-leap year they're both the same duration, 59 days. Actually, daylight savings, but let's not make this any more complicated than it needs to be to make the point ;-)
There may be a third-party library that helps you, but as far as standard Python libraries are concerned, AFAIK you'll have to roll your sleeves up and do it yourself by finding the differences of the day/month/year components of the two dates in turn. Of course, the month and day differences might be negative numbers so you'll have to deal with those cases. Recall how you were taught to do subtraction in school, and be very careful when carrying numbers from the month column to the days column, to use the correct number of days for the relevant month.

Categories