python parse date string with out of range days without error - python

I want to parse a date string and manipulate the year, month, date in cases where I either get '00' for month or day or in cases where I get a day beyond the possible days of that year/month. Given a '2012-00-00' or a '2020-02-31', I get a ValueError. What I want, is to catch the error and then turn the former into '2012-01-01' and the latter to '2020-02-29'. No results on Google so far.
Clarification: I use try/except/ValueError... what I want is to parse out the year, month, day and fix the day or month when they are having a ValueError... without having to code the parsing and regular expressions myself... which defeats the purpose of using a library to begin with.
# Try dateutjil
blah = dateutil.parser.parse(date_string, fuzzy=True)
print(blah)
# Try datetime
date_object = datetime.strptime(date_string, date_format)
return_date_string = date_object.date().strftime('%Y-%m-%d')

I know you don't want to parse the date yourself but I think you will probably have to. One option would be to split the incoming string into its component year, month and day parts and check them against valid values, adjusting as required. You can then create a date from that and call strftime to get a valid date string:
from datetime import datetime, date
import calendar
def parse_date(dt):
[y, m, d] = map(int, dt.split('-'))
# optional error checking on y
# ...
# check month
m = 1 if m == 0 else 12 if m > 12 else m
# check day
last = calendar.monthrange(y, m)[-1]
d = 1 if d == 0 else last if d > last else d
return date(y, m, d).strftime('%Y-%m-%d')
print(parse_date('2012-00-00'))
print(parse_date('2020-02-31'))
Output:
2012-01-01
2020-02-29

Related

Choose specific calendar dates in python

I am trying to convert a column of dates from MonthYear form to mm/dd/yyyy and I can do it as a string replace but it requires 157 lines of code to get all the data changed. I want to be able to take the month and year and push out the second wednesday of the month in mm/dd/yyyy form. is that possible?
I am currently using this code
df['Column']=df['Column'].str.replace("December2009", "12/11/2009")
I don't know of a standard library tool for this, but it's easy to make your own, something like this:
from datetime import datetime, timedelta
import pandas as pd
test_arr = ['December2009', 'August2012', 'March2015']
def replacer(d):
# take a datestring of format %B%Y and find the second wednesday
dt = datetime.strptime(d, '%B%Y')
x = 0
# start at day 1 and increment through until conditions satisfied
while True:
s = dt.strftime('%A')
if s == 'Wednesday':
x += 1 # if a wednesday found, increment the counter
if x == 2:
break # when two wednesdays found then break
dt += timedelta(days = 1)
return dt.strftime('%m/%d/%Y')
df = pd.DataFrame(test_arr, columns = ['a'])
df['a'].apply(replacer) # .apply() applies the given python function to each element in the df column
Maybe the calendar module as recommended in the other comments could make the code look nicer but I'm unfamiliar with it so it might be something you want to look into the improve the solution

In python, how can I find the start & end dates for a random quarter in the past?

Get start and end date of quarter from date and fiscal year end provides great helper functions to get the current/prior quarter. I'm unable to generalize the prev_quarter_range function to include a quarters_ago param that returns the start & end dates for a random quarter n quarters ago.
Ideally, I want a function named get_quarter_start_end_dates that takes in (dt, quarters_ago) and outputs (start_dt, end_dt). Here are some sample inputs --> outputs:
('2017-01-01', 0) --> ('2017-01-01', '2017-04-01')
('2017-01-01', 1) --> ('2016-10-01', '2017-01-01')
('2017-01-01', 2) --> ('2016-07-01', '2016-10-01')
('2017-02-01', 12) --> ('2014-01-01', '2014-04-01')
How about:
def get_quarter_start_end_dates(dt, quarters_ago):
months = relativedelta(months=3*quarters_ago)
start_dt = dt - months
end_dt = start_dt + relativedelta(months=3)
return (start_dt, end_dt)
The arrow library (for date processing) is very good for this purpose.
Get today's date in tod.
Find the first day of the month by replacing the day in today's date with one.
Suppose we want to go back n=3 quarters.
Then shift the first of the month back by 3*n months.
To get the end of the quarter shift the first of the month back by 3*(n-1) months, then by 1 day.
>>> import arrow
>>> tod = arrow.now()
>>> first_of_month = tod.replace(day=1)
>>> n = 3
>>> n_quarters_back = first_of_month.shift(months=-3*n)
>>> n_quarters_back
<Arrow [2017-04-01T16:56:46.377079-04:00]>
>>> end_of_n_quarters_back = first_of_month.shift(months=-3*(1-n)).shift(days=-1)
>>> end_of_n_quarters_back
<Arrow [2018-06-30T16:56:46.377079-04:00]>
I had this same question and after some trial and error I found the following to work:
def get_n_quarters_back(n_quarters):
date_in_quarter = arrow.now().shift(months=-3*n_quarters)
quarter_start, quarter_end = list(arrow.Arrow.interval('quarter',
date_in_quarter,date_in_quarter))[0]
return quarter_start, quarter_end
If you want to specify the datetime to shift back from just add the datetime parameter instead of going from now. In this case ref_date would be an Arrow object.
def get_n_quarters_back(ref_date, n_quarters):
date_in_quarter = ref_date.shift(months=-3*n_quarters)
quarter_start, quarter_end = list(arrow.Arrow.interval('quarter',
date_in_quarter, date_in_quarter))[0]
return quarter_start, quarter_end
NOTE: The arrow documentation states that the end date is an optional parameter on the interval classmethod. However, in practice this does not appear to be the case because without it TypeError: interval() missing 1 required positional argument: 'end' is returned.
With Arrow it's quite easy actually
import arrow
def get_quarter_start_end_dates(dt, quarters_ago):
date_in_past = arrow.get(dt).shift(quarters=-quarters_ago)
quarter_start_date = date_in_past.floor('quarter').datetime
quarter_end_date = date_in_past.ceil('quarter').datetime
return quarter_start_date, quarter_end_date

Can a datetime.date object without the day be created in python?

I'm trying to enter a date in Python but sometimes I don't know the exact day or month. So I would like to record only the year. I would like to do something like:
datetime.date(year=1940, month="0 or None", day="0 or None")
Is there a code for doing this? Or if not, how would you manage to deal with this problem?
Unfortunately, you can't pass 0 because there is no month 0 so you'll get ValueError: month must be in 1..12, you cannot skip the month or the day as both are required.
If you do not know the exact year or month, just pass in 1 for the month and day and then keep only the year part.
>>> d = datetime.date(year=1940, month=1, day=1)
>>> d
datetime.date(1940, 1, 1)
>>> d.year
1940
>>> d = datetime.date(year=1940, month=1, day=1).year
>>> d
1940
The second statement is a shorthand for the first.
However, if you want to just store the year, you don't need a datetime object. You can store the integer value separately. A date object implies month and day.
Pandas has Period class where you don't have to supply day if you don't know that:
import pandas as pd
pp = pd.Period('2013-12', 'M')
print pp
print pp + 1
print pp - 1
print (pp + 1).year, (pp + 1).month
Output:
2013-12
2014-01
2013-11
2014 1

Python 3.2 input date function

I would like to write a function that takes a date entered by the user, stores it with the shelve function and prints the date thirty days later when called.
I'm trying to start with something simple like:
import datetime
def getdate():
date1 = input(datetime.date)
return date1
getdate()
print(date1)
This obviously doesn't work.
I've used the answers to the above question and now have that section of my program working! Thanks!
Now for the next part:
I'm trying to write a simple program that takes the date the way you instructed me to get it and adds 30 days.
import datetime
from datetime import timedelta
d = datetime.date(2013, 1, 1)
print(d)
year, month, day = map(int, d.split('-'))
d = datetime.date(year, month, day)
d = dplanted.strftime('%m/%d/%Y')
d = datetime.date(d)+timedelta(days=30)
print(d)
This gives me an error:
year, month, day = map(int, d.split('-'))
AttributeError: 'datetime.date' object has no attribute 'split'
Ultimately what I want is have 01/01/2013 + 30 days and print 01/30/2013.
Thanks in advance!
The input() method can only take text from the terminal. You'll thus have to figure out a way to parse that text and turn it into a date.
You could go about that in two different ways:
Ask the user to enter the 3 parts of a date separately, so call input() three times, turn the results into integers, and build a date:
year = int(input('Enter a year'))
month = int(input('Enter a month'))
day = int(input('Enter a day'))
date1 = datetime.date(year, month, day)
Ask the user to enter the date in a specific format, then turn that format into the three numbers for year, month and day:
date_entry = input('Enter a date in YYYY-MM-DD format')
year, month, day = map(int, date_entry.split('-'))
date1 = datetime.date(year, month, day)
Both these approaches are examples; no error handling has been included for example, you'll need to read up on Python exception handling to figure that out for yourself. :-)
Thanks. I have been trying to figure out how to add info to datetime.datetime(xxx) and this explains it nicely. It's as follows
datetime.datetime(year,month, day, hour, minute, second) with parameters all integer. It works!
Use the dateutils module
from dateutil import parser
date = parser.parse(input("Enter date: "))
you can also use
import datetime
time_str = input("enter time in this format yyyy-mm-dd")
time=datetime.datetime.strptime(time_str, "%Y-%m-%d")
datetime.datetime.strptime() strips the given string in the format you give it.
Check the library as
import datetime
and follow syntax
date = datetime.datetime(2013, 1, 1)

nth weekday calculation in Python - whats wrong with this code?

I'm trying to calculate the nth weekday for a given date. For example, I should be able to calculate the 3rd wednesday in the month for a given date.
I have written 2 versions of a function that is supposed to do that:
from datetime import datetime, timedelta
### version 1
def nth_weekday(the_date, nth_week, week_day):
temp = the_date.replace(day=1)
adj = (nth_week-1)*7 + temp.weekday()-week_day
return temp + timedelta(days=adj)
### version 2
def nth_weekday(the_date, nth_week, week_day):
temp = the_date.replace(day=1)
adj = temp.weekday()-week_day
temp += timedelta(days=adj)
temp += timedelta(weeks=nth_week)
return temp
Console output
# Calculate the 3rd Friday for the date 2011-08-09
x=nth_weekday(datetime(year=2011,month=8,day=9),3,4)
print 'output:',x.strftime('%d%b%y')
# output: 11Aug11 (Expected: '19Aug11')
The logic in both functions is obviously wrong, but I can't seem to locate the bug - can anyone spot what is wrong with the code - and how do I fix it to return the correct value?
Your problem is here:
adj = temp.weekday()-week_day
First of all, you are subtracting things the wrong way: you need to subtract the actual day from the desired one, not the other way around.
Second, you need to ensure that the result of the subtraction is not negative - it should be put in the range 0-6 using % 7.
The result:
adj = (week_day - temp.weekday()) % 7
In addition, in your second version, you need to add nth_week-1 weeks like you do in your first version.
Complete example:
def nth_weekday(the_date, nth_week, week_day):
temp = the_date.replace(day=1)
adj = (week_day - temp.weekday()) % 7
temp += timedelta(days=adj)
temp += timedelta(weeks=nth_week-1)
return temp
>>> nth_weekday(datetime(2011,8,9), 3, 4)
datetime.datetime(2011, 8, 19, 0, 0)
one-liner
You can find the nth weekday with a one liner that uses calendar from the standard library.
import calendar
calendar.Calendar(x).monthdatescalendar(year, month)[n][0]
where:
x : the integer representing your weekday (0 is Monday)
n : the 'nth' part of your question
year, month : the integers year and month
This will return a datetime.date object.
broken down
It can be broken down this way:
calendar.Calendar(x)
creates a calendar object with weekdays starting on your required weekday.
.monthdatescalendar(year, month)
returns all the calendar days of that month.
[n][0]
returns the 0 indexed value of the nth week (the first day of that week, which starts on the xth day).
why it works
The reason for starting the week on your required weekday is that by default 0 (Monday) will be used as the first day of the week and if the month starts on a Wednesday, calendar will consider the first week to start on the first occurrence of Monday (ie. week 2) and you'll be a week behind.
example
If you were to need the third Saturday of September 2013 (that month's US stock option expiry day), you would use the following:
calendar.Calendar(5).monthdatescalendar(2013,9)[3][0]
The problem with the one-liner with the most votes is it doesn't work.
It can however be used as a basis for refinement:
You see this is what you get:
c = calendar.Calendar(calendar.SUNDAY).monthdatescalendar(2018, 7)
for c2 in c:
print(c2[0])
2018-07-01
2018-07-08
2018-07-15
2018-07-22
2018-07-29
c = calendar.Calendar(calendar.SUNDAY).monthdatescalendar(2018, 8)
for c2 in c:
print(c2[0])
2018-07-29
2018-08-05
2018-08-12
2018-08-19
2018-08-26
If you think about it it's trying to organise the calendars into nested lists to print a weeks worth of dates at a time. So stragglers from other months come into play. By using a new list of valid days that fall in the month - this does the trick.
Answer with appended list
import calendar
import datetime
def get_nth_DOW_for_YY_MM(dow, yy, mm, nth) -> datetime.date:
#dow - Python Cal - 6 Sun 0 Mon ... 5 Sat
#nth is 1 based... -1. is ok for last.
i = -1 if nth == -1 or nth == 5 else nth -1
valid_days = []
for d in calendar.Calendar(dow).monthdatescalendar(yy, mm):
if d[0].month == mm:
valid_days.append(d[0])
return valid_days[i]
So here's how it could be called:
firstSundayInJuly2018 = get_nth_DOW_for_YY_MM(calendar.SUNDAY, 2018, 7, 1)
firstSundayInAugust2018 = get_nth_DOW_for_YY_MM(calendar.SUNDAY, 2018, 8, 1)
print(firstSundayInJuly2018)
print(firstSundayInAugust2018)
And here is the output:
2018-07-01
2018-08-05
get_nth_DOW_for_YY_MM() can be refactored using lambda expressions like so:
Answer with lambda expression refactoring
import calendar
import datetime
def get_nth_DOW_for_YY_MM(dow, yy, mm, nth) -> datetime.date:
#dow - Python Cal - 6 Sun 0 Mon ... 5 Sat
#nth is 1 based... -1. is ok for last.
i = -1 if nth == -1 or nth == 5 else nth -1
return list(filter(lambda x: x.month == mm, \
list(map(lambda x: x[0], \
calendar.Calendar(dow).monthdatescalendar(yy, mm) \
)) \
))[i]
The one-liner answer does not seem to work if the target day falls on the first of the month. For instance, if you want the 2nd Friday of every month, then the one-liner approach
calendar.Calendar(4).monthdatescalendar(year, month)[2][0]
for March 2013 will return March 15th 2013 when it should be March 8th 2013. Perhaps add in a check like
if date(year, month, 1).weekday() == x:
delivery_date.append(calendar.Calendar(x).monthdatescalendar(year, month)[n-1][0])
else:
delivery_date.append(calendar.Calendar(x).monthdatescalendar(year, month)[n][0])
Alternatively this will work for Python 2, returns the occurance of weekday in the said month, i.e if 16 June 2018 is the input, then returns the occurance of the day on 16th June 2018
You may substitute the month/year/date integers to anything you might want - right now it's getting the input / date from the system via datetime
Omit out print statements or use pass where they're not needed
import calendar
import datetime
import pprint
month_number = int(datetime.datetime.now().strftime('%m'))
year_number = int(datetime.datetime.now().strftime('%Y'))
date_number = int(datetime.datetime.now().strftime('%d'))
day_ofweek = str(datetime.datetime.now().strftime('%A'))
def weekday_occurance():
print "\nFinding current date here\n"
for week in xrange(5):
try:
calendar.monthcalendar(year_number, month_number)[week].index(date_number)
occurance = week + 1
print "Date %s of month %s and year %s is %s #%s in this month." % (date_number,month_number,year_number,day_ofweek,occurance)
return occurance
break
except ValueError as e:
print "The date specified is %s which is week %s" % (e,week)
myocc = weekday_occurance()
print myocc
A little tweak would make the one-liner work correctly:
import calendar
calendar.Calendar((weekday+1)%7).monthdatescalendar(year, month)[n_th][-1]
Here n_th should be interpreted as c-style, e.g. 0 is the first index.
Example: to find 1st Sunday in July 2018 one could type:
>>> calendar.Calendar(0).monthdatescalendar(2018, 7)[0][-1]
datetime.date(2018, 7, 1)
People here seem to like one-liner, I will propose below.
import calendar
[cal[0] for cal in calendar.Calendar(x).monthdatescalendar(year, month) if cal[0].month == month][n]
The relativedelta module that's an extension from the Python dateutil package (pip install python-dateutil) does exactly what you want:
from dateutil import relativedelta
import datetime
def nth_weekday(the_date, nth_week, week_day):
return the_date.replace(day=1) + relativedelta.relativedelta(
weekday=week_day(nth_week)
)
print(nth_weekday(datetime.date.today(), 3, relativedelta.FR))
The key part here evaluates to weekday=relativedelta.FR(3): the third Friday of the month. Here are the relevant part of the docs for the weekday parameter,
weekday:
One of the weekday instances (MO, TU, etc) available in the
relativedelta module. These instances may receive a parameter N,
specifying the Nth weekday, which could be positive or negative
(like MO(+1) or MO(-2)).

Categories