I am working on parsing a date range from an email in zapier. Here is what comes in: Dec 4 - Jan 4, 2020 From this I need to separate the start and end date to something like 12/04/2019 and 01/04/2020 accounting for the fact that some dates will start in the prior year as in the example above and some will be in the same year for example Mar 4 - Mar 22, 2020. It seems the code to use in zapier is python. I have looked at examples for panda
import pandas as pd
date_series = pd.date_range(start='Mar 4' -, end='Mar 7, 2020')
print(date)
But keep getting errors.
Any suggestions would be much appreciated thanks
This is one way to do it:
def parse_email_range(date_string):
dates = date_string.split(' - ')
month_1 = pd.to_datetime(dates[0], format='%b %d').month
month_2 = pd.to_datetime(dates[1]).month
day_1 = pd.to_datetime(dates[0], format='%b %d').day
day_2 = pd.to_datetime(dates[1]).day
year_2 = pd.to_datetime(dates[1]).year
year_1 = year_2 if (month_1 < month_2) or (month_1 == month_2 and day_1 < day_2) else year_2 - 1
return '{}-{}-{}'.format(year_1, month_1, day_1), '{}-{}-{}'.format(year_2, month_2, day_2)
parse_email_range('Dec 4 - Jan 4, 2020')
## ('2019-12-4', '2020-1-4')
Split the two dates and record them into a single variable:
raw_dates = 'Dec 4 - Jan 4, 2020'.split(" - ")
dateutil package is capable of parsing most dates:
from dateutil.parser import parse
Parse and separate start and end date from the raw dates:
start_date, end_date = (parse(date) for date in raw_dates)
strftime is the method that could be used to format dates.
Store desired format in a variable (please note I have used day first format):
date_format = '%d/%m/%Y'
Convert the end date into the desired format:
print(end_date.strftime(date_format))
'04/01/2020'
Convert start date:
dateutil's relativedelta function will help us to subtract one year from the start date:
from dateutil.relativedelta import relativedelta
adjusted_start_date = start_date - relativedelta(years=1)
print(adjusted_start_date.strftime(date_format))
'04/12/2019'
Related
I have a pandas dataframe in which the date information is a string with the month and year:
date = ["JUN 17", "JULY 17", "AUG 18", "NOV 19"]
Note that the month is usually written as the 3 digit abbreviation, but is sometimes written as the full month for June and July.
I would like to convert this into a datetime format which assumes each date is on the first of the month:
date = [06-01-2017, 07-01-2017, 08-01-2018, 11-01-2019]
Edit to provide more information:
Two main issues I wasn't sure how to handle:
Month is not in a consistent format. Tried to solve this using by just taking a subset of the first three characters of the string.
Year is last two digits only, was struggling to specify that it is 2020 without it getting very messy
I have tried a dozen different things that didn't work, most recent attempt is below:
df['date'] = pd.to_datetime(dict(year = df['Record Month'].astype(str).str[-2:], month = df['Record Month'].astype(str).str[0:3], day=1))
This has the error "Unable to parse string "JUN" at position 0
If you are not sure of the many spellings that can show up then a dictionary mapping would not work. Perhaps your best chance is to split and slice so you normalize into year and month columns and then build the date.
If date is a list as in your example.
date = [d.split() for d in date]
df = pd.DataFrame([m[:3].lower, '20' + y] for m, y in date],
# df = pd.DataFrame([[s.split()[0][:3].lower, '20' + s.split()[1]] for s in date],
columns=['month', 'year'])
Then pass a mapper to series.replace as in
df.month = df.month.replace({'jan': 1, 'feb': 2 ...})
Then parse the dates from its components
# first cap the date to the first day of the month
df['day'] = 1
df = pd.to_datetime(df)
You were close with using pandas.to_datetime(). Instead of using a dictionary though, you could just reformat the date strings to a more standard format. If you convert each date string into MMMYY format (pretty similar to what you were doing) you can pass the strftime format "%b%y" to to_datetime() and it will convert the strings into dates.
import pandas as pd
date = ["JUN 17", "JULY 17", "AUG 18", "NOV 19"]
df = pd.DataFrame(date, columns=["Record Month"])
df['date'] = pd.to_datetime(df["Record Month"].str[:3] + df["Record Month"].str[-2:], format='%b%y')
print(df)
Produces that following result:
Record Date date
0 JUN 17 2017-06-01
1 JULY 17 2017-07-01
2 AUG 18 2018-08-01
3 NOV 19 2019-11-01
I'm quite lost and I'm in need of trying to format some code so it ends up having dashes in the date. I can get 3, 12, 28 but I can't get 3-12-28. I am a super new beginner so I'm quite lost at the moment.
year = 3
month = 12
day = 28
print(date)
Try
print("{0}-{1}-{2}".format(year,month,day))
You could use datetime to format the result
import datetime
year = 3
month = 12
day = 28
dt = (datetime.date(year, month, day))
print(dt)
the result will be 0003-12-28
if you want more examples of datetime you could take a look at https://docs.python.org/2/library/datetime.html#
As you say you are new to python you can concatenate the strings together.
year = 3
month = 12
day = 28
date = year + "-" + month + "-" + day
print(date)
Alternatively you can use format to set the variables in your required format.
print(f"{year}-{month}-{day}")
Another method is to use datetime if you are using todays date
import datetime
today = datetime.date.today()
print(today)
I have a dataset with the column 'Date', which has dates in several formats, including:
2018.05.07
01-Jun-2018
Reported 01 Jun 2018
Jun 2018
2018
before 1970
1941-1945
Ca. 1960
There are also invalid dates, such as:
190Feb-2010
I am trying to find dates which have an exact date (day, month, and year) and convert them to datetime. I also need to exclude dates with "Reported" in the field. Is there any way to filter such data without finding before all the possible formats of dates?
Using dateutil library.
if statement to check if any part of date (month,year,date) is missing, if yes then avoid it.
use fuzzy=True if want to extract dates from strings such as "Reported 01 Jun 2018"
import dateutil.parser
dates = ["2018.05.07","01-Jun-2018","Reported 01 Jun 2018","Jun 2018","2018","before 1970","1941-1945","Ca. 1960","190Feb-2010"]
formated_date = []
for date in dates:
try:
if dateutil.parser.parse(date,fuzzy=False,default=datetime.datetime(2015, 1, 1)) == dateutil.parser.parse(date,fuzzy=False,default=datetime.datetime(2016, 2, 2)):
formated_date.append(yourdate)
except:
continue
another solution. This is brute force method that check each date with every format. Keep on adding more formats to make it work on any date format. But this is time taking method.
import datetime
dates = ["2018.05.07","01-Jun-2018","Reported 01 Jun 2018","Jun 2018","2018","before 1970","1941-1945","Ca. 1960","190Feb-2010"]
formats = ["%Y%m%d","%Y.%m.%d","%Y-%m-%d","%Y/%m/%d","%Y%a%d","%Y.%a.%d","%Y-%a-%d","%Y%A%d","%Y.%A.%d","%Y-%A-%d",
"%d-%m-%Y","%d.%m.%Y","%d%m%Y","%d/%m/%Y","%d-%b-%Y","%d%b%Y","%d.%b.%Y","%d/%b/%Y"]
formated_date = []
for date in dates:
for fmt in formats:
try:
dt = datetime.datetime.strptime(date,fmt)
formated_date.append(dt)
except:
continue
In [1]: string_with_dates = """entries are due by January 4th, 2017 at 8:00pm created 01/15/2005 by ACME Inc. and associates."""
In [2]: import datefinder
In [3]: matches = datefinder.find_dates(string_with_dates)
In [4]: for match in matches:
...: print match
2017-01-04 20:00:00
2005-01-15 00:00:00
Hope this would help you to find dates from string with dates
I am trying to get last month and current year in the format: July 2016.
I have tried (but that didn't work) and it does not print July but the number:
import datetime
now = datetime.datetime.now()
print now.year, now.month(-1)
If you're manipulating dates then the dateutil library is always a great one to have handy for things the Python stdlib doesn't cover easily.
First, install the dateutil library if you haven't already:
pip install python-dateutil
Next:
from datetime import datetime
from dateutil.relativedelta import relativedelta
# Returns the same day of last month if possible otherwise end of month
# (eg: March 31st->29th Feb an July 31st->June 30th)
last_month = datetime.now() - relativedelta(months=1)
# Create string of month name and year...
text = format(last_month, '%B %Y')
Gives you:
'July 2016'
now = datetime.datetime.now()
last_month = now.month-1 if now.month > 1 else 12
last_year = now.year - 1
to get the month name you can use
"Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec".split()[last_month-1]
An alternative solution using Pandas which converts today to a monthly period and then subtracts one (month). Converted to desired format using strftime.
import datetime as dt
import pandas as pd
>>> (pd.Period(dt.datetime.now(), 'M') - 1).strftime('%B %Y')
u'July 2016'
You can use just the Python datetime library to achieve this.
Explanation:
Replace day in today's date with 1, so you get date of first day of this month.
Doing - timedelta(days=1) will give last day of previous month.
format and use '%B %Y' to convert to required format.
import datetime as dt
format(dt.date.today().replace(day=1) - dt.timedelta(days=1), '%B %Y')
>>>'June-2019'
from datetime import date, timedelta
last_month = date.today().replace(day=1) - timedelta(1)
last_month.strftime("%B %Y")
date.today().replace(day=1) gets the first day of current month, substracting 1 day will get last day of last month
def subOneMonth(dt):
day = dt.day
res = dt.replace(day=1) - datetime.timedelta(days =1)
try:
res.replace(day= day)
except ValueError:
pass
return res
print subOneMonth(datetime.datetime(2016,07,11)).strftime('%d, %b %Y')
11, Jun 2016
print subOneMonth(datetime.datetime(2016,01,11)).strftime('%d, %b %Y')
11, Dec 2015
print subOneMonth(datetime.datetime(2016,3,31)).strftime('%d, %b %Y')
29, Feb 2016
from datetime import datetime, timedelta, date, time
#Datetime: 1 month ago
datetime_to = datetime.now().replace(day=15) - timedelta(days=30 * 1)
#Date : 2 months ago
date_to = date.today().replace(day=15) - timedelta(days=30 * 2)
#Date : 12 months ago
date_to = date.today().replace(day=15) - timedelta(days=30 *12)
#Accounting standards: 13 months ago of pervious day
date_ma = (date.today()-timedelta(1)).replace(day=15)-timedelta(days=30*13)
yyyymm = date_ma.strftime('%Y%m') #201909
yyyy = date_ma.strftime('%Y') #2019
#Error Range Test
from datetime import datetime, timedelta, date, time
import pandas as pd
for i in range(1,120):
pdmon = (pd.Period(dt.datetime.now(), 'M')-i).strftime('%Y%m')
wamon = (date.today().replace(day=15)-timedelta(days=30*i)).strftime('%Y%m')
if pdmon != wamon:
print('Incorrect %s months ago:%s,%s' % (i,pdmon,wamon))
break
#Incorrect 37 months ago:201709,201710
import datetime as dt
.replace(day=1) replaces today's date with the first day of the month, simple
subtracting timedelta(1) subtracts 1 day, giving the last day of the previous month
last_month = dt.datetime.today().replace(day=1) - dt.timedelta(1)
user wanted the word July, not the 6th month so updating %m to %B
last_month.strftime("%Y, %B")
Say I have a week number of a given year (e.g. week number 6 of 2014).
How can I convert this to the date of the Monday that starts that week?
One brute force solution I thought of would be to go through all Mondays of the year:
date1 = datetime.date(1,1,2014)
date2 = datetime.date(12,31,2014)
def monday_range(date1,date2):
while date1 < date2:
if date1.weekday() == 0:
yield date1
date1 = date1 + timedelta(days=1)
and store a hash from the first to the last Monday of the year, but this wouldn't do it, since, the first week of the year may not contain a Monday.
You could just feed the data into time.asctime().
>>> import time
>>> week = 6
>>> year = 2014
>>> atime = time.asctime(time.strptime('{} {} 1'.format(year, week), '%Y %W %w'))
>>> atime
'Mon Feb 10 00:00:00 2014'
EDIT:
To convert this to a datetime.date object:
>>> datetime.datetime.fromtimestamp(time.mktime(atime)).date()
datetime.date(2014, 2, 10)
All about strptime \ strftime:
https://docs.python.org/2/library/datetime.html
mytime.strftime('%U') #for W\C Monday
mytime.strftime('%W') #for W\C Sunday
Sorry wrong way around
from datetime import datetime
mytime=datetime.strptime('2012W6 MON'. '%YW%U %a')
Strptime needs to see both the year and the weekday to do this. I'm assuming you've got weekly data so just add 'mon' to the end of the string.
Enjoy
A simple function to get the Monday, given a date.
def get_monday(dte):
return dte - datetime.timedelta(days = dte.weekday())
Some sample output:
>>> get_monday(date1)
datetime.date(2013, 12, 30)
>>> get_monday(date2)
datetime.date(2014, 12, 29)
Call this function within your loop.
We can just add the number of weeks to the first day of the year.
>>> import datetime
>>> from dateutil.relativedelta import relativedelta
>>> week = 40
>>> year = 2019
>>> date = datetime.date(year,1,1)+relativedelta(weeks=+week)
>>> date
datetime.date(2019, 10, 8)
To piggyback and give a different version of the answer #anon582847382 gave, you can do something like the below code if you're creating a function for it and the week number is given like "11-2023":
import time
from datetime import datetime
def get_date_from_week_number(str_value):
temp_str = time.asctime(time.strptime('{} {} 1'.format(str_value[3:7], str_value[0:2]), '%Y %W %w'))
return datetime.strptime(temp_str, '%a %b %d %H:%M:%S %Y').date()