Obtaining decimal format for range of years and specific months - python

I have monthly data (1993 - 2019) but I am hoping to get the decimal format of only July, August, and September months from 1993 - 2019.
Below is the code for the months in decimal format between 1993 - 2019 (all 12 months) but hoping to get the same thing but just for July, August, and September months:
year_start = 1993
year_end = 2019
full_time_months = np.arange(year_start+.5/12,year_end+1,1/12)
print(full_time_months[:12])
# these are the 12 months in 1993 as decimals
1993.04166667 1993.125 1993.20833333 1993.29166667 1993.375
1993.45833333 1993.54166667 1993.625 1993.70833333 1993.79166667
1993.875 1993.95833333
My goal is to just get an array of months july, august, and september:
1993.54167, 1993.625, 1993.708... 2019.54167 , 2019.625, 2019.708
where year.54167 = July, year.625 = August, and year.708 = September.
How might I go about doing this? Hope my question is clear enough, please comment if something is unclear, thank you!!!

I'm not sure what you want to achieve with this, but you can do something like this, to separate the data you want.
import numpy as np
year_start = 1993
year_end = 2019
full_time_months = np.arange(year_start+.5/12,year_end+1,1/12)
# Reshape into 2D array
full_time_months = full_time_months.reshape(-1, 12)
# Choose selected columns
# July, Aug, Sept
selected_months = full_time_months[:, [6,7,8]]
print(selected_months)
Results:
[[1993.54166667 1993.625 1993.70833333]
[1994.54166667 1994.625 1994.70833333]
...
[2018.54166667 2018.625 2018.70833333]
[2019.54166667 2019.625 2019.70833333]]

Related

Deducting x number of months from a given date

Is there a way to deduct a specified number of months from a given date. So for example, if the day is 2006/02/27. Then I want to backtrack 3 months and find the months within this date and 3 months back. In this case, it would be Feb, Jan & dec. What I am really after is finding a range of months.
I can think of using timedelta and specifying 93 days (31 x 3). But this could potentially be a problem if its early month date. something like 01/03/2006 - 93 days will perhaps result in a date in november/2005, which will include march, Feb,Jan,Dec,Nov as months. But what I want is March,Feb and Jan
from datetime import datetime,timedelta
someDate = datetime(2006,2,27)
newDate =someDate - timedelta(days = 3)
#someDate - 3months
Any ideas on how to solve?

Pandas - math operation on H:M:S string format

I have some time data that I need to subtract the largest value (last row) from the smallest value (first row) per month. The HOURS column is a in string (object) format though, and I don't know how to convert this properly and then get it back into the current format. The end result needs to be displayed as H:M:S. The data looks as follows:
MACHINE HOURS MONTH
M400 54:56:00 December
M400 61:54:52 December
M400 75:38:52 December
M400 89:21:09 December
M400 13:44:00 November
M400 27:28:00 November
M400 41:12:00 November
The end result I'm looking for is:
MACHINE HOURS MONTH
M400 34:25:09 December
M400 27:28:00 November
What is the fastest way to convert this (I'm assuming to datetime format), do the math, then reverse back?
One way I can think of to achieve this is, although there might be more efficient way.
def convert(h):
#convert string to seconds
h = h.split(':')
h = list(map(int, h))
return h[0]*3600+h[1]*60+h[2]
def convert_back(t):
#convert seconds to string
m,s = divmod(t,60)
h,m = divmod(m,60)
return f"{h}:{m}:{s}"
df['time'] = df['hours'].apply(convert)
final = (df.groupby('month').max()['time'] - df.groupby('month').min()['time']).apply(convert_back)
df_final = df.groupby('month').max()
df_final['hours'] = final
df_final is what you are looking for.

Extraction of some date formats failed when using Dateutil in Python

I have gone through multiple links before posting this question so please read through and below are the two answers which have solved 90% of my problem:
parse multiple dates using dateutil
How to parse multiple dates from a block of text in Python (or another language)
Problem: I need to parse multiple dates in multiple formats in Python
Solution by Above Links: I am able to do so but there are still certain formats which I am not able to do so.
Formats which still can't be parsed are:
text ='I want to visit from May 16-May 18'
text ='I want to visit from May 16-18'
text ='I want to visit from May 6 May 18'
I have tried regex also but since dates can come in any format,so ruled out that option because the code was getting very complex. Hence, Please suggest me modifications on the code presented on the link, so that above 3 formats can also be handled on the same.
This kind of problem is always going to need tweeking with new edge cases, but the following approach is fairly robust:
from itertools import groupby, izip_longest
from datetime import datetime, timedelta
import calendar
import string
import re
def get_date_part(x):
if x.lower() in month_list:
return x
day = re.match(r'(\d+)(\b|st|nd|rd|th)', x, re.I)
if day:
return day.group(1)
return False
def month_full(month):
try:
return datetime.strptime(month, '%B').strftime('%b')
except:
return datetime.strptime(month, '%b').strftime('%b')
tests = [
'I want to visit from May 16-May 18',
'I want to visit from May 16-18',
'I want to visit from May 6 May 18',
'May 6,7,8,9,10',
'8 May to 10 June',
'July 10/20/30',
'from June 1, july 5 to aug 5 please',
'2nd March to the 3rd January',
'15 march, 10 feb, 5 jan',
'1 nov 2017',
'27th Oct 2010 until 1st jan',
'27th Oct 2010 until 1st jan 2012'
]
cur_year = 2017
month_list = [m.lower() for m in list(calendar.month_name) + list(calendar.month_abbr) if len(m)]
remove_punc = string.maketrans(string.punctuation, ' ' * len(string.punctuation))
for date in tests:
date_parts = [get_date_part(part) for part in date.translate(remove_punc).split() if get_date_part(part)]
days = []
months = []
years = []
for k, g in groupby(sorted(date_parts, key=lambda x: x.isdigit()), lambda y: not y.isdigit()):
values = list(g)
if k:
months = map(month_full, values)
else:
for v in values:
if 1900 <= int(v) <= 2100:
years.append(int(v))
else:
days.append(v)
if days and months:
if years:
dates_raw = [datetime.strptime('{} {} {}'.format(m, d, y), '%b %d %Y') for m, d, y in izip_longest(months, days, years, fillvalue=years[0])]
else:
dates_raw = [datetime.strptime('{} {}'.format(m, d), '%b %d').replace(year=cur_year) for m, d in izip_longest(months, days, fillvalue=months[0])]
years = [cur_year]
# Fix for jumps in year
dates = []
start_date = datetime(years[0], 1, 1)
next_year = years[0] + 1
for d in dates_raw:
if d < start_date:
d = d.replace(year=next_year)
next_year += 1
start_date = d
dates.append(d)
print "{} -> {}".format(date, ', '.join(d.strftime("%d/%m/%Y") for d in dates))
This converts the test strings as follows:
I want to visit from May 16-May 18 -> 16/05/2017, 18/05/2017
I want to visit from May 16-18 -> 16/05/2017, 18/05/2017
I want to visit from May 6 May 18 -> 06/05/2017, 18/05/2017
May 6,7,8,9,10 -> 06/05/2017, 07/05/2017, 08/05/2017, 09/05/2017, 10/05/2017
8 May to 10 June -> 08/05/2017, 10/06/2017
July 10/20/30 -> 10/07/2017, 20/07/2017, 30/07/2017
from June 1, july 5 to aug 5 please -> 01/06/2017, 05/07/2017, 05/08/2017
2nd March to the 3rd January -> 02/03/2017, 03/01/2018
15 march, 10 feb, 5 jan -> 15/03/2017, 10/02/2018, 05/01/2019
1 nov 2017 -> 01/11/2017
27th Oct 2010 until 1st jan -> 27/10/2010, 01/01/2011
27th Oct 2010 until 1st jan 2012 -> 27/10/2010, 01/01/2012
This works as follows:
First create a list of valid months names, i.e. both full and abbreviated.
Make a translation table to make it easy to quickly remove any punctuation from the text.
Split the text, and extract only the date parts by using a function with a regular expression to spot days or months.
Sort the list based on whether or not the part is a digit, this will group months to the front and digits to the end.
Take the first and last part of each list. Convert months into full form e.g. Aug to August and convert each into datetime objects.
If a date appears to be before the previous one, add a whole year.

How to use day of the week and the day of the month to get year in python?

for example how would you get "2016" from
Fri, Feb, 19 ?
I have a database from somewhere that has entries from the last 4 years but only lists dates as day of the week and day of the month, and I need to get the year from each. There should not be any repetitions because there is only four years of data, and date & day of the week alignments do not repeat in that timeframe.
How about this:
import datetime
year = next(date for year in (2013, 2014, 2015, 2016)
for date in (datetime.date(year, 2, 19),)
if date.weekday() == 4).year
This is just slightly different from #zondo's answer who beats me to it, but I'll put this up anyway because I prefer not to have the .year at the end of a long expression:
year = next(y for y in range(2013, 2017) if datetime.date(y, 2, 19).weekday() == 4)

How to find out week no of the month in python?

I have seen many ways to determine week of the year. Like by giving instruction datetime.date(2016, 2, 14).isocalendar()[1] I get 6 as output. Which means 14th feb 2016 falls under 6th Week of the year. But I couldn't find any way by which I could find week of the Month.
Means IF I give input as some_function(2016,2,16)
I should get output as 3, denoting me that 16th Feb 2016 is 3rd week of the Feb 2016
[ this is different question than similar available question, here I'm asking about finding week no of the month and not of the year]
This function did the work what I wanted
from math import ceil
def week_of_month(dt):
first_day = dt.replace(day=1)
dom = dt.day
adjusted_dom = dom + first_day.weekday()
return int(ceil(adjusted_dom/7.0))
I got this function from This StackOverFlow Answer
import datetime
def week_number_of_month(date_value):
week_number = (date_value.isocalendar()[1] - date_value.replace(day=1).isocalendar()[1] + 1)
if week_number == -46:
week_number = 6
return week_number
date_given = datetime.datetime(year=2018, month=12, day=31).date()
week_number_of_month(date_given)

Categories