Python function for billing calculation - python

How to create the python function for the Invoicing app to calculate the dates for different billing / invoice terms.
overall if we provide category name to function, it will return the job_date, start_date(DAYS TO COVER ), end_date(DAYS TO COVER ),
Date format to be returned: "%Y-%m-%dT00:00:00Z"
There are below different categories, first row contains the category name, second row lists the job day / date, third row specifies the start and end date to be returned.
WEEKLY
JOB DATE = SUNDAY
DAYS TO COVER = RECENT LAST 7 DAYS (SUN-SAT)
TWICE_A_WEEK
JOB RUN = THURSDAY OR MONDAY
DAYS TO COVER = MON-WED, THURSDAY-SUNDAY
SEMI_MONTHLY - 1-15, 15-31
JOB RUN ON 16 OR 1
DAYS TO COVER = 1-15 OR 16-31
MONTHLY -
JOB RUN ON 1 OR 16
DAYS TO COVER = 1-31, 16-15
CUSTOM - (GET USER INPUT FOR JOB RUN AND DAYS TO COVER)
Above where OR condition is mentioned in JOB RUN, script will get the most recent JOB RUN date and will return respective start_date, end_date and job_run date.
Can, Someone help me with python code or else simply framing an algorithm, Alternative suggestions are also welcomed!!

Related

Calculate Business Days in Django Annotate

I wanted to calculate business days in Django annotate. For example, if an event was generated 7 days back, and I wanted to know how many business days had passed. As per the example, 7 days includes [Monday - Sunday], and I only wanted to include [Monday - Friday], which means 5 business days. I've done this logic via some pythonic hack, but I wanted to do this in Django annotate() method. So I can filter the results based on business days.
Here is an example of what I've done so far:
table_values = Table.objects.all().annotate(issue_severity=datetime.utcnow() - F("event__created"))
for table in table_values:
date = datetime.now(timezone.utc) - table.issue_severity
dates = (date + timedelta(x + 1) for x in range(table.issue_severity.days))
table.business_days = sum(day.weekday() < 5 for day in dates)

How to retrieve a certain day of the month for each row based on a dataframe value?

I am trying to replace some hardcoded SQL queries related to timezone changes with a more dynamic/data-driven Python script. I have a dataset that looks like this spreadsheet below. WEEK_START/DAY/MONTH is the week, day, and month when daylight savings time begins (for example Canberra starts the first Sunday of April while Vienna is the last Sunday of March). The end variables are in the same format and display when it ends.
Dataset
Here is the issue. I have seen solutions for specific use cases such as this, finding the last Sunday of the month:
current_year=today.year
current_month=today.month
current_day=today.day
month = calendar.monthcalendar(current_year, current_month)
day_of_month = max(month[-1][calendar.SUNDAY], month[-2][calendar.SUNDAY])
print(day_of_month)
31
This tells me that the last day of this month is the 31st. I can adjust the attributes for one given month/scenario, but how would I make a column for each and every row (city) to retrieve each? That is, several cities that change times on different dates? I thought if I could set attributes in day_of_month in an apply function it would work but when I do something like weekday='SUNDAY' it returns an error because of course the string 'SUNDAY' is not the same as SUNDAY the attribute of calendar. My SQL queries are grouped by cities that change on the same day but ideally anyone would be able to edit the CSV that loads the above dataset as needed and then each day the script would run once to see if today is between the start and end of daylight savings. We might have new cities to add in the future. I'm confident in doing that bit but quite lost on how to retrieve the dates for a given year.
My alternate, less resilient, option is to look at the distinct list of potential dates (last Sunday of March, first Sunday of April, etc.), write code to retrieve each one upfront (as in the above snippet above), and assign the dates in that way. I say that this is less resilient because if a city is added that does not fit in an existing group for time changes, the code would need to be altered as well.
So stackoverflow, is there a way to do this in a data driven way in pandas through an apply or something similar? Thanks in advance.
Basically I think you have most of what you need. Just map the WEEK_START / WEEK_END column {-1, 1} to last or first day of month, put it all in a function and apply it to each row. EX:
import calendar
import operator
import pandas as pd
def get_date(year: int, month: int, dayname: str, first=-1) -> pd.Timestamp:
"""
get the first or last day "dayname" in given month and year.
returns last by default.
"""
daysinmonth = calendar.monthcalendar(year, month)
getday = operator.attrgetter(dayname.upper())
if first == 1:
day = daysinmonth[0][getday(calendar)]
else:
day = max(daysinmonth[-1][getday(calendar)], daysinmonth[-2][getday(calendar)])
return pd.Timestamp(year, month, day)
year = 2021 # we need a year...
df['date_start'] = df.apply(lambda row: get_date(year,
row['MONTH_START'],
row['DAY_START'],
row['WEEK_START']), # selects first or last
axis=1) # to each row
df['date_end'] = df.apply(lambda row: get_date(year,
row['MONTH_END'],
row['DAY_END'],
row['WEEK_END']),
axis=1)
giving you for the sample data
df[['CITY', 'date_start', 'date_end']]
CITY date_start date_end
0 Canberra 2021-04-04 2021-10-03
1 Melbourne 2021-04-04 2021-10-03
2 Sydney 2021-04-04 2021-10-03
3 Kitzbuhel 2021-03-28 2021-10-31
4 Vienna 2021-03-28 2021-10-31
5 Antwerp 2021-03-28 2021-10-31
6 Brussels 2021-03-28 2021-10-31
7 Louvain-la-Neuve 2021-03-28 2021-10-31
Once you start working with time zones and DST transitions, Q: Is there a way to infer in Python if a date is the actual day in which the DST (Daylight Saving Time) change is made? might also be interesting to you.

Date selection using Selenium

So I'm building a program that extracts data from several different websites that will display all our business data on a dashboard.
I've managed to get to the point where I can extract the data I need, and have it set up to select todays data, previous week and month to date by importing datetime, and for the example of previous week the following:
date = datetime.date.today()
yesterday = (date - datetime.timedelta(days = 1)).strftime('%d')
lastWeek = (date - datetime.timedelta(days = 7)).strftime('%d')
today = date.strftime('%d')
Then just using the find element function:
browser.find_element_by_link_text(lastWeek).click()
However I'm running into problems at the beginning of the month, where I can't subtract 7 as it would take me into negative numbers and wouldn't go to the previous months date as each month has different days etc.
Is there any way of getting around this?
TIA

Python Count number of records within a given date range

We have a backend table that stores details of transaction including seconds since epoch. I am creating a UI where I collect from-to dates to display counts of transaction occurred in-between the dates.
Assuming that the date range is from 07/01/2012 - 07/30/2012, I am unable to establish a logic that will increment a counter for records that happened within the time period. I should hit the DB only once as hitting for each day will give poor performance.
I am stuck at a logic:
Convert 07/01/2012 & 07/30/2012 to seconds since epoch.
Get the records for start date - end date [as converted to seconds since epoch]
For each record get the month / date
-- now how will we add counters for each date in between 07/01/2012 - 07/30/2012
MySQL has the function FROM_UNIXTIME which will convert your seconds since epoch into datetime and you can then extract the DATE part of it (YYYY-MM-DD format) and group according to it.
SELECT DATE(FROM_UNIXTIME(timestamp_column)), COUNT(*)
FROM table_name
GROUP BY DATE(FROM_UNIXTIME(timestamp_column))
This will return something like
2012-07-01 2
2012-07-03 4
…
(no entries for days without transactions)

Sorting scheduled events python

So I have list of events that are sort of like alarms. They're defined by their start and end time (in hours and minutes), a range of days (ie 1-3 which is sunday through wed.), and a range of months (ie 1-3, january through march). The format of that data is largely unchangeable. I need to, not necessarily sort the list, but I need to find the next upcoming event based on the current time. There's just so many different ways to do this and so many different corner cases. This is my pseudo code:
now = time()
diff = []
# Start difference between now and start times
for s in schedule #assuming appending to diff
diff.minutes = s.minutes - time.minutes #
diff.hours = s.hours - time.hours
diff.days = s.days - time.days
diff.months = s.months - time.months
for d in diff
if d < 0
d = period + d
# period is the maximum period of the attribute. ie minutes is 60, hours is 24
# repeat for event end times
So now I have a list of tuples of differences in hours, minutes, days, and weeks. This tuple already takes into account if it's passed the start time, but before the end time. So let's say it's in August and the start month of the event is July and the end month is September, so diff.month == 0.
Now this specific corner case is giving me trouble:
Let's say a schedule runs from 0 to 23:59 thursdays in august. And it's Friday the 27th. Running my algorithm, the difference in months would be 0 when in reality it won't run again until next august, so it should be 12. And I'm stuck. The month is the only problem I think because the month is the only attribute that directly depends on what the date of the specific month is (versus just the day). Is my algorithm OK and I can just deal with this special case? Or is there something better out there for this?
This is the data I'm working with
map['start_time']=''
map['end_time']=''
map['start_moy']=''
map['end_moy']=''
map['start_dow']=''
map['end_dow']=''
The schedule getAllSchedules method just returns a list to all of the schedules. I can change the schedule class but I'm not sure what difference I can make there. I can't add/change the format of the schedules I'm given
Convert the items from the schedule into datetime objects. Then you can simply sort them
from datetime import datetime
events = sorted(datetime(s.year, s.month, s.day, s.hour, s.minute) for s in schedule)
Since your resolution is in minutes, and assuming that you don't have many events, then I'd simply scan all the events every minute.
Filter your events so that you have a new list where the event range match the current month and day.
Then for each of those events declare that they are active or inactive according to whether the current time matches the event's range.
The primary issue seems to be with the fact that you're using the day of the week, instead of explicit days of the month.
While your cited edge case is one example, does this issue not crop up with all events scheduled in any month outside of the current one?
I think the most robust approach here would be to do the work to get your scheduled events into datetime format, then use #gnibbler's suggestion of sorting the datetime objects.
Once you have determined that the last event for the current month has already passed, calculate the distance to the next month the event occurs in (be it + 1 year, or just + 1 month), then construct a datetime object with that information:
first_of_month = datetime.date(calculated_year, calculated_month, 1)
By using the first day of the month, you can then use:
day_of_week = first_of_month.strftime('%w')
To give you what day of the week the first of that month falls on, which you can then use to calculate how many days to add to get to the first, second, third, etc. instance of a given day of the week, for that month. Once you have that day, you can construct a valid datetime object and do whatever comparisons you wish with now().
I couldn't figure out how to do it using only datetimes. But I found a module and used this. It's perfect
http://labix.org/python-dateutil

Categories