I have to subtract 2 dates using datetime like this:
datetime.datetime.now() - datetime.timedelta(minutes=15)
# subtracting 15 minutes
The problem is that i don't necessarily need to subtract minutes.
I have a dictionary that tells what i should subtract.
period = {'quantity': '15', 'time': 'minutes'}
But i need that when the dictionary changes, the subtracted time also changes. Like this:
if period['time'] == 'minutes':
tempo_atras = datetime.datetime.now() -datetime.timedelta(minutes=int(period['quantity']))
elif period['time'] == 'hours':
tempo_atras = datetime.datetime.now() - datetime.timedelta(hours=int(period['quantity']))
elif period['time'] == 'days':
tempo_atras = datetime.datetime.now() - datetime.timedelta(days=int(period['quantity']))
elif period['time'] == 'weeks':
tempo_atras = datetime.datetime.now() - datetime.timedelta(weeks=int(period['quantity']))
I feel that the way i wrote it is not clean, so i need a way to convert the period['time'] string in the function parameter; something like:
tempo_atras = datetime.datetime.now() - datetime.timedelta(period['time']=int(period['quantity']))
How can i do this?
You can use the dictionary unpacking operator ** to expand a dict into keyword arguments, so you just need to make the dict first.
q = {period['time']: int(period['quantity'])}
tempo_atras = datetime.datetime.now() - datetime.timedelta(**q)
Docs:
Tutorial: Unpacking Argument Lists
Reference: Calls
You have to convert the dictionary so that the period is pointing to the quantity. Then use dictionary unpacking normally.
import datetime
period = {'quantity': '15', 'time': 'minutes'}
# convert the dictionary to a dictionary of "minutes"= "15"
period = {period['time']: int(period['quantity'])}
print(period)
tempo_atras = datetime.datetime.now() - datetime.timedelta(**period)
Note: you can read more about this format (and the * format for lists) here: More on Defining Functions
Related
I have a data frame that has a date column, what I need is to create another 2 columns with the "start of week date" and "end of week date". The reason for this is that I will then need to group by an "isoweek" column... but also keep this two-column "start_of_week_date" and "end_of_week_date"
I've created the below function:
def myfunc(dt, option):
wkday = dt.isoweekday()
if option == 'start':
delta = datetime.timedelta(1 - wkday)
elif option == 'end':
delta = datetime.timedelta(7 - wkday)
else:
raise TypeError
return date + delta
Now I don't know how I would use the above function to populate the columns.
Probably don't even need my function to get what I need... which is... I have a DF that has the below columns
\>>> date, isoweek, qty
I will need to change it to:
\>>> isoweek, start_of_week_date, end_of_week_date, qty
this would then make my data go from 1.8 million rows to 300 thousand rows :D
can someone help me?
thank you
There might be builtin functions that one can use and i can see one of the answers proposes such.
However, if you wish to apply your own function (which is perfectly acceptable) then could use the apply with lambda.
Here is an example:
import pandas as pd
from datetime import datetime
# an example dataframe
d = {'some date':[1,2,3,4],
'other data':[2,4,6,8]}
df = pd.DataFrame(d)
# user defined function from the question
def myfunc(dt, option):
wkday = dt.isoweekday()
if option == 'start':
delta = datetime.timedelta(1 - wkday)
elif option == 'end':
delta = datetime.timedelta(7 - wkday)
else:
raise TypeError
return date + delta
df['new_col'] = df.apply(lambda x: myfunc(df['some data'], df['other data']), axis=1)
Hope I understand correctly, Refer this dt.weekday for caculating week start & week end, here I've used 6 for 'Sunday' if you need any other day as weekend then give the appropriate number.
The day of the week with Monday=0, Sunday=6
df['start_of_week_date'] = df['Date'] - df['Date'].dt.weekday.astype('timedelta64[D]')
df['end_of_week_date'] = df['Date'] + (6 - df['Date'].dt.weekday).astype('timedelta64[D]')
I am trying to apply the next function in which two datetime64 pandas dataframe columns are arguments:
import datetime
import pandas as pd
def set_dif_months_na(start_date, end_date):
if (pd.isnull(start_date) and pd.notnull(end_date)):
return None
elif (pd.notnull(start_date) and pd.isnull(end_date)):
return None
elif (pd.isnull(start_date) and pd.isnull(end_date)):
return None
else:
start_date = datetime.strptime(start_date, "%d/%m/%Y")
end_date = datetime.strptime(end_date, "%d/%m/%Y")
return abs((end_date.year - start_date.year) * 12 + (end_date.month - start_date.month))
This function is intended to get month difference as integer given two dates as arguments, else it has to return None.
When I apply it to a new pandas dataframe column as this:
df['new_col'] = [set_dif_months_na(date1, date2)
for date1,date2 in
zip(df['date1'], df['date2'])]
The next error arises:
TypeError: strptime() argument 1 must be str, not Timestamp
How could I adjust the function in order to properly apply it over a new pandas dataframe column?
You see, pandas uses numpy to parse dates, and numpy.datetime64 is not directly compatible with datetime.datetime, which you are trying to use.
There's a couple of different solutions, but if you want to use datetime, which is more readable in my opinion, you may do something like this. First we define a function to convert between both data types (got it from here):
def numpy2datetime(date):
return (datetime.
datetime.
utcfromtimestamp(
(date - np.datetime64('1970-01-01T00:00:00')) /
np.timedelta64(1, 's'))
)
Then you may be able to do what you want by changing your function from :
start_date = datetime.strptime(start_date, "%d/%m/%Y")
end_date = datetime.strptime(end_date, "%d/%m/%Y")
to
start_date = numpy2datetime(start_date)
end_date = numpy2datetime(end_date)
This should work. However, I may have some additional suggestions for you. First, you can change all your if and elif to a single one by using the or logical operator:
if pd.isnull(start_date) or pd.isnull(end_date):
return None
else:
start_date = numpy2datetime(start_date)
end_date = numpy2datetime(end_date)
return abs((end_date.year - start_date.year) * 12 + (end_date.month - start_date.month))
And a last one is regarding your list comprehension. You don't need zip at all, since both columns are within the same dataframe. You can simply do:
df['new_col'] = [set_dif_months_na(date1, date2)
for date1,date2 in
df[['date1','date2']].values]
Don't know if it's faster, but at least it's clearer.
Hope it's useful. And let us know if you have any further issues.
By changing start_date and end_date setting from strptime to pd.to_datetime the function worked without any error:
def set_dif_months_na(start_date, end_date):
if (pd.isnull(start_date) and pd.notnull(end_date)):
return None
elif (pd.notnull(start_date) and pd.isnull(end_date)):
return None
elif (pd.isnull(start_date) and pd.isnull(end_date)):
return None
else:
start_date = pd.to_datetime(start_date, format="%d/%m/%Y")
end_date = pd.to_datetime(end_date, format="%d/%m/%Y")
return abs((end_date.year - start_date.year) * 12 + (end_date.month - start_date.month))
Im trying to provide dynamic value to relativedelta function, i.e relativedelta(days=1) i would like to assign dynamic function value days, months, years to the function. Consider my situation as follows.
I will get list dynamicaly as follows:
Ex: 1
list = ['today', 'minus', '1', 'days']
Ex: 2
list = ['today', 'plus', '1', 'year']
Ex: 3
list = ['today', 'plus', '1', 'months']
I wrote my code to handle the calculation
import operator
from datetime import datetime
from datetime import date
from dateutil.relativedelta import relativedelta
operations = {
'plus': operator.add,
'minus': operator.sub,
}
today = date.today()
new_date = self.operations['plus'](today, relativedelta(days=1))
# the above is some thing like [today + relativedelta(days=1)]
What I'm trying to do is like operations I would like to assign days, months, years to the relativedelta() function, but I couldn't able to do it. Any suggested way to do it?
Found a way to do it!
We can use the **expression call syntax to pass in a dictionary to a function instead, it'll be expanded into keyword arguments (which your **kwargs function parameter will capture again):
attributes = {'days': 1}
relativedelta(**attributes)
I'm trying to create list of hours contained within each specified interval, which would be quite complicated with loop. Therefore, I wanted to ask for datetime recommendations.
# input in format DDHH/ddhh:
validity = ['2712/2812','2723/2805','2800/2812']
# demanded output:
val_hours = ['2712', '2713', '2714'..., '2717', '2723', '2800',...'2804',]
It would be great if last hour of validity would be considered as non-valid, becouse interval is ended by that hour, or more precisely by 59th minute of previous one.
I've tried quite complicated way with if conditions and loops, but I am persuaded that there is better one - as always.
It is something like:
#input in format DDHH/ddhh:
validity = ['2712/2812','2723/2805','2800/2812']
output = []
#upbound = previsously defined function defining list of lengt of each group
upbound = [24, 6, 12]
#For only first 24-hour group:
for hour in range(0,upbound[0]):
item = int(validity[0][-7:-5]) + hour
if (hour >= 24):
hour = hour - 24
output = output + hour
Further I would have to prefix numbers with date smaller than 10, like 112 (01st 12:00 Zulu) with zero and ensure correct day.
Loops and IFs seem to me just to compĂșlicated. Not mentioning error handling, it looks like two or three conditions.
Thank you for your help!
For each valid string, I use datetime.strptime to parse it, then based on either start date is less than or equal to end date, or greater than end date, I calculate the hours.
For start date less than or equal to end date, I consider original valid string, else I create two strings start_date/3023 and 0100/end_date
import datetime
validity = ['2712/2812','2723/2805','2800/2812','3012/0112','3023/0105','0110/0112']
def get_valid_hours(valid):
hours_li = []
#Parse the start date and end date as datetime
start_date_str, end_date_str = valid.split('/')
start_date = datetime.datetime.strptime(start_date_str,'%d%H')
end_date = datetime.datetime.strptime(end_date_str, '%d%H')
#If start date less than equal to end date
if start_date <= end_date:
dt = start_date
i=0
#Keep creating new dates until we hit end date
while dt < end_date:
#Append the dates to a list
dt = start_date+datetime.timedelta(hours=i)
hours_li.append(dt.strftime('%d%H'))
i+=1
#Else split the validity into two and calculate them separately
else:
start_date_str, end_date_str = valid.split('/')
return get_valid_hours('{}/3023'.format(start_date_str)) + get_valid_hours('0100/{}'.format(end_date_str))
#Append sublist to a bigger list
return hours_li
for valid in validity:
print(get_valid_hours(valid))
The output then looks like, not sure if this was the format needed!
['2712', '2713', '2714', '2715', '2716', '2717', '2718', '2719', '2720', '2721', '2722', '2723', '2800', '2801', '2802', '2803', '2804', '2805', '2806', '2807', '2808', '2809', '2810', '2811', '2812']
['2723', '2800', '2801', '2802', '2803', '2804', '2805']
['2800', '2801', '2802', '2803', '2804', '2805', '2806', '2807', '2808', '2809', '2810', '2811', '2812']
['3012', '3013', '3014', '3015', '3016', '3017', '3018', '3019', '3020', '3021', '3022', '3023', '0100', '0101', '0102', '0103', '0104', '0105', '0106', '0107', '0108', '0109', '0110', '0111', '0112']
['0100', '0101', '0102', '0103', '0104', '0105']
['0110', '0111', '0112']
Finally, I created something easy like this:
validity = ['3012/0112','3023/0105','0110/0112']
upbound = [24, 6, 12]
hours_list = []
for idx, val in enumerate(validity):
hours_li = []
DD = val[:2]
HH = val[2:4]
dd = val[5:7]
hh = val[7:9]
if DD == dd:
for i in range(int(HH),upbound[idx]):
hours_li.append(DD + str(i).zfill(2))
if DD <> dd:
for i in range(int(HH),24):
hours_li.append(DD + str(i).zfill(2))
for j in range(0,int(hh)):
hours_li.append(dd + str(j).zfill(2))
hours_list.append(hours_li)
This works for 24h validity (it could be solved by one if condition and similar block of concatenate), does not use datetime, just numberst and str. It is neither pythonic nor fast, but works.
elif row[inc3].startswith('LIGHT ON'):
onstr = row[inc3 + 1]
onlst.append(onstr)
elif row[inc4].startswith('LIGHT OFF'):
offstr = row[inc4 + 1]
offlst.append(offstr)
for idx, val in enumerate(onlst):
tdifflst.append(float(offlst[idx]) - float(onlst[idx]))
Here I pulled out the code from a script that extracts data from an EXCEL spreadsheet and analyzes it. The two types of values are the time a light turned on and the time a light turned off. For instance light on at 0500 and light off at 2300.
I want to subtract the on time from the off time but I obviously can't treat these as true floats because of the 60 minutes to an hour thing. How do I treat these "floats" like the times they are?
You could do something like this:
>>> from datetime import datetime
>>> light_on = datetime.strptime('0500', '%H%M')
>>> light_off = datetime.strptime('2300', '%H%M')
>>> print light_off - light_on
18:00:00
light_on and light_off are datetime objects and the difference is a timedelta object.
I think that you are at least implying that the time will always be reported in the form hhmm
therefore it is trivial to modify your code
elif row[inc3].startswith('LIGHT ON'):
raw_time = row[inc3 + 1]
hour = raw_time[0:2]
minute = raw_time[2:]
minute_hour_fraction = int(minute)/60.
str_time = ''.join([hour,str(minute_hour_fraction)[1:]])
float_time = float(str_time)
onlst.append(float_time)
elif row[inc4].startswith('LIGHT OFF'):
same as above
Another way:
import datetime
hh, mm = int(offstr[0:2]), int(offstr[2:4])
# a dummy date 101/1/1
toff = datetime.datetime(101, 1, 1, hh, mm)
hh, mm = int(onstr[0:2]), int(onstr[2:4])
tdiff = toff - datetime.timedelta(hours=hh, minutes=mm)
tdiff.hour
tdiff.minute
Also you can add the whole date (year, month, day) to the datetime objects, so you can get the result for difference greater than 24 hours.