I need to trying to get entire days timestamp in array. I need from 00:00:00 to 24:00:00 seconds of timestamp a day i.e 86400 data.
I have used pandas to achieve but could not
m_days=1
# today's date in timestamp
base = pd.Timestamp.today()
timestamp_list = [base + datetime.timedelta(days=x) for x in range(n_days)]
This should do it then:
import pandas as pd
start = pd.Timestamp('2022-02-16') # start of the day
end = start + pd.Timedelta(days=1) # end of the day
timestamps = pd.date_range(start, end, freq='s')
Related
I have a pandas dataframe df in which I have a column named time_column which consists of timestamp objects. I want to calculate the number of seconds elapsed from the start of the day i.e from 00:00:00 Hrs for each timestamp. How can that be done?
You can use pandas.Series.dt.total_seconds
df['time_column'] = pd.to_datetime(df['time_column'])
df['second'] = pd.to_timedelta(df['time_column'].dt.time.astype(str)).dt.total_seconds()
Do df['time_column]. That will give you the time column. Than just do something like:
import datetime as date
current_date = date.datetime.now()
time_elapsed = []
for x in range(0, current_date.minute*60 + current_date.hour*60*60):
time_elapsed.append((df['time_column'][x].minute*60 + df['time_column][x].hour*60*60)- (current_date.minute*60 + current_date.hour*60*60))
I am trying to generate 8 day intervals between two-time periods using pandas.date_range. In addition, when the 8 day interval exceeds the end of year (i.e., 365/366), I would like the range start to reset to the beginning of respective year. Below is the example code for just two years, however, I do plan to use it across several years, e.g., 2014-01-01 to 2021-01-01.
import pandas as pd
print(pd.date_range(start='2018-12-01', end='2019-01-31', freq='8D'))
Results in,
DatetimeIndex(['2018-12-01', '2018-12-09', '2018-12-17', '2018-12-25','2019-01-02', '2019-01-10', '2019-01-18', '2019-01-26'], dtype='datetime64[ns]', freq='8D')
However, I would like the start of the interval in 2019 to reset to the first day, e.g., 2019-01-01
You could loop creating a date_range up to the start of the next year for each year, appending them until you hit the end date.
import pandas as pd
from datetime import date
def date_range_with_resets(start, end, freq):
start = date.fromisoformat(start)
end = date.fromisoformat(end)
result = pd.date_range(start=start, end=start, freq=freq) # initialize result with just start date
next_year_start = start.replace(year=start.year+1, month=1, day=1)
while next_year_start < end:
result = result.append(pd.date_range(start=start, end=next_year_start, freq=freq))
start = next_year_start
next_year_start = next_year_start.replace(year=next_year_start.year+1)
result = result.append(pd.date_range(start=start, end=end, freq=freq))
return result[1:] # remove duplicate start date
start = '2018-12-01'
end = '2019-01-31'
date_range_with_resets(start, end, freq='8D')
Edit:
Here's a simpler way without using datetime. Create a date_range of years between start and end, then loop through those.
def date_range_with_resets(start, end, freq):
years = pd.date_range(start=start, end=end, freq='YS') # YS=year start
if len(years) == 0:
return pd.date_range(start=start, end=end, freq=freq)
result = pd.date_range(start=start, end=years[0], freq=freq)
for i in range(0, len(years) - 1):
result = result.append(pd.date_range(start=years[i], end=years[i+1], freq=freq))
result = result.append(pd.date_range(start=years[-1], end=end, freq=freq))
return result
I am writing a script for my project where I need to add start date and end date in online webpage. Start date must be current month's first date and end date must be yesterday's date.
Below is the code for predefined days. Kindly help to solve. In below I have to add manually from yesterday to require days, but need auto find to put first day of current month.
daystostart = 6
daystoend = 1
# Time and Date
yesterday = dt.datetime.now() - dt.timedelta(days=daystostart)
StartDT = yesterday.strftime("%Y-%m-%d ") + "00:00:00"
yesterdayNightEnd = dt.datetime.now() - dt.timedelta(days=daystoend)
EndDT = yesterdayNightEnd.strftime("%Y-%m-%d ") + "23:59:59"
from datetime import datetime
today = datetime.today().date()
first_day = today.replace(day=1)
first day of current month
from datetime import datetime as dt
now = dt.now()
dt(now.year, now.month, 1)
from datetime import datetime
today = datetime.today()
first_day = datetime(today.year,today.month,1)
#2021-08-01 00:00:00
My below working code calculates date/month ranges, but I am using the Pandas library, which I want to get rid of.
import pandas as pd
dates=pd.date_range("2019-12","2020-02",freq='MS').strftime("%Y%m%d").tolist()
#print dates : ['20191101','20191201','20200101','20200201']
df=(pd.to_datetime(dates,format="%Y%m%d") + MonthEnd(1)).strftime("%Y%m%d").tolist()
#print df : ['20191130','20191231','20200131','20200229']
How can I rewrite this code without using Pandas?
I don't want to use Pandas library as I am triggering my job through Oozie and we don't have Pandas installed on all our nodes.
Pandas offers some nice functionalities when using datetimes which the standard library datetime module does not have (like the frequency or the MonthEnd). You have to reproduce these yourself.
import datetime as DT
def next_first_of_the_month(dt):
"""return a new datetime where the month has been increased by 1 and
the day is always the first
"""
new_month = dt.month + 1
if new_month == 13:
new_year = dt.year + 1
new_month = 1
else:
new_year = dt.year
return DT.datetime(new_year, new_month, day=1)
start, stop = [DT.datetime.strptime(dd, "%Y-%m") for dd in ("2019-11", "2020-02")]
dates = [start]
cd = next_first_of_the_month(start)
while cd <= stop:
dates.append(cd)
cd = next_first_of_the_month(cd)
str_dates = [d.strftime("%Y%m%d") for d in dates]
print(str_dates)
# prints: ['20191101', '20191201', '20200101', '20200201']
end_dates = [next_first_of_the_month(d) - DT.timedelta(days=1) for d in dates]
str_end_dates = [d.strftime("%Y%m%d") for d in end_dates]
print(str_end_dates)
# prints ['20191130', '20191231', '20200131', '20200229']
I used here a function to get a datetime corresponding to the first day of the next month of the input datetime. Sadly, timedelta does not work with months, and adding 30 days of course is not feasible (not all months have 30 days).
Then a while loop to get a sequence of fist days of the month until the stop date.
And to the get the end of the month, again get the next first day of the month fo each datetime in your list and subtract a day.
I have a dataframe (df) with start_date column's and add_days column's (=10). I want to create target_date (=start_date + add_days) excluding week-end and holidays (holidays as dataframe).
I do some research and I try this.
from datetime import date, timedelta
import datetime as dt
df["star_date"] = pd.to_datetime(df["star_date"])
Holidays['Date_holi'] = pd.to_datetime(Holidays['Date_holi'])
def date_by_adding_business_days(from_date, add_days, holidays):
business_days_to_add = add_days
current_date = from_date
while business_days_to_add > 0:
current_date += datetime.timedelta(days=1)
weekday = current_date.weekday()
if weekday >= 5: # sunday = 6
continue
if current_date in holidays:
continue
business_days_to_add -= 1
return current_date
#demo:
base["Target_date"]=date_by_adding_business_days(df["start_date"], 10, Holidays['Date_holi'])
but i get this error:
AttributeError: 'Series' object has no attribute 'weekday'
Thanks you for your help.
The comments by ALollz are very valid; customizing your date during creation to only keep what is defined as business day for your problem would be optimal.
However, I assume that you cannot define the business day beforehand and that you need to solve the problem with the data frame constructed as is.
Here is one possible solution:
import pandas as pd
import numpy as np
from datetime import timedelta
# Goal is to offset a start date by N business days (weekday + not a holiday)
# Here we fake the dataset as it was not provided
num_row = 1000
df = pd.DataFrame()
df['start_date'] = pd.date_range(start='1/1/1979', periods=num_row, freq='D')
df['add_days'] = pd.Series([10]*num_row)
# Define what is a week day
week_day = [0,1,2,3,4] # Monday to Friday
# Define what is a holiday with month and day without year (you can add more)
holidays = ['10-30','12-24']
def add_days_to_business_day(df, week_day, holidays, increment=10):
'''
modify the dataframe to increment only the days that are part of a weekday
and not part of a pre-defined holiday
>>> add_days_to_business_day(df, [0,1,2,3,4], ['10-31','12-31'])
this will increment by 10 the days from Monday to Friday excluding Halloween and new year-eve
'''
# Increment everything that is in a business day
df.loc[df['start_date'].dt.dayofweek.isin(week_day),'target_date'] = df['start_date'] + timedelta(days=increment)
# Remove every increment done on a holiday
df.loc[df['start_date'].dt.strftime('%m-%d').isin(holidays), 'target_date'] = np.datetime64('NaT')
add_days_to_business_day(df, week_day, holidays)
df
To Note: I'm not using the 'add_days' column since its just a repeated value. I am instead using a parameter for my function increment which will increment by N number of days (with a default of N = 10).
Hope it helps!