I have a dataset that I have converted into a NumPy dataset. The dataset contains a series of date stamps.
A sample value would be: 2014-03-01 09:00:00.
What I am wondering is if someone knows how to convert a NumPy datetime to the day of the week for example in this case it is Saturday.
I suppose this is what your array looks like. If so, here's an example of how to do this.
import numpy, datetime
a=numpy.array([datetime.datetime.now(),datetime.datetime.now()+datetime.timedelta(days=2)])
a[0].weekday()
The return value of weekday is day of the week as an integer, where Monday is 0 and Sunday is 6 according to the docs. You could also use isoweekday to get numbers from 1 to 7. So all you need now is, say, a dictionary like this: daysOfWeek={0:'Monday',1:'Tuesday'} etc. to get the names of the days of week instead of numbers.
Related
If there was a variable in an xarray dataset with a time dimension with daily values over some multiyear time span
2017-01-01 ... 2018-12-31, then it is possible to group the data by month, or by the day of the year, using
.groupby("time.month") or .groupby("time.dayofyear")
Is there a way to efficiently group the data by the day of the month, for example if I wanted to calculate the mean value on the 21st of each month?
See the xarray docs on the DateTimeAccessor helper object. For more info, you can also check out the xarray docs on Working with Time Series Data: Datetime Components, which in turn refers to the pandas docs on date/time components.
You're looking for day. Unfortunately, both pandas and xarray simply describe .dt.day as referring to "the days of the datetime" which isn't particularly helpful. But if you take a look at python's native datetime.Date.day definition, you'll see the more specific:
date.day
Between 1 and the number of days in the given month of the given year.
So, simply
da.groupby("time.day")
Should do the trick!
I not sure, but maybe you can do like this:
import datetime
x = datetime.datetime.now()
day = x.strftime("%d")
month = x.strftime("%m")
year = x.strftime("%Y")
.groupby(month) or .groupby(year)
To use numpy.arange to create an array of dates which increase in 1 day intervals is straightforward and can be achieved using the code
np.arange(datetime(1985,7,1), datetime(2015,7,1), relativedelta(days=1)).astype(datetime)
However, I require an array of dates which increase in 1 year intervals. To do this, I cannot use
np.arange(datetime(1985,7,1), datetime(2015,7,1), relativedelta(days=365)).astype(datetime)
since this does not account for leap years and I need the day and month of my dates to remain the same at all terms.
Is there a way to achieve this using np.arange?
I wish to use numpy.arrange since I am hoping to use #Mustafa Aydın's answer to my earlier question (https://stackoverflow.com/a/68032151/10346788) but with dates rather than with integers.
Specify only the year and the month in the datetime64 , and set the interval as 1 year . For example ,to generate all dates of March 10 , from 1985 to 2015
np.arange(np.datetime64("1985-03"), np.datetime64("2015-03"),np.timedelta64(1,"Y")) +np.timedelta64("9","D")
array(['1985-03-10', '1986-03-10', '1987-03-10', '1988-03-10',
'1989-03-10', '1990-03-10', '1991-03-10', '1992-03-10',
'1993-03-10', '1994-03-10', '1995-03-10', '1996-03-10',
'1997-03-10', '1998-03-10', '1999-03-10', '2000-03-10',
'2001-03-10', '2002-03-10', '2003-03-10', '2004-03-10',
'2005-03-10', '2006-03-10', '2007-03-10', '2008-03-10',
'2009-03-10', '2010-03-10', '2011-03-10', '2012-03-10',
'2013-03-10', '2014-03-10'], dtype='datetime64[D]'
try
np.array([datetime(i,7,1) for i in range(1985,2015+1)])
EDIT: or just as a normal list - in case it does not have to be a numpy array:
[datetime(i,7,1) for i in range(1985,2015+1)]
Currently my script is subtracting my current time with the times that i have in a Dataframe column called "Creation", generating a new column with the days of the difference. I get the difference days with this code:
df['Creation']= pandas.to_datetime(df["Creation"],dayfirst="True")
#Generates new column with the days.
df['Difference'] = df.to_datetime('now') - df['Creation']
What i want to now is for it to give me the days like hes giving me but dont count the Saturdays and Sundays. How can i do that ?
you can make use of numpy's busday_count, Ex:
import pandas as pd
import numpy as np
# some dummy data
df = pd.DataFrame({'Creation': ['2021-03-29', '2021-03-30']})
# make sure we have datetime
df['Creation'] = pd.to_datetime(df['Creation'])
# set now to a fixed date
now = pd.Timestamp('2021-04-05')
# difference in business days, excluding weekends
# need to cast to datetime64[D] dtype so that np.busday_count works
df['busday_diff'] = np.busday_count(df['Creation'].values.astype('datetime64[D]'),
np.repeat(now, df['Creation'].size).astype('datetime64[D]'))
df['busday_diff'] # since I didn't define holidays, potential Easter holiday is excluded:
0 5
1 4
Name: busday_diff, dtype: int64
If you need the output to be of dtype timedelta, you can easily cast to that via
df['busday_diff'] = pd.to_timedelta(df['busday_diff'], unit='d')
df['busday_diff']
0 5 days
1 4 days
Name: busday_diff, dtype: timedelta64[ns]
Note: np.busday_count also allows you to set a custom weekmask (exclude days other than Saturday and Sunday) or a list of holidays. See the docs I linked on top.
Related: Calculate difference between two dates excluding weekends in python?, how to use (np.busday_count) with pandas.core.series.Series
I've a datetime (int64) column in my pandas dataframe.
I'm trying to convert its value of 201903250428 to a datetime value.
The value i have for the datetime (int64) column is only till minute level with 24 hours format.
I tried various methods like striptime, to_datetime methods but no luck.
pd.datetime.strptime('201903250428','%y%m%d%H%M')
I get this error when i use the above code.
ValueError: unconverted data remains: 0428
I wanted this value to be converted to like '25-03-2019 04:28:00'
Lower-case y means two-digit years only, so this is trying to parse "20" as the year, 1 as the month, 9 the day, and 03:25 as the time, leaving "0428" unconverted.
You need to use %Y which will work fine:
pd.datetime.strptime('201903250428','%Y%m%d%H%M')
http://strftime.org/ is a handy reference for time formatting/parsing parameters.
Python and Matlab quite often have integer date representations as follows:
733828.0
733829.0
733832.0
733833.0
733834.0
733835.0
733836.0
733839.0
733840.0
733841.0
these numbers correspond to some dates this year. Do you guys know which function can convert them back to YYYYMMDD format?
thanks a million!
The datetime.datetime class can help you here. The following works, if those values are treated as integer days (you don't specify what they are).
>>> from datetime import datetime
>>> dt = datetime.fromordinal(733828)
>>> dt
datetime.datetime(2010, 2, 25, 0, 0)
>>> dt.strftime('%Y%m%d')
'20100225'
You show the values as floats, and the above doesn't take floats. If you can give more detail about what the data is (and where it comes from) it will be possible to give a more complete answer.
Since Python example was already demonstrated, here is the matlab one:
>> datestr(733828, 'yyyymmdd')
ans =
20090224
Also, note that while looking similar these are actually different things in Matlab and Python:
Matlab
A serial date number represents the whole and fractional number of days
from a specific date and time, where datenum('Jan-1-0000 00:00:00') returns
the number 1. (The year 0000 is merely a reference point and is not intended
to be interpreted as a real year in time.)
Python, datetime.date.fromordinal
Return the date corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1.
So they would differ by 366 days, which is apparently the length of the year 0.
Dates like 733828.0 are Rata Die dates, counted from January 1, 1 A.D. (and decimal fraction of days). They may be UTC or by your timezone.
Julian Dates, used mostly by astronomers, count the days (and decimal fraction of days) since January 1, 4713 BC Greenwich noon. Julian date is frequently confused with Ordinal date, which is the date count from January 1 of the current year (Feb 2 = ordinal day 33).
So datetime is calling these things ordinal dates, but I think this only makes sense locally, in the world of python.
Is 733828.0 a timestamp? If so, you can do the following:
import datetime as dt
dt.date.fromtimestamp(733828.0).strftime('%Y%m%d')
PS
I think Peter Hansen is right :)
I am not a native English speaker. Just trying to help. I don't quite know the difference between a timestamp and an ordinal :(