I have to convert a MATLAB's datenum to Python's datetime (e.g.2010-11-04 00:03:50.209589).
The datenum is represented in milliseconds and the date must be from 2010-11-04 00:00:00 to 2011-06-11 00:00:00.
The following code is as below:
matlab_datenum = 6.365057116950260162e+10
python_datetime = datetime.datetime.fromtimestamp(matlab_datenum / 1e3)
print (python_datetime)
The result is : 1972-01-07 16:42:51.169503
The result is wrong because the date must be from 2010-11-04 to 2011-06-11.
Do you have any idea how to correct the result ?
Thank you for your help
The datenum page in the Matlab documentation states:
The datenum function creates a numeric array that represents each point in time as the number of days from January 0, 0000.
Python's datetime module page states the following for fromtimestamp:
Return the local date corresponding to the POSIX timestamp
which is 00:00:00 1 January 1970
The two functions are counting from different start points and using different units (days and seconds), hence the discrepancy between your two dates.
Related
I have an array of numbers (e.g 279.341, 279.345, 279.348) which relate to the date and time in 2017 (its supposed to be October 6th 2017). To be able to compare this data to another dataset I need to convert that array into an array of UNIX timestamps.
I have successfully done something similar in matlab (code below) but don't know how to translate this to Python.
MatLab:
adcpTimeStr = datestr(adcp.adcp_day_num,'2017 mmm dd HH:MM:SS');
adcpTimeRaw = datetime(adcpTimeStr,'InputFormat','yyyy MMM dd HH:mm:ss');
adcpTimenumRaw = datenum(adcpTimeRaw)';
What would be a good way of converting the array into UNIX timestamps?
assuming these numbers are fractional days of the year (UTC) and the year is 2017, in Python you would do
from datetime import datetime, timedelta, timezone
year = datetime(2017,1,1, tzinfo=timezone.utc) # the starting point
doy = [279.341, 279.345, 279.348]
# add days to starting point as timedelta and call timestamp() method:
unix_t = [(year+timedelta(d)).timestamp() for d in doy]
# [1507363862.4, 1507364208.0, 1507364467.2]
I just started moving from Matlab to Python 2.7 and I have some trouble reading my .mat-files. Time information is stored in Matlab's datenum format. For those who are not familiar with it:
A serial date number represents a calendar date as the number of days that has passed since a fixed base date. In MATLAB, serial date number 1 is January 1, 0000.
MATLAB also uses serial time to represent fractions of days beginning at midnight; for example, 6 p.m. equals 0.75 serial days. So the string '31-Oct-2003, 6:00 PM' in MATLAB is date number 731885.75.
(taken from the Matlab documentation)
I would like to convert this to Pythons time format and I found this tutorial. In short, the author states that
If you parse this using python's datetime.fromordinal(731965.04835648148) then the result might look reasonable [...]
(before any further conversions), which doesn't work for me, since datetime.fromordinal expects an integer:
>>> datetime.fromordinal(731965.04835648148)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: integer argument expected, got float
While I could just round them down for daily data, I actually need to import minutely time series. Does anyone have a solution for this problem? I would like to avoid reformatting my .mat files since there's a lot of them and my colleagues need to work with them as well.
If it helps, someone else asked for the other way round. Sadly, I'm too new to Python to really understand what is happening there.
/edit (2012-11-01): This has been fixed in the tutorial posted above.
You link to the solution, it has a small issue. It is this:
python_datetime = datetime.fromordinal(int(matlab_datenum)) + timedelta(days=matlab_datenum%1) - timedelta(days = 366)
a longer explanation can be found here
Using pandas, you can convert a whole array of datenum values with fractional parts:
import numpy as np
import pandas as pd
datenums = np.array([737125, 737124.8, 737124.6, 737124.4, 737124.2, 737124])
timestamps = pd.to_datetime(datenums-719529, unit='D')
The value 719529 is the datenum value of the Unix epoch start (1970-01-01), which is the default origin for pd.to_datetime().
I used the following Matlab code to set this up:
datenum('1970-01-01') % gives 719529
datenums = datenum('06-Mar-2018') - linspace(0,1,6) % test data
datestr(datenums) % human readable format
Just in case it's useful to others, here is a full example of loading time series data from a Matlab mat file, converting a vector of Matlab datenums to a list of datetime objects using carlosdc's answer (defined as a function), and then plotting as time series with Pandas:
from scipy.io import loadmat
import pandas as pd
import datetime as dt
import urllib
# In Matlab, I created this sample 20-day time series:
# t = datenum(2013,8,15,17,11,31) + [0:0.1:20];
# x = sin(t)
# y = cos(t)
# plot(t,x)
# datetick
# save sine.mat
urllib.urlretrieve('http://geoport.whoi.edu/data/sine.mat','sine.mat');
# If you don't use squeeze_me = True, then Pandas doesn't like
# the arrays in the dictionary, because they look like an arrays
# of 1-element arrays. squeeze_me=True fixes that.
mat_dict = loadmat('sine.mat',squeeze_me=True)
# make a new dictionary with just dependent variables we want
# (we handle the time variable separately, below)
my_dict = { k: mat_dict[k] for k in ['x','y']}
def matlab2datetime(matlab_datenum):
day = dt.datetime.fromordinal(int(matlab_datenum))
dayfrac = dt.timedelta(days=matlab_datenum%1) - dt.timedelta(days = 366)
return day + dayfrac
# convert Matlab variable "t" into list of python datetime objects
my_dict['date_time'] = [matlab2datetime(tval) for tval in mat_dict['t']]
# print df
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 201 entries, 2013-08-15 17:11:30.999997 to 2013-09-04 17:11:30.999997
Data columns (total 2 columns):
x 201 non-null values
y 201 non-null values
dtypes: float64(2)
# plot with Pandas
df = pd.DataFrame(my_dict)
df = df.set_index('date_time')
df.plot()
Here's a way to convert these using numpy.datetime64, rather than datetime.
origin = np.datetime64('0000-01-01', 'D') - np.timedelta64(1, 'D')
date = serdate * np.timedelta64(1, 'D') + origin
This works for serdate either a single integer or an integer array.
Just building on and adding to previous comments. The key is in the day counting as carried out by the method toordinal and constructor fromordinal in the class datetime and related subclasses. For example, from the Python Library Reference for 2.7, one reads that fromordinal
Return the date corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1. ValueError is raised unless 1 <= ordinal <= date.max.toordinal().
However, year 0 AD is still one (leap) year to count in, so there are still 366 days that need to be taken into account. (Leap year it was, like 2016 that is exactly 504 four-year cycles ago.)
These are two functions that I have been using for similar purposes:
import datetime
def datetime_pytom(d,t):
'''
Input
d Date as an instance of type datetime.date
t Time as an instance of type datetime.time
Output
The fractional day count since 0-Jan-0000 (proleptic ISO calendar)
This is the 'datenum' datatype in matlab
Notes on day counting
matlab: day one is 1 Jan 0000
python: day one is 1 Jan 0001
hence an increase of 366 days, for year 0 AD was a leap year
'''
dd = d.toordinal() + 366
tt = datetime.timedelta(hours=t.hour,minutes=t.minute,
seconds=t.second)
tt = datetime.timedelta.total_seconds(tt) / 86400
return dd + tt
def datetime_mtopy(datenum):
'''
Input
The fractional day count according to datenum datatype in matlab
Output
The date and time as a instance of type datetime in python
Notes on day counting
matlab: day one is 1 Jan 0000
python: day one is 1 Jan 0001
hence a reduction of 366 days, for year 0 AD was a leap year
'''
ii = datetime.datetime.fromordinal(int(datenum) - 366)
ff = datetime.timedelta(days=datenum%1)
return ii + ff
Hope this helps and happy to be corrected.
This question already has an answer here:
convert numerical representation of date (excel format) to python date and time, then split them into two seperate dataframe columns in pandas
(1 answer)
Closed 4 years ago.
I have seen that excel identifies dates with specific serial numbers. For example :
09/07/2018 = 43290
10/07/2018 = 43291
I know that we use the DATEVALUE , VALUE and the TEXT functions to convert between these types.
But what is the logic behind this conversion? why 43290 for 09/07/2018 ?
Also , if I have a list of these dates in the number format in a dataframe (Python), how can I convert this number to the date format?
Similarly with time, I see decimal values in place of a regular time format. What is the logic behind these time conversions?
The following question that has been given in the comments is informative, but does not answer my question of the logic behind the conversion between Date and Text format :
convert numerical representation of date (excel format) to python date and time, then split them into two seperate dataframe columns in pandas
It is simply the number of days (or fraction of days, if talking about date and time) since January 1st 1900:
The DATEVALUE function converts a date that is stored as text to a
serial number that Excel recognizes as a date. For example, the
formula =DATEVALUE("1/1/2008") returns 39448, the serial number of the
date 1/1/2008. Remember, though, that your computer's system date
setting may cause the results of a DATEVALUE function to vary from
this example
...
Excel stores dates as sequential serial numbers so that they can be used in calculations. By default, January 1, 1900 is serial number 1, and January 1, 2008 is serial number 39448 because it is 39,447 days after January 1, 1900.
from DATEVALUE docs
if I have a list of these dates in the number format in a dataframe
(Python), how can I convert this number to the date format?
Since we know this number represents the number of days since 1/1/1900 it can be easily converted to a date:
from datetime import datetime, timedelta
day_number = 43290
print(datetime(1900, 1, 1) + timedelta(days=day_number - 2))
# 2018-07-09 00:00:00 ^ subtracting 2 because 1/1/1900 is
# "day 1", not "day 0"
However pd.read_excel should be able to handle this automatically.
I have a dataset that I have converted into a NumPy dataset. The dataset contains a series of date stamps.
A sample value would be: 2014-03-01 09:00:00.
What I am wondering is if someone knows how to convert a NumPy datetime to the day of the week for example in this case it is Saturday.
I suppose this is what your array looks like. If so, here's an example of how to do this.
import numpy, datetime
a=numpy.array([datetime.datetime.now(),datetime.datetime.now()+datetime.timedelta(days=2)])
a[0].weekday()
The return value of weekday is day of the week as an integer, where Monday is 0 and Sunday is 6 according to the docs. You could also use isoweekday to get numbers from 1 to 7. So all you need now is, say, a dictionary like this: daysOfWeek={0:'Monday',1:'Tuesday'} etc. to get the names of the days of week instead of numbers.
Python and Matlab quite often have integer date representations as follows:
733828.0
733829.0
733832.0
733833.0
733834.0
733835.0
733836.0
733839.0
733840.0
733841.0
these numbers correspond to some dates this year. Do you guys know which function can convert them back to YYYYMMDD format?
thanks a million!
The datetime.datetime class can help you here. The following works, if those values are treated as integer days (you don't specify what they are).
>>> from datetime import datetime
>>> dt = datetime.fromordinal(733828)
>>> dt
datetime.datetime(2010, 2, 25, 0, 0)
>>> dt.strftime('%Y%m%d')
'20100225'
You show the values as floats, and the above doesn't take floats. If you can give more detail about what the data is (and where it comes from) it will be possible to give a more complete answer.
Since Python example was already demonstrated, here is the matlab one:
>> datestr(733828, 'yyyymmdd')
ans =
20090224
Also, note that while looking similar these are actually different things in Matlab and Python:
Matlab
A serial date number represents the whole and fractional number of days
from a specific date and time, where datenum('Jan-1-0000 00:00:00') returns
the number 1. (The year 0000 is merely a reference point and is not intended
to be interpreted as a real year in time.)
Python, datetime.date.fromordinal
Return the date corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1.
So they would differ by 366 days, which is apparently the length of the year 0.
Dates like 733828.0 are Rata Die dates, counted from January 1, 1 A.D. (and decimal fraction of days). They may be UTC or by your timezone.
Julian Dates, used mostly by astronomers, count the days (and decimal fraction of days) since January 1, 4713 BC Greenwich noon. Julian date is frequently confused with Ordinal date, which is the date count from January 1 of the current year (Feb 2 = ordinal day 33).
So datetime is calling these things ordinal dates, but I think this only makes sense locally, in the world of python.
Is 733828.0 a timestamp? If so, you can do the following:
import datetime as dt
dt.date.fromtimestamp(733828.0).strftime('%Y%m%d')
PS
I think Peter Hansen is right :)
I am not a native English speaker. Just trying to help. I don't quite know the difference between a timestamp and an ordinal :(