How to convert numpy datetime64 into datetime [duplicate] - python

This question already has answers here:
Converting between datetime, Timestamp and datetime64
(14 answers)
Closed 7 years ago.
I basically face the same problem posted here:Converting between datetime, Timestamp and datetime64
but I couldn't find satisfying answer from it, my question how to extract datetime from numpy.datetime64 type:
if I try:
np.datetime64('2012-06-18T02:00:05.453000000-0400').astype(datetime.datetime)
it gave me:
1339999205453000000L
my current solution is convert datetime64 into a string and then turn to datetime again. but it seems quite a silly method.

Borrowing from
Converting between datetime, Timestamp and datetime64
In [220]: x
Out[220]: numpy.datetime64('2012-06-17T23:00:05.453000000-0700')
In [221]: datetime.datetime.utcfromtimestamp(x.tolist()/1e9)
Out[221]: datetime.datetime(2012, 6, 18, 6, 0, 5, 452999)
Accounting for timezones I think that's right. Looks rather clunky though.
Using int() is more explicit (I think) than tolist()):
In [294]: datetime.datetime.utcfromtimestamp(int(x)/1e9)
Out[294]: datetime.datetime(2012, 6, 18, 6, 0, 5, 452999)
or to get datetime in local:
In [295]: datetime.datetime.fromtimestamp(x.astype('O')/1e9)
But in the test_datetime.py file
https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_datetime.py
I find some other options - first convert the general datetime64 to one of the format that specifies units:
In [296]: x.astype('M8[D]').astype('O')
Out[296]: datetime.date(2012, 6, 18)
In [297]: x.astype('M8[ms]').astype('O')
Out[297]: datetime.datetime(2012, 6, 18, 6, 0, 5, 453000)
This works for arrays:
In [303]: np.array([[x,x],[x,x]],dtype='M8[ms]').astype('O')[0,1]
Out[303]: datetime.datetime(2012, 6, 18, 6, 0, 5, 453000)

Note that Timestamp IS a sub-class of datetime.datetime so the [4] will generally work
In [4]: pd.Timestamp(np.datetime64('2012-06-18T02:00:05.453000000-0400'))
Out[4]: Timestamp('2012-06-18 06:00:05.453000')
In [5]: pd.Timestamp(np.datetime64('2012-06-18T02:00:05.453000000-0400')).to_pydatetime()
Out[5]: datetime.datetime(2012, 6, 18, 6, 0, 5, 453000)

Related

Converting integer Seconds values into datetime. Python

I'm trying to figure out the easiest way to automate the conversion of an array of seconds into datetime. I'm very familiar with converting the seconds from 1970 into datetime, but the values that I have here are for the seconds elapsed in a given day. For example, 14084 is the number if seconds that has passed on 2011,11,11, and I was able to generate the datetime below.
str(dt.timedelta(seconds = 14084))
Out[245]: '3:54:44'
dt.datetime.combine(date(2011,11,11),time(3,54,44))
Out[250]: datetime.datetime(2011, 11, 11, 3, 54, 44)
Is there a faster way of conversion for an array.
numpy has support for arrays of datetimes with a timedelta type for manipulating them:
https://numpy.org/doc/stable/reference/arrays.datetime.html
e.g. you can do this:
import numpy as np
date_array = np.arange('2005-02', '2005-03', dtype='datetime64[D]')
date_array += np.timedelta64(4, 's') # Add 4 seconds
If you have an array of seconds, you could convert it into an array of timedeltas and add that to a fixed datetime
Say you have
seconds = [14084, 14085, 15003]
You can use pandas
import pandas as pd
series = pd.to_timedelta(seconds, unit='s') + pd.to_datetime('2011-11-11')
series = series.to_series().reset_index(drop=True)
print(series)
0 2011-11-11 03:54:44
1 2011-11-11 03:54:45
2 2011-11-11 04:10:03
dtype: datetime64[ns]
Or a list comprehension
list_comp = [datetime.datetime(2011, 11, 11) +
datetime.timedelta(seconds=s) for s in seconds]
print(list_comp)
[datetime.datetime(2011, 11, 11, 3, 54, 44), datetime.datetime(2011, 11, 11, 3, 54, 45), datetime.datetime(2011, 11, 11, 4, 10, 3)]

Pandas Convert Particular columns to a list [duplicate]

This question already has answers here:
Pandas DataFrame column to list [duplicate]
(4 answers)
How do I convert a Pandas series or index to a NumPy array? [duplicate]
(8 answers)
Closed 3 years ago.
I have a Python dataFrame with multiple columns.
LogBlk Page BayFail
0 0 [0, 1, 8, 9]
1 16 [0, 1, 4, 5, 6, 8, 9, 12, 13, 14]
2 32 [0, 1, 4, 5, 6, 8, 9, 12, 13, 14]
3 48 [0, 1, 4, 5, 6, 8, 9, 12, 13, 14]
I want to find BayFails that is associated with LogBlk=0, and Page=0.
df2 = df[ (df['Page'] == 16) & (df['LogBlk'] == 0) ]['BayFail']
This will return [0,1,8,9]
What I want to do is to convert this pandas.series into a list. Does anyone know how to do that?
pandas.Series, has a tolist method:
In [10]: import pandas as pd
In [11]: s = pd.Series([0,1,8,9], name = 'BayFail')
In [12]: s.tolist()
Out[12]: [0L, 1L, 8L, 9L]
Technical note: In my original answer I said that Series was a subclass of numpy.ndarray and inherited its tolist method. While that's true for Pandas version 0.12 or older, In the soon-to-be-released Pandas version 0.13, Series has been refactored to be a subclass of NDFrame. Series still has a tolist method, but it has no direct relationship to the numpy.ndarray method of the same name.
You can also convert them to numpy arrays
In [124]: s = pd.Series([0,1,8,9], name='BayFail')
In [125]: a = pd.np.array(s)
Out[125]: array([0, 1, 8, 9], dtype=int64)
In [126]: a[0]
Out[126]: 0

find repeating dates between two datetime arrays python

I have two datetime arrays, and I am trying to output an array with only those dates which are repeated between both arrays.. I feel like this is something I should be able to answer myself, but I have spent a lot of time searching and I do not understand how to solve this.
>>> datetime1[0:4]
array([datetime.datetime(2014, 6, 19, 4, 0),
datetime.datetime(2014, 6, 19, 5, 0),
datetime.datetime(2014, 6, 19, 6, 0),
datetime.datetime(2014, 6, 19, 7, 0)], dtype=object)
>>> datetime2[0:4]
array([datetime.datetime(2014, 6, 19, 3, 0),
datetime.datetime(2014, 6, 19, 4, 0),
datetime.datetime(2014, 6, 19, 5, 0),
datetime.datetime(2014, 6, 19, 6, 0)], dtype=object)
I've tried this below but I still do not understand why this does not work
>>> np.where(datetime1==datetime2)
(array([], dtype=int64),)
This:
datetime1==datetime2
Is an element-wise comparison. It compares [0] with [0], then [1] with [1], and gives you a boolean array.
Instead, try:
np.in1d(datetime1, datetime2)
This gives you a boolean array the same size as datetime1, set to True for those elements which exist in datetime2.
If your goal is only to get the values rather than the indexes, use this:
np.intersect1d(datetime1, datetime2)
https://docs.scipy.org/doc/numpy/reference/generated/numpy.intersect1d.html
I would say just iterate over the values of datetime1 and datetime2 and check for containment. So for example:
for date in datetime1:
if date in datetime2:
print(date)

Difficulty getting local time

I am struggling to understand how to get time series in local time (with DST) from excel file into a pandas timeseries in UTC. I’ve tried various combinations of .localize(), .replace(), .astimezone(), .to_datetime() , all without luck and I feel like I am getting an inconsistent results in situations that should (at least in my mind) be returning the same thing. The example that gets to the heart of the dilemma follows:. I would be grateful if someone could explain why these two sections of code produce different answers. The first of these sections accomplishes what I would like. It seems like the second should do the same, but it doesn’t.
In[1]: UTCtz=timezone("UTC")
In[2]: localtz=timezone("US/Central")
In[3]: dt1=datetime.datetime(2016,3,13,1,0,0,0)
In[4]: dt2=datetime.datetime(2016,3,13,3,0,0,0)
In[5]: dt3=localtz.localize(dt1)
In[6]: dt3
Out[6]: datetime.datetime(2016, 3, 13, 1, 0, tzinfo=<DstTzInfo 'US/Central' CST-1 day, 18:00:00 STD>)
In[7]: dt4=localtz.localize(dt2)
In[8]: dt4
Out[8]: datetime.datetime(2016, 3, 13, 3, 0, tzinfo=<DstTzInfo 'US/Central' CDT-1 day, 19:00:00 DST>)
In[9]: dt3.astimezone(UTCtz)
Out[9]: datetime.datetime(2016, 3, 13, 7, 0, tzinfo=<UTC>)
In[10]: dt4.astimezone(UTCtz)
Out[10]: datetime.datetime(2016, 3, 13, 8, 0, tzinfo=<UTC>)
dt1 and dt2 straddle the DST switch and the conversion to UTC is what I was expecting.
Alternatively, here is what I get from the series that I’ve read from excel into a dataframe (column “timeindex”).
In[1]: df.iloc[2]["timeindex"]
Out[1]: Timestamp('2016-03-13 01:00:00')
In[2]: df.iloc[3]["timeindex"]
Out[2]: Timestamp('2016-03-13 03:00:00')
In[3]: localtz.localize(df.iloc[2]["timeindex"])
Out[3]: Timestamp('2016-03-13 01:00:00-0600', tz='US/Central')
In[4]: localtz.localize(df.iloc[3]["timeindex"])
Out[4]: Timestamp('2016-03-13 04:00:00-0500', tz='US/Central')
The same function calls as before but the utc gap between times increases by an hour !
What is going on here ?

How to calculate timedelta between datetimes with different timezones in Python

How do you get a valid timedelta instance when differencing datetimes with different timezones in Python? I'm finding the timedelta is always 0 if the timezones are different.
>>> from dateutil.parser import parse
>>> dt0=parse('2017-02-06 18:14:32-05:00')
>>> dt0
datetime.datetime(2017, 2, 6, 18, 14, 32, tzinfo=tzoffset(None, -18000))
>>> dt1=parse('2017-02-06 23:14:32+00:00')
>>> dt1
datetime.datetime(2017, 2, 6, 23, 02, 12, tzinfo=tzutc())
>>> (dt1-dt0).total_seconds()
0.0
This doesn't make any sense to me. I would have thought that Python's datetime class would be smart enough to normalize both values to UTC internally, and then return a timedelta based on those values. Or throw an exception. Instead it returns 0, implying both datetimes are equal, which clearly they're not. What am I doing wrong here?
You are confused about what the timezone means; the two times you gave are identical, so of course their difference is zero. I can duplicate your results, except that I don't have the discrepancy between the second string and second datetime that you have:
>>> from dateutil.parser import parse
>>> dt0=parse('2017-02-06 18:14:32-05:00')
>>> dt0
datetime.datetime(2017, 2, 6, 18, 14, 32, tzinfo=tzoffset(None, -18000))
>>> dt1=parse('2017-02-06 23:14:32+00:00')
>>> dt1
datetime.datetime(2017, 2, 6, 23, 14, 32, tzinfo=tzutc())
>>> (dt1-dt0).total_seconds()
0.0
But watch what happens when I convert dt0 to UTC. The time gets adjusted by the 5 hour timezone difference, and it becomes identical to the second.
>>> dt0.astimezone(dt1.tzinfo)
datetime.datetime(2017, 2, 6, 23, 14, 32, tzinfo=tzutc())

Categories