Convert from numpy.timedelta64 to time interval - python

I'm having a hard time converting numpy.timedelta64, which is directional, into a time span, which is not directional. I don't see any instructions in the docs
The .abs function does not seem to work on timedelta values

Use np.abs(x) or abs(x), both work fine.

Related

PySpark - converting hour and minute data to seconds

I have a given time of XXh:YYm (ex 1h:23m) that I'm trying to convert to seconds. The tricky part is that if it is less than an hour then the time will be given as just YYm (eg 52m).
I am currently using
%pyspark
newColumn = unix_timestamp(col("time"), "H:mm")
dataF.withColumn('time', regexp_replace('time', 'h|m', '')).withColumn("time", newColumn).show()
This works great for removing the h and m letters and then converting to seconds, but throws a null when the time is less than an hour as explained above since it's not actually on the H:mm format. What's a good approach to this? I keep trying different things that seems to overcomplicate it, and I still haven't found a solution.
I am leaning toward some sort of conditional like
if value contains 'h:' then newColumn = unix_timestamp(col("time"), "H:mm")
else newColumn = unix_timestamp(col("time"), "mm")
but I am fairly new to pyspark and not sure how to do this to get the final output. I am basically looking for an approach that will convert a time to seconds and can handle formats of '1h:23m' as well as '53m'.
This should do the trick, assuming time column is stringtype. Just used when otherwise to separate the two different times(by contains 'h') and used substring to get desired minutes.
from pyspark.sql import functions as F
df.withColumn("seconds", F.when(F.col("time").contains("h"), F.unix_timestamp(F.regexp_replace("time", "h|m", ''),"H:mm"))\
.otherwise(F.unix_timestamp(F.substring("time",1,2),"mm")))\
.show()
+------+-------+
| time|seconds|
+------+-------+
|1h:23m| 4980|
| 23m| 1380|
+------+-------+
You can use "unix_timestamp" function to convert DateTime to unix timestamp in seconds.
You can refer to one of my blog on the Spark DateTime function and go to "unix_timestamp" section.
https://medium.com/expedia-group-tech/deep-dive-into-apache-spark-datetime-functions-b66de737950a
Regards,
Neeraj

get time difference in seconds for time referring to another timezone

I intend to find the time difference between two time variables in seconds. The issue here is that I am referring to time in a different zone. I have managed to find a solution, but it is a mix of pandas.datetime function and python datetime library. I guess, the objective can be achieved with just pandas/numpy alone and with fewer lines of code. below is my code, appreciate any guidance on how can i achieve the final_output more efficiently.
import pandas as pd
from datetime import timedelta
local_time = pd.to_datetime('now').tz_localize('UTC').tz_convert('Asia/Dubai')
t1 = timedelta(hours=local_time.now('Asia/Dubai').hour, minutes=local_time.now('Asia/Dubai').minute)
t2 = timedelta(hours=9, minutes=14)
final_output = (t2 - t1).seconds
You may want to convert both times to UTC, then find the difference. Programmers usually like to work with UTC until the time reaches the front end.

Efficient way for python date string manipulation

I want to turn '07/18/2013' to '07/2013' and there are a lot of these strings to be processed. What would be the most efficient way to do it?
I am thinking of using
''.join(['07/18/2013'[0:3],'07/18/2013'[6:]])
Look into strftime and strptime.
Assuming you start with the string s you can put it into a datetime object using strptime then take that back out into a string with only the necessary fields using strftime. I didn't actually run this code so I don't know if it is perfect, but the idea is here.
temp = datetime.strptime.(s, "%m/%D/%Y")
final = temp.strftime(%m/%Y")
You can find info on the datetime functions here https://docs.python.org/2/library/datetime.html
Use datetime module:
import datetime
print datetime.datetime.strptime("07/18/2013", '%m/%d/%Y').strftime('%m/%Y')

Write numpy datetime64 in ISO 8601 with timezone

How can the time zone be controlled when writing numpy datetime64 objects as an ISO 8601 string? Specifically, I would like the time zone to be "+0000", just like the input below. For this very simple example I just want it to print back the original string.
import numpy
print(numpy.datetime64('2014-03-07T17:52:00.000+0000'))
For me, it returns
2014-03-07T12:52:00.000-0500
I am using python 3.4, numpy 1.9.2, and windows.
This question is similar, but the first two answers don't actually answer the question and the third answer is specific to unix.
s = '2014-03-07T17:52:00.000+0000'
print(numpy.datetime64(s).item().replace(tzinfo=pytz.UTC).isoformat('T'))
Thanks to ShadowRanger for getting me going in the right direction. item gets naive datetime from datetime64, then replace time zone with UTC since I know that's what it is in this case, then get it in ISO format with the 'T' separator.
This should work:
import numpy, time, os
os.environ['TZ'] = 'GMT'
time.tzset()
print(numpy.datetime64('2014-03-07T17:52:00.000+0000'))
based on this stackoverflow answer:
https://stackoverflow.com/a/32764078/5915424

Compare if datetime.timedelta is between two values

I have a datetime.timedelta time object in python (e.g. 00:02:00) I want to check if this time is less than 5 minutess and greater then 1 minute.
I'm not sure how to construct a timedelta object and I'm also not sure if this is the right format to compare times. Would anyone know the most efficient way to do this?
So if you start with a string that's rigorously and precisely in the format 'HH:MM:SS', timedelta doesn't directly offer a string-parsing function, but it's not hard to make one:
import datetime
def parsedelta(hhmmss):
h, m, s = hhmmss.split(':')
return datetime.timedelta(hours=int(h), minutes=int(m), seconds=int(s))
If you need to parse many different variants you'll be better off looking for third-party packages like dateutil.
Once you do have timedelta instance, the check you request is easy, e.g:
onemin = datetime.timedelta(minutes=1)
fivemin = datetime.timedelta(minutes=5)
if onemin < parsedelta('00:02:00') < fivemin:
print('yep')
will, as expected, display yep.

Categories