Converting from local time to UTC time in Python Pandas dataframe? - python

How would I efficiently convert local times in a dataframe to UTC times? There are 3 columns with information: the date (string), the timezone code (string), and the hour of the day (integer).
date
timezone
hour
7/31/2010 0:00:00
EST
1
6/14/2010 0:00:00
PST
3
6/14/2010 0:00:00
PST
4
5/30/2010 0:00:00
EDT
23
5/30/2010 0:00:00
EDT
24
After the data is converted I will be aggregating it to monthly data.

Gday.
Working with dates is described reasonably well in this answer here: converting utc to est time in python
In that case they have the timezone offsets as numbers e.g +11:00. You have the US short code. So you could convert that column to the numerical equivalent first and then use that function.
Personally I find the notation "Australia/Melbourne" way easier to deal with - especially because it thinks about daylight savings etc for you. Timezones are a nightmare. Thats described here: Python: datetime tzinfo time zone names documentation
In terms of the hour column, you can just use a string function to join those two values together to form a date and time string.
So I'd suggest you convert that timezone column to that format (I.e EST as America/New York), etc, then feed all three columns into a datetime convert line per the first answer

Related

Set column with different time zones as index

I have a DataFrame with time values from different timezones. See here:
The start of the data is the usual time and the second half is daylight savings time. As you can see I want to convert it to a datetime column but because of the different time zones it doesn't work. My goal is to set this column as index. How can I do that?
"... timezone-aware inputs with mixed time offsets ..." can be a bit problematic with Pandas. However, there is a pandas.to_datetime parameter setting that may be acceptable to use timezone-aware inputs with mixed time offsets as a DatetimeIndex.
Excerpt from the docs:
... timezone-aware inputs with mixed time offsets (for example issued from a timezone with daylight savings, such as Europe/Paris)
are not successfully converted to a DatetimeIndex. Instead a simple
Index containing datetime.datetime objects is returned:
...
Setting utc=True solves most of the ... issues:
...
Timezone-aware inputs are converted to UTC (the output represents the exact same datetime, but viewed from the UTC time offset +00:00)
[and a DatetimeIndex is returned].

Can't read date variables in Python from SPSS (.sav) files

I'm working with a .sav (SPSS) file in Python. All the variables look fine after import while using PyreadStat (also when using Pandas) except for the datetime variables. They read in as exponential numbers of type float using Python. But their original SPSS format is dd-mmm-yy (e.g., 02-feb-2021) of type date.
This is how the date variable looks like
1.383160e+10
Is there a way to convert this format to datetime using Python?
I've tried various ways of using the datetime module and time module. But what I get is a date from the year 2408
# Here I'm using the float from the first row in the dataframe
time.gmtime(13831603200)
The results
time.struct_time(tm_year=2408, tm_mon=4, tm_mday=22, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=113, tm_isdst=0)
When I use the datetime module:
python_date = datetime.fromtimestamp(13831603200).strftime('%d-%b-%Y, %H:%M:%S')
print(python_date)
22-Apr-2408, 00:00:00
[How the datetime variable (Vdatesub) is showing when using Python][1]
[1]: https://i.stack.imgur.com/I7yza.png
This is answered under these two posts (one Python, one R):
Convert 'seconds since October 14, 1582' to Python datetime
Read SPSS file into R, the data format for date is wrong, and generate more variable
In short: the date is stored as number of seconds from 14 Oct 1582, while Python starts at the Epoch date (01 Jan 1970).
You would need to calculate the number of seconds between 1582-10-14 and 1970-01-01 to adjust the timestamp value as per this post:
Timestamp out of range for platform localtime()/gmtime() function
(Possibly 12,218,515,200 seconds)

How can I convey text descriptions of location time zones to UTC based in Python?

I need to be able to convert a time zone stored as a string that is region based either to a UTC time zone or a common time zone across locations. For example, “Canada/Vancouver” and “Americas/Los_Angeles” should both resolve to “US/Pacific”. This solution should also work for other time zones like “Canada/Toronto” and “AmericA/New_York” to “US/Eastern”, also extending to time zones for other locations like Mexico, etc.
I have no idea how to do this or even think about this. I could convert it to a UTC-7 but that doesn’t handle PST vs PDT shifts.
Can someone help?
Edit: after reading the comments and answer I realized that my question wasn’t clear enough.
I have a set of phone numbers, and I use the “phonenumbers” package to get the time zone out in the newer format for each number, but I want to count the number of unique phone numbers by the old region time zone naming convention. Hence I want to convert to newer “Continent/City” time zones to “Country/Region” time zones. . The UTC was just me trying to think of a way to convert the region/city formats into a common name.
time zones as from the IANA database refer to regions in a geographical sense. UTC on the other hand is not a time zone, it is universal (not specific to a region).
For a time zone, you can have an offset from UTC (like UTC-8 for 8 hours behind UTC).
A certain date/time in a given time zone has a specific UTC offset, as derived from the rules for that time zone (when to apply DST etc.).
The other way around, a certain UTC offset can apply in multiple time zones at given date/time, so mapping back needs a definition, otherwise it's ambiguous.
Regarding the naming of time zones, "Continent/City"-style time zone names are preferred. Old names like "US/Pacific" (as from before 1993) are kept in the database for backwards-compatibility - see also eggert-tz/backward.
Python >= 3.9 supports IANA time zones with the standard library via the zoneinfo module. Using that, you can create aware datetime objects easily and get their UTC offset, e.g. like
from datetime import datetime
from zoneinfo import ZoneInfo
tznames = ["America/Vancouver", "America/Los_Angeles",
"America/Toronto", "America/New_York", "Europe/Berlin"]
def timedelta_to_str(td):
hours, seconds = divmod(td.total_seconds(), 3600)
return f"{int(hours):+}:{int(seconds/60):02d}"
now = datetime.now().replace(microsecond=0)
for z in tznames:
local_now = now.astimezone(ZoneInfo(z))
print(f"now in zone {z}:\n\t{local_now.isoformat(' ', timespec='seconds')}, "
f"UTC offset: {timedelta_to_str(local_now.utcoffset())} hours\n")
# or also e.g. print(f"local time {z}:\n\t{local_now}, UTC offset: {local_now.strftime('%z')}\n")
# now in zone America/Vancouver:
# 2022-01-12 06:30:08-08:00, UTC offset: -08:00 hours
# now in zone America/Los_Angeles:
# 2022-01-12 06:30:08-08:00, UTC offset: -08:00 hours
# now in zone America/Toronto:
# 2022-01-12 09:30:08-05:00, UTC offset: -05:00 hours
# now in zone America/New_York:
# 2022-01-12 09:30:08-05:00, UTC offset: -05:00 hours
# now in zone Europe/Berlin:
# 2022-01-12 15:30:08+01:00, UTC offset: +01:00 hours
see also on SO:
Python: datetime tzinfo time zone names documentation
Display the time in a different time zone
Format timedelta to string

Converting strings to datetime while changing timezone

I have many strings of dates and times (or both), like these:
'Thu Jun 18 19:30:21 2015'
'21:07:52'
I want to convert these times to the proper datetime format while also changing the timezone to UTC. The current timezone is 4 hours behind UTC. Is there a way that I can tell python to add 4 hours while converting the formats? Can it also take care of the date in UTC such that when the hour goes past 24 the date changes and time resets?
I will ultimately be inserting these into a mysql table into fields with the 'datetime' and 'time' data type, but they all need to be in UTC.
I would approach this with time.strptime() to parse the source time string, time.mktime() to convert the resulting time vector into an epoch time (seconds since 1970-01-01 00:00:00), and time.strftime() to format the time as you like.
For the timezone adjustment, you could add 4*3600 to the epoch time value or, more generally, append a timezone string to the source and use %Z to parse it.

How to get utc time from time string in python?

In my application, i have two select options. one is for hour selection and another for minute selection. So when i fetch select option values i get values as time string like '12:34'. and I want to convert that time string to UTC time.
So, Can anyone please suggest me that how to get UTC time from timestring?
Thank you.
You don't have enough information to do that because you need to know the offset of your timezone from UTC.
But to get you started
from datetime import datetime, timedelta
offset = -2 # I live in Central European Summer Time, so I am 2 hours east of UTC
myutctime = datetime.strptime(timestring,'%H:%M') + timedelta(hours=offset)

Categories