Can't read date variables in Python from SPSS (.sav) files - python

I'm working with a .sav (SPSS) file in Python. All the variables look fine after import while using PyreadStat (also when using Pandas) except for the datetime variables. They read in as exponential numbers of type float using Python. But their original SPSS format is dd-mmm-yy (e.g., 02-feb-2021) of type date.
This is how the date variable looks like
1.383160e+10
Is there a way to convert this format to datetime using Python?
I've tried various ways of using the datetime module and time module. But what I get is a date from the year 2408
# Here I'm using the float from the first row in the dataframe
time.gmtime(13831603200)
The results
time.struct_time(tm_year=2408, tm_mon=4, tm_mday=22, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=113, tm_isdst=0)
When I use the datetime module:
python_date = datetime.fromtimestamp(13831603200).strftime('%d-%b-%Y, %H:%M:%S')
print(python_date)
22-Apr-2408, 00:00:00
[How the datetime variable (Vdatesub) is showing when using Python][1]
[1]: https://i.stack.imgur.com/I7yza.png

This is answered under these two posts (one Python, one R):
Convert 'seconds since October 14, 1582' to Python datetime
Read SPSS file into R, the data format for date is wrong, and generate more variable
In short: the date is stored as number of seconds from 14 Oct 1582, while Python starts at the Epoch date (01 Jan 1970).
You would need to calculate the number of seconds between 1582-10-14 and 1970-01-01 to adjust the timestamp value as per this post:
Timestamp out of range for platform localtime()/gmtime() function
(Possibly 12,218,515,200 seconds)

Related

Converting from local time to UTC time in Python Pandas dataframe?

How would I efficiently convert local times in a dataframe to UTC times? There are 3 columns with information: the date (string), the timezone code (string), and the hour of the day (integer).
date
timezone
hour
7/31/2010 0:00:00
EST
1
6/14/2010 0:00:00
PST
3
6/14/2010 0:00:00
PST
4
5/30/2010 0:00:00
EDT
23
5/30/2010 0:00:00
EDT
24
After the data is converted I will be aggregating it to monthly data.
Gday.
Working with dates is described reasonably well in this answer here: converting utc to est time in python
In that case they have the timezone offsets as numbers e.g +11:00. You have the US short code. So you could convert that column to the numerical equivalent first and then use that function.
Personally I find the notation "Australia/Melbourne" way easier to deal with - especially because it thinks about daylight savings etc for you. Timezones are a nightmare. Thats described here: Python: datetime tzinfo time zone names documentation
In terms of the hour column, you can just use a string function to join those two values together to form a date and time string.
So I'd suggest you convert that timezone column to that format (I.e EST as America/New York), etc, then feed all three columns into a datetime convert line per the first answer

how can I convert date(YYYY, MM, DD) format to YYYY/MM/DD with python? [duplicate]

This question already has answers here:
Convert datetime object to a String of date only in Python
(15 answers)
Closed 4 months ago.
I have a JSON that returns a date in the following date format:
datetime(2015, 12, 1)
So the key value from JSON is
'CreateDate': datetime(2015, 1, 1)
In order to be able to subtract two dates, I need to convert above to date format:
YYYY/MM/DD
so in above case that would become: 2015/12/01
Is there a smart way to do that? Is it at all possible? Or do I really have to parse it as a block of text? I tried using datetime.strptime, but I can't get it to work.
datetime(2015,12,1).strftime("%Y/%m/%d")
will do the trick for you. A complete python program would look like this.
# import the datetime and date classes from the datetime module
from datetime import datetime, date
# create your datetime(..) instance from your JSON somehow
# The included Python JSON tools do not usually know about date-times,
# so this may require a special step
# Assume you have a datetime now
dt = datetime(2015,12,1)
# Print it out in your format
print( dt.strftime("%Y/%m/%d") )
Two important details:
You are using just a date in a Python datetime. Nothing wrong with that but just note that the Python datetime module also has a date class
You can enable your JSON encoder/decoder to recognise dates and datetimes automatically but it requires extra work.
Now, to subtract datetimes from each other they should remain as instances of the datetime class. You can not subtract datetimes from each other once they have been formatted as a string.
Once you subtract a Python datetime from an other datetime the result will be an instance of the timedelta class.
from datetime import datetime, timedelta
# time_diff here is a timedelta type
time_diff = datetime(2015,12,1) - datetime(2014,11,1)
Now you can look up the Python timedelta type and extract the days, hours, minutes etc. that you need. Be aware that timedeltas can be a negative if you subtract a later datetime from an earlier one.
The datetime module has a function for doing this which is pretty easy as shown below
from datetime import datetime
print(datetime(2015,12,1).strftime("%Y/%m/%d"))
Also read more about the module here https://docs.python.org/3/library/datetime.html

Converting UTC Timestamp in CSV to Local time (PST)

I am looking for a code on how to convert timestamps from some GPS data in a csv file to local time (in this case PST). I do have some other files I would have to convert also to CDT and EDT.
This is what the output looks like:
2019-09-18T07:07:48.000Z
I would like to create a separate column in the right of the Excel for the Date and another for the time EX:
TIME_UTC DATE TIME_PST
2019-09-18T07:07:48.000Z 09-18-2019 12:07:48 AM
I only know basic Python and nothing about Excel in python so it would be super helpful!
Thank you!!!
By calling to localize you tell in what TZ your time is. So, in your example you say that your date is in UTC, then you call astimezone for UTC. FOr example:
utc_dt = pytz.utc.localize(datetime.pstnow())
pst_tz = timezone('US/Pacific')
pst_dt = pst_tz.normalize(pst_dt.astimezone(utc_tz))
pst_dt.strftime(fmt)
For more example, visit here
If you want to use Excel Formula:
For the date:
=INT(SUBSTITUTE(LEFT(A2,LEN(A2)-1),"T"," ")-TIME(7,0,0))
For the Time:
=MOD(SUBSTITUTE(LEFT(A2,LEN(A2)-1),"T"," ")-TIME(7,0,0),1)
And format the output with the desire format: mm-dd-yyyy and hh:mm:ss AM/PM respectively.

Converting strings to datetime while changing timezone

I have many strings of dates and times (or both), like these:
'Thu Jun 18 19:30:21 2015'
'21:07:52'
I want to convert these times to the proper datetime format while also changing the timezone to UTC. The current timezone is 4 hours behind UTC. Is there a way that I can tell python to add 4 hours while converting the formats? Can it also take care of the date in UTC such that when the hour goes past 24 the date changes and time resets?
I will ultimately be inserting these into a mysql table into fields with the 'datetime' and 'time' data type, but they all need to be in UTC.
I would approach this with time.strptime() to parse the source time string, time.mktime() to convert the resulting time vector into an epoch time (seconds since 1970-01-01 00:00:00), and time.strftime() to format the time as you like.
For the timezone adjustment, you could add 4*3600 to the epoch time value or, more generally, append a timezone string to the source and use %Z to parse it.

storing date into mongodb using python in ISO format

I am trying to store date into mongodb using python(bottle framework).
I want to store it in the below format:
ISODate("2015-06-08 03:38:28")
Currently I am using the following command:
datetime.strptime(DateField, '%m/%d/%Y %H:%M:%S %p')
it is getting stored like this:
ISODate("2015-06-08T03:38:28Z")
How to store it without "T" and "Z" in it??
You are confusing how something is stored vs. how something is displayed.
In MongoDB, dates are stored as 64 bit integers, what you are seeing is the way it is represented so that we can easily determine what date and time the 64bit number represents.
The ISODate is just a helper method, which formats the date in the ISO date format.
So when you pass it in a normal date and time string, it will convert it into the correct format.
The format adds the T (to separate the time part) and the Z (as you have not identified a time zone, it is defaulted to UTC).
In short - you are not storing it with the T and the Z, that's just how it is displayed back to you.

Categories