I have a string datetime "2017-01-01T20:19:47.922596536+09".
I would like to convert this into snowflake's DATETIME_NTZ date type (which can be found here). Simply put, DATETIME_NTZ is defined as
TIMESTAMP_NTZ
TIMESTAMP_NTZ internally stores “wallclock” time with a specified precision. All operations are performed without taking any time zone into account.
If the output format contains a time zone, the UTC indicator (Z) is displayed.
TIMESTAMP_NTZ is the default for TIMESTAMP.
Aliases for TIMESTAMP_NTZ:
TIMESTAMPNTZ
TIMESTAMP WITHOUT TIME ZONE
I've tried using numpy.datetime64 but I get the following:
> numpy.datetime64("2017-01-01T20:19:47.922596536+09")
numpy.datetime64('2017-01-01T11:19:47.922596536')
This for some reason converts the time to certain timezone.
I've also tried pd.to_datetime:
> pd.to_datetime("2017-01-01T20:19:47.922596536+09")
Timestamp('2017-01-01 20:19:47.922596536+0900', tz='pytz.FixedOffset(540)')
This gives me the correct value but when I try to insert the above value to snowflake db, I get the following error:
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 252004: Failed processing pyformat-parameters: 255001: Binding data in type (timestamp) is not supported.
Any suggestions would be much appreciated!
You can do this on the Snowflake side if you want by sending the string format as-is and converting to a timestamp_ntz. This single line shows two ways, one that simply strips off the time zone information, and one that converts the time zone to UTC before stripping off the time zone.
select try_to_timestamp_ntz('2017-01-01T20:19:47.922596536+09',
'YYYY-MM-DD"T"HH:MI:SS.FF9TZH') TS_NTZ
,convert_timezone('UTC',
try_to_timestamp_tz('2017-01-01T20:19:47.922596536+09',
'YYYY-MM-DD"T"HH:MI:SS.FF9TZH'))::timestamp_ntz UTC_TS_NTZ
;
Note that Snowflake UI by default only shows 3 decimal places (milliseconds) unless you specify higher precision for the output display using to_varchar() and a timestamp format string.
TS_NTZ
UTC_TS
2017-01-01 20:19:47.922596536
2017-01-01 11:19:47.922596536
Related
I have a value in a raw binary file (part of a database) and I want to convert to a Python format which can be interpreted by a human being. This is part of a forensic carving procedure.
I can convert 8 byte values using this SQL sentence (you will see a date in GMT+2 and in GMT)
SELECT CAST(0x0000ae9401039c4a AS datetime), CAST(0x0000ae9400e2a6ca AS datetime)
which returns
2022-05-13 15:45:12.780 2022-05-13 13:45:12.780
I have tried to convert the binary value with DCODE v5.5 (https://www.digital-detective.net/dcode/) but can't find any format meeting the output of the previous SQL sentence (I have checked that it is right in the database I'm trying to carve).
Do anyone know how to perform the conversion in Python?
I imagine I just need the origin of this time representation and how much time is every bit. Comparing two timestamps separated exactly 2 hours you can see a "300" that I don't know how to interpret. Is a bit 1/300 seconds?
>>> t1=0xae9401039c4a
>>> t2=0xae9400e2a6ca
>>> t1-t2
2160000
>>> (t1-t2)/(2*3600)
300.0
Those ere the properties of the database I need to carve:
Short Version
The legacy datetime type stores dates as a 64-bit floating point offset from 1900-01-01
floatValue=struct.unpack('<d',bytes)[0]
OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO + datetime.timedelta(days=floatValue)
Newer types don't use that format though.
Excel handling libraries like openpyxl offer functions that convert OA/Serial dates like openpyxl.utils.datetime.from_excel
Long Explanation
The legacy datetime type in SQL Server uses the OLE Automation Date storage format that was also used in Excel, Visual Basic and all desktop applications that supported OLE/COM Automation in the late 1990s and early 2000s, before macro viruses. This is a 64-bit floating point number (called a Serial date in Excel) whose integral part is an offset since 1899-12-30 and fractional part is the time of day. Except when it's 1899-12-31 or 1900-01-01.
Back when Excel was released, Lotus 1-2-3 was the most popular spreadsheet and a defacto standard, and incorrectly considered 1900 a leap year. To ensure compatibility, Excel adopted the same bug. VBA adopted tried to both fix the bug and ensure formulas produced the same results as Excel and Lotus, so use 1899-12-30 as a base.
The SQL Server team didn't care about the bug and used the logical 1900-01-01 instead.
Essentially, this value is a timedelta. In Python, you can convert this float to a timedelta by passing it as the days parameter to timedelta, and add it to the base 1900-01-01:
OLE_TIME_ZERO = datetime.datetime(1900, 01, 01, 0, 0, 0)
date=OLE_TIME_ZERO + datetime.timedelta(days=floatValue)
To get a 64-bit float from an array of bytes you can use struct.unpack with the appropriate format string. A 64-bit float is actually a double:
floatValue=struct.unpack('<d',bytes)[0]
WARNING
datetime is a legacy type. The types introduced in 2005, date, time, datetime2 and datetimeoffset have a different storage format. datetime2 and datetimeoffset have variable precision and variable size.
For future reference, I was finally able to find the data I needed in this post: What is the internal representation of datetime in sql server?
The details are supposedly opaque, but most resources (1), (2) that I've found on the web state the following:
The first 4 bytes store the number of days since SQL Server's epoch (1st Jan 1900) and that
the second 4 bytes stores the number of ticks after midnight, where a "tick" is 3.3 milliseconds.
The first four bytes are signed (can be positive or negative), which explains why dates earlier than the epoch can be represented.
https://learn.microsoft.com/en-us/sql/t-sql/functions/date-and-time-data-types-and-functions-transact-sql?redirectedfrom=MSDN&view=sql-server-ver16
Range: 1753-01-01 through 9999-12-31
Accuracy: 0.00333 second
This function will do the conversion:
def extr_datetime (bytes):
days_off = int.from_bytes(bytes[4:8],byteorder='little', signed=True)
ticks_off = int.from_bytes(bytes[0:4],byteorder='little', signed=True) / 300.0
epoch = '1900/01/01 00:00:00'
epoch_obj = datetime.strptime(epoch, '%Y/%m/%d %H:%M:%S')
d = epoch_obj + timedelta(days=days_off) + timedelta(seconds=ticks_off)
return d
I am migrating an InfluxDB database to mySQL. I have managed to export the influx data to a CSV file, which is great, but now I am stuck with the date and time field which has been given to me.
I have no idea what format it is in, after doing some research it tells me that it is in epoch time, but using python to try and convert the timestamp to an ISO format, it isn't recognised as a valid timestamp. Any idea how to get this converted. Ideally to separate date and time columns. The data that I have got is something like this :
time,absoluteHumidity
1578152602608558363,5.788981747966442
1578152608059500073,4.769760557208695
1578152613662193439,5.788981747966442
And the python that I was using to try and convert it, was this :
from datetime import datetime, timezone
print (datetime.fromtimestamp(1578152602608558363, timezone.utc))
Any help or suggestions would be appreciated !
According to the influxdb docs they store timestamp values with nanoseconds precision.
However the datetime.fromtimestamp method expects a floating point number and its integer part is in second precision.
So generally your approach is right you just need to divide the influx timestamp by 1e9 and it should just work:
from datetime import datetime, timezone
print(datetime.fromtimestamp(1578152602608558363 / 1e9, timezone.utc))
Some code I am writing in Python takes in a date from a server in the struct_time format (with the 9 args).
How can I store this date in an SQL database, and be able to read back this date as a struct_time while keeping the timezone and all additional information coming from struct_time?
I tried putting the struct_time directly in the SQL
struct_date = time.struct_time(tm_year=2020, tm_mon=9, tm_mday=10, tm_hour=22, tm_min=31, tm_sec=4, tm_wday=3, tm_yday=254, tm_isdst=0)
cursor.execute("UPDATE dbo.RSS_Links SET last_update=? WHERE link=?;", struct_date, links)
> "A TVP's rows must be Sequence objects.", 'HY000'
I can put the time in the database using the below, but I don't see where the timezone is kept when converting to strftime.
date_to_store = time.strftime("%Y-%m-%d %H:%M:%S", struct_date)
I'd highly suggest doing one of these (in this specific order):
Use built-in DATETIME data type and store all dates in UTC
Use LONG/BIGINT type to store date in epoch
Use built-in DATETIME format that can store time zone information
Don't store dates as strings, don't couple it with struct_time or any other struct/class, you'll regret it later :)
Your application should have a data layer, which would handle data serialization/deserialization.
I am loading the data using COPY command.
My Dates are in the following format.
D/MM/YYYY eg. 1/12/2016
DD/MM/YYYY eg. 23/12/2016
My target table data type is DATE. I am getting the following error "Invalid Date Format - length must be 10 or more"
As per the AWS Redshift documentation,
The default date format is YYYY-MM-DD. The default time stamp without
time zone (TIMESTAMP) format is YYYY-MM-DD HH:MI:SS.
So, as your date is not in the same format and of different length, you are getting this error. Append the following at the end of your COPY command and it should work.
[[COPY command as you are using right now]] + DATEFORMAT 'DD/MM/YYYY'
Not sure about the single digit case though. You might want to pad the incoming values with a 0 in the beginning to match the format length.
I am trying to store date into mongodb using python(bottle framework).
I want to store it in the below format:
ISODate("2015-06-08 03:38:28")
Currently I am using the following command:
datetime.strptime(DateField, '%m/%d/%Y %H:%M:%S %p')
it is getting stored like this:
ISODate("2015-06-08T03:38:28Z")
How to store it without "T" and "Z" in it??
You are confusing how something is stored vs. how something is displayed.
In MongoDB, dates are stored as 64 bit integers, what you are seeing is the way it is represented so that we can easily determine what date and time the 64bit number represents.
The ISODate is just a helper method, which formats the date in the ISO date format.
So when you pass it in a normal date and time string, it will convert it into the correct format.
The format adds the T (to separate the time part) and the Z (as you have not identified a time zone, it is defaulted to UTC).
In short - you are not storing it with the T and the Z, that's just how it is displayed back to you.