storing date into mongodb using python in ISO format - python

I am trying to store date into mongodb using python(bottle framework).
I want to store it in the below format:
ISODate("2015-06-08 03:38:28")
Currently I am using the following command:
datetime.strptime(DateField, '%m/%d/%Y %H:%M:%S %p')
it is getting stored like this:
ISODate("2015-06-08T03:38:28Z")
How to store it without "T" and "Z" in it??

You are confusing how something is stored vs. how something is displayed.
In MongoDB, dates are stored as 64 bit integers, what you are seeing is the way it is represented so that we can easily determine what date and time the 64bit number represents.
The ISODate is just a helper method, which formats the date in the ISO date format.
So when you pass it in a normal date and time string, it will convert it into the correct format.
The format adds the T (to separate the time part) and the Z (as you have not identified a time zone, it is defaulted to UTC).
In short - you are not storing it with the T and the Z, that's just how it is displayed back to you.

Related

convert nanosecond precision datetime to snowflake TIMESTAMP_NTZ format

I have a string datetime "2017-01-01T20:19:47.922596536+09".
I would like to convert this into snowflake's DATETIME_NTZ date type (which can be found here). Simply put, DATETIME_NTZ is defined as
TIMESTAMP_NTZ
TIMESTAMP_NTZ internally stores “wallclock” time with a specified precision. All operations are performed without taking any time zone into account.
If the output format contains a time zone, the UTC indicator (Z) is displayed.
TIMESTAMP_NTZ is the default for TIMESTAMP.
Aliases for TIMESTAMP_NTZ:
TIMESTAMPNTZ
TIMESTAMP WITHOUT TIME ZONE
I've tried using numpy.datetime64 but I get the following:
> numpy.datetime64("2017-01-01T20:19:47.922596536+09")
numpy.datetime64('2017-01-01T11:19:47.922596536')
This for some reason converts the time to certain timezone.
I've also tried pd.to_datetime:
> pd.to_datetime("2017-01-01T20:19:47.922596536+09")
Timestamp('2017-01-01 20:19:47.922596536+0900', tz='pytz.FixedOffset(540)')
This gives me the correct value but when I try to insert the above value to snowflake db, I get the following error:
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 252004: Failed processing pyformat-parameters: 255001: Binding data in type (timestamp) is not supported.
Any suggestions would be much appreciated!
You can do this on the Snowflake side if you want by sending the string format as-is and converting to a timestamp_ntz. This single line shows two ways, one that simply strips off the time zone information, and one that converts the time zone to UTC before stripping off the time zone.
select try_to_timestamp_ntz('2017-01-01T20:19:47.922596536+09',
'YYYY-MM-DD"T"HH:MI:SS.FF9TZH') TS_NTZ
,convert_timezone('UTC',
try_to_timestamp_tz('2017-01-01T20:19:47.922596536+09',
'YYYY-MM-DD"T"HH:MI:SS.FF9TZH'))::timestamp_ntz UTC_TS_NTZ
;
Note that Snowflake UI by default only shows 3 decimal places (milliseconds) unless you specify higher precision for the output display using to_varchar() and a timestamp format string.
TS_NTZ
UTC_TS
2017-01-01 20:19:47.922596536
2017-01-01 11:19:47.922596536

Trying to convert influxdb timestamp in a csv to date and time columns

I am migrating an InfluxDB database to mySQL. I have managed to export the influx data to a CSV file, which is great, but now I am stuck with the date and time field which has been given to me.
I have no idea what format it is in, after doing some research it tells me that it is in epoch time, but using python to try and convert the timestamp to an ISO format, it isn't recognised as a valid timestamp. Any idea how to get this converted. Ideally to separate date and time columns. The data that I have got is something like this :
time,absoluteHumidity
1578152602608558363,5.788981747966442
1578152608059500073,4.769760557208695
1578152613662193439,5.788981747966442
And the python that I was using to try and convert it, was this :
from datetime import datetime, timezone
print (datetime.fromtimestamp(1578152602608558363, timezone.utc))
Any help or suggestions would be appreciated !
According to the influxdb docs they store timestamp values with nanoseconds precision.
However the datetime.fromtimestamp method expects a floating point number and its integer part is in second precision.
So generally your approach is right you just need to divide the influx timestamp by 1e9 and it should just work:
from datetime import datetime, timezone
print(datetime.fromtimestamp(1578152602608558363 / 1e9, timezone.utc))

Pyspark - Convert to Timestamp

Spark version : 2.1
I'm trying to convert a string datetime column to utc timestamp with the format yyyy-mm-ddThh:mm:ss
I first start by changing the format of the string column to yyyy-mm-ddThh:mm:ss
and then convert it to timestamp type. Later I would convert the timestamp to UTC using to_utc_timestamp function.
df.select(
f.to_timestamp(
f.date_format(f.col("time"), "yyyy-MM-dd'T'HH:mm:ss"), "yyyy-MM-dd'T'HH:mm:ss"
)
).show(5, False)
The date_format works fine by giving me the correct format. But, when I do to_timestamp on top of that result, the format changes to yyyy-MM-dd HH:mm:ss, when it should instead be yyyy-MM-dd'T'HH:mm:ss. Why does this happen?
Could someone tell me how I could retain the format given by date_format? What should I do?
The function to_timestamp returns a string to a timestamp, with the format yyyy-MM-dd HH:mm:ss.
The second argument is used to define the format of the DateTime in the string you are trying to parse.
You can see a couple of examples in the official documentation.
The code should be like this, just look at the single 'd' part here, and this is tricky in many cases.
data= data.withColumn('date', to_timestamp(col('date'), 'yyyy/MM/d'))

Storing struct_time in SQL

Some code I am writing in Python takes in a date from a server in the struct_time format (with the 9 args).
How can I store this date in an SQL database, and be able to read back this date as a struct_time while keeping the timezone and all additional information coming from struct_time?
I tried putting the struct_time directly in the SQL
struct_date = time.struct_time(tm_year=2020, tm_mon=9, tm_mday=10, tm_hour=22, tm_min=31, tm_sec=4, tm_wday=3, tm_yday=254, tm_isdst=0)
cursor.execute("UPDATE dbo.RSS_Links SET last_update=? WHERE link=?;", struct_date, links)
> "A TVP's rows must be Sequence objects.", 'HY000'
I can put the time in the database using the below, but I don't see where the timezone is kept when converting to strftime.
date_to_store = time.strftime("%Y-%m-%d %H:%M:%S", struct_date)
I'd highly suggest doing one of these (in this specific order):
Use built-in DATETIME data type and store all dates in UTC
Use LONG/BIGINT type to store date in epoch
Use built-in DATETIME format that can store time zone information
Don't store dates as strings, don't couple it with struct_time or any other struct/class, you'll regret it later :)
Your application should have a data layer, which would handle data serialization/deserialization.

Redshift COPY Statement Date load error

I am loading the data using COPY command.
My Dates are in the following format.
D/MM/YYYY eg. 1/12/2016
DD/MM/YYYY eg. 23/12/2016
My target table data type is DATE. I am getting the following error "Invalid Date Format - length must be 10 or more"
As per the AWS Redshift documentation,
The default date format is YYYY-MM-DD. The default time stamp without
time zone (TIMESTAMP) format is YYYY-MM-DD HH:MI:SS.
So, as your date is not in the same format and of different length, you are getting this error. Append the following at the end of your COPY command and it should work.
[[COPY command as you are using right now]] + DATEFORMAT 'DD/MM/YYYY'
Not sure about the single digit case though. You might want to pad the incoming values with a 0 in the beginning to match the format length.

Categories