Redshift COPY Statement Date load error - python

I am loading the data using COPY command.
My Dates are in the following format.
D/MM/YYYY eg. 1/12/2016
DD/MM/YYYY eg. 23/12/2016
My target table data type is DATE. I am getting the following error "Invalid Date Format - length must be 10 or more"

As per the AWS Redshift documentation,
The default date format is YYYY-MM-DD. The default time stamp without
time zone (TIMESTAMP) format is YYYY-MM-DD HH:MI:SS.
So, as your date is not in the same format and of different length, you are getting this error. Append the following at the end of your COPY command and it should work.
[[COPY command as you are using right now]] + DATEFORMAT 'DD/MM/YYYY'
Not sure about the single digit case though. You might want to pad the incoming values with a 0 in the beginning to match the format length.

Related

convert nanosecond precision datetime to snowflake TIMESTAMP_NTZ format

I have a string datetime "2017-01-01T20:19:47.922596536+09".
I would like to convert this into snowflake's DATETIME_NTZ date type (which can be found here). Simply put, DATETIME_NTZ is defined as
TIMESTAMP_NTZ
TIMESTAMP_NTZ internally stores “wallclock” time with a specified precision. All operations are performed without taking any time zone into account.
If the output format contains a time zone, the UTC indicator (Z) is displayed.
TIMESTAMP_NTZ is the default for TIMESTAMP.
Aliases for TIMESTAMP_NTZ:
TIMESTAMPNTZ
TIMESTAMP WITHOUT TIME ZONE
I've tried using numpy.datetime64 but I get the following:
> numpy.datetime64("2017-01-01T20:19:47.922596536+09")
numpy.datetime64('2017-01-01T11:19:47.922596536')
This for some reason converts the time to certain timezone.
I've also tried pd.to_datetime:
> pd.to_datetime("2017-01-01T20:19:47.922596536+09")
Timestamp('2017-01-01 20:19:47.922596536+0900', tz='pytz.FixedOffset(540)')
This gives me the correct value but when I try to insert the above value to snowflake db, I get the following error:
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 252004: Failed processing pyformat-parameters: 255001: Binding data in type (timestamp) is not supported.
Any suggestions would be much appreciated!
You can do this on the Snowflake side if you want by sending the string format as-is and converting to a timestamp_ntz. This single line shows two ways, one that simply strips off the time zone information, and one that converts the time zone to UTC before stripping off the time zone.
select try_to_timestamp_ntz('2017-01-01T20:19:47.922596536+09',
'YYYY-MM-DD"T"HH:MI:SS.FF9TZH') TS_NTZ
,convert_timezone('UTC',
try_to_timestamp_tz('2017-01-01T20:19:47.922596536+09',
'YYYY-MM-DD"T"HH:MI:SS.FF9TZH'))::timestamp_ntz UTC_TS_NTZ
;
Note that Snowflake UI by default only shows 3 decimal places (milliseconds) unless you specify higher precision for the output display using to_varchar() and a timestamp format string.
TS_NTZ
UTC_TS
2017-01-01 20:19:47.922596536
2017-01-01 11:19:47.922596536

What's the correct datetime format for this string date generated by python?

I have this date example '2022-08-30T11:53:52.204219' stored in database, when I get it from database it's type is string so I wanted to convert it to a date type by using this python code
datetime.strptime('2022-08-30T11:53:52.204219', "%Y-%m-%d'T'%H:%M:%S.%f")
I also tried this one
datetime.strptime('2022-08-30T11:53:52.204219', "yyyy-MM-dd'T'HH:mm:ssZ")
But I always get this error response 'time data '2022-08-30T11:53:52.204219' does not match format "%Y-%m-%d'T'%H:%M:%S.%f'
I need help to convert this string date to an actual date
As per comment:
from datetime import datetime
print(datetime.strptime('2022-08-30T11:53:52.204219', "%Y-%m-%dT%H:%M:%S.%f"))
Result:
2022-08-30 11:53:52.204219

querying Elasticsearch for parse date field with format

I am querying Elasticsearch based on date, passing in a date and time string in this format yyyy-mm-dd hh:mm:ss, but Elasticsearch and DateTime are unable to accept this format.
I am writing a script that takes input and queries Elasticsearch based on those inputs, primarily by index and date-time. I've written the script using command line arguments, entering the date-time in the same format, and the script runs perfectly. However, when I try converting the script running with hardcoded inputs, the error appears:
error elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'failed to parse date field
[2019-07-01 00:00:00] with format
[strict_date_optional_time||epoch_millis]')
#this throws the error
runQueryWithoutCommandLine("log4j-*", "2019-07-01 00:00:00", "csv", "json")
#this does not throw error
def runQueryWithCommandLine(*args):
# "yyyy-mm-dd hh:mm:ss" date-time format is given in commandline
Why is this error appearing, and how can I get rid of it? Thank you!
The Date format "strict_date_optional_time||epoch_millis" in elastic uses the ISO date format standards.
As can be seen in link above, the ISO format for string representation of date is :
date-opt-time = date-element ['T' [time-element] [offset]]
In your case, the time portion is separated by a whitespace and not the 'T' and hence the parsing error.
In addition, as I see the time mentioned is 00:00:00, you can simply omit this as this is what that's taken as default is no time portion is specified.
So, any of below date value will work:
1) 2019-07-01T00:00:00
2) 2019-07-01

Apache Spark Query only on YEAR from "dd/mm/yyyy" format

I have more than 1 Million records in excel file. I want to query on the Table using python, but date format is dd/mm/yyyy. I know that in MySQL the supported format is yyyy-mm-dd. I am restricted towards changing the format of date. Is there any possibility that I could do it on run-time. Just query on yyyy from dd/mm/yyyy and fetch the record.
How Do I query on such format only on Year and not on Month or Date to get data ?
Assuming the "date" is being received as a string, then RIGHT(date, 4) will give you just the year.
(I see no need to reformat the string if you only need the data. Otherwise see STR_TO_DATE()

storing date into mongodb using python in ISO format

I am trying to store date into mongodb using python(bottle framework).
I want to store it in the below format:
ISODate("2015-06-08 03:38:28")
Currently I am using the following command:
datetime.strptime(DateField, '%m/%d/%Y %H:%M:%S %p')
it is getting stored like this:
ISODate("2015-06-08T03:38:28Z")
How to store it without "T" and "Z" in it??
You are confusing how something is stored vs. how something is displayed.
In MongoDB, dates are stored as 64 bit integers, what you are seeing is the way it is represented so that we can easily determine what date and time the 64bit number represents.
The ISODate is just a helper method, which formats the date in the ISO date format.
So when you pass it in a normal date and time string, it will convert it into the correct format.
The format adds the T (to separate the time part) and the Z (as you have not identified a time zone, it is defaulted to UTC).
In short - you are not storing it with the T and the Z, that's just how it is displayed back to you.

Categories