Pyspark - Convert to Timestamp - python

Spark version : 2.1
I'm trying to convert a string datetime column to utc timestamp with the format yyyy-mm-ddThh:mm:ss
I first start by changing the format of the string column to yyyy-mm-ddThh:mm:ss
and then convert it to timestamp type. Later I would convert the timestamp to UTC using to_utc_timestamp function.
df.select(
f.to_timestamp(
f.date_format(f.col("time"), "yyyy-MM-dd'T'HH:mm:ss"), "yyyy-MM-dd'T'HH:mm:ss"
)
).show(5, False)
The date_format works fine by giving me the correct format. But, when I do to_timestamp on top of that result, the format changes to yyyy-MM-dd HH:mm:ss, when it should instead be yyyy-MM-dd'T'HH:mm:ss. Why does this happen?
Could someone tell me how I could retain the format given by date_format? What should I do?

The function to_timestamp returns a string to a timestamp, with the format yyyy-MM-dd HH:mm:ss.
The second argument is used to define the format of the DateTime in the string you are trying to parse.
You can see a couple of examples in the official documentation.

The code should be like this, just look at the single 'd' part here, and this is tricky in many cases.
data= data.withColumn('date', to_timestamp(col('date'), 'yyyy/MM/d'))

Related

What's the correct datetime format for this string date generated by python?

I have this date example '2022-08-30T11:53:52.204219' stored in database, when I get it from database it's type is string so I wanted to convert it to a date type by using this python code
datetime.strptime('2022-08-30T11:53:52.204219', "%Y-%m-%d'T'%H:%M:%S.%f")
I also tried this one
datetime.strptime('2022-08-30T11:53:52.204219', "yyyy-MM-dd'T'HH:mm:ssZ")
But I always get this error response 'time data '2022-08-30T11:53:52.204219' does not match format "%Y-%m-%d'T'%H:%M:%S.%f'
I need help to convert this string date to an actual date
As per comment:
from datetime import datetime
print(datetime.strptime('2022-08-30T11:53:52.204219', "%Y-%m-%dT%H:%M:%S.%f"))
Result:
2022-08-30 11:53:52.204219

How to fix weird date values to datetime type in pandas

I'm new to Python and dataframes.
I have a date value that is not formatted as date. Since this value has a 'weird' format, pandas' function to_datetime() doesn't work properly. The values are formatted like:
['20190630', '20190103']
This is the 'yyyymmdd' format.
I have tried to slice the values and make different columns where I extract the year- month- day. But this doesn't work, since the slicing wasn't working. This is the code I have now, but it isn't doing anything.
df.Date = pd.to_datetime(df.date)
I would like to have the dd-mm-yyyy format and datetime type. What can I do?

Formatting datetime pandas

I have some rows in my dataset with the following release date format:
1995-10-30
It is an object/string. However, I want to convert it to datetime, so I wrote the following to achieve that:
movies_df["release_date"] = pd.to_datetime(movies_df.release_date)
It gets converted to datetime as it should, but I would like to have the following format
mm-dd-year
I have tried yearfirst=False and dayfirst=False but nothing seems to be happening and I cant figure out why it isnt working.
I have also tried to specify the format in the to_datetime method as following:
movies_df["release_date"] = pd.to_datetime(movies_df.release_date, format="%Y/%m/%d", dayfirst=False, yearfirst=False)
Any help is appriciated
You can convert datetimes to strings with format mm-dd-YY:
movies_df["release_date"] = pd.to_datetime(movies_df.release_date).dt.strftime('%m-%d-%Y')
But if want datetimes in format mm-dd-YY it is not possible in python.

storing date into mongodb using python in ISO format

I am trying to store date into mongodb using python(bottle framework).
I want to store it in the below format:
ISODate("2015-06-08 03:38:28")
Currently I am using the following command:
datetime.strptime(DateField, '%m/%d/%Y %H:%M:%S %p')
it is getting stored like this:
ISODate("2015-06-08T03:38:28Z")
How to store it without "T" and "Z" in it??
You are confusing how something is stored vs. how something is displayed.
In MongoDB, dates are stored as 64 bit integers, what you are seeing is the way it is represented so that we can easily determine what date and time the 64bit number represents.
The ISODate is just a helper method, which formats the date in the ISO date format.
So when you pass it in a normal date and time string, it will convert it into the correct format.
The format adds the T (to separate the time part) and the Z (as you have not identified a time zone, it is defaulted to UTC).
In short - you are not storing it with the T and the Z, that's just how it is displayed back to you.

String to DATETIME in MySQL

I would like to convert strings to datetime objects to be used in the insert statement for MySQL. The strings are received in the following format :
2010-12-21T22:57:04.000Z
The data type of the MySQL column is DATETIME.
You can use the strptime function.
For instance, that would give:
myDatetime = datetime.strptime(myString.split(".")[0], "%Y-%m-%dT%H:%M:%S")
[EDIT] Well, I've seen this has been treated in another thread with a better answer than mine: How to parse an ISO 8601-formatted date?

Categories