How can I convert following string date-time into pandas datetime - python

How can I convert the following string format of datetime into datetime object to be used in pandas Dataframe? I tried many examples, but it seems my format is different from the standard Pandas datetime object. I know this could be a repetition, but I tried solutions on the Stackexchange, but they don't work!

Below code will convert it into appropriate format
df = pd.DataFrame({'datetime':['2013-11-1_00:00','2013-11-1_00:10','2013-11-1_00:20']})
df['datetime_changed'] = pd.to_datetime(df['datetime'].str.replace('_','T'))
df.head()
output:

You can use pd.to_datetime with format
df['datetime'] = pd.to_datetime(df['datetime'], format='%Y-%m-%d_%H:%M')

Related

Convert mm-yyyy to datetime datatype in Python

I am trying to convert a datetime datatype of the form 24/12/2021 07:24:00 to mm-yyyy format which is 12-2021 with datetime datatype. I need the mm-yyyy in datetime format in order to sort the column 'Month-Year' in a time series. I have tried
import pandas as pd
from datetime import datetime
df = pd.read_excel('abc.xlsx')
df['Month-Year'] = df['Due Date'].map(lambda x: x.strftime('%m-%y'))
df.set_index(['ID', 'Month-Year'], inplace=True)
df.sort_index(inplace=True)
df
The column 'Month-Year' does not sort in time series because 'Month-Year' is of object datatype. How do I please convert 'Month-Year' column to datetime datatype?
I have been able to obtain a solution to the problem.
df['month_year'] = pd.to_datetime(df['Due Date']).dt.to_period('M')
I got this from the link below
https://www.interviewqs.com/ddi-code-snippets/extract-month-year-pandas
df['Month-Year']=pd.to_datetime(df['Month-Year']).dt.normalize()
will convert the Month-Year to datetime64[ns].
Use it before sorting.

Pandas: to.datetime() quesiton

I have Date/Time in the following format:
10/01/21 04:49:43.75
MM/DD/YY HH/MM/SS.ms
I am trying to convert this from being an object to a datetime. I tried the following code but i am getting an error that it does not match the format. Any ideas?
df['Date/Time'] = pd.to_datetime(df['Date/Time'], format = '%m%d%y %H%M%S%f')
you can try letting pandas infer the datetime format with:
pd.to_datetime(df['Date/Time'], infer_datetime_format=True)

What is Vaex function to parse string to datetime64, which equivalent to pandas to_datetime, that allow custom format?

I have date as string (example: 3/24/2020) that I would like to convert to datetime64[ns] format
df2['date'] = pd.to_datetime(df1["str_date"], format='%m/%d/%Y')
Use pandas to_datetime on vaex dataframe will result an error:
ValueError: time data 'str_date' does not match format '%m/%d/%Y' (match)
I have see maybe duplicate question.
df2['pdate']=df2.date.astype('datetime64[ns]')
However, the answer is type casting. My case required to a format ('%m/%d/%Y') parse string to datetime64[ns], not just type cast.
Solution: make custom function, then .apply
vaex can use apply function for object operations, so you can use datetime and np.datetime64 convert each date string, then apply it.
import numpy as np
from datetime import datetime
def convert_to_datetime(date_string):
return np.datetime64(datetime.strptime(str(date_string), "%Y%m%d%H%M%S"))
df['date'] = df.date.apply(convert_to_datetime)

How to change string to date type using dask dataframes in python?

I am parsing JSON data in POST request in my URL and want to extract date and convert that to Date type in Python and all this I am doing in dask. How to convert this?
Previously I have tried using pandas dataframes using:
datetime.strptime(i, '%Y-%m-%d').date().__str__()
if i understand right: the list somedates contains string values of dates in the format: "2020-01-23", and you want to convert is to datetime type.
import datetime as dt
for i in range(len(somedates)):
myformat = "%Y-%m-%d"
somedates[i] = dt.datetime.strptime(somedates[i], myformat)

Pandas, convert datetime format mm/dd/yyyy to dd/mm/yyyy

The default format of csv is dd/mm/yyyy. When I convert it to datetime by df['Date']=pd.to_datetime(df['Date']), it change the format to mm//dd/yyyy.
Then, I used df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%d/%m/%Y')
to convert to dd/mm/yyyy, But, they are in the string (object) format. However, I need to change them to datetime format. When I use again this (df['Date']=pd.to_datetime(df['Date'])), it gets back to the previous format. Need your help
You can use the parse_dates and dayfirst arguments of pd.read_csv, see: the docs for read_csv()
df = pd.read_csv('myfile.csv', parse_dates=['Date'], dayfirst=True)
This will read the Date column as datetime values, correctly taking the first part of the date input as the day. Note that in general you will want your dates to be stored as datetime objects.
Then, if you need to output the dates as a string you can call dt.strftime():
df['Date'].dt.strftime('%d/%m/%Y')
When I use again this: df['Date'] = pd.to_datetime(df['Date']), it gets back to the previous format.
No, you cannot simultaneously have the string format of your choice and keep your series of type datetime. As remarked here:
datetime series are stored internally as integers. Any
human-readable date representation is just that, a representation,
not the underlying integer. To access your custom formatting, you can
use methods available in Pandas. You can even store such a text
representation in a pd.Series variable:
formatted_dates = df['datetime'].dt.strftime('%m/%d/%Y')
The dtype of formatted_dates will be object, which indicates
that the elements of your series point to arbitrary Python times. In
this case, those arbitrary types happen to be all strings.
Lastly, I strongly recommend you do not convert a datetime series
to strings until the very last step in your workflow. This is because
as soon as you do so, you will no longer be able to use efficient,
vectorised operations on such a series.
This solution will work for all cases where a column has mixed date formats. Add more conditions to the function if needed. Pandas to_datetime() function was not working for me, but this seems to work well.
import date
def format(val):
a = pd.to_datetime(val, errors='coerce', cache=False).strftime('%m/%d/%Y')
try:
date_time_obj = datetime.datetime.strptime(a, '%d/%m/%Y')
except:
date_time_obj = datetime.datetime.strptime(a, '%m/%d/%Y')
return date_time_obj.date()
Saving the changes to the same column.
df['Date'] = df['Date'].apply(lambda x: format(x))
Saving as CSV.
df.to_csv(f'{file_name}.csv', index=False, date_format='%s')

Categories