String to date in pandas - python

I have a dataset with dates encoded as strings formatted as %B %d, %Y, eg September 10, 2021.
Using:df['sale_date'] = pd.to_datetime(df.sale_date, format = '%B %d, %Y')
produces this error ValueError: time data 'September 10, 2021' does not match format '%B %d, %Y' (match)
Manually checking with strptimedatetime.strptime('September 10, 2021', '%B %d, %Y') produces the correct datetime object.
Is there something I missed in the pd.to_datetime?
Thanks.

Upon further investigation, I found out that the error only happens on the first element of the series. It seems that the string has '\ufeff' added to it. So I just did a series.str.replace() and now it is working. Sorry for the bother. Question is how did that BOM end up there?

Very likely you have to eliminate some whitespaces first!
If I add whitespaces at the beginning, end or both..
datestring = ' September 10, 2021 '
datetime.datetime.strptime(datestring, '%B %d, %Y')
it will result in the same error message as you have..
ValueError: time data ' September 10, 2021 ' does not match format '%B %d, %Y'
As a solution for a single value use:
datestring = ' September 10, 2021 '
datestring.strip()
for a column in a dataframe use:
dummy = pd.DataFrame(columns={'Date'}, data = [' September 10, 2021 ', ' September 11, 2021 ', ' September 12, 2021 '])
dummy.Date = dummy.Date.apply(lambda x: x.strip())

Related

time data 'June 13, 1980 (United States)' does not match format '%m/%d/%Y' (match)

How can i passs a datetime format on a column with str such as June 13, 1980 (United States)
i tried df['format_released'] = pd.to_datetime(df['released'], format='%m/%d/%Y')
got this error
time data 'June 13, 1980 (United States)' does not match format '%m/%d/%Y' (match)
The correct format is: pd.to_datetime(pd.to_datetime(df['released'], format='%B %d, %Y')
For the full name, you need to specify %B for the format.
You don't need the value "(United States)" in the string.
You need to preprocess the column to discard the non relevant data.
Using str.replace:
df['format_released'] = pd.to_datetime(df['released'].str.replace(r'\s*(.*$', '', regex=True), format='%B %d, %Y')
Or using str.extract:
df['format_released'] = pd.to_datetime(df['released'].str.extract(r'(\w+ \d+, \d+)', expand=False), format='%B %d, %Y')

convert 'July 31, 2021' to YYYY-MM-DD format caused ValueError: time data 'July 31, 2021' does not match format '%m %d, %Y'

I have a web app , using Django as backend.
I used datetime.strptime function in python3 to convert the date to the format need to input to Mysql database.
But I got the error: ValueError: time data 'July 31, 2021' does not match format '%m %d, %Y'
end_date = request.GET.getlist('end_date')[0] # end_date = 'July 31, 2021' in the test case
end_date_converted = datetime.strptime(end_date, "%m %d, %Y").strftime("%Y-%m-%d")
How could I convert 'July 31, 2021' to YYYY-MM-DD format so I could save it to MYSQL date column?
According to docs %m is "Month as a zero-padded decimal number", not the month name. You should be using
%B %d, %Y
as the format specifier. For example:
>>> datetime.strptime('July 31, 2021', '%B %d, %Y').strftime('%Y-%m-%d')
'2021-07-31'
replace %m with %B which will decode the Month full name

How to convert string date (Nov 13, 2020) to datetime in pandas?

I have a dataset with abbreviated month names, and I have tried to follow some other solutions posted here, such as :
r.Date = pd.to_datetime(r.Date, format='%MMM %d, %Y')
but unfortunately it is giving me a ValueError: time data 'Nov 13, 2020' does not match format '%d %B, %Y' (match). The months dates are all abbreviated.
Change to
pd.to_datetime('Nov 13, 2020',format='%b %d, %Y')
Out[23]: Timestamp('2020-11-13 00:00:00')

Date does not match parsed format in python

the following snippet of code throws an error stating "ValueError: time data 'Dec 25 2017' does not match format '%b /%d /%y'"
import datetime,time
from Hall import Hall
Fd=input("Enter Start time\n")
d1 = datetime.datetime.strptime(Fd, '%b /%d /%y')
Sd=input("Enter the End time\n")
d2 = datetime.datetime.strptime(Sd, '%b /%d /%y')
cost=int(input("Enter the cost per day\n"))
x = Hall(d1,d2,cost)
The format i want to use is Dec 25 2017. Would appreciate any help.
The date that you input, namely Dec 25 2017, needs to match the format you specify in strptime.
Try the following, and enter the same input Dec 25 2017:
Fd=input("Enter Start time\n")
d1 = datetime.datetime.strptime(Fd, '%b %d %Y')
Sd=input("Enter the End time\n")
d2 = datetime.datetime.strptime(Sd, '%b %d %Y')
There are two issues with your date format
You have some extra / that you don't expect
And to input the complete year you need to use %Y (capital Y)
Try this:
datetime.datetime.strptime(Sd, '%b %d %Y')

Format datestamp in python

Using import datetime in python, is it possible to take a formatted time/date string such as:
2012-06-21 20:36:11
And convert it into an object that I can then use to produce a newly formatted string such as:
21st June 2012 20:36
import time
s = '2012-06-21 20:36:11'
t = time.strptime(s, '%Y-%m-%d %H:%M:%S')
print time.strftime('%d %B %Y %H:%M', t)
returns
21 June 2012 20:36
If you really want the 'st',
def nth(n):
return str(n) + nth.ext[int(n)%10]
nth.ext = ['th', 'st', 'nd', 'rd'] + ['th']*6
print nth(t.tm_mday) + time.strftime(' %B %Y %H:%M', t)
gets you
21st June 2012 20:36
You want datetime.strptime, it parses text into datetimes:
>>> d = "2012-06-21 20:36:11"
>>> datetime.datetime.strptime(d, "%Y-%m-%d %H:%M:%S")
datetime.datetime(2012, 6, 21, 20, 36, 11)
Formatting the date in the way you want is almost doable:
>>> datetime.datetime.strftime(t, "%d %B %Y %H:%m")
'21 June 2012 20:06'

Categories