Convert df column to date - python

I have the following sample data to convert to date.
data = {'Dates':['20030430', '20010131', '20190805', '20191115']}
df = pd.DataFrame(data)
The code I am using is
df['Converted Date'] = pd.to_datetime(df['Dates'], format='%Y%m%d')
it gives the following error.
ValueError: time data '2011-10-13 00:00:00' does not match format '%Y%m%d' (match)
I tried
df['Converted Date'] = pd.to_datetime(df['Dates'], format='%Y%m%d%f')
which results in below error.
ValueError: time data '20190805' does not match format '%Y%m%d%f' (match)
Kindly help to resolve.

Related

Converting dates to datetime64 results in day and month places getting swapped

I am pulling a time series from a csv file which has dates in "mm/dd/yyyy" format
df = pd.read_csv(lib_file.csv)
df['Date'] = df['Date'].apply(lambda x:datetime.strptime(x,'%m/%d/%Y').strftime('%d/%m/%Y'))
below is the output
I convert dtypes for ['Date'] from object to datetime64
df['Date'] = pd.to_datetime(df['Date'])
but that changes my dates as well
how do I fix it?
Try this:
df['Date'] = pd.to_datetime(df['Date'], infer_datetime_format=True)
This will infer your dates based on the first non-NaN element which is being correctly parsed in your case and will not infer the format for each and every row of the dataframe.
just using the below code helped
df = pd.read_csv(lib_file.csv)
df['Date'] = pd.to_datetime(df['Date])

Panda dataframe conversion of series of 03Mar2020 date format to 2020-03-03

I'm not able to convert input
Dates = {'dates': ['05Sep2009','13Sep2011','21Sep2010']}
to desired output
Dates = {'dates': [2019-09-02,2019-09-13,2019-09-21]}
using Pandas Dataframe.
data = {'dates': ['05Sep2009','13Sep2011','21Sep2010']}
df = pd.DataFrame(data, columns=['dates'])
df['dates'] = pd.to_datetime(df['dates'], format='%Y%m%d')
print (df)
Output:
ValueError: time data '05Sep2009' does not match format '%Y%m%d' (match)
I'm new to this library. Help is appreciated.
Currently the months are abbreviated and are not numeric, so you can't use %m.
To convert abbreviated months and get the expected output use %b, like this:
df['dates'] = pd.to_datetime(df['dates'], format='%d%b%Y')
Update: to convert the DataFrame back to a dictionary you can use the function to_dict() but first, to get the desidered output, you need to convert the column from datetime back to string type. You can achieve it through this:
df['dates'] = df['dates'].astype(str)
df.to_dict('list')
You must change %m with %b. Because %m support month as a zero-padded decimal number. But in you dataframe has a abbreviation of months. Try these code:
data = {'dates': ['05Sep2009','13Sep2011','21Sep2010']}
df = pd.DataFrame(data, columns=['dates'])
df['convert of dates'] = pd.to_datetime(df['dates'], format='%d%b%Y')
display(df)
And also you can check this link about other format:
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior

How to convert non-traditional formatted Pandas date object to datetime

I have an interesting scenario where a date object looks like the following:
'6/7/2018 7:59:11 PM'
in the format m/d/yyyy h:mm:ss PM (or AM). Note that the month and hour is not padded with a zero. I have tried the following line of code using a Pandas date object:
data = pd.read_csv('file.txt', sep="\t", header=None, dtype = 'str')
data.columns = ['A', 'B', 'C', ...]
The data.columns provides a look at the format of the file, all tab-delimited (note that is not an actual line of code, just an arbitrary way to show how the columns were labeled). The time series are in Column A. I attempted the conversion using:
time = pd.to_datetime(pd.Series(data['A']), format = '%-m/%-d/%Y %-H/%M/%S %p')
The return is:
ValueError: '-' is a bad directive in format '%-m/%-d/%Y %-H/%M/%S %p'
Any suggestions on how to go about resolving this issue would be greatly appreciated!
Your datetime string is '%m/%d/%Y %H:%M:%S %p'
Ex:
import pandas as pd
data = pd.DataFrame({"A": ['6/7/2018 7:59:11 PM' ]})
time = pd.to_datetime(pd.Series(data['A']), format = '%m/%d/%Y %H:%M:%S %p')
print( time )
Output:
0 2018-06-07 07:59:11
Name: A, dtype: datetime64[ns]

Python ValueError: time data '2001-11-03 ' %Y:%m %d %H:%M:%S' when dates in csv file are month/day/year

I'm having an issue where the date format is not matching up. Meaning in my .csv file the dates are as follows %m/%d/%Y (ex. 11/3/2001) but in the error it saying %Y/%m/%d or %Y/%d/%m. I've tried all the possible permutations as far as year, month and day and I continue to recieve the same error of ValueError: time data '2001-11-03 ' %Y:%m %d %H:%M:%S'. Below is my code. Thanks.
df = pd.read_excel('.xlsx', header=None)
df.to_csv('.csv', header=None, index=False)
df= pd.read_csv('.csv', index_col[5,8,9,12], date_parser=lambda x: datetime.datetime.strptime(x, '%Y/%m/$d %H:%M:%S').strptime('%m/%d/%Y))
Note: What I'm trying to do is convert an .xlsx file to .csv and then remove the trailing 0:00 from multiple columns within the .csv file. Hope this helps.
Use the parse from dateutil.parser to parse the date appropriately. It is an easy access. The fastest way to parse dates.
from dateutil.parser import parse
df = pd.read_csv('filename.csv', date_parser = parse, index_..)
our you can use to_datetime native to Pandas
pd.to_datetime(df['Date Col'])
In order to format the date properly, you should use the following:
date_parser=lambda x: parse(x)
#parse from dateutil.parser
df['Date Col'] = df['Date Col'].strftime('%m/%d/%Y')
df.to_csv('New File.csv')
You can use to_datetime since you are using pandas. MoreInfo
import pandas as pd
df = pd.DataFrame({"a": ["11/3/2001", '2001-11-03']})
df["a"] = pd.to_datetime(df["a"])
print(df["a"])
Output:
0 2001-11-03
1 2001-11-03
Name: a, dtype: datetime64[ns]

Why does time data not match format?

I have a dataframe with strings that I am converting to datetimes. They all look like "12/20/17 5:45:30" (month/day/year hour:minute:second). This is my code:
for col in cols:
df[col] = pd.to_datetime(df[col], format='%m/%d/%Y %H:%M:%S')
But I get the following error:
ValueError: time data '4/19/16 1:05:30' does not match format '%m/%d/%Y %H:%M:%S'
The date shown in the error is the very first date in the dataframe, so it is not working at all. Can someone explain what's wrong with my datetime format? How does that datetime not match the format? By the way, before I was doing this with a file that had no seconds, and my format was %m/%d/%Y %H:%M, which worked fine, but now with seconds it does not.
Your format string is not working because your format uses a Y where it needed a y. But pandas to the rescue, it can often figure this stuff out for you by using the infer_datetime_format parameter to pandas.to_datetime()
Code:
df[col] = pd.to_datetime(df[col], infer_datetime_format=True)
Test Code:
df = pd.DataFrame(["12/20/17 5:45:30", "4/19/16 1:05:30"], columns=['date'])
print(df)
for col in df.columns:
df[col] = pd.to_datetime(df[col], infer_datetime_format=True)
print(df)
Results:
date
0 12/20/17 5:45:30
1 4/19/16 1:05:30
date
0 2017-12-20 05:45:30
1 2016-04-19 01:05:30

Categories