I have a csv file with column 'date' which has dates in many different formats like ddmmyy, mmddyy,yymmdd. I want to convert all the dates to y-m-d format
df=pd.read_csv(file)
df=df['date] .dt.strftime(%y-%m-%d)
This code gives error: "Can only use .dt accessor with datetimelike values"
You can utilise pd.to_datetime -
>>> import pandas as pd
>>>
>>> df = pd.DataFrame(['1/2/2020','12/31/2020','20-Jun-20'],columns=['Date'])
>>> df
Date
0 1/2/2020
1 12/31/2020
2 20-Jun-20
>>>
>>> df['Date'] = pd.to_datetime(df['Date'])
>>> df
Date
0 2020-01-02
1 2020-12-31
2 2020-06-20
>>>
>>> df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%y-%m-%d')
>>>
>>> df
Date
0 20-01-02
1 20-12-31
2 20-06-20
>>>
Step 0:-
Your dataframe:-
df=pd.read_csv('your file name.csv')
Step 1:-
firstly convert your 'date' column into datetime by using to_datetime() method:-
df['date']=pd.to_datetime(df['date'])
Step 2:-
And If you want to convert them in string like format Then use:-
df['date']=df['date'].astype(str)
Now if you print df or write df(if you are using jupyter notebook)
Output:-
0 2020-01-01
1 2020-12-31
2 2020-06-20
Related
I have data in the form of yyyymm in a CSV file, I want to import it into pandas and find the range of timeperiod. eg: 202201. I want to apply datetime functions to this but am unable to convert it into appropriate format.
test['YEAR_MONTH'] = pd.to_datetime(
test['YEARMONTH'], format='%Y%m', errors='coerce').dropna()
I tried using this, but to no avail.
>>> import pandas as pd
>>> import datetime
# Sample data as per OP's format
>>> test = pd.DataFrame({'YEARMONTH':['202201','202202','202203']})
>>> test
YEARMONTH
0 202201
1 202202
2 202203
# Using strptime to convert to datetime object
>>> test_mod = test['YEARMONTH'].apply(lambda x: datetime.datetime.strptime(x,'%Y%m'))
>>> test_mod
0 2022-01-01
1 2022-02-01
2 2022-03-01
Name: YEARMONTH, dtype: datetime64[ns]
# Note - By default it assigns date as the first date of every month
I have month column with values formatted as: 2019M01
To find the seasonality I need this formatted into Pandas DateTime format.
How to format 2019M01 into datetime so that I can use it for my seasonality plotting?
Thanks.
Use to_datetime with format parameter:
print (df)
date
0 2019M01
1 2019M03
2 2019M04
df['date'] = pd.to_datetime(df['date'], format='%YM%m')
print (df)
date
0 2019-01-01
1 2019-03-01
2 2019-04-01
I get a date in data which looks like this "2014-12-19T05:00:00". I want to convert it in order to obtain a Date or String object and get something like "01-04-2018" that its "dd-MM-YYYY" in dataframe. How can I do it?
The result will be used for time series. So far,my time series result is like this, perhaps because it doesn't detect the date format (x-axis not in datetime).
Date column:
For a pandas dataframe column/series:
Convert a string column (dtype of object) to a datetime column (dtype of datetime64[ns]) using to_datetime. Then if you want another column with your datetimes back in a string format of your choosing, use dt.strftime.
An example:
df = pd.DataFrame({
"Date": ["2014-12-19T05:00:00", "2014-12-20T05:00:00", "2014-12-21T05:00:00"],
"Value": [0, 2, 4]})
df['DateTime'] = pd.to_datetime(df['Date'])
df['MyDateTimeString'] = df['DateTime'].dt.strftime('%Y-%m-%d')
print(df)
# Date Value DateTime MyDateTimeString
# 0 2014-12-19T05:00:00 0 2014-12-19 05:00:00 2014-12-19
# 1 2014-12-20T05:00:00 2 2014-12-20 05:00:00 2014-12-20
# 2 2014-12-21T05:00:00 4 2014-12-21 05:00:00 2014-12-21
In general:
To read your strings into datetime objects, use strptime:
import datetime
d = datetime.datetime.strptime("2014-12-19T05:00:00", "%Y-%m-%dT%H:%M:%S")
Then to get a string representation of those datetime objects, use strftime:
d.strftime("%d-%m-%Y")
For more general string-to-datetime parsing, the dateparser library is handy:
import dateparser
dateparser.parse("2014-12-19T05:00:00").strftime("%d-%m-%Y")
# '19-12-2014'
dateparser.parse("December 19, 2014 at 5am").strftime("%d-%m-%Y")
# '19-12-2014'
I recommend using https://pypi.org/project/python-dateutil/
(Install with pip install python-dateutil.)
>>> import dateutil.parser
>>> d = dateutil.parser.isoparse('2014-12-19T05:00:00')
>>> print(d.strftime('%m-%d-%Y'))
12-19-2014
I have a column I_DATE of type string(object) in a dataframe called train as show below.
I_DATE
28-03-2012 2:15:00 PM
28-03-2012 2:17:28 PM
28-03-2012 2:50:50 PM
How to convert I_DATE from string to datetime format & specify the format of input string.
Also, how to filter rows based on a range of dates in pandas?
Use to_datetime. There is no need for a format string since the parser is able to handle it:
In [51]:
pd.to_datetime(df['I_DATE'])
Out[51]:
0 2012-03-28 14:15:00
1 2012-03-28 14:17:28
2 2012-03-28 14:50:50
Name: I_DATE, dtype: datetime64[ns]
To access the date/day/time component use the dt accessor:
In [54]:
df['I_DATE'].dt.date
Out[54]:
0 2012-03-28
1 2012-03-28
2 2012-03-28
dtype: object
In [56]:
df['I_DATE'].dt.time
Out[56]:
0 14:15:00
1 14:17:28
2 14:50:50
dtype: object
You can use strings to filter as an example:
In [59]:
df = pd.DataFrame({'date':pd.date_range(start = dt.datetime(2015,1,1), end = dt.datetime.now())})
df[(df['date'] > '2015-02-04') & (df['date'] < '2015-02-10')]
Out[59]:
date
35 2015-02-05
36 2015-02-06
37 2015-02-07
38 2015-02-08
39 2015-02-09
Approach: 1
Given original string format: 2019/03/04 00:08:48
you can use
updated_df = df['timestamp'].astype('datetime64[ns]')
The result will be in this datetime format: 2019-03-04 00:08:48
Approach: 2
updated_df = df.astype({'timestamp':'datetime64[ns]'})
For a datetime in AM/PM format, the time format is '%I:%M:%S %p'. See all possible format combinations at https://strftime.org/. N.B. If you have time component as in the OP, the conversion will be done much, much faster if you pass the format= (see here for more info).
df['I_DATE'] = pd.to_datetime(df['I_DATE'], format='%d-%m-%Y %I:%M:%S %p')
To filter a datetime using a range, you can use query:
df = pd.DataFrame({'date': pd.date_range('2015-01-01', '2015-04-01')})
df.query("'2015-02-04' < date < '2015-02-10'")
or use between to create a mask and filter.
df[df['date'].between('2015-02-04', '2015-02-10')]
I have a Pandas Dataframe df:
a date
1 2014-06-29 00:00:00
df.types return:
a object
date object
I want convert column data to data without time but:
df['date']=df['date'].astype('datetime64[s]')
return:
a date
1 2014-06-28 22:00:00
df.types return:
a object
date datetime64[ns]
But value is wrong.
I'd have:
a date
1 2014-06-29
or:
a date
1 2014-06-29 00:00:00
I would start by putting your dates in pd.datetime:
df['date'] = pd.to_datetime(df.date)
Now, you can see that the time component is still there:
df.date.values
array(['2014-06-28T19:00:00.000000000-0500'], dtype='datetime64[ns]')
If you are ok having a date object again, you want:
df['date'] = [x.strftime("%y-%m-%d") for x in df.date]
Here would be ending with a datetime:
df['date'] = [x.date() for x in df.date]
df.date
datetime.date(2014, 6, 29)
Here you go. Just use this pattern:
df.to_datetime().date()