I have a column in which there are dates :
df['Date']
Date
0 2020-25-04
1 2020-26-04
2 2020-27-04
3 2020-12-05
4 2020-06-05
Name: Date, Length: 5, dtype: datetime64[ns]
I want to swap the element Day by element Month, so I can have :
df['Date']
Date
0 2020-04-25
1 2020-04-26
2 2020-04-27
3 2020-05-12
4 2020-05-06
Name: Date, Length: 5, dtype: datetime64[ns]
Any help would be appreciated.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Date':[np.datetime64('2020-04-25') ,np.datetime64('2020-04-26')]})
df['Date'] = df['Date'].apply(lambda x: x.strftime('%Y-%m-%d'))
print(df)
I converted data into np.datetime format and applied lambda function.
Related
MonthEnd function offset the date to end of month as given below. But I am not clear how it works? Does it scan the left side of '+' sign and then calculate the number of days to add? Normally we expect something like MonthEnd(Date).
import pandas as pd
from pandas.tseries.offsets import MonthEnd
df = pd.DataFrame({'Date': [20010410, 20050805, 20100219, 20160211, 19991208, 20061122]})
df['EndOfMonth'] = pd.to_datetime(df['Date'], format="%Y%m%d") + MonthEnd(1)
print(df['EndOfMonth'])
0 2001-04-30
1 2005-08-31
2 2010-02-28
3 2016-02-29
4 1999-12-31
5 2006-11-30
Name: EndOfMonth, dtype: datetime64[ns]
I have a dataset like this
df = pd.DataFrame({'time': ('08.02.2020', '21.02.2020', '2020.05.04')})
df
I do
pd.to_datetime(df['time'])
0 2020-08-02
1 2020-02-21
2 2020-05-04
Name: time, dtype: datetime64[ns]
But the first row must be
0 2020-02-08
If i do
pd.to_datetime(df['time']).dt.strftime('%d-%m-%Y')
0 02-08-2020
1 21-02-2020
2 04-05-2020
Name: time, dtype: object
Again 02-08-2020 instead of 08-02-2020
I have a dataframe with two columns (date and days).
df = pd.DataFrame({'date':[2020-01-31, 2020-01-21, 2020-01-11], 'days':[1, 2, 3]})
I want to have a third column (date_2) for which to substract the number of days from the date. Therefore, date_2 would be [2020-01-30, 2020-01-19, 2020-01-8].
I know timedelta(days = i) but I cannot give it the content of df['days'] as i in pandas.
Use to_timedelta with unit=d and subtract
>>pd.to_datetime(df['date'])-pd.to_timedelta(df['days'],unit='d')
0 2020-01-30
1 2020-01-19
2 2020-01-08
dtype: datetime64[ns]
Use to_datetime for datetimes and subtract by Series.sub with timedeltas created by to_timedelta:
df['new'] = pd.to_datetime(df['date']).sub(pd.to_timedelta(df['days'], unit='d'))
print (df)
date days new
0 2020-01-31 1 2020-01-30
1 2020-01-21 2 2020-01-19
2 2020-01-11 3 2020-01-08
My dataset has dates in the European format, and I'm struggling to convert it into the correct format before I pass it through a pd.to_datetime, so for all day < 12, my month and day switch.
Is there an easy solution to this?
import pandas as pd
import datetime as dt
df = pd.read_csv(loc,dayfirst=True)
df['Date']=pd.to_datetime(df['Date'])
Is there a way to force datetime to acknowledge that the input is formatted at dd/mm/yy?
Thanks for the help!
Edit, a sample from my dates:
renewal["Date"].head()
Out[235]:
0 31/03/2018
2 30/04/2018
3 28/02/2018
4 30/04/2018
5 31/03/2018
Name: Earliest renewal date, dtype: object
After running the following:
renewal['Date']=pd.to_datetime(renewal['Date'],dayfirst=True)
I get:
Out[241]:
0 2018-03-31 #Correct
2 2018-04-01 #<-- this number is wrong and should be 01-04 instad
3 2018-02-28 #Correct
Add format.
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
You can control the date construction directly if you define separate columns for 'year', 'month' and 'day', like this:
import pandas as pd
df = pd.DataFrame(
{'Date': ['01/03/2018', '06/08/2018', '31/03/2018', '30/04/2018']}
)
date_parts = df['Date'].apply(lambda d: pd.Series(int(n) for n in d.split('/')))
date_parts.columns = ['day', 'month', 'year']
df['Date'] = pd.to_datetime(date_parts)
date_parts
# day month year
# 0 1 3 2018
# 1 6 8 2018
# 2 31 3 2018
# 3 30 4 2018
df
# Date
# 0 2018-03-01
# 1 2018-08-06
# 2 2018-03-31
# 3 2018-04-30
I have a pandas dataframe with a date column the data type is datetime64[ns]. there are over 1000 observations in the dataframe. I want to transform the following column:
date
2013-05-01
2013-05-01
to
date
05/2013
05/2013
or
date
05-2013
05-2013
EDIT//
this is my sample code as of now
test = pd.DataFrame({'a':['07/2017','07/2017',pd.NaT]})
a
0 2017-07-13
1 2017-07-13
2 NaT
test['a'].apply(lambda x: x if pd.isnull(x) == True else x.strftime('%Y-%m'))
0 2017-07-01
1 2017-07-01
2 NaT
Name: a, dtype: datetime64[ns]
why did only the date change and not the format?
You can convert datetime64 into whatever string format you like using the strftime method. In your case you would apply it like this:
df.date = df.date[df.date.notnull()].map(lambda x: x.strftime('%m/%Y'))
df.date
Out[111]:
0 05/2013
1 05/2013