How to convert month number to datetime in pandas - python

I have followed the instructions from this thread, but have run into issues.
Converting month number to datetime in pandas
I think it may have to do with having an additional variable in my dataframe but I am not sure. Here is my dataframe:
0 Month Temp
1 0 2
2 1 4
3 2 3
What I want is:
0 Month Temp
1 1990-01 2
2 1990-02 4
3 1990-03 3
Here is what I have tried:
df= pd.to_datetime('1990-' + df.Month.astype(int).astype(str) + '-1', format = '%Y-%m')
And I get this error:
ValueError: time data 1990-0-1 doesn't match format specified

IIUC, we can manually create your datetime object then format it as your expected output:
m = np.where(df['Month'].eq(0),
df['Month'].add(1), df['Month']
).astype(int).astype(str)
df['date'] = pd.to_datetime(
"1900" + "-" + pd.Series(m), format="%Y-%m"
).dt.strftime("%Y-%m")
print(df)
Month Temp date
0 0 2 1900-01
1 1 4 1900-02
2 2 3 1900-03

Try .dt.strftime() to show how to display the date, because datetime values are by default stored in %Y-%m-%d 00:00:00 format.
import pandas as pd
df= pd.DataFrame({'month':[1,2,3]})
df['date']=pd.to_datetime(df['month'], format="%m").dt.strftime('%Y-%m')
print(df)
You have to explicitly tell pandas to add 1 to the months as they are from range 0-11 not 1-12 in your case.
df=pd.DataFrame({'month':[11,1,2,3,0]})
df['date']=pd.to_datetime(df['month']+1, format='%m').dt.strftime('1990-%m')

Here is my solution for you
import pandas as pd
Data = {
'Month' : [1,2,3],
'Temp' : [2,4,3]
}
data = pd.DataFrame(Data)
data['Month']= pd.to_datetime('1990-' + data.Month.astype(int).astype(str) + '-1', format = '%Y-%m').dt.to_period('M')
Month Temp
0 1990-01 2
1 1990-02 4
2 1990-03 3
If you want Month[0] means 1 then you can conditionally add this one

Related

Pandas change values based on previous value in same column

I have the following dataframe:
import pandas as pd
import datetime
df = pd.DataFrame({'ID': [1, 2, 1, 1],
'Date' : [datetime.date(year=2022,month=5,day=1), datetime.date(year=2022,month=11,day=1),
datetime.date(year=2022,month=10,day=1), datetime.date(year=2022,month=11,day=1)],
"Lifecycle ID": [5,5,5,5]})
And I need to change the lifecycle based on the lifecycle 6 month ago (if it was 5, it should always be 6 (not +1)).
I'm currently trying:
df.loc[(df["Date"] == (df["Date"] - pd.DateOffset(months=6))) & (df["Lifecycle ID"] == 5), "Lifecycle ID"] = 6
However Pandas is not considering the ID and I don't know how.
The output should be this dataframe (only last Lifecycle ID changed to 6):
Could you please help me here?
The logic is not fully clear, but if I guess correctly:
# ensure datetime type
df['Date'] = pd.to_datetime(df['Date'])
# add the time delta to form a helper DataFrame
df2 = df.assign(Date=df['Date'].add(pd.DateOffset(months=6)))
# merge on ID/Date, retrieve "Lifecycle ID"
# and check if the value is 5
m = df[['ID', 'Date']].merge(df2, how='left')['Lifecycle ID'].eq(5)
# if it is, update the value
df.loc[m, 'Lifecycle ID'] = 6
If you want to increment the value automatically from the value 6 months ago:
s = df[['ID', 'Date']].merge(df2, how='left')['Lifecycle ID']
df.loc[s.notna(), 'Lifecycle ID'] = s.add(1)
Output:
ID Date Lifecycle ID
0 1 2022-05-01 5
1 2 2022-11-01 5
2 1 2022-10-01 5
3 1 2022-11-01 6

How add dig in a string by index without deleting anything

How adding a dig / in a string that looks like this:
012019
so it will look like this:
01/2019
Also add maybe day like
01/01/2019
The data:
import pandas as pd
df= pd.DataFrame({ "month": ["012019","152019","222019","142019","302019","012020"]})
My code:
df.month = df.month.apply(lambda x: '{:0>2}'.format(x.split('/')[0]))
But it does not work.
If I understood correctly, you just want to add a slash between 2nd and 3rd characters. then it's easy:
df['new'] = df.month.str.slice(0, 2) + '/' + df.month.str.slice(2)
IIUC just convert to datetime and use dt.strftime
df['month'] = pd.to_datetime(df['month'],format='%d%Y').dt.strftime('%d/%Y')
output:
print(df)
month
0 01/2019
1 15/2019
2 22/2019
3 14/2019
4 30/2019
5 01/2020
if you want to add a month as well just add it to your string
month = '01'
df['month'] = pd.to_datetime(df['month'].astype(str) +
month,format='%d%Y%m').dt.strftime('%m/%d/%Y')
print(df)
month
0 01/01/2019
1 01/15/2019
2 01/22/2019
3 01/14/2019
4 01/30/2019
5 01/01/2020

Pandas - Increase date by 1

I have a dataframe like this,
ID DateCol
1 26/06/2017
2 Employee Not Found
I want to increase the date by 1.
The expected output is,
ID DateCol
1 27/06/2017
2 Employee Not Found
I tried,
temp_result_df['NewDateCol'] = pd.to_datetime(temp_result_df['DateCol']).apply(pd.DateOffset(1))
And it is not working, as I believe there is a string on it.
Here is best working with datetimes in column with NaT for missing values - use to_datetime with Timedelta or Day offset:
temp_result_df['NewDateCol'] = pd.to_datetime(temp_result_df['DateCol'], errors='coerce')
temp_result_df['NewDateCol'] += pd.Timedelta(1, 'd')
#alternative
#temp_result_df['NewDateCol'] += pd.offsets.Day(1)
print (temp_result_df)
ID DateCol NewDateCol
0 1 26/06/2017 2017-06-27
1 2 Employee Not Found NaT
If need strings like original data need strftime with replace:
s = pd.to_datetime(temp_result_df['DateCol'], errors='coerce') + pd.Timedelta(1, 'd')
temp_result_df['NewDateCol'] = s.dt.strftime('%d/%m/%Y').replace('NaT','Employee Not Found')
print (temp_result_df)
ID DateCol NewDateCol
0 1 26/06/2017 27/06/2017
1 2 Employee Not Found Employee Not Found

Converting dataframe column of datetime data to DD/MM/YYYY string data

I have a dataframe column with datetime data in 1980-12-11T00:00:00 format.
I need to convert the whole column to DD/MM/YYY string format.
Is there any easy code for this?
Creating a working example:
df = pd.DataFrame({'date':['1980-12-11T00:00:00', '1990-12-11T00:00:00', '2000-12-11T00:00:00']})
print(df)
date
0 1980-12-11T00:00:00
1 1990-12-11T00:00:00
2 2000-12-11T00:00:00
Convert the column to datetime by pd.to_datetime() and invoke strftime()
df['date_new']=pd.to_datetime(df.date).dt.strftime('%d/%m/%Y')
print(df)
date date_new
0 1980-12-11T00:00:00 11/12/1980
1 1990-12-11T00:00:00 11/12/1990
2 2000-12-11T00:00:00 11/12/2000
You can use pd.to_datetime to convert string to datetime data
pd.to_datetime(df['col'])
You can also pass specific format as:
pd.to_datetime(df['col']).dt.strftime('%d/%m/%Y')
When using pandas, try pandas.to_datetime:
import pandas as pd
df = pd.DataFrame({'date': ['1980-12-%sT00:00:00'%i for i in range(10,20)]})
df.date = pd.to_datetime(df.date).dt.strftime("%d/%m/%Y")
print(df)
date
0 10/12/1980
1 11/12/1980
2 12/12/1980
3 13/12/1980
4 14/12/1980
5 15/12/1980
6 16/12/1980
7 17/12/1980
8 18/12/1980
9 19/12/1980

Changing format of date in pandas dataframe

I have a pandas dataframe, in which a column is a string formatted as
yyyymmdd
which should be a date. Is there an easy way to convert it to a recognizable form of date?
And then what python libraries should I use to handle them?
Let's say, for example, that I would like to consider all the events (rows) whose date field is a working day (so mon-fri). What is the smoothest way to handle such a task?
Ok so you want to select Mon-Friday. Do that by converting your column to datetime and check if the dt.dayofweek is lower than 6 (Mon-Friday --> 0-4)
m = pd.to_datetime(df['date']).dt.dayofweek < 5
df2 = df[m]
Full example:
import pandas as pd
df = pd.DataFrame({
'date': [
'20180101',
'20180102',
'20180103',
'20180104',
'20180105',
'20180106',
'20180107'
],
'value': range(7)
})
m = pd.to_datetime(df['date']).dt.dayofweek < 5
df2 = df[m]
print(df2)
Returns:
date value
0 20180101 0
1 20180102 1
2 20180103 2
3 20180104 3
4 20180105 4

Categories