Get month-day pair without a year from pandas date time - python

I am trying to use this, but eventually, I get the same year-month-day format where my year changed to default "1900". I want to get only month-day pairs if it is possible.
df['date'] = pd.to_datetime(df['date'], format="%m-%d")

If you transform anything to date time, you'll always have a year in it, i.e. to_datetime will always yield a date time with a year.
Without a year, you will need to store it as a string, e.g. by running the inverse of your example:
df['date'] = df['date'].dt.strftime(format="%m-%d")

Related

Convert month, day in string to date in Python

I have 2 columns as month and day in my dataframe which are of the datatypes objects. I want to sort those in ascending order (Jan, Feb, Mar) but in order to do that, I need to convert them to date format. I tried using the following code, and some more but nothing seems to work.
ff['month'] = dt.datetime.strptime(ff['month'],format='%b')
and
ff['month'] = pd.to_datetime(ff['month'], format="%b")
Data Frame
Any help would be appreciated. Thank you
This works to convert Month Names to Integers:
import datetime as dt
ff['month'] = [dt.datetime.strptime(m, "%b").month for m in ff['month']]
(Basically, you're just passing strings one by one to the first function you mentioned, to make it work.)
You can then manipulate (e.g. sort) them.
Working with dataframe:
ff['month'] = ff['month'].apply(lambda x: dt.datetime.strptime(x, "%b"))
ff = ff.sort_values(by=['month'])
ff['month'] = ff['month'].apply(lambda x: x.strftime("%b"))

Group by month in a Pandas dataframe when there is no year in the datetime object

I have a large dataset with a date_time field (object) that is in this format: 01/01 01:00:00 (month/day hour:minute:second). There is no year. I want to be able to group the dataset by month in a Pandas dataframe.
Whatever I try, I either get an error like, "Error parsing datetime string " 01/01 01:00:00" at position 3" or an out-of-bounds error. I'm a bit of a newbie here. I suspect it is a datetime formatting issue because there is no year...but I cannot figure it out.
If you don't have a year, you don't really have a date. But you can still group by month, just treat it like a string!
Something along the lines of this should work:
# create a month string column, called month_str
# the lambda function just turns the col with the yearless 'dates' into a str
# and takes only the first two characters
df['month_str'] = df['datetime'].apply(lambda x: str(x)[0:2])
df.groupby('month_str')

Trying to convert '2020-12-28' to only month, for example 'December'

Trying to convert '2020-12-28' to only month, for example 'December'.
I already converted the column to datetime from object and then used the following code:
df['month'] = pd.DatetimeIndex(df['ArrivalDate']).month
But this code gives me the error
'Length of value does not match length of index'.
However, the column 'ArrivalDate' is not the index and I do not intend to make it either. I also have multiple values with the same dates and I want to aggregate them based on months.
You can use pandas.Series.dt.strftime() to convert datetime to string with format you designate.
df['month'] = df['ArrivalDate'].dt.strftime('%B')

Pandas sets datetime to first day of month if missing day?

When I used Pandas to convert my datetime string, it sets it to the first day of the month if the day is missing.
For example:
pd.to_datetime('2017-06')
OUT[]: Timestamp('2017-06-01 00:00:00')
Is there a way to have it use the 15th (middle) day of the month?
EDIT:
I only want it to use day 15 if the day is missing, otherwise use the actual date - so offsetting all values by 15 won't work.
While this isn't possible using the actual call, you could always use regex matching to figure out if the string contains a date and proceed accordingly. Note: this code only works if using '-' delimited dates:
import re
date_str = '2017-06'
if (not bool(re.match('.+-.+-.+',date_str))):
pd.to_datetime(date_str).replace(date=15)
else:
pd.to_datetime(date_str)

Pandas: select all dates with specific month and day

I have a dataframe full of dates and I would like to select all dates where the month==12 and the day==25 and add replace the zero in the xmas column with a 1.
Anyway to do this? the second line of my code errors out.
df = DataFrame({'date':[datetime(2013,1,1).date() + timedelta(days=i) for i in range(0,365*2)], 'xmas':np.zeros(365*2)})
df[df['date'].month==12 and df['date'].day==25] = 1
Pandas Series with datetime now behaves differently. See .dt accessor.
This is how it should be done now:
df.loc[(df['date'].dt.day==25) & (cust_df['date'].dt.month==12), 'xmas'] = 1
Basically what you tried won't work as you need to use the & to compare arrays, additionally you need to use parentheses due to operator precedence. On top of this you should use loc to perform the indexing:
df.loc[(df['date'].month==12) & (df['date'].day==25), 'xmas'] = 1
An update was needed in reply to this question. As of today, there's a slight difference in how you extract months from datetime objects in a pd.Series.
So from the very start, incase you have a raw date column, first convert it to datetime objects by using a simple function:
import datetime as dt
def read_as_datetime(str_date):
# replace %Y-%m-%d with your own date format
return dt.datetime.strptime(str_date,'%Y-%m-%d')
then apply this function to your dates column and save results in a new column namely datetime:
df['datetime'] = df.dates.apply(read_as_datetime)
finally in order to extract dates by day and month, use the same piece of code that #Shayan RC explained, with this slight change; notice the dt.datetime after calling the datetime column:
df.loc[(df['datetime'].dt.datetime.month==12) &(df['datetime'].dt.datetime.day==25),'xmas'] =1

Categories