How to obtain just the year from pandas data frame? [duplicate] - python

This question already has answers here:
python pandas extract year from datetime: df['year'] = df['date'].year is not working
(5 answers)
Closed 3 months ago.
So I wrote some code to turn a list of strings into date times:
s = pd.Series(["14 Nov 2020", "14/11/2020", "2020/11/14",
"Hello World", "Nov 14th, 2020"])
s_dates = pd.to_datetime(s, errors='coerce', exact=False)
print(s_dates)
It produced the following output:
0 2020-11-14
1 2020-11-14
2 2020-11-14
3 NaT
4 2020-11-14
dtype: datetime64[ns]
How would I obtain just the year from this?

Since your seriess_dates has dtype datetime64[ns], you can directly use
Series.dt.year like:
print(s_dates.dt.year)
This will return a series containing only the year (as dtype int64).
Check the documentation for more useful datetime transformations.

Assuming your years would always be 4 digits, we can try using str.extract here:
s_dates["year"] = s_dates["dates_extracted"].str.extract(r'(\d{4})')

Related

How to convert datetime to strings in python [duplicate]

This question already has answers here:
Converting a datetime column to a string column
(4 answers)
Closed 3 days ago.
I have a dataframe which contains a column called period that has datetime values in it in the following format:
2020-03-01T00:00:00.000000000
I want to convert the datetime to strings with the format - 03/01/2020 (month, day, year)
How would I do this?
import pandas as pd
df = pd.DataFrame({'period': ['2020-03-01T00:00:00.000000000', '2020-04-01T00:00:00.000000000']})
df['period'] = pd.to_datetime(df['period'])
df['period'] = df['period'].dt.strftime('%m/%d/%Y')
print(df)
Output
period
0 03/01/2020
1 04/01/2020

Computing time between two dates and returning number of days [duplicate]

This question already has answers here:
Pandas Timedelta in Days
(5 answers)
Closed 3 years ago.
Given two columns in a dataframe that are date time objects:
Checkin Checkout
2018-09-13 19:55:00 2018-09-16 13:08:00
I'd like to compute the time difference in days and have it output as an integer to a new column. So far, I've done this but the output also includes seconds.
delta = df['Checkin'] - df['Checkout']
print(delta)
The output however ends up being:
2 days 17:13:00
and is output as a DT object. I'd like it to just output as 2 and as an integer in a new column.
How would I go about doing that?
You need dt.days
(df['checkin'] - df['checkout']).dt.days
Output:
0 -3
dtype: int64

pandas - add 1 month to a pd.Timestamp [duplicate]

This question already has answers here:
Add months to a date in Pandas
(4 answers)
How can I get pandas Timestamp offset by certain amount of months?
(1 answer)
Closed 4 years ago.
I have multiple df, and they are indexed with timestamps for consecutive months. For example:
1996-01-01 01:00:00
1996-02-01 01:00:00
1996-03-01 01:00:00
1996-04-01 01:00:00
1996-05-01 01:00:00
1996-06-01 01:00:00
I'm trying to create a function where I can add an arbitrary number of rows onto the df, continuing on from whatever the last month happens to be. I tried to solve this by using:
df.iloc[-1].name + pd.Timedelta(1, unit='M')
in a for loop, but this only seems to add 30 days, instead of changing the month value +1. Is there a more reliable way to fetch a pd.Timestamp and add 1 month?
Thank you

"AttributeError: Can only use .dt accessor with datetimelike values" when trying to extract year from a date column [duplicate]

This question already has answers here:
python pandas extract year from datetime: df['year'] = df['date'].year is not working
(5 answers)
Closed 5 years ago.
I got a dateframe which there is one column 'brth_dt' and its type is datetime64[ns].I want to extract the age of persons,however when input:
all_df['brth_dt'].dt.year or
all_df['age'] = (pd.datetime.today().year - all_df['brth_dt'].dt.year)
coming up an error:can only use .dt accessor with datetimelike values
brth_dt columns are like this:
brth_date
1 14Oct1978
2 21Aug1970
3 06Jan1980
4 09Mar1992
any advice?thanks!
You need to convert the column to datetime first using
df['brth_date'] = pd.to_datetime(df['brth_date'], format = '%d%b%Y')
The you can use the dt accessor
df['brth_date'].dt.year
You get
1 1978
2 1970
3 1980
4 1992

How to convert to just a date - Pandas, Python [duplicate]

This question already has answers here:
In Pandas how do I convert a string of date strings to datetime objects and put them in a DataFrame?
(3 answers)
Closed 8 years ago.
I'm trying to convert a string to a date and I understand how to use the to_datetime that comes with pandas but I'd like to be able to do this without inserting a time?
I'm sure this is very simple but I'm a little new to this.
You don't need the time component, if you use the datetime.strptime or to_datetime the conversion is the same:
In [10]:
df = pd.DataFrame({'date':['2012/04/06']})
df
Out[10]:
date
0 2012/04/06
In [11]:
import datetime as dt
df['date'].apply(lambda x: dt.datetime.strptime(x, '%Y/%m/%d'))
Out[11]:
0 2012-04-06
Name: date, dtype: datetime64[ns]
In [13]:
pd.to_datetime(df['date'])
Out[13]:
0 2012-04-06
Name: date, dtype: datetime64[ns]

Categories