How to convert datetime to strings in python [duplicate] - python

This question already has answers here:
Converting a datetime column to a string column
(4 answers)
Closed 3 days ago.
I have a dataframe which contains a column called period that has datetime values in it in the following format:
2020-03-01T00:00:00.000000000
I want to convert the datetime to strings with the format - 03/01/2020 (month, day, year)
How would I do this?

import pandas as pd
df = pd.DataFrame({'period': ['2020-03-01T00:00:00.000000000', '2020-04-01T00:00:00.000000000']})
df['period'] = pd.to_datetime(df['period'])
df['period'] = df['period'].dt.strftime('%m/%d/%Y')
print(df)
Output
period
0 03/01/2020
1 04/01/2020

Related

How to obtain just the year from pandas data frame? [duplicate]

This question already has answers here:
python pandas extract year from datetime: df['year'] = df['date'].year is not working
(5 answers)
Closed 3 months ago.
So I wrote some code to turn a list of strings into date times:
s = pd.Series(["14 Nov 2020", "14/11/2020", "2020/11/14",
"Hello World", "Nov 14th, 2020"])
s_dates = pd.to_datetime(s, errors='coerce', exact=False)
print(s_dates)
It produced the following output:
0 2020-11-14
1 2020-11-14
2 2020-11-14
3 NaT
4 2020-11-14
dtype: datetime64[ns]
How would I obtain just the year from this?
Since your seriess_dates has dtype datetime64[ns], you can directly use
Series.dt.year like:
print(s_dates.dt.year)
This will return a series containing only the year (as dtype int64).
Check the documentation for more useful datetime transformations.
Assuming your years would always be 4 digits, we can try using str.extract here:
s_dates["year"] = s_dates["dates_extracted"].str.extract(r'(\d{4})')

Converting column type 'datetime64[ns]' to datetime in Python3

I would like to perform a comparison between the two dates (One from a pandas dataframe) in python3, another one is calculated. I would like to filter pandas dataframe if the values in the 'Publication_date' is equal to or less than the today's date and is greater than the date 10 years ago.
The pandas df looks like this:
PMID Publication_date
0 31611796 2019-09-27
1 33348808 2020-12-17
2 12089324 2002-06-27
3 31028872 2019-04-25
4 26805781 2016-01-21
I am doing the comparison as shown below.
df[(df['Publication_date']> datetime.date.today() - datetime.timedelta(days=3650)) &
(df['Publication_date']<= datetime.date.today())]
Above date filter when applied on the df should not give Row:3 of the df.
'Publication_date' column has type 'string'. I converted it to date using below line in my script.
df_phenotype['publication_date']= pd.to_datetime(df_phenotype['publication_date'])
But it changes the column type to 'datetime64[ns]' that makes the comparison incompatible between 'datetime64[ns]' and datetime.
How can I perform this comparison?
Any help is highly appreciated.
You can use only pandas for working with datetimes - Timestamp.floor is for remove times from datetimes (set times to 00:00:00):
df['Publication_date']= pd.to_datetime(df['Publication_date'])
today = pd.to_datetime('now').floor('d')
df1 = df[(df['Publication_date']> today - pd.Timedelta(days=3650)) &
(df['Publication_date']<= today)]
Also you can use 10 years offset:
today = pd.to_datetime('now').floor('d')
df1 = df[(df['Publication_date']> today - pd.offsets.DateOffset(years=10)) &
(df['Publication_date']<= today)]
print (df1)
PMID Publication_date
0 31611796 2019-09-27
1 33348808 2020-12-17
3 31028872 2019-04-25
4 26805781 2016-01-21

pandas - add 1 month to a pd.Timestamp [duplicate]

This question already has answers here:
Add months to a date in Pandas
(4 answers)
How can I get pandas Timestamp offset by certain amount of months?
(1 answer)
Closed 4 years ago.
I have multiple df, and they are indexed with timestamps for consecutive months. For example:
1996-01-01 01:00:00
1996-02-01 01:00:00
1996-03-01 01:00:00
1996-04-01 01:00:00
1996-05-01 01:00:00
1996-06-01 01:00:00
I'm trying to create a function where I can add an arbitrary number of rows onto the df, continuing on from whatever the last month happens to be. I tried to solve this by using:
df.iloc[-1].name + pd.Timedelta(1, unit='M')
in a for loop, but this only seems to add 30 days, instead of changing the month value +1. Is there a more reliable way to fetch a pd.Timestamp and add 1 month?
Thank you

"AttributeError: Can only use .dt accessor with datetimelike values" when trying to extract year from a date column [duplicate]

This question already has answers here:
python pandas extract year from datetime: df['year'] = df['date'].year is not working
(5 answers)
Closed 5 years ago.
I got a dateframe which there is one column 'brth_dt' and its type is datetime64[ns].I want to extract the age of persons,however when input:
all_df['brth_dt'].dt.year or
all_df['age'] = (pd.datetime.today().year - all_df['brth_dt'].dt.year)
coming up an error:can only use .dt accessor with datetimelike values
brth_dt columns are like this:
brth_date
1 14Oct1978
2 21Aug1970
3 06Jan1980
4 09Mar1992
any advice?thanks!
You need to convert the column to datetime first using
df['brth_date'] = pd.to_datetime(df['brth_date'], format = '%d%b%Y')
The you can use the dt accessor
df['brth_date'].dt.year
You get
1 1978
2 1970
3 1980
4 1992

How to convert to just a date - Pandas, Python [duplicate]

This question already has answers here:
In Pandas how do I convert a string of date strings to datetime objects and put them in a DataFrame?
(3 answers)
Closed 8 years ago.
I'm trying to convert a string to a date and I understand how to use the to_datetime that comes with pandas but I'd like to be able to do this without inserting a time?
I'm sure this is very simple but I'm a little new to this.
You don't need the time component, if you use the datetime.strptime or to_datetime the conversion is the same:
In [10]:
df = pd.DataFrame({'date':['2012/04/06']})
df
Out[10]:
date
0 2012/04/06
In [11]:
import datetime as dt
df['date'].apply(lambda x: dt.datetime.strptime(x, '%Y/%m/%d'))
Out[11]:
0 2012-04-06
Name: date, dtype: datetime64[ns]
In [13]:
pd.to_datetime(df['date'])
Out[13]:
0 2012-04-06
Name: date, dtype: datetime64[ns]

Categories