AttributeError: 'Series' object has no attribute 'days' - python

I have a column 'delta' in a dataframe dtype: timedelta64[ns], calculated by subcontracting one date from another. I am trying to return the number of days as a float by using this code:
from datetime import datetime
from datetime import date
df['days'] = float(df['delta'].days)
but I receive this error:
AttributeError: 'Series' object has no attribute 'days'
Any ideas why?

DataFrame column is a Series, and for Series you need dt.accessor to calculate days (if you are using a newer Pandas version). You can see docs here
So, you need to change:
df['days'] = float(df['delta'].days)
To
df['days'] = float(df['delta'].dt.days)

While subtracting the dates you should use the following code.
df = pd.DataFrame([ pd.Timestamp('20010101'), pd.Timestamp('20040605') ])
(df.loc[0]-df.loc[1]).astype('timedelta64[D]')
So basically use .astype('timedelta64[D]') on the subtracted column.

Related

Extracting seconds from a pandas TimeStamp series: 'Series' object has no attribute 'second'

I am reading a CSV file using pandas, where one of the columns has a TimeStamp value that looks like this:
2022.7.11/11:34:36:372
My end goal is to extract the TimeStamp value as datetime and subsequently get the values of seconds that I want to use in further arithmetic operations.
So far the code I have:
df = pd.read_csv(location, encoding='utf8', delimiter=';')
s = pd.Series((df['TimeStamp']).values).str.split('/', n=4 ,expand=True)
t1 = pd.to_datetime(s[1] , format='%H:%M:%S:%f')
But when I try to get the seconds value from t1, using t1.second I get an exception that says: > 'Series' object has no attribute 'second'
Where could I be going wrong?
After typecasting the object to datetime column , you should be able to access seconds using :
df['TimeStamp'] = pd.to_datetime(df['TimeStamp'], format='%Y.%m.%d/%H:%M:%S:%f')
df['TimeStamp'].dt.second
0 36
Name: TimeStamp, dtype: int64
You can access the microsecond part using
df['TimeStamp'].dt.microsecond
You should use t1.dt.second instead of t1.second

How to group a datetime column by year?

I have a dataframe with a column of this format : datetime64[ns]
I want to group by rows by year. The dates are of this format: 2019-01-08 02:27:17
I unsuccessfully tried
df1=df.groupby([(df.modification_datetime.year)]).sum()
AttributeError: 'Series' object has no attribute 'year'
Do you know how to solve that?
EDIT :
The solution is
df1=df.groupby(df.modification_datetime.dt.year).sum()
We don't need the brackets!
You don't need all of those brackets, and you need to use the dt accessor to extract the year from the date:
df1 = df.groupby(df.modification_datetime.dt.year).sum()

Python - pd.to_datetime does not convert object string to datetime, keeps being object

following problem:
I'm trying to convert a column of a DataFrame from a string to a datetime object with following code:
df = pd.read_csv('data.csv')
df['Time (CET)'] = pd.to_datetime(df['Time (CET)'])
Should be the standard pandas way to do so. But the dtype of the column doesn't change, keeps being an object while no error or exception is raised.
The entries look like 2018-12-31 17:47:14+01:00.
If I apply pd.to_datetime with utc=True, it works completely fine, dtype changes from object to datetime64[ns, UTC]. Unfortunately I don't want to convert the time to UTC, only converting the string to a datetime object without any time zone changes.
Thanks a lot!

How do I group date by month using pd.Grouper?

I've searched stackoverflow to find out how to group DateTime by month and for some reason I keep receiving this error, even after I pass the dataframe through pd.to.datetime
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or
PeriodIndex, but got an instance of 'Int64Index'
df['Date'] = pd.to_datetime(df['Date'])
df['Date'].groupby(pd.Grouper(freq='M'))
When I pull the datatype for df['Date'] it shows dtype: datetime64[ns] Any, ideas why I keep getting this error?
The reason is simple: you didn't pass a groupby key to groupby.
What you want is to group the entire dataframe by the month values of the contents of df['Date'].
However, what df['Date'].groupby(pd.Grouper(freq='M')) actually does is first extract a pd.Series from the DataFrame's Date column. Then, it attempts to perform a groupby on that Series; without a specified key, it defaults to attempting to group by the index, which is of course numeric.
This will work:
df.groupby(pd.Grouper(key='Date', freq='M'))

How to apply to_datetime on pandas Dataframe column?

I have a dataframe with Timestamp entries in one column, created from strings like so:
df = pd.DataFrame({"x": pd.to_datetime("MARCH2016")})
Now I want to select from df based on month, cutting across years, by accessing the .month attribute of the datetime object. However, to_datetime actually created a Timestamp object from the string, and I can't seem to coerce it to datetime. The following works as expected:
type(df.x[0].to_datetime()) # gives datetime object
but using apply (which in my real life example of course I want to do given that I have more than one row) doesn't:
type(df.x.apply(pd.to_datetime)[0]) # returns Timestamp
What am I missing?
The fact that it's a TimeStamp is irrelevant here, you can still access the month attribute using .dt accessor:
In [79]:
df = pd.DataFrame({"x": [pd.to_datetime("MARCH2016")]})
df['x'].dt.month
Out[79]:
0 3
Name: x, dtype: int64

Categories