Remove the time from datetime.datetime in pandas column - python

I have a pandas column called 'date'
which has values and type like 2014-07-30 00:00:00 <class 'datetime.datetime'>.
I want to remove the time from the date.The end result being `2014-07-30' in datetime.datetime format.
I tried a bunch of solutions like-
df['PSG Date '] = df['PSG Date '].dt.date
but its giving me error-
AttributeError: Can only use .dt accessor with datetimelike values

I believe need first to_datetime and for dates use dt.date:
df['PSG Date '] = pd.to_datetime(df['PSG Date '], errors='coerce').dt.date
If want datetimes with no times use dt.floor:
df['PSG Date '] = pd.to_datetime(df['PSG Date '], errors='coerce').dt.floor('d')

First, you should begin with a datetime series; if you don't have one, use pd.to_datetime to force this conversion. This will permit vectorised computations:
df = pd.DataFrame({'col': ['2014-07-30 12:19:22', '2014-07-30 05:52:05',
'2014-07-30 20:15:00']})
df['col'] = pd.to_datetime(df['col'])
Next, note you cannot remove time from a datetime series in Pandas. By definition, a datetime series will include both "date" and "time" components.
Normalize time
You can use pd.Series.dt.floor or pd.Series.dt.normalize to reset the time component to 00:00:00:
df['col_floored'] = df['col'].dt.floor('d')
df['col_normalized'] = df['col'].dt.normalize()
print(df['col_floored'].iloc[0]) # 2014-07-30 00:00:00
print(df['col_normalized'].iloc[0]) # 2014-07-30 00:00:00
Convert to datetime.date pointers
You can convert your datetime series to an object series, consisting of datetime.date objects representing dates:
df['col_date'] = df['col'].dt.date
print(df['col_date'].iloc[0]) # 2014-07-30
Since these are not held in a contiguous memory block, operations on df['col_date'] will not be vectorised.
How to check the difference
It's useful to check the dtype for the series we have derived. Notice the one option which "removes" time involves converting your series to object.
Computations will be non-vectorised with such a series, since it consists of pointers to datetime.date objects instead of data in a contiguous memory block.
print(df.dtypes)
col datetime64[ns]
col_date object
col_floored datetime64[ns]
col_normalized datetime64[ns]

You can convert a datetime.datetime to date time.date by calling the .date() method of the object. eg
current_datetime = datetime.datetime.now()
date_only = current_datetime.date()

Related

Dataframe - Converting entire column from str object to datetime object - TypeError: strptime() argument 1 must be str, not Series

I want to convert values in entire column from strings to datetime objects, but I can't accomplish it with this code which works on solo strings i.e. (if I add .iloc[] and specify the index):
price_df_higher_interval['DateTime'] = datetime.datetime.strptime(price_df_higher_interval['DateTime'],
'%Y-%m-%d %H:%M:%S')
Also I would like to ommit looping through the dataframe, but I don't know if that won't be necessery.
Thank you for your help :)
You could use the pd.to_datetime function.
df = pd.DataFrame({"str_date": ["2023-01-01 12:13:21", "2023-01-02 13:10:24 "]})
df["date"] = pd.to_datetime(df["str_date"], format="%Y-%m-%d %H:%M:%S")
df.dtypes
str_date object
date datetime64[ns]
dtype: object

How to remove the time from datetime of the pandas Dataframe. The type of the column is str and objects, but the value is dateime [duplicate]

i have a variable consisting of 300k records with dates and the date look like
2015-02-21 12:08:51
from that date i want to remove time
type of date variable is pandas.core.series.series
This is the way i tried
from datetime import datetime,date
date_str = textdata['vfreceiveddate']
format_string = "%Y-%m-%d"
then = datetime.strftime(date_str,format_string)
some Random ERROR
In the above code textdata is my datasetname and vfreceived date is a variable consisting of dates
How can i write the code to remove the time from the datetime.
Assuming all your datetime strings are in a similar format then just convert them to datetime using to_datetime and then call the dt.date attribute to get just the date portion:
In [37]:
df = pd.DataFrame({'date':['2015-02-21 12:08:51']})
df
Out[37]:
date
0 2015-02-21 12:08:51
In [39]:
df['date'] = pd.to_datetime(df['date']).dt.date
df
Out[39]:
date
0 2015-02-21
EDIT
If you just want to change the display and not the dtype then you can call dt.normalize:
In[10]:
df['date'] = pd.to_datetime(df['date']).dt.normalize()
df
Out[10]:
date
0 2015-02-21
You can see that the dtype remains as datetime:
In[11]:
df.dtypes
Out[11]:
date datetime64[ns]
dtype: object
You're calling datetime.datetime.strftime, which requires as its first argument a datetime.datetime instance, because it's an unbound method; but you're passing it a string instead of a datetime instance, whence the obvious error.
You can work purely at a string level if that's the result you want; with the data you give as an example, date_str.split()[0] for example would be exactly the 2015-02-21 string you appear to require.
Or, you can use datetime, but then you need to parse the string first, not format it -- hence, strptime, not strftime:
dt = datetime.strptime(date_str, '%Y-%m-%d %H:%M:%S')
date = dt.date()
if it's a datetime.date object you want (but if all you want is the string form of the date, such an approach might be "overkill":-).
simply writing
date.strftime("%d-%m-%Y") will remove the Hour min & sec

convert dataframe column from timestamp with timezone to timestamp type

I have a dataframe with 3 columns. The dataframe is created from Postgres table.
How can I do a conversion from timestamptz to timestamp please?
I did
df['StartTime'] = df["StartTime"].apply(lambda x: x.tz_localize(None))
example of data in the StartTime :
2013-09-27 14:19:46.825000+02:00
2014-02-07 10:52:25.392000+01:00
Thank you,
To give a more thorough answer, the point here is that in your example, you have timestamps with mixed UTC offsets. Without settings any keywords, pandas will convert the strings to datetime but leave the Series' type as native Python datetime, not pandas (numpy) datetime64. That makes it kind of hard to use built-in methods like tz_localize. But you can work your way around. Ex:
import pandas as pd
# exemplary Series
StartTime = pd.Series(["2013-09-27 14:19:46.825000+02:00", "2014-02-07 10:52:25.392000+01:00"])
# make sure we have datetime Series
StartTime = pd.to_datetime(StartTime)
# notice the dtype:
print(type(StartTime.iloc[0]))
# <class 'datetime.datetime'>
# we also cannot use dt accessor:
# print(StartTime.dt.date)
# >>> AttributeError: Can only use .dt accessor with datetimelike values
# ...but we can use replace method of datetime object and remove tz info:
StartTime = StartTime.apply(lambda t: t.replace(tzinfo=None))
# now we have
StartTime
0 2013-09-27 14:19:46.825
1 2014-02-07 10:52:25.392
dtype: datetime64[ns]
# and can use e.g.
StartTime.dt.date
# 0 2013-09-27
# 1 2014-02-07
# dtype: object

Trying to convert object to DateTime, getting TypeError

I have two dataframes (see here), which contain dates and times.
The details for the first data frame are:
Date object
Time object
Channel1 float64
Channel2 float64
Channel3 float64
Channel4 float64
Channel5 float64
dtype: object
The details for the second data frame are:
Date object
Time object
Mean float64
STD float64
Min float64
Max float64
dtype: object
I am trying to convert the times to a DateTime object so that I can then do a calculation to make the time relative to the first time instance (i.e. the earliest time would become 0, and then all others would be seconds after the start).
When I try (from here):
df['Time'] = df['Time'].apply(pd.Timestamp)
I get this error:
TypeError: Cannot convert input [15:35:45] of type <class 'datetime.time'> to Timestamp
When I try (from here):
df['Time'] = pd.to_datetime(df['Time'])
but it gives me this error:
TypeError: <class 'datetime.time'> is not convertible to datetime
Any suggestions would be appreciated.
the reason why you are getting the error
TypeError: <class 'datetime.time'> is not convertible to datetime
is literally what it says, your df['Time'] contains datetime.time object and so, cannot be converted to a datetime.datetime or Timestamp object, both of which require the date component to be passed as well.
The solution is to combine df['Date'] and df['Time'] and then, pass it to pd.to_datetime. See below code sample:
df = pd.DataFrame({'Date': ['3/11/2000', '3/12/2000', '3/13/2000'],
'Time': ['15:35:45', '18:35:45', '05:35:45']})
df['datetime'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])
Output
Date Time datetime
0 3/11/2000 15:35:45 2000-03-11 15:35:45
1 3/12/2000 18:35:45 2000-03-12 18:35:45
2 3/13/2000 05:35:45 2000-03-13 05:35:45
In the end my solution was different for the two dataframes which I had.
For the first dataframe, the solution which combines the Date column with the Time column worked well:
df['Date Time'] = df['Date'] + ' ' + df['Time']
After the two columns are combined, the following code is used to turn it into a datetime object (note the format='%d/%m/%Y %H:%M:%S' part is required because otherwise it confuses the month/date and uses the US formatting, i.e. it thinks 11/12/2018 is 12th of November, and not 11th of December):
df['Date Time'] = pd.to_datetime(df['Date Time'], format='%d/%m/%Y %H:%M:%S')
For my second dataframe, I went up earlier in my data processing journey and found an option which saves the date and month to a single column directly. After which the following code converted it to a datetime object:
df['Date Time'] = df['Date Time'].apply(pd.Timestamp)

How to convert timedelta to time of day in pandas?

I have a SQL table that contains data of the mySQL time type as follows:
time_of_day
-----------
12:34:56
I then use pandas to read the table in:
df = pd.read_sql('select * from time_of_day', engine)
Looking at df.dtypes yields:
time_of_day timedelta64[ns]
My main issue is that, when writing my df to a csv file, the data comes out all messed up, instead of essentially looking like my SQL table:
time_of_day
0 days 12:34:56.000000000
I'd like to instead (obviously) store this record as a time, but I can't find anything in the pandas docs that talk about a time dtype.
Does pandas lack this functionality intentionally? Is there a way to solve my problem without requiring janky data casting?
Seems like this should be elementary, but I'm confounded.
Pandas does not support a time dtype series
Pandas (and NumPy) do not have a time dtype. Since you wish to avoid Pandas timedelta, you have 3 options: Pandas datetime, Python datetime.time, or Python str. Below they are presented in order of preference. Let's assume you start with the following dataframe:
df = pd.DataFrame({'time': pd.to_timedelta(['12:34:56', '05:12:45', '15:15:06'])})
print(df['time'].dtype) # timedelta64[ns]
Pandas datetime series
You can use Pandas datetime series and include an arbitrary date component, e.g. today's date. Underlying such a series are integers, which makes this solution the most efficient and adaptable.
The default date, if unspecified, is 1-Jan-1970:
df['time'] = pd.to_datetime(df['time'])
print(df)
# time
# 0 1970-01-01 12:34:56
# 1 1970-01-01 05:12:45
# 2 1970-01-01 15:15:06
You can also specify a date, such as today:
df['time'] = pd.Timestamp('today').normalize() + df['time']
print(df)
# time
# 0 2019-01-02 12:34:56
# 1 2019-01-02 05:12:45
# 2 2019-01-02 15:15:06
Pandas object series of Python datetime.time values
The Python datetime module from the standard library supports datetime.time objects. You can convert your series to an object dtype series containing pointers to a sequence of datetime.time objects. Operations will no longer be vectorised, but each underlying value will be represented internally by a number.
df['time'] = pd.to_datetime(df['time']).dt.time
print(df)
# time
# 0 12:34:56
# 1 05:12:45
# 2 15:15:06
print(df['time'].dtype)
# object
print(type(df['time'].at[0]))
# <class 'datetime.time'>
Pandas object series of Python str values
Converting to strings is only recommended for presentation purposes that are not supported by other types, e.g. Pandas datetime or Python datetime.time. For example:
df['time'] = pd.to_datetime(df['time']).dt.strftime('%H:%M:%S')
print(df)
# time
# 0 12:34:56
# 1 05:12:45
# 2 15:15:06
print(df['time'].dtype)
# object
print(type(df['time'].at[0]))
# <class 'str'>
it's a hack, but you can pull out the components to create a string and convert that string to a datetime.time(h,m,s) object
def convert(td):
time = [str(td.components.hours), str(td.components.minutes),
str(td.components.seconds)]
return datetime.strptime(':'.join(time), '%H:%M:%S').time()
df['time'] = df['time'].apply(lambda x: convert(x))
found a solution, but i feel like it's gotta be more elegant than this:
def convert(x):
return pd.to_datetime(x).strftime('%H:%M:%S')
df['time_of_day'] = df['time_of_day'].apply(convert)
df['time_of_day'] = pd.to_datetime(df['time_of_day']).apply(lambda x: x.time())
Adapted this code

Categories