How to convert a pandas datetime column from UTC to EST - python

There is another question that is eleven years old with a similar title.
I have a pandas dataframe with a column of datetime.time values.
val time
a 12:30:01.323
b 12:48:04.583
c 14:38:29.162
I want to convert the time column from UTC to EST.
I tried to do dataframe.tz_localize('utc').tz_convert('US/Eastern') but it gave me the following error: RangeIndex Object has no attribute tz_localize

tz_localize and tz_convert work on the index of the DataFrame. So you can do the following:
convert the "time" to Timestamp format
set the "time" column as index and use the conversion functions
reset_index()
keep only the time
Try:
dataframe["time"] = pd.to_datetime(dataframe["time"],format="%H:%M:%S.%f")
output = (dataframe.set_index("time")
.tz_localize("utc")
.tz_convert("US/Eastern")
.reset_index()
)
output["time"] = output["time"].dt.time
>>> output
time val
0 15:13:12.349211 a
1 15:13:13.435233 b
2 15:13:14.345233 c

to_datetime accepts an argument utc (bool) which, when true, coerces the timestamp to utc.
to_datetime returns a DateTimeIndex, which has a method tz_convert. this method will convert tz-aware timestamps from one timezeone to another.
So, this transformation could be concisely written as
df = pd.DataFrame(
[['a', '12:30:01.323'],
['b', '12:48:04.583'],
['c', '14:38:29.162']],
columns=['val', 'time']
)
df['time'] = pd.to_datetime(df.time, utc=True, format='%H:%M:%S.%f')
# convert string to timezone aware field ^^^
df['time'] = df.time.dt.tz_convert('EST').dt.time
# convert timezone, discarding the date part ^^^
This produces the following dataframe:
val time
0 a 07:30:01.323000
1 b 07:48:04.583000
2 c 09:38:29.162000
This could also be a 1-liner as below:
pd.to_datetime(df.time, utc=True, format='%H:%M:%S.%f').dt.tz_convert('EST').dt.time

list_temp = []
for row in df['time_UTC']:
list_temp.append(Timestamp(row, tz = 'UTC').tz_convert('US/Eastern'))
df['time_EST'] = list_temp

Related

pandas convert object to time format

I have a dataframe time column with object datatype and would like to convert time format for graph.
import pandas as pd
df = pd.DataFrame({
"time":["12:30:31.320"]
})
df["time"]
df['time'] = pd.to_datetime(df['time'],format='%H:%M:%S.%f').dt.strftime('%H:%M:%S')
df['time'] # Output Name: time, dtype: object
To keep Python's time instance, you can use:
df['time'] = (pd.to_datetime(df['time'],format='%H:%M:%S.%f')
.dt.floor('S') # remove milliseconds
.dt.time) # keep time part
Output:
>>> df['time']
0 12:30:31
Name: time, dtype: object # the dtype is object but...
>>> df.loc[0, 'time']
datetime.time(12, 30, 31) # ...contain a list of time objects
You appear to be attempting to convert the 'time' column back to a string in the format '%H:%M:%S' after converting it to datetime.
You may accomplish this by using the dt.strftime function.
However, after converting back to string, the output of df['time'] is still of object data type.
You may use the astype method to convert the data type of this column to string:
df['time'] = df['time'].astype(str)

converting float number into datetime format

I have a file that contains DateTime in float format
example 14052020175648.000000 I want to convert this into 14-05-2020 and leave the timestamp value.
input ==> 14052020175648.000000
expected output ==> 14-05-2020
Use pd.to_datetime:
df = pd.DataFrame({'Timestamp': ['14052020175648.000000']})
df['Date'] = pd.to_datetime(df['Timestamp'].astype(str).str[:8], format='%d%m%Y')
print(df)
# Output:
Timestamp Date
0 14052020175648.000000 2020-05-14
I used astype(str) in case where Timestamp is a float number and not a string, so it's not mandatory if your column already contains strings.
This can solve your problem
from datetime import datetime
string = "14052020175648.000000"
yourDate = datetime.strptime(string[:8], '%d%m%Y').strftime("%d-%m-%Y")
print(yourDate)
Output:
14-05-2020

How to remove the time from datetime of the pandas Dataframe. The type of the column is str and objects, but the value is dateime [duplicate]

i have a variable consisting of 300k records with dates and the date look like
2015-02-21 12:08:51
from that date i want to remove time
type of date variable is pandas.core.series.series
This is the way i tried
from datetime import datetime,date
date_str = textdata['vfreceiveddate']
format_string = "%Y-%m-%d"
then = datetime.strftime(date_str,format_string)
some Random ERROR
In the above code textdata is my datasetname and vfreceived date is a variable consisting of dates
How can i write the code to remove the time from the datetime.
Assuming all your datetime strings are in a similar format then just convert them to datetime using to_datetime and then call the dt.date attribute to get just the date portion:
In [37]:
df = pd.DataFrame({'date':['2015-02-21 12:08:51']})
df
Out[37]:
date
0 2015-02-21 12:08:51
In [39]:
df['date'] = pd.to_datetime(df['date']).dt.date
df
Out[39]:
date
0 2015-02-21
EDIT
If you just want to change the display and not the dtype then you can call dt.normalize:
In[10]:
df['date'] = pd.to_datetime(df['date']).dt.normalize()
df
Out[10]:
date
0 2015-02-21
You can see that the dtype remains as datetime:
In[11]:
df.dtypes
Out[11]:
date datetime64[ns]
dtype: object
You're calling datetime.datetime.strftime, which requires as its first argument a datetime.datetime instance, because it's an unbound method; but you're passing it a string instead of a datetime instance, whence the obvious error.
You can work purely at a string level if that's the result you want; with the data you give as an example, date_str.split()[0] for example would be exactly the 2015-02-21 string you appear to require.
Or, you can use datetime, but then you need to parse the string first, not format it -- hence, strptime, not strftime:
dt = datetime.strptime(date_str, '%Y-%m-%d %H:%M:%S')
date = dt.date()
if it's a datetime.date object you want (but if all you want is the string form of the date, such an approach might be "overkill":-).
simply writing
date.strftime("%d-%m-%Y") will remove the Hour min & sec

Slicing a string to filter a pandas dataframe

Should be an easy one, just not getting anywhere with it after looking at any existing examples.
I'm trying to filter a df where a date/time in my df equals a date/time I have in another variable called "date".
Both of these are stored as strings.
The format of df['DATE'] is like this:
2017/11/28 14:19:58
The format of date is like this:
11/28/2017 14:19
I want these to return a match.
df = df[df['DATE'][:-3] == date]
Error I get is this:
raise IndexingError('Unalignable boolean Series provided as '
pandas.core.indexing.IndexingError: Unalignable boolean Series provided
as indexer (index of the boolean Series and of the indexed object do not match
Seems like interpreter treats it as I am referencing the df position, not slicing the string within.
You need to use the pd.Series.str accessor for slicing:
from datetime import datetime
s = pd.Series(['2016/09/25 12:29:18', '2017/11/28 14:19:58', '2018/01/02 03:35:12'])
date = '11/28/2017 14:19'
res = (s.str[:-3] == datetime.strptime(date, '%m/%d/%Y %H:%M').strftime('%Y/%m/%d %H:%M'))
print(res)
0 False
1 True
2 False
dtype: bool
df
DATE
0 2017/11/21 14:19:58
1 2017/11/20 14:19:58
2 2017/11/21 12:19:58
date = '11/20/2017 14:19'
df[df['DATE'].apply(lambda x :pd.to_datetime(x,infer_datetime_format=True).strftime('%m/%d/%Y %H:%M'))==date]
DATE
1 2017/11/20 14:19:58
You can convert either of them or both if you would like to do any other datetime based operations.

Convert float64 column to datetime pandas

I have the following pandas DataFrame column dfA['TradeDate']:
0 20100329.0
1 20100328.0
2 20100329.0
...
and I wish to transform it to a datetime.
Based on another tread on SO, I convert it first to a string and then apply the strptime function.
dfA['TradeDate'] = datetime.datetime.strptime( dfA['TradeDate'].astype('int').to_string() ,'%Y%m%d')
However this returns the error that my format is incorrect (ValueError).
An issue that I spotted is that the column is not properly to string, but to an object.
When I try:
dfA['TradeDate'] = datetime.datetime.strptime( dfA['TradeDate'].astype(int).astype(str),'%Y%m%d')
It returns: must be a Str and not Series.
You can use:
df['TradeDate'] = pd.to_datetime(df['TradeDate'], format='%Y%m%d.0')
print (df)
TradeDate
0 2010-03-29
1 2010-03-28
2 2010-03-29
But if some bad values, add errors='coerce' for replace them to NaT
print (df)
TradeDate
0 20100329.0
1 20100328.0
2 20100329.0
3 20153030.0
4 yyy
df['TradeDate'] = pd.to_datetime(df['TradeDate'], format='%Y%m%d.0', errors='coerce')
print (df)
TradeDate
0 2010-03-29
1 2010-03-28
2 2010-03-29
3 NaT
4 NaT
You can use to_datetime with a custom format on a string representation of the values:
import pandas as pd
pd.to_datetime(pd.Series([20100329.0, 20100328.0, 20100329.0]).astype(str), format='%Y%m%d.0')
strptime function works on a single value, not on series. You need to apply that function to each element of the column
try pandas.to_datetime method
eg
dfA = pandas.DataFrame({"TradeDate" : [20100329.0,20100328.0]})
pandas.to_datetime(dfA['TradeDate'], format = "%Y%m%d")
or
dfA['TradeDate'].astype(int).astype(str)\
.apply(lambda x:datetime.datetime.strptime(x,'%Y%m%d'))
In your first attempt you tried to convert it to string and then pass to strptime, which resulted in ValueError. This happens because dfA['TradeDate'].astype('int').to_string() creates a single string containing all dates as well as their row numbers. You can change this to
dates = dfA['TradeDate'].astype('int').to_string(index=False).split()
dates
[u'20100329.0', u'20100328.0', u'20100329.0']
to get a list of dates. Then use python list comprehension to convert each element to datetime:
dfA['TradeDate'] = [datetime.strptime(x, '%Y%m%d.0') for x in dates]

Categories