'Index' object has no attribute 'tz_localize' - python

I'm trying to convert all instances of 'GMT' time in a time/date column ('Created_At') in a csv file so that it is all formatted in 'EST'.
Please see below:
import pandas as pd
from pandas.tseries.resample import TimeGrouper
from pandas.tseries.offsets import DateOffset
from pandas.tseries.index import DatetimeIndex
cambridge = pd.read_csv('\Users\cgp\Desktop\Tweets.csv')
cambridge['Created_At'] = pd.to_datetime(pd.Series(cambridge['Created_At']))
cambridge.set_index('Created_At', drop=False, inplace=True)
cambridge.index = cambridge.index.tz_localize('GMT').tz_convert('EST')
cambridge.index = cambridge.index - DateOffset(hours = 12)
The error I'm getting is:
cambridge.index = cambridge.index.tz_localize('GMT').tz_convert('EST')
AttributeError: 'Index' object has no attribute 'tz_localize'
I've tried various different things but am stumped as to why the Index object won't recognized the tz_attribute. Thank you so much for your help!

Replace
cambridge.set_index('Created_At', drop=False, inplace=True)
with
cambridge.set_index(pd.DatetimeIndex(cambridge['Created_At']), drop=False, inplace=True)

Hmm. Like the other tz_localize current problem, this works fine for me. Does this work for you? I have simplified some of the calls a bit from your example:
df2 = pd.DataFrame(randn(3, 3), columns=['A', 'B', 'C'])
# randn(3,3) returns nine random numbers in a 3x3 array.
# the columns argument to DataFrame names the 3 columns.
# no datetimes here! (look at df2 to check)
df2['A'] = pd.to_datetime(df2['A'])
# convert the random numbers to datetimes -- look at df2 again
# if A had values to_datetime couldn't handle, we'd clean up A first
df2.set_index('A',drop=False, inplace=True)
# and use that column as an index for the whole df2;
df2.index = df2.index.tz_localize('GMT').tz_convert('US/Eastern')
# make it timezone-conscious in GMT and convert that to Eastern
df2.index.tzinfo
<DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>

Related

pandas convert object to time format

I have a dataframe time column with object datatype and would like to convert time format for graph.
import pandas as pd
df = pd.DataFrame({
"time":["12:30:31.320"]
})
df["time"]
df['time'] = pd.to_datetime(df['time'],format='%H:%M:%S.%f').dt.strftime('%H:%M:%S')
df['time'] # Output Name: time, dtype: object
To keep Python's time instance, you can use:
df['time'] = (pd.to_datetime(df['time'],format='%H:%M:%S.%f')
.dt.floor('S') # remove milliseconds
.dt.time) # keep time part
Output:
>>> df['time']
0 12:30:31
Name: time, dtype: object # the dtype is object but...
>>> df.loc[0, 'time']
datetime.time(12, 30, 31) # ...contain a list of time objects
You appear to be attempting to convert the 'time' column back to a string in the format '%H:%M:%S' after converting it to datetime.
You may accomplish this by using the dt.strftime function.
However, after converting back to string, the output of df['time'] is still of object data type.
You may use the astype method to convert the data type of this column to string:
df['time'] = df['time'].astype(str)

How to convert a pandas datetime column from UTC to EST

There is another question that is eleven years old with a similar title.
I have a pandas dataframe with a column of datetime.time values.
val time
a 12:30:01.323
b 12:48:04.583
c 14:38:29.162
I want to convert the time column from UTC to EST.
I tried to do dataframe.tz_localize('utc').tz_convert('US/Eastern') but it gave me the following error: RangeIndex Object has no attribute tz_localize
tz_localize and tz_convert work on the index of the DataFrame. So you can do the following:
convert the "time" to Timestamp format
set the "time" column as index and use the conversion functions
reset_index()
keep only the time
Try:
dataframe["time"] = pd.to_datetime(dataframe["time"],format="%H:%M:%S.%f")
output = (dataframe.set_index("time")
.tz_localize("utc")
.tz_convert("US/Eastern")
.reset_index()
)
output["time"] = output["time"].dt.time
>>> output
time val
0 15:13:12.349211 a
1 15:13:13.435233 b
2 15:13:14.345233 c
to_datetime accepts an argument utc (bool) which, when true, coerces the timestamp to utc.
to_datetime returns a DateTimeIndex, which has a method tz_convert. this method will convert tz-aware timestamps from one timezeone to another.
So, this transformation could be concisely written as
df = pd.DataFrame(
[['a', '12:30:01.323'],
['b', '12:48:04.583'],
['c', '14:38:29.162']],
columns=['val', 'time']
)
df['time'] = pd.to_datetime(df.time, utc=True, format='%H:%M:%S.%f')
# convert string to timezone aware field ^^^
df['time'] = df.time.dt.tz_convert('EST').dt.time
# convert timezone, discarding the date part ^^^
This produces the following dataframe:
val time
0 a 07:30:01.323000
1 b 07:48:04.583000
2 c 09:38:29.162000
This could also be a 1-liner as below:
pd.to_datetime(df.time, utc=True, format='%H:%M:%S.%f').dt.tz_convert('EST').dt.time
list_temp = []
for row in df['time_UTC']:
list_temp.append(Timestamp(row, tz = 'UTC').tz_convert('US/Eastern'))
df['time_EST'] = list_temp

How do I make a column with iso weeks?

I have a data frame (df) with a column of dates (DATUM).
i then try to make a new column with iso weeks on the dates.
But as a beginner in python, I have run into a problem.
When I try to use:
df ['iso_week_num'] = df ["DATUM"]. isocalendar () [1]
I get the following error message:
AttributeError: 'Series' object has no attribute 'isocalendar'
What am I doing wrong?
Notice that isocalendar must be applied to a timestamp (check documentation) - it is not an attribute as the error raised. Said that, this should work for your case:
# Generating some data (you may use rand too):
dates = ['2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07']
dataframe = pd.DataFrame(data = dates, columns = ['date'])
dataframe['date'] = pd.to_datetime(dataframe['date'])
# Applying isocalendar for each element in the series:
dataframe['year'] = dataframe['date'].apply(lambda x: x.isocalendar()[0])
You should use pandas.Series.dt() to access object for datetimelike properties of the Series values.
df['week'] = df['date'].dt.isocalendar().week

I have a date column in a dataframe. I want to change the format of the dates,in that column

I have a date column in a dataset where the dates are like 'Apr-12','Jan-12' format. I would like to change the format to 04-2012,01-2012. I am looking for a function which can do this.
I think I know one guy with the same name. Jokes apart here is the solution to your problem.
We do have an inbuilt function named as strptime(), so it takes up the string and then convert into the format you want.
You need to import datetime first since it is the part of the datetime package of python. Don't no need to install anything, just import it.
Then this works like this: datetime.strptime(your_string, format_you_want)
# You can also do this, from datetime import * (this imports all the functions of datetime)
from datetime import datetime
str = 'Apr-12'
date_object = datetime.strptime(str, '%m-%Y')
print(date_object)
I hope this will work for you. Happy coding :)
You can do following:
import pandas as pd
df = pd.DataFrame({
'date': ['Apr-12', 'Jan-12', 'May-12', 'March-13', 'June-14']
})
pd.to_datetime(df['date'], format='%b-%y')
This will output:
0 2012-04-01
1 2012-01-01
2 2012-05-01
Name: date, dtype: datetime64[ns]
Which means you can update your date column right away:
df['date'] = pd.to_datetime(df['date'], format='%b-%y')
You can chain a couple of pandas methods together to get this the desired output:
df = pd.DataFrame({'date_fmt':['Apr-12','Jan-12']})
df
Input dataframe:
date_fmt
0 Apr-12
1 Jan-12
Use pd.to_datetime chained with .dt date accessor and strftime
pd.to_datetime(df['date_fmt'], format='%b-%y').dt.strftime('%m-%Y')
Output:
0 04-2012
1 01-2012
Name: date_fmt, dtype: object

Select Pandas dataframe rows based on 'hour' datetime

I have a pandas dataframe 'df' with a column 'DateTimes' of type datetime.time.
The entries of that column are hours of a single day:
00:00:00
.
.
.
23:59:00
Seconds are skipped, it counts by minutes.
How can I choose rows by hour, for example the rows between 00:00:00 and 00:01:00?
If I try this:
df.between_time('00:00:00', '00:00:10')
I get an error that index must be a DateTimeIndex.
I set the index as such with:
df=df.set_index(keys='DateTime')
but I get the same error.
I can't seem to get 'loc' to work either. Any suggestions?
Here a working example of what you are trying to do:
times = pd.date_range('3/6/2012 00:00', periods=100, freq='S', tz='UTC')
df = pd.DataFrame(np.random.randint(10, size=(100,1)), index=times)
df.between_time('00:00:00', '00:00:30')
Note the index has to be of type DatetimeIndex.
I understand you have a column with your dates/times. The problem probably is that your column is not of this type, so you have to convert it first, before setting it as index:
# Method A
df.set_index(pd.to_datetime(df['column_name'], drop=True)
# Method B
df.index = pd.to_datetime(df['column_name'])
df = df.drop('col', axis=1)
(The drop is only necessary if you want to remove the original column after setting it as index)
Check out these links:
convert column to date type: Convert DataFrame column type from string to datetime
filter dataframe on dates: Filtering Pandas DataFrames on dates
Hope this helps

Categories