I am trying to merge two pandas dataframes, and to do this I want to make it so that they both have the same index. The problem is, one df has an index of datatype object which just includes the date while the other df has an index of datatype datetime64[ns] which includes the date and time. Is there a way to make these both the same data type so that I can merge the two dataframes?
Convert both date types into a pandas datetime format and convert them with having just dates.
df['date_only'] = df['dates'].dt.date
You could convert a date and time format to just date as below
import pandas as pd
date_n_time='2015-01-08 22:44:09'
date=pd.to_datetime(date_n_time).date()
make your index as a column using
df.reset_index()
set it back to index using
df.set_index()
Related
I am trying to convert a datetime datatype of the form 24/12/2021 07:24:00 to mm-yyyy format which is 12-2021 with datetime datatype. I need the mm-yyyy in datetime format in order to sort the column 'Month-Year' in a time series. I have tried
import pandas as pd
from datetime import datetime
df = pd.read_excel('abc.xlsx')
df['Month-Year'] = df['Due Date'].map(lambda x: x.strftime('%m-%y'))
df.set_index(['ID', 'Month-Year'], inplace=True)
df.sort_index(inplace=True)
df
The column 'Month-Year' does not sort in time series because 'Month-Year' is of object datatype. How do I please convert 'Month-Year' column to datetime datatype?
I have been able to obtain a solution to the problem.
df['month_year'] = pd.to_datetime(df['Due Date']).dt.to_period('M')
I got this from the link below
https://www.interviewqs.com/ddi-code-snippets/extract-month-year-pandas
df['Month-Year']=pd.to_datetime(df['Month-Year']).dt.normalize()
will convert the Month-Year to datetime64[ns].
Use it before sorting.
Basically i am sas developer.
As of now i am doing sas2python migrations.
Before reading to pandas dataframe i have two columns ie,
DATE NAME
01JAN1988 VARUN
11JAN1999 THARUN
After reading to pandas dataframe the DATE columns is automatically read as float values. Now I need to show it as DATE Columns as date9 format
Could you please provide the steps
you can use apply function to convert the values into date objects and datetime module to covert them:
df['DATE'] = df['DATE'].apply(lambda x: datetime.datetime.strptime(x,'%d%b%Y').date())
Output:
DATE NAME
0 1988-01-01 VARUN
1 1999-01-11 THARUN
I have a pandas dataframe 'df' with a column 'DateTimes' of type datetime.time.
The entries of that column are hours of a single day:
00:00:00
.
.
.
23:59:00
Seconds are skipped, it counts by minutes.
How can I choose rows by hour, for example the rows between 00:00:00 and 00:01:00?
If I try this:
df.between_time('00:00:00', '00:00:10')
I get an error that index must be a DateTimeIndex.
I set the index as such with:
df=df.set_index(keys='DateTime')
but I get the same error.
I can't seem to get 'loc' to work either. Any suggestions?
Here a working example of what you are trying to do:
times = pd.date_range('3/6/2012 00:00', periods=100, freq='S', tz='UTC')
df = pd.DataFrame(np.random.randint(10, size=(100,1)), index=times)
df.between_time('00:00:00', '00:00:30')
Note the index has to be of type DatetimeIndex.
I understand you have a column with your dates/times. The problem probably is that your column is not of this type, so you have to convert it first, before setting it as index:
# Method A
df.set_index(pd.to_datetime(df['column_name'], drop=True)
# Method B
df.index = pd.to_datetime(df['column_name'])
df = df.drop('col', axis=1)
(The drop is only necessary if you want to remove the original column after setting it as index)
Check out these links:
convert column to date type: Convert DataFrame column type from string to datetime
filter dataframe on dates: Filtering Pandas DataFrames on dates
Hope this helps
I've seen several articles about using datetime and dateutil to convert into datetime objects.
However, I can't seem to figure out how to convert a column into a datetime object so I can pivot out that columns and perform operations against it.
I have a dataframe as such:
Col1 Col 2
a 1/1/2013
a 1/12/2013
b 1/5/2013
b 4/3/2013 ....etc
What I want is :
pivott = pivot_table( df, rows ='Col1', values='Col2', and then I want to get the range of dates for each value in Col1)
I am not sure how to correctly approach this. Even after using
df['Col2']= pd.to_datetime(df['Col2'])
I couldn't do operations against the dates since they are strings...
Any advise?
Use datetime.strptime
import pandas as pd
from datetime import datetime
df = pd.read_csv('somedata.csv')
convertdatetime = lambda d: datetime.strptime(d,'%d/%m/%Y')
converted = df['DATE_TIME_IN_STRING'].apply(convertdatetime)
converted[:10] # you should be getting dtype: datetime64[ns]
I have a data frame indexed with a date (Python datetime object). How could I find the frequency as the number of months of data in the data frame?
I tried the attribute data_frame.index.freq, but it returns a none value. I also tried asfreq function using data_frame.asfreq('M',how={'start','end'} but it does not return the expected results. Please advise how I can get the expected results.
You want to convert you index of datetimes to a DatetimeIndex, the easiest way is to use to_datetime:
df.index = pd.to_datetime(df.index)
Now you can do timeseries/frame operations, like resample or TimeGrouper.
If your data has a consistent frequency, then this will be df.index.freq, if it doesn't (e.g. if some days are missing) then df.index.freq will be None.
You probably want to be use pandas Timestamp for your index instead of datetime to use 'freq'. See example below
import pandas as pd
dates = pd.date_range('2012-1-1','2012-2-1')
df = pd.DataFrame(index=dates)
print (df.index.freq)
yields,
<Day>
You can easily convert your dataframe like so,
df.index = [pd.Timestamp(d) for d in df.index]