This question already has answers here:
Pandas convert column with year integer to datetime
(3 answers)
Closed 3 years ago.
I have a year and I need to convert it to datetime formatted as Y-01-01. I have tried the following:
testdf = testdf = pd.DataFrame({"A":[1994]})
pd.to_datetime(testdf.A)
yields
0 1970-01-01 00:00:00.000001994
Name: A, dtype: datetime64[ns]
Desired output would be this:
0 1994-01-01
I have also tried various configurements of format, unit etc but to no avail. I can only assume I must be missing something glaringly obvious as this seems like a fairly trivial task!
You can specify the format as %Y:
pd.to_datetime(testdf.A, format='%Y')
then
0 1994-01-01
Name: A, dtype: datetime64[ns]
Related
This question already has answers here:
pandas datetime to unixtime
(2 answers)
Closed 2 years ago.
I would like to convert a date of the format with yyyy=year, mm=month, dd=day, hh=hour, nn=minute in a unix timestamp.
I tried:
df_out['unixtime'] = datetime(df_out['yyyymmddhhmm'].dt.year.to_numpy(),df_out['yyyymmddhhmm'].dt.month.to_numpy(),df_out['yyyymmddhhmm'].dt.day.to_numpy(),df_out['yyyymmddhhmm'].dt.hour.to_numpy(),df_out['yyyymmddhhmm'].dt.minute.to_numpy()).timestamp()
but I got the error message:
TypeError: only size-1 arrays can be converted to Python scalars
What am I doing wrong?
Any help is highly appreciated!
Regards,
Alexander
The officially recommended way is to subtract the epoch and then to floor-divide by the “unit” (1 second):
df = pd.DataFrame({'yyyymmddhhmm': pd.to_datetime(['20201108121314', '20201109121314'])})
df['unixtime'] = (df.yyyymmddhhmm - pd.Timestamp('1970-01-01')) // pd.Timedelta('1s')
Result:
yyyymmddhhmm unixtime
0 2020-11-08 12:13:14 1604837594
1 2020-11-09 12:13:14 1604923994
You can create a single column for the date using the pandas library
df_out['date_format'] = pd.to_datetime(df_out['date_time_column'], format='%Y%m%d%H%M')
Then you can create new columns which will consist of year, month, date, hour info by
pd.DatetimeIndex(df_out['date_format']).year
pd.DatetimeIndex(df_out['date_format']).month
pd.DatetimeIndex(df_out['date_format']).day
pd.DatetimeIndex(df_out['date_format']).hour
This question already has answers here:
ValueError: day is out of range for month
(2 answers)
Closed 3 years ago.
I am trying to convert a column type to datetime
Value Format in Column: '2016-04-10 12:17:52'
df['dropoff_time']
output
0 2016-04-10 12:17:52
1 2016-04-13 06:44:12
2 2016-04-13 06:54:43
3 2016-04-13 08:33:50
Name: created_at_new, Length: 328, dtype: object
I am trying the following code:
df['created_at_new'] = pd.to_datetime(df['created_at_new'])
ValueError: day is out of range for month
Desired result is a datetime
('2010-11-12 00:00:00')
When I tried with the same example, it worked for me. Anyways in order to rectify the error, you can try the following:
Check whether you have the latest version of pandas. If not Update it and
Try mentioning the date format
df['created_at_new'] = pd.to_datetime(df['created_at_new'], format='%Y-%m-%d %H:%M:%S')
Still, if it doesn't work. You can skip the one with error using the argument errors='coerce'. In the place of the skipped one, 'NaT' value will be added.
For more details, you check out this answer.
Being new to python and pandas, I faced next problem.
In my dataframe i have column with dates (yyyy-mm-ddThh-mm-sec), where most part of the years are ok (looks like 2008), and a part, where year is written like 0008. Due to this I have problem with formatting column using pd.to_datetime.
My thought was to convert it first into 2-digit year (using pd.to_datetime(df['date']).dt.strftime('%y %b, %d %H:%M:%S.%f +%Z')), but I got an error Out of bounds nanosecond timestamp: 08-10-02 14:41:00.
Are there any other options to convert 0008 to 2008 in dataframe?
Thanks for the help in advance
If the format for the bad data is always the same (as in the bad years are always 4 characters) then you can use str:
df = pd.DataFrame({'date':['2008-01-01', '0008-01-02']})
df['date'] = pd.to_datetime(df['date'].str[2:], yearfirst=True)
date
0 2008-01-01
1 2008-01-02
This question already has answers here:
How to change the datetime format in Pandas
(8 answers)
Closed 4 years ago.
I have this panda dataframe df.
Name Date Score Score2
Joe 26-12-2007 53.45 53.4500
Joe 27-12-2007 52.38 52.7399
Joe 28-12-2007 51.71 51.8500
I would like to convert the date format in the Date column from dd-mm-yyyy to yyyy-mm-dd. The converted dataframe will look like this;
Name Date Score Score2
Joe 2007-12-26 53.45 53.4500
Joe 2007-12-27 52.38 52.7399
Joe 2007-12-28 51.71 51.8500
I am using python v3.6
EDIT: The duplicate question assumes that the original date format is yyyy-mm-dd. However, my original date format is dd-mm-yyyy. If I were to apply the answer in that question, the converted dates is wrong.
How to change the datetime format in pandas
Use:
df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%Y-%m-%d')
I think you need this:
df['Date'] = df['Date'].dt.strftime('%Y-%m-%d')
This question already has answers here:
How to convert integer into date object python?
(4 answers)
Closed 5 years ago.
I have following dataframe.
id int_date
1 20160228
2 20161231
3 20160618
4 20170123
5 20151124
How to convert above date in int format to date format of mm/dd/yyyy? Want this in particular format for further excel operations?
id int_date
1 02/28/2016
2 12/31/2016
3 06/18/2016
4 01/23/2017
5 11/24/2015
IS it also possible to generate third column with only Month in words? like January, February etc from int_date?
I tried following
date = datetime(year=int(s[0:4]), month=int(s[4:6]), day=int(s[6:8]))
but date is in datetime object, how to put it as date in pandas DF?
You can use datetime methods.
from datetime import datetime
a = '20160228'
date = datetime.strptime(a, '%Y%m%d').strftime('%m/%d/%Y')
Good Luck;
Build a new column with applymap:
import pandas as pd
dates = [
20160228,
20161231,
20160618,
20170123,
20151124,
]
df = pd.DataFrame(data=list(enumerate(dates, start=1)), columns=['id','int_date'])
df[['str_date']] = df[['int_date']].applymap(str).applymap(lambda s: "{}/{}/{}".format(s[4:6],s[6:], s[0:4]))
print(df)
Emits:
$ python test.py
id int_date str_date
0 1 20160228 02/28/2016
1 2 20161231 12/31/2016
2 3 20160618 06/18/2016
3 4 20170123 01/23/2017
4 5 20151124 11/24/2015
There is bound to be a better solution to this, but since you have zeroes instead of single-digit elements in your date (i.e. 06 instead of 6), why not just convert it to string and convert the subsections?
using datetime would also get you the month strings etc.
//edit:
to be a little more precise, something like this should do the job:
def get_datetime(date):
date_string = str(date)
return datetime.date(date_string[:3], date_string[4:6], date_string[6:8]