This question already has answers here:
ValueError: day is out of range for month
(2 answers)
Closed 3 years ago.
I am trying to convert a column type to datetime
Value Format in Column: '2016-04-10 12:17:52'
df['dropoff_time']
output
0 2016-04-10 12:17:52
1 2016-04-13 06:44:12
2 2016-04-13 06:54:43
3 2016-04-13 08:33:50
Name: created_at_new, Length: 328, dtype: object
I am trying the following code:
df['created_at_new'] = pd.to_datetime(df['created_at_new'])
ValueError: day is out of range for month
Desired result is a datetime
('2010-11-12 00:00:00')
When I tried with the same example, it worked for me. Anyways in order to rectify the error, you can try the following:
Check whether you have the latest version of pandas. If not Update it and
Try mentioning the date format
df['created_at_new'] = pd.to_datetime(df['created_at_new'], format='%Y-%m-%d %H:%M:%S')
Still, if it doesn't work. You can skip the one with error using the argument errors='coerce'. In the place of the skipped one, 'NaT' value will be added.
For more details, you check out this answer.
Related
This question already has answers here:
Convert Pandas Column to DateTime
(8 answers)
Closed 1 year ago.
I have got a data frame with diverse data but two columns contained dates (date of admission and date of discharge). The format of these dates is xxxx-xx-xx 00:00:00. I want to do some calculations such as subtraction. The code I used was
covid_19_admission['days_Hospitalised'] = (covid_19_admission['Discharge_date'] - covid_19_admission['Date']).dt.days
# I expected a new column with the days.
But I got the error TypeError: unsupported operand type(s) for -: 'str' and 'str'.
I am not interested in hours, minutes or seconds so I have tried to remove these and understand the format but when I write
covid_19_admission.dtypes
Patient_admitted_id int64
Date object
Hospital_ID int64
Discharge_date object
dtype: object
I am new working on dates so I dont know how to do this.
Convert both columns to datetimes:
covid_19_admission['days_Hospitalised'] = (pd.to_datetime(covid_19_admission['Discharge_date']) - (pd.to_datetime(covid_19_admission['Date']))).dt.days
This question already has answers here:
pandas datetime to unixtime
(2 answers)
Closed 2 years ago.
I would like to convert a date of the format with yyyy=year, mm=month, dd=day, hh=hour, nn=minute in a unix timestamp.
I tried:
df_out['unixtime'] = datetime(df_out['yyyymmddhhmm'].dt.year.to_numpy(),df_out['yyyymmddhhmm'].dt.month.to_numpy(),df_out['yyyymmddhhmm'].dt.day.to_numpy(),df_out['yyyymmddhhmm'].dt.hour.to_numpy(),df_out['yyyymmddhhmm'].dt.minute.to_numpy()).timestamp()
but I got the error message:
TypeError: only size-1 arrays can be converted to Python scalars
What am I doing wrong?
Any help is highly appreciated!
Regards,
Alexander
The officially recommended way is to subtract the epoch and then to floor-divide by the “unit” (1 second):
df = pd.DataFrame({'yyyymmddhhmm': pd.to_datetime(['20201108121314', '20201109121314'])})
df['unixtime'] = (df.yyyymmddhhmm - pd.Timestamp('1970-01-01')) // pd.Timedelta('1s')
Result:
yyyymmddhhmm unixtime
0 2020-11-08 12:13:14 1604837594
1 2020-11-09 12:13:14 1604923994
You can create a single column for the date using the pandas library
df_out['date_format'] = pd.to_datetime(df_out['date_time_column'], format='%Y%m%d%H%M')
Then you can create new columns which will consist of year, month, date, hour info by
pd.DatetimeIndex(df_out['date_format']).year
pd.DatetimeIndex(df_out['date_format']).month
pd.DatetimeIndex(df_out['date_format']).day
pd.DatetimeIndex(df_out['date_format']).hour
This question already has answers here:
Pandas convert column with year integer to datetime
(3 answers)
Closed 3 years ago.
I have a year and I need to convert it to datetime formatted as Y-01-01. I have tried the following:
testdf = testdf = pd.DataFrame({"A":[1994]})
pd.to_datetime(testdf.A)
yields
0 1970-01-01 00:00:00.000001994
Name: A, dtype: datetime64[ns]
Desired output would be this:
0 1994-01-01
I have also tried various configurements of format, unit etc but to no avail. I can only assume I must be missing something glaringly obvious as this seems like a fairly trivial task!
You can specify the format as %Y:
pd.to_datetime(testdf.A, format='%Y')
then
0 1994-01-01
Name: A, dtype: datetime64[ns]
This question already has answers here:
Extracting just Month and Year separately from Pandas Datetime column
(13 answers)
Closed 3 years ago.
I have a dataframe with dates, and I want to make a column with only the month of the corresponding date in each row. First, I converted my dates to ts objects like this:
df['Date'] = pd.to_datetime(df['Date'])
After that, I tried to make my new column for the month like this:
df['Month'] = df['Date'].month
However, it gives me an error:
AttributeError: 'Series' object has no attribute 'month'
I do not understand why I can't do it like this. I double checked whether the conversion to ts objects actually works, and that does work. Also, if I extract 1 date using slicing, I can append .month to get the month. I technically could solve the problem by looping over all indices and then slicing for each index, but my dataframe contains 166000+ rows so that is not an option.
You have to use property (or accessor object) dt
df["month"] = df.date.dt.month
Being new to python and pandas, I faced next problem.
In my dataframe i have column with dates (yyyy-mm-ddThh-mm-sec), where most part of the years are ok (looks like 2008), and a part, where year is written like 0008. Due to this I have problem with formatting column using pd.to_datetime.
My thought was to convert it first into 2-digit year (using pd.to_datetime(df['date']).dt.strftime('%y %b, %d %H:%M:%S.%f +%Z')), but I got an error Out of bounds nanosecond timestamp: 08-10-02 14:41:00.
Are there any other options to convert 0008 to 2008 in dataframe?
Thanks for the help in advance
If the format for the bad data is always the same (as in the bad years are always 4 characters) then you can use str:
df = pd.DataFrame({'date':['2008-01-01', '0008-01-02']})
df['date'] = pd.to_datetime(df['date'].str[2:], yearfirst=True)
date
0 2008-01-01
1 2008-01-02