I've been trying to use the to_datetime function to convert values in my column to datetime:
df['date'] = pd.to_datetime(df['date'],errors='coerce',format='%Y-%m-%d %H:%M:%S %z %Z')
After that, I received only NaT values.
Example: Value Format in Column: '1979-01-01 00:00:00 +0000 UTC'
I think you can't parse utc offset (+0000) and timeszone information at the sime time.
You might want to remove the UTC at the end and only parse the offset.
df['date'] = df.date.str[:-4]
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d %H:%M:%S %z')
Pandas can't manage both %z and %Z as you can see here. Note that Python's strptime can handle this, but doesn't deal with %Z.
In your case you might want to just peel off the last bit with ser.str and consider opening a feature request.
Related
I have a pandas dataframe that contains a couple of columns. Two of which are start_time and end_time. In those columns the values look like - 2020-01-04 01:38:33 +0000 UTC
I am not able to create a datetime object from these strings because I am not able to get the format right -
df['start_time'] = pd.to_datetime(df['start_time'], format="yyyy-MM-dd HH:mm:ss +0000 UTC")
I also tried using yyyy-MM-dd HH:mm:ss %z UTC as a format
This gives the error -
ValueError: time data '2020-01-04 01:38:33 +0000 UTC' does not match format 'yyyy-MM-dd HH:mm:ss +0000 UTC' (match)
You just need to use the proper timestamp format that to_datetime will recognize
df['start_time'] = pd.to_datetime(df['start_time'], format="%Y-%m-%d %H:%M:%S +0000 UTC")
There are some notes below about this problem:
1. About your error
This gives the error -
You have parsed a wrong datetime format that will cause the error. For correct format check this one https://strftime.org/. Correct format for this problem would be: "%Y-%m-%d %H:%M:%S %z UTC"
2. Pandas limitation with timezone
Parsing UTC timezone as %z doesn't working on pd.Series (it only works on index value). So if you use this, it will not work:
df['startTime'] = pd.to_datetime(df.startTime, format="%Y-%m-%d %H:%M:%S %z UTC", utc=True)
Solution for this is using python built-in library for inferring the datetime data:
from datetime import datetime
f = lambda x: datetime.strptime(x, "%Y-%m-%d %H:%M:%S %z UTC")
df['startTime'] = pd.to_datetime(df.startTime.apply(f), utc=True)
#fmarm answer only help you dealing with date and hour data, not UTC timezone.
I'm trying to convert strings in my dataset('2016-01-01 00:00:00') to time stamps using pd.to_datetime.
Im trying:
pd.to_datetime(train["timestamp"],format='%Y/%m/%d %I:%M:%S')
but I get
time data '2016-01-01 00:00:00' does not match format '%Y/%m/%d %I:%M:%S' (match)
How can I fix this?
If you want it to be in the specific format that you mentioned, that is %Y/%m/%d %I:%M:%S, then do it like this.
First convert your string to datetime format using to_datetime:
df['timestamp'] = pd.to_datetime(df['timestamp'])
Now that your column is in datetime format, convert to the following format using strftime:
df['timestamp'] = df['timestamp'].dt.strftime('%Y/%m/%d %I:%M:%S')
Output:
timestamp
0 2016/01/01 12:00:00
1 2016/01/01 12:00:00
As others pointed out, use %H instead of %I for 24 hour format, like this:
df['timestamp'] = df['timestamp'].dt.strftime('%Y/%m/%d %H:%M:%S')
That's because your format in your df is different. Try the following using -, also use %H for 24-hour clock:
pd.to_datetime(train["timestamp"],format='%Y-%m-%d %H:%M:%S')
2 issues here:
Use - instead of /
%I is for Hour 00-12, use %H for Hour 00-23
pd.to_datetime(train["timestamp"],format='%Y-%m-%d %H:%M:%S')
I have the following strings that I'd like to convert to datetime objects:
'01-01-16 7:43'
'01-01-16 3:24'
However, when I try to use strptime it always results in a does not match format error.
Pandas to_datetime function nicely handles the automatic conversion, but I'd like to solve it with the datetime library as well.
format_ = '%m-%d-%Y %H:%M'
my_date = datetime.strptime("01-01-16 4:51", format_)
ValueError: time data '01-01-16 4:51' does not match format '%m-%d-%Y %H:%M'
as i see your date time string '01-01-16 7:43'
its a 2-digit year not 4-digit year
that in order to parse through a 2-digit year, e.g. '16' rather than '2016', a %y is required instead of a %Y.
you can do that like this
from datetime import datetime
datetime_str = '01-01-16 7:43'
datetime_object = datetime.strptime(datetime_str, '%m-%d-%y %H:%M')
print(type(datetime_object))
print(datetime_object)
give you output 2016-01-01 07:43:00
First of all, if you want to match 2016 you should write %Y while for 16 you should write %y.
That means you should write:
format_ = '%m-%d-%y %H:%M'
Check this link for all format codes.
I scraped a website and got the following Output:
2018-06-07T12:22:00+0200
2018-06-07T12:53:00+0200
2018-06-07T13:22:00+0200
Is there a way I can take the first one and convert it into a DateTime value?
Just parse the string into year, month, day, hour and minute integers and then create a new date time object with those variables.
Check out the datetime docs
You can convert string format of datetime to datetime object like this using strptime, here %z is the time zone :
import datetime
dt = "2018-06-07T12:22:00+0200"
ndt = datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S%z")
# output
2018-06-07 12:22:00+02:00
The following function (not mine) should help you with what you want:
df['date_column'] = pd.to_datetime(df['date_column'], format = '%d/%m/%Y %H:%M').dt.strftime('%Y%V')
You can mess around with the keys next to the % symbols to achieve what you want. You may, however, need to do some light cleaning of your values before you can use them with this function, i.e. replacing 2018-06-07T12:22:00+0200 with 2018-06-07 12:22.
You can use datetime lib.
from datetime import datetime
datetime_object = datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
datetime.strptime documentation
Solution here
I have a datetime type mydate in %Y-%m-%dT%H:%M:%S format.
I want to replace the hours
I did this using mydate.replace() method
Now I want to comapre it with another specific date -> myNEWdate whose format is %Y-%m-%d %H:%M:%S :
newdate = mydate.replace(hour = islot)
print newdate
appointmentDict[mydate]['time_start'] = datetime.strptime(str(newdate),"%Y/%m/%d %H:%M:%S")
The date is printed as 2015-06-26 08:00:00
and I get the error
ValueError: time data '2015-06-26 08:00:00' does not match format '%Y/%m/%d %H:%M:%S'
What should I do to resolve this
You need to set the correct format
datetime.strptime(str(newdate),"%Y-%m-%d %H:%M:%S")
To solve the exception. Although converting from datetime 2 string and backwards doesn't make much sense, as mentioned in the comments.