How to change date format in python with pandas - python

I'm working with big data in pandas and I have a problem with the format of the dates, this is the format of one column
Wed Feb 24 12:06:14 +0000 2021
and I think it is easier to change the format of all the columns with a format like this
'%d/%m/%Y, %H:%M:%S'
how can i do that?

Does this work for you?
pandas.to_datetime(s, format='%d/%m/%Y, %H:%M:%S')
Source: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior

You can use the following function for your dataset.
def change_format(x):
format = dt.datetime.strptime(x, "%a %b %d %H:%M:%S %z %Y")
new_format = format.strftime('%d/%m/%Y, %H:%M:%S')
return new_format
Then apply it using df['date_column'] = df['date_column'].apply(change_format).
Here df is your dataset.

Related

Unable to get time difference between to pandas dataframe columns

I have a pandas dataframe that contains a couple of columns. Two of which are start_time and end_time. In those columns the values look like - 2020-01-04 01:38:33 +0000 UTC
I am not able to create a datetime object from these strings because I am not able to get the format right -
df['start_time'] = pd.to_datetime(df['start_time'], format="yyyy-MM-dd HH:mm:ss +0000 UTC")
I also tried using yyyy-MM-dd HH:mm:ss %z UTC as a format
This gives the error -
ValueError: time data '2020-01-04 01:38:33 +0000 UTC' does not match format 'yyyy-MM-dd HH:mm:ss +0000 UTC' (match)
You just need to use the proper timestamp format that to_datetime will recognize
df['start_time'] = pd.to_datetime(df['start_time'], format="%Y-%m-%d %H:%M:%S +0000 UTC")
There are some notes below about this problem:
1. About your error
This gives the error -
You have parsed a wrong datetime format that will cause the error. For correct format check this one https://strftime.org/. Correct format for this problem would be: "%Y-%m-%d %H:%M:%S %z UTC"
2. Pandas limitation with timezone
Parsing UTC timezone as %z doesn't working on pd.Series (it only works on index value). So if you use this, it will not work:
df['startTime'] = pd.to_datetime(df.startTime, format="%Y-%m-%d %H:%M:%S %z UTC", utc=True)
Solution for this is using python built-in library for inferring the datetime data:
from datetime import datetime
f = lambda x: datetime.strptime(x, "%Y-%m-%d %H:%M:%S %z UTC")
df['startTime'] = pd.to_datetime(df.startTime.apply(f), utc=True)
#fmarm answer only help you dealing with date and hour data, not UTC timezone.

How to custom parse dates in pandas

I know this seems like a duplicated question (and it kinda is), but previous answers didn't let me achieve what I'm looking for. I have a date Series with the following format:
date
Jun 13 14:46
Jun 13 17:11
And so, I wanted to turn it into a datetime object. I did the following:
pd.to_datetime(df.date, format='%b %d %I:%M')
Which based on this question should be enough: Convert custom date formats in pandas
But, I'm still getting ValueError: time data 'Jun 13 14:46' does not match format '%b %d %I:%M' (match)
What am I missing?
Thanks
try pd.to_datetime(df.date, format='%b %d %H:%M')
%I is for a 12-hour clock.
%H is for a 24-hour clock.

How to convert datetime to different timezone?

I'm trying to convert a datetime string into a different timezone. My code works but the result is not what I'm looking for.
I've already tried .localize() and .astimezone but the output is the same.
phtimezone = timezone('Asia/Manila')
test = datetime.datetime.strptime('Sun Sep 16 03:38:40 +0000 2018','%a %b %d %H:%M:%S +0000 %Y')
date = phtimezone.localize(test)
print (date)
date = test.astimezone(phtimezone)
print (date)
The output is 2018-09-16 03:38:40+08:00. I was expecting it to be 2018-09-16 11:38:40+08:00.
Your parsed object test does not contain a timezone. It's a naïve datetime object. Using both localize and astimezone cannot do any conversion, since they don't know what they're converting from; so they just attach the timezone as given to the naïve datetime.
Also parse the timezone:
datetime.strptime('Sun Sep 16 03:38:40 +0000 2018','%a %b %d %H:%M:%S %z %Y')
^^
This gives you an aware datetime object in the UTC timezone which can be converted to other timezones.
I was able to fix it thanks to #deceze. Here is the code:
phtimezone = pytz.timezone('Asia/Manila')
test = datetime.datetime.strptime('Sun Sep 16 03:38:40 +0000 2018','%a %b %d %H:%M:%S %z %Y')
test_utc = test.replace(tzinfo=timezone('UTC'))
date = test_utc.astimezone(pytz.timezone('Asia/Manila'))
print (date)
The output is now 2018-09-16 11:38:40+08:00

Python date conversion?

I'm currently trying to convert a file format into a slightly different style to allow easier importing into a program however I can't quite get my head around how to convert datetime strings between formats. The original I have is the following:
2016-12-15 17:26:45
However the required format for the date time is:
Thu Dec 15 17:19:03 2016
Does anyone know if there is an easy way to convert between these? These values are always in the same place and format so it doesn't need to be too dynamic so to speak outside of recognising what a certain day of the month is (if that can be done at all?)
Update - The conversion has worked for 1 date but not the other weirdly :/ The code to grab the two dates is the following:
startDate=startDate.replace("Started : ","")
startDate=startDate.replace(" (ISO format YYYY-MM-DD HH:MM:SS)","")
startDate=startDate.strip()
startDt = datetime.strptime(startDate, '%Y-%m-%d %H:%M:%S')
startDt=startDt.strftime('%a %b %d %H:%M:%S %Y ')
print (startDt)
This part works as inteded and outputs the required format:
"2016-12-15 17:26:45
Thu Dec 15 17:26:45 2016"
The end date part is a bit "ham fisted" so to speak and I'm sure there are better ways to do the re.sub search just to do anything in brackets but I'll edit that later.
endDate=endDate.replace("Ended : ","")
endDate=endDate.strip()
endDate = re.sub("\(.*?\)", "", endDate)
endDate.strip()
endDt = datetime.strptime(endDate, '%Y-%m-%d %H:%M:%S')
endDt=endDt.strftime('%a %b %d %H:%M:%S %Y ')
print (endDt)
This part however despite the outputs being an identical format
"2016-12-15 17:26:45
2016-12-15 21:22:11"
produces the following error:
endDt = datetime.strptime(endDate, '%Y-%m-%d %H:%M:%S')
File "C:\Python27\lib\_strptime.py", line 335, in _strptime
data_string[found.end():])
ValueError: unconverted data remains:
from datetime import datetime
dt = datetime.strptime('2016-06-01 1:33:45', '%Y-%m-%d %H:%M:%S')
dt.strftime('%a %b %d %H:%M:%S %Y ')
>>> 'Wed Jun 01 01:33:45 2016'
It's a pretty easy task with the Datetime module.
As it's been pointed out, checking the docs will get you a lot of useful info, starting from the directives to feed to the strptime and strftime (respectively, parse and format time) functions which you'll need here.
A working example for you case would be:
from datetime import datetime
myDateString = '2016-12-15 17:26:45'
myDateObj = datetime.strptime(myDateString, '%Y-%m-%d %H:%M:%S')
myDateFormat = myDateObj.strftime('%a %b %d %H:%M:%S %Y')
Check out this section of the docs to have a better understanding of the formatting placeholders.
You can use the datetime module:
from datetime import datetime
string = '2016-12-15 17:26:45'
date = datetime.strptime(string, '%Y-%m-%d %H:%M:%S')
date2 = date.strftime("%a %b %d %H:%M:%S %Z %Y")
print(date2)
Output:
Thu Dec 15 17:26:45 2016

String to Time Python strptime

I'm pulling the date value from gmail and trying to perform some functions on it. First I simply want to display it, but I can't even do that. See my code and error below.
from datetime import datetime
timeString = 'Sat, 2 Aug 2014 09:29:31 -0700'
myTime = datetime.strptime(timeString, '%m-%d-%Y %I:%M %p')
Here's the error I get. Do you think its the -0700 that's getting in the way?
ValueError: time data 'Sat, 2 Aug 2014 09:29:31 -0700' does not match format '%m-%d-%Y %I:%M %p'
As the error message suggests, you need to put the same format as your date string is in, I've not tried it, but something like this should work:
myTime = datetime.strptime(timeString, '%a, %d %b %Y %I:%M:%S %z')
Check here for complete details.

Categories