Python Date / Time Regular Expression - python

I am pretty new to regular expressions and it's pretty alien to me. I am parsing an XML feed which produces a date time as follows:
Wed, 23 July 2014 19:25:52 GMT
But I want to split these up so there are as follows:
date = 23/07/2014
time = 19/25/52
Where would I start? I have looked at a couple of other questions on SO and all of them deviate a bit from what I am trying to achieve.

Use datetime.strptime to parse the date from string and then format it using the strftime method of datetime objects:
>>> from datetime import datetime
>>> dt = datetime.strptime("Wed, 23 July 2014 19:25:52 GMT", "%a, %d %B %Y %H:%M:%S %Z")
>>> dt.strftime('%d/%m/%Y')
'23/07/2014'
>>> dt.strftime('%H/%M/%S')
'19/25/52'
But if you're okay with the ISO format you can call date and time methods:
>>> str(dt.date())
'2014-07-23'
>>> str(dt.time())
'19:25:52'

Related

Convert different date strings to unix timestamp in python 3.6

I have two strings. How can I convert them to UNIX timestamp (eg.: "1284101485")? (Please observe that 1284101485 is not the correct answer for this case.)
I don't care about time zones as long as it is consistent.
string_1_to_convert = 'Tue Jun 25 13:53:58 CEST 2019'
string_2_to_convert = '2019-06-25 13:53:58'
You can use dateparser
Install:
$ pip install dateparser
Sample code:
import dateparser
from time import mktime
string_1_to_convert = 'Tue Jun 25 13:53:58 CEST 2019'
string_2_to_convert = '2019-06-25 13:53:58'
datetime1 = dateparser.parse(string_1_to_convert)
datetime2 = dateparser.parse(string_2_to_convert)
unix_secs_1 = mktime(datetime1.timetuple())
unix_secs_2 = mktime(datetime2.timetuple())
print(unix_secs_1)
print(unix_secs_2)
Output:
1561492438.0
1561488838.0
The above implementation gives you a consistent response and doesn't give you an error when trying to parse CEST.
you can use .strptime to parse by a format you specify.
try this:
import datetime
string_1_to_convert = 'Tue Jun 25 13:53:58 CEST 2019'
string_2_to_convert = '2019-06-25 13:53:58'
ts1 = datetime.datetime.strptime(string_1_to_convert, "%a %b %d %H:%M:%S %Z %Y").timestamp()
ts2 = datetime.datetime.strptime(string_2_to_convert, "%Y-%m-%d %H:%M:%S").timestamp()
print(ts1)
print(ts2)
NOTICE: the CEST part might be non-portable, as strptime only knows how to parse timezones that appear in time.tzname.

Convert weekday name string into datetime

I have the following date (as an object format) : Tue 31 Jan in a pandas Series.
and I try to change it into : 31/01/2019
Please, how can I achieve this ? I understand more or less that pandas.Datetime can convert easily when a string date is clearer (like 6/1/1930 22:00) but not in my case, when their is a weekday name.
Thank you for your help.
Concat the year and callpd.to_datetime with a custom format:
s = pd.Series(['Tue 31 Jan', 'Mon 20 Feb',])
pd.to_datetime(s + ' 2019', format='%a %d %b %Y')
0 2019-01-31
1 2019-02-20
dtype: datetime64[ns]
This is fine as long as all your dates follow this format. If that is not the case, this cannot be solved reliably.
More information on datetime formats at strftime.org.
Another option is using the 3rd party dateutil library:
import dateutil
s.apply(dateutil.parser.parse)
0 2018-01-31
1 2018-02-20
dtype: datetime64[ns]
This can be installed with PyPi.
Another, slower option (but more flexible) is using the 3rd party datefinder library to sniff dates from string containing random text (if this is what you need):
import datefinder
s.apply(lambda x: next(datefinder.find_dates(x)))
0 2018-01-31
1 2018-02-20
dtype: datetime64[ns]
You can install it with PyPi.
Convert to a datetime object
If you wanted to use the datetime module, you could get the year by doing the following:
import datetime as dt
d = dt.datetime.strptime('Tue 31 Jan', '%a %d %b').replace(year=dt.datetime.now().year)
This is taking the date in your format, but replacing the default year 1900 with the current year in a reliable way.
This is similar to the other answers, but uses the builtin replace method as opposed to concatenating a string.
Output
To get the desired output from your new datetime object, you could perform the following:
>>> d.strftime('%d/%m/%Y')
'31/01/2018'
Here is two alternate ways to achieve the same result.
Method 1: Using datetime module
from datetime import datetime
datetime_object = datetime.strptime('Tue 31 Jan', '%a %d %b')
print(datetime_object) # outputs 1900-01-31 00:00:00
If you had given an Year parameter like Tue 31 Jan 2018, then this code would work.
from datetime import datetime
datetime_object = datetime.strptime('Tue 31 Jan 2018', '%a %d %b %Y')
print(datetime_object) # outputs 2018-01-31 00:00:00
To print the resultant date in a format like this 31/01/2019. You can use
print(datetime_object.strftime("%d/%m/%Y")) # outputs 31/01/2018
Here are all the possible formatting options available with datetime object.
Method 2: Using dateutil.parser
This method automatically fills in the Year parameter with current year.
from dateutil import parser
string = "Tue 31 Jan"
date = parser.parse(string)
print(date) # outputs 2018-01-31 00:00:00

How to parse time retrieved from Facebook Graph into 12 hour format?

When I pull events start times from Facebook Graph in comes in this form:
2017-09-26T18:00:00+0300
I'd like to convert it into readable format so I use this:
readable_event_date = dateutil.parser.parse(event_date).strftime('%a, %b %d %Y %H:%M:%S')
and it comes out like this:
Tue, 26 Sep 2017 18:00:00
Which is good but it loses the offset from UTC and I'd like it in AM PM format.
Thus, I would like it like this:
Tue, 26 Sep 2017 9:00 PM
To get into 12 hours format and keep offset from UTC for printing :
from dateutil.parser import parse
event_date = '2017-09-26T18:00:0+0300'
date = parse(event_date)
offset = date.tzinfo._offset
readable_event_date = (date + offset).strftime('%a, %b %d %Y %I:%M:%S %p')
print(readable_event_date)
Output:
'Tue, Sep 26 2017 09:00:00 PM'
It seems like what you want is this time, expressed in UTC, in the format '%a, %b %d %Y %I:%M:%S %p'. Luckily, all the information you need to do this is contained in the datetime object that you parsed, you just need to convert to UTC
Python 2.6+ or Python 3.3+:
The approach you've taken using dateutil will work for Python 2.6+ or Python 3.3.+ (and also works for a greater variety of datetime string formats):
from dateutil.parser import parse
# In Python 2.7, you need to use another one
from dateutil.tz import tzutc
UTC = tzutc()
dt_str = '2017-09-26T18:00:00+0300'
dt = parse(dt_str)
dt_utc = dt.astimezone(UTC) # Convert to UTC
print(dt_utc.strftime('%a, %b %d %Y %I:%M:%S %p'))
# Tue, Sep 26 2017 03:00:00 PM
One thing I notice is that the date you've provided, as far as I can tell, represents 3PM in UTC, not 9PM (as your example states). This is one reason you should use .astimezone(UTC) rather than some other approach.
If you want to include the time zone offset information, you can also use the %z parameter on the non-converted version of the datetime object.
print(dt.strftime('%a, %b %d %Y %I:%M:%S%z %p'))
# Tue, Sep 26 2017 06:00:00+0300 PM
This %z parameter may also be useful even if you are keeping it in UTC, because then you can at least be clear that the date the user is seeing is a UTC date.
Python 3.2+ only:
Given that you know the exact format of the input string, in Python 3.2+, you can achieve this same thing without pulling in dateutil, and it will almost certainly be faster (which may or may not be a concern for you).In your case here is how to rewrite the code so that it works with just the standard library:
from datetime import datetime, timezone
UTC = timezone.utc
dt_str = '2017-09-26T18:00:00+0300'
dt = datetime.strptime(dt_str, '%Y-%m-%dT%H:%M:%S%z')
dt_utc = dt.astimezone(UTC)
print(dt_utc.strftime('%a, %b %d %Y %I:%M:%S %p'))
# Tue, Sep 26 2017 03:00:00 PM
print(dt.strftime('%a, %b %d %Y %I:%M:%S%z %p'))
# Tue, Sep 26 2017 06:00:00+0300 PM

Calculating days using string dates in Python

I have dates in the current string format: 'Tue Feb 19 00:09:28 +1100 2013'
I'm trying to figure out how many days have passed between the date in the string and the present date.
I've been able to convert the string into a date.
import time
day = time.strptime('Tue Feb 19 00:09:28 +1100 2013', '%a %b %d %H:%M:%S +1100 %Y')
Use the datetime module instead:
import datetime
day = datetime.datetime.strptime('Tue Feb 19 00:09:28 +1100 2013', '%a %b %d %H:%M:%S +1100 %Y')
delta = day - datetime.datetime.now()
print delta.days
Subtracting two datetime.datetime values returns a datetime.timedelta object, which has a days attribute.
Your strings do contain a timezone offset, and you hardcoded it to match; if the value varies you'll have to use a parser that can handle the offset. The python-dateutil package includes both an excellent parser and the timezone support to handle this:
>>> from dateutil import parser
>>> parser.parse('Tue Feb 19 00:09:28 +1100 2013')
datetime.datetime(2013, 2, 19, 0, 9, 28, tzinfo=tzoffset(None, 39600))
Note that because this result includes the timezone, you now need to use timezone-aware datetime objects when using date arithmetic:
>>> from dateutil import tz
>>> import datetime
>>> utcnow = datetime.datetime.now(tz.tzutc())
>>> then = parser.parse('Tue Feb 19 00:09:28 +1100 2013')
>>> utcnow - then
datetime.timedelta(31, 12087, 617740)
>>> (utcnow - then).days
31
I created a utcnow variable in the above example based of the UTC timezone before calculating how long ago the parsed date was.

date format with timezone

How do I format a date in python to look like this: weekday:month:day(number):HH:MM:SS(military):EST/CST/PST:YYYY? I am familiar with strftime(), but I am unsure how I would handle the HH:MM:SS and EST/CST/PST.
example of how I am trying to get the date to look:
Sun Mar 10 15:53:00 EST 2013
from time import gmtime, strftime
print strftime("%a %b %d %H:%M:%S %Z %Y", gmtime())
This will produce
Fri Mar 22 21:10:56 Eastern Standard Time 2013
You'll have to settle for the long name of the timezone unless you want to use pytz. I suppose it's worth noting that timezone abbreviations aren't unique.
Use strftime to output a formatted string representation:
print time.strftime("%a %b %d %H:%M:%S %Z %Y")
A list of the format codes can be found here

Categories