I have found answers to question like this one helpful but not complete for my problem.
I have a form where the user automatically produces a date. I would like to store that as a date time.
I don't need any of the information after the seconds, but I cannot find a datetime.datetime.strptime code to translate the remaining stuff. So I would either like a strptime code that works for python2.7 on google app engine, or a string editing trick for removing the extra information that is not needed.
date-from-user='2012-09-22 07:36:36.333373-05:00'
You can slice your string to only select the first 19 characters:
>>> datefromuser='2012-09-22 07:36:36.333373-05:00'
>>> datefromuser[:19]
'2012-09-22 07:36:36'
This let's you parse the date without having to bother with the microseconds and timezone.
Do note that you probably do want to parse the timezone too though. You can use the iso8601 module to handle the whole format, without the need to slice:
>>> import iso8601
>>> iso8601.parse_date(datefromuser)
datetime.datetime(2012, 9, 22, 7, 36, 36, 333373, tzinfo=<FixedOffset '-05:00'>)
The iso8601 module is written in pure python and works without problems on the Google App Engine.
Python Docs would be a good place to start. strptime() would be your best option.
import datetime
datefromuser = '2012-09-22 07:36:36.333373-05:00'
datetime.datetime.strptime(datefromuser.split(".")[0], "%Y-%m-%d %H:%M:%S")
2012-09-22 07:36:36
http://docs.python.org/library/datetime.html#strftime-and-strptime-behavior
Related
I'm trying to use Twilio's messaging services to schedule a text, but I can't figure out how to format the time properly. I've pretty much copied the instructions to a T from Twilio's website, but I keep getting an invalid syntax error. Here's the line of code for the send_at variable:
send_at=datetime(2022-2-8'T'17:50:00'Z'),
How do I format this properly so it'll run? Thanks in advance for the help.
How is this timestamp being generated? Are you hard-coding it? The syntax error you're seeing is Python being unable to understand what's in between the parentheses, it doesn't seem syntactically correct.
You could simply use the isoformat() method from Python's datetime library instead to convert a regular datetime object to ISO afterwards, example:
>>> import datetime
>>> x = datetime.datetime(2022, 2, 8, 17, 50)
>>> x.isoformat()
'2022-02-08T17:50:00'
Twilio's own docs directly suggest this pattern (see the "Send Scheduled SMS in Python" section):
message = client.messages.create(
from_=messaging_service_sid,
to='+1xxxxxxxxxx', # ← your phone number here
body='Friendly reminder that you have an appointment with us next week.',
schedule_type='fixed',
send_at=send_when.isoformat() + 'Z',
)
Looks like the only additional detail is appending that 'Z' at the end, which is the only difference between my first snippet and your original example. If there are different docs that you followed that you can share a link to, happy to give more specific advice.
Suppose I have a large set of strings I want to parse to a set of datetime objects. I could use the dateutils.parser and iterate through the set but it is more computer intensive and takes a longer time than parsing one, retrieving the strptime format applied and just do datetime.strptime(string, model).
I wanted to create a function, a bit like the following:
def retrieve_format(datetime_object, string):
#do some things
return model
with the model being a string.
I have found nothing that explains the inner workings of the dateutils parser, and I believe the developers have the ability to add such a feature.
Any idea on how to do it ? It would save time and computing power.
Example
Suppose I have a set of string that are formatted the same way as this one:
myStr = '27/03/2020 - 16:20'
I could do
myDate = dateutils.parser.parse(myStr)
and get 'myDate' as being
datetime.datetime(2020, 3, 27, 16, 20)
but now I could use my function as such
>>> model = retrieve_format(myDate, myStr)
>>> print(model)
%d/%m/%Y - %H:%M
I could then do
datetime_set = {}
for formatted_string in set:
raw = datetime.datetime.strptime(formatted_string, model)
datetime_set.add(raw)
to treat all the other elements very efficiently.
Okay so thanks to snakecharmerb's comment on my question, I found this comment which uses the dateinfer library. Here, just the string is needed. Installation with pip is possible
pip install pydateinfer
A working example would be the following
import dateinfer
dateinfer.infer(['27/03/2020 - 16:20', '28/03/2020 - 14:56' ])
and the output is
'%d/%m/%Y - %H:%M'
The input is always a list, even if it contains only one element.
Depending on the ambiguity of the string, the list should have more or less elements. That is because for example in '04/04/2020', we have no means of distinguishing the day or the month.
Here's an array of datetime values:
array = np.array(['2016-05-01T00:00:59.3+10:00', '2016-05-01T00:02:59.4+10:00',
'2016-05-01T00:03:59.4+10:00', '2016-05-01T00:13:00.1+10:00',
'2016-05-01T00:22:00.5+10:00', '2016-05-01T00:31:01.1+10:00'],
dtype=object)
pd.to_datetime is very good at inferring datetime formats.
array = pd.to_datetime(array)
print(array)
DatetimeIndex(['2016-04-30 14:00:59.300000', '2016-04-30 14:02:59.400000',
'2016-04-30 14:03:59.400000', '2016-04-30 14:13:00.100000',
'2016-04-30 14:22:00.500000', '2016-04-30 14:31:01.100000'],
dtype='datetime64[ns]', freq=None)
How can I dynamically figure out what datetime format pd.to_datetime inferred? Something like: %Y-%m-%dT... (sorry, my datetime foo is really bad).
I don't think it's possible to do this in full generality in pandas.
As mentioned in other comments and answers, the internal function _guess_datetime_format is close to being what you ask for, but it has strict criteria for what constitutes a guessable format and so it will only work for a restricted class of datetime strings.
These criteria are set out in the _guess_datetime_format function on these lines and you can also see some examples of good and bad formats in the test_parsing script.
Some of the main points are:
year, month and day must each be present and identifiable
the year must have four digits
exactly six digits must be used if using microseconds
you can't specify a timezone
This means that it will fail to guess the format for datetime strings in the question despite them being a valid ISO 8601 format:
>>> from pandas.core.tools.datetimes import _guess_datetime_format_for_array
>>> array = np.array(['2016-05-01T00:00:59.3+10:00'])
>>> _guess_datetime_format_for_array(array)
# returns None
In this case, dropping the timezone and padding the microseconds to six digits is enough to make pandas to recognise the format:
>>> array = np.array(['2016-05-01T00:00:59.300000']) # six digits, no tz
>>> _guess_datetime_format_for_array(array)
'%Y-%m-%dT%H:%M:%S.%f'
This is probably as good as it gets.
If pd.to_datetime is not asked to infer the format of the array, or given a format string to try, it will just try and parse each string separately and hope that it is successful. Crucially, it does not need to infer a format in advance to do this.
First, pandas parses the string assuming it is (approximately) a ISO 8601 format. This begins in a call to _string_to_dts and ultimately hits the low-level parse_iso_8601_datetime function that does the hard work.
You can check if your string is able to be parsed in this way using the _test_parse_iso8601 function. For example:
from pandas._libs.tslib import _test_parse_iso8601
def is_iso8601(string):
try:
_test_parse_iso8601(string)
return True
except ValueError:
return False
The dates in the array you give are recognised as this format:
>>> is_iso8601('2016-05-01T00:00:59.3+10:00')
True
But this doesn't deliver what the question asks for and I don't see any realistic way to recover the exact format that is recognised by the parse_iso_8601_datetime function.
If parsing the string as a ISO 8601 format fails, pandas falls back to using the parse() function from the third-party dateutil library (called by parse_datetime_string). This allows a fantastic level of parsing flexibility but, again, I don't know of any good way to extract the recognised datetime format from this function.
If both of these two parsers fail, pandas either raises an error, ignores the string or defaults to NaT (depending on what the user specifies). No further attempt is made to parse the string or guess the format of the string.
DateInfer (PyDateInfer) library allows to infer dates based on the sequence of available dates:
github.com/wdm0006/dateinfer
Usage from docs:
>>> import dateinfer
>>> dateinfer.infer(['Mon Jan 13 09:52:52 MST 2014', 'Tue Jan 21 15:30:00 EST 2014'])
'%a %b %d %H:%M:%S %Z %Y'
>>>
Disclaimer: I have used and then contributed to this library
You can use _guess_datetime_format from core.tools to get the format. ie
from pandas.core.tools import datetimes as tools
tools._guess_datetime_format(pd.to_datetime(array).format()[0][:10])
Output :
'%Y-%m-%d'
To know more about this method you can see here. Hope it helps.
I am looking to retrieve the next possible date for a weekday contained in a string. Complexity being that this weekday will be in foreign language (sv_SE).
In bash I can solve this using `dateround´:
startdate=$(dateround --from-locale=sv_SE -z CET today $startday)
Highly appreciate your guidance on how to solve this in Python.
Thank you very much!
Dateparser has support for quite a few languages. You could parse the weekday to a datetime object then determine the next possible date available.
-- Edit --
from dateparser import parse
parse('Onsdag').isoweekday() # 3
Now that you have the iso weekday, you can find the next possible date. You can refer to this to see how.
It seems locale aliases are platform specific and case sensitive. I've windows. So locale will be sv_SE.
You can use babel for date/time conversion and is much more comprehensive than native locale module.
Babel is an integrated collection of utilities that assist in internationalizing and localizing Python applications, with an emphasis on web-based applications.
Which can be installed as:
pip install Babel
Once installed, we can use format_date , format_datetime , format_time utilities to format one language date , time to other.
You can use these utilities to convert date/time data between English and Swedish.
>>>import datetime
>>>from babel.dates import format_date, format_datetime, format_time
#Here we get current date time in an datetime object
>>> now = datetime.datetime.now()
>>> now
datetime.datetime(2017, 10, 31, 9, 46, 32, 650000)
#We format datetime object to english using babel
>>> format_date(now, locale='en')
u'Oct 31, 2017'
#We format datetime object to sweedish using babel
>>> format_date(now, locale='sv_SE')
u'31 okt. 2017'
>>>
Does anyone know how to parse the format as described in the title using Pythons strptime method?
I have something similar to this:
import datetime
date = datetime.datetime.strptime(entry.published.text, '%Y-%m-%dT%H:%M:%S.Z')
I can't seem to figure out what kind of timeformat this is. By the way, I'm a newbie at the Python language (I'm used to C#).
UPDATE
This is how I changed the code based on the advise (answers) below:
from dateutil.parser import *
from datetime import *
date = parse(entry.published.text)
That date is in ISO 8601, or more specifically RFC 3339, format.
Such dates can't be parsed with strptime. There's a Python issue that discusses this.
dateutil.parser.parse can handle a wide variety of dates, including the one in your example.
If you're using an external module for XML or RSS parsing, there is probably a routine in there to parse that date.
Here's a good way to find the answer: using strftime, construct a format string that will emit what you see. That string will, by definition, be the string needed to PARSE the time with strptime.
If you are trying to parse RSS or Atom feeds then use Universal Feed Parser. It supports many date/time formats.
>>> import feedparser # parse feed
>>> d = feedparser.parse("http://stackoverflow.com/feeds/question/3946689")
>>> t = d.entries[0].published_parsed # get date of the first entry as a time tuple
>>> import datetime
>>> datetime.datetime(*t[:6]) # convert time tuple to datetime object
datetime.datetime(2010, 10, 15, 22, 46, 56)
That's the standard XML datetime format, ISO 8601. If you're already using an XML library, most of them have datetime parsers built in. xml.utils.iso8601 works reasonably well.
import xml.utils.iso8601
date = xml.utils.iso8601.parse(entry.published.text)
You can look at a bunch of other ways to deal with that here:
http://wiki.python.org/moin/WorkingWithTime