validate date : python - python

I want to know how to convert different format dates to expected format in python .
ex : i want to get this format : 2/29/2012
['2012-02-01 // 2012-02-28', '2/15/2012', '2/13/2012', '2/14/2012', '2/23/2012', '2/18/2012', '2/29/2012']
How to check today date in the range '2012-02-01 // 2012-02-28'
Share your suggestions

Use the datetime library in python. You can just compare two different datetime.datetime objects. And you can separate the year, month, date thing to put it in anyform you want.
Check this link for all the library details.
http://docs.python.org/library/datetime.html
Hope that helped.

The dateutil python library parses a wider variety of date formats than the standard datetime module.

Related

Python library to return date format

I need to return the date format from a string. Currently I am using parser to parse a string as a date, then replacing the year with a yyyy or yy. Similarly for other dates items. Is there some function I could use that would return mm-dd-yyyy when I send 12-05-2018?
Technically, it is an impossible question. If you send in 12-05-2018, there is no way for me to know whether you are sending in a mm-dd-yyyy (Dec 5, 2018) or dd-mm-yyyy (May 12, 2018).
One approach might be to do a regex replacement of anything which matches your expected date pattern, e.g.
date = "Here is a date: 12-05-2018 and here is another one: 10-31-2010"
date_masked = re.sub(r'\b\d{2}-\d{2}-\d{4}\b', 'mm-dd-yyyy', date)
print(date)
print(date_masked)
Here is a date: 12-05-2018 and here is another one: 10-31-2010
Here is a date: mm-dd-yyyy and here is another one: mm-dd-yyyy
Of course, the above script makes no effort to check whether the dates are actually valid. If you require that, you may use one of the date libraries available in Python.
I don't really understand what you plan to do with the format. There are two reasons I can think of why you might want it. (1) You want at some future point to convert a normalized datetime back into the original string. If that is what you want you would be better off just storing the normalized datetime and the original string. Or (2) you want to draw (dodgy) conclusions about person sending the data, because different nationalities will tend to use different formats. But, whatever you want it for, you can do it this way:
from dateutil import parser
def get_date_format(date_input):
date = parser.parse(date_input)
for date_format in ("%m-%d-%Y", "%d-%m-%Y", "%Y-%m-%d"):
# You can extend the list above to include formats with %y in addition to %Y, etc, etc
if date.strftime(date_format) == date_input:
return date_format
>>> date_input = "12-05-2018"
>>> get_date_format(date_input)
'%m-%d-%Y'
You mention in a comment you are prepared to make assumptions about ambiguous dates like 12-05-2018 (could be May or December) and 05-12-18 (could be 2018 or 2005). You can pass those assumptions to dateutil.parser.parse. It accepts boolean keyword parameters dayfirst and yearfirst which it will use in ambiguous cases.
Take a look at the datetime library. There you will find the function strptime(), which is exactly what you are looking for.
Here is the documentation: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior

Is it possible to extract a format string (e.g. "YY-mm-DD HH:MM:SS.sss") from a python datetime object? [duplicate]

Here's an array of datetime values:
array = np.array(['2016-05-01T00:00:59.3+10:00', '2016-05-01T00:02:59.4+10:00',
'2016-05-01T00:03:59.4+10:00', '2016-05-01T00:13:00.1+10:00',
'2016-05-01T00:22:00.5+10:00', '2016-05-01T00:31:01.1+10:00'],
dtype=object)
pd.to_datetime is very good at inferring datetime formats.
array = pd.to_datetime(array)
print(array)
DatetimeIndex(['2016-04-30 14:00:59.300000', '2016-04-30 14:02:59.400000',
'2016-04-30 14:03:59.400000', '2016-04-30 14:13:00.100000',
'2016-04-30 14:22:00.500000', '2016-04-30 14:31:01.100000'],
dtype='datetime64[ns]', freq=None)
How can I dynamically figure out what datetime format pd.to_datetime inferred? Something like: %Y-%m-%dT... (sorry, my datetime foo is really bad).
I don't think it's possible to do this in full generality in pandas.
As mentioned in other comments and answers, the internal function _guess_datetime_format is close to being what you ask for, but it has strict criteria for what constitutes a guessable format and so it will only work for a restricted class of datetime strings.
These criteria are set out in the _guess_datetime_format function on these lines and you can also see some examples of good and bad formats in the test_parsing script.
Some of the main points are:
year, month and day must each be present and identifiable
the year must have four digits
exactly six digits must be used if using microseconds
you can't specify a timezone
This means that it will fail to guess the format for datetime strings in the question despite them being a valid ISO 8601 format:
>>> from pandas.core.tools.datetimes import _guess_datetime_format_for_array
>>> array = np.array(['2016-05-01T00:00:59.3+10:00'])
>>> _guess_datetime_format_for_array(array)
# returns None
In this case, dropping the timezone and padding the microseconds to six digits is enough to make pandas to recognise the format:
>>> array = np.array(['2016-05-01T00:00:59.300000']) # six digits, no tz
>>> _guess_datetime_format_for_array(array)
'%Y-%m-%dT%H:%M:%S.%f'
This is probably as good as it gets.
If pd.to_datetime is not asked to infer the format of the array, or given a format string to try, it will just try and parse each string separately and hope that it is successful. Crucially, it does not need to infer a format in advance to do this.
First, pandas parses the string assuming it is (approximately) a ISO 8601 format. This begins in a call to _string_to_dts and ultimately hits the low-level parse_iso_8601_datetime function that does the hard work.
You can check if your string is able to be parsed in this way using the _test_parse_iso8601 function. For example:
from pandas._libs.tslib import _test_parse_iso8601
def is_iso8601(string):
try:
_test_parse_iso8601(string)
return True
except ValueError:
return False
The dates in the array you give are recognised as this format:
>>> is_iso8601('2016-05-01T00:00:59.3+10:00')
True
But this doesn't deliver what the question asks for and I don't see any realistic way to recover the exact format that is recognised by the parse_iso_8601_datetime function.
If parsing the string as a ISO 8601 format fails, pandas falls back to using the parse() function from the third-party dateutil library (called by parse_datetime_string). This allows a fantastic level of parsing flexibility but, again, I don't know of any good way to extract the recognised datetime format from this function.
If both of these two parsers fail, pandas either raises an error, ignores the string or defaults to NaT (depending on what the user specifies). No further attempt is made to parse the string or guess the format of the string.
DateInfer (PyDateInfer) library allows to infer dates based on the sequence of available dates:
github.com/wdm0006/dateinfer
Usage from docs:
>>> import dateinfer
>>> dateinfer.infer(['Mon Jan 13 09:52:52 MST 2014', 'Tue Jan 21 15:30:00 EST 2014'])
'%a %b %d %H:%M:%S %Z %Y'
>>>
Disclaimer: I have used and then contributed to this library
You can use _guess_datetime_format from core.tools to get the format. ie
from pandas.core.tools import datetimes as tools
tools._guess_datetime_format(pd.to_datetime(array).format()[0][:10])
Output :
'%Y-%m-%d'
To know more about this method you can see here. Hope it helps.

Converting "2013-01-06T22:25:08.733" to "2013-01-06 22:25:08" in python

I have a big .csv file which holds machine log data. One of the fields is timestamp. It stores date and time as shown in the title and I would like to drop the milli seconds and convert it into the format also shown in title. Can anyone help me with that? Im new to Python and Ipython.
Much obliged.
For your special case, this should suffice:
t.replace('T', ' ')[:19]
But I would recommend, that you use the datetime module of the standard library instead, so your time conversion also could be internationalized.
You can use easy_date to make it easy:
import date_converter
new_date = date_converter.string_to_string("2013-01-06T22:25:08.733", "%Y-%m-%dT%H:%M:%S.%f", "%Y-%m-%d %H:%M:%S")

How to implement calendar in Python?

For example I give the date as:
2/12/2015
The result should be:
February/Thursday/2015
I tried to do with if but I'm not getting the result. It would be nice if you could tell me the long way (without using built in functions (like datetime and others) too much). I'm new to python and not much is taught in my school.
You don't have to use datetime too much, simply parse the date and output it in whatever format you want
from datetime import datetime
d = "2/12/2015"
print(datetime.strptime(d,"%m/%d/%Y").strftime("%B/%A/%Y"))
February/Thursday/2015
A = Locale’s full weekday name.
B = Locale’s full month name.
Y = Year with century as a decimal number.
All the format directives are here
You could create a dict mapping but you will find datetime is lot simpler.

How to parse e.g. 2010-04-24T07:47:00.007+02:00 with Python strptime

Does anyone know how to parse the format as described in the title using Pythons strptime method?
I have something similar to this:
import datetime
date = datetime.datetime.strptime(entry.published.text, '%Y-%m-%dT%H:%M:%S.Z')
I can't seem to figure out what kind of timeformat this is. By the way, I'm a newbie at the Python language (I'm used to C#).
UPDATE
This is how I changed the code based on the advise (answers) below:
from dateutil.parser import *
from datetime import *
date = parse(entry.published.text)
That date is in ISO 8601, or more specifically RFC 3339, format.
Such dates can't be parsed with strptime. There's a Python issue that discusses this.
dateutil.parser.parse can handle a wide variety of dates, including the one in your example.
If you're using an external module for XML or RSS parsing, there is probably a routine in there to parse that date.
Here's a good way to find the answer: using strftime, construct a format string that will emit what you see. That string will, by definition, be the string needed to PARSE the time with strptime.
If you are trying to parse RSS or Atom feeds then use Universal Feed Parser. It supports many date/time formats.
>>> import feedparser # parse feed
>>> d = feedparser.parse("http://stackoverflow.com/feeds/question/3946689")
>>> t = d.entries[0].published_parsed # get date of the first entry as a time tuple
>>> import datetime
>>> datetime.datetime(*t[:6]) # convert time tuple to datetime object
datetime.datetime(2010, 10, 15, 22, 46, 56)
That's the standard XML datetime format, ISO 8601. If you're already using an XML library, most of them have datetime parsers built in. xml.utils.iso8601 works reasonably well.
import xml.utils.iso8601
date = xml.utils.iso8601.parse(entry.published.text)
You can look at a bunch of other ways to deal with that here:
http://wiki.python.org/moin/WorkingWithTime

Categories