I have a Python script that generates .csv files from other data sources.
Currently, an error happens when the user manually adds a space to a date by accident. Instead of inputting the date as "1/13/17", a space may be added at the front (" 1/13/17") so that there's a space in front of the month.
I've included the relevant part of my Python script below:
def processDateStamp(sourceStamp):
matchObj = re.match(r'^(\d+)/(\d+)/(\d+)\s', sourceStamp)
(month, day, year) = (matchObj.group(1), matchObj.group(2), matchObj.group(3))
return "%s/%s/%s" % (month, day, year)
How do I trim the space issue in front of month and possibly on other components of the date (the day and year) as well for the future?
Thanks in advance.
Since you're dealing with dates, it might be more appropriate to use datetime.strptime than regex here. There are two advantages of this approach:
It makes it slightly clearer to anyone reading that you're trying to parse dates.
Your code will be more prone to throw exceptions when trying to parse data that doesn't represent dates, or represent dates in an incorrect format - this is good because it helps you catch and address issues that might otherwise go unnoticed.
Here's the code:
from datetime import datetime
def processDateStamp(sourceStamp):
date = datetime.strptime(sourceStamp.replace(' ', ''), '%M/%d/%y')
return '{}/{}/{}'.format(date.month, date.day, date.year)
if __name__ == '__main__':
print(processDateStamp('1/13/17')) # 1/13/17
print(processDateStamp(' 1/13/17')) # 1/13/17
print(processDateStamp(' 1 /13 /17')) # 1/13/17
You also can use parser from python-dateutil library. The main benefit you will get - it can recognize the datetime format for you (sometimes it may be useful):
from dateutil import parser
from datetime import datetime
def processDateTimeStamp(sourceStamp):
dt = parser.parse(sourceStamp)
return dt.strftime("%m/%d/%y")
processDateTimeStamp(" 1 /13 / 17") # returns 01/13/17
processDateTimeStamp(" jan / 13 / 17")
processDateTimeStamp(" 1 - 13 - 17")
processDateTimeStamp(" 1 .13 .17")
Once again, a perfect opportunity to use split, strip, and join:
def remove_spaces(date_string):
date_list = date_string.split('/')
result = '/'.join(x.strip() for x in date_list)
return result
Examples
In [7]: remove_spaces('1/13/17')
Out[7]: '1/13/17'
In [8]: remove_spaces(' 1/13/17')
Out[8]: '1/13/17'
In [9]: remove_spaces(' 1/ 13/17')
Out[9]: '1/13/17'
Related
def checkage(year,month,day):
today_day=11
today_month=6
today_year=2021
age=((today_year*365)+
(today_month*30)+today_day)-
((year*365)+(month*30)+day)
print(age//12,"year",(age-
((age//12)*365))//30,"month",
(age-((age-
((age//12)*365))//30)*30,"days")
checkage(1996,12,11)
Python :In the last line it is showing syntax error.why?
You need one more ) after ..."days")
(that clears the syntax error, at least )
You are not correctly opening and closing brackets in your code. If you want a line of code to span across more lines, you need to make use of an additional bracket. You should also make use of indents to make your code more readable like so:
def checkage(year,month,day):
today_day=11
today_month=6
today_year=2021
age=(((today_year*365)+
(today_month*30)+today_day)-
((year*365)+(month*30)+day))
print(age//12,
"year",
((age-((age//12)*365))//30),
"month",
(age-((age-((age//12)*365))//30)*30),"days")
checkage(1996,12,11)
output:
745 year -8766 month 271925 days
This doesnt seem to be doing the correct calculation either. If you want to work with dates you should consider the datetime library
from datetime import date
def calculate_age(born):
today = date.today()
return today.year - born.year - ((today.month, today.day) < (born.month, born.day))
print(calculate_age(date(1996, 12, 11)))
output:
24
You are missing a ) in your last print statement.
def checkage(year,month,day):
today_day=11
today_month=6
today_year=2021
age=((today_year*365)+ (today_month*30)+today_day)- ((year*365)+(month*30)+day)
print(age//12,"year",(age- ((age//12)*365))//30,"month", (age-((age-((age//12)*365))//30)*30,"days"))
checkage(1991,12,21)
Output
896 year -10543 month (327050, 'days')
P.S. Your logic isnt right, it takes even 51 as month which is incorrect.
You are just missing the closing parenthesis of the print function;
def checkage(year,month,day):
today_day=11
today_month=6
today_year=2021
age=((today_year*365)+
(today_month*30)+today_day)-((year*365)+(month*30)+day)
print(age//12,"year",(age-((age//12)*365))//30,"month",(age-((age-((age//12)*365))//30)*30,"days"))
checkage(1996,12,11)
I'm doing an application which parse a XML from http request and one of the attributes is a date.
The problem is that the format is a string without separation, for example: '20190327200000000W' and I need to transform it into a datetime format to send it to a database.
All the information I have found is with some kind of separation char (2019-03-23 ...). Can you help me?
Thanks!!!
Maybe this? (in jupypter notebook)
from datetime import datetime
datetime_object = datetime.strptime('20190327200000000W', '%Y%m%d%H%M%S%fW')
datetime_object
Well I have solved this, at first I did that Xenobiologist said, but I had a format problem, so I decided to delete the last character (the X of %X)...and I realized that I hadn't a string, I had a list, so I transformed to string and did the operations. My code (I'll put only the inside for loop part, without the parsing part):
for parse in tree.iter(aaa):
a = parse.get(m)
respon = a.split(' ')
if m == 'Fh':
x = str(respon[0])
x2 = len(x)
x3 = x[:x2-1]
print (x3)
y = time.strptime(x3, "%Y%m%d%H%M%S%f")
I have two DateTime strings. How would I compare them and tell which comes first?
A = '2019-02-12 15:01:45:145'
B = '2019-02-12 15:02:02:22'
This format has milliseconds in it, so it cannot be parsed by time.strptime. I chose to split according to the last colon, parse the left part, and manually convert the right part, add them together.
A = '2019-02-12 15:01:45:145'
B = '2019-02-12 15:02:02:22'
import time
def parse_date(s):
date,millis = s.rsplit(":",1)
return time.mktime(time.strptime(date,"%Y-%m-%d %H:%M:%S")) + int(millis)/1000.0
print(parse_date(A))
print(parse_date(B))
prints:
1549958505.145
1549958522.022
now compare the results instead of printing them to get what you want
If your convention on milliseconds is different (ex: here 22 could also mean 220), then it's slightly different. Pad with zeroes on the right, then parse:
def parse_date(s):
date,millis = s.rsplit(":",1)
millis = millis+"0"*(3-len(millis)) # pad with zeroes
return time.mktime(time.strptime(date,"%Y-%m-%d %H:%M:%S")) + int(millis)/1000.0
in that case the result it:
1549958505.145
1549958522.22
If both the date/time strings are in ISO 8601 format (YYYY-MM-DD hh:mm:ss) you can compare them with a simple string compare, like this:
a = '2019-02-12 15:01:45.145'
b = '2019-02-12 15:02:02.022'
if a < b:
print('Time a comes before b.')
else:
print('Time a does not come before b.')
Your strings, however, have an extra ':' after which come... milliseconds? I'm not sure. But if you convert them to a standard hh:mm:ss.xxx... form, then your date strings will be naturally comparable.
If there is no way to change the fact that you're receiving those strings in hh:mm:ss:xx format (I'm assuming that xx is milliseconds, but only you can say for sure), then you can "munge" the string slightly by parsing out the final ":xx" and re-attaching it as ".xxx", like this:
def mungeTimeString(timeString):
"""Converts a time string in "YYYY-MM-DD hh:mm:ss:xx" format
to a time string in "YYYY-MM-DD hh:mm:ss.xxx" format."""
head, _, tail = timeString.rpartition(':')
return '{}.{:03d}'.format(head, int(tail))
Then call it with:
a = '2019-02-12 15:01:45:145'
b = '2019-02-12 15:02:02:22'
a = mungeTimeString(a)
b = mungeTimeString(b)
if a < b:
print('Time a comes before b.')
else:
print('Time a does not come before b.')
In my Python 3.6 application, from my input data I can receive datatimes in two different formats:
"datefield":"12/29/2017" or "datefield":"2017-12-31"
I need to make sure the that I can handle either datetime format and convert them to (or leave it in) the iso 8601 format. I want to do something like this:
#python pseudocode
import datetime
if datefield = "m%/d%/Y%":
final_date = datetime.datetime.strptime(datefield, "%Y-%m-%d").strftime("%Y-%m-%d")
elif datefield = "%Y-%m-%d":
final_date = datefield
The problem is I don't know how to check the datefield for a specific datetime format in that first if-statement in my pseudocode. I want a true or false back. I read through the Python docs and some tutorials. I did see one or two obscure examples that used try-except blocks, but that doesn't seem like an efficient way to handle this. This question is unique from other stack overflow posts because I need to handle and validate two different cases, not just one case, where I can simply fail it if it does validate.
You can detect the first style of date by a simple string test, looking for the / separators. Depending on how "loose" you want the check to be, you could check a specific index or scan the whole string with a substring test using the in operator:
if "/" in datefield: # or "if datefield[2] = '/'", or maybe "if datefield[-5] = '/'"
final_date = datetime.datetime.strptime(datefield, "%m/%d/%Y").strftime("%Y-%m-%d")
Since you'll only ever deal with two date formats, just check for a / or a - character.
import datetime
# M/D/Y
if '/' in datefield:
final_date = datetime.datetime.strpdate(date, '%M/%D/%Y').isoformat()
# Y-M-D
elif '-' in datefield:
final_date = datetime.datetime.strpdate(date, '%Y-%M-%D').isoformat()
A possible approach is to use the dateutil library. It contains many of the commonest datetime formats and can automagically detect these formats for you.
>>> from dateutil.parser import parse
>>> d1 = "12/29/2017"
>>> d2 = "2017-12-31"
>>> parse(d1)
datetime.datetime(2017, 12, 29, 0, 0)
>>> parse(d2)
datetime.datetime(2017, 12, 31, 0, 0)
NOTE: dateutil is a 3rd party library so you may need to install it with something like:
pip install python-dateutil
It can be found on pypi:
https://pypi.python.org/pypi/python-dateutil/2.6.1
And works with Python2 and Python3.
Alternate Examples:
Here are a couple of alternate examples of how well dateutil handles random date formats:
>>> d3 = "December 28th, 2017"
>>> parse(d3)
datetime.datetime(2017, 12, 28, 0, 0)
>>> d4 = "27th Dec, 2017"
>>> parse(d4)
datetime.datetime(2017, 12, 27, 0, 0)
I went with the advice of #ChristianDean and used the try-except block in effort to be Pythonic. The first format %m/%d/%Y appears a bit more in my data, so I lead the try-except with that datetime formatting attempt.
Here is my final solution:
import datetime
try:
final_date = datetime.datetime.strptime(datefield, "%m/%d/%Y").strftime("%Y-%m-%d")
except ValueError:
final_date = datetime.datetime.strptime(datefield, "%Y-%m-%d").strftime("%Y-%m-%d")
I have a time format like this
t = "2012-03-20T08:31:00-05:00"
I can extract the time contents using RegEx like this.
p = re.compile("(\d{4})\-(\d\d)\-(\d\d)T(\d\d):(\d\d):(\d\d)[\-|+]\d\d:\d\d")
matches = p.findall(t)
But I was wondering if there is a way to convert this format directly to unix_timestamp without using RegEx ? Is there a calendar library or something similar ?
datetime.datetime.strptime is your friend :)
http://docs.python.org/library/datetime.html#datetime.datetime.strptime
Use time.strptime:
time.strptime(t, "%Y-%m-%dT%H:%M:%S-%Z")
Unfortunately, this doesn't appear to work with numeric time zone offsets. python-dateutil should be able to handle your format:
dateutil.parser.parse(t)
Alternatively, you could split your string before the numeric offset:
t, offset_sign, offset = t[:-6], t[-6], t[-5:]
t = time.strptime(t, "%Y-%m-%dT%H:%M:%S")
offset = time.strptime(offset, "%H:%M")
Use strptime from the time module