Date regex in a sentence [duplicate] - python

This question already has answers here:
How to match a whole word with a regular expression?
(4 answers)
Closed 4 years ago.
I'm trying to use the date regex from this post:
^(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))\1|(?:(?:29|30)(\/|-|\.)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)(?:0?2|(?:Feb))\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$
However, I want to find all matches that are also wrapped around white spaces.
For example in this sentence:
I went to Disney World on 11/11/1989 and once more on 12/12/2009
I want to get back:
11/11/1989
12/12/2009
How do I accomplish this? I'm using Python3 regex module if it matters.

If you want to tweak the regex you linked to work in a string like that, change the three ^ and $s to word boundaries (\b) instead:
\b(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))\1|(?:(?:29|30)(\/|-|\.)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)(?:0?2|(?:Feb))\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))\b|\b(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})\b
https://regex101.com/r/WX5Itv/1

Related

Python Regex to match a colon either side (left and right) of a word [duplicate]

This question already has answers here:
Regex to find and replace emoji names within colons
(4 answers)
Closed 3 months ago.
At a complete loss here - trying to match a a colon either side of any given word in a passage of text.
For example:
:wave: Hello guys! :partyface: another huge win for us all to celebrate!
An appropriate regex that would match:
:wave:
:partyface:
Really appreciate your help!
\w*:\b
To catch all the content
:[^:]*:
To catch the content between
(?<=:)[^:]*(?=:)

Python: Using Regex to remove multiple occurrences of punctuation? [duplicate]

This question already has answers here:
strip punctuation with regex - python
(4 answers)
Closed 2 years ago.
I'm looking to remove reoccurring punctuation in a row.
E.g turn 'Hello...' into 'Hello.'
I've been reading some of the documentation on the matter, but am struggling to find a definitive method. (I personally find the docs on regex to a be a little overwhelming, and unclear at times).
I thought it may be something along the lines of:
re.sub('[!()-{};:,<>./?##$%^&*_~]+', '', input)
But this doesn't work. Any help? Thanks.
You can use this:
import re
input='Hello...'
re.sub(r'(\W)(?=\1)', '', input)
Output:
'Hello.'

Regex working in text editor(sublime) but not in python [duplicate]

This question already has answers here:
Case insensitive regular expression without re.compile?
(10 answers)
Closed 2 years ago.
I want to extract the line using regex.
The line that I want to extract from document is:
":method":"POST",":path":"/api/browser/projects/8bd4d1d3-0b69-515e-8e15-e9c49992f7d5/buckets/b-ao-mock-testing/copy
The regex I am using is:
":method"[:"a-z,/\d-]{20,1000}/copy
The code for the same in python is:
re.findall('":method"[:"a-z,/\d-]{20,1000}/copy', str(s), re.MULTILINE)
It is working perfectly fine in sublime text but not in python. It is returning an empty list in python. How to resolve this?
You need to use i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z]).
Without this how will POST match?
or use ":method"[:"a-zA-Z,/\d-]{20,1000}/copy
See demo

RegEx for finding words between dots [duplicate]

This question already has answers here:
How to find overlapping matches with a regexp?
(4 answers)
Closed 4 years ago.
I am new to RegEx and I want to use regular expression to find words between dots.
For example, the text is something like:
abc.efg.hij.klm.opq.
I tried with below RegEx:
\.(\w+)\.
It only show me 2 matches:
.efg.
.klm.
Why am I getting this result?
Here is the link to the RegEx: https://regex101.com/r/pqMN8t/1/
It only shows two matches because the regex engine will not match what it has already matched. After matching .efg., it won't match the dot before hij, because that dot has already been matched (the dot after efg).
One way to fix this is to not match the dots and use lookaheads and lookbehinds instead:
(?<=\.)\w+(?=\.)
This way, the dots won't get matched.

What are () (parentheses) are for in regex python [duplicate]

This question already has answers here:
Python regex -- extraneous matchings
(5 answers)
Closed 6 years ago.
I searched in all the internet and didnt get a good answer on this thing.
What parentheses in python are stand for? its very wierd..
For example, if i do:
re.split(r'(/s*)', "ho from there")
its will give me a list of separate words with the spaces between that... how does its happening?
This isn't specific to python, but in regex those denote a capture group.
Further information on how these are handled in re.split can be seen here

Categories