regex: getting all matches for group - python

in python, i'm trying to extract all time-ranges (of the form HHmmss-HHmmss) from a string. i'm using this python code.
text = "random text 0700-1300 random text 1830-2230 random 1231 text"
regex = "(.*(\d{4,10}-\d{4,10}))*.*"
match = re.search(regex, text)
this only returns 1830-2230 but i'd like to get 0700-1300 and 1830-2230. in my application there may be zero or any number of time-ranges (within reason) in the text string. i'd appreciate any hints.

Try to use re.findall to find all matches (Regex demo.):
import re
text = "random text 0700-1300 random text 1830-2230 random 1231 text"
pat = r"\b\d{4,10}-\d{4,10}\b"
for m in re.findall(pat, text):
print(m)
Prints:
0700-1300
1830-2230

Try this regex.
(\d{4}-\d{4})
All you need is pattern {four digts}minus{ four digits}. Other parts are not needed.

Related

How to find specific pattern in a paragraph in Python?

I want to find a specific pattern in a paragraph. The pattern must contain a-zA-Z and 0-9 and length is 5 or more than 5. How to implement it on Python?
My code is:
str = "I love5 verye mu765ch"
print(re.findall('(?=.*[0-9])(?=.*[a-zA-Z]{5,})',str))
this will return a null.
Expected result like:
love5
mu765ch
the valid pattern is like:
9aacbe
aver23893dk
asdf897
This is easily done with some programming logic and a simple regex:
import re
string = "I love5 verye mu765ch a123...bbb"
pattern = re.compile(r'(?=\D*\d)(?=[^a-zA-Z]*[a-zA-Z]).{5,}')
interesting = [word for word in string.split() if pattern.match(word)]
print(interesting)
This yields
['love5', 'mu765ch', 'a123...bbb']
See a demo on ideone.com.

Strike-Down Markdown text between two symols

I'm trying to emulate the strike-through markdown from GitHub in python and I managed to do half of the job. Now there's just one problem I have: The pattern I'm using doesn't seem to replace the text containing symbols and I couldn't figure it out so I hope someone can help me
text = "This is a ~~test?~~"
match = re.findall(r"(?<![.+?])(~{2})(?!~~)(.+?)(?<!~~)\1(?![.+?])", text) # Finds all the text between ~~ symbols
if match:
for _, m in match: # Iterates though the matches. First variable (_) containing the symbol ~ and the second one (m) contains the text I want to replace
text = re.sub(f"~~{m}~~", "\u0336".join(m) + "\u0336", text) # Should replace ~~test?~~ with t̶e̶s̶t̶?̶ but it fails
There is a problem in the string that you are trying to replace. In your case, ~~{m}~~ where value of m is test? the regex to be replaced becomes ~~test?~~ and here ? has a special meaning which you aren't escaping hence the replace doesn't work properly. Just try using re.escape(m) instead of m so meta characters get escaped and are treated as literals.
Try your modified Python code,
import re
text = "This is a ~~test?~~"
match = re.findall(r"(?<![.+?])(~{2})(?!~~)(.+?)(?<!~~)\1(?![.+?])", text) # Finds all the text between ~~ symbols
if match:
for _, m in match: # Iterates though the matches. First variable (_) containing the symbol ~ and the second one (m) contains the text I want to replace
print(m)
text = re.sub(f"~~{re.escape(m)}~~", "\u0336".join(m) + "\u0336", text) # Should replace ~~test?~~ with t̶e̶s̶t̶?̶ but it fails
print(text)
This replaces like you expected and prints,
This is a t̶e̶s̶t̶?̶

How to extract function name python regex

Hello I am trying to extract the function name in python using Regex however I am new to Python and nothing seems to be working for me. For example: if i have a string "def myFunction(s): ...." I want to just return myFunction
import re
def extractName(s):
string = []
regexp = re.compile(r"\s*(def)\s+\([^\)]*\)\s*{?\s*")
for m in regexp.finditer(s):
string += [m.group()]
return string
Assumption: You want the name myFunction from "...def myFunction(s):..."
I find something missing in your regex and the way it is structured.
\s*(def)\s+\([^\)]*\)\s*{?\s*
Lets look at it step by step:
\s*: match to zero or more white spaces.
(def): match to the word def.
\s+: match to one or more white spaces.
\([^\)]*\): match to balanced ()
\s*: match to zero or more white spaces.
After that pretty much doesn't matter if you are going for just the name of the function. You are not matching the exact thing you want out of the regex.
You can try this regex if you are interested in doing it by regex:
\s*(def)\s([a-zA-Z]*)\([a-zA-z]*\)
Now the way I have structured the regex, you will get def myFunction(s) in group0, def in group1 and myFunction in group2. So you can use the following code to get you result:
import re
def extractName(s):
string = ""
regexp = re.compile(r"(def)\s([a-zA-Z]*)\([a-zA-z]*\)")
for m in regexp.finditer(s):
string += m.group(2)
return string
You can check your regex live by going on this site.
Hope it helps!

regex search and replace with modified result

I have a string with tagged elements inside. I want to remove the tags and add some characters to the content inside the tags.
s = 'Hello there <something>, this is more text <tagged content>'
result = 'Hello there somethingADDED, this is more text tagged contentADDED
So far, I've tried
import re
result = re.search('\<(.*)\>', s)
result = result.group(1)
and s = s.split('>') and regex each substring one by one, but it doesn't seem like the correct or efficient way of doing this.
Use back-reference \1.
x="Hello there <something>, this is more text <tagged content>"
print re.sub(r"<([^>]*)>",r"\1added",x)
Output :Hello there somethingadded, this is more text tagged contentadded

Python Regular Expression: Replace Withing a group

Is there a way to do substitution on a group?
Say I am trying to insert a link into text, based on custom formatting. So, given something like this:
This is a random text. This should be a [[link somewhere]]. And some more text at the end.
I want to end up with
This is a random text. This should be a link somewhere. And some more text at the end.
I know that '\[\[(.*?)\]\]' will match stuff within square brackets as group 1, but then I want to do another substitution on group 1, so that I can replace space with _.
Is that doable in a single re.sub regex expression?
You can use a function as a replacement instead of string.
>>> import re
>>> def as_link(match):
... link = match.group(1)
... return '{}'.format(link.replace(' ', '_'), link)
...
>>> text = 'This is a random text. This should be a [[link somewhere]]. And some more text at the end.'
>>> re.sub(r'\[\[(.*?)\]\]', as_link, text)
'This is a random text. This should be a link somewhere. And some more text at the end.'
You could do something like this.
import re
pattern = re.compile(r'\[\[([^]]+)\]\]')
def convert(text):
def replace(match):
link = match.group(1)
return '{}'.format(link.replace(' ', '_'), link)
return pattern.sub(replace, text)
s = 'This is a random text. This should be a [[link somewhere]]. .....'
convert(s)
See working demo

Categories