This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 5 years ago.
I have a file that has a repeated pattern output
!-----------------------------------------------------------------
line 1
line 2
line 3
.....
-------------------------------------------------------------------!
I am trying to match and extract all the occurrences of these blocks but the below returns all the file
match = re.search(r'\!-.*-\!', data, re.DOTALL)
print match.group()
Regexes in Python are greedy by default, meaning * will consume as many characters as possible. You can turn off greediness by using *?:
match = re.search(r'\!-.*?-\!', data, re.DOTALL)
Related
This question already has answers here:
Exclude characters from a character class
(5 answers)
Closed 2 years ago.
For example, I want to replace all the data going from the specified intervals with * (except the chars u0650, u0660, u064F), for example.
Note: I don't want to break the interval because I have a lot of characters to preserve.
data = re.sub(r'[\u0600-\u061E\u0620-\u065F\u0670-\u06ef]', "*", data)
You can put the characters to be excluded in a negative Lookahead before the main character class.
For example:
(?![\u0650\u0660\u064F])[\u0600-\u061E\u0620-\u065F\u0670-\u06ef]
Demo.
This question already has answers here:
Regular expression to match a line that doesn't contain a word
(34 answers)
Closed 3 years ago.
I'm trying to write a regex that filters out matches if they contain "plex" in them.
plex-release -> should not match
my-release -> should match
potato -> should match
Been playing with pythex and came up with this one that works partially:
(?![plex])(\w+)[-_](release|version)$
However this also messes with any other values containing the letter "p".
I'm trying to come up with a regex that leaves out matches that only contain the string "plex" and in this order, not just any letter from the string.
Yes, you can do it using this regex.
^((?!plex).)*$
Source : Regular expression to match a line that doesn't contain a word
This question already has an answer here:
Learning Regular Expressions [closed]
(1 answer)
Closed 3 years ago.
I cant find the solution for a regex that looks for a pattern but only in a specific range of the string
I want to find $ $ but only if it is in the 5-7 position of the string and it doesnt matter which character is between those two
Example
xxxx$x$xxxxx would match
xx$x$xxxxxxx would not
import re
should = "xxxx$x$xxxxx would match"
shouldnt = "xx$x$xxxxxxx would not"
pattern = r'^.{4}\$.\$.+'
re.match(pattern, should)
re.match(pattern, shouldnt)
gives
match
None
https://regex101.com/r/RLHrZb/1
This question already has answers here:
How can I find all matches to a regular expression in Python?
(1 answer)
Python regular expression re.match, why this code does not work? [duplicate]
(1 answer)
Closed 4 years ago.
I'm having trouble to find the matches with the above code:
sample="""[2019-01-02 16:15:17.882][P:1624/T:1420][UIPCall.cpp:743
CUIPCall::HandleUICEvent()][Enter]
[2019-01-02 16:15:17.883][P:1624/T:1420][UIPCallState.cpp:1776
CUIPCallIncomingLine1State::HandleUICEvent()][Enter]"""
pattern=r'\[(.*?)\]\[(.*?)\]\[(.+?)(HandleUICEvent|FastNtfClosed_Line1_Common|Login|Logout)\(\)\]\[(.*?)\]$'
p= re.compile(pattern, re.MULTILINE | re.DOTALL)
p.match(sample)
it is troubling me because it works on https://regex101.com/r/hw7pyY/1 but does not match anything on python.
It has to be re.match() as I need the .end() and .start() functions.
This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 4 years ago.
I am new to Python and I am struggling a bit with regular expressions. If I have an input like this:
text = <tag>xyz</tag>\n<tag>abc</tag>
Is it possible to get an output list with elements like:
matches = ['<tag>xyz</tag>','<tag>abc</tag>]
Right now I am using the following regex
matches = re.findall(r"<tag>[\w\W]*</tag>", text)
But instead of a list with two elements I am getting only one element with the whole input string like:
matches = ['<tag>xyz</tag>\n<tag>abc</tag>']
Could someone please guide me?
Thank you.
You just need to make your capture non-greedy.
Change this regex,
<tag>[\w\W]*</tag>
to
<tag>[\w\W]*?</tag>
import re
text = '<tag>xyz</tag>\n<tag>abc</tag>'
matches = re.findall(r"<tag>[\w\W]*?</tag>", text)
print(matches)
Prints,
['<tag>xyz</tag>', '<tag>abc</tag>']