Python's regex '|' (or) operator [duplicate] - python

This question already has answers here:
python re.findall() with substring in alternations
(1 answer)
How can I find all matches to a regular expression in Python?
(1 answer)
Order of regular expression operator (..|.. ... ..|..)
(1 answer)
Closed 2 years ago.
According to Python's documentation for regular expression syntax for | (or) operator: "...once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy."
I have tried this in my console (running Python 3.7.6):
import re
txt = 'tim is walking and tom is running'
pattern = 'tim|tom'
re.findall(pattern, txt)
and I get:
['tim', 'tom']
Why is the right side of | still evaluated in this case?

Related

Why does regex with “|” (or/alternation) match differently when order is switched? [duplicate]

This question already has answers here:
Why doesn't regular expression alternation (A|B) match as per doc?
(3 answers)
Closed 3 years ago.
I want to clarify a doubt in python - regular expression
import re
stri="Item3. Super Market ListsItem4"
#1st print
print(re.sub(r'(Item[0-9]|Item[0-9]\.)', "", stri,))
#2nd print
print(re.sub(r'(Item[0-9]\.|Item[0-9])', "", stri,))
In the stri, I need to remove the "Item4" and "Item3."
output -
'. Super Market Lists'
' Super Market Lists'
My question is, I used OR(|) operator for both patterns.
In the 1st print statement, it did not remove the dot(.) in the given string. And in the 2nd print statement, I switched the pattern with OR operator. In this time, it removed the dot(.) in the string. Why it happens like this
Thank you
It happens because it first tries to match the left operand of the OR operator.
Because it matches without the dot, it removes the matched part without looking into the right operand.

Regex but just in substring [duplicate]

This question already has an answer here:
Learning Regular Expressions [closed]
(1 answer)
Closed 3 years ago.
I cant find the solution for a regex that looks for a pattern but only in a specific range of the string
I want to find $ $ but only if it is in the 5-7 position of the string and it doesnt matter which character is between those two
Example
xxxx$x$xxxxx would match
xx$x$xxxxxxx would not
import re
should = "xxxx$x$xxxxx would match"
shouldnt = "xx$x$xxxxxxx would not"
pattern = r'^.{4}\$.\$.+'
re.match(pattern, should)
re.match(pattern, shouldnt)
gives
match
None
https://regex101.com/r/RLHrZb/1

Python: Regex unable to find any matches on string [duplicate]

This question already has answers here:
How can I find all matches to a regular expression in Python?
(1 answer)
Python regular expression re.match, why this code does not work? [duplicate]
(1 answer)
Closed 4 years ago.
I'm having trouble to find the matches with the above code:
sample="""[2019-01-02 16:15:17.882][P:1624/T:1420][UIPCall.cpp:743
CUIPCall::HandleUICEvent()][Enter]
[2019-01-02 16:15:17.883][P:1624/T:1420][UIPCallState.cpp:1776
CUIPCallIncomingLine1State::HandleUICEvent()][Enter]"""
pattern=r'\[(.*?)\]\[(.*?)\]\[(.+?)(HandleUICEvent|FastNtfClosed_Line1_Common|Login|Logout)\(\)\]\[(.*?)\]$'
p= re.compile(pattern, re.MULTILINE | re.DOTALL)
p.match(sample)
it is troubling me because it works on https://regex101.com/r/hw7pyY/1 but does not match anything on python.
It has to be re.match() as I need the .end() and .start() functions.

Why is the regex "java" not matching "/something.java" using Python's re module? [duplicate]

This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 6 years ago.
This is the code:
import re
regex = re.compile('java')
print regex.match('/something.java')
This is the output:
None
Because python match matches from the beginning. see
python -- re.match vs. re.search
you need to use pattern .*java if you want to use match.

unable to match this regular expression in python [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 4 years ago.
I am trying to match regular expression using python in this code.
CDS_REGEX = re.compile(r'\+CDS:\s*"([^"]+)",\s*(\d+)$')
cdsiMatch = allLinesMatchingPattern(self.CDS_REGEX, notificationLine)
print cdsiMatch
Matching String:
['+CDS: 24', '079119890400202306A00AA17909913764514010106115225140101061452200']
Please help me i am not able to find my mistake,
As #Blckknght said, are you sure you really want to match that string?
What is ([^"]+) supposed to match?
You're looking for " instead of ' (you probably want ['"]).
You're only checking for numbers here: (\d+), but your long string clearly contains A's.

Categories