How can I make a regex match the entire string? - python

Suppose I have a string like test-123.
I want to test whether it matches a pattern like test-<number>, where <number> means one or more digit symbols.
I tried this code:
import re
correct_string = 'test-251'
wrong_string = 'test-123x'
regex = re.compile(r'test-\d+')
if regex.match(correct_string):
print 'Matching correct string.'
if regex.match(wrong_string):
print 'Matching wrong_string.'
How can I make it so that only the correct_string matches, and the wrong_string doesn't? I tried using .search instead of .match but it didn't help.

Try with specifying the start and end rules in your regex:
re.compile(r'^test-\d+$')

For exact match regex = r'^(some-regex-here)$'
^ : Start of string
$ : End of string

Since Python 3.4 you can use re.fullmatch to avoid adding ^ and $ to your pattern.
>>> import re
>>> p = re.compile(r'\d{3}')
>>> bool(p.match('1234'))
True
>>> bool(p.fullmatch('1234'))
False

I think It may help you -
import re
pattern = r"test-[0-9]+$"
s = input()
if re.match(pattern,s) :
print('matched')
else :
print('not matched')

You can try re.findall():
import re
correct_string = 'test-251'
if len(re.findall("test-\d+", correct_string)) > 0:
print "Match found"

A pattern such as \btest-\d+\b should do you;
matches = re.search(r'\btest-\d+\', search_string)
Demo
This requires the matching of word boundaries, so prevents other substrings from occuring after your desired match.

Related

Regular expression for YYYY-MM-DDTHH:MM:SS is not detecting the presence of .00Z [duplicate]

Suppose I have a string like test-123.
I want to test whether it matches a pattern like test-<number>, where <number> means one or more digit symbols.
I tried this code:
import re
correct_string = 'test-251'
wrong_string = 'test-123x'
regex = re.compile(r'test-\d+')
if regex.match(correct_string):
print 'Matching correct string.'
if regex.match(wrong_string):
print 'Matching wrong_string.'
How can I make it so that only the correct_string matches, and the wrong_string doesn't? I tried using .search instead of .match but it didn't help.
Try with specifying the start and end rules in your regex:
re.compile(r'^test-\d+$')
For exact match regex = r'^(some-regex-here)$'
^ : Start of string
$ : End of string
Since Python 3.4 you can use re.fullmatch to avoid adding ^ and $ to your pattern.
>>> import re
>>> p = re.compile(r'\d{3}')
>>> bool(p.match('1234'))
True
>>> bool(p.fullmatch('1234'))
False
I think It may help you -
import re
pattern = r"test-[0-9]+$"
s = input()
if re.match(pattern,s) :
print('matched')
else :
print('not matched')
You can try re.findall():
import re
correct_string = 'test-251'
if len(re.findall("test-\d+", correct_string)) > 0:
print "Match found"
A pattern such as \btest-\d+\b should do you;
matches = re.search(r'\btest-\d+\', search_string)
Demo
This requires the matching of word boundaries, so prevents other substrings from occuring after your desired match.

Python Regex find all matches after specific word

I have a string as below
"Server: myserver.mysite.com\r\nAddress: 111.122.133.144\r\n\r\nName: myserver.mysite.com\r\nAddress: 123.144.412.111\r\nAliases: alias1.myserver.mysite.com\r\n\t myserver.mysite.com\r\n\r\n"
I'm currently struggling to write a function in python that will find all aliases and put them in a list. So basically, I need a list that will be ['alias1.myserver.mysite.com', 'myserver.mysite.com']
I tried the following code
pattern = '(?<=Aliases: )([\S*]+)'
name = re.findall(pattern, mystring)
but it only matches the first alias and not both of them.
Any ideas on this?
Greatly appreciated!
Try the following:
import re
s = "Server: myserver.mysite.com\r\nAddress: 111.122.133.144\r\n\r\nName: myserver.mysite.com\r\nAddress: 123.144.412.111\r\nAliases: alias1.myserver.mysite.com\r\n\t myserver.mysite.com\r\n\r\n"
l = re.findall(r'\S+', s.split('Aliases: ')[1])
print(l)
Prints:
['alias1.myserver.mysite.com', 'myserver.mysite.com']
Explanation
First we split the string into two pieces and keep the second piece with s.split('Aliases: ')[1]. This evaluates to the part of the string that follows 'Aliases: '.
Next we use findall with the regaular expression:
\S+
This matches all consecutive strings of one or more non-space characters.
But this can be more simply done in this case without using a regex:
s = "Server: myserver.mysite.com\r\nAddress: 111.122.133.144\r\n\r\nName: myserver.mysite.com\r\nAddress: 123.144.412.111\r\nAliases: alias1.myserver.mysite.com\r\n\t myserver.mysite.com\r\n\r\n"
l = s.split('Aliases: ')[1].split()
print(l)
Try this :
import re
regex = re.compile(r'[\n\r\t]')
t="Server: myserver.mysite.com\r\nAddress: 111.122.133.144\r\n\r\nName: myserver.mysite.com\r\nAddress: 123.144.412.111\r\nAliases: alias1.myserver.mysite.com\r\n\t myserver.mysite.com\r\n\r\n"
t = regex.sub(" ", t)
t = t.split("Aliases:")[1].strip().split()
print(t)

Getting word from string

How can i get word example from such string:
str = "http://test-example:123/wd/hub"
I write something like that
print(str[10:str.rfind(':')])
but it doesn't work right, if string will be like
"http://tests-example:123/wd/hub"
You can use this regex to capture the value preceded by - and followed by : using lookarounds
(?<=-).+(?=:)
Regex Demo
Python code,
import re
str = "http://test-example:123/wd/hub"
print(re.search(r'(?<=-).+(?=:)', str).group())
Outputs,
example
Non-regex way to get the same is using these two splits,
str = "http://test-example:123/wd/hub"
print(str.split(':')[1].split('-')[1])
Prints,
example
You can use following non-regex because you know example is a 7 letter word:
s.split('-')[1][:7]
For any arbitrary word, that would change to:
s.split('-')[1].split(':')[0]
many ways
using splitting:
example_str = str.split('-')[-1].split(':')[0]
This is fragile, and could break if there are more hyphens or colons in the string.
using regex:
import re
pattern = re.compile(r'-(.*):')
example_str = pattern.search(str).group(1)
This still expects a particular format, but is more easily adaptable (if you know how to write regexes).
I am not sure why do you want to get a particular word from a string. I guess you wanted to see if this word is available in given string.
if that is the case, below code can be used.
import re
str1 = "http://tests-example:123/wd/hub"
matched = re.findall('example',str1)
Split on the -, and then on :
s = "http://test-example:123/wd/hub"
print(s.split('-')[1].split(':')[0])
#example
using re
import re
text = "http://test-example:123/wd/hub"
m = re.search('(?<=-).+(?=:)', text)
if m:
print(m.group())
Python strings has built-in function find:
a="http://test-example:123/wd/hub"
b="http://test-exaaaample:123/wd/hub"
print(a.find('example'))
print(b.find('example'))
will return:
12
-1
It is the index of found substring. If it equals to -1, the substring is not found in string. You can also use in keyword:
'example' in 'http://test-example:123/wd/hub'
True

Python regular expression not matching

This is one of those things where I'm sure I'm missing something simple, but... In the sample program below, I'm trying to use Python's RE library to parse the string "line" to get the floating-point number just before the percent sign, i.e. "90.31". But the code always prints "no match".
I've tried a couple other regular expressions as well, all with the same result. What am I missing?
#!/usr/bin/python
import re
line = ' 0 repaired, 90.31% done'
pct_re = re.compile(' (\d+\.\d+)% done$')
#pct_re = re.compile(', (.+)% done$')
#pct_re = re.compile(' (\d+.*)% done$')
match = pct_re.match(line)
if match: print 'got match, pct=' + match.group(1)
else: print 'no match'
match only matches from the beginning of the string. Your code works fine if you do pct_re.search(line) instead.
You should use re.findall instead:
>>> line = ' 0 repaired, 90.31% done'
>>>
>>> pattern = re.compile("\d+[.]\d+(?=%)")
>>> re.findall(pattern, line)
['90.31']
re.match will match at the start of the string. So you would need to build the regex for complete string.
try this if you really want to use match:
re.match(r'.*(\d+\.\d+)% done$', line)
r'...' is a "raw" string ignoring some escape sequences, which is a good practice to use with regexp in python. – kratenko (see comment below)

How can I get part of regex match as a variable in python?

In Perl it is possible to do something like this (I hope the syntax is right...):
$string =~ m/lalala(I want this part)lalala/;
$whatIWant = $1;
I want to do the same in Python and get the text inside the parenthesis in a string like $1.
If you want to get parts by name you can also do this:
>>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcom Reynolds")
>>> m.groupdict()
{'first_name': 'Malcom', 'last_name': 'Reynolds'}
The example was taken from the re docs
See: Python regex match objects
>>> import re
>>> p = re.compile("lalala(I want this part)lalala")
>>> p.match("lalalaI want this partlalala").group(1)
'I want this part'
import re
astr = 'lalalabeeplalala'
match = re.search('lalala(.*)lalala', astr)
whatIWant = match.group(1) if match else None
print(whatIWant)
A small note: in Perl, when you write
$string =~ m/lalala(.*)lalala/;
the regexp can match anywhere in the string. The equivalent is accomplished with the re.search() function, not the re.match() function, which requires that the pattern match starting at the beginning of the string.
import re
data = "some input data"
m = re.search("some (input) data", data)
if m: # "if match was successful" / "if matched"
print m.group(1)
Check the docs for more.
there's no need for regex. think simple.
>>> "lalala(I want this part)lalala".split("lalala")
['', '(I want this part)', '']
>>> "lalala(I want this part)lalala".split("lalala")[1]
'(I want this part)'
>>>
import re
match = re.match('lalala(I want this part)lalala', 'lalalaI want this partlalala')
print match.group(1)
import re
string_to_check = "other_text...lalalaI want this partlalala...other_text"
p = re.compile("lalala(I want this part)lalala") # regex pattern
m = p.search(string_to_check) # use p.match if what you want is always at beginning of string
if m:
print m.group(1)
In trying to convert a Perl program to Python that parses function names out of modules, I ran into this problem, I received an error saying "group" was undefined. I soon realized that the exception was being thrown because p.match / p.search returns 0 if there is not a matching string.
Thus, the group operator cannot function on it. So, to avoid an exception, check if a match has been stored and then apply the group operator.
import re
filename = './file_to_parse.py'
p = re.compile('def (\w*)') # \w* greedily matches [a-zA-Z0-9_] character set
for each_line in open(filename,'r'):
m = p.match(each_line) # tries to match regex rule in p
if m:
m = m.group(1)
print m

Categories