Python re wrong output [duplicate] - python

This question already has answers here:
get index of character in python list
(4 answers)
Regular expression to match a dot
(7 answers)
Closed 3 years ago.
I want to find the position of '.', but when i run code below:
text = 'Hello world.'
pattern = '.'
search = re.search(pattern,text)
print(search.start())
print(search.end())
Output is:
0
1
Place of '.' isn't 0 1.
So why is it giving wrong output?

You can use find method for this task.
my_string = "test"
s_position = my_string.find('s')
print (s_position)
Output
2
If you really want to use RegEx be sure to escape the dot character or it will be interpreted as a special character.
The dot in RegEx matches any character except the newline symbol.
text = 'Hello world.'
pattern = '\.'
search = re.search(pattern,text)
print(search.start())
print(search.end())

Related

Regular Expression to replace first occurance of match [duplicate]

This question already has answers here:
How to replace the first occurrence of a regular expression in Python?
(2 answers)
Closed 6 months ago.
Simple regex question. I have a string in the following format:
string = """陣頭には見るも<RUBY text="いかめ">厳</RUBY>しい、厚い鎧姿の武士達が立つ。
 分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。"""
What is the regular expression to find the first occurance of <RUBY text="something">something</RUBY> and replace it with something like HELLO i.e
 陣頭には見るもHELLOしい、厚い鎧姿の武士達が立つ。
 分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。
I tried it with (<R(.*?)/RUBY>){0} but this didn't work.
string = re.sub("(\<R(.*?)\/RUBY>){0}", "HELLO", string)
print(string)
Can be done like this:
string = """陣頭には見るも<RUBY text="いかめ">厳</RUBY>しい、厚い鎧姿の武士達が立つ。
 分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。"""
try:
first_match = re.findall(r'<RUBY text=.*</RUBY>', string)[0]
parts = string.split(first_match)
result = f'{parts[0]}HELLO{first_match.join(parts[1:])}'
except IndexError:
result = string
print(result)
Result:
陣頭には見るもHELLOしい、厚い鎧姿の武士達が立つ。
 分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。

Add a single backslash ("\") to string in python [duplicate]

This question already has answers here:
Process escape sequences in a string in Python
(8 answers)
Closed 7 months ago.
I have an array of strings which looks like:
["U0001f308", "U0001F602"]
I need to add “\” in front of the first letter U so the output will be like:
["\U0001f308", "\U0001F602"]
This is the code I have tried so far:
matches = ["U0001f308", "U0001F602"]
emojis = [emoji.replace('U', r"\U") for emoji in matches]
print(emojis) #this prints ['\\U0001f308', '\\U0001F602'] which has two blacklashes
How can i add only one backslash in front of every string?
I guess what you want is the following code:
matches = ["U0001f308", "U0001F602"]
emojis = [emoji.replace('U', r"\U").encode().decode('unicode-escape') for emoji in matches]
print(emojis)
which prints
['🌈', '😂']
It's the same result as when we execute the following code:
print(["\U0001f308", "\U0001F602"])

Using regular expression to count the number of spaces at the beginning of a string [duplicate]

This question already has answers here:
Check string indentation?
(4 answers)
Closed 4 years ago.
How can I use regex to count the number of spaces beginning of the string. For example:
string = ' area border router'
count_space variable would return me a value of 1 since there is 1 whitespace at the beginning of the string. If my string is:
string = ' router ospf 1'
count_space variable would return me a value of 2 since there is 2 whitespace at the beginning of the string. And so on....
I thing the expression would be something like RE = '^\s' ? But not sure how to formulate it.
You don't need regex, you can just do this:
s = ' area border router'
print(len(s)-len(s.lstrip()))
Output:
1

Replace sequence of chars in string with its length [duplicate]

This question already has answers here:
Python replace string pattern with output of function
(4 answers)
Closed 5 years ago.
Say I have the following string:
mystr = "6374696f6e20????28??????2c??2c????29"
And I want to replace every sequence of "??" with its length\2. So for the example above, I want to get the following result:
mystr = "6374696f6e2022832c12c229"
Meaning:
???? replaced with 2
?????? replaced with 3
?? replaced with 1
???? replaced with 2
I tried the following but I'm not sure it's the good approach, and anyway -- it doesn't work:
regex = re.compile('(\?+)')
matches = regex.findall(mystr)
if matches:
for match in matches:
match_length = len(match)/2
if (match_length > 0):
mystr= regex .sub(match_length , mystr)
You can use a callback function in Python's re.sub. FYI lambda expressions are shorthand to create anonymous functions.
See code in use here
import re
mystr = "6374696f6e20????28??????2c??2c????29"
regex = re.compile(r"\?+")
print(re.sub(regex, lambda m: str(int(len(m.group())/2)), mystr))
There seems to be uncertainty about what should happen in the case of ???. The above code will result in 1 since it converts to int. Without int conversion the result would be 1.0. If you want to ??? to become 1? you can use the pattern (?:\?{2})+ instead.

finding number in string REGEX [duplicate]

This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 9 years ago.
I am very new to regex and learning by practice. I wrote the following regex for finding a number inside a string of characters, however, it returns nothing. Why is that?
string = "hello world & bello stack 12456";
findObj = re.match(r'[0-9]+',string,re.I);
if findObj:
print findObj.group();
else:
print "nothing matched"
Regards
re.match must match from the beginning of the string.
Use re.search instead.
re.match matches from the start of the string. Use re.search
>>> my_string = "hello world & bello stack 12456"
>>> find_obj = re.search(r'[0-9]+', my_string, re.I)
>>> print find_obj.group()
12456
P.S semicolons are not necessary.

Categories