This question already has answers here:
How to replace the first occurrence of a regular expression in Python?
(2 answers)
Closed 6 months ago.
Simple regex question. I have a string in the following format:
string = """陣頭には見るも<RUBY text="いかめ">厳</RUBY>しい、厚い鎧姿の武士達が立つ。
分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。"""
What is the regular expression to find the first occurance of <RUBY text="something">something</RUBY> and replace it with something like HELLO i.e
陣頭には見るもHELLOしい、厚い鎧姿の武士達が立つ。
分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。
I tried it with (<R(.*?)/RUBY>){0} but this didn't work.
string = re.sub("(\<R(.*?)\/RUBY>){0}", "HELLO", string)
print(string)
Can be done like this:
string = """陣頭には見るも<RUBY text="いかめ">厳</RUBY>しい、厚い鎧姿の武士達が立つ。
分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。"""
try:
first_match = re.findall(r'<RUBY text=.*</RUBY>', string)[0]
parts = string.split(first_match)
result = f'{parts[0]}HELLO{first_match.join(parts[1:])}'
except IndexError:
result = string
print(result)
Result:
陣頭には見るもHELLOしい、厚い鎧姿の武士達が立つ。
分厚い鉄甲、長大な太刀――彼らの<RUBY text="かも">醸</RUBY>し出す威圧感
は、一騎のみでも背後の兵全てに優る戦力たり得ると
いう事実を、何より雄弁に物語っている。
Related
This question already has answers here:
Regular Expressions: Is there an AND operator?
(14 answers)
Closed 2 years ago.
I have a list of strings like:
1,-102a
1,123-f
1943dsa
-da238,
-,dwjqi92
How can I make a Regex expression in Python that matches as long as the string contains the characters , AND - regardless of the order or the pattern in which they appear?
I would use the following regex alternation:
,.*-|-.*,
Sample script:
inp = ['1,-102a', '1,123-f', '1943dsa', '-da238,', '-,dwjqi92']
output = [x for x in inp if re.search(r',.*-|-.*,', x)]
print(output)
This prints:
['1,-102a', '1,123-f', '-da238,', '-,dwjqi92']
This question already has answers here:
How to split strings into text and number?
(11 answers)
Closed 2 years ago.
I have a string like 'S10', 'S11' v.v
How to split this to ['S','10'], ['S','11']
example:
import re
str = 'S10'
re.compile(...)
result = re.split(str)
result:
print(result)
// ['S','10']
resolved at How to split strings into text and number?
This should do the trick:
I'm using capture groups using the circle brackets to match the alphabetical part to the first group and the numbers to the second group.
Code:
import re
str_data = 'S10'
exp = "(\w)(\d+)"
match = re.match(exp, str_data)
result = match.groups()
Output:
('S', '10')
This question already has answers here:
Split a string by a delimiter in python
(5 answers)
Closed 2 years ago.
How can I get a string after and before a specific substring?
For example, I want to get the strings before and after : in
my_string="str1:str2"
(which in this case it is: str1 and str2).
Depending on your use case you may want different things, but this might work best for you:
lst = my_string.split(":")
Then, lst will be: ['str1', 'str2']
You can also find the index of the substring by doing something like:
substring = ":"
index = my_string.find(":")
Then split the string on that index:
first_string = my_string[:index]
second_string = my_string[index+len(substring):]
This question already has answers here:
get index of character in python list
(4 answers)
Regular expression to match a dot
(7 answers)
Closed 3 years ago.
I want to find the position of '.', but when i run code below:
text = 'Hello world.'
pattern = '.'
search = re.search(pattern,text)
print(search.start())
print(search.end())
Output is:
0
1
Place of '.' isn't 0 1.
So why is it giving wrong output?
You can use find method for this task.
my_string = "test"
s_position = my_string.find('s')
print (s_position)
Output
2
If you really want to use RegEx be sure to escape the dot character or it will be interpreted as a special character.
The dot in RegEx matches any character except the newline symbol.
text = 'Hello world.'
pattern = '\.'
search = re.search(pattern,text)
print(search.start())
print(search.end())
This question already has answers here:
Python replace string pattern with output of function
(4 answers)
Closed 5 years ago.
Say I have the following string:
mystr = "6374696f6e20????28??????2c??2c????29"
And I want to replace every sequence of "??" with its length\2. So for the example above, I want to get the following result:
mystr = "6374696f6e2022832c12c229"
Meaning:
???? replaced with 2
?????? replaced with 3
?? replaced with 1
???? replaced with 2
I tried the following but I'm not sure it's the good approach, and anyway -- it doesn't work:
regex = re.compile('(\?+)')
matches = regex.findall(mystr)
if matches:
for match in matches:
match_length = len(match)/2
if (match_length > 0):
mystr= regex .sub(match_length , mystr)
You can use a callback function in Python's re.sub. FYI lambda expressions are shorthand to create anonymous functions.
See code in use here
import re
mystr = "6374696f6e20????28??????2c??2c????29"
regex = re.compile(r"\?+")
print(re.sub(regex, lambda m: str(int(len(m.group())/2)), mystr))
There seems to be uncertainty about what should happen in the case of ???. The above code will result in 1 since it converts to int. Without int conversion the result would be 1.0. If you want to ??? to become 1? you can use the pattern (?:\?{2})+ instead.