Finding a repetitive pattern in Python strings [duplicate]

Finding a repetitive pattern in Python strings [duplicate] - python

This question already has an answer here:
Detect repetitions in string
(1 answer)
Closed 8 years ago.
Let's suppose I have this string
s = '123123123'
I can notice the '123' sub-string is being repeated.
here = '1234'
The sub-string would be '1234' with no repetitions.
s = '11111'
The sub-string would be '1'
How can I get this with Python? Any hints?

strings = ['123123123', '1234', '11111']
import re
pattern, result = re.compile(r'(.+?)\1+'), []
for item in strings:
result.extend(pattern.findall(item) or [item])
print result
# ['123', '1234', '1']
Debuggex Demo
You can see the explanation for the RegEx here

Related

Regex split numbers and letter groups without spaces with python [duplicate]

This question already has answers here:
How to split strings into text and number?
(11 answers)
Closed 2 years ago.
I have a string like 'S10', 'S11' v.v
How to split this to ['S','10'], ['S','11']
example:
import re
str = 'S10'
re.compile(...)
result = re.split(str)
result:
print(result)
// ['S','10']
resolved at How to split strings into text and number?

This should do the trick:
I'm using capture groups using the circle brackets to match the alphabetical part to the first group and the numbers to the second group.
Code:
import re
str_data = 'S10'
exp = "(\w)(\d+)"
match = re.match(exp, str_data)
result = match.groups()
Output:
('S', '10')

python string parsing with regular expression [duplicate]

This question already has answers here:
In Python, how do I split a string and keep the separators?
(19 answers)
Closed 2 years ago.
I can get numeric with this:
>>> import re
>>> re.findall(r'\d+', '!"123%&654()')
['123', '654']
How can I get all the components ?
['!"', '123', '%&', '654', '()']

For reference, with findall, you would greedily look for only digits, or only non-digits:
re.findall(r'\d+|\D+', '!"123%&654()')
# ['!"', '123', '%&', '654', '()']
split is a little cleaner.

How can I split a string maintaining the punctuation? (Python) [duplicate]

This question already has answers here:
Splitting a string into words and punctuation
(11 answers)
Closed 3 years ago.
How can I split a string in python taking into account the punctuation in the result?
The following code:
s = "Hello, my name is Robert."
s_splitted = s.split()
will give as output:
["Hello,","my","name","is","Robert."]
How can I obtain the following result?
["Hello",",","my","name","is","Robert","."]

Regex can handle this.
import re
s = "Hello, my name is Robert."
s_splitted = [part for part in re.split(r'\b|\s', s) if part != '']
# ['Hello', ',', 'my', 'name', 'is', 'Robert']

Does this answer your question?
So in your case:
import re
s = "Hello, my name is Robert."
items = re.findall(r"[\w']+|[.,!?;]", s)

How does list comprehension work in this code block? [duplicate]

This question already has answers here:
Remove empty strings from a list of strings
(13 answers)
Closed 3 years ago.
In the code block below, I understand that s is the string. re.split() will generate a list of split results and the list comprehension will iterate through every result created.
I don't understand how "if i" will work here.
This is from the following stackoverflow thread: https://stackoverflow.com/a/28290501/11292262
s = '125km'
>>> [i for i in re.split(r'([A-Za-z]+)', s) if i]
['125', 'km']
>>> [i for i in re.split(r'(\d+)', s) if i]
['125', 'km']

Empty strings evaluate to False. Note what happens when we take the if out:
import re
s = '125km'
print(re.split(r'([A-Za-z]+)', s))
print(re.split(r'(\d+)', s))
Output:
['125', 'km', '']
['', '125', 'km']
The if is used to remove the empty string, which is unwanted, per that question. Note that the capture groups in both expressions are needed to ensure that the part of the string split on (value or unit) is also returned.

Splitting on a lookahead [duplicate]

This question already has answers here:
Decode HTML entities in Python string?
(6 answers)
Closed 6 years ago.
I'm trying to split on a lookahead, but it doesn't work for the last occurrence. How do I do this?
my_str = 'HRCâs'
import re
print(re.split(r'.(?=&)', my_str))
My output:
['HR', '&#226', '&#128', 's']
My desired output:
['HRC', '&#226', '&#128', '&#153', 's']

The solution using re.findall() function:
my_str = 'HRCâs'
result = re.findall(r'\w+|&#\d+(?=;)', my_str)
print(result)
The output:
['HRC', '&#226', '&#128', '&#153', 's']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding a repetitive pattern in Python strings [duplicate] - python

strings = ['123123123', '1234', '11111'] import re pattern, result = re.compile(r'(.+?)\1+'), [] for item in strings: result.extend(pattern.findall(item) or [item]) print result # ['123', '1234', '1'] Debuggex Demo You can see the explanation for the RegEx here

Related

Regex split numbers and letter groups without spaces with python [duplicate]

python string parsing with regular expression [duplicate]

How can I split a string maintaining the punctuation? (Python) [duplicate]

How does list comprehension work in this code block? [duplicate]

Splitting on a lookahead [duplicate]

Categories

Resources