python string parsing with regular expression [duplicate]

python string parsing with regular expression [duplicate] - python

This question already has answers here:
In Python, how do I split a string and keep the separators?
(19 answers)
Closed 2 years ago.
I can get numeric with this:
>>> import re
>>> re.findall(r'\d+', '!"123%&654()')
['123', '654']
How can I get all the components ?
['!"', '123', '%&', '654', '()']

For reference, with findall, you would greedily look for only digits, or only non-digits:
re.findall(r'\d+|\D+', '!"123%&654()')
# ['!"', '123', '%&', '654', '()']
split is a little cleaner.

Related

converting a string to a readable sentence [duplicate]

This question already has answers here:
How do I convert a list into a string with spaces in Python?
(6 answers)
Closed 4 years ago.
say you have a list like so:
lst = ['my', 'name', 'is', 'jack.']
If I convert to a string doing this:
''.join(lst)
output
'mynameisjack.'
How do I make the list print out:
"my name is jack."
instead of all together.

Instead of using ''.join(lst)(an empty string), use ' '.join(lst), with a space (see the documentation of join!).

how to split entries in a list by more than one delimiter Python [duplicate]

This question already has answers here:
Split Strings into words with multiple word boundary delimiters
(31 answers)
Closed 5 years ago.
I have a .txt file with entries separated by a newline and a comma, alternating.
x = file_1.read().split("\n")
...
x = ['10,0902', '13897,00641']
how can I also delimit by a comma? .split("\n" and ",")
does not seem to work

.split("\n" and ",") is the same as .split(True) which doesn't make much sense.
You'd want to use re.split so you can split by a regex:
import re
string = '1,2\n3,4'
print(re.split(r'(?:\n|,)', string))
# ['1', '2', '3', '4']

Splitting on a lookahead [duplicate]

This question already has answers here:
Decode HTML entities in Python string?
(6 answers)
Closed 6 years ago.
I'm trying to split on a lookahead, but it doesn't work for the last occurrence. How do I do this?
my_str = 'HRCâs'
import re
print(re.split(r'.(?=&)', my_str))
My output:
['HR', '&#226', '&#128', 's']
My desired output:
['HRC', '&#226', '&#128', '&#153', 's']

The solution using re.findall() function:
my_str = 'HRCâs'
result = re.findall(r'\w+|&#\d+(?=;)', my_str)
print(result)
The output:
['HRC', '&#226', '&#128', '&#153', 's']

How to remove hyphens from a list of strings [duplicate]

This question already has answers here:
How to delete all instances of a character in a string in python?
(6 answers)
Closed 6 years ago.
['0-0-0', '1-10-20', '3-10-15', '2-30-20', '1-0-5', '1-10-6', '3-10-30', '3-10-4']
How can I remove all the hyphens between the numbers?

You can just iterate through with a for loop and replace each instance of a hyphen with a blank.
hyphenlist = ['0-0-0', '1-10-20', '3-10-15', '2-30-20', '1-0-5', '1-10-6', '3-10-30', '3-10-4']
newlist = []
for x in hyphenlist:
newlist.append(x.replace('-', ''))
This code should give you a newlist without the hyphens.

Or as a list comprehension:
>>>l=['0-0-0', '1-10-20', '3-10-15', '2-30-20', '1-0-5', '1-10-6', '3-10-30', '3-10-4']
>>>[i.replace('-','') for i in l]
['000', '11020', '31015', '23020', '105', '1106', '31030', '3104']

Finding a repetitive pattern in Python strings [duplicate]

This question already has an answer here:
Detect repetitions in string
(1 answer)
Closed 8 years ago.
Let's suppose I have this string
s = '123123123'
I can notice the '123' sub-string is being repeated.
here = '1234'
The sub-string would be '1234' with no repetitions.
s = '11111'
The sub-string would be '1'
How can I get this with Python? Any hints?

strings = ['123123123', '1234', '11111']
import re
pattern, result = re.compile(r'(.+?)\1+'), []
for item in strings:
result.extend(pattern.findall(item) or [item])
print result
# ['123', '1234', '1']
Debuggex Demo
You can see the explanation for the RegEx here

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python string parsing with regular expression [duplicate] - python

This question already has answers here: In Python, how do I split a string and keep the separators? (19 answers) Closed 2 years ago. I can get numeric with this: >>> import re >>> re.findall(r'\d+', '!"123%&654()') ['123', '654'] How can I get all the components ? ['!"', '123', '%&', '654', '()']

For reference, with findall, you would greedily look for only digits, or only non-digits: re.findall(r'\d+|\D+', '!"123%&654()') # ['!"', '123', '%&', '654', '()'] split is a little cleaner.

Related

converting a string to a readable sentence [duplicate]

how to split entries in a list by more than one delimiter Python [duplicate]

Splitting on a lookahead [duplicate]

How to remove hyphens from a list of strings [duplicate]

Finding a repetitive pattern in Python strings [duplicate]

Categories

Resources