python regix how to split two word and add comma? [closed] - python

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 10 months ago.
Improve this question
This is my string:
Hair ReplacementHair BraidingHair Supplies & Accessories
my expected result should be:
Hair Replacement,Hair Braiding,Hair Supplies & Accessories
If two word like this ReplacementHair I want to split this two word and add comma between theme.
I tried this code:
re.sub(r"(\w)([A-Z])", r"\1 \2", text)
The above code splitting two word and add space between theme. I want comma instead of space.

You can replace the space in the replacement pattern with a comma.
import re
text = "Hair ReplacementHair BraidingHair Supplies & Accessories"
text2 = re.sub(r"(\w)([A-Z])", r"\1,\2", text)
print(text2)
output
Hair Replacement,Hair Braiding,Hair Supplies & Accessories

Related

Adding sentence breaks to the beginning and end of each element in a list [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I have a list of strings:
mini_corpus = ['I am Sam','Sam I am','I am Sam','I do not like green eggs and Sam']
I need to add a sentence boundary at the beginning and end of each element (i.e. 'BOS I am Sam EOS', 'BOS Sam I am EOS', etc.)
I've tried using map : mini_corpv2 = list(map(lambda x: 'BOS{}EOS'.format(x), mini_corpus)) but it throws 'list' object is not callable
Can anyone tell me what I'm doing wrong or suggest another method to implement this?
I suppose the problem is somewhere else. Your code runs without problems, resulting in
['BOSI am SamEOS',
'BOSSam I amEOS',
'BOSI am SamEOS',
'BOSI do not like green eggs and SamEOS']
(so you will probably want to add spaces after BOS and before EOS).
An alternative solution using list comprehension:
mini_corpv2 = [f'BOS {x} EOS' for x in mini_corpus]

write to file \t creates spaces not tabs [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
So, I have a list of lists and trying to write the values to a file with tab delimited;
sorted_results=[
["test1", 01],
["test2", 02],
]
with open('outfile.txt', 'a') as write_file:
for i in sorted_results:
write_file.write("{}\t{}\n".format(i[0], i[1]))
The end result comes out as:
test1 01
test2 02
Values are space delimited not tab. What am I missing? If I add a space before \t then end result will have a space and a tab between the values.
You can read the file back in and inspect the resulting data.
>>> open('outfile.txt').read()
'test1\t1\ntest2\t2\n'
This shows that the tab character is indeed written to the file. If you are still in doubt use a hex editor to view the characters.

Capture group with python regex not capturing [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
Im trying to gain an understanding of capture groups using this example:
sentence = "the quick brown fox jumps over the lazy dog"
re.search(r'\S+\s+\S+',sentence)
<_sre.SRE_Match object; span=(0, 9), match='the quick'>
I can see this matches as follows:
re.search(r'\S+\s+\S+',sentence).group()
'the quick'
I want to add a match group for the word 'quick' so I try this:
re.search(r'\S+\s+\(S+)',sentence)
Which gives an error:
error: unbalanced parenthesis at position 10
What am I doing wrong here?
Looks like a typo, but I'll still provide an explanation.
You are escaping the opening parenthesis making it matching a literal (, which makes the closing parenthesis at the end of the expression without an opening part, replace:
\S+\s+\(S+)
with:
\S+\s+(\S+)

Python Regex stops after first "|" match [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
p = re.compile("[AG].{2}[ATG|ATA|AAG].{1}G")
regex_result = p.search('ZZZAXXATGXGZZZ')
regex_result.group()
'AXXATG'
I was expecting AXXATGXG instead.
Use a grouping construct (...) rather than a character class [...] around the alternatives:
p = re.compile("[AG].{2}(?:ATG|ATA|AAG).G")
^^^^^^^^^^^^^^^
The (?:ATG|ATA|AAG) matches 3 sequences: either a ATG, or ATA or AAG. The [ATG|ATA|AAG] character class matches 1 char, either A, T, G or |.
Note the {1} is redundant and can be removed.
Python:
import re
p = re.compile("[AG].{2}(?:ATG|ATA|AAG).G")
regex_result = p.search('ZZZAXXATGXGZZZ')
print(regex_result.group())
# => AXXATGXG
See IDEONE demo

Python sub string [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
I have run into an odd situation while trying to find a sub string in Python. I am aware that I should use the in operator.
My string looks like '(email#email.org, Name, ext)'.
When I run this in the interactive terminal, it starts to not match:
>>> '(foo#bar.org,' in a
True
>>> '(foo#bar.org, B' in a
False
I have the string exactly as the pattern is in the text I provide. I am just curious as to why in isn't working once it passes the first comma?
a is:
Purpose: foo - bar\n\n Server Admin: (baz#bar.org, a f. g, 6-6405) \n\n App Owner Group: hi\n\n App Owners: (blah, blah blah, 6-5627)\n (foo#bar.org, Brian Cody, 6-5624)\n\nNotes for Alerts:\n
Everything works as expected if a actually contains 'foo#bar.org, B':
>>> a = '(foo#bar.org, Bob, x1234)'
>>> 'foo#bar.org,' in a
True
>>> 'foo#bar.org, B' in a
True
>>>
The string that you provided actually has two spaces between (foo#bar.org, and Brian Cody. Therefore, your second expression will return False because it's looking for one and only one space.

Categories