Splitting two concatenated terms in python [duplicate] - python

This question already has answers here:
Split a string at uppercase letters
(22 answers)
Closed 6 years ago.
In general I have a string say
temp = "ProgramFields"
Now I want to split strings like these into two terms(I can identify tow strings based on uppercase character)
term1 = "Program"
term2 = "Field"
How to achieve this in python?
I tried regular expression and splitting terms but nothing gave me the result that I expected
Python code -
re.split("[A-Z][a-z]*","ProgramField")
Any suggestions?

You have to include groups:
re.split('([A-Z][a-z]*)', 'ProgramField)

Related

Using regex to match two specific characters [duplicate]

This question already has answers here:
Regular Expressions: Is there an AND operator?
(14 answers)
Closed 2 years ago.
I have a list of strings like:
1,-102a
1,123-f
1943dsa
-da238,
-,dwjqi92
How can I make a Regex expression in Python that matches as long as the string contains the characters , AND - regardless of the order or the pattern in which they appear?
I would use the following regex alternation:
,.*-|-.*,
Sample script:
inp = ['1,-102a', '1,123-f', '1943dsa', '-da238,', '-,dwjqi92']
output = [x for x in inp if re.search(r',.*-|-.*,', x)]
print(output)
This prints:
['1,-102a', '1,123-f', '-da238,', '-,dwjqi92']

Regex for number after a Parenthesis ( [duplicate]

This question already has answers here:
Python - Regular expressions get numbers between parenthesis
(2 answers)
Closed 2 years ago.
I have a column with values like this:
4 (3 in force)
44 (39 in force)
I was able to use this to get a new column for the first number.
df['new'] = df['column'].str.extract('(\d+)')
How can I get a new column for the second number? (3,39, etc.)
One way that specifically answers you question would be to use a lookbehind regular expression, that basically says "the first number after another number, a space and a parenthesis":
df['new'] = df['column'].str.extract('(?<=\d+\s\()\d+')
But if you're extracting multiple parts from a single string, you might consider combining the two and using groups in the regex to access the parts you want.
You could just take the row, convert it into a string, split it, and access the needed numbers, for example:
row = '4 (3 in force)'
row.split(' ') # This returns ['4', '(3', 'in', 'force)']
row.split(' ')[1] # This returns '(3'
row.split(' ')[1][1:] # And this returns all numbers after the bracket, so '3'

Regular expression to find largest repeating pattern? [duplicate]

This question already has answers here:
Longest consecutive substring of certain character type in python
(2 answers)
Closed 2 years ago.
How can I use regular expressions to find the largest repeating pattern?
For example, in the string "CATchickenchickenCATCATCATCATchickenchickenCATCATchicken"
I need a way to get this string: "CATCATCATCAT" since it is the largest repeating chunk of my substring "CAT"
How can I do this?
Thanks :)
import re
string = "CATchickenchickenCATCATCATCATchickenchickenCATCATchicken"
pattern = "((CAT)+)"
print(max(re.findall(pattern, string), key=lambda tpl: len(tpl[0]))[0])
Output:
CATCATCATCAT
>>>

how do you check if a string has more than one specific character in python. Example The string, 'mood' would clearly have two 'o' characters [duplicate]

This question already has answers here:
Count the number of occurrences of a character in a string
(26 answers)
Closed 3 years ago.
how do you check if a string has more than one specific character in python. Example The string, 'mood' would clearly have two 'o' characters
You can use the str.count method:
>>> 'mood'.count('o') > 1
True
>>>

Finding strings with gaps that match a string in a list of strings [duplicate]

This question already has answers here:
Python wildcard search in string
(6 answers)
Closed 5 years ago.
I have a list of strings that looks like this: ['ban*', 'c*rr*r', 'pl*s', pist*l ]. I want to check if those strings have matching equivalents in another list of strings which is the following:
['banner', 'bannana', ban, 'carrer', 'clorror', 'planes', 'plots']
Comparing first string from the list I have'banner' and 'bannana' and that would mean that there is a word that is matching that string ("ban*") So the '*' means that there can be one or more letters in that word.
Try this fnmatch approach
import fnmatch
lst = ['banner', 'bannana', 'ban', 'carrer', 'clorror', 'planes', 'plots']
f1 = fnmatch.filter(lst, 'ban*')
print (f1)
Output
['banner', 'bannana', 'ban']

Categories