Python Split Outputting in square brackets - python

I have code that looks like this:
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = str(activity [1:2]) + str(activity [0:1])
print("".join(activity))
I want the output to look like Girl's 9th Basketball, but the current output when printed is
[' Girls 9th']['Basketball ']
I want to get rid of the square brackets. I know I can simply trim it, but I would rather know how to do it right.

You're almost there. When you use .join on a list it creates a string so you can omit that step.
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = activity[1:2] + activity[0:1]
print(" ".join(activity))

You are stringyfying the lists which is the same as using print(someList) - it is the representation of a list wich puts the [] around it.
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = activity [1:2] + [" "] + activity [0:1] # reorders and reassignes list
print("".join(activity))
You could consider adding a step:
# remove empty list items and remove trailing/leading strings
cleaned = [x.strip() for x in activity if x.strip() != ""]
print(" ".join(cleaned)) # put a space between each list item
This just resorts the lists and adds the needed space item in between so you output fits.

You can solve it in one line:
new_activity = ' '.join(activity.split(' - ')[::-1])

You can try something like this:
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = str(activity [1:2][0]).strip() + ' ' + str(activity [0:1][0])
print(activity)
output:
Girls 9th Basketball

Related

How to replace a string in a list of strings with regex?

my list = [
'<instance id="line-nw8_059:8174:">',
' advanced micro devices inc sunnyvale calif and siemens ag of west germany '
'said they agreed to jointly develop manufacture and market microchips for '
'data communications and telecommunications with an emphasis on the '
'integrated services digital network the integrated services digital '
'network or isdn is an international standard used to transmit voice data '
'graphics and video images over telephone <head>line</head> ',
'<instance id="line-nw7_098:12684:">',
' in your may 21 story about the phone industry billing customers for '
'unconnected calls i was surprised that you did not discuss whether such '
'billing is appropriate a caller who keeps a <head>line</head> open '
'waiting for a connection uses communications switching and transmission '
'equipment just as if a conversation were taking place ',
'<instance id="line-nw8_106:13309:">'
]
I have to replace all of the <instance id="line-nw8_106:13309:"> (any variation) with a whitespace, along with added them all to their own list. I have figured out how to add them to their own list with regex like this:
instanceList =[]
instanceMatch = '<instance id="([^"]*)"'
for i in contentsTestSplit:
matchy = re.match(instanceMatch,i)
if matchy:
instanceMatchy = matchy.group(0)
instanceList.append(instanceMatchy)
print("instance list: ",instanceList)
So this works, but I can't figure out how to replace all of them with white spaces? I have attempted this along with using replace methods and it is not working, any help would be appreicated:
instanceList =[]
instanceMatch = '<instance id="([^"]*)"'
pat = re.compile(r'<instance id="([^"]*)"')
for i in contentsTestSplit:
matchy = re.match(instanceMatch,i)
if matchy:
instanceMatchy = matchy.group(0)
instanceList.append(instanceMatchy)
i = pat.sub("",i)
print("instance list: ",instanceList)
Also have attempted this: but it doesn't replace, but does locate the occurrences accurately
for i in contentsTestSplit:
if i.startswith("<instance id="):
i.replace(i,"")
You can use regex with substitution to replace all instances with a whitespace. You can then pass it a custom function to return the matches and append the results to your instances list.
def _sub(match):
instanceList.append(match[0])
return ''
instanceList =[]
instanceMatch = '<instance id="([^"]*)"'
for i in my_list:
re.sub(instanceMatch, _sub, i)
I didn't know what you wanted to do with the processed data, but re.sub(instanceMatch, _sub, i) returns your text with the substitutions.
I wonder why id="([^"])" represent a string and if id="[^"]" it represent nothing. what's
the using of ().

python bypass re.finditer match when searched words are in a defined expression

I have a list of words (find_list) that I want to find in a text and a list of expressions containing those words that I want to bypass (scape_list) when it is in the text.
I can find all the words in the text using this code:
find_list = ['name', 'small']
scape_list = ['small software', 'company name']
text = "My name is Klaus and my middle name is Smith. I work for a small company. The company name is Small Software. Small Software sells Software Name."
final_list = []
for word in find_list:
s = r'\W{}\W'.format(word)
matches = re.finditer(s, text, (re.MULTILINE | re.IGNORECASE))
for word_ in matches:
final_list.append(word_.group(0))
The final_list is:
[' name ', ' name ', ' name ', ' Name.', ' small ', ' Small ', ' Small ']
Is there a way to bypass expressions listed in scape_list and obtain a final_list like this one:
[' name ', ' name ', ' Name.', ' small ']
final_list and scape_list are always being updated. So I think that regex is a good approach.
You can capture the word before and after the find_list word using the regex and check whether both the combinations are not present in the scape_list. I have added comments where I have changed the code. (And better change the scape_list to a set if it can become large in future)
find_list = ['name', 'small']
scape_list = ['small software', 'company name']
text = "My name is Klaus and my middle name is Smith. I work for a small company. The company name is Small Software. Small Software sells Software Name."
final_list = []
for word in find_list:
s = r'(\w*\W)({})(\W\w*)'.format(word) # change the regex to capture adjacent words
matches = re.finditer(s, text, (re.MULTILINE | re.IGNORECASE))
for word_ in matches:
if ((word_.group(1) + word_.group(2)).strip().lower() not in scape_list
and (word_.group(2) + word_.group(3)).strip().lower() not in scape_list): # added this condition
final_list.append(word_.group(2)) # changed here
final_list
['name', 'name', 'Name', 'small']

How to print next line after regex search using python

I have the below text:
subject = "Madam / Dear Sir, ', ' ', 'The terrorist destroyed the building at 23:45 with a remote
detonation device', ' ', 'so a new line character is appended to the string"
I have used the below regex code to search :
[p for p in re.split('\,', str(subject)) if re.search('(M[a-z]+ / \w+ \w+r)', p)]
getting output: Madam / Dear Sir
Expected output : The terrorist destroyed the building at 23:45 with a remote
detonation device
Please note the expected output should always be after the regex expression is found.
Can you please help me on this?
You can extend the split a bit more \s*',\s*'\s* to match all the parts that you don't want until the next part that you do want.
Then use a loop to first match your pattern M[a-z]+ / \w+ \w+r. The get the next item if there is an item present.
Example code
import re
subject = "Madam / Dear Sir, ', ' ', 'The terrorist destroyed the building at 23:45 with a remote detonation device', ' ', 'so a new line character is appended to the string"
filteredList = list(filter(None, re.split("\s*',\s*'\s*", subject)))
l = len(filteredList)
for i, s in enumerate(filteredList):
if re.match(r"M[a-z]+ / \w+ \w+r", s) and i + 1 < l:
print(filteredList[i + 1])
Output
The terrorist destroyed the building at 23:45 with a remote detonation device
Python demo

replace spaces with % in django or python app

im having a hard time fixing this one. i have a search function that will look for campaign name or campaign launcher name. for example if a user look for all campaigns launched by john doe. i want to enclose all spaces with '%' (%john%doe%) expected.
campaigns = Campaign.objects.filter(title(re.sub('/\s/g ', '%', search)) | launcher(re.sub('/\s/g ', '%', search)))
i also tried
campaigns = Campaign.objects.filter(title(re.sub(' ', '%', search)) | launcher(re.sub(' ', '%', search)))
but my code is not doing the right thing. im getting
`camp`.`name` LIKE '%john doe%' OR `user`.`name` LIKE '%john doe%'
and if i did the search.replace(" ", "%") im getting
`camp`.`name` LIKE '%john\\%doe%' OR `user`.`name` LIKE '%john\\%doe%'
i also got this sub function
def search_campaign(request, search):
def title(search):
return Q(name__icontains=search)
def launcher(search):
return Q(created_by_name__icontains=search)
any help will be much appreciated.
search.replace(" ", "%") should work for input search = "john doe"
If you want to send the query with % then simply below cod will work for you.
>>> import re
>>> re.sub(" ", "%", " jhon doe ")
'%jhon%doe%'
If you want to send title like %john%doe% for " john doe " then this query should work.
campaigns = Campaign.objects.filter(title(re.sub(' ', '%', search)) | launcher(re.sub(' ', '%', search)))
Correct me if I got something wrong from the question.
"Regex" in python put spaces between words staring with capital letters.
for this first you need to import "re"
import re
def space(input):
i1=re.findall("[a-z][A-Z]*",input)
result=[]
for word in i1:
word=chr(ord(word[])+32)+word[1:]
result.append(word)
print(' '.join(result))
if __name__=="__main__":
input="JohnDoe"
space(input)

Comma in return value [duplicate]

This question already has answers here:
How would you make a comma-separated string from a list of strings?
(15 answers)
Closed 3 years ago.
I know the desired syntax lies in the first function but I for the life of me can't find where it is.
I've attempted to remove commas and add spaces to each .split() each has yielded an undesired return value.
def get_country_codes(prices):
price_list = prices.split(',')
results = ''
for price in price_list:
results += price.split('$')[0]
return results
def main():
prices = "US$40, AU$89, JP$200"
price_result = get_country_codes(prices)
print(price_result)
if __name__ == "__main__":
main()
The current output:
US AU JP
The desired output:
US, AU, JP
It looks like you could benefit from using a list to collect the country codes of the prices instead of a string. Then you can use ', '.join() later.
Maybe like this:
def get_country_codes(prices):
country_code_list = []
for price in prices.split(','):
country_code = price.split('$')[0].strip()
country_code_list.append(country_code)
return country_code_list
if __name__ == '__main__':
prices = "US$40, AU$89, JP$200"
result_list = get_country_codes(prices)
print(', '.join(result_list))
Or if you like really short code:
prices = "US$40, AU$89, JP$200"
print(
', '.join(
price.split('$')[0].strip()
for price in prices.split(',')))
You could also use regex if you want to. Since you know country codes will be two capital letters only (A-Z), you can look for a match of two capital letters that precede a dollar sign.
def get_country_codes(prices):
country_codes = re.findall(r'([A-Z]{2})\$', prices)
return ', '.join(country_codes)
See regex demo here.
Look at the successive steps:
Your string:
In [1]: prices = "US$40, AU$89, JP$200"
split into a list on comma
In [2]: alist = prices.split(',')
In [3]: alist
Out[3]: ['US$40', ' AU$89', ' JP$200']
split the substrings on $
In [4]: [price.split('$') for price in alist]
Out[4]: [['US', '40'], [' AU', '89'], [' JP', '200']]
select the first element:
In [5]: [price.split('$')[0] for price in alist]
Out[5]: ['US', ' AU', ' JP']
Your += joins the strings as is; same as join with ''. Note that the substrings still have the initial blank for the original string.
In [6]: ''.join([price.split('$')[0] for price in alist])
Out[6]: 'US AU JP'
Join with comma:
In [7]: ','.join([price.split('$')[0] for price in alist])
Out[7]: 'US, AU, JP'
join is the easiest way of joining a list of strings with a specific delimiter between, in effect reversing a split. += in a loop is harder to use, since it tends to add an extra delimiter at the start or end.

Categories