How to print next line after regex search using python - python

I have the below text:
subject = "Madam / Dear Sir, ', ' ', 'The terrorist destroyed the building at 23:45 with a remote
detonation device', ' ', 'so a new line character is appended to the string"
I have used the below regex code to search :
[p for p in re.split('\,', str(subject)) if re.search('(M[a-z]+ / \w+ \w+r)', p)]
getting output: Madam / Dear Sir
Expected output : The terrorist destroyed the building at 23:45 with a remote
detonation device
Please note the expected output should always be after the regex expression is found.
Can you please help me on this?

You can extend the split a bit more \s*',\s*'\s* to match all the parts that you don't want until the next part that you do want.
Then use a loop to first match your pattern M[a-z]+ / \w+ \w+r. The get the next item if there is an item present.
Example code
import re
subject = "Madam / Dear Sir, ', ' ', 'The terrorist destroyed the building at 23:45 with a remote detonation device', ' ', 'so a new line character is appended to the string"
filteredList = list(filter(None, re.split("\s*',\s*'\s*", subject)))
l = len(filteredList)
for i, s in enumerate(filteredList):
if re.match(r"M[a-z]+ / \w+ \w+r", s) and i + 1 < l:
print(filteredList[i + 1])
Output
The terrorist destroyed the building at 23:45 with a remote detonation device
Python demo

Related

replace spaces with % in django or python app

im having a hard time fixing this one. i have a search function that will look for campaign name or campaign launcher name. for example if a user look for all campaigns launched by john doe. i want to enclose all spaces with '%' (%john%doe%) expected.
campaigns = Campaign.objects.filter(title(re.sub('/\s/g ', '%', search)) | launcher(re.sub('/\s/g ', '%', search)))
i also tried
campaigns = Campaign.objects.filter(title(re.sub(' ', '%', search)) | launcher(re.sub(' ', '%', search)))
but my code is not doing the right thing. im getting
`camp`.`name` LIKE '%john doe%' OR `user`.`name` LIKE '%john doe%'
and if i did the search.replace(" ", "%") im getting
`camp`.`name` LIKE '%john\\%doe%' OR `user`.`name` LIKE '%john\\%doe%'
i also got this sub function
def search_campaign(request, search):
def title(search):
return Q(name__icontains=search)
def launcher(search):
return Q(created_by_name__icontains=search)
any help will be much appreciated.
search.replace(" ", "%") should work for input search = "john doe"
If you want to send the query with % then simply below cod will work for you.
>>> import re
>>> re.sub(" ", "%", " jhon doe ")
'%jhon%doe%'
If you want to send title like %john%doe% for " john doe " then this query should work.
campaigns = Campaign.objects.filter(title(re.sub(' ', '%', search)) | launcher(re.sub(' ', '%', search)))
Correct me if I got something wrong from the question.
"Regex" in python put spaces between words staring with capital letters.
for this first you need to import "re"
import re
def space(input):
i1=re.findall("[a-z][A-Z]*",input)
result=[]
for word in i1:
word=chr(ord(word[])+32)+word[1:]
result.append(word)
print(' '.join(result))
if __name__=="__main__":
input="JohnDoe"
space(input)

Python str.maketrans Remove Punctuation with Empty Space

I am using maketrans from string module in Python 3 to do simple text preprocessing like lowering, removing digits and punctuations. The problem is that during the punctuation removal all words are attached together with no empty space! For example, let's say I have the following text:
text='[{"Hello":"List:","Test"321:[{"Hello":"Airplane Towel for Kitchen"},{"Hello":2 " Repair massive utilities "2},{"Hello":"Some 3 appliance for our kitchen"2}'
text=text.lower()
text=text.translate(str.maketrans(' ',' ',string.digits))
Works just fine, it gives:
'[{"hello":"list:","test":[{"hello":"airplane towel for kitchen"},{"hello": " repair massives utilities "},{"hello":"some appliance for our kitchen"}'
But once I want to remove the punctuations:
text=text.translate(str.maketrans(' ',' ',string.punctuation))
It gives me this:
'hellolisttesthelloairplane towel for kitchenhello nbsprepair massives utilitiesnbsphellosome appliance for our kitchen'
Ideally it should yield:
'hello list test hello airplane towel for kitchen hello nbsp repair massives utilities nbsp hello some appliance for our kitchen'
There is not specific reason I am doing it with maketrans, but I like as it is fast and easy and kind of stuck solving it. Thanks!
Disclaimer: I already know how to do it with re like the following:
import re
s = "string.]With. Punctuation?"
s = re.sub(r'[^\w\s]','',s)
well... this works
txt = text.translate(str.maketrans(string.punctuation, ' ' * len(string.punctuation))).replace(' '*4, ' ').replace(' '*3, ' ').replace(' '*2, ' ').strip()

Python Split Outputting in square brackets

I have code that looks like this:
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = str(activity [1:2]) + str(activity [0:1])
print("".join(activity))
I want the output to look like Girl's 9th Basketball, but the current output when printed is
[' Girls 9th']['Basketball ']
I want to get rid of the square brackets. I know I can simply trim it, but I would rather know how to do it right.
You're almost there. When you use .join on a list it creates a string so you can omit that step.
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = activity[1:2] + activity[0:1]
print(" ".join(activity))
You are stringyfying the lists which is the same as using print(someList) - it is the representation of a list wich puts the [] around it.
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = activity [1:2] + [" "] + activity [0:1] # reorders and reassignes list
print("".join(activity))
You could consider adding a step:
# remove empty list items and remove trailing/leading strings
cleaned = [x.strip() for x in activity if x.strip() != ""]
print(" ".join(cleaned)) # put a space between each list item
This just resorts the lists and adds the needed space item in between so you output fits.
You can solve it in one line:
new_activity = ' '.join(activity.split(' - ')[::-1])
You can try something like this:
import re
activity = "Basketball - Girls 9th"
activity = re.sub(r'\s', ' ', activity).split('-')
activity = str(activity [1:2][0]).strip() + ' ' + str(activity [0:1][0])
print(activity)
output:
Girls 9th Basketball

Python - split() producing ValueError

I am trying to split the line:
American plaice - 11,000 lbs # 35 cents or trade for SNE stocks
at the word or but I receive ValueError: not enough values to unpack (expected 2, got 1).
Which doesn't make sense, if I split the sentence at or then that will indeed leave 2 sides, not 1.
Here's my code:
if ('-' in line) and ('lbs' in line):
fish, remainder = line.split('-')
if 'trade' in remainder:
weight, price = remainder.split('to ')
weight, price = remainder.split('or')
The 'to' line is what I normally use, and it has worked fine, but this new line appeared without a 'to' but instead an 'or' so I tried writing one line that would tackle either condition but couldn't figure it out so I simply wrote a second and am now running into the error listed above.
Any help is appreciated, thanks.
The most straightforward way is probably to use a regular expression to do the split. Then you can split on either word, whichever appears. The ?: inside the parentheses makes the group non-capturing so that the matched word doesn't appear in the output.
import re
# ...
weight, price = re.split(" (?:or|to) ", remainder, maxsplit=1)
You split on 'to ' before you attempt to split on 'or', which is throwing the error. The return value of remainder.split('to ') is [' 11,000 lbs # 35 cents or trade for SNE stocks'] which cannot be unpacked to two separate values. you can fix this by testing for which word you need to split on first.
if ('-' in line) and ('lbs' in line):
fish, remainder = line.split('-')
if 'trade' in remainder:
if 'to ' in remainder:
weight, price = remainder.split('to ')
elif ' or ' in remainder:
weight, price = remainder.split(' or ') #add spaces so we don't match 'for'
This should solve your problem by checking if your separator is in the string first.
Also note that split(str, 1) makes sure that your list will be split a max of one time (Ex "hello all world".split(" ", 1) == ["hello", "all world"])
if ('-' in line) and ('lbs' in line):
fish, remainder = line.split('-')
if 'trade' in remainder:
weight, price = remainder.split(' to ', 1) if ' to ' in remainder else remainder.split(' or ', 1)
The problem is that the word "for" also contains an "or" therefore you will end up with the following:
a = 'American plaice - 11,000 lbs # 35 cents or trade for SNE stocks'
a.split('or')
gives
['American plaice - 11,000 lbs # 35 cents ', ' trade f', ' SNE stocks']
Stephen Rauch's answer does fix the problem
Once you have done the split(), you have a list, not a string. So you can not do another split(). And if you just copy the line, then you will overwrite you other results. You can instead try and do the processing as a string:
weight, price = remainder.replace('or ', 'to ').split('to ')

Split string with delimiters in Python

I have such a String as an example:
"[greeting] Hello [me] my name is John."
I want to split it and get such a result
('[greetings]', 'Hello' , '[me]', 'my name is John')
Can it be done in one line of code?
OK another example as it seems that many misunderstood the question.
"[greeting] Hello my friends [me] my name is John. [bow] nice to meet you."
then I should get
('[greetings]', ' Hello my friends ' , '[me]', ' my name is John. ', '[bow]', ' nice to meet you.')
I basically want to send this kind of string to my robot. It will automatically decompose it and do some motion corresponding to [greetings] [me] and [bow] and in between speak the other strings.
Using regex:
>>> import re
>>> s = "[greeting] Hello my friends [me] my name is John. [bow] nice to meet you."
>>> re.findall(r'\[[\w\s.]+\]|[\w\s.]+', s)
['[greeting]', ' Hello my friends ', '[me]', ' my name is John. ', '[bow]', ' nice to meet you.']
Edit:
>>> s = "I can't see you"
>>> re.findall(r'\[.*?\]|.*?(?=\[|$)', s)[:-1]
["I can't see you"]
>>> s = "[greeting] Hello my friends [me] my name is John. [bow] nice to meet you."
>>> re.findall(r'\[.*?\]|.*?(?=\[|$)', s)[:-1]
['[greeting]', ' Hello my friends ', '[me]', ' my name is John. ', '[bow]', ' nice to meet you.'
The function you're after is .split(). The function accepts a delimiter as its argument and returns a list made by splitting the string at every occurrence of the delimiter. To split a string, using either "[" or "]" as a delimiter, you should use a regular expression:
import re
str = "[greeting] Hello [me] my name is John."
re.split("\]|\[", str)
# returns ['', 'greeting', ' Hello ', 'me', ' my name is John.']
This uses a regular expression to split the string.
\] # escape the right bracket
| # OR
\[ # escape the left bracket
I think can't be done in one line, you need first split by ], then [:
# Run in the python shell
sentence = "[greeting] Hello [me] my name is John."
for part in sentence.split(']')
part.split('[')
# Output
['', 'greeting']
[' Hello ', 'me']
[' my name is John.']

Categories