Getting the match number when passing a function in re.sub [duplicate] - python

This question already has an answer here:
Replace part of a string in python multiple times but each replace increments a number in the string by 1
(1 answer)
Closed 2 years ago.
When using a function in re.sub:
import re
def custom_replace(match):
# how to get the match number here? i.e. 0, 1, 2
return 'a'
print(re.sub(r'o', custom_replace, "oh hello wow"))
How to get the match number inside custom_replace?
i.e. 0, 1, 2 for the three "o" of the example input string.
NB: I don't want to use a global variable for this, because multiple such operations might happen in different threads etc.

Based on #Barmar's answer, I tried this:
import re
def custom_replace(match, matchcount):
result = 'a' + str(matchcount.i)
matchcount.i += 1
return result
def any_request():
matchcount = lambda: None # an empty "object", see https://stackoverflow.com/questions/19476816/creating-an-empty-object-in-python/37540574#37540574
matchcount.i = 0 # benefit : it's a local variable that we pass to custom_replace "as reference
print(re.sub(r'o', lambda match: custom_replace(match, matchcount), "oh hello wow"))
# a0h hella1 wa2w
any_request()
and it seems to work.
Reason: I was a bit reluctant to use a global variable for this, because I'm using this inside a web framework, in a route function (called any_request() here).
Let's say there are many requests in parallel (in threads), I don't want a global variable to be "mixed" between different calls (since the operations are probably not atomic?)

There doesn't seem to be a built-in way. You can use a global variable as a counter.
def custom_replace(match):
global match_num
result = 'a' + str(match_num)
match_num += 1
return result
match_num = 0
print(re.sub(r'o', custom_replace, "oh hello wow"))
Output is
a0h hella1 wa2w
Don't forget to reset match_num to 0 before each time you call re.sub() with this function.

You can use re.search with re.sub.
def count_sub(pattern,text,repl=''):
count=1
while re.search(pattern,text):
text=re.sub(pattern,repl+str(count),text,count=1)
count+=1
return text
Output:
count_sub(r'o', 'oh hello world')
# '1h hell2 w3rld'
count_sub(r'o', 'oh hello world','a')
# 'a1h hella2 wa3rld'
Alternative:
def count_sub1(pattern,text,repl=''):
it=enumerate(re.finditer(pattern,text),1)
count=1
while count:
count,_=next(it,(0,0))
text=re.sub(pattern,repl+str(count),text,count=1)
return text
Output:
count_sub1(r'o','oh hello world')
# '1h hell2 w3rld'
count_sub1(r'o','oh hello world','a')
# 'a1h hella2 wa3rld'

Related

How do i increase the value of an variable in a command line?

a = 1
for i in range(5):
browser.find_element_by_xpath("/html/body/div[6]/div/div/div[2]/div/div/div[1]/div[3]/button").click()
sleep(1)
I want to increase the 1 in div[1] by 1+ every loop, but how can i do that?
i thought i need to add a value, do "+a+" and last of all a "a = a + 1" to increase the value every time, but it didnt worked.
a = 1
for i in range(5):
browser.find_element_by_xpath("/html/body/div[6]/div/div/div[2]/div/div/div["+a+"]/div[3]/button").click()
a = a + 1
sleep(1)
for i in range(1,6):
browser.find_element_by_xpath("/html/body/div[6]/div/div/div[2]/div/div/div["+str(i)+"]/div[3]/button").click()
sleep(1)
you don't need 2 variables, just one variable i in the loop, convert it to string with str() and add it to where you need it, pretty simple. the value of i increases for every iteration of the loop going from 1 to 5 doing exactly what you need.
alternatively to Elyes' answer, you can use the 'global' keyword at the top of your function then a should increment 'correctly'.
You don't really need two variables for this unless you are going to use the second variable for something. However, look at the following code and it will show you that both i and a will give you the same result:
from time import sleep
a = 1
for i in range(1, 6):
path = "/html/body/div[6]/div/div/div[2]/div/div/div[{idx}]/div[3]/button".format(idx=i)
print(path, 'using i')
path = "/html/body/div[6]/div/div/div[2]/div/div/div[{idx}]/div[3]/button".format(idx=a)
a += 1
print(path, 'using a')
sleep(1)
Result:
/html/body/div[6]/div/div/div[2]/div/div/div[1]/div[3]/button using i
/html/body/div[6]/div/div/div[2]/div/div/div[1]/div[3]/button using a
/html/body/div[6]/div/div/div[2]/div/div/div[2]/div[3]/button using i
/html/body/div[6]/div/div/div[2]/div/div/div[2]/div[3]/button using a
/html/body/div[6]/div/div/div[2]/div/div/div[3]/div[3]/button using i
/html/body/div[6]/div/div/div[2]/div/div/div[3]/div[3]/button using a
/html/body/div[6]/div/div/div[2]/div/div/div[4]/div[3]/button using i
/html/body/div[6]/div/div/div[2]/div/div/div[4]/div[3]/button using a
/html/body/div[6]/div/div/div[2]/div/div/div[5]/div[3]/button using i
/html/body/div[6]/div/div/div[2]/div/div/div[5]/div[3]/button using a
You can read up on range here

how to use conditionals (multiple IF and ELSE statements) in a function using PYTHON [duplicate]

This question already has answers here:
Why is "None" printed after my function's output?
(7 answers)
Closed 6 months ago.
I'm new to programming and i'm taking a course on edx.org.
i'm having issues with using conditionals in a function. each time i call the function it gives me the output i desire but also shows "NONE" at the end. is there any way i can use return keyword in the code? below is the question and my code.
###create a functions using startswith('w')
###w_start_test() tests if starts with "w"
# function should have a parameter for test_string and print the test result
# test_string_1 = "welcome"
# test_string_2 = "I have $3"
# test_string_3 = "With a function it's efficient to repeat code"
# [ ] create a function w_start_test() use if & else to test with startswith('w')
# [ ] Test the 3 string variables provided by calling w_start_test()
test_string_1='welcome'.lower()
test_string_2='I have $3'.lower()
test_string_3='With a function it\'s efficient to repeat code'.lower()
def w_start_test():
if test_string_1.startswith('w'):
print(test_string_1,'starts with "w"')
else:
print(test_string_2,'does not start with "w"')
if test_string_2.startswith('w'):
print(test_string_2,'starts with "w"')
else:
print(test_string_2,'does not starts with "w"')
if test_string_3.startswith('w'):
print(test_string_3,'starts with "w"')
else:
print(test_string_3,'does not start with "w"')
print(w_start_test())
There are a number of questions here, I'll try to answer them.
For some reason, you are attempting to print out your function, this will just attempt to return the type of the function which is None. That won't return anything.
From my understanding you are wanting to compare many different strings, there are a few ways you can do that but here's my solution:
You take your 3 strings, and put them into a list like so:
test_strings = ['welcome'.lower(),'I have $3'.lower(),'With a function it\'s efficient to repeat code'.lower()]
We create our function as you have done so already but include parameters instead:
def w_start_test(test_string_list):
for string in test_string_list:
if string.startswith('w'):
print(string,'starts with "w"')
else:
print(string,'does not start with "w"')
return
This function takes a parameter, test_string_list and loops through all objects within this list and does the comparisons you have provided. We then return nothing because I am not sure what you want to return.
Let's say you wanted to return 'Completed', you would do this:
test_strings = ['welcome'.lower(),'I have $3'.lower(),'With a function it\'s efficient to repeat code'.lower()]
def w_start_test(test_string_list):
for string in test_string_list:
if string.startswith('w'):
print(string,'starts with "w"')
else:
print(string,'does not start with "w"')
return 'Completed Test'
def __main__():
ValueOfTest = w_start_test(test_strings)
print(ValueOfTest)
Functions are slightly complicated. The solution which you are looking for is as below:
def w_start_test(alpha):
if alpha.lower().startswith("w"):
print("The word starts with 'w'")
else:
print("The word doesn't start with 'w'")
w_start_test(test_string_1)
w_start_test(test_string_2)
w_start_test(test_string_3)
I was trying to discover the right answer. I think I did so.
Here it's my variant of the problem solution.
test_string_1 = "welcome"
test_string_2 = "I have $3"
test_string_3 = "With a function it's efficient to repeat code"
# [ ] create a function w_start_test() use if & else to test with startswith('w')
# [ ] Test the 3 string variables provided by calling w_start_test()
if test_string_1.lower().startswith('w'):
print('this string starts with \'w\'')
else:
pass
if test_string_2.lower().startswith('w'):
print('this string starts with \'w\'')
else:
print('this string doesn\'t start with \'w\'')
if test_string_3.lower().startswith('w'):
print('this string starts with \'w\'')
else:
pass

search string with find() method and returning every char after 2nd occurrence in string

Here goes another edX exercise I got stuck earlier:
They asked me to create a function called "after_second" that accepts two
arguments:
1. a string to search
2. a search term.
Function: return everything in the first string after
the SECOND occurrence of the search term.
For example:
after_second("1122334455321", "3") -> 4455321
The search term "3" appears at indices 4 and 5. So, this returns everything from the index 6 to the end.
after_second("heyyoheyhi!", "hey") -> hi!
The search term "hey" appears at indices 0 and 5. The search term itself is three characters. So, this returns everything from the index 8 to the end.
This is my code:
def after_second(searchString, searchTerm):
finder = searchString.find(searchTerm)
count = 0
while not finder == -1:
finder = searchString.find(searchTerm, finder + 1)
count += 1
if count == 1:
return searchString[finder + len(searchTerm):]
print(after_second("1122334455321", "3")) #Sample problems by edX
print(after_second("heyyoheyhi!", "hey")) #Sample problems by edX
Which returns the expected correct answers:
4455321
hi!
I'm wondering if there is a better way to structure the code I made. It outputs the correct answer but I'm not convinced it is the best answer.
Thank you in advance!
You can easily use str.split by passing the maxsplit parameter as 2, then taking the final item from the split:
>>> "1122334455321".split('3', 2)[-1]
'4455321'
>>> "heyyoheyhi!".split('hey', 2)[-1]
'hi!'
And your function can now be written as:
def after_second(search_string, search_term):
return search_string.split(search_term, 2)[-1]

Making string series in Python

I have a problem in Python I simply can't wrap my head around, even though it's fairly simple (I think).
I'm trying to make "string series". I don't really know what it's called, but it goes like this:
I want a function that makes strings that run in series, so that every time the functions get called it "counts" up once.
I have a list with "a-z0-9._-" (a to z, 0 to 9, dot, underscore, dash). And the first string I should receive from my method is aaaa, next time I call it, it should return aaab, next time aaac etc. until I reach ----
Also the length of the string is fixed for the script, but should be fairly easy to change.
(Before you look at my code, I would like to apologize if my code doesn't adhere to conventions; I started coding Python some days ago so I'm still a noob).
What I've got:
Generating my list of available characters
chars = []
for i in range(26):
chars.append(str(chr(i + 97)))
for i in range(10):
chars.append(str(i))
chars.append('.')
chars.append('_')
chars.append('-')
Getting the next string in the sequence
iterationCount = 0
nameLen = 3
charCounter = 1
def getString():
global charCounter, iterationCount
name = ''
for i in range(nameLen):
name += chars[((charCounter + (iterationCount % (nameLen - i) )) % len(chars))]
charCounter += 1
iterationCount += 1
return name
And it's the getString() function that needs to be fixed, specifically the way name gets build.
I have this feeling that it's possible by using the right "modulu hack" in the index, but I can't make it work as intended!
What you try to do can be done very easily using generators and itertools.product:
import itertools
def getString(length=4, characters='abcdefghijklmnopqrstuvwxyz0123456789._-'):
for s in itertools.product(characters, repeat=length):
yield ''.join(s)
for s in getString():
print(s)
aaaa
aaab
aaac
aaad
aaae
aaaf
...

make a global condition break

allow me to preface this by saying that i am learning python on my own as part of my own curiosity, and i was recommended a free online computer science course that is publicly available, so i apologize if i am using terms incorrectly.
i have seen questions regarding this particular problem on here before - but i have a separate question from them and did not want to hijack those threads. the question:
"a substring is any consecutive sequence of characters inside another string. The same substring may occur several times inside the same string: for example "assesses" has the substring "sses" 2 times, and "trans-Panamanian banana" has the substring "an" 6 times. Write a program that takes two lines of input, we call the first needle and the second haystack. Print the number of times that needle occurs as a substring of haystack."
my solution (which works) is:
first = str(input())
second = str(input())
count = 0
location = 0
while location < len(second):
if location == 0:
location = str.find(second,first,0)
if location < 0:
break
count = count + 1
location = str.find(second,first,location +1)
if location < 0:
break
count = count + 1
print(count)
if you notice, i have on two separate occasions made the if statement that if location is less than 0, to break. is there some way to make this a 'global' condition so i do not have repetitive code? i imagine efficiency becomes paramount with increasing program sophistication so i am trying to develop good practice now.
how would python gurus optimize this code or am i just being too nitpicky?
I think Matthew and darshan have the best solution. I will just post a variation which is based on your solution:
first = str(input())
second = str(input())
def count_needle(first, second):
location = str.find(second,first)
if location == -1:
return 0 # none whatsoever
else:
count = 1
while location < len(second):
location = str.find(second,first,location +1)
if location < 0:
break
count = count + 1
return count
print(count_needle(first, second))
Idea:
use function to structure the code when appropriate
initialise the variable location before entering the while loop save you from checking location < 0 multiple times
Check out regular expressions, python's re module (http://docs.python.org/library/re.html). For example,
import re
first = str(input())
second = str(input())
regex = first[:-1] + '(?=' + first[-1] + ')'
print(len(re.findall(regex, second)))
As mentioned by Matthew Adams the best way to do it is using python'd re module Python re module.
For your case the solution would look something like this:
import re
def find_needle_in_heystack(needle, heystack):
return len(re.findall(needle, heystack))
Since you are learning python, best way would be to use 'DRY' [Don't Repeat Yourself] mantra. There are lots of python utilities that you can use for many similar situation.
For a quick overview of few very important python modules you can go through this class:
Google Python Class
which should only take you a day.
even your aproach could be imo simplified (which uses the fact, that find returns -1, while you aks it to search from non existent offset):
>>> x = 'xoxoxo'
>>> start = x.find('o')
>>> indexes = []
>>> while start > -1:
... indexes.append(start)
... start = x.find('o',start+1)
>>> indexes
[1, 3, 5]
needle = "ss"
haystack = "ssi lass 2 vecess estan ss."
print 'needle occurs %d times in haystack.' % haystack.count(needle)
Here you go :
first = str(input())
second = str(input())
x=len(first)
counter=0
for i in range(0,len(second)):
if first==second[i:(x+i)]:
counter=counter+1
print(counter)
Answer
needle=input()
haystack=input()
counter=0
for i in range(0,len(haystack)):
if(haystack[i:len(needle)+i]!=needle):
continue
counter=counter+1
print(counter)

Categories