I was study iter(), in its official document, it says i can do iter(v,w) , so that iter() will call v until it return the value w, then it stops.
But I tried for half hour, and still can't work out a function that can return multiple result.
Here is my code, I expect it to return 1,2,3,4,5:
def x():
for i in range(10):
return i
a = iter(x, 5)
a.next()
I know that when I return i, that I was actually quit the function.
Maybe it's impossible to return result for multiple times for a function.
But how should I use a function to make that iter(x,5) work properly?
iter() calls the function each time. Your function returns the same value on each call (the first number in the range(10) list).
You could change the function to use a global to illustrate how iter() with two arguments works:
i = 0
def f():
global i
i += 1
return i
for x in iter(f, 5):
print x
Now each time f() is called, a new number is returned. You could use a default argument, or an instance with state and a method on that instance, too. As long as the function returns something different when called more than once it'll fit the iter(a, b) usecase.
iter() with two arguments is most often called with a method, where the state of an instance changes with each call. The .readline() method on a file object, for example:
for line in iter(fileobject.readline, ''):
which would work exactly like iterating over the fileobject iterable directly, except it wouldn't use the internal file iteration buffer. That could sometimes be a requirement (see the file.next() method for more information on the file iteration buffer).
You can of course pass in a lambda function too:
for chunk in iter(lambda: fileobject.read(2048), ''):
Now we are reading the file object is chunks of up to 2048 bytes instead of line by line.
After #Martjin Pieters's answer, I've got the idea.
And this is the piece of code I wrote which can use iter(v,w) correctly:
import random
def x():
return random.randrange(1,10)
a = iter(x,5)
while True:
print a.next()
In this code, a.next() will return the value a get from x(), until x() returns 5, then it will stop.
You could also use a generator function, via the yield keyword. Example:
def x():
for i in range(10):
yield i
Related
I was taking a look at the code of a coworker and I felt like this was an unnecessary use of the yield statement. It was something like this:
def standardize_text(text: str):
pattern = r"ABC" # some regex
yield re.sub(pattern, "X", text)
def preprocess_docs(docs: List[str]):
for doc in docs:
yield standardize_text(doc)
I understand the use of yield in preprocess_docs so that I can return a generator, which would be helpful if docs is a large list. But I don't understand the value of the yield in the standardize_text function. To me, a return statement would do the exact same thing.
Is there a reason why that yield would be useful?
To me, a return statement would do the exact same thing.
Using return instead wouldn't be the same as yield, as explained in ShadowRanger's comment.
With yield, calling the function gives you a generator object:
>>> standardize_text("ABCD")
<generator object standardize_text at 0x10561f740>
Generators can produce more than one result (unlike functions that use return). This generator happens to produce exactly one item, which is a string (the result of re.sub). You can collect the generator's results into a list(), for example, or just grab the first result with next():
>>> list(standardize_text("ABCD"))
['XD']
>>> g = standardize_text("ABCD")
>>> next(g)
'XD'
>>> next(g) # raises StopIteration, indicating the generator has finished
If we change the function to use return:
def standardize_text(text: str):
pattern = r"ABC" # some regex
return re.sub(pattern, "X", text)
Then calling the function just gives us the single result only — no list() or next() needed.
>>> standardize_text("ABCD")
'XD'
Is there a reason why that yield would be useful?
In the standardize_text function, no, not really. But your preprocess_docs function actually does make use of returning more than one value with yield: it returns a generator with one result for each of the values in docs. Those results are either generators themselves (in your original code with yield) or strings (if we change standardize_text to use return).
def preprocess_docs(docs: List[str]):
for doc in docs:
yield standardize_text(doc)
# returns a generator because the implementation uses "yield"
>>> preprocess_docs(["ABCD", "AAABC"])
<generator object preprocess_docs at 0x10561f820>
# with standardize_text using "yield re.sub..."
>>> for x in preprocess_docs(["ABCD", "AAABC"]): print(x)
...
<generator object standardize_text at 0x1056cce40>
<generator object standardize_text at 0x1056cceb0>
# with standardize_text using "return re.sub..."
>>> for x in preprocess_docs(["ABCD", "AAABC"]): print(x)
...
XD
AAX
Note: Prior to Python 3's async/await, some concurrency libraries used yield in the same way that await is now used. For example, Twisted's #inlineCallbacks. I don't think this is directly relevant to your question, but I included it for completeness.
def testfunction():
for i in range(10):
return('a')
print(testfunction())
I want 'a' outputed 10 times in one line. If I use print instead of return, it gives me 10 'a's but each on a new line. Can you help?
return terminates the current function, while print is a call to another function(atleast in python 3)
Any code after a return statement will not be run.
Python's way of printing 10 a's would be:
print('a' * 10)
In your case it would look like the following:
def testfunction ():
return 'a' * 10
print(testfunction ())
The reason its only printing once is because the return statment finishes the function (the return function stops the loop).
In order to print 'a' 10 times you want to do the following:
def testfunction():
for i in range(10):
print('a')
testfunction()
If you want "a" printed 10 times in one single line then you can simply go for:
def TestCode():
print("a"*10)
There's no need to use the for loop. For loop will just "a" for 10 times but every time it'll be a new line.
You can also take in a function argument and get "a" printed as many times as desired.
Such as:
def TestCode(times):
t = "a"*times
print(t)
Test:
TestCode(5)
>>> aaaaa
TestCode(7)
>>> aaaaaaa
print and return get mixed up when starting Python.
A function can return anything but it doesn't mean that the value will be printed for you to see. A function can even return another function (it's called functional programming).
The function below is adapted from your question and it returns a string object. When you call the function, it returns the string object into the variable called x. That contains all of the info you wanted and you can print that to the console.
You could have also used yield or print in your for loop but that may be outside of the scope.
def test_function(item:str="a", n:int=10):
line = item*n # this will be a string object
return line
ten_a_letters = test_function()
print(ten_a_letters)
"aaaaaaaaaa"
two_b_letters = test_function("b",2)
print(two_b_letters)
"bb"
I want 'a' outputed 10 times in one line. If I use print instead of
return, it gives me 10 'a's but each on a new line.
If you want to use print, the you need to pass a 2nd parameter as follows:
def testfunction():
for i in range(10):
print('a', end='')
However, I think the pythonic way would be to do the following:
def testfunction():
print('a' * 10)
When you use return you end the execution of the function immediately and only one value is returned.
Other answers here provide an easier way to solve your problem (which is great), but I would like to suggest a different approach using yield (instead of return) and create a generator (which might be an overkill but a valid alternative nonetheless):
def testfunction():
for i in range(10):
yield('a')
print(''.join(x for x in testfunction()))
1. What does "yield" keyword do?
def test ():
print('a' * 10)
test()
Output will be 'aaaaaaaaaa'.
I would like concurrent.futures.ProcessPoolExecutor.map() to call a function consisting of 2 or more arguments. In the example below, I have resorted to using a lambda function and defining ref as an array of equal size to numberlist with an identical value.
1st Question: Is there a better way of doing this? In the case where the size of numberlist can be million to billion elements in size, hence ref size would have to follow numberlist, this approach unnecessarily takes up precious memory, which I would like to avoid. I did this because I read the map function will terminate its mapping until the shortest array end is reach.
import concurrent.futures as cf
nmax = 10
numberlist = range(nmax)
ref = [5, 5, 5, 5, 5, 5, 5, 5, 5, 5]
workers = 3
def _findmatch(listnumber, ref):
print('def _findmatch(listnumber, ref):')
x=''
listnumber=str(listnumber)
ref = str(ref)
print('listnumber = {0} and ref = {1}'.format(listnumber, ref))
if ref in listnumber:
x = listnumber
print('x = {0}'.format(x))
return x
a = map(lambda x, y: _findmatch(x, y), numberlist, ref)
for n in a:
print(n)
if str(ref[0]) in n:
print('match')
with cf.ProcessPoolExecutor(max_workers=workers) as executor:
#for n in executor.map(_findmatch, numberlist):
for n in executor.map(lambda x, y: _findmatch(x, ref), numberlist, ref):
print(type(n))
print(n)
if str(ref[0]) in n:
print('match')
Running the code above, I found that the map function was able to achieve my desired outcome. However, when I transferred the same terms to concurrent.futures.ProcessPoolExecutor.map(), python3.5 failed with this error:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/queues.py", line 241, in _feed
obj = ForkingPickler.dumps(obj)
File "/usr/lib/python3.5/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function <lambda> at 0x7fd2a14db0d0>: attribute lookup <lambda> on __main__ failed
Question 2: Why did this error occur and how do I get concurrent.futures.ProcessPoolExecutor.map() to call a function with more than 1 argument?
To answer your second question first, you are getting an exception because a lambda function like the one you're using is not picklable. Since Python uses the pickle protocol to serialize the data passed between the main process and the ProcessPoolExecutor's worker processes, this is a problem. It's not clear why you are using a lambda at all. The lambda you had takes two arguments, just like the original function. You could use _findmatch directly instead of the lambda and it should work.
with cf.ProcessPoolExecutor(max_workers=workers) as executor:
for n in executor.map(_findmatch, numberlist, ref):
...
As for the first issue about passing the second, constant argument without creating a giant list, you could solve this in several ways. One approach might be to use itertools.repeat to create an iterable object that repeats the same value forever when iterated on.
But a better approach would probably be to write an extra function that passes the constant argument for you. (Perhaps this is why you were trying to use a lambda function?) It should work if the function you use is accessible at the module's top-level namespace:
def _helper(x):
return _findmatch(x, 5)
with cf.ProcessPoolExecutor(max_workers=workers) as executor:
for n in executor.map(_helper, numberlist):
...
(1) No need to make a list. You can use itertools.repeat to create an iterator that just repeats the some value.
(2) You need to pass a named function to map because it will be passed to the subprocess for execution. map uses the pickle protocol to send things, lambdas can't be pickled and therefore they can't be part of the map. But its totally unnecessary. All your lambda did was call a 2 parameter function with 2 parameters. Remove it completely.
The working code is
import concurrent.futures as cf
import itertools
nmax = 10
numberlist = range(nmax)
workers = 3
def _findmatch(listnumber, ref):
print('def _findmatch(listnumber, ref):')
x=''
listnumber=str(listnumber)
ref = str(ref)
print('listnumber = {0} and ref = {1}'.format(listnumber, ref))
if ref in listnumber:
x = listnumber
print('x = {0}'.format(x))
return x
with cf.ProcessPoolExecutor(max_workers=workers) as executor:
#for n in executor.map(_findmatch, numberlist):
for n in executor.map(_findmatch, numberlist, itertools.repeat(5)):
print(type(n))
print(n)
#if str(ref[0]) in n:
# print('match')
Regarding your first question, do I understand it correctly that you want to pass an argument whose value is determined only at the time you call map but constant for all instances of the mapped function? If so, I would do the map with a function derived from a "template function" with the second argument (ref in your example) baked into it using functools.partial:
from functools import partial
refval = 5
def _findmatch(ref, listnumber): # arguments swapped
...
with cf.ProcessPoolExecutor(max_workers=workers) as executor:
for n in executor.map(partial(_findmatch, refval), numberlist):
...
Re. question 2, first part: I haven't found the exact piece of code that tries to pickle (serialize) the function that should then be executed in parallel, but it sounds natural that that has to happen -- not only the arguments but also the function has to be transferred to the workers somehow, and it likely has to be serialized for this transfer. The fact that partial functions can be pickled while lambdas cannot is mentioned elsewhere, for instance here: https://stackoverflow.com/a/19279016/6356764.
Re. question 2, second part: if you wanted to call a function with more than one argument in ProcessPoolExecutor.map, you would pass it the function as the first argument, followed by an iterable of first arguments for the function, followed by an iterable of its second arguments etc. In your case:
for n in executor.map(_findmatch, numberlist, ref):
...
def apply_twice(func,arg):
return func(func(arg))
def add_five(x):
return x+5
print (apply_twice(add_five,10))
The output I get is 20.
This one is actually confusing me like how is it working.Can anybody explain me how this is working by breaking it down
The function apply_twice(func,arg) takes two arguments, a function object func and an argument to pass to the function func called arg.
In Python, functions can easily be passed around to other functions as arguments, they are not treated differently than any other argument type (i.e first class citizens).
Inside apply_twice, func is called twice in the line:
func(func(arg))
Which, alternatively, can be viewed in a more friendly way as:
res = func(arg)
func(res)
If you replace func with the name of the function passed in add_five you get the following:
res = add_five(arg) # equals: 15
add_five(res) # result: 20
which, of course, returns your expected result.
The key point to remember from this is that you shouldn't think of functions in Python as some special construct, functions are objects just like ints, listss and everything else is.
Expanding the code it executes as follows, starting with the print call:
apply_twice(add_five,10))
add_five(add_five(10)) # add_five(10) = 15
add_five(15) # add_five(15) = 20
Which gives you the result: 20.
When apply_twice is called, you are passing in a function object and a value. As you can see in the apply_twice definition, where you see func that is substituted with the function object passed to it (in this case, add_five). Then, starting with the inner func(arg) call, evaluate the result, which is then passed to add_five again, in the outer return func( ... ) call.
What you need to understand here is that
apply_twice(func,arg)
is a higher function which accepts two arguments (another function named func and an argument arg). The way it works is that it first evaluate the value of the other function, then use the value as an argument inside the higher function.
remember we have a function add_five(x) which add 5 to the argument supply in it...
then this function add_five(x) is then passed as an argument to another function called
apply_twice_(func,arg) which return func(func(arg)).
now splitting func(func(arg)) we have
func(arg) #lets called it a
then func(func(arg))==func(a) since a = func(agr)
and (a) is our add_five(x) function, after it add 5, then the value we got is re-used as another fresh argument to add another 5 to it, that is why we have 20 as our result.
Another example is:
def test(func, arg):
return func(func(arg))
def mult(x):
return x * x
print(test(mult, 2))
which give 16 as result.
I'm brand-new to decorators and closures, I'm trying to practice with a simple example. When executed it raises an error of:
NameError: name 'congratulate' is not defined
What do I need to change?
"""
A recursive function to check if a string is a palindrome.
"""
#congratulate
def palindrome(phrase):
characters = [char.lower() for char in phrase if char.isalpha()]
chars_len = len(characters)
out1 = characters[0]
out2 = characters[-1]
if chars_len <= 2:
return out1 == out2
else:
if out1 == out2:
return palindrome(characters[1:-1])
else:
return False
def congratulate(func):
if func:
print('Congratulations, it\'s a palindrome!')
if __name__ == '__main__':
print(palindrome('Rats live on no evil star'))
"""
A recursive function to check if a string is a palindrome.
"""
def congratulate(func):
def wrapper(*argv, **kargs):
result = func(*argv, **kargs)
if result:
print('Congratulations, it\'s a palindrome!')
return result
return wrapper
#congratulate
def palindrome(phrase):
characters = [char.lower() for char in phrase if char.isalpha()]
chars_len = len(characters)
out1 = characters[0]
out2 = characters[-1]
if chars_len <= 2:
return out1 == out2
else:
if out1 == out2:
return palindrome(characters[1:-1])
else:
return False
if __name__ == '__main__':
print(palindrome('Rats live on no evil star'))
the essence of understanding decorator is
#f
def g(args)
=>
f(g)(args)
I know I'm late to the party, but I want to expand.
As noted, the NameError in this case is caused by the fact that you use a name before you actually create one. Moving congratulate() to the top remedies this.
Appart from the NameError you have two implicit Logic Errors relating to Decorator/Function Functionality:
First Issue:
Your if clause in congratulate always evaluates to True; you aren't exactly congratulating when a string is a palindrome.
This is caused by the fact that function objects always evaluate to True, so a condition of the form if func: will always execute:
def f():
pass
if f:
print("I'm true!")
# Prints: I'm true!
This is thankfully trivial and can easily be fixed by actually calling the function if func("test string"):
Second Issue:
The second issue here is less trivial and probably caused by the fact that decorators can be comfusing. You aren't actually using
congratulate() the way decorators are supposed to be used.
A decorator is a callable that returns a callable (callables are things like functions, classes overloaded on __call__). What your 'decorator' is doing here is simply accepting a function object, evaluating if the object is True and then printing congratulations.
Worst part? It is also implicitly rebinding the name palindrome to None.
Again, you can see this indirect effect (+1 for rhyming) in this next snippet:
def decor(f):
if f: print("Decorating can be tricky")
#decor
def f():
print("Do I even Exist afterwards?")
# When executed, this prints:
Decorating can be tricky
Cool, our function f has been decorated, but, look what happens when we try calling our function f:
f()
TypeError Traceback (most recent call last)
<ipython-input-31-0ec059b9bfe1> in <module>()
----> 1 f()
TypeError: 'NoneType' object is not callable
Yes, our function object f has now been assigned to None, the return value of our decor function.
This happens because as pointed out, the #syntax is directly equivalent to the following:
#decor
def f(): pass
# similar to
f = decor(f) # we re-assign the name f!
Because of this we must make sure the return value of a decorator is an object that can afterwards be called again, ergo, a callable object.
So what do you do? One option you might consider would be simply returning the function you passed:
def congratulate(func):
if func("A test Phrase!"):
print('Congratulations, it\'s a palindrome!')
return func
This will guarantee that after the decorator runs on your palindrome() function, the name palindrome is still going to map to a callable object.
The problem? This turns out to be a one-time ride. When Python encounters your decorator and your function, it's going to execute congratulate once and as a result only going to execute your if clause once.
But you need it to run this if every time your function is called! What can you do in order to accomplish this? Return a function that executes the decorated function (so called nested function decorators).
By doing this you create a new function for the name palindrome and this function contains your original function which you make sure is executed each time palindrome() is called.
def congratulate(func): # grabs your decorated function
# a new function that uses the original decorated function
def newFunc():
# Use the function
if func("Test string"):
print('Congratulations, it\'s a palindrome!')
# Return the function that uses the original function
return newFunc
newFunc is now a function that issues calls to your original function.
The decoration process now assigns the palindrome name to the newFunc object (notice how we returned it with return newFunc.
As a result, each time you execute a call of the form palindrome() this is tranlated to newFunc() which in turn calls func() in its body. (If you're still with me I commend you).
What's the final issue here? We've hard-coded the parameters for func. As is, everytime you call palindrome() function newFunc() will call your original function func with a call signature of func("Test String"), which is not what we want, we need to be able to pass parameters.
What's the solution? Thankfully, this is simple: Pass an argument to newFunc() which will then pass the argument to func():
def congratulate(func): # grabs your decorated function
# a new function that uses the original decorated function
# we pass the required argument <phrase>
def newFunc(phrase):
# Use the function
# we use the argument <phrase>
if func(phrase):
print('Congratulations, it\'s a palindrome!')
# Return the function that uses the original function
return newFunc
Now, everytime you call palindrome('Rats live on no evil star') this will translate to a call of newFunc('Rats live on no evil star') which will then transfer that call to your func as func('Rats live on no evil star') in the if clause.
After execution, this works wonderfully and get's you the result you wanted:
palindrome('Rats live on no evil star')
Congratulations, it's a palindrome!
I hope you enjoy reading, I believe I'm done (for now)!
Move the congratulate() function above the function it's decorating (palindrome).