Questions about "yield from" and "next" behaviour - python

So I am making a generator from a list but would like to call next on it, which should just return the next item in the list, however it returns the same object, i.e. the whole piece of code is run again in stead of just returning the yield part. The example below shows the expected behaviour when looping through the list, but then the next returns 1 twice, whereas I would like the second call of next to return 2.
class demo:
#property
def mygen(self):
a = [1,2,3,4,5]
b = [6,7,8,9,10]
yield from a
yield from b
if __name__=='__main__':
demo1 = demo()
print([_ for _ in demo1.mygen])
demo2 = demo()
print(next(demo2.mygen))
print(next(demo2.mygen))
There's a reason I am turning a list into a generator as it is the response from an api call and would like to dynamically return the next item in the list and make the api call if it comes to the end of that list.

Every call to the property creates a new generator. You should store the generator returned by the property to a variable. Then you will be able to call next multiple times. Change
print(next(demo2.mygen))
print(next(demo2.mygen)) # calls next on a fresh generator
to
gen = demo2.mygen
print(next(gen))
print(next(gen)) # calls next on the SAME generator
As others have pointed out, this behaviour should have you reconsider making this a property in the first place. Seeing
demo2.mygen()
makes it much more obvious that there is some dynamic stuff going on, while
demo2.mygen
gives the impression of a more static attribute producing the same object every time. You can find some more elaboration on that here.

Related

What happen if we do for loop using exhausted generator in Python?

I am creating a generator in python 3 which may yield a single value or more.
The condition that I wanted is, I want to loop with this iterator starting at the second value and so on, running an API request function with that value. If the generator yield only a single value, the for loop and corresponding code is not needed to be executed. If the generator yield more than one value, the function inside the for-loop will be executed starting from the second value of generator and so on.
The reason why I want to start at the second value is because the first value is already accessed for the API request and its result has been stored.
My question is related to a generator that produce a single value.
I give the code example below: (I simplified API Request with print() function):
def iterexample(): # creating a simple iterator that return a single value
yield 0
print(0)
iter = iterexample()
next(iter) #generator is consumed once here
for i in iter: #1 generator is exhausted
print(i, ' inside loop') #2 this is skipped because the generator is exhausted
#3 rest of the code outside the loop will be executed
It returns what I expected: only 0 is printed, not "0 inside loop"
0
My question is:
Is it the safest and the most pythonic way to do that? will it raise
any error?
Will it produce infinite loop? I am very afraid if it will result as
infinite loop of API request.
Please review my #1 ~ #3 comment in above codes, are my
understanding correct?
Thanks for the response and the help. Cheers!
1 Is it the safest and the most pythonic way to do that? will it raise any error?
Once a generator is exhausted, it will continually raise StopIteration exceptions when asked for new values. For loops can handle this case by terminating the loop when this exception is raised, which makes it safe to pass an exhausted generator to a for loop constructor.
However, your code calls next directly, and is therefore only safe only if it also handle StopIteration exceptions. In this case you would need to document that the generator provided must produce 1 or more values or be tolerant of the empty case. If the generator returned no values, then you would get an error. e.g.
def iterexample():
while False:
yield 0
print(next(iterexample()))
Traceback (most recent call last):
File "test.py", line 5, in <module>
print(next(iterexample()))
StopIteration
To prevent against empty generators you can use the second optional default argument to next.
print(next(iterexample(), "default"))
default
2 Will it produce infinite loop? I am very afraid if it will result as infinite loop of API request.
Again this depends on the generator. Generators do not need to have an end value. You can easily define non-ending generators like this:
def iterexample():
i = 0
while True:
yield i
i += 1
for i in iterexample(): #This never ends.
print(i)
If this is a concern for you, one way to prevent never ending outputs would be to use an islice that cuts off your generator after so many values are consumed:
from itertools import islice
for i in islice(iterexample(), 5):
print(i)
0
1
2
3
4
If I understand correctly your issue: you have a first value that you need for a case, and the rest for another case.
I would recommend building a structure that fits your needs, something like this:
class MyStructrue:
def __init__(self, initial_data):
if not initial_data:
# Make sure your data structure is valid before using it
raise ValueErro("initial_data is empty")
self.initial_data = initial_data
#property
def cached_value(self):
return self.initial_data[0]
#property
def all_but_first(self):
return self.initial_data[1:]
In this case, you make sure your data is valid, and you can give your accessors names that reflects what you those value are representing. In this example, I gave them dummy names, but you should try to make something that is relevant to your business.
Such a class could be used this way (changed names just to illustrate how method naming can document your code):
tasks = TaskQueue(get_input_in_some_way())
advance_task_status(tasks.current_task)
for pending_task in tasks.pending_tasks:
log_remaining_time(pending_tasks)
You should first try to understand what your datastructure represents and build a useful api that hide the implementation to better reflect your business.

Python pass list as argument sends all of list data or just reference

I have the following code:
def test(a):
a.append(3)
test_list = [1,2]
test(test_list)
In the above code, when I pass test_list as an argument to test function, does python send the whole list object/data (which is a big byte size) or just the reference to the list (which would be much smaller since its just a pointer)?
From what I understood by looking at Python's pass by object reference, it only sends the reference, but do not know how to verify that it indeed is the case
It's passing an alias to the function; for the life of the function, or until the function intentionally rebinds a to point to something else (with a = something), the function's a is an alias to the same list bound to the caller's test_list.
The straightforward ways to confirm this are:
Print test_list after the call; if the list were actually copied, the append in the function would only affect the copy, the caller wouldn't see it. test_list will in fact have had a new element appended, so it was the same list in the function.
Print id(test_list) outside the function, and id(a) inside the function; ids are required to be unique at any given point in time, so the only way two different lists could have the same ID is if one was destroyed before the other was created; since test_list continues to exist before and after the function call, if a has the same id, it's definitionally the same list.
All function arguments are passed by reference in Python. So the a variable in the function test will refer to the same object as test_list does in the calling code. After the function returns, test_list will contain [1,2,3].

What happens when you invoke a function that contains yield?

I read here the following example:
>>> def double_inputs():
... while True: # Line 1
... x = yield # Line 2
... yield x * 2 # Line 3
...
>>> gen = double_inputs()
>>> next(gen) # Run up to the first yield
>>> gen.send(10) # goes into 'x' variable
If I understand the above correctly, it seems to imply that Python actually waits until next(gen) to "run up to" to Line 2 in the body of the function. Put another way, the interpreter would not start executing the body of the function until we call next.
Is that actually correct?
To my knowledge, Python does not do AOT compilation, and it doesn't "look ahead" much except for parsing the code and making sure it's valid Python. Is this correct?
If the above are true, how would Python know when I invoke double_inputs() that it needs to wait until I call next(gen) before it even enters the loop while True?
Correct. Calling double_inputs never executes any of the code; it simply returns a generator object. The presence of the yield expression in the body, discovered when the def statement is parsed, changes the semantics of the def statement to create a generator object rather than a function object.
The function contains yield is a generator.
When you call gen = double_inputs(), you get a generator instance as the result. You need to consume this generator by calling next.
So for your first question, it is true. It runs lines 1, 2, 3 when you first call next.
For your second question, I don't exactly get your point. When you define the function, Python knows what you are defining, it doesn't need to look ahead when running it.
For your third question, the key is yield key word.
Generator-function is de iure a function, but de facto it is an iterator, i.e. a class (with implemented __next__(), __iter()__, and some other methods.)
          In other words, it is a class disguised as a function.
It means, that “calling” this function is in reality making an instance of this class, and explains, why the “called function” does initially nothing. This is the answer to your 3rd question.
The answer to your 1st question is surprisingly no.
Instances always wait for calling its methods, and the __next__() method (indirectly launched by calling the next() build-in function) is not the only method of generators. Other method is the .send(), and you may use gen.send(None) instead of your next(gen).
The answer to your 2nd question is no. Python interpreter by no mean "look ahead" and there are no exceptions, including your
... except for parsing the code and making sure it's valid Python.
Or the answer to this question is yes, if you mean “parsing only up to the next command”. ;-)

Does accessing a methods returned value re-run the method?

As I am learning Python, and programming in general, I've come across a concept I'm not very sure of. I am working on a script and I am curious on if I have already created object and if I want to access the returned value from one of that object's methods, will it rerun the method, or will it simply reply back with those values. For example:
class ClassOne():
def oneMethod(self):
x = 2
y = 10
return x, y
class ClassTwo():
def twoMethod(self, x, y):
...
newObject = ClassOne()
newObject.oneMethod()
secondObject = ClassTwo()
# My question is, will the lines below re-execute newObject.oneMethod()
# or will it simply pull the already returned value
secondObject.twoMethod(newObject.oneMethod()[0],
newObject.oneMethod()[1])
While my script isn't necessarily large enough to be super worried about performance, it's just something I'm wondering and couldn't find much info about online.
Your title asks a different question from the text body.
No, accessing the result of a method won't rerun the method. But that's not what you're doing here; you explicitly call the method twice, so of course it will run twice.
The normal thing to do is to assign the returned value to a variable and use that as many times as you want.
The answer is yes, the method will re-execute each time it is called.
You could verify that by adding a call to print() in oneMethod() to see whether its code is executed every time it is called. You will find that it is.
You can avoid re-execution by binding the return value to some variable(s), e.g.
a, b = newObject.oneMethod() # will call and execute the code in oneMethod()
secondObject = ClassTwo()
secondObject.twoMethod(a, b)
Here the tuple (2, 10) will be returned from oneMethod() and it will be unpacked into the variables a and b such that a = 2 and b = 10. You could also use a single tuple variable, and then access individual elements via indexing:
t = newObject.oneMethod()
secondObject.twoMethod(t[0], t[1])
# secondObject.twoMethod(*t) # also works, see below...
Another way, without having to save the return value first, is to pass the return value of oneMethod() directly into twoMethod() using tuple unpacking with the * operator:
secondObject = ClassTwo()
secondObject.twoMethod(*newObject.oneMethod())

Understanding python 2.7 email.feedparser Feedparser __init__ function

(So I'm trying to learn python. I figured it would be good to read code by people better than me. I decided to read through the email module...)
The init function for the Feedparser class in the email.feedparser module is defined as:
def __init__(self, _factory=message.Message):
"""_factory is called with no arguments to create a new message obj"""
self._factory = _factory
self._input = BufferedSubFile()
self._msgstack = []
self._parse = self._parsegen().next
self._cur = None
self._last = None
self._headersonly = False
The line I'm having trouble with is:
self._parse = self._parsegen().next
Which I think should mean 'set the attribute self._parse to the value of the next attribute of the return value of the method self._parsegen()
As far as I can tell, self._parsgen() when called during __init__() will first call self._new_message() which will set/add values to self._cur, self._last, and self._msgstack. It will then assign an empty list object to the local variable headers then start iterating over the self._input object. I think the first value for line will be a NeedMoreData object. Since the NeedMoreData class just extends object it should have no attribute or method named next. So does next just refer back to the iterator (self._input)?
Is there any way to have a look at this in the interpreter so that I can step through each line of the script?
So does next just refer back to the iterator (self._input)?
next does refer to the generator. Since the _parsegen() method uses yield, it returns a generator object. Consider the following simple example (from IPython):
In [1]: def a():
...: yield 1
...: yield 2
...:
In [2]: a()
Out[2]: <generator object a at 0x1a56550>
In [3]: a().next
Out[3]: <method-wrapper 'next' of generator object at 0x1a567d0>
In [4]: a().next()
Out[4]: 1
So, yes, you are mostly right. It will fall down to the iterator, and reference the method returning the next value from it.
Is there any way to have a look at this in the interpreter so that I can step through each line of the script?
You can use pdb for that.
The next method is a way to generate the next value of a python iterator or generator. The easiest way to think about this is to rewrite a for-loop.
You have a really easy syntax for looping over a list:
for element in list:
print element
which will produce an element on each iteration. But under the hood, Python is actually doing something akin to this:
iterator = iter(list)
while True:
element = iterator.next()
# do something with element (e.g. print it)
print element
When the iterator is exhausted (has no more items), it raises the StopIteration exception, which is how for loops and other methods employing iterators know when to stop. (so the previous code snippet should really be wrapped in a try/except block, but I figured it would be clearer to read without it).
You can read about the protocol for iterators in the Python docs. (but basically anything can be an iterator if it defines __iter__ and produces an iterator that defines __iter__ and next.

Categories