What is the Kotlin equivalent of Python generators? - python

If I have a python generator function, let's say this one:
def gen():
x = 0
while (true):
yield x
x += 1
This function remembers its current state, and every time you call gen(), yields a new value. Essentially, I would like a Kotlin sequence which can remember its state.

def gen():
x = 0
while (true):
yield x
x += 1
This function remembers its current state, and every time you call gen(), yields a new value.
This is incorrect. Every time you call gen() you get a new "generator object" whose state is independent of any other generator object created by this function. You then query the generator object to get the next number. For example:
def demo():
numbers = gen() # 'gen()' from your question
for _ in range(0, 3):
next_number = next(numbers)
print(next_number)
if __name__ == '__main__'
demo()
print()
demo()
Output:
0
1
2
0
1
2
As you can see, the sequence of numbers "starts over" when you call gen() again (though if you kept a reference to the old generator object it would continue from 2, even after calling gen() again).
In Kotlin, you can use the kotlin.sequences.iterator function. It creates an Iterator which lazily yields the next value, just like a Python generator object. For example:
fun gen() = iterator {
var x = 0
while (true) {
yield(x)
x++
}
}
fun demo() {
val numbers = gen()
repeat(3) {
val nextNumber = numbers.next()
println(nextNumber)
}
}
fun main() {
demo()
println()
demo()
}
Which will output:
0
1
2
0
1
2
Just like the Python code.
Note you can do the essentially the same thing with a Kotlin Sequence, you just have to convert the Sequence into an Iterator if you want to use it like a Python generator object. Though keep in mind that Kotlin sequences are meant more for defining a series of operations and then lazily processing a group of elements in one go (sort of like Java streams, if you're familiar with them).

As stated before in the comments https://kotlinlang.org/docs/sequences.html are the answer, and you don't even need an iterator. You can generate sequence using https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.sequences/generate-sequence.html
and here is a little playground witch produces similar sequence as your generator https://pl.kotl.in/LdboRzAzr

Related

Function that returns an accumulator in Python

I am reading Hackers and Painters and am confused by a problem mentioned by the author to illustrate the power of different programming languages.
The problem is:
We want to write a function that generates accumulators—a function that takes a number n, and returns a function that takes another number i and returns n incremented by i. (That’s incremented by, not plus. An accumulator has to accumulate.)
The author mentions several solutions with different programming languages. For example, Common Lisp:
(defun foo (n)
(lambda (i) (incf n i)))
and JavaScript:
function foo(n) { return function (i) { return n += i } }
However, when it comes to Python, the following codes do not work:
def foo(n):
s = n
def bar(i):
s += i
return s
return bar
f = foo(0)
f(1) # UnboundLocalError: local variable 's' referenced before assignment
A simple modification will make it work:
def foo(n):
s = [n]
def bar(i):
s[0] += i
return s[0]
return bar
I am new to Python. Why doesn the first solution not work while the second one does? The author mentions lexical variables but I still don't get it.
s += i is just sugar for s = s + i.*
This means you assign a new value to the variable s (instead of mutating it in place). When you assign to a variable, Python assumes it is local to the function. However, before assigning it needs to evaluate s + i, but s is local and still unassigned -> Error.
In the second case s[0] += i you never assign to s directly, but only ever access an item from s. So Python can clearly see that it is not a local variable and goes looking for it in the outer scope.
Finally, a nicer alternative (in Python 3) is to explicitly tell it that s is not a local variable:
def foo(n):
s = n
def bar(i):
nonlocal s
s += i
return s
return bar
(There is actually no need for s - you could simply use n instead inside bar.)
*The situation is slightly more complex, but the important issue is that computation and assignment are performed in two separate steps.
An infinite generator is one implementation. You can call __next__ on a generator instance to extract successive results iteratively.
def incrementer(n, i):
while True:
n += i
yield n
g = incrementer(2, 5)
print(g.__next__()) # 7
print(g.__next__()) # 12
print(g.__next__()) # 17
If you need a flexible incrementer, one possibility is an object-oriented approach:
class Inc(object):
def __init__(self, n=0):
self.n = n
def incrementer(self, i):
self.n += i
return self.n
g = Inc(2)
g.incrementer(5) # 7
g.incrementer(3) # 10
g.incrementer(7) # 17
In Python if we use a variable and pass it to a function then it will be Call by Value whatever changes you make to the variable it will not be reflected to the original variable.
But when you use a list instead of a variable then the changes that you make to the list in the functions are reflected in the original List outside the function so this is called call by reference.
And this is the reason for the second option does work and the first option doesn't.

Looping structure best approach

Consider these two variants of the same loop structure:
x = find_number_of_iterations()
for n in range(x):
# do something in loop
and:
for n in range(find_number_of_iterations()):
# do something
Will the second loop evaluate the method find_number_of_iterations in every subsequent loop run, or will the method find_number_of_iterations be evaluated only once even in the second variant?
I suspect that your mentor's confusion is traceable to the fact that the semantics of Python's for loop is so much different than in other languages.
In a language like C a for loop is more or less syntactic sugar for a while loop:
for(i = 0; i < n; i++)
{
//do stuff
}
is equivalent to:
i = 0;
while(i < n)
{
//do stuff
i++
}
In Python it is different. Its for loops are iterator-based. The iterator object is initialized just once and then consumed in subsequent iterations. The following snippets show that Python's for loop is not (easily) translatable into a while loop, and also shows that with a while loop your mentor's concern is valid:
>>> def find_number_of_iterations():
print("called")
return 3
>>> for i in range(find_number_of_iterations()): print(i)
called
0
1
2
>>> i = 0
>>> while i < find_number_of_iterations():
print(i)
i += 1
called
0
called
1
called
2
called
The function is called once. Logically, were it to be called on every iteration then the loop range could change causing all kinds of havoc. This is easily tested:
def find_iterations():
print "find_iterations called"
return 5
for n in range(find_iterations()):
print n
Results in:
$ python test.py
find_iterations called
0
1
2
3
4
Either way, the function only gets called once. You can demonstrate this as follows:
>>> def test_func():
"""Function to count calls and return integers."""
test_func.called += 1
return 3
# first version
>>> test_func.called = 0
>>> x = test_func()
>>> for _ in range(x):
print 'loop'
loop
loop
loop
>>> test_func.called
1
# second version
>>> test_func.called = 0
>>>
>>> for _ in range(test_func()):
print 'loop'
loop
loop
loop
>>> test_func.called
1
The function is called once, and the result of calling that function is passed to range (then the result of calling range is iterated over); the two versions are logically equivalent.

Genshi and Python Generators (yield)

How do I create/call a python generator in Genshi? Is that even possible?
For example, (and no i'm not looking for an alternate solution to this problem, of which there are many, including enumerate on the for each, etc):
<?python
""" a bunch of other code ... """
def bg_color_gen():
""" Alternate background color every call """
while 1:
yield "#FFFFFF"
yield "#EBEBEB"
?>
And then calling this function:
<fo:block background-color="${bg_color_gen()}">First entry</fo:block>
<fo:block background-color="${bg_color_gen()}">Second entry</fo:block>
<fo:block background-color="${bg_color_gen()}">Third entry</fo:block>
This has nothing to do with my < fo:block >, which you could replace with < div >. It is not an FO question but a Genshi question.
I'm guessing Genshi doesn't recognize the 'yield' and runs 'while 1' ad-infinitum?
Also, I do realize I could use a global to keep track of a counter, and then call
counter++
if counter%yieldCount==0: return "#FFFFFFF"
elif counter%yieldCount==1: return "#EBEBEB"
But this is not a generator and gets ugly very quickly!
Clarification:
Another way to ask this question: how would you code
def fib():
a,b = 0,1
while True:
yield a
b = a+b
yield b
a = a+b
Which would then be called in the sentence "The first number is $fib(), the second is $fib(), the third is $fib(), and so on."
================================================
Updated full solution based on accepted answer:
<?python
def fib_generator():
a,b = 0,1
while True:
yield a
b = a+b
yield b
a = a+b
fib = fib_generator()
?>
The first number is ${next(fib)},
the second is ${next(fib)},
the third is ${next(fib)}, and so on.
Without knowing the structure of your content, I would suggest the following:
<fo:block py:for="i, entry in entries"
background-color="${'#FFFFFF' if i % 2 else '#EBEBEB'}">
${entry}
</fo:block>
However if you truly want to use a generator then you could just evaluate using Python's native next()
<py:with vars="color=bg_color_gen();">
<fo:block background-color="${next(color)}">
</py:with>
You would want to declare the generator first and then call next on it to get a yielded color.
In this case you are passing three different instances of the generator created by calling bg_color_gen() ie)
# this creates a generator
>>> bg_color_gen()
<generator object bg_color_gen at 0x02B21A30>
>>> bgcg = bg_color_gen()
# this gets values
>>> next(bgcg)
'#FFFFFF'
>>> next(bgcg)
'#EBEBEB'
>>> next(bgcg)
'#FFFFFF'
>>>

Lazy evaluation in Python

What is lazy evaluation in Python?
One website said :
In Python 3.x the range() function returns a special range object which computes elements of the list on demand (lazy or deferred evaluation):
>>> r = range(10)
>>> print(r)
range(0, 10)
>>> print(r[3])
3
What is meant by this?
The object returned by range() (or xrange() in Python2.x) is known as a lazy iterable.
Instead of storing the entire range, [0,1,2,..,9], in memory, the generator stores a definition for (i=0; i<10; i+=1) and computes the next value only when needed (AKA lazy-evaluation).
Essentially, a generator allows you to return a list like structure, but here are some differences:
A list stores all elements when it is created. A generator generates the next element when it is needed.
A list can be iterated over as much as you need, a generator can only be iterated over exactly once.
A list can get elements by index, a generator cannot -- it only generates values once, from start to end.
A generator can be created in two ways:
(1) Very similar to a list comprehension:
# this is a list, create all 5000000 x/2 values immediately, uses []
lis = [x/2 for x in range(5000000)]
# this is a generator, creates each x/2 value only when it is needed, uses ()
gen = (x/2 for x in range(5000000))
(2) As a function, using yield to return the next value:
# this is also a generator, it will run until a yield occurs, and return that result.
# on the next call it picks up where it left off and continues until a yield occurs...
def divby2(n):
num = 0
while num < n:
yield num/2
num += 1
# same as (x/2 for x in range(5000000))
print divby2(5000000)
Note: Even though range(5000000) is a generator in Python3.x, [x/2 for x in range(5000000)] is still a list. range(...) does it's job and generates x one at a time, but the entire list of x/2 values will be computed when this list is create.
In a nutshell, lazy evaluation means that the object is evaluated when it is needed, not when it is created.
In Python 2, range will return a list - this means that if you give it a large number, it will calculate the range and return at the time of creation:
>>> i = range(100)
>>> type(i)
<type 'list'>
In Python 3, however you get a special range object:
>>> i = range(100)
>>> type(i)
<class 'range'>
Only when you consume it, will it actually be evaluated - in other words, it will only return the numbers in the range when you actually need them.
A github repo named python patterns and wikipedia tell us what lazy evaluation is.
Delays the eval of an expr until its value is needed and avoids repeated evals.
range in python3 is not a complete lazy evaluation, because it doesn't avoid repeated eval.
A more classic example for lazy evaluation is cached_property:
import functools
class cached_property(object):
def __init__(self, function):
self.function = function
functools.update_wrapper(self, function)
def __get__(self, obj, type_):
if obj is None:
return self
val = self.function(obj)
obj.__dict__[self.function.__name__] = val
return val
The cached_property(a.k.a lazy_property) is a decorator which convert a func into a lazy evaluation property. The first time property accessed, the func is called to get result and then the value is used the next time you access the property.
eg:
class LogHandler:
def __init__(self, file_path):
self.file_path = file_path
#cached_property
def load_log_file(self):
with open(self.file_path) as f:
# the file is to big that I have to cost 2s to read all file
return f.read()
log_handler = LogHandler('./sys.log')
# only the first time call will cost 2s.
print(log_handler.load_log_file)
# return value is cached to the log_handler obj.
print(log_handler.load_log_file)
To use a proper word, a python generator object like range are more like designed through call_by_need pattern, rather than lazy evaluation

Please explain Python for loops to a C++/C#/Java programmer

I have had experience in Java/C#/C++ and for loops or pretty much if not exactly done the same. Now I'm learning Python through Codecademy. I find it poor the way it trys to explain for loops to me. The code they give you is
my_list = [1,9,3,8,5,7]
for number in my_list:
# Your code here
print 2 * number
Is this saying for every number in my_list ... print 2 * number.
That somewhat makes sense to me if that's true but I don't get number and how that works. It's not even a variable declared earlier. Are you declaring a variable withing the for loop? And how does Python know that number is accessing the values within my_list and multiplying them by 2? Also, how do for loops work with things other than lists because I've looked at other Python code that contains for loops and they make no sense. Could you please find some way to explain the way these are similar to something like C# for loops or just explain Python for loops in general.
Yes, number is a newly defined variable. Python does not require variables to be declared before using them. And the understanding of the loop iteration is correct.
This is the same sytnax Borne-style shells use (such as bash).
The logic of the for loop is this: assign the named variable the next value in the list, iterate, repeat.
correction
As for other non-list values, they should translate into a sequence in python. Try this:
val="1 2 3"
for number in val:
print number
Note this prints "1", " ", "2", " ", "3".
Here's a useful reference: http://www.tutorialspoint.com/python/python_for_loop.htm.
The quick answer, to relate to C#, is that a Python for loop is roughly equivalent to a C# foreach loop. C++ sort of has similar facilities (BOOST_FOREACH for example, or the for syntax in C++11), but C does not have an equivalent.
There is no equivalent in Python of the C-style for (initial; condition; increment) style loop.
Python for loops can iterate over more than just lists; they can iterate over anything that is iterable. See for example What makes something iterable in python.
Python doesn't need variables to be declared it can be declared itself at the time of initialization
While and do while are similar to those languages but for loop is quite different in python
you can use it for list similar to for each
but for another purpose like to run from 1 to 10 you can use,
for number in range(10):
print number
Python for loops should be pretty similar to C# foreach loops. It steps through my_list and at each step you can use "number" to reverence to that element in the list.
If you want to access list indices as well as list elements while you are iterating, the usual idiom is to use g the "enumerate function:
for (i, x) in enumerate(my_list):
print "the", i, "number in the list is", x
The foreach loop should should be similar to the following desugared code:
my_iterator = iter(my_list)
while True:
try:
number = iter.next()
#Your code here
print 2*number
except StopIteration:
break
Pretty similar to this Java loop: Java for loop syntax: "for (T obj : objects)"
In python there's no need to declare variable type, that's why number has no type.
In Python you don't need to declare variables. In this case the number variable is defined by using it in the loop.
As for the loop construct itself, it's similar to the C++11 range-based for loop:
std::vector<int> my_list = { 1, 9, 3, 8, 5, 7 };
for (auto& number : my_list)
std::cout << 2 * number << '\n';
This can of course be implemented pre-C++11 using std::for_each with a suitable functor object (which may of course be a C++11 lambda expression).
Python does not have an equivalent to the normal C-style for loop.
I will try to explain the python for loop to you in much basic way as possible:
Let's say we have a list:
a = [1, 2, 3, 4, 5]
Before we jump into the for loop let me tell you we don't have to initialize the variable type in python while declaring variable.
int a, str a is not required.
Let's go to for loop now.
for i in a:
print 2*i
Now, what does it do?
The loop will start from the first element so,
i is replaced by 1 and it is multiplied by 2 and displayed. After it's done with 1 it will jump to 2.
Regarding your another question:
Python knows its variable type in it's execution:
>>> a = ['a', 'b', 'c']
>>> for i in a:
... print 2*i
...
aa
bb
cc
>>>
Python uses protocols (duck-typing with specially named methods, pre and post-fixed with double underscores). The equivalent in Java would be an interface or an abstract base class.
In this case, anything in Python which implements the iterator protocol can be used in a for loop:
class TheStandardProtocol(object):
def __init__(self):
self.i = 0
def __iter__(self):
return self
def __next__(self):
self.i += 1
if self.i > 15: raise StopIteration()
return self.i
# In Python 2 `next` is the only protocol method without double underscores
next = __next__
class TheListProtocol(object):
"""A less common option, but still valid"""
def __getitem__(self, index):
if index > 15: raise IndexError()
return index
We can then use instances of either class in a for loop and everything will work correctly:
standard = TheStandardProtocol()
for i in standard: # `__iter__` invoked to get the iterator
# `__next__` invoked and its return value bound to `i`
# until the underlying iterator returned by `__iter__`
# raises a StopIteration exception
print i
# prints 1 to 15
list_protocol = TheListProtocol()
for x in list_protocol: # Python creates an iterator for us
# `__getitem__` is invoked with ascending integers
# and the return value bound to `x`
# until the instance raises an IndexError
print x
# prints 0 to 15
The equivalent in Java is the Iterable and Iterator interface:
class MyIterator implements Iterable<Integer>, Iterator<Integer> {
private Integer i = 0;
public Iterator<Integer> iterator() {
return this;
}
public boolean hasNext() {
return i < 16;
}
public Integer next() {
return i++;
}
}
// Elsewhere
MyIterator anIterator = new MyIterator();
for(Integer x: anIterator) {
System.out.println(x.toString());
}

Categories