Get a subset of a generator - python

I have a generator function and want to get the first ten items from it; my first attempt was:
my_generator()[:10]
This doesn't work because generators aren't subscriptable, as the error tells me. Right now I have worked around that with:
list(my_generator())[:10]
This works since it converts the generator to a list; however, it's inefficient and defeats the point of having a generator. Is there some built-in, Pythonic equivalent of [:10] for generators?

import itertools
itertools.islice(mygenerator(), 10)
itertools has a number of utilities for working with iterators. islice takes start, stop, and step arguments to slice an iterator just as you would slice a list.

to clarify the above comments:
from itertools import islice
def fib_gen():
a, b = 1, 1
while True:
yield a
a, b = b, a + b
assert [1, 1, 2, 3, 5] == list(islice(fib_gen(), 5))

Related

Extra next() for generator in `zip()`?

Given,
import itertools as it
def foo():
idx = 0
while True:
yield idx
idx += 1
k = foo()
When I use zip() as in the following,
>>> list(zip(k,[11,12,13]))
[(0, 11), (1, 12), (2, 13)]
and then immediately after,
>>> list(zip(k,[11,12,13]))
[(4, 11), (5, 12), (6, 13)]
Notice that the second zip should have started with (3,11) but it jumped to (4,11)
instead. It's as if there is another hidden next(k) somewhere. This does not happen when I use it.islice
>>> k = foo()
>>> list(it.islice(k,6))
[0, 1, 2, 3, 4, 5]
Notice it.islice is not missing the 3 term.
I am using Python 3.8.
zip basically (and necessarily, given the design of the iterator protocol) works like this:
# zip is actually a class, but we'll pretend it's a generator
# function for simplicity.
def zip(xs, ys):
# zip doesn't require its arguments to be iterators, just iterable
xs = iter(xs)
ys = iter(ys)
while True:
x = next(xs)
y = next(ys)
yield x, y
There is no way to tell if ys is exhausted before an element of xs is consumed, and the iterator protocol doesn't provide a way for zip to put x "back" in xs if next(ys) raises a StopIteration exception.
For the special case where one of the input iterables is sized, you can do a little better than zip:
import itertools as it
from collections.abc import Sized
def smarter_zip(*iterables):
sized = [i for i in iterables if isinstance(i, Sized)]
try:
min_length = min(len(s) for s in sized)
except ValueError:
# can't determine a min length.. fall back to regular zip
return zip(*iterables)
return zip(*[it.islice(i, min_length) for i in iterables])
It uses islice to prevent zip from consuming more from each iterator than we know is strictly necessary. This smarter_zip will solve the problem for the case posed in the original question.
However, in the general case, there is no way to tell beforehand whether an iterator is exhausted or not (consider a generator yielding bytes arriving on a socket). If the shortest of the iterables is not sized, the original problem still remains. For solving the general case, you may want to wrap iterators in a class which remembers the last-yielded item, so that it can be recalled from memory if necessary.

Looking for a fold-like idiom

So my friend presented a problem for me to solve, and I'm currently writing a solution in functional-style Python. The problem itself isn't my question; I'm looking for a possible idiom that I can't find at the moment.
What I need is a fold, but instead of using the same function for every one of it's applications, it would do a map-like exhaustion of another list containing functions. For example, given this code:
nums = [1, 2, 3]
funcs = [add, sub]
special_foldl(nums, funcs)
the function (special_foldl) would fold the number list down with ((1 + 2) - 3). Is there a function/idiom that elegantly does this, or should I just roll my own?
There is no such function in the Python standard library. You'll have to roll you own, perhaps something like this:
import operator
import functools
nums = [1, 2, 3]
funcs = iter([operator.add, operator.sub])
def special_foldl(nums, funcs):
return functools.reduce(lambda x,y: next(funcs)(x,y), nums)
print(special_foldl(nums, funcs))
# 0

good practice for string.partition in python

Sometime I write code like this:
a,temp,b = s.partition('-')
I just need to pick the first and 3rd elements. temp would never be used. Is there a better way to do this?
In other terms, is there a better way to pick distinct elements to make a new list?
For example, I want to make a new list using the elements 0,1,3,7 from the old list. The
code would be like this:
newlist = [oldlist[0],oldlist[1],oldlist[3],oldlist[7]]
It's pretty ugly, isn't it?
Be careful using
a, _, b = s.partition('-')
sometimes _ is use for internationalization (gettext), so you wouldn't want to accidentally overwrite it.
Usually I would do this for partition rather than creating a variable I don't need
a, b = s.partition('-')[::2]
and this in the general case
from operator import itemgetter
ig0137 = itemgetter(0, 1, 3, 7)
newlist = ig0137(oldlist)
The itemgetter is more efficient than a list comprehension if you are using it in a loop
For the first there's also this alternative:
a, b = s.partition('-')[::2]
For the latter, since there's no clear interval there is no way to do it too clean. But this might suit your needs:
newlist = [oldlist[k] for k in (0, 1, 3, 7)]
You can use Python's extended slicing feature to access a list periodically:
>>> a = range(10)
>>> # Pick every other element in a starting from a[1]
>>> b = a[1::2]
>>> print b
>>> [1, 3, 5, 7, 9]
Negative indexing works as you'd expect:
>>> c = a[-1::-2]
>>> print c
>>> [9, 7, 5, 3, 1]
For your case,
>>> a, b = s.partition('-')[::2]
the common practice in Python to pick 1st and 3rd values is:
a, _, b = s.partition('-')
And to pick specified elements in a list you can do :
newlist = [oldlist[k] for k in (0, 1, 3, 7)]
If you don't need to retain the middle field you can use split (and similarly rsplit) with the optional maxsplit parameter to limit the splits to the first (or last) match of the separator:
a, b = s.split('-', 1)
This avoids a throwaway temporary or additional slicing.
The only caveat is that with split, unlike partition, the original string is returned if the separator is not found. The attempt to unpack will fail as a result. The partition method always returns a 3-tuple.

Python 3 turn range to a list

I'm trying to make a list with numbers 1-1000 in it. Obviously this would be annoying to write/read, so I'm attempting to make a list with a range in it. In Python 2 it seems that:
some_list = range(1,1000)
would have worked, but in Python 3 the range is similar to the xrange of Python 2?
Can anyone provide some insight into this?
You can just construct a list from the range object:
my_list = list(range(1, 1001))
This is how you do it with generators in python2.x as well. Typically speaking, you probably don't need a list though since you can come by the value of my_list[i] more efficiently (i + 1), and if you just need to iterate over it, you can just fall back on range.
Also note that on python2.x, xrange is still indexable1. This means that range on python3.x also has the same property2
1print xrange(30)[12] works for python2.x
2The analogous statement to 1 in python3.x is print(range(30)[12]) and that works also.
In Pythons <= 3.4 you can, as others suggested, use list(range(10)) in order to make a list out of a range (In general, any iterable).
Another alternative, introduced in Python 3.5 with its unpacking generalizations, is by using * in a list literal []:
>>> r = range(10)
>>> l = [*r]
>>> print(l)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Though this is equivalent to list(r), it's literal syntax and the fact that no function call is involved does let it execute faster. It's also less characters, if you need to code golf :-)
in Python 3.x, the range() function got its own type. so in this case you must use iterator
list(range(1000))
The reason why Python3 lacks a function for directly getting a ranged list is because the original Python3 designer was quite novice in Python2. He only considered the use of range() function in a for loop, thus, the list should never need to be expanded. In fact, very often we do need to use the range() function to produce a list and pass into a function.
Therefore, in this case, Python3 is less convenient as compared to Python2 because:
In Python2, we have xrange() and range();
In Python3, we have range() and list(range())
Nonetheless, you can still use list expansion in this way:
[*range(N)]
You really shouldn't need to use the numbers 1-1000 in a list. But if for some reason you really do need these numbers, then you could do:
[i for i in range(1, 1001)]
List Comprehension in a nutshell:
The above list comprehension translates to:
nums = []
for i in range(1, 1001):
nums.append(i)
This is just the list comprehension syntax, though from 2.x. I know that this will work in python 3, but am not sure if there is an upgraded syntax as well
Range starts inclusive of the first parameter; but ends Up To, Not Including the second Parameter (when supplied 2 parameters; if the first parameter is left off, it'll start at '0')
range(start, end+1)
[start, start+1, .., end]
Python 3:
my_list = [*range(1001)]
Actually, if you want 1-1000 (inclusive), use the range(...) function with parameters 1 and 1001: range(1, 1001), because the range(start, end) function goes from start to (end-1), inclusive.
Use Range in Python 3.
Here is a example function that return in between numbers from two numbers
def get_between_numbers(a, b):
"""
This function will return in between numbers from two numbers.
:param a:
:param b:
:return:
"""
x = []
if b < a:
x.extend(range(b, a))
x.append(a)
else:
x.extend(range(a, b))
x.append(b)
return x
Result
print(get_between_numbers(5, 9))
print(get_between_numbers(9, 5))
[5, 6, 7, 8, 9]
[5, 6, 7, 8, 9]
In fact, this is a retro-gradation of Python3 as compared to Python2. Certainly, Python2 which uses range() and xrange() is more convenient than Python3 which uses list(range()) and range() respectively. The reason is because the original designer of Python3 is not very experienced, they only considered the use of the range function by many beginners to iterate over a large number of elements where it is both memory and CPU inefficient; but they neglected the use of the range function to produce a number list. Now, it is too late for them to change back already.
If I was to be the designer of Python3, I will:
use irange to return a sequence iterator
use lrange to return a sequence list
use range to return either a sequence iterator (if the number of elements is large, e.g., range(9999999) or a sequence list (if the number of elements is small, e.g., range(10))
That should be optimal.

zipWith analogue in Python?

What is the analogue of Haskell's zipWith function in Python?
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
map()
map(operator.add, [1, 2, 3], [3, 2, 1])
Although a LC with zip() is usually used.
[x + y for (x, y) in zip([1, 2, 3], [3, 2, 1])]
You can create yours, if you wish, but in Python we mostly do
list_c = [ f(a,b) for (a,b) in zip(list_a,list_b) ]
as Python is not inherently functional. It just happens to support a few convenience idioms.
You can use map:
>>> x = [1,2,3,4]
>>> y = [4,3,2,1]
>>> map(lambda a, b: a**b, x, y)
[1, 8, 9, 4]
A lazy zipWith with itertools:
import itertools
def zip_with(f, *coll):
return itertools.starmap(f, itertools.izip(*coll))
This version generalizes the behaviour of zipWith with any number of iterables.
Generally as others have mentioned map and zip can help you replicate the functionality of zipWith as in Haskel.
Generally you can either apply a defined binary operator or some binary function on two list.An example to replace an Haskel zipWith with Python's map/zip
Input: zipWith (+) [1,2,3] [3,2,1]
Output: [4,4,4]
>>> map(operator.add,[1,2,3],[4,3,2])
[5, 5, 5]
>>> [operator.add(x,y) for x,y in zip([1,2,3],[4,3,2])]
[5, 5, 5]
>>>
There are other variation of zipWith aka zipWith3, zipWith4 .... zipWith7. To replicate these functionalists you may want to use izip and imap instead of zip and map.
>>> [x for x in itertools.imap(lambda x,y,z:x**2+y**2-z**2,[1,2,3,4],[5,6,7,8],[9,10,11,12])]
>>> [x**2+y**2-z**2 for x,y,z in itertools.izip([1,2,3,4],[5,6,7,8],[9,10,11,12])]
[-55, -60, -63, -64]
As you can see, you can operate of any number of list you desire and you can still use the same procedure.
I know this is an old question, but ...
It's already been said that the typical python way would be something like
results = [f(a, b) for a, b in zip(list1, list2)]
and so seeing a line like that in your code, most pythonistas will understand just fine.
There's also already been a (I think) purely lazy example shown:
import itertools
def zipWith(f, *args):
return itertools.starmap(f, itertools.izip(*args))
but I believe that starmap returns an iterator, so you won't be able to index, or go through multiple times what that function will return.
If you're not particularly concerned with laziness and/or need to index or loop through your new list multiple times, this is probably as general purpose as you could get:
def zipWith(func, *lists):
return [func(*args) for args in zip(*lists)]
Not that you couldn't do it with the lazy version, but you could also call that function like so if you've already built up your list of lists.
results = zipWith(func, *lists)
or just like normal like:
results = zipWith(func, list1, list2)
Somehow, that function call just looks simpler and easier to grok than the list comprehension version.
Looking at that, this looks strangely reminiscent of another helper function I often write:
def transpose(matrix):
return zip(*matrix)
which could then be written like:
def transpose(matrix):
return zipWith(lambda *x: x, *matrix)
Not really a better version, but I always find it interesting how when writing generic functions in a functional style, I often find myself going, "Oh. That's just a more general form of a function I've already written before."

Categories