Pythonic way to skip items in an iterable?

Pythonic way to skip items in an iterable? - python

Say I have an iterable and want to skip elements as long as the elements match a particular predicate. I want to affect the current iterator, not return a new one.
I could simply do this:
# untested, just for explanation
e = next(iterable)
while True:
if something(e):
e = next(iterable)
else:
break
But is there a built-in function for this, or some common idiom?

This is the basic use-case of itertools.dropwhile.

Why not just
e = next(iterable)
while something(e):
e = next(iterable)
Also, why "I want to affect the current iterator, not return a new one."? I can't think of a scenario when you would need that. If you allow for a new iterator to wrap the current one, itertools.dropwhile that wim suggests is much more Pythonic (and readable).

Your example just plain stops after the first false result, so this does the same:
>>> it = iter(range(100000)) # just constructing _an_ iterator
>>> next(itertools.filterfalse(lambda i: i != 12, it))
12
I'm not clear on what - exactly - "I want to affect the current iterator, not return a new one" means, but do note that it itself has advanced:
>>> next(it)
13

Related

How to change this for loop to while loop?

Having a bit of trouble with while loops. I understand this basic for loop runs through whatever is passed into the function but how would I change the for loop to a while loop? Thought it would be as easy as changing the for to while but apparently not so.
def print_data(items):
for item in items:
print(item)

You can do something like this to have the same printing functionality with a while loop:
def print_data(items):
i = 0
while i < len(items):
print items[i]
i += 1

Here is a while loop version that works by constructing an iterator and manually walking it. It works regardless of whether the input is a generator or a list.
def print_data(items):
it = iter(items)
while True:
try:
print next(it)
except StopIteration:
break
print_data([1,2,3])

One option, that works on both lists and generators, is to construct an iterator and then use Python's built-in next function. When the iterator reaches the end, the next function will raise a StopIteration exception, which you can use to break the loop:
def print_data(items):
it = iter(items)
while True:
try:
print next(it)
except StopIteration:
break
print_data(['a', 'b', 'c'])
print_data(('a', 'b', 'c'))
There's more about the next built-in function and iterators in the docs.

If you are learning Python:
If you need to iterate over an iterable (a list, a generator, a string etc.. in short and not precise words something that contains things and can "give" those things one by one..) you better use for.
In Python for was made for iterables, so you don't need a while.
If you are learning programming in general:
while needs a condition to be satisfied to keep looping, you create your own condition making a counter that will increment every loop, and making the while loop go while this counter is less or equal to the lenght of your items, as showed in Mathias711's answer.

The for-loop you are using is iterating through a so called iterator.
This means to walk through iterable objects (lists, generators, dicts,...) and return the next item from the iterator which is returned by the built-in function [next()][2]. If there is no item left, calling this function will raise an so called StopIteration error which causes to stop iteration.
So the pythonic way to iterate througth iteratable objects is in fact using the for-loop you provided in your question. However, if you really want to use a while loop (which at least in general is not recommended at all) you have to iterate using a try-except-block and handle the StopIteration error raised if no item is left.
def iterate_manually(items):
# convert items list into iterator
iterator = iter(items)
while True:
try:
print(next(iterator))
# handle StopIteration error and exit while-loop
except StopIteration:
break
iterate_manually(['foo', 'bar', 'baz'])

You can try this
def print_data(items):
i =0
while items:
if i < len(items):
print items[i]
i = i+1
else:
break

python find the num in list greater than left and right

The question is use iterator to find the num from list who is greater than the nums in left and right
For example, select([0,1,0,2,0,3,0,4,0]) return [1,2,3,4]
My first try is
def select(iterable):
answer = []
it = iter(iterable)
try:
v2 = next(it)
while True:
v1,v2= v2,next(it)
if v2>v1 and v2>next(it):
answer.append(v2)
except StopIteration:
pass
return answer
This code fails.
I think the next(it) in the while loop would be the same iterator,but the next() still iter next one in the code.
then I change the code to below one, it works.
try:
v1,v2,v3 = next(it),next(it),next(it)
while True:
if v2>v1 and v2>v3:
answer.append(v2)
v1,v2,v3 = v2,v3,next(it)
except StopIteration:
pass
Can someone explain what is difference happen here?

There are two issues with the first snippet:
Every time you call next(it), it advances the iterator. You have to store the returned value if you want to access it more than once. Calling next(it) again won't do that; it will advance the iterator yet again.
Even if the first point weren't an issue, the following would still be problematic:
if v2>v1 and v2>next(it):
The issue here is short-circuit evaluation. Depending on whether v2>v1, this may or may not advance the iterator.

The error lies in this line:
if v2>v1 and v2>next(it):
Your calling next(it), but you don't store the return value. You just compare it to the v2. So the value gets skipped.
edit:
Btw, if you compare multiple values, the following comparison is much more cleaner:
if v1 < v2 > v3:

Pythonic way to get the single element of a 1-sized list

What's the most pythonic way to take the single item of a 1-sized list in python?
I usually go for
item = singlet_list[0]
This would fail for an empty list, but I would like a way to make it fail even if the list is longer, something like:
assert(len(singlet_list) == 1)
item = singlet_list[0]
but I find this ugly. Is there anything better?

This blog post suggests an elegant solution I fell in love with:
(item,) = singlet_list
I find it much more readable, and plus it works with any iterable, even if it is not indexable.
EDIT: Let me dig a little more
This construct is called sequence unpacking or multiple assignment throughout the python documentation, but here I'm using a 1-sized tuple on the left of the assignment.
This has actually a behaviour that is similar to the 2-lines in the initial question: if the list/iterable singlet_list is of length 1, it will assign its only element to item. Otherways, it will fail with an appropriate ValueError (rather than an AssertionError as in the question's 2-liner):
>>> (item,) = [1]
>>> item
1
>>> (item,) = [1, 2]
ValueError: too many values to unpack
>>> (item,) = []
ValueError: need more than 0 values to unpack
As pointed out in the blog post, this has some advantages:
This will not to fail silently, as required by the original question.
It is much more readable than the [0] indexing, and it doesn't pass unobserved even in complex statements.
It works for any iterable object.
(Of course) it uses only one line, with respect to explicitly using assert

You could use an inline if...else and define a default value like this:
If singlet_list contains one or more values:
singlet_list = [2]
item = singlet_list[0] if singlet_list else False
print item
output:
2
If singlet_list is empty:
singlet_list = []
item = singlet_list[0] if singlet_list else False
print item
output:
False
This uses the fact that an empty list evaluates to False.
Similarly, if you would like a default value to be assigned if the list doesn't contain exactly one element, you could check the length:
item = singlet_list[0] if len(singlet_list) == 1 else False

Python for loop implementation

Can someone tell me how exactly Python's for loops are implemented? The reason I'm asking this is because I'm getting different behavior in the following two for loops when I expect the same behavior (assuming cases is just a set of elements):
First for loop:
for case in cases:
blah
Second for loop:
for i in range(len(cases)):
case = cases[i]
blah
I'm running my code in a multi-threaded environment.
Basically, I'm wondering whether Python's for loop's iterating over a set (as in the first for loop) is simply a quickhand way of the second one. What exactly happens when we use the python for loop, and is there any underlying optimization/ implementation that may be causing the behavior difference I'm observing?

No, the second format is quite different.
The for loop calls iter() on the to-loop-over sequence, and uses next() calls on the result. Consider it the equivalent of:
iterable = iter(cases):
while True:
try:
case = next(iterable)
except StopIteration:
break
# blah
The result of calling iter() on a list is a list iterator object:
>>> iter([])
<list_iterator object at 0x10fcc6a90>
This object keeps a reference to the original list and keeps track of the index it is at. That index starts at 0 and increments until it the list has been iterated over fully.
Different objects can return different iterators with different behaviours. With threading mixed in, you could end up replacing cases with something else, but the iterator would still reference the old sequence.

l = [1, 2, 3, 4, 5]
l = iter(l)
while True:
try:
print l.next()
except StopIteration:
exit()

i didn't get any difference,check this below, is it what u exactly trying..
>>> cases = [1,2,3]
>>> for case in cases:
... print case
...
1
2
3
>>> i=0
>>> for i in range(len(cases)):
... print cases[i]
...
1
2
3
>>>

Python idiom to return first item or None

I'm calling a bunch of methods that return a list. The list may be empty. If the list is non-empty, I want to return the first item; otherwise, I want to return None. This code works:
def main():
my_list = get_list()
if len(my_list) > 0:
return my_list[0]
return None
but it seems to me that there should be a simple one-line idiom for doing this. Is there?

Python 2.6+
next(iter(your_list), None)
If your_list can be None:
next(iter(your_list or []), None)
Python 2.4
def get_first(iterable, default=None):
if iterable:
for item in iterable:
return item
return default
Example:
x = get_first(get_first_list())
if x:
...
y = get_first(get_second_list())
if y:
...
Another option is to inline the above function:
for x in get_first_list() or []:
# process x
break # process at most one item
for y in get_second_list() or []:
# process y
break
To avoid break you could write:
for x in yield_first(get_first_list()):
x # process x
for y in yield_first(get_second_list()):
y # process y
Where:
def yield_first(iterable):
for item in iterable or []:
yield item
return

The best way is this:
a = get_list()
return a[0] if a else None
You could also do it in one line, but it's much harder for the programmer to read:
return (get_list()[:1] or [None])[0]

(get_list() or [None])[0]
That should work.
BTW I didn't use the variable list, because that overwrites the builtin list() function.

The most python idiomatic way is to use the next() on a iterator since list is iterable. just like what #J.F.Sebastian put in the comment on Dec 13, 2011.
next(iter(the_list), None) This returns None if the_list is empty. see next() Python 2.6+
or if you know for sure the_list is not empty:
iter(the_list).next() see iterator.next() Python 2.2+

If you find yourself trying to pluck the first thing (or None) from a list comprehension you can switch to a generator to do it like:
next((x for x in blah if cond), None)
Pro: works if blah isn't indexable Con: it's unfamiliar syntax. It's useful while hacking around and filtering stuff in ipython though.

The OP's solution is nearly there, there are just a few things to make it more Pythonic.
For one, there's no need to get the length of the list. Empty lists in Python evaluate to False in an if check. Just simply say
if list:
Additionally, it's a very Bad Idea to assign to variables that overlap with reserved words. "list" is a reserved word in Python.
So let's change that to
some_list = get_list()
if some_list:
A really important point that a lot of solutions here miss is that all Python functions/methods return None by default. Try the following below.
def does_nothing():
pass
foo = does_nothing()
print foo
Unless you need to return None to terminate a function early, it's unnecessary to explicitly return None. Quite succinctly, just return the first entry, should it exist.
some_list = get_list()
if some_list:
return list[0]
And finally, perhaps this was implied, but just to be explicit (because explicit is better than implicit), you should not have your function get the list from another function; just pass it in as a parameter. So, the final result would be
def get_first_item(some_list):
if some_list:
return list[0]
my_list = get_list()
first_item = get_first_item(my_list)
As I said, the OP was nearly there, and just a few touches give it the Python flavor you're looking for.

Python idiom to return first item or None?
The most Pythonic approach is what the most upvoted answer demonstrated, and it was the first thing to come to my mind when I read the question. Here's how to use it, first if the possibly empty list is passed into a function:
def get_first(l):
return l[0] if l else None
And if the list is returned from a get_list function:
l = get_list()
return l[0] if l else None
New in Python 3.8, Assignment Expressions
Assignment expressions use the in-place assignment operator (informally called the walrus operator), :=, new in Python 3.8, allows us to do the check and assignment in-place, allowing the one-liner:
return l[0] if (l := get_list()) else None
As a long-time Python user, this feels like we're trying to do too much on one line - I feel it would be better style to do the presumptively equally performant:
if l := get_list():
return l[0]
return None
In support of this formulation is Tim Peter's essay in the PEP proposing this change to the language. He didn't address the first formulation, but based on the other formulations he did like, I don't think he would mind.
Other ways demonstrated to do this here, with explanations
for
When I began trying to think of clever ways to do this, this is the second thing I thought of:
for item in get_list():
return item
This presumes the function ends here, implicitly returning None if get_list returns an empty list. The below explicit code is exactly equivalent:
for item in get_list():
return item
return None
if some_list
The following was also proposed (I corrected the incorrect variable name) which also uses the implicit None. This would be preferable to the above, as it uses the logical check instead of an iteration that may not happen. This should be easier to understand immediately what is happening. But if we're writing for readability and maintainability, we should also add the explicit return None at the end:
some_list = get_list()
if some_list:
return some_list[0]
slice or [None] and select zeroth index
This one is also in the most up-voted answer:
return (get_list()[:1] or [None])[0]
The slice is unnecessary, and creates an extra one-item list in memory. The following should be more performant. To explain, or returns the second element if the first is False in a boolean context, so if get_list returns an empty list, the expression contained in the parentheses will return a list with 'None', which will then be accessed by the 0 index:
return (get_list() or [None])[0]
The next one uses the fact that and returns the second item if the first is True in a boolean context, and since it references my_list twice, it is no better than the ternary expression (and technically not a one-liner):
my_list = get_list()
return (my_list and my_list[0]) or None
next
Then we have the following clever use of the builtin next and iter
return next(iter(get_list()), None)
To explain, iter returns an iterator with a .next method. (.__next__ in Python 3.) Then the builtin next calls that .next method, and if the iterator is exhausted, returns the default we give, None.
redundant ternary expression (a if b else c) and circling back
The below was proposed, but the inverse would be preferable, as logic is usually better understood in the positive instead of the negative. Since get_list is called twice, unless the result is memoized in some way, this would perform poorly:
return None if not get_list() else get_list()[0]
The better inverse:
return get_list()[0] if get_list() else None
Even better, use a local variable so that get_list is only called one time, and you have the recommended Pythonic solution first discussed:
l = get_list()
return l[0] if l else None

Regarding idioms, there is an itertools recipe called nth.
From itertools recipes:
def nth(iterable, n, default=None):
"Returns the nth item or a default value"
return next(islice(iterable, n, None), default)
If you want one-liners, consider installing a library that implements this recipe for you, e.g. more_itertools:
import more_itertools as mit
mit.nth([3, 2, 1], 0)
# 3
mit.nth([], 0) # default is `None`
# None
Another tool is available that only returns the first item, called more_itertools.first.
mit.first([3, 2, 1])
# 3
mit.first([], default=None)
# None
These itertools scale generically for any iterable, not only for lists.

for item in get_list():
return item

Frankly speaking, I do not think there is a better idiom: your is clear and terse - no need for anything "better". Maybe, but this is really a matter of taste, you could change if len(list) > 0: with if list: - an empty list will always evaluate to False.
On a related note, Python is not Perl (no pun intended!), you do not have to get the coolest code possible.
Actually, the worst code I have seen in Python, was also very cool :-) and completely unmaintainable.
By the way, most of the solution I have seen here do not take into consideration when list[0] evaluates to False (e.g. empty string, or zero) - in this case, they all return None and not the correct element.

my_list[0] if len(my_list) else None

Not sure how pythonic this is but until there is a first function in the library I include this in the source:
first = lambda l, default=None: next(iter(l or []), default)
It's just one line (conforms to black) and avoids dependencies.

Out of curiosity, I ran timings on two of the solutions. The solution which uses a return statement to prematurely end a for loop is slightly more costly on my machine with Python 2.5.1, I suspect this has to do with setting up the iterable.
import random
import timeit
def index_first_item(some_list):
if some_list:
return some_list[0]
def return_first_item(some_list):
for item in some_list:
return item
empty_lists = []
for i in range(10000):
empty_lists.append([])
assert empty_lists[0] is not empty_lists[1]
full_lists = []
for i in range(10000):
full_lists.append(list([random.random() for i in range(10)]))
mixed_lists = empty_lists[:50000] + full_lists[:50000]
random.shuffle(mixed_lists)
if __name__ == '__main__':
ENV = 'import firstitem'
test_data = ('empty_lists', 'full_lists', 'mixed_lists')
funcs = ('index_first_item', 'return_first_item')
for data in test_data:
print "%s:" % data
for func in funcs:
t = timeit.Timer('firstitem.%s(firstitem.%s)' % (
func, data), ENV)
times = t.repeat()
avg_time = sum(times) / len(times)
print " %s:" % func
for time in times:
print " %f seconds" % time
print " %f seconds avg." % avg_time
These are the timings I got:
empty_lists:
index_first_item:
0.748353 seconds
0.741086 seconds
0.741191 seconds
0.743543 seconds avg.
return_first_item:
0.785511 seconds
0.822178 seconds
0.782846 seconds
0.796845 seconds avg.
full_lists:
index_first_item:
0.762618 seconds
0.788040 seconds
0.786849 seconds
0.779169 seconds avg.
return_first_item:
0.802735 seconds
0.878706 seconds
0.808781 seconds
0.830074 seconds avg.
mixed_lists:
index_first_item:
0.791129 seconds
0.743526 seconds
0.744441 seconds
0.759699 seconds avg.
return_first_item:
0.784801 seconds
0.785146 seconds
0.840193 seconds
0.803380 seconds avg.

try:
return a[0]
except IndexError:
return None

def head(iterable):
try:
return iter(iterable).next()
except StopIteration:
return None
print head(xrange(42, 1000) # 42
print head([]) # None
BTW: I'd rework your general program flow into something like this:
lists = [
["first", "list"],
["second", "list"],
["third", "list"]
]
def do_something(element):
if not element:
return
else:
# do something
pass
for li in lists:
do_something(head(li))
(Avoiding repetition whenever possible)

Borrowing more_itertools.first_true code yields something decently readable:
def first_true(iterable, default=None, pred=None):
return next(filter(pred, iterable), default)
def get_first_non_default(items_list, default=None):
return first_true(items_list, default, pred=lambda x: x!=default)

Following code covers several scenarios by using lambda:
l1 = [1,2,3]
l2 = []
l3 = None
first_elem = lambda x: x[0] if x else None
print(first_elem(l1))
print(first_elem(l2))
print(first_elem(l3))

Using the and-or trick:
a = get_list()
return a and a[0] or None

Probably not the fastest solution, but nobody mentioned this option:
dict(enumerate(get_list())).get(0)
if get_list() can return None you can use:
dict(enumerate(get_list() or [])).get(0)
Advantages:
-one line
-you just call get_list() once
-easy to understand

My use case was only to set the value of a local variable.
Personally I found the try and except style cleaner to read
items = [10, 20]
try: first_item = items[0]
except IndexError: first_item = None
print first_item
than slicing a list.
items = [10, 20]
first_item = (items[:1] or [None, ])[0]
print first_item

How about this:
(my_list and my_list[0]) or None
Note: This should work fine for lists of objects but it might return incorrect answer in case of number or string list per the comments below.

You could use Extract Method. In other words extract that code into a method which you'd then call.
I wouldn't try to compress it much more, the one liners seem harder to read than the verbose version. And if you use Extract Method, it's a one liner ;)

Several people have suggested doing something like this:
list = get_list()
return list and list[0] or None
That works in many cases, but it will only work if list[0] is not equal to 0, False, or an empty string. If list[0] is 0, False, or an empty string, the method will incorrectly return None.
I've created this bug in my own code one too many times !

isn't the idiomatic python equivalent to C-style ternary operators
cond and true_expr or false_expr
ie.
list = get_list()
return list and list[0] or None

if mylist != []:
print(mylist[0])
else:
print(None)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pythonic way to skip items in an iterable? - python

This is the basic use-case of itertools.dropwhile.

Related

How to change this for loop to while loop?

python find the num in list greater than left and right

Pythonic way to get the single element of a 1-sized list

Python for loop implementation

Python idiom to return first item or None

Categories

Resources