Is there a built-in python equivalent of std::find_if to find the first element of a list for which a given condition is true? In other words, something like the index() function of a list, but with an arbitrary unary predicate rather than just a test for equality.
I don't want to use list comprehension, because the specific predicate I have in mind is somewhat expensive to compute.
Using a tip from an answer to a related question, and borrowing from the answer taras posted, I came up with this:
>>> lst=[1,2,10,3,5,3,4]
>>> next(n for n in lst if n%5==0)
10
A slight modification will give you the index rather than the value:
>>> next(idx for idx,n in enumerate(lst) if n%5==0)
2
Now, if there was no match this will raise an exception StopIteration. You might want use a function that handles the exception and returns None if there was no match:
def first_match(iterable, predicate):
try:
return next(idx for idx,n in enumerate(iterable) if predicate(n))
except StopIteration:
return None
lst=[1,2,10,3,5,3,4]
print(first_match(lst, lambda x: x%5 == 0))
Note that this uses a generator expression, not a list comprehension. A list comprehension would apply the condition to every member of the list and produce a list of all matches. This applies it to each member until it finds a match and then stops, which is the minimum work to solve the problem.
Say, you have some predicate function pred and a list lst.
You can use itertools.dropwhile to get the first element in lst,
for which pred returns True with
itertools.dropwhile(lambda x: not pred(x), lst).next()
It skips all elements for which pred(x) is False and .next()
yields you the value for which pred(x) is True.
Edit:
A sample use to find the first element in lst divisible by 5
>>> import itertools
>>> lst = [1,2,10,3,5,3,4]
>>> pred = lambda x: x % 5 == 0
>>> itertools.dropwhile(lambda x: not pred(x), lst).next()
10
Related
i am refreshing my python (2.7) and i am discovering iterators and generators.
As i understood, they are an efficient way of navigating over values without consuming too much memory.
So the following code do some kind of logical indexing on a list:
removing the values of a list L that triggers a False conditional statement represented here by the function f.
I am not satisfied with my code because I feel this code is not optimal for three reasons:
I read somewhere that it is better to use a for loop than a while loop.
However, in the usual for i in range(10), i can't modify the value of 'i' because it seems that the iteration doesn't care.
Logical indexing is pretty strong in matrix-oriented languages, and there should be a way to do the same in python (by hand granted, but maybe better than my code).
Third reason is just that i want to use generator/iterator on this example to help me understand.
Third reason is just that i want to use generator/iterator on this example to help me understand.
TL;DR : Is this code a good pythonic way to do logical indexing ?
#f string -> bool
def f(s):
return 'c' in s
L=['','a','ab','abc','abcd','abcde','abde'] #example
length=len(L)
i=0
while i < length:
if not f(L[i]): #f is a conditional statement (input string output bool)
del L[i]
length-=1 #cut and push leftwise
else:
i+=1
print 'Updated list is :', L
print length
This code has a few problems, but the main one is that you must never modify a list you're iterating over. Rather, you create a new list from the elements that match your condition. This can be done simply in a for loop:
newlist = []
for item in L:
if f(item):
newlist.append(item)
which can be shortened to a simple list comprehension:
newlist = [item for item in L if f(item)]
It looks like filter() is what you're after:
newlist = filter(lambda x: not f(x), L)
filter() filters (...) an iterable and only keeps the items for which a predicate returns True. In your case f(..) is not quite the predicate but not f(...).
Simpler:
def f(s):
return 'c' not in s
newlist = filter(f, L)
See: https://docs.python.org/2/library/functions.html#filter
Never modify a list with del, pop or other methods that mutate the length of the list while iterating over it. Read this for more information.
The "pythonic" way to filter a list is to use reassignment and either a list comprehension or the built-in filter function:
List comprehension:
>>> [item for item in L if f(item)]
['abc', 'abcd', 'abcde']
i want to use generator/iterator on this example to help me understand
The for item in L part is implicitly making use of the iterator protocol. Python lists are iterable, and iter(somelist) returns an iterator .
>>> from collections import Iterable, Iterator
>>> isinstance([], Iterable)
True
>>> isinstance([], Iterator)
False
>>> isinstance(iter([]), Iterator)
True
__iter__ is not only being called when using a traditional for-loop, but also when you use a list comprehension:
>>> class mylist(list):
... def __iter__(self):
... print('iter has been called')
... return super(mylist, self).__iter__()
...
>>> m = mylist([1,2,3])
>>> [x for x in m]
iter has been called
[1, 2, 3]
Filtering:
>>> filter(f, L)
['abc', 'abcd', 'abcde']
In Python3, use list(filter(f, L)) to get a list.
Of course, to filter a list, Python needs to iterate over it, too:
>>> filter(None, mylist())
iter has been called
[]
"The python way" to do it would be to use a generator expression:
# list comprehension
L = [l for l in L if f(l)]
# alternative generator comprehension
L = (l for l in L if f(l))
It depends on your context if a list or a generator is "better" (see e.g. this so question). Because your source data is coming from a list, there is no real benefit of using a generator here.
For simply deleting elements, especially if the original list is no longer needed, just iterate backwards:
Python 2.x:
for i in xrange(len(L) - 1, -1, -1):
if not f(L[i]):
del L[i]
Python 3.x:
for i in range(len(L) - 1, -1, -1):
if not f(L[i]):
del L[i]
By iterating from the end, the "next" index does not change after deletion and a for loop is possible. Note that you should use the xrange generator in Python 2, or the range generator in Python 3, to save memory*.
In cases where you must iterate forward, use your given solution above.
*Note that Python 2's xrange will break if there are >= 2 ** 32 - 1 elements. Python 3's range, as well as the less efficient Python 2's range do not have this limitation.
I did some google searching on how to check if a string has any elements of a list in it and I found this bit of code that works:
if any(i in string for i in list):
I know this works, but I don't really know why. Could you share some insight?
As the docs for any say:
Return True if any element of the iterable is true. If the iterable is empty, return False. Equivalent to:
def any(iterable):
for element in iterable:
if element:
return True
return False
So, this is equivalent to:
for element in (i in string for i in list):
if element:
return True
return False
… which is itself effectively equivalent to:
for i in list:
element = i in string
if element:
return True
return False
If you don't understand the last part, first read the tutorial section on list comprehensions, then skip ahead to iterators, generators, and generator expressions.
If you want to really break it down, you can do this:
elements = []
for i in list:
elements.append(i in string)
for element in elements:
if element:
return True
return False
That still isn't exactly the same, because a generator expression builds a generator, not a list, but it should be enough to get you going until you read the tutorial sections.
But meanwhile, the point of having any and comprehensions and so on is that you can almost read them as plain English:
if any(i in string for i in list): # Python
if any of the i's is in the string, for each i in the list: # pseudo-English
i in string for i in list
This produces an iterable of booleans indicating whether each item in list is in string. Then you check whether any item in this iterable of bools is true.
In effect, you're checking whether any of the items in the list are substrings of string.
What's going on here with:
if any(i in string for i in list):
is best explained by illustrating:
>>> xs = ["Goodbye", "Foo", "Balloon"]
>>> s = "Goodbye World"
>>> [i in s for i in xs]
[True, False, False]
>>> any([i in s for i in xs])
True
If you read the any documentaiton you'll note:
any(iterable) Return True if any element of the iterable is true.
If the iterable is empty, return False. Equivalent to:
The list comprehension should be more obvious as it constructs a list of i in s for each element of xs.
Basically (in English) you are returning any match where each sub-string exists in the search string (haystack).
It's important to note as well that any() will short circuit and end on the first True(ish) value it finds. any() can be implement in pure Python like this:
def any(iterable):
for x in iterable:
if x:
return True
return False
Basically my question is say you have an list containing 'None' how would you try retrieving the sum of the list. Below is an example I tried which doesn't work and I get the error: TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'. Thanks
def sumImport(self):
my_list = [[1,2,3,None],[1,2,3],[1,1],[1,1,2,2]]
k = sum(chain.from_iterable(my_list))
return k
You can use filter function
>>> sum(filter(None, [1,2,3,None]))
6
Updated from comments
Typically filter usage is filter(func, iterable), but passing None as first argument is a special case, described in Python docs. Quoting:
If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.
Remove None (and zero) elements before summing by using filter:
>>> k = sum(filter(None, chain.from_iterable(my_list)))
>>> k
20
To see why this works, see the documentation for filter:
filter(function, iterable)
Construct a list from those elements of iterable for which function returns true. iterable may be either a sequence, a container which supports iteration, or an iterator. If iterable is a string or a tuple, the result also has that type; otherwise it is always a list. If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.
Note that filter(function, iterable) is equivalent to [item for item in iterable if function(item)] if function is not None and [item for item in iterable if item] if function is None.
Another suggestion:
from itertools import chain
k = sum(x for x in chain.from_iterable(my_list) if x)
Assuming you want to treat None as zero, a simple way is
sum(x if x is not None else 0 for x in chain.from_iterable(my_list))
Explicitly, this is equivalent to filter:
k = sum([x for x in chain.from_iterable(my_list) if x])
That saves me from remembering another function. :P
You always have the option of just writing the loop you want:
k = 0
for sublist in my_list:
for val in sublist:
if val is not None:
k += val
But it certainly doesn’t hurt to know about filter either.
Just using sum and map:
sum(map(lambda x: x or 0, [1,2,3,None]))
# 6
I am using generators to perform searches in lists like this simple example:
>>> a = [1,2,3,4]
>>> (i for i, v in enumerate(a) if v == 4).next()
3
(Just to frame the example a bit, I am using very much longer lists compared to the one above, and the entries are a little bit more complicated than int. I do it this way so the entire lists won't be traversed each time I search them)
Now if I would instead change that to i == 666, it would return a StopIteration because it can't find any 666 entry in a.
How can I make it return None instead? I could of course wrap it in a try ... except clause, but is there a more pythonic way to do it?
If you are using Python 2.6+ you should use the next built-in function, not the next method (which was replaced with __next__ in 3.x). The next built-in takes an optional default argument to return if the iterator is exhausted, instead of raising StopIteration:
next((i for i, v in enumerate(a) if i == 666), None)
You can chain the generator with (None,):
from itertools import chain
a = [1,2,3,4]
print chain((i for i, v in enumerate(a) if v == 6), (None,)).next()
but I think a.index(2) will not traverse the full list, when 2 is found, the search is finished. you can test this:
>>> timeit.timeit("a.index(0)", "a=range(10)")
0.19335955439601094
>>> timeit.timeit("a.index(99)", "a=range(100)")
2.1938486138533335
I'm looking for a concise and functional style way to apply a function to one element of a tuple and return the new tuple, in Python.
For example, for the following input:
inp = ("hello", "my", "friend")
I would like to be able to get the following output:
out = ("hello", "MY", "friend")
I came up with two solutions which I'm not satisfied with.
One uses a higher-order function.
def apply_at(arr, func, i):
return arr[0:i] + [func(arr[i])] + arr[i+1:]
apply_at(inp, lambda x: x.upper(), 1)
One uses list comprehensions (this one assumes the length of the tuple is known).
[(a,b.upper(),c) for a,b,c in [inp]][0]
Is there a better way? Thanks!
Here is a version that works on any iterable and returns a generator:
>>> inp = ("hello", "my", "friend")
>>> def apply_nth(fn, n, iterable):
... return (fn(x) if i==n else x for (i,x) in enumerate(iterable))
...
>>> tuple(apply_nth(str.upper, 1, inp))
('hello', 'MY', 'friend')
You can extend this so that instead of one position you can give it a list of positions:
>>> def apply_at(fn, pos_lst, iterable):
... pos_lst = set(pos_lst)
... return (fn(x) if i in pos_lst else x for (i,x) in enumerate(iterable))
...
>>> ''.join(apply_at(str.upper, [2,4,6,8], "abcdefghijklmno"))
'abCdEfGhIjklmno'
I commented in support of your first snippet, but here are a couple other ways for the record:
(lambda (a,b,c): [a,b.upper(),c])(inp)
(Won't work in Python 3.x.) And:
[inp[0], inp[1].upper(), inp[2]]
>>> inp = "hello", "my", "friend"
>>> index = 1
>>> inp[:index] + ( str.upper(inp[index]),) + inp[index + 1:]
('hello', 'MY', 'friend')
Seems simple, the only thing you may need to know is that to make a single element tuple, do (elt,)
Maybe some' like this?
>>>inp = ("hello", "my", "friend")
>>>out = tuple([i == 1 and x.upper() or x for (x,i) in zip(t,range(len(t)))])
>>> out
('hello', 'MY', 'friend')
Note: rather than (x,i) in zip(t, range(len(t))) I should have thought of using the enumerate function : (i,x) in enumerate(t)
Making it a bit more general:
Rather than hard-coding the 1, we can place it in a variable.
Also, by using a tuple for that purpose, we can apply the function to elements at multiple indexes.
>>>inp = ("hello", "my", "friend")
>>>ix = (0,2)
>>>out = tuple([i in ix and x.upper() or x for (i, x) in enumerate(t)])
>>> out
('HELLO', 'my', 'FRIEND')
Also, we can "replace" the zip()/enumerate() by map(), in something like
out = tuple(map(lambda x,i : i == 1 and x.upper() or x, inp, range(len(inp)) ) )
Edit: (addressing comment about specifying the function to apply):
Could be something as simple as:
>>> f = str.upper # or whatever function taking a single argument
>>> out = tuple(map(lambda x,i : i == 1 and f(x) or x, inp, range(len(inp)) ) )
Since we're talking about applying any function, we should mention the small caveat with the condition and if_true or if_false construct which is not exactly a substitute for the if/else ternary operator found in other languages. The limitation is that the function cannot return a value which is equivalent to False (None, 0, 0.0, '' for example). A suggestion to avoid this problem, is, with Python 2.5 and up, to use the true if-else ternary operator, as shown in Dave Kirby's answer (note the when_true if condition else when_false syntax of this operator)
I don't understand if you want to apply a certain function to every element in the tuple that passes some test, or if you would like it to apply the function to any element present at a certain index of the tuple. So I have coded both algorithms:
This is the algorithm (coded in Python) that I would use to solve this problem in a functional language like scheme:
This function will identify the element identifiable by id and apply func to it and return a list with that element changed to the output of func. It will do this for every element identifiable as id:
def doSomethingTo(tup, id):
return tuple(doSomethingToHelper(list(tup), id))
def doSomethingToHelper(L, id):
if len(L) == 0:
return L
elif L[0] == id:
return [func(L[0])] + doSomethingToHelper(L[1:], id)
else:
return [L[0]] + doSomethingToHelper(L[1:], id)
This algorithm will find the element at the index of the tuple and apply func to it, and stick it back into its original index in the tuple
def doSomethingAt(tup, i):
return tuple(doSomethingAtHelper(list(tup), i, 0))
def doSomethingAtHelper(L, index, i):
if len(L) == 0:
return L
elif i == index:
return [func(L[0])] + L[1:]
else:
return [L[0]] + doSomethingAtHelper(L[1:], index, i+1)
i also like the answer that Dave Kirby gave. however, as a public service announcement, i'd like to say that this is not a typical use case for tuples -- these are data structures that originated in Python as a means to move data (parameters, arguments) to and from functions... they were not meant for the programmer to use as general array-like data structures in applications -- this is why lists exist. naturally, if you're needing the read-only/immutable feature of tuples, that is a fair argument, but given the OP question, this should've been done with lists instead -- note how there is extra code to either pull the tuple apart and put the resulting one together and/or the need to temporarily convert to a list and back.