Basically my question is say you have an list containing 'None' how would you try retrieving the sum of the list. Below is an example I tried which doesn't work and I get the error: TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'. Thanks
def sumImport(self):
my_list = [[1,2,3,None],[1,2,3],[1,1],[1,1,2,2]]
k = sum(chain.from_iterable(my_list))
return k
You can use filter function
>>> sum(filter(None, [1,2,3,None]))
6
Updated from comments
Typically filter usage is filter(func, iterable), but passing None as first argument is a special case, described in Python docs. Quoting:
If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.
Remove None (and zero) elements before summing by using filter:
>>> k = sum(filter(None, chain.from_iterable(my_list)))
>>> k
20
To see why this works, see the documentation for filter:
filter(function, iterable)
Construct a list from those elements of iterable for which function returns true. iterable may be either a sequence, a container which supports iteration, or an iterator. If iterable is a string or a tuple, the result also has that type; otherwise it is always a list. If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.
Note that filter(function, iterable) is equivalent to [item for item in iterable if function(item)] if function is not None and [item for item in iterable if item] if function is None.
Another suggestion:
from itertools import chain
k = sum(x for x in chain.from_iterable(my_list) if x)
Assuming you want to treat None as zero, a simple way is
sum(x if x is not None else 0 for x in chain.from_iterable(my_list))
Explicitly, this is equivalent to filter:
k = sum([x for x in chain.from_iterable(my_list) if x])
That saves me from remembering another function. :P
You always have the option of just writing the loop you want:
k = 0
for sublist in my_list:
for val in sublist:
if val is not None:
k += val
But it certainly doesn’t hurt to know about filter either.
Just using sum and map:
sum(map(lambda x: x or 0, [1,2,3,None]))
# 6
Related
Is there a built-in python equivalent of std::find_if to find the first element of a list for which a given condition is true? In other words, something like the index() function of a list, but with an arbitrary unary predicate rather than just a test for equality.
I don't want to use list comprehension, because the specific predicate I have in mind is somewhat expensive to compute.
Using a tip from an answer to a related question, and borrowing from the answer taras posted, I came up with this:
>>> lst=[1,2,10,3,5,3,4]
>>> next(n for n in lst if n%5==0)
10
A slight modification will give you the index rather than the value:
>>> next(idx for idx,n in enumerate(lst) if n%5==0)
2
Now, if there was no match this will raise an exception StopIteration. You might want use a function that handles the exception and returns None if there was no match:
def first_match(iterable, predicate):
try:
return next(idx for idx,n in enumerate(iterable) if predicate(n))
except StopIteration:
return None
lst=[1,2,10,3,5,3,4]
print(first_match(lst, lambda x: x%5 == 0))
Note that this uses a generator expression, not a list comprehension. A list comprehension would apply the condition to every member of the list and produce a list of all matches. This applies it to each member until it finds a match and then stops, which is the minimum work to solve the problem.
Say, you have some predicate function pred and a list lst.
You can use itertools.dropwhile to get the first element in lst,
for which pred returns True with
itertools.dropwhile(lambda x: not pred(x), lst).next()
It skips all elements for which pred(x) is False and .next()
yields you the value for which pred(x) is True.
Edit:
A sample use to find the first element in lst divisible by 5
>>> import itertools
>>> lst = [1,2,10,3,5,3,4]
>>> pred = lambda x: x % 5 == 0
>>> itertools.dropwhile(lambda x: not pred(x), lst).next()
10
def shuffle(self, x, random=None, int=int):
"""x, random=random.random -> shuffle list x in place; return None.
Optional arg random is a 0-argument function returning a random
float in [0.0, 1.0); by default, the standard random.random.
"""
randbelow = self._randbelow
for i in reversed(range(1, len(x))):
# pick an element in x[:i+1] with which to exchange x[i]
j = randbelow(i+1) if random is None else int(random() * (i+1))
x[i], x[j] = x[j], x[i]
When I run the shuffle function it raises the following error, why is that?
TypeError: 'dict_keys' object does not support indexing
Clearly you're passing in d.keys() to your shuffle function. Probably this was written with python2.x (when d.keys() returned a list). With python3.x, d.keys() returns a dict_keys object which behaves a lot more like a set than a list. As such, it can't be indexed.
The solution is to pass list(d.keys()) (or simply list(d)) to shuffle.
You're passing the result of somedict.keys() to the function. In Python 3, dict.keys doesn't return a list, but a set-like object that represents a view of the dictionary's keys and (being set-like) doesn't support indexing.
To fix the problem, use list(somedict.keys()) to collect the keys, and work with that.
Convert an iterable to a list may have a cost. Instead, to get the the first item, you can use:
next(iter(keys))
Or, if you want to iterate over all items, you can use:
items = iter(keys)
while True:
try:
item = next(items)
except StopIteration as e:
pass # finish
In Python 2 dict.keys() return a list, whereas in Python 3 it returns a generator.
You could only iterate over it's values else you may have to explicitly convert it to a list i.e. pass it to a list function.
Why you need to implement shuffle when it already exists? Stay on the shoulders of giants.
import random
d1 = {0:'zero', 1:'one', 2:'two', 3:'three', 4:'four',
5:'five', 6:'six', 7:'seven', 8:'eight', 9:'nine'}
keys = list(d1)
random.shuffle(keys)
d2 = {}
for key in keys: d2[key] = d1[key]
print(d1)
print(d2)
i am refreshing my python (2.7) and i am discovering iterators and generators.
As i understood, they are an efficient way of navigating over values without consuming too much memory.
So the following code do some kind of logical indexing on a list:
removing the values of a list L that triggers a False conditional statement represented here by the function f.
I am not satisfied with my code because I feel this code is not optimal for three reasons:
I read somewhere that it is better to use a for loop than a while loop.
However, in the usual for i in range(10), i can't modify the value of 'i' because it seems that the iteration doesn't care.
Logical indexing is pretty strong in matrix-oriented languages, and there should be a way to do the same in python (by hand granted, but maybe better than my code).
Third reason is just that i want to use generator/iterator on this example to help me understand.
Third reason is just that i want to use generator/iterator on this example to help me understand.
TL;DR : Is this code a good pythonic way to do logical indexing ?
#f string -> bool
def f(s):
return 'c' in s
L=['','a','ab','abc','abcd','abcde','abde'] #example
length=len(L)
i=0
while i < length:
if not f(L[i]): #f is a conditional statement (input string output bool)
del L[i]
length-=1 #cut and push leftwise
else:
i+=1
print 'Updated list is :', L
print length
This code has a few problems, but the main one is that you must never modify a list you're iterating over. Rather, you create a new list from the elements that match your condition. This can be done simply in a for loop:
newlist = []
for item in L:
if f(item):
newlist.append(item)
which can be shortened to a simple list comprehension:
newlist = [item for item in L if f(item)]
It looks like filter() is what you're after:
newlist = filter(lambda x: not f(x), L)
filter() filters (...) an iterable and only keeps the items for which a predicate returns True. In your case f(..) is not quite the predicate but not f(...).
Simpler:
def f(s):
return 'c' not in s
newlist = filter(f, L)
See: https://docs.python.org/2/library/functions.html#filter
Never modify a list with del, pop or other methods that mutate the length of the list while iterating over it. Read this for more information.
The "pythonic" way to filter a list is to use reassignment and either a list comprehension or the built-in filter function:
List comprehension:
>>> [item for item in L if f(item)]
['abc', 'abcd', 'abcde']
i want to use generator/iterator on this example to help me understand
The for item in L part is implicitly making use of the iterator protocol. Python lists are iterable, and iter(somelist) returns an iterator .
>>> from collections import Iterable, Iterator
>>> isinstance([], Iterable)
True
>>> isinstance([], Iterator)
False
>>> isinstance(iter([]), Iterator)
True
__iter__ is not only being called when using a traditional for-loop, but also when you use a list comprehension:
>>> class mylist(list):
... def __iter__(self):
... print('iter has been called')
... return super(mylist, self).__iter__()
...
>>> m = mylist([1,2,3])
>>> [x for x in m]
iter has been called
[1, 2, 3]
Filtering:
>>> filter(f, L)
['abc', 'abcd', 'abcde']
In Python3, use list(filter(f, L)) to get a list.
Of course, to filter a list, Python needs to iterate over it, too:
>>> filter(None, mylist())
iter has been called
[]
"The python way" to do it would be to use a generator expression:
# list comprehension
L = [l for l in L if f(l)]
# alternative generator comprehension
L = (l for l in L if f(l))
It depends on your context if a list or a generator is "better" (see e.g. this so question). Because your source data is coming from a list, there is no real benefit of using a generator here.
For simply deleting elements, especially if the original list is no longer needed, just iterate backwards:
Python 2.x:
for i in xrange(len(L) - 1, -1, -1):
if not f(L[i]):
del L[i]
Python 3.x:
for i in range(len(L) - 1, -1, -1):
if not f(L[i]):
del L[i]
By iterating from the end, the "next" index does not change after deletion and a for loop is possible. Note that you should use the xrange generator in Python 2, or the range generator in Python 3, to save memory*.
In cases where you must iterate forward, use your given solution above.
*Note that Python 2's xrange will break if there are >= 2 ** 32 - 1 elements. Python 3's range, as well as the less efficient Python 2's range do not have this limitation.
I'm from a C++ background so this problem seems a little absurd to me:
Let's suppose I have a function:
def scale(data, factor):
for val in data:
val *= factor
This doesn't work as intended, if I pass a list, it changes nothing, but
def scale(data, factor):
for index, val in enumerate(data):
data[index] *= factor
and lst = [val * factor for val in lst] works properly.
How does Python handle argument passing? How do I know if the actual reference, or alias is passed?
if you want to mutate the list, you need to reference the elements. This version uses map (it could be written using list comprehensions)
def scale(data, factor):
return map(lambda x : x*factor, data)
a lambda function is an anonymous function.
>>> (lambda x : x + 1) (5)
6
The x takes the place of the variable in this case 5 + 1
So in this case, we traverse the list applying the function f(x) -> x * factor to every element of the list. The original list is not mutated, but instead we return a new version.
In python basic data types are passed by value - for example int, str, bool etc are passed by value
Derived data types like classes, enum, list, dict are passed by reference.
In your example, the problem is how you use the for loop - not the function argument. If you do:
for val in lst:
val += 1
The values inside lst won't get updated because the val is not the same as lst[0], lst[1] and so on IF val is of the basic data types. So, even here, the val is copied by value.
Second, In your example with enumerate:
But when you loop over the enumerated list, you are using data[index] - which modifies the element in the actual list.
And finally, In your example with the generator:
lst = [val * factor for val in lst] - here the generator loops over every element and creates a new list which is again stored in lst. This is something like a = a + 2 but extended to lists.
This behaviour is so because the basic data types are passed by value and the derived data types like lists are passed by reference consider this
>>> x = 24
>>> x + 1
25
>>> x
24
but on the otherhand with a list
>>> y = [1, 2, 3, 4]
>>> y.remove(2)
>>> y
[1,3,4]
so you should always be careful to reassign values back when performing operations on them in the case of the basic data ypes and also be careful with datatypes that are passed by reference because you could accidentally modify a variable without knowing
I am using generators to perform searches in lists like this simple example:
>>> a = [1,2,3,4]
>>> (i for i, v in enumerate(a) if v == 4).next()
3
(Just to frame the example a bit, I am using very much longer lists compared to the one above, and the entries are a little bit more complicated than int. I do it this way so the entire lists won't be traversed each time I search them)
Now if I would instead change that to i == 666, it would return a StopIteration because it can't find any 666 entry in a.
How can I make it return None instead? I could of course wrap it in a try ... except clause, but is there a more pythonic way to do it?
If you are using Python 2.6+ you should use the next built-in function, not the next method (which was replaced with __next__ in 3.x). The next built-in takes an optional default argument to return if the iterator is exhausted, instead of raising StopIteration:
next((i for i, v in enumerate(a) if i == 666), None)
You can chain the generator with (None,):
from itertools import chain
a = [1,2,3,4]
print chain((i for i, v in enumerate(a) if v == 6), (None,)).next()
but I think a.index(2) will not traverse the full list, when 2 is found, the search is finished. you can test this:
>>> timeit.timeit("a.index(0)", "a=range(10)")
0.19335955439601094
>>> timeit.timeit("a.index(99)", "a=range(100)")
2.1938486138533335