Speed difference between iterating over generators and lists - python

In the following trivial examples there are two functions that sort a list of random numbers. The first method passes sorted a generator expression, the second method creates a list first:
import random
l = [int(1000*random.random()) for i in xrange(10*6)]
def sort_with_generator():
return sorted(a for a in l)
def sort_with_list():
return sorted([a for a in l])
Benchmarking with line profiler indicates that the second option (sort_with_list) is about twice as fast as the generator expression.
Can anyone explain what's happening, and why the first method is so much slower than the second?

Your first example is a generator expression that iterates over a list. Your second example is a list expression that iterates over a list. Indeed, the second example is slightly faster.
>>> import timeit
>>> timeit("sorted(a for a in l)", setup="import random;l = [int(1000*random.random()) for i in xrange(10*6)]")
5.963912010192871
>>> timeit("sorted([a for a in l])", setup="import random;l = [int(1000*random.random()) for i in xrange(10*6)]")
5.021576881408691
The reason for this is undoubtedly that making a list is done in one go, while iterating over a generator requires function calls.
Generators are not to speed up small lists like this (you have 60 elements in the list, that's very small). It's to save memory when creating long lists, primarily.

If you look at the source for sorted, any sequence you pass in gets copied into a new list first.
newlist = PySequence_List(seq);
generator --> list appears to be slower than list --> list.
>>> timeit.timeit('x = list(l)', setup = 'l = xrange(1000)')
16.656711101531982
>>> timeit.timeit('x = list(l)', setup = 'l = range(1000)')
4.525658845901489
As to why a copy must be made, think about how sorting works. Sorts aren't linear algorithms. We move through the data multiple times, sometimes traversing data in both directions. A generator is intended for producing a sequence through which we iterate once and only once, from start to somewhere after it. A list allows for random access.
On the other hand, creating a list from a generator will mean only one list in memory, while making a copy of a list will mean two lists in memory. Good 'ol fashioned space-time tradeoff.
Python uses Timsort, a hybrid of merge sort and insertion sort.

List expressions, firstly, loads data into a memory. Then doing any operations with resulting list. Let the allocation time is T2 (for second case).
Generator expressions not allocating time at once, but it change iterator value for time t1[i]. Sum of all t1[i] will be T1. T1 ≈ T2.
But when you call sorted(), in the first case time T1 added with time of allocation memory of every pairs compared to sort (tx1[i]). In result, T1 added with sum of all tx1[i].
Therefore, T2 < T1 + sum(tx1[i])

Related

Complexity of built-in methods like list.count(). Do they take O(1) time?

I have two methods to count occurrences of any element. One is using built-in method count and other is using loop.
Time complexity for 2nd method is O(n), but not sure of built-in method.
Does count take time of O(1) or O(n)?Please also tell me about other built-in methods like reverse, index, etc. Using count.
List1 = [10,4,5,10,6,4,10]
print(List1.count(10))
using loop
List2 = [10,4,5,10,6,4,10]
count = 0
for ele in List2:
if (ele == 10):
count += 1
print(count)
As per the documentation
list.count(x) - Return the number of times x appears in the list.
Now think about it: if you have 10 cups over some coloured balls, can you be 100% certain about the number of red balls under the cups before you check under all of the cups?
Hint: No
Therefore, list.count(x) has to check the entire list. As the list has size n, list.count(x) has to be O(n).
EDIT: For the pedantic readers out there, of course there could be an implementation of lists that stores the count of every item. This would lead to an increase in memory usage but would provide the O(1) for list.count(x).
EDIT2: You can have a look at the implementation of list.count here. You will see the for loop that runs exactly n times, definitely answering your question: Built-in methods do not necessarily take O(1) time, list.count(x) being an example of a built-in method which is O(n)
The build in count() method in python is also having time complexity of O(n)
The time complexity of the count(value) method is O(n) for a list with n elements. The standard Python implementation cPython “touches” all elements in the original list to check if they are equal to the value. Thus, the time complexity is linear in the number of list elements.
It's an easy thing to see for yourself.
>>> import timeit
>>> timeit.timeit('x.count(10)', 'x=list(range(100))', number=1000)
0.007884800899773836
>>> timeit.timeit('x.count(10)', 'x=list(range(1000))', number=1000)
0.03378760418854654
>>> timeit.timeit('x.count(10)', 'x=list(range(10000))', number=1000)
0.2234031839761883
>>> timeit.timeit('x.count(10)', 'x=list(range(100000))', number=1000)
2.1812812101561576
Maybe a little better than O(n), but definitely closer to that than O(1).

Most efficient way to convert/unpack an itertools.chain object to an unordered and ordered list

Other than using the list and sorted methods to convert an itertools.chain object to get an unordered and ordered list, respectively, are there more efficient ways of doing the same in python3? I read in this answer that list is for debugging. Is this true?
Below is an example code where I time the processes:
from itertools import chain
from time import time
def foo(n):
for i in range(n):
yield range(n)
def check(n):
# check list method
start = time()
a = list(itertools.chain.from_iterable(foo(n)))
end = time()- start
print('Time for list = ', end)
# check sorted method
start = time()
b = sorted(itertools.chain.from_iterable(foo(n)))
end = time()- start
print('Time for sorted = ', end)
Results:
>>> check(1000)
Time for list = 0.04650092124938965
Time for sorted = 0.08582258224487305
>>> check(10000)
Time for list = 1.615750789642334
Time for sorted = 8.84056806564331
>>>
Other than using the list and sorted methods to convert an itertools.chain object to get an unordered and ordered list, respectively, are there more efficient ways of doing the same in python3?
simple answer: no. When working with the python generators and iterators, the only caveat you have to avoid is to convert a generator into a list, then into a generator, then into a list again etc…
i.e. a chain like that would be stupid:
list(sorted(list(filter(list(map(…
because you would then lose all the added value of the generators.
I read in this answer that list is for debugging. Is this true?
it depends on your context, generally speaking a list() is not for debugging, it's a different way to represent an iterable.
You might want to use a list() if you need to access an element at a given index, or if you want to know the length of the dataset.
You'll want to not use a list() if you can consume the data as it goes.
Think of all the generators/iterator scheme as a way to apply an algorithm for each item as they are available, whereas you work on lists as bulk.
About the question you quote, the question is very specific, and it asks how it can introspect a generator from the REPL, in order to know what is inside. And the advice from the person who answered this is to use the list(chain) only for introspection, but keep as it was originally.
The most efficient way is using list() but if you want to flatten a nested iterable by itertools.chain() or concatenate some iterables and then convert it to list you can just use a nested list comprehension at once. Also the reason that sorted() takes more time is that it's sorting the iterable while list just calls some methods of generator function (like __next__) in order to copy the items to a list object.
Note that in terms of run time the itertools.chain can perform slightly even faster than list comprehension (python 2.x and python 3.x). Here is an example:
In [27]: lst = [range(10000) for _ in range(10000)]
In [28]: %timeit [i for sub in lst for i in sub]
1 loops, best of 3: 3.94 s per loop
In [29]: %timeit list(chain.from_iterable(lst))
1 loops, best of 3: 2.75 s per loop

Complexity of enumerate

I see a lot of questions about the run-time complexity of python's built in methods, and there are a lot of answers for a lot of the methods (e.g. https://wiki.python.org/moin/TimeComplexity , https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt , Cost of len() function , etc.)
What I don't see anything that addresses enumerate. I know it returns at least one new array (the indexes) but how long does it take to generate that and is the other array just the original array?
In other words, I'm assuming it's O(n) for creating a new array (iteration) and O(1) for the reuse of the original array...O(n) in total (I think). Is the another O(n) for the copy making it O(n^2), or something else...?
The enumerate-function returns an iterator. The concept of an iterator is described here.
Basically this means that the iterator gets initialized pointing to the first item of the list and then returning the next element of the list every time its next() method gets called.
So the complexity should be:
Initialization: O(1)
Returning the next element: O(1)
Returning all elements: n * O(1)
Please note that enumerate does NOT create a new data structure (list of tuples or something like that)! It is just iterating over the existing list, keeping the element index in mind.
You can try this out by yourself:
# First, create a list containing a lot of entries:
# (O(n) - executing this line should take some noticeable time)
a = [str(i) for i in range(10000000)] # a = ["0", "1", ..., "9999999"]
# Then call the enumeration function for a.
# (O(1) - executes very fast because that's just the initialization of the iterator.)
b = enumeration(a)
# use the iterator
# (O(n) - retrieving the next element is O(1) and there are n elements in the list.)
for i in b:
pass # do nothing
Assuming the naïve approach (enumerate duplicates the array, then iterates over it), you have O(n) time for duplicating the array, then O(n) time for iterating over it. If that was just n instead of O(n), you would have 2 * n time total, but that's not how O(n) works; all you know is that the amount of time it takes will be some multiple of n. That's (basically) what O(n) means anyway, so in any case, the enumerate function is O(n) time total.
As martineau pointed out, enumerate() does not make a copy of the array. Instead it returns an object which you use to iterate over the array. The call to enumerate() itself is O(1).

Which gives better performance: len(a[:-1]) or len(a)-1

In Python which is better in terms of performance:
1)
for i in range(len(a[:-1])):
foo()
or
2)
for i in range(len(a)-1):
foo()
UPDATE:
Some context on why I'm looping over indices (non-idiomatic?):
for c in reversed(range(len(self._N)-1)):
D[c] = np.dot(self._W[c], D[c-1])*A[c]*(1-A[c])
The second one is better, two reasons:
The first one created a new list a[:-1], which takes up unnecessary time & memory, the second one didn't create a new list.
The second one is more intuitive and clear.
[:] returns a shallow copy of a list. it means that every slice notation returns a list which have new address in memory, but its elements would have same addresses that elements of source list have.
range(0, len(a) - 1) should be more performant of the 2 options. The other makes an unnecessary copy of the sequence (sans 1 element). For looping like this, you might even get a performance boost by using xrange on python2.x as that avoids materializing a list from the range call.
Of course, you usually don't need to loop over indices in python -- There's usually a better way.
The expression a[:-1] creates a new list for which you take the length and discard the list. This is O(n) time and space complexity, compared to O(1) time and space complexity for the second example.
neither is the right way
for i,item in enumerate(a[:-1]):
...
is how you should probably do it ...
in general in python you should never do
for i in range(len(a_list)): ...
here are the timing differences
>>> timeit.timeit("for i in range(len(a)-1):x=i","a=range(1000000)",number=100)
3.30806345671283
>>> timeit.timeit("for i in enumerate(a[:-1]):x=i","a=range(1000000)",number=100
5.3319918613661201
as you can see it takes an extra 2 seconds to do 100 times on a large list of integers ... but its still a better idea imho

Is there any built-in way to get the length of an iterable in python?

For example, files, in Python, are iterable - they iterate over the lines in the file. I want to count the number of lines.
One quick way is to do this:
lines = len(list(open(fname)))
However, this loads the whole file into memory (at once). This rather defeats the purpose of an iterator (which only needs to keep the current line in memory).
This doesn't work:
lines = len(line for line in open(fname))
as generators don't have a length.
Is there any way to do this short of defining a count function?
def count(i):
c = 0
for el in i: c += 1
return c
To clarify, I understand that the whole file will have to be read! I just don't want it in memory all at once
Short of iterating through the iterable and counting the number of iterations, no. That's what makes it an iterable and not a list. This isn't really even a python-specific problem. Look at the classic linked-list data structure. Finding the length is an O(n) operation that involves iterating the whole list to find the number of elements.
As mcrute mentioned above, you can probably reduce your function to:
def count_iterable(i):
return sum(1 for e in i)
Of course, if you're defining your own iterable object you can always implement __len__ yourself and keep an element count somewhere.
If you need a count of lines you can do this, I don't know of any better way to do it:
line_count = sum(1 for line in open("yourfile.txt"))
The cardinality package provides an efficient count() function and some related functions to count and check the size of any iterable: http://cardinality.readthedocs.org/
import cardinality
it = some_iterable(...)
print(cardinality.count(it))
Internally it uses enumerate() and collections.deque() to move all the actual looping and counting logic to the C level, resulting in a considerable speedup over for loops in Python.
I've used this redefinition for some time now:
def len(thingy):
try:
return thingy.__len__()
except AttributeError:
return sum(1 for item in iter(thingy))
It turns out there is an implemented solution for this common problem. Consider using the ilen() function from more_itertools.
more_itertools.ilen(iterable)
An example of printing a number of lines in a file (we use the with statement to safely handle closing files):
# Example
import more_itertools
with open("foo.py", "r+") as f:
print(more_itertools.ilen(f))
# Output: 433
This example returns the same result as solutions presented earlier for totaling lines in a file:
# Equivalent code
with open("foo.py", "r+") as f:
print(sum(1 for line in f))
# Output: 433
Absolutely not, for the simple reason that iterables are not guaranteed to be finite.
Consider this perfectly legal generator function:
def forever():
while True:
yield "I will run forever"
Attempting to calculate the length of this function with len([x for x in forever()]) will clearly not work.
As you noted, much of the purpose of iterators/generators is to be able to work on a large dataset without loading it all into memory. The fact that you can't get an immediate length should be considered a tradeoff.
Because apparently the duplication wasn't noticed at the time, I'll post an extract from my answer to the duplicate here as well:
There is a way to perform meaningfully faster than sum(1 for i in it) when the iterable may be long (and not meaningfully slower when the iterable is short), while maintaining fixed memory overhead behavior (unlike len(list(it))) to avoid swap thrashing and reallocation overhead for larger inputs.
# On Python 2 only, get zip that lazily generates results instead of returning list
from future_builtins import zip
from collections import deque
from itertools import count
def ilen(it):
# Make a stateful counting iterator
cnt = count()
# zip it with the input iterator, then drain until input exhausted at C level
deque(zip(it, cnt), 0) # cnt must be second zip arg to avoid advancing too far
# Since count 0 based, the next value is the count
return next(cnt)
Like len(list(it)), ilen(it) performs the loop in C code on CPython (deque, count and zip are all implemented in C); avoiding byte code execution per loop is usually the key to performance in CPython.
Rather than repeat all the performance numbers here, I'll just point you to my answer with the full perf details.
For filtering, this variation can be used:
sum(is_good(item) for item in iterable)
which can be naturally read as "count good items" and is shorter and simpler (although perhaps less idiomatic) than:
sum(1 for item in iterable if is_good(item)))
Note: The fact that True evaluates to 1 in numeric contexts is specified in the docs
(https://docs.python.org/3.6/library/stdtypes.html#boolean-values), so this coercion is not a hack (as opposed to some other languages like C/C++).
We'll, if you think about it, how do you propose you find the number of lines in a file without reading the whole file for newlines? Sure, you can find the size of the file, and if you can gurantee that the length of a line is x, you can get the number of lines in a file. But unless you have some kind of constraint, I fail to see how this can work at all. Also, since iterables can be infinitely long...
I did a test between the two common procedures in some code of mine, which finds how many graphs on n vertices there are, to see which method of counting elements of a generated list goes faster. Sage has a generator graphs(n) which generates all graphs on n vertices. I created two functions which obtain the length of a list obtained by an iterator in two different ways and timed each of them (averaging over 100 test runs) using the time.time() function. The functions were as follows:
def test_code_list(n):
l = graphs(n)
return len(list(l))
and
def test_code_sum(n):
S = sum(1 for _ in graphs(n))
return S
Now I time each method
import time
t0 = time.time()
for i in range(100):
test_code_list(5)
t1 = time.time()
avg_time = (t1-t0)/10
print 'average list method time = %s' % avg_time
t0 = time.time()
for i in range(100):
test_code_sum(5)
t1 = time.time()
avg_time = (t1-t0)/100
print "average sum method time = %s" % avg_time
average list method time = 0.0391882109642
average sum method time = 0.0418473792076
So computing the number of graphs on n=5 vertices this way, the list method is slightly faster (although 100 test runs isn't a great sample size). But when I increased the length of the list being computed by trying graphs on n=7 vertices (i.e. changing graphs(5) to graphs(7)), the result was this:
average list method time = 4.14753051996
average sum method time = 3.96504004002
In this case the sum method was slightly faster. All in all, the two methods are approximately the same speed but the difference MIGHT depend on the length of your list (it might also just be that I only averaged over 100 test runs, which isn't very high -- would have taken forever otherwise).

Categories