Why is direct indexing of an array significantly faster than iteration?

Why is direct indexing of an array significantly faster than iteration? - python

Just some Python code for an example:
nums = [1,2,3]
start = timer()
for i in range(len(nums)):
print(nums[i])
end = timer()
print((end-start)) #computed to 0.0697546862831
start = timer()
print(nums[0])
print(nums[1])
print(nums[2])
end = timer()
print((end-start)) #computed to 0.0167170338524
I can grasp that some extra time will be taken in the loop because the value of i must be incremented a few times, but the difference between the running times of these two different methods seems a lot bigger than I expected. Is there something else happening underneath the hood that I'm not considering?

Short answer: it isn't, unless the loop is very small. The for loop has a small overhead, but the way you're doing it is inefficient. By using range(len(nums)) you're effectively creating another list and iterating through that, then doing the same index lookups anyway. Try this:
for i in nums:
print(i)
Results for me were as expected:
>>> import timeit
>>> timeit.timeit('nums[0];nums[1];nums[2]', setup='nums = [1,2,3]')
0.10711812973022461
>>> timeit.timeit('for i in nums:pass', setup='nums = [1,2,3]')
0.13474011421203613
>>> timeit.timeit('for i in range(len(nums)):pass', setup='nums = [1,2,3]')
0.42371487617492676
With a bigger list the advantage of the loop becomes apparent, because the incremental cost of accessing an element by index outweighs the one-off cost of the loop:
>>> timeit.timeit('for i in nums:pass', setup='nums = range(0,100)')
1.541944980621338
timeit.timeit(';'.join('nums[%s]' % i for i in range(0,100)), setup='nums = range(0,100)')
2.5244338512420654
In python 3, which puts a greater emphasis on iterators over indexable lists, the difference is even greater:
>>> timeit.timeit('for i in nums:pass', setup='nums = range(0,100)')
1.6542046590038808
>>> timeit.timeit(';'.join('nums[%s]' % i for i in range(0,100)), setup='nums = range(0,100)')
10.331634456000756

With such a small array you're probably measuring noise first, and then the overhead of calling range(). Note that range not only has to increment a variable a few times, it also creates an object that holds its state (the current value) because it's a generator. The function call and object creation are two things you don't pay for in the second example and for very short iterations they will probably dwarf three array accesses.
Essentially your second snippet does loop unrolling, which is a viable and frequent technique of speeding up performance-critical code.

The for loop have a cost in any case, and the one you write is especially costly. Here is four versions, using timeit for measure time:
from timeit import timeit
NUMS = [1, 2, 3]
def one():
for i in range(len(NUMS)):
NUMS[i]
def one_no_access():
for i in range(len(NUMS)):
i
def two():
NUMS[0]
NUMS[1]
NUMS[2]
def three():
for i in NUMS:
i
for func in (one, one_no_access, two, three):
print(func.__name__ + ':', timeit(func))
Here is the found times:
one: 1.0467438200000743
one_no_access: 0.8853238560000136
two: 0.3143197629999577
three: 0.3478466749998006
The one_no_access show the cost of the expression range(len(NUMS)).
While lists in python are stocked contiguously in memory, the random access of elements is in O(1), explaining two as the quicker.

Related

Python generator vs list as array initializer

Here's an example of initializing an array of ten million random numbers, using a list (a), and using tuple-like generator (b). The result is exactly the same, the list or tuple is never used, so there's no practical advantage with one or the other
from random import randint
from array import array
a = array('H', [randint(1, 100) for _ in range(0, 10000000)])
b = array('H', (randint(1, 100) for _ in range(0, 10000000)))
So the question is which one to use. In principle, my understanding is that that a tuple should be able to get away with using less resources than a list, but since this list and tuple are not kept, it should be possible that the code is executed without ever initializing the intermediate data structure… My tests indicate that the list is slightly faster in this case. I can only imagine that this is because the Python implementation has more optimization around lists than tuples. Can I expect this to be consistent?
More generally, should I use one or the other, and why? (Or should I do this kind initialization some other way completely.)
Update: Answers and comments made me realize that the b example is not actually a tuple but a generator, so I edited a bit in the headline and the text above to reflect that. Also I tried splitting the list version into two lines like this, which should force the list to actually be instantiated:
g = [randint(1, 100) for _ in range(0, 10000000)]
a = array('H', g)
It appears to make no difference. The list version takes about 8.5 seconds, and the generator version takes about 9 seconds.

Although it looks like it, (randint(1, 100) for _ in range(0, 1000000)) is not a tuple, it's a generator:
>>> type((randint(1, 100) for _ in range(0, 1000000)))
<class 'generator'>
>>>
If you really want a tuple, use:
b = array('H', tuple(randint(1, 100) for _ in range(0, 1000000)))
The list being a bit faster than the generator makes sense, since the generator generates the next value when asked, one at a time, while the list comprehension allocates all the memory needed and then proceeds to fill it with values all in one go. That optimisation for speed is paid for in memory space.
I'd favour the generator, since it will work regardless of most reasonable memory restrictions and would work for any number of random numbers, while the speedup of the list is minimal. Unless you need to generate this list again and again, at which time the speedup would start to count - but then you'd probably use the same copy of the list each time to begin with.

[randint(1, 100) for _ in range(0, 10000000)]
This is a list comprehension. Every element is evaluated in a tight loop and put together into a list, so it is generally faster but takes more RAM (everything comes out at once).
(randint(1, 100) for _ in range(0, 10000000))
This is a generator expression. No element is evaluated at this point, and one of them comes out at a time when you call next() on the resulting generator. It's slower but takes a consistent (small) amount of memory.
As given in the other answer, if you want a tuple, you should convert either into one:
tuple([randint(1, 100) for _ in range(0, 10000000)])
tuple(randint(1, 100) for _ in range(0, 10000000))
Let's come back to your question:
When to use which?
In general, if you use a list comprehension or generator expression as an initializer of another sequential data structure (list, array, etc.), it makes no difference except for the memory-time tradeoff mentioned above. Things you need to consider is as simple as performance and memory budget. You would prefer the list comprehension if you need more speed (or write a C program to be absolutely fast) or the generator expression if you need to keep the memory consumption low.
If you plan to reuse the resulting sequence, things start to get interesting.
A list is strictly a list, and can for all purposes be used as a list:
a = [i for i in range(5)]
a[3] # 3
a.append(5) # a = [0, 1, 2, 3, 4, 5]
for _ in a:
print("Hello")
# Prints 6 lines in total
for _ in a:
print("Bye")
# Prints another 6 lines
b = list(reversed(a)) # b = [5, 4, 3, 2, 1, 0]
A generator can be only used once.
a = (i for i in range(5))
a[3] # TypeError: generator object isn't subscriptable
a.append(5) # AttributeError: generator has no attribute 'append'
for _ in a:
print("Hello")
# Prints 5 lines in total
for _ in a:
print("Bye")
# Nothing this time, because
# the generator has already been consumed
b = list(reversed(a)) # TypeError: generator isn't reversible
The final answer is: Know what you want to do, and find the appropriate data structure for it.

Speed to iterate multiple times over a generator compared to a list

I expected that in the case of multiple loops list iteration will be much faster than using a generator, and my code suggests this is false.
My understanding is (by operation I mean any expression defining an element):
a list requires n operations to be initialized
but then every loop over the list is just grabbing an element from memory
thus, m loops over a list require only n operations
a generator does not require any operations to be initialized
however, looping over generator runs operations in fly
thus, one loop over a generator requires n operations
but m loops over a generator require n x m operations
And I checked my expectations using the following code:
from timeit import timeit
def pow2_list(n):
"""Return a list with powers of 2"""
results = []
for i in range(n):
results.append(2**i)
return results
def pow2_gen(n):
"""Generator of powers of 2"""
for i in range(n):
yield 2**i
def loop(iterator, n=1000):
"""Loop n times over iterable object"""
for _ in range(n):
for _ in iterator:
pass
l = pow2_list(1000) # point to a list
g = pow2_gen(1000) # point to a generator
time_list = \
timeit("loop(l)", setup="from __main__ import loop, l", number=10)
time_gen = \
timeit("loop(g)", setup="from __main__ import loop, g", number=10)
print("Loops over list took: ", time_list)
print("Loops over generator took: ", time_gen)
And the results surprised me...
Loops over list took: 0.20484769299946493
Loops over generator took: 0.0019217690005461918
Somehow using generators appears much faster than lists, even when looping over 1000 times. And in this case we are talking about two orders of magnitude! Why?
EDIT:
Thanks for answers. Now I see my mistake. I wrongly assumed that generator starts from beginning on a new loop, like range:
>>> x = range(10)
>>> sum(x)
45
>>> sum(x)
45
But this was naive (range is not a generator...).
Regarding possible duplicate comment: my problem concerned multiple loops over generator, which is not explained in the other thread.

Your generator is actually only looping once. Once created with pow2_gen, g stores a generator; the very first time through loop, this generator is consumed, and emits StopIteration. The other times through loop, next(g) (or g.next() in Python 2) just continues to throw StopIteration, so, in effect g represents an empty sequence.
To make the comparison more fair, you would need to re-create the generator every time you loop.
A further difficulty with the way you've approached this is that you're calling append to build your list, which is likely the very slowest way to construct a list. More often, lists are built with list comprehensions.
The following code lets us pick apart the timing a bit more carefully. create_list and create_gen create lists and generators, respectively, using list comprehension and generator expressions. time_loop is like your loop method, while time_apply is a version of loop that re-creates the iterable each time through the loop.
def create_list(n=1000):
return [2**i for i in range(n)]
def create_gen(n=1000):
return (2**i for i in range(n))
def time_loop(iterator, n=1000):
for t in range(n):
for v in iterator:
pass
def time_apply(create_fn, fn_arg, n=1000):
for t in range(n):
iterator = create_fn(fn_arg)
time_loop(iterator, 1)
print('time_loop(create_list): %.3f' % timeit("time_loop(create_list(1000))",
setup="from __main__ import *",
number=10))
print('time_loop(create_gen): %.3f' % timeit("time_loop(create_gen(1000))",
setup="from __main__ import *",
number=10))
print('time_apply(create_list): %.3f' % timeit("time_apply(create_list, 1000)",
setup="from __main__ import *",
number=10))
print('time_apply(create_gen): %.3f' % timeit("time_apply(create_gen, 1000)",
setup="from __main__ import *",
number=10))
Results on my box suggest that building a list (time_apply(create_list)) is similar in time to (or maybe even faster than) building a generator (time_apply(create_gen)).
time_loop(create_list): 0.244
time_loop(create_gen): 0.028
time_apply(create_list): 21.190
time_apply(create_gen): 21.555
You can see the same effect you've documented in your question, which is that time_loop(create_gen) is an order of magnitude faster than time_loop(create_list). Again, this is because the generator created is only being iterated over once, rather than the many loops over the list.
As you hypothesise, building a list once and iterating over it many times (time_loop(create_list)) is faster than iterating over a generator many times (time_apply(create_gen)) in this particular scenario.
The trade-off between list and generator is going to be strongly dependent on how big the iterator you're creating is. With 1000 items, I would expect lists to be pretty fast. With 100,000 items, things might look different.
print('create big list: %.3f' % timeit("l = create_list(100000)",
setup="from __main__ import *",
number=10))
print('create big gen: %.3f' % timeit("g = create_gen(100000)",
setup="from __main__ import *",
number=10))
Here I'm getting:
create big list: 209.748
create big gen: 0.023
Python uses up between 700 and 800 MB of memory building the big list; the generator uses almost nothing at all. Memory allocation and garbage cleanup are computationally expensive in Python, and predictably make your code slow; generators are a very simple way to avoid gobbling up your machine's RAM, and can make a big difference to runtime.

There is a problem with your test. Namely, a generator is not reusable. Once exhausted it cannot be used again, and a new one must be generated. eg.
l = [0, 1, 2, 4, 5]
g = iter(l) # creates an iterator (a type of generator) over the list
sum_list0 = sum(l)
sum_list1 = sum(1)
assert sum_list0 == sum_list1 # all working normally
sum_gen0 = sum(g) # consumes generator
sum_gen1 = sum(g) # sum of empty generator is 0
assert sum_gen0 == sum_list1 # result is correct
assert sum_gen1 == sum_list1, "second result was incorrect" # because generator was exhausted
For your test to work you must recreate the generator afresh in the statement you pass to timeit.
from timeit import timeit
n = 1000
repeats = 10000
list_powers = [2**i for i in range(n)]
def gen_powers():
for i in range(n):
yield 2**i
time_list = timeit("min(list_powers)", globals=globals(), number=repeats)
time_gen = timeit("min(gen_powers())", globals=globals(), number=repeats)
print("Loops over list took: ", time_list)
print("Loops over generator took: ", time_gen)
gives:
Loops over list took: 0.24689035064701784
Loops over generator took: 13.551637053904571
Now the generator is two orders of magnitude slower than the list. This is to be expected as the size of the sequence is small compared to the number of iterations over the sequence. If n is large, then the list creation becomes slower. This is because of how lists are expanded when appending new items, and the final size not passed to the list when created. Increasing the number of iterations will speed up the list in comparison to the generator as the amount of work that is required by the generator is increased, whilst for the list it remains constant. Since n is only 1000 (small), and repeats dominates n, then the generator is slower.

Your test does not work because your generator is exhausted on the first pass in loop(). This is one of the advantages of lists over generators, you can iterate over them multiple times (at the expense of storing the full list in memory).
Here is an illustration of this. I'm using a generator expression and list comprehension (which is more optimized than using append in a for loop) but the concept is the same:
>>> gen = (i for i in range(3))
>>> for n in range(2):
... for i in gen:
... print(i)
...
0 # 1st print
1
2 # after one loop the iterator is exhausted
>>>
>>> lst = [x for x in range(3)]
>>> for n in range(2):
... for i in lst:
... print(i)
...
0 # 1st print
1
2
0 # 2nd print
1
2
>>>
For an equivalent test you should rebuild the generator after each iteration of the outer loop:
>>> for n in range(2):
... gen = (i for i in range(3))
... for i in gen:
... print(i)
...
0 # 1st print
1
2
0 # 2nd print
1
2
>>>

Efficiently check if an element occurs at least n times in a list

How to best write a Python function (check_list) to efficiently test if an element (x) occurs at least n times in a list (l)?
My first thought was:
def check_list(l, x, n):
return l.count(x) >= n
But this doesn't short-circuit once x has been found n times and is always O(n).
A simple approach that does short-circuit would be:
def check_list(l, x, n):
count = 0
for item in l:
if item == x:
count += 1
if count == n:
return True
return False
I also have a more compact short-circuiting solution with a generator:
def check_list(l, x, n):
gen = (1 for item in l if item == x)
return all(next(gen,0) for i in range(n))
Are there other good solutions? What is the best efficient approach?
Thank you

Instead of incurring extra overhead with the setup of a range object and using all which has to test the truthiness of each item, you could use itertools.islice to advance the generator n steps ahead, and then return the next item in the slice if the slice exists or a default False if not:
from itertools import islice
def check_list(lst, x, n):
gen = (True for i in lst if i==x)
return next(islice(gen, n-1, None), False)
Note that like list.count, itertools.islice also runs at C speed. And this has the extra advantage of handling iterables that are not lists.
Some timing:
In [1]: from itertools import islice
In [2]: from random import randrange
In [3]: lst = [randrange(1,10) for i in range(100000)]
In [5]: %%timeit # using list.index
....: check_list(lst, 5, 1000)
....:
1000 loops, best of 3: 736 µs per loop
In [7]: %%timeit # islice
....: check_list(lst, 5, 1000)
....:
1000 loops, best of 3: 662 µs per loop
In [9]: %%timeit # using list.index
....: check_list(lst, 5, 10000)
....:
100 loops, best of 3: 7.6 ms per loop
In [11]: %%timeit # islice
....: check_list(lst, 5, 10000)
....:
100 loops, best of 3: 6.7 ms per loop

You could use the second argument of index to find the subsequent indices of occurrences:
def check_list(l, x, n):
i = 0
try:
for _ in range(n):
i = l.index(x, i)+1
return True
except ValueError:
return False
print( check_list([1,3,2,3,4,0,8,3,7,3,1,1,0], 3, 4) )
About index arguments
The official documentation does not mention in its Python Tutuorial, section 5 the method's second or third argument, but you can find it in the more comprehensive Python Standard Library, section 4.6:
s.index(x[, i[, j]]) index of the first occurrence of x in s (at or after index i and before index j) (8)
(8) index raises ValueError when x is not found in s. When supported, the additional arguments to the index method allow efficient searching of subsections of the sequence. Passing the extra arguments is roughly equivalent to using s[i:j].index(x), only without copying any data and with the returned index being relative to the start of the sequence rather than the start of the slice.
Performance Comparison
In comparing this list.index method with the islice(gen) method, the most important factor is the distance between the occurrences to be found. Once that distance is on average 13 or more, the list.index has a better performance. For lower distances, the fastest method also depends on the number of occurrences to find. The more occurrences to find, the sooner the islice(gen) method outperforms list.index in terms of average distance: this gain fades out when the number of occurrences becomes really large.
The following graph draws the (approximate) border line, at which both methods perform equally well (the X-axis is logarithmic):

Ultimately short circuiting is the way to go if you expect a significant number of cases will lead to early termination. Let's explore the possibilities:
Take the case of the list.index method versus the list.count method (these were the two fastest according to my testing, although ymmv)
For list.index if the list contains n or more of x and the method is called n times. Whilst within the list.index method, execution is very fast, allowing for much faster iteration than the custom generator. If the occurances of x are far enough apart, a large speedup will be seen from the lower level execution of index. If instances of x are close together (shorter list / more common x's), much more of the time will be spent executing the slower python code that mediates the rest of the function (looping over n and incrementing i)
The benefit of list.count is that it does all of the heavy lifting outside of slow python execution. It is a much easier function to analyse, as it is simply a case of O(n) time complexity. By spending almost none of the time in the python interpreter however it is almost gaurenteed to be faster for short lists.
Summary of selection criteria:
shorter lists favor list.count
lists of any length that don't have a high probability to short circuit favor list.count
lists that are long and likely to short circuit favor list.index

I would recommend using Counter from the collections module.
from collections import Counter
%%time
[k for k,v in Counter(np.random.randint(0,10000,10000000)).items() if v>1100]
#Output:
Wall time: 2.83 s
[1848, 1996, 2461, 4481, 4522, 5844, 7362, 7892, 9671, 9705]

This shows another way of doing it.
Sort the list.
Find the index of the first occurrence of the item.
Increase the index by one less than the number of times the item must occur. (n - 1)
Find if the element at that index is the same as the item you want to find.
def check_list(l, x, n):
_l = sorted(l)
try:
index_1 = _l.index(x)
return _l[index_1 + n - 1] == x
except IndexError:
return False

c=0
for i in l:
if i==k:
c+=1
if c>=n:
print("true")
else:
print("false")

Another possibility might be:
def check_list(l, x, n):
return sum([1 for i in l if i == x]) >= n

Fastest way to get sorted unique list in python?

What is the fasted way to get a sorted, unique list in python? (I have a list of hashable things, and want to have something I can iterate over - doesn't matter whether the list is modified in place, or I get a new list, or an iterable. In my concrete use case, I'm doing this with a throwaway list, so in place would be more memory efficient.)
I've seen solutions like
input = [5, 4, 2, 8, 4, 2, 1]
sorted(set(input))
but it seems to me that first checking for uniqueness and then sorting is wasteful (since when you sort the list, you basically have to determine insertion points, and thus get the uniqueness test as a side effect). Maybe there is something more along the lines of unix's
cat list | sort | uniq
that just picks out consecutive duplications in an already sorted list?
Note in the question ' Fastest way to uniqify a list in Python ' the list is not sorted, and ' What is the cleanest way to do a sort plus uniq on a Python list? ' asks for the cleanest / most pythonic way, and the accepted answer suggests sorted(set(input)), which I'm trying to improve on.

I believe sorted(set(sequence)) is the fastest way of doing it.
Yes, set iterates over the sequence but that's a C-level loop, which is a lot faster than any looping you would do at python level.
Note that even with groupby you still have O(n) + O(nlogn) = O(nlogn) and what's worst is that groupby will require a python-level loop, which increases dramatically the constants in that O(n) thus in the end you obtain worst results.
When speaking of CPython the way to optimize things is to do as much as you can at C-level (see this answer to have an other example of counter-intuitive performance). To have a faster solution you must reimplement a sort, in a C-extensions. And even then, good luck with obtaining something as fast as python's Timsort!
A small comparison of the "canonical solution" versus the groupby solution:
>>> import timeit
>>> sequence = list(range(500)) + list(range(700)) + list(range(1000))
>>> timeit.timeit('sorted(set(sequence))', 'from __main__ import sequence', number=1000)
0.11532402038574219
>>> import itertools
>>> def my_sort(seq):
... return list(k for k,_ in itertools.groupby(sorted(seq)))
...
>>> timeit.timeit('my_sort(sequence)', 'from __main__ import sequence, my_sort', number=1000)
0.3162040710449219
As you can see it's 3 times slower.
The version provided by jdm is actually even worse:
>>> def make_unique(lst):
... if len(lst) <= 1:
... return lst
... last = lst[-1]
... for i in range(len(lst) - 2, -1, -1):
... item = lst[i]
... if item == last:
... del lst[i]
... else:
... last = item
...
>>> def my_sort2(seq):
... make_unique(sorted(seq))
...
>>> timeit.timeit('my_sort2(sequence)', 'from __main__ import sequence, my_sort2', number=1000)
0.46814608573913574
Almost 5 times slower.
Note that using seq.sort() and then make_unique(seq) and make_unique(sorted(seq)) are actually the same thing, since Timsort uses O(n) space you always have some reallocation, so using sorted(seq) does not actually change much the timings.
The jdm's benchmarks give different results because the input he is using are way too small and thus all the time is taken by the time.clock() calls.

Maybe this is not the answer you are searching for, but anyway, you should take this into your consideration.
Basically, you have 2 operations on a list:
unique_list = set(your_list) # O(n) complexity
sorted_list = sorted(unique_list) # O(nlogn) complexity
Now, you say "it seems to me that first checking for uniqueness and then sorting is wasteful", and you are right. But, how bad really is that redundant step? Take n = 1000000:
# sorted(set(a_list))
O(n) => 1000000
o(nlogn) => 1000000 * 20 = 20000000
Total => 21000000
# Your fastest way
O(nlogn) => 20000000
Total: 20000000
Speed gain: (1 - 20000000/21000000) * 100 = 4.76 %
For n = 5000000, speed gain: ~1.6 %
Now, is that optimization worth it?

This is just something I whipped up in a couple minutes. The function modifies a list in place, and removes consecutive repeats:
def make_unique(lst):
if len(lst) <= 1:
return lst
last = lst[-1]
for i in range(len(lst) - 2, -1, -1):
item = lst[i]
if item == last:
del lst[i]
else:
last = item
Some representative input data:
inp = [
(u"Tomato", "de"), (u"Cherry", "en"), (u"Watermelon", None), (u"Apple", None),
(u"Cucumber", "de"), (u"Lettuce", "de"), (u"Tomato", None), (u"Banana", None),
(u"Squash", "en"), (u"Rubarb", "de"), (u"Lemon", None),
]
Make sure both variants work as wanted:
print inp
print sorted(set(inp))
# copy because we want to modify it in place
inp1 = inp[:]
inp1.sort()
make_unique(inp1)
print inp1
Now to the testing. I'm not using timeit, since I don't want to time the copying of the list, only the sorting. time1 is sorted(set(...), time2 is list.sort() followed by make_unique, and time3 is the solution with itertools.groupby by Avinash Y.
import time
def time1(number):
total = 0
for i in range(number):
start = time.clock()
sorted(set(inp))
total += time.clock() - start
return total
def time2(number):
total = 0
for i in range(number):
inp1 = inp[:]
start = time.clock()
inp1.sort()
make_unique(inp1)
total += time.clock() - start
return total
import itertools
def time3(number):
total = 0
for i in range(number):
start = time.clock()
list(k for k,_ in itertools.groupby(sorted(inp)))
total += time.clock() - start
return total
sort + make_unique is approximately as fast as sorted(set(...)). I'd have to do a couple more iterations to see which one is potentially faster, but within the variations they are very similar. The itertools version is a bit slower.
# done each 3 times
print time1(100000)
# 2.38, 3.01, 2.59
print time2(100000)
# 2.88, 2.37, 2.6
print time3(100000)
# 4.18, 4.44, 4.67
Now with a larger list (the + str(i) is to prevent duplicates):
old_inp = inp[:]
inp = []
for i in range(100):
for j in old_inp:
inp.append((j[0] + str(i), j[1]))
print time1(10000)
# 40.37
print time2(10000)
# 35.09
print time3(10000)
# 40.0
Note that if there are a lot of duplicates in the list, the first version is much faster (since it does less sorting).
inp = []
for i in range(100):
for j in old_inp:
#inp.append((j[0] + str(i), j[1]))
inp.append((j[0], j[1]))
print time1(10000)
# 3.52
print time2(10000)
# 26.33
print time3(10000)
# 20.5

import numpy as np
np.unique(...)
The np.unique function returns an ndarray unique and sorted based on an array-like parameter. This will work with any numpy types, but also regular python values that are orderable.
If you need a regular python list, use np.unique(...).tolist()

>>> import itertools
>>> a=[2,3,4,1,2,7,8,3]
>>> list(k for k,_ in itertools.groupby(sorted(a)))
[1, 2, 3, 4, 7, 8]

Higher Order Functions vs loops - running time & memory efficiency?

Does using Higher Order Functions & Lambdas make running time & memory efficiency better or worse?
For example, to multiply all numbers in a list :
nums = [1,2,3,4,5]
prod = 1
for n in nums:
prod*=n
vs
prod2 = reduce(lambda x,y:x*y , nums)
Does the HOF version have any advantage over the loop version other than it's lesser lines of code/uses a functional approach?
EDIT:
I am not able to add this as an answer as I don't have the required reputation.
I tied to profile the loop & HOF approach using timeit as suggested by #DSM
def test1():
s= """
nums = [a for a in range(1,1001)]
prod = 1
for n in nums:
prod*=n
"""
t = timeit.Timer(stmt=s)
return t.repeat(repeat=10,number=100)
def test2():
s="""
nums = [a for a in range(1,1001)]
prod2 = reduce(lambda x,y:x*y , nums)
"""
t = timeit.Timer(stmt=s)
return t.repeat(repeat=10,number=100)
And this is my result:
Loop:
[0.08340786340144211, 0.07211491653462579, 0.07162720686361926, 0.06593182661083438, 0.06399049758613146, 0.06605228229559557, 0.06419744588664211, 0.0671893658461038, 0.06477527090075941, 0.06418023793167627]
test1 average: 0.0644778902685
HOF:
[0.0759414223099324, 0.07616920129277016, 0.07570730355421262, 0.07604965128984942, 0.07547092059389193, 0.07544737286604364, 0.075532959799953, 0.0755039779810629, 0.07567424616704144, 0.07542563650187661]
test2 average: 0.0754917512762
On an average loop approach seems to be faster than using HOFs.

Higher-order functions can be very fast.
For example, map(ord, somebigstring) is much faster than the equivalent list comprehension [ord(c) for c in somebigstring]. The former wins for three reasons:
map() pre-sizes the result string to the length of somebigstring. In contrast, the list-comprehension must make many calls to realloc() as it grows.
map() only has to do one lookup for ord, first checking globals, then checking and finding it in builtins. The list comprehension has to repeat this work on every iteration.
The inner loop for map runs at C speed. The loop body for the list comprehension is a series of pure Python steps that each need to be dispatched or handled by the eval-loop.
Here are some timings to confirm the prediction:
>>> from timeit import Timer
>>> print min(Timer('map(ord, s)', 's="x"*10000').repeat(7, 1000))
0.808364152908
>>> print min(Timer('[ord(c) for c in s]', 's="x"*10000').repeat(7, 1000))
1.2946639061

from my experience loops can do things very fast , provided they are not nested too deeply , and with complex higher math operations , for simple operations and a Single layer of loops it can be as fast as any other way , maybe faster , so long as only integers are used as the index to the loop or loops, it would actually depend on what you are doing too
Also it might very well be that the higher order function will produce just as many loops
as the loop program version and might even be a little slower , you would have to time them both...just to be sure.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is direct indexing of an array significantly faster than iteration? - python

Related

Python generator vs list as array initializer

Speed to iterate multiple times over a generator compared to a list

Efficiently check if an element occurs at least n times in a list

Fastest way to get sorted unique list in python?

Higher Order Functions vs loops - running time & memory efficiency?

Categories

Resources