Python for each faster than for indexed? - python

In python, which is faster?
1
for word in listOfWords:
doSomethingToWord(word)
2
for i in range(len(listOfWords)):
doSomethingToWord(listOfWords[i])
Of course I'd use xrange in python 2.x.
My assumption is 1. is faster than 2. If so, why is it?

Use Python's timeit module to answer this kind of question:
duncan#ubuntu:~$ python -m timeit -s "listOfWords=['hello']*1000" "for word in listOfWords: len(word)"
10000 loops, best of 3: 37.2 usec per loop
duncan#ubuntu:~$ python -m timeit -s "listOfWords=['hello']*1000" "for i in range(len(listOfWords)): len(listOfWords[i])"
10000 loops, best of 3: 52.1 usec per loop

Instead of asking this questions, you can always try do them by yourself. It is not hard.
Super simple benchmarking will show you the difference.
from datetime import datetime
arr = [4 for _ in xrange(10**8)]
startTime = datetime.now()
for i in arr:
i
print datetime.now() - startTime
startTime = datetime.now()
for i in xrange(len(arr)):
arr[i]
print datetime.now() - startTime
On my machine it is:
0:00:04.822513
0:00:05.676396
Note that the list you are iterating should be pretty big to see the difference. The second loop is longer because each time you need to make a look up by index (arr[i]) and also to generate the values for xrange.
Please do not spend too much time in mostly useless microoptimization, rather try to look whether you can improve the computational complexity of your inner loop functions.

simply try timeit.
In [2]: def solve(listOfWords):
for word in range(len(listOfWords)):
pass
...:
In [3]: %timeit solve(xrange(10**5))
100 loops, best of 3: 4.34 ms per loop
In [4]: def solve(listOfWords):
for word in listOfWords:
pass
...:
In [5]: %timeit solve(xrange(10**5))
1000 loops, best of 3: 1.84 ms per loop

In addition to the speed advantage, 1 is "cleaner-looking", but also will work for sequences that do not support len, namely generator expressions and the results from generator functions. To use solution 2, you would first have to convert the generator to a list in order to get its length if you could. But what if the generator is generating the list of all prime numbers, and doSomething is looking for the first value > 100?
for num in prime_number_generator():
if num > 100: return num
There is no way to convert this to the second form, since this generator has no end.
Also, what if it is very expensive to create the elements of the list (as in fetching from a database, or remote web server)? If you are looking for a matching value out of a generated set of N values, with #1 you could exit as soon as you found a match, and avoid on average the generation of N/2 values. To use #2, you first have to generate all N values in order to get the length in order to make the range.
There is a reason Python 3 converted many builtins to return iterators instead of lists - they are more flexible.
What is Pythonic?
"for i in range(len(seq)):"? No.
Use "for x in seq:"

Related

Python: Very slow execution loops

I am writing a code for proposing typo correction using HMM and Viterbi algorithm. At some point for each word in the text I have to do the following. (lets assume I have 10,000 words)
#FYI Windows 10, 64bit, interl i7 4GRam, Python 2.7.3
import numpy as np
import pandas as pd
for k in range(10000):
tempWord = corruptList20[k] #Temp word read form the list which has all of the words
delta = np.zeros(26, len(tempWord)))
sai = np.chararray(26, len(tempWord)))
sai[:] = '#'
# INITIALIZATION DELTA
for i in range(26):
delta[i][0] = #CALCULATION matrix read and multiplication each cell is different
# INITILIZATION END
# 6.DELTA CALCULATION
for deltaIndex in range(1, len(tempWord)):
for j in range(26):
tempDelta = 0.0
maxDelta = 0.0
maxState = ''
for i in range(26):
# CALCULATION to fill each cell involve in:
# 1-matrix read and multiplication
# 2 Finding Column Max
# logical operation and if-then-else operations
# 7. SAI BACKWARD TRACKING
delta2 = pd.DataFrame(delta)
sai2 = pd.DataFrame(sai)
proposedWord = np.zeros(len(tempWord), str)
editId = 0
for col in delta2.columns:
# CALCULATION to fill each cell involve in:
# 1-matrix read and multiplication
# 2 Finding Column Max
# logical operation and if-then-else operations
editList20.append(''.join(editWord))
#END OF LOOP
As you can see it is computationally involved and When I run it takes too much time to run.
Currently my laptop is stolen and I run this on Windows 10, 64bit, 4GRam, Python 2.7.3
My question: Anybody can see any point that I can use to optimize? Do I have to delete the the matrices I created in the loop before loop goes to next round to make memory free or is this done automatically?
After the below comments and using xrange instead of range the performance increased almost by 30%. I am adding the screenshot here after this change.
I don't think that range discussion makes much difference. With Python3, where range is the iterator, expanding it into a list before iteration doesn't change time much.
In [107]: timeit for k in range(10000):x=k+1
1000 loops, best of 3: 1.43 ms per loop
In [108]: timeit for k in list(range(10000)):x=k+1
1000 loops, best of 3: 1.58 ms per loop
With numpy and pandas the real key to speeding up loops is to replace them with compiled operations that work on the whole array or dataframe. But even in pure Python, focus on streamlining the contents of the iteration, not the iteration mechanism.
======================
for i in range(26):
delta[i][0] = #CALCULATION matrix read and multiplication
A minor change: delta[i, 0] = ...; this is the array way of addressing a single element; functionally it often is the same, but the intent is clearer. But think, can't you set all of that column as once?
delta[:,0] = ...
====================
N = len(tempWord)
delta = np.zeros(26, N))
etc
In tight loops temporary variables like this can save time. This isn't tight, so here is just adds clarity.
===========================
This one ugly nested triple loop; admittedly 26 steps isn't large, but 26*26*N is:
for deltaIndex in range(1,N):
for j in range(26):
tempDelta = 0.0
maxDelta = 0.0
maxState = ''
for i in range(26):
# CALCULATION
# 1-matrix read and multiplication
# 2 Finding Column Max
# logical operation and if-then-else operations
Focus on replacing this with array operations. It's those 3 commented lines that need to be changed, not the iteration mechanism.
================
Make proposedWord a list rather than array might be faster. Small list operations are often faster than array one, since numpy arrays have a creation overhead.
In [136]: timeit np.zeros(20,str)
100000 loops, best of 3: 2.36 µs per loop
In [137]: timeit x=[' ']*20
1000000 loops, best of 3: 614 ns per loop
You have to careful when creating 'empty' lists that the elements are truly independent, not just copies of the same thing.
In [159]: %%timeit
x = np.zeros(20,str)
for i in range(20):
x[i] = chr(65+i)
.....:
100000 loops, best of 3: 14.1 µs per loop
In [160]: timeit [chr(65+i) for i in range(20)]
100000 loops, best of 3: 7.7 µs per loop
As noted in the comments, the behavior of range changed between Python 2 and 3.
In 2, range constructs an entire list populated with the numbers to iterate over, then iterates over the list. Doing this in a tight loop is very expensive.
In 3, range instead constructs a simple object that (as far as I know), consists only of 3 numbers: the starting number, the step (distance between numbers), and the end number. Using simple math, you can calculate any point along the range instead of needing to iterate necessarily. This makes "random access" on it O(1) instead of O(n) when the entire list is interated, and prevents the creation of a costly list.
In 2, use xrange to iterate over a range object instead of a list.
(#Tom: I'll delete this if you post an answer).
It's hard to see exactly what you need to do because of the missing code, but it's clear that you need to learn how to vectorize your numpy code. This can lead to a 100x speedup.
You can probably get rid of all the inner for-loops and replace them with vectorized operations.
eg. instead of
for i in range(26):
delta[i][0] = #CALCULATION matrix read and multiplication each cell is differen
do
delta[:, 0] = # Vectorized form of whatever operation you were going to do.

Converting str numbers in list to int and find out the sum of the list

I have a list, but the numbers in it are strings so I can't find the sum of the list, so I need help in converting the numbers in the list to int.
This is my code
def convertStr(cals):
ret = float(cals)
return ret
TotalCal = sum(cals)
So basically there is list called cals
and it looks like this
(20,45,...etc)
But the numbers in it are strings so when I try finding the sum like this
TotalCal = sum(cals)
And then run it shows an error saying that the list needs to be an int format
so the question is how do I convert all numbers in the list to int format?
If you have a different way of finding the sum of lists it will be good too.
You can use either the python builtin map or a list comprehension for this
def convertStr(cals):
ret = [float(i) for i in (cals)]
return ret
or
def convertStr(cals):
return map(float,cals)
Here are the timeit results for both the approaches
$ python -m timeit "cals = ['1','2','3','4'];[float(i) for i in (cals)]"
1000000 loops, best of 3: 0.804 usec per loop
$ python -m timeit "cals = ['1','2','3','4'];map(float,cals)"
1000000 loops, best of 3: 0.787 usec per loop
As you can see map is faster and more pythonic as compared to the list comprehension. This is discussed in full length here
map may be microscopically faster in some cases (when you're NOT making a lambda for the purpose, but using the same function in map and a listcomp). List comprehensions may be faster in other cases
Another way using itertools.imap. This is the fastest for long lists
from itertools import imap
TotalCal = sum(imap(float,cals)
And using timeit for a list with 1000 entries.
$ python -m timeit "import random;cals = [str(random.randint(0,100)) for r in range(1000)];sum(map(float,cals))"
1000 loops, best of 3: 1.38 msec per loop
$ python -m timeit "import random;cals = [str(random.randint(0,100)) for r in range(1000)];[float(i) for i in (cals)]"
1000 loops, best of 3: 1.39 msec per loop
$ python -m timeit "from itertools import imap;import random;cals = [str(random.randint(0,100)) for r in range(1000)];imap(float,cals)"
1000 loops, best of 3: 1.24 msec per loop
As Padraic mentions below, The imap way is the best way to go! It is fast1 and looks great! Inclusion of a library function has it's bearing on small lists only and not on large lists. Thus for large lists, imap is better suited.
1 List comprehension is still slower than map by 1 micro second!!! Thank god
sum(map(float,cals))
or
sum(float(i) for i in cals)

In python, will converting variables to its own type waste CPU power?

I'm trying to fetch an int id number from database and some ids are mistakenly stored as string, I'm wondering which of the following way is better:
# the first method
new_id = int(old_id) + 1
# second
if isinstance(old_id, str):
new_id = int(old_id) + 1
else:
new_id = old_id +1
So the question is, does it cost to convert a variable to its own type in python?
Let's check!
~/Coding > python -m timeit -s "id1=1;id2='1'" "new_id = int(id1)" "new_id = int(id2)"
1000000 loops, best of 3: 0.755 usec per loop
~/Coding > python -m timeit -s "id1=1;id2='1';f=lambda x: int(x) if isinstance(x, str) else x" "new_id=f(id1)" "new_id=f(id2)"
1000000 loops, best of 3: 1.15 usec per loop
Looks like the most efficient way is simply doing the int conversion without checking.
I'm open to being corrected that the issue here is the lambda or something else I did.
Update:
This may actually not be a fair answer, because the if check itself is much quicker than the type conversion.
~/Coding > python -m timeit "int('3')"
1000000 loops, best of 3: 0.562 usec per loop
~/Coding > python -m timeit "int(3)"
10000000 loops, best of 3: 0.136 usec per loop
~/Coding > python -m timeit "if isinstance('3', str): pass"
10000000 loops, best of 3: 0.0966 usec per loop
This means that it depends on how many of your ids you expect to be strings to see which is worth it.
Update 2:
I've gone a bit overboard here, but we can determine exactly when it's right to switch over using the above timings depending on how many strings you expect to have.
Where z is the total number of ids and s is the percentage of them that are strings, and all values in microseconds,
Always check type: (assuming returning int costs 0 time)
.0966*z + .562*z*s
Always convert without checking:
.136*z*(1-s) + .562*z*s
When we do the math, the z's and string conversions cancel out (since you have to convert the string regardless), and we end up with the following:
s ~= 0.289706
So it looks like 29% strings or so is about the time when you'd cross over from one method to the other.

What is the speed difference between Python's set() and set([])?

Is there a big difference in speed in these two code fragments?
1.
x = set( i for i in data )
versus:
2.
x = set( [ i for i in data ] )
I've seen people recommending set() instead of set([]); is this just a matter of style?
The form
x = set(i for i in data)
is shorthand for:
x = set((i for i in data))
This creates a generator expression which evaluates lazily. Compared to:
x = set([i for i in data])
which creates an entire list before passing it to set
From a performance standpoint, generator expressions allow for short-circuiting in certain functions (all and any come to mind) and takes less memory as you don't need to store the extra list -- In some cases this can be very significant.
If you actually are going to iterate over the entire iterable data, and memory isn't a problem for you, I've found that typically the list-comprehension is slightly faster then the equivalent generator expression*.
temp $ python -m timeit 'set(i for i in "xyzzfoobarbaz")'
100000 loops, best of 3: 3.55 usec per loop
temp $ python -m timeit 'set([i for i in "xyzzfoobarbaz"])'
100000 loops, best of 3: 3.42 usec per loop
Note that if you're curious about speed -- Your fastest bet will probably be just:
x = set(data)
proof:
temp $ python -m timeit 'set("xyzzfoobarbaz")'
1000000 loops, best of 3: 1.83 usec per loop
*Cpython only -- I don't know how Jython or pypy optimize this stuff.
The [] syntax creates a list, which is discarded immediatley after the set is created. So you are increasing the memory footprint of the program.
The generator syntax avoids that.

How to measure length of generator sequence (list comp vs generator expression)

I have a generator that generates a finite sequence. To determine
the length of this sequence I tried these two approaches:
seq_len = sum([1 for _ in euler14_seq(sv)]) # list comp
and
seq_len = sum(1 for _ in euler14_seq(sv)) # generator expression
sv is a constant starting value for the sequence.
I had expected that list comprehension would be slower and the
generator expression faster, but it turns out the other way around.
I assume the first one will be much more memory intensive since it
creates a complete list in memory first - part of the reason I also thought it would be slower.
My question: Is this observation generalizable? And is this due to
having two generators involved in the second statement vs the first?
I've looked at these What's the shortest way to count the number of items in a generator/iterator?, Length of generator output, and
Is there any built-in way to get the length of an iterable in python? and saw some other approaches to measuring the length of a sequence, but I'm specifically curious about the comparison of list comp vs generator expression.
PS: This came up when I decided to solve Euler Project #14 based on a
question asked on SO yesterday.
(By the way, what's the general feeling regarding use of the '_' in
places where variable values are not needed).
This was done with Python 2.7.2 (32-bit) under Windows 7 64-bit
On this computer, the generator expression becomes faster somewhere between 100,000 and 1,000,000
$ python -m timeit "sum(1 for x in xrange(100000))"
10 loops, best of 3: 34.8 msec per loop
$ python -m timeit "sum([1 for x in xrange(100000)])"
10 loops, best of 3: 20.8 msec per loop
$ python -m timeit "sum(1 for x in xrange(1000000))"
10 loops, best of 3: 315 msec per loop
$ python -m timeit "sum([1 for x in xrange(1000000)])"
10 loops, best of 3: 469 msec per loop
The following code block should generate the length:
>>> gen1 = (x for x in range(10))
>>> len(list(gen1))
10

Categories