This question already has answers here:
Are list-comprehensions and functional functions faster than "for loops"?
(8 answers)
Closed 7 years ago.
I was trying to do some simple procedures using lists.
From the book learning python I saw the method of using a comprehension.
Well I also knew that a loop could replace it.
Now I really want to know that which is faster, loop or comprehension.
These are my programs.
a = []
for x in range(1, 101):
a.append(x)
This would set a as [1, 2, 3, ......, 99, 100]
Now this is what I have done with the comprehension.
[x ** 2 for x in a]
This is what I did with the loop.
c = []
for x in a:
b=[x**2]
c+=b
Could any one say a way to find which of the above is faster.Please also try to explain that how comprehensions differ from loops.
Any help is appreciated.
You can use the timeit library, or just use time.time() to time it yourself:
>>> from time import time
>>> def first():
... ftime = time()
... _foo = [x ** 2 for x in range(1, 101)]
... print "First", time()-ftime
...
>>> def second():
... ftime = time()
... _foo = []
... for x in range(1, 101):
... _b=[x**2]
... _foo+=_b
... print "Second", time()-ftime
...
>>> first()
First 5.60283660889e-05
>>> second()
Second 8.79764556885e-05
>>> first()
First 4.88758087158e-05
>>> second()
Second 8.39233398438e-05
>>> first()
First 2.8133392334e-05
>>> second()
Second 7.29560852051e-05
>>>
Evidently, the list comprehension runs faster, by a factor of around 2 to 3.
Related
This question already has answers here:
Creating functions (or lambdas) in a loop (or comprehension)
(6 answers)
Closed 6 months ago.
Say I want to create a list of 5 functions, where the i-th function just adds i to the argument. The naive code
L = []
for i in range(5):
def f(z):
return z+i
L.append(f)
apparently does not work: print([f(0) for f in L]) yields [4, 4, 4, 4, 4]. Similarly,
L = [lambda z: z+i for i in range(5)]
does not work either. The value of i at the time when the function is defined is not fixed to f. A clumsy hack is
tmp = ["lambda z:z+{}".format(i) for i in range(5)]
L = eval("[" + ",".join(tmp) + "]")
But I'm sure that there is a clean solution! Which is it?
def get_func(i):
return lambda z: z + i
L = [get_func(i) for i in range(5)]
According to an answer to a related question (thanks to rdas for the link in the comments), default arguments are evaluated when a function is created, not when it is called. So this alternative to Peter Collingridge's suggestion works too:
L = []
for i in range(5):
def f(z, i=i):
return z+i
L.append(f)
or
L = [lambda z, i=i: z+i for i in range(5)]
There is a multidimensional list with not clear structure:
a=[[['123', '456'], ['789', '1011']], [['1213', '1415']], [['1617', '1819']]]
And there is a recursive function, which operates the list:
def get_The_Group_And_Rewrite_It(workingList):
if isinstance(workingList[0],str):
doSomething(workingList)
else:
for i in workingList:
get_The_Group_And_Rewrite_It(i)
Through time the get_The_Group_And_Rewrite_It() should get a list, for instance, ['123','456'], and as soon as it get it, doSomething function should rewrite it with ['abc'] in entire list.
The same with other lists of format[str,str,...]. In the end I should get something like
a=[[['abc'], ['abc']], [['abc']], [['abc']]]
I see it would be easy in C++, using *links, but how to do that in Python?
For this case, you can use slice assignment:
>>> a = [[['123', '456']]]
>>> x = a[0][0]
>>> x[:] = ['abc']
>>> a
[[['abc']]]
>>> def f(workingList):
... if isinstance(workingList[0],str):
... workingList[:] = ['abc']
... else:
... for i in workingList:
... f(i)
...
>>> a=[[['123', '456'], ['789', '1011']], [['1213', '1415']], [['1617', '1819']]]
>>> f(a)
>>> a
[[['abc'], ['abc']], [['abc']], [['abc']]]
In python, as far as I know, there are at least 3 to 4 ways to create and initialize lists of a given size:
Simple loop with append:
my_list = []
for i in range(50):
my_list.append(0)
Simple loop with +=:
my_list = []
for i in range(50):
my_list += [0]
List comprehension:
my_list = [0 for i in range(50)]
List and integer multiplication:
my_list = [0] * 50
In these examples I don't think there would be any performance difference given that the lists have only 50 elements, but what if I need a list of a million elements? Would the use of xrange make any improvement? Which is the preferred/fastest way to create and initialize lists in python?
Let's run some time tests* with timeit.timeit:
>>> from timeit import timeit
>>>
>>> # Test 1
>>> test = """
... my_list = []
... for i in xrange(50):
... my_list.append(0)
... """
>>> timeit(test)
22.384258893239178
>>>
>>> # Test 2
>>> test = """
... my_list = []
... for i in xrange(50):
... my_list += [0]
... """
>>> timeit(test)
34.494779364416445
>>>
>>> # Test 3
>>> test = "my_list = [0 for i in xrange(50)]"
>>> timeit(test)
9.490926919482774
>>>
>>> # Test 4
>>> test = "my_list = [0] * 50"
>>> timeit(test)
1.5340533503559755
>>>
As you can see above, the last method is the fastest by far.
However, it should only be used with immutable items (such as integers). This is because it will create a list with references to the same item.
Below is a demonstration:
>>> lst = [[]] * 3
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are the same
>>> id(lst[0])
28734408
>>> id(lst[1])
28734408
>>> id(lst[2])
28734408
>>>
This behavior is very often undesirable and can lead to bugs in the code.
If you have mutable items (such as lists), then you should use the still very fast list comprehension:
>>> lst = [[] for _ in xrange(3)]
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are different
>>> id(lst[0])
28796688
>>> id(lst[1])
28796648
>>> id(lst[2])
28736168
>>>
*Note: In all of the tests, I replaced range with xrange. Since the latter returns an iterator, it should always be faster than the former.
If you want to see the dependency with the length of the list n:
Pure python
I tested for list length up to n=10000 and the behavior remains the same. So the integer multiplication method is the fastest with difference.
Numpy
For lists with more than ~300 elements you should consider numpy.
Benchmark code:
import time
def timeit(f):
def timed(*args, **kwargs):
start = time.clock()
for _ in range(100):
f(*args, **kwargs)
end = time.clock()
return end - start
return timed
#timeit
def append_loop(n):
"""Simple loop with append"""
my_list = []
for i in xrange(n):
my_list.append(0)
#timeit
def add_loop(n):
"""Simple loop with +="""
my_list = []
for i in xrange(n):
my_list += [0]
#timeit
def list_comprehension(n):
"""List comprehension"""
my_list = [0 for i in xrange(n)]
#timeit
def integer_multiplication(n):
"""List and integer multiplication"""
my_list = [0] * n
import numpy as np
#timeit
def numpy_array(n):
my_list = np.zeros(n)
import pandas as pd
df = pd.DataFrame([(integer_multiplication(n), numpy_array(n)) for n in range(1000)],
columns=['Integer multiplication', 'Numpy array'])
df.plot()
Gist here.
There is one more method which, while sounding weird, is handy in right curcumstances. If you need to produce the same list many times (initializing matrix for roguelike pathfinding and related stuff in my case), you can store a copy of the list in the tuple, then turn it to list when you need it. It is noticeably quicker than generating list via comprehensions and, unlike list multiplication, works with nested data structures.
# In class definition
def __init__(self):
self.l = [[1000 for x in range(1000)] for y in range(1000)]
self.t = tuple(self.l)
def some_method(self):
self.l = list(self.t)
self._do_fancy_computation()
# self.l is changed by this method
# Later in code:
for a in range(10):
obj.some_method()
Voila, on every iteration you have a fresh copy of the same list in no time!
Disclaimer:
I do not have a slightest idea why is this so quick or whether it works anywhere outside CPython 3.4.
If you want to create a list incrementing, i.e. adding 1 every time, use the range function. In range the start argument is included and the end argument is excluded as shown below:
list(range(10,20))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
If you want to create a list by adding 2 to previous elements use this:
list(range(10,20,2))
[10, 12, 14, 16, 18]
Here the third argument is the step size to be taken. Now you can give any start element, end element and step size and create many lists fast and easy.
Thank you..!
Happy Learning.. :)
In python, as far as I know, there are at least 3 to 4 ways to create and initialize lists of a given size:
Simple loop with append:
my_list = []
for i in range(50):
my_list.append(0)
Simple loop with +=:
my_list = []
for i in range(50):
my_list += [0]
List comprehension:
my_list = [0 for i in range(50)]
List and integer multiplication:
my_list = [0] * 50
In these examples I don't think there would be any performance difference given that the lists have only 50 elements, but what if I need a list of a million elements? Would the use of xrange make any improvement? Which is the preferred/fastest way to create and initialize lists in python?
Let's run some time tests* with timeit.timeit:
>>> from timeit import timeit
>>>
>>> # Test 1
>>> test = """
... my_list = []
... for i in xrange(50):
... my_list.append(0)
... """
>>> timeit(test)
22.384258893239178
>>>
>>> # Test 2
>>> test = """
... my_list = []
... for i in xrange(50):
... my_list += [0]
... """
>>> timeit(test)
34.494779364416445
>>>
>>> # Test 3
>>> test = "my_list = [0 for i in xrange(50)]"
>>> timeit(test)
9.490926919482774
>>>
>>> # Test 4
>>> test = "my_list = [0] * 50"
>>> timeit(test)
1.5340533503559755
>>>
As you can see above, the last method is the fastest by far.
However, it should only be used with immutable items (such as integers). This is because it will create a list with references to the same item.
Below is a demonstration:
>>> lst = [[]] * 3
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are the same
>>> id(lst[0])
28734408
>>> id(lst[1])
28734408
>>> id(lst[2])
28734408
>>>
This behavior is very often undesirable and can lead to bugs in the code.
If you have mutable items (such as lists), then you should use the still very fast list comprehension:
>>> lst = [[] for _ in xrange(3)]
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are different
>>> id(lst[0])
28796688
>>> id(lst[1])
28796648
>>> id(lst[2])
28736168
>>>
*Note: In all of the tests, I replaced range with xrange. Since the latter returns an iterator, it should always be faster than the former.
If you want to see the dependency with the length of the list n:
Pure python
I tested for list length up to n=10000 and the behavior remains the same. So the integer multiplication method is the fastest with difference.
Numpy
For lists with more than ~300 elements you should consider numpy.
Benchmark code:
import time
def timeit(f):
def timed(*args, **kwargs):
start = time.clock()
for _ in range(100):
f(*args, **kwargs)
end = time.clock()
return end - start
return timed
#timeit
def append_loop(n):
"""Simple loop with append"""
my_list = []
for i in xrange(n):
my_list.append(0)
#timeit
def add_loop(n):
"""Simple loop with +="""
my_list = []
for i in xrange(n):
my_list += [0]
#timeit
def list_comprehension(n):
"""List comprehension"""
my_list = [0 for i in xrange(n)]
#timeit
def integer_multiplication(n):
"""List and integer multiplication"""
my_list = [0] * n
import numpy as np
#timeit
def numpy_array(n):
my_list = np.zeros(n)
import pandas as pd
df = pd.DataFrame([(integer_multiplication(n), numpy_array(n)) for n in range(1000)],
columns=['Integer multiplication', 'Numpy array'])
df.plot()
Gist here.
There is one more method which, while sounding weird, is handy in right curcumstances. If you need to produce the same list many times (initializing matrix for roguelike pathfinding and related stuff in my case), you can store a copy of the list in the tuple, then turn it to list when you need it. It is noticeably quicker than generating list via comprehensions and, unlike list multiplication, works with nested data structures.
# In class definition
def __init__(self):
self.l = [[1000 for x in range(1000)] for y in range(1000)]
self.t = tuple(self.l)
def some_method(self):
self.l = list(self.t)
self._do_fancy_computation()
# self.l is changed by this method
# Later in code:
for a in range(10):
obj.some_method()
Voila, on every iteration you have a fresh copy of the same list in no time!
Disclaimer:
I do not have a slightest idea why is this so quick or whether it works anywhere outside CPython 3.4.
If you want to create a list incrementing, i.e. adding 1 every time, use the range function. In range the start argument is included and the end argument is excluded as shown below:
list(range(10,20))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
If you want to create a list by adding 2 to previous elements use this:
list(range(10,20,2))
[10, 12, 14, 16, 18]
Here the third argument is the step size to be taken. Now you can give any start element, end element and step size and create many lists fast and easy.
Thank you..!
Happy Learning.. :)
I'm trying to create a function that takes two lists and selects an element at random from each of them. Is there any way to do this using the random.seed function?
You can use random.choice to pick a random element from a sequence (like a list).
If your two lists are list1 and list2, that would be:
a = random.choice(list1)
b = random.choice(list2)
Are you sure you want to use random.seed? This will initialize the random number generator in a consistent way each time, which can be very useful if you want subsequent runs to be identical but in general that is not desired. For example, the following function will always return 8, even though it looks like it should randomly choose a number between 0 and 10.
>>> def not_very_random():
... random.seed(0)
... return random.choice(range(10))
...
>>> not_very_random()
8
>>> not_very_random()
8
>>> not_very_random()
8
>>> not_very_random()
8
Note: #F.J's solution is much less complicated and better.
Use random.randint to pick a pseudo-random index from the list. Then use that index to select the element:
>>> import random as r
>>> r.seed(14) # used random number generator of ... my head ... to get 14
>>> mylist = [1,2,3,4,5]
>>> mylist[r.randint(0, len(mylist) - 1)]
You can easily extend this to work on two lists.
Why do you want to use random.seed?
Example (using Python2.7):
>>> import collections as c
>>> c.Counter([mylist[r.randint(0, len(mylist) - 1)] for x in range(200)])
Counter({1: 44, 5: 43, 2: 40, 3: 39, 4: 34})
Is that random enough?
I totally redid my previous answer. Here is a class which wraps a random-number generator (with optional seed) with the list. This is a minor improvement over F.J.'s, because it gives deterministic behavior for testing. Calling choice() on the first list should not affect the second list, and vice versa:
class rlist ():
def __init__(self, lst, rg=None, rseed=None):
self.lst = lst
if rg is not None:
self.rg = rg
else:
self.rg = random.Random()
if rseed is not None:
self.rg.seed(rseed)
def choice(self):
return self.rg.choice(self.lst)
if __name__ == '__main__':
rl1 = rlist([1,2,3,4,5], rseed=1234)
rl2 = rlist(['a','b','c','d','e'], rseed=1234)
print 'First call:'
print rl1.choice(),rl2.choice()
print 'Second call:'
print rl1.choice(),rl2.choice()