python 2.7 for loop to generate a list - python

I have tested in Python 2.7, the two styles are the same. My confusion is, when reading first method to generate a list, I am always a bit confused if i%2 == 0 controls if we should execute the whole loop of i in range(100), or i%2 == 0 is under loop of i in range(100). I have the confusion maybe in the past I write Java and C++, thinking methods from there.
Looking for advice how to read list generation code, normally the pattern is [<something before loop> <the loop> <something after the loop>], in this case "something before loop" is 1, and "the loop" is for i in range(100) and "something after the loop" is i%2 == 0.
Also asking for advice if writing code in method 1 is good coding style in Python 2.7? Thanks.
a = [1 for i in range(100) if i%2 == 0]
print a
a=[]
for i in range(100):
if i%2==0:
a.append(1)
print a
Edit 1,
I also want to compare of using xrange in an explicit loop (compare to first method of list comprehension for pros and cons), for example,
a=[]
for i in xrange(100):
if i%2==0:
a.append(1)
print a
Edit 2,
a = [1 for i in xrange(100) if i%2 == 0]

1) as already mentioned in python 2.7 it is usually suggested to use xrange since it will (like in C) only keep a counter that will be incremented.
Instead the range is really creating in memory a whole list from 0 till 99!
Maybe here you have to think, if you need the 100 included --> then please use 101 ;)
2) You got my point, the question is valid and you have to think that operation will be executed indeed "under" the loop!!
Bearing in mind that the list comprehension is quite powerful in order to create the needful!! Anyway be careful that in some cases is not so easy to read especially when you are using inside multiple variable like x,y and so on.
I would chose your first line, just take care of min and max of your array. As said maybe you have to incorporate the 100th element and you can speed up using the xrange function instead of range.
a = [1 for i in range(100) if i%2 == 0]
3) A good suggestion is also to document yourself on xrange and while loop --> on stackoverflow you can find plenty of discussions looking for the speed of the two mentioned operation!! (This is only suggestion)
Hope this clarify your query! Have a nice day!

Related

I want to exclude all numbers that aren't divisible by 7 and that are not multiples of 5 in a range of numbers from (0,300)

basically I have to get a list of numbers that are divisible by 7 but not multiples of 5. but for some reason when i put the conditions it tells me i have error.
for i in [x for x in xrange(0,100) if x%7 ==0 and if x%5 !=0 ]:
print i
I know you posted something along the lines of a list comprehension but it's a bit hard to read. So a few things...
I would try writing this as a multi-line for loop before condensing it down to a list comprehension.
I'm not sure why you have an 'x' in here, and 'xrange' doesn't make sense.
Edit: Just realized why I don't recognize xrange and it's because I never worked with Python 2.x
So thinking through this, you are basically looking for any number from 0-300 that is divisible by 7 but is not a multiple of 5.
Means we have a few things...
range(0,301): Since range is not inclusive of our last value we want n+1
Our number, let's say "i" is both... "i%7==0" and "i%5!=0"
So let's look line-by-line
for i in range(0,301):
Okay cool, now you don't need a nested for loop list comprehension like you did in your example. Now, you need to know "if" i is ____... So we need an if statement.
if i%7==0 and i%5!=0:
See the logic? And of course that if statement is inside of our for loop to loop over all the values in our range.
Finally, if our "i" meets our criteria, then we can print all the values.
print(i)
So, our final code looks like...
for i in range(0,301):
if (i % 7 == 0) and (i % 5 != 0):
print(i)
Of course, there are ways you can make this more elegant, but this is the general idea.
List Comprehension:
party = [i for i in range(0,301) if i%7==0 and i%5!=0]
print(party)
That stores them all in a list so you can access them whenever. Or you can print it without assigning it of course.
Edit: The title and then what you say in the body are kind of conflicting. After reading over my own answer, I'm not completely sure if that's what you're looking for, but that is how it came across to me. Hope it helps!
Your list comprehension is incorrect. It should be something similar to:
[x for x in xrange(100) if x%5 and not x%7]
Even better (more efficient) will be something similar to
[x for x in xrange (7, 100, 7) if x%5]
Even better will be ... Nah, we'll just stop here for now.

Python: weird list index out of range error [duplicate]

This question already has answers here:
Strange result when removing item from a list while iterating over it
(8 answers)
Closed 7 years ago.
l = range(100)
for i in l:
print i,
print l.pop(0),
print l.pop(0)
The above python code gives the output quite different from expected. I want to loop over items so that I can skip an item while looping.
Please explain.
Never alter the container you're looping on, because iterators on that container are not going to be informed of your alterations and, as you've noticed, that's quite likely to produce a very different loop and/or an incorrect one. In normal cases, looping on a copy of the container helps, but in your case it's clear that you don't want that, as the container will be empty after 50 legs of the loop and if you then try popping again you'll get an exception.
What's anything BUT clear is, what behavior are you trying to achieve, if any?! Maybe you can express your desires with a while...?
i = 0
while i < len(some_list):
print i,
print some_list.pop(0),
print some_list.pop(0)
I've been bitten before by (someone else's) "clever" code that tries to modify a list while iterating over it. I resolved that I would never do it under any circumstance.
You can use the slice operator mylist[::3] to skip across to every third item in your list.
mylist = [i for i in range(100)]
for i in mylist[::3]:
print(i)
Other points about my example relate to new syntax in python 3.0.
I use a list comprehension to define mylist because it works in Python 3.0 (see below)
print is a function in python 3.0
Python 3.0 range() now behaves like xrange() used to behave, except it works with values of arbitrary size. The latter no longer exists.
The general rule of thumb is that you don't modify a collection/array/list while iterating over it.
Use a secondary list to store the items you want to act upon and execute that logic in a loop after your initial loop.
Use a while loop that checks for the truthfulness of the array:
while array:
value = array.pop(0)
# do some calculation here
And it should do it without any errors or funny behaviour.
Try this. It avoids mutating a thing you're iterating across, which is generally a code smell.
for i in xrange(0, 100, 3):
print i
See xrange.
I guess this is what you want:
l = range(100)
index = 0
for i in l:
print i,
try:
print l.pop(index+1),
print l.pop(index+1)
except IndexError:
pass
index += 1
It is quite handy to code when the number of item to be popped is a run time decision.
But it runs with very a bad efficiency and the code is hard to maintain.
This slice syntax makes a copy of the list and does what you want:
l = range(100)
for i in l[:]:
print i,
print l.pop(0),
print l.pop(0)

Python: List comprehension significantly faster than Filter? [duplicate]

I have a list that I want to filter by an attribute of the items.
Which of the following is preferred (readability, performance, other reasons)?
xs = [x for x in xs if x.attribute == value]
xs = filter(lambda x: x.attribute == value, xs)
It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter+lambda, but use whichever you find easier.
There are two things that may slow down your use of filter.
The first is the function call overhead: as soon as you use a Python function (whether created by def or lambda) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn't think much about performance until you've timed your code and found it to be a bottleneck, but the difference will be there.
The other overhead that might apply is that the lambda is being forced to access a scoped variable (value). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value through a closure and this difference won't apply.
The other option to consider is to use a generator instead of a list comprehension:
def filterbyvalue(seq, value):
for el in seq:
if el.attribute==value: yield el
Then in your main code (which is where readability really matters) you've replaced both list comprehension and filter with a hopefully meaningful function name.
This is a somewhat religious issue in Python. Even though Guido considered removing map, filter and reduce from Python 3, there was enough of a backlash that in the end only reduce was moved from built-ins to functools.reduce.
Personally I find list comprehensions easier to read. It is more explicit what is happening from the expression [i for i in list if i.attribute == value] as all the behaviour is on the surface not inside the filter function.
I would not worry too much about the performance difference between the two approaches as it is marginal. I would really only optimise this if it proved to be the bottleneck in your application which is unlikely.
Also since the BDFL wanted filter gone from the language then surely that automatically makes list comprehensions more Pythonic ;-)
Since any speed difference is bound to be miniscule, whether to use filters or list comprehensions comes down to a matter of taste. In general I'm inclined to use comprehensions (which seems to agree with most other answers here), but there is one case where I prefer filter.
A very frequent use case is pulling out the values of some iterable X subject to a predicate P(x):
[x for x in X if P(x)]
but sometimes you want to apply some function to the values first:
[f(x) for x in X if P(f(x))]
As a specific example, consider
primes_cubed = [x*x*x for x in range(1000) if prime(x)]
I think this looks slightly better than using filter. But now consider
prime_cubes = [x*x*x for x in range(1000) if prime(x*x*x)]
In this case we want to filter against the post-computed value. Besides the issue of computing the cube twice (imagine a more expensive calculation), there is the issue of writing the expression twice, violating the DRY aesthetic. In this case I'd be apt to use
prime_cubes = filter(prime, [x*x*x for x in range(1000)])
Although filter may be the "faster way", the "Pythonic way" would be not to care about such things unless performance is absolutely critical (in which case you wouldn't be using Python!).
I thought I'd just add that in python 3, filter() is actually an iterator object, so you'd have to pass your filter method call to list() in order to build the filtered list. So in python 2:
lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = filter(lambda num: num % 2 == 0, lst_a)
lists b and c have the same values, and were completed in about the same time as filter() was equivalent [x for x in y if z]. However, in 3, this same code would leave list c containing a filter object, not a filtered list. To produce the same values in 3:
lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = list(filter(lambda num: num %2 == 0, lst_a))
The problem is that list() takes an iterable as it's argument, and creates a new list from that argument. The result is that using filter in this way in python 3 takes up to twice as long as the [x for x in y if z] method because you have to iterate over the output from filter() as well as the original list.
An important difference is that list comprehension will return a list while the filter returns a filter, which you cannot manipulate like a list (ie: call len on it, which does not work with the return of filter).
My own self-learning brought me to some similar issue.
That being said, if there is a way to have the resulting list from a filter, a bit like you would do in .NET when you do lst.Where(i => i.something()).ToList(), I am curious to know it.
EDIT: This is the case for Python 3, not 2 (see discussion in comments).
I find the second way more readable. It tells you exactly what the intention is: filter the list.
PS: do not use 'list' as a variable name
generally filter is slightly faster if using a builtin function.
I would expect the list comprehension to be slightly faster in your case
Filter is just that. It filters out the elements of a list. You can see the definition mentions the same(in the official docs link I mentioned before). Whereas, list comprehension is something that produces a new list after acting upon something on the previous list.(Both filter and list comprehension creates new list and not perform operation in place of the older list. A new list here is something like a list with, say, an entirely new data type. Like converting integers to string ,etc)
In your example, it is better to use filter than list comprehension, as per the definition. However, if you want, say other_attribute from the list elements, in your example is to be retrieved as a new list, then you can use list comprehension.
return [item.other_attribute for item in my_list if item.attribute==value]
This is how I actually remember about filter and list comprehension. Remove a few things within a list and keep the other elements intact, use filter. Use some logic on your own at the elements and create a watered down list suitable for some purpose, use list comprehension.
Here's a short piece I use when I need to filter on something after the list comprehension. Just a combination of filter, lambda, and lists (otherwise known as the loyalty of a cat and the cleanliness of a dog).
In this case I'm reading a file, stripping out blank lines, commented out lines, and anything after a comment on a line:
# Throw out blank lines and comments
with open('file.txt', 'r') as lines:
# From the inside out:
# [s.partition('#')[0].strip() for s in lines]... Throws out comments
# filter(lambda x: x!= '', [s.part... Filters out blank lines
# y for y in filter... Converts filter object to list
file_contents = [y for y in filter(lambda x: x != '', [s.partition('#')[0].strip() for s in lines])]
It took me some time to get familiarized with the higher order functions filter and map. So i got used to them and i actually liked filter as it was explicit that it filters by keeping whatever is truthy and I've felt cool that I knew some functional programming terms.
Then I read this passage (Fluent Python Book):
The map and filter functions are still builtins
in Python 3, but since the introduction of list comprehensions and generator ex‐
pressions, they are not as important. A listcomp or a genexp does the job of map and
filter combined, but is more readable.
And now I think, why bother with the concept of filter / map if you can achieve it with already widely spread idioms like list comprehensions. Furthermore maps and filters are kind of functions. In this case I prefer using Anonymous functions lambdas.
Finally, just for the sake of having it tested, I've timed both methods (map and listComp) and I didn't see any relevant speed difference that would justify making arguments about it.
from timeit import Timer
timeMap = Timer(lambda: list(map(lambda x: x*x, range(10**7))))
print(timeMap.timeit(number=100))
timeListComp = Timer(lambda:[(lambda x: x*x) for x in range(10**7)])
print(timeListComp.timeit(number=100))
#Map: 166.95695265199174
#List Comprehension 177.97208347299602
In addition to the accepted answer, there is a corner case when you should use filter instead of a list comprehension. If the list is unhashable you cannot directly process it with a list comprehension. A real world example is if you use pyodbc to read results from a database. The fetchAll() results from cursor is an unhashable list. In this situation, to directly manipulating on the returned results, filter should be used:
cursor.execute("SELECT * FROM TABLE1;")
data_from_db = cursor.fetchall()
processed_data = filter(lambda s: 'abc' in s.field1 or s.StartTime >= start_date_time, data_from_db)
If you use list comprehension here you will get the error:
TypeError: unhashable type: 'list'
In terms of performance, it depends.
filter does not return a list but an iterator, if you need the list 'immediately' filtering and list conversion it is slower than with list comprehension by about 40% for very large lists (>1M). Up to 100K elements, there is almost no difference, from 600K onwards there starts to be differences.
If you don't convert to a list, filter is practically instantaneous.
More info at: https://blog.finxter.com/python-lists-filter-vs-list-comprehension-which-is-faster/
Curiously on Python 3, I see filter performing faster than list comprehensions.
I always thought that the list comprehensions would be more performant.
Something like:
[name for name in brand_names_db if name is not None]
The bytecode generated is a bit better.
>>> def f1(seq):
... return list(filter(None, seq))
>>> def f2(seq):
... return [i for i in seq if i is not None]
>>> disassemble(f1.__code__)
2 0 LOAD_GLOBAL 0 (list)
2 LOAD_GLOBAL 1 (filter)
4 LOAD_CONST 0 (None)
6 LOAD_FAST 0 (seq)
8 CALL_FUNCTION 2
10 CALL_FUNCTION 1
12 RETURN_VALUE
>>> disassemble(f2.__code__)
2 0 LOAD_CONST 1 (<code object <listcomp> at 0x10cfcaa50, file "<stdin>", line 2>)
2 LOAD_CONST 2 ('f2.<locals>.<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_FAST 0 (seq)
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
But they are actually slower:
>>> timeit(stmt="f1(range(1000))", setup="from __main__ import f1,f2")
21.177661532000116
>>> timeit(stmt="f2(range(1000))", setup="from __main__ import f1,f2")
42.233950221000214
I would come to the conclusion: Use list comprehension over filter since its
more readable
more pythonic
faster (for Python 3.11, see attached benchmark, also see )
Keep in mind that filter returns a iterator, not a list.
python3 -m timeit '[x for x in range(10000000) if x % 2 == 0]'
1 loop, best of 5: 270 msec per loop
python3 -m timeit 'list(filter(lambda x: x % 2 == 0, range(10000000)))'
1 loop, best of 5: 432 msec per loop
Summarizing other answers
Looking through the answers, we have seen a lot of back and forth, whether or not list comprehension or filter may be faster or if it is even important or pythonic to care about such an issue. In the end, the answer is as most times: it depends.
I just stumbled across this question while optimizing code where this exact question (albeit combined with an in expression, not ==) is very relevant - the filter + lambda expression is taking up a third of my computation time (of multiple minutes).
My case
In my case, the list comprehension is much faster (twice the speed). But I suspect that this varies strongly based on the filter expression as well as the Python interpreter used.
Test it for yourself
Here is a simple code snippet that should be easy to adapt. If you profile it (most IDEs can do that easily), you will be able to easily decide for your specific case which is the better option:
whitelist = set(range(0, 100000000, 27))
input_list = list(range(0, 100000000))
proximal_list = list(filter(
lambda x: x in whitelist,
input_list
))
proximal_list2 = [x for x in input_list if x in whitelist]
print(len(proximal_list))
print(len(proximal_list2))
If you do not have an IDE that lets you profile easily, try this instead (extracted from my codebase, so a bit more complicated). This code snippet will create a profile for you that you can easily visualize using e.g. snakeviz:
import cProfile
from time import time
class BlockProfile:
def __init__(self, profile_path):
self.profile_path = profile_path
self.profiler = None
self.start_time = None
def __enter__(self):
self.profiler = cProfile.Profile()
self.start_time = time()
self.profiler.enable()
def __exit__(self, *args):
self.profiler.disable()
exec_time = int((time() - self.start_time) * 1000)
self.profiler.dump_stats(self.profile_path)
whitelist = set(range(0, 100000000, 27))
input_list = list(range(0, 100000000))
with BlockProfile("/path/to/create/profile/in/profile.pstat"):
proximal_list = list(filter(
lambda x: x in whitelist,
input_list
))
proximal_list2 = [x for x in input_list if x in whitelist]
print(len(proximal_list))
print(len(proximal_list2))
Your question is so simple yet interesting. It just shows how flexible python is, as a programming language. One may use any logic and write the program according to their talent and understandings. It is fine as long as we get the answer.
Here in your case, it is just an simple filtering method which can be done by both but i would prefer the first one my_list = [x for x in my_list if x.attribute == value] because it seems simple and does not need any special syntax. Anyone can understands this command and make changes if needs it.
(Although second method is also simple, but it still has more complexity than the first one for the beginner level programmers)

Enormous Input - for loop faster than list comprehension

I'm trying to solve a codechef beginner problem - Enormous Input Test. My code
a,b = [ int(i) for i in raw_input().split()]
print [input()%b==0 for i in range(a)].count(True)
gets timed out. Another solution, which uses basic for-loops, seems to be working fine.
I believe that list comprehension is quicker than basic for - loops. Then why is the former slower? Also will using generators in this case reduce the memory used and perform the computation faster, if so how can I do it?
Why do you believe that list comprehension is quicker than basic for loops? (Hint: they are both implemented using the same underlying instructions.)
Your code will be executed in some manner like this:
a, b = ...
temp = []
for i in range(a):
temp.append(int(raw_input()) % b == 0)
print temp.count(True)
As you can see, it creates a large list in memory, iterates over it to create a second list, and then iterates over the second list to create a count. The list does not ever need to be created.
a, b = ...
count = 0
for i in xrange(a):
if int(raw_input()) % b == 0:
count += 1
print count
Some compilers are capable of optimizing hylomorphisms to remove the intermideate list, but I know of no Python implementation capable of this. So you are stuck optimizing by hand.
Note: Do not use input in Python 2.x, unless you know what you are doing. I have changed the code to use int(raw_input()) because that is safe, whereas input() is dangerous.

Most pythonic way to truncate a list to N indices when you can't guarantee the list is at least N length?

What is the most pythonic way to truncate a list to N indices when you can not guarantee the list is even N length? Something like this:
l = range(6)
if len(l) > 4:
l = l[:4]
I'm fairly new to python and am trying to learn to think pythonicly. The reason I want to even truncate the list is because I'm going to enumerate on it with an expected length and I only care about the first 4 elements.
Python automatically handles, the list indices that are out of range gracefully.
In [5]: k = range(2)
In [6]: k[:4]
Out[6]: [0, 1]
In [7]: k = range(6)
In [8]: k[:4]
Out[8]: [0, 1, 2, 3]
BTW, the degenerate slice behavior is explained in The Python tutorial. That is a good place to start because it covers a lot of concepts very quickly.
You've got it here:
lst = lst[:4]
This works regardless of the number of items in the list, even if it's less than 4.
If you want it to always have 4 elements, padding it with (say) zeroes or None if it's too short, try this:
lst = (lst + [0] * 4)[:4]
When you have a question like this, it's usually feasible to try it and see what happens, or look it up in the documentation.
It's bad idea to name a variable list, by the way; this will prevent you from referring to the built-in list type.
All of the answers so far don't truncate the list. They follow your example in assigning the name to a new list which contains the first up to 4 elements of the old list. To truncate the existing list, delete elements whose index is 4 or higher. This is done very simply:
del lst[4:]
Moving on to what you really want to do, one possibility is:
for i, value in enumerate(lst):
if i >= 4:
break
do_something_with(lst, i, value)
I think the most pythonic solution that answers your question is this:
for x in l[:4]:
do_something_with(x)
The advantages:
It's the most succinct and descriptive way of doing what you're after.
I suggest that the pythonic way to do this is to slice the source array rather than truncate it. Once you remove elements from the source array they are gone forever, and while in your trivial example that's fine, I think in general it's not a great practice.
Slicing the array to the first four is almost certainly more efficient than removing a possibly large number of elements from its tail.

Categories