Can Python slicing be used to skip one specific element by index? - python

Suppose I am writing a recursive function where I want to pass a list to a function with a single element missing, as part of a loop. Here is one possible solution:
def Foo(input):
if len(input) == 0: return
for j in input:
t = input[:]
t.remove(j)
Foo(t)
Is there a way to abuse the slice operator to pass the list minus the element j without explicitly copying the list and removing the item from it?

What about this?
for i in range(len(list_)):
Foo(list_[:i] + list_[i+1:])
You are stilling copying items, though you ignore the element at index i while copying.
BTW, you can always try to avoid overriding built-in names like list by appending underscores.

If your lists are small, I recommend using the approach in the answer from #satoru.
If your lists are very large, and you want to avoid the "churn" of creating and deleting list instances, how about using a generator?
import itertools as it
def skip_i(seq, i):
return it.chain(it.islice(seq, 0, i), it.islice(seq, i+1, None))
This pushes the work of skipping the i'th element down into the C guts of itertools, so this should be faster than writing the equivalent in pure Python.
To do it in pure Python I would suggest writing a generator like this:
def gen_skip_i(seq, i):
for j, x in enumerate(seq):
if i != j:
yield x
EDIT: Here's an improved version of my answer, thanks to #Blckknght in comments below.
import itertools as it
def skip_i(iterable, i):
itr = iter(iterable)
return it.chain(it.islice(itr, 0, i), it.islice(itr, 1, None))
This is a big improvement over my original answer. My original answer only worked properly on indexable things like lists, but this will work correctly for any iterable, including iterators! It makes an explicit iterator from the iterable, then (in a "chain") pulls the first i values, and skips only a single value before pulling all remaining values.
Thank you very much #Blckknght!

Here is a code equivalent to satoru's, but is faster, as it makes one copy of the list per iteration instead of two:
before = []
after = list_[:]
for x in range(0, len(list_)):
v = after.pop(0)
Foo(before + after)
before.append(v)
(11ms instead of 18ms on my computer, for a list generated with list(range(1000)))

Related

How can I yield from an arbitrary depth of recursion?

I've written a function to create combinations of inputs of an arbitrary length, so recursion seemed to be the obvious way to do it. While it's OK for a small toy example to return a list of the results, I'd like to yield them instead. I've read about yield from, but don't fully understand how it is used, the examples don't appear to cover my use case, and hoping'n'poking it into my code has not yet produced anything that works. Note that writing this recursive code was at the limit of my python ability, hence the copious debug print statements.
This is the working list return code, with my hopeful non-working yield commented out.
def allposs(elements, output_length):
"""
return all zero insertion paddings of elements up to output_length maintaining order
elements - an iterable of length >= 1
output_length >= len(elements)
for instance allposs((3,1), 4) returns
[[3,1,0,0], [3,0,1,0], [3,0,0,1], [0,3,1,0], [0,3,0,1], [0,0,3,1]]
"""
output_list = []
def place_nth_element(nth, start_at, output_so_far):
# print('entering place_nth_element with nth =', nth,
# ', start_at =', start_at,
# ', output_so_far =', output_so_far)
last_pos = output_length - len(elements) + nth
# print('iterating over range',start_at, 'to', last_pos+1)
for pos in range(start_at, last_pos+1):
output = list(output_so_far)
# print('placing', elements[nth], 'at position', pos)
output[pos] = elements[nth]
if nth == len(elements)-1:
# print('appending output', output)
output_list.append(output)
# yield output
else:
# print('making recursive call')
place_nth_element(nth+1, pos+1, output)
place_nth_element(0, 0, [0]*output_length)
return output_list
if __name__=='__main__':
for q in allposs((3,1), 4):
print(q)
What is the syntax to use yield from to get my list generated a combination at a time?
Recursive generators are a powerful tool and I'm glad you're putting in the effort to study them.
What is the syntax to use yield from to get my list generated a combination at a time?
You put yield from in front of the expression from which results should be yielded; in your case, the recursive call. Thus: yield from place_nth_element(nth+1, pos+1, output). The idea is that each result from the recursively-called generator is iterated over (behind the scenes) and yielded at this point in the process.
Note that for this to work:
You need to yield the individual results at the base level of the recursion
To "collect" the results from the resulting generator, you need to iterate over the result from the top-level call. Fortunately, iteration is built-in in a lot of places; for example, you can just call list and it will iterate for you.
Rather than nesting the recursive generator inside a wrapper function, I prefer to write it as a separate helper function. Since there is no longer a need to access output_list from the recursion, there is no need to form a closure; and flat is better than nested as they say. This does, however, mean that we need to pass elements through the recursion. We don't need to pass output_length because we can recompute it (the length of output_so_far is constant across the recursion).
Also, I find it's helpful, when doing these sorts of algorithms, to think as functionally as possible (in the paradigm sense - i.e., avoid side effects and mutability, and proceed by creating new objects). You had a workable approach using list to make copies (although it is clearer to use the .copy method), but I think there's a cleaner way, as shown below.
All this advice leads us to:
def place_nth_element(elements, nth, start_at, output_so_far):
last_pos = len(output_so_far) - len(elements) + nth
for pos in range(start_at, last_pos+1):
output = output_so_far[:pos] + (elements[nth],) + output_so_far[pos+1:]
if nth == len(elements)-1:
yield output
else:
yield from place_nth_element(elements, nth+1, pos+1, output)
def allposs(elements, output_length):
return list(place_nth_element(elements, 0, 0, (0,)*output_length))
HOWEVER, I would not solve the problem that way - because the standard library already offers a neat solution: we can find the itertools.combinations of indices where a value should go, and then insert them. Now that we no longer have to think recursively, we can go ahead and mutate values :)
from itertools import combinations
def place_values(positions, values, size):
result = [0] * size
for position, value in zip(positions, values):
result[position] = value
return tuple(result)
def possibilities(values, size):
return [
place_values(positions, values, size)
for positions in combinations(range(size), len(values))
]

How to return a list that is made up of extracted elements from another list in python?

I am building a function to extract all negatives from a list called xs and I need it to add those extracted numbers into another list called new_home. I have come up with a code that I believe should work, however; it is only showing an empty list.
Example input/output:
xs=[1,2,3,4,0,-1,-2,-3,-4] ---> new_home=[1,2,3,4,0]
Here is my code that returns an empty list:
def extract_negatives(xs):
new_home=[]
for num in range(len(xs)):
if num <0:
new_home= new_home+ xs.pop(num)
return
return new_home
Why not use
[v for v in xs if v >= 0]
def extract_negatives(xs):
new_home=[]
for num in range(len(xs)):
if xs[num] < 0:
new_home.append(xs[num])
return new_home
for your code
But the Chuancong Gao solution is better:
def extract_negative(xs):
return [v for v in xs if v >= 0]
helper function filter could also help. Your function actually is
new_home = filter(lambda x: x>=0, xs)
Inside the loop of your code, the num variable doesn't really store the value of the list as you expect. The loop just iterates for len(xs) times and passes the current iteration number to num variable.
To access the list elements using loop, you should construct loop in a different fashion like this:
for element in list_name:
print element #prints all element.
To achieve your goal, you should do something like this:
another_list=[]
for element in list_name:
if(element<0): #only works for elements less than zero
another_list.append(element) #appends all negative element to another_list
Fortunately (or unfortunately, depending on how you look at it) you aren't examining the numbers in the list (xs[num]), you are examining the indexes (num). This in turn is because as a Python beginner you probably nobody haven't yet learned that there are typically easier ways to iterate over lists in Python.
This is a good (or bad, depending on how you look at it) thing, because had your code taken that branch you would have seen an exception occurring when you attempted to add a number to a list - though I agree the way you attempt it seems natural in English. Lists have an append method to put new elements o the end, and + is reserved for adding two lists together.
Fortunately ignorance is curable. I've recast your code a bit to show you how you might have written it:
def extract_negatives(xs):
out_list = []
for elmt in xs:
if elmt < 0:
out_list.append(elmt)
return out_list
As #ChuangongGoa suggests with his rather terse but correct answer, a list comprehension such as he uses is a much better way to perform simple operations of this type.

How can I limit iterations of a loop?

Say I have a list of items, and I want to iterate over the first few of it:
items = list(range(10)) # I mean this to represent any kind of iterable.
limit = 5
Naive implementation
The Python naïf coming from other languages would probably write this perfectly serviceable and performant (if unidiomatic) code:
index = 0
for item in items: # Python's `for` loop is a for-each.
print(item) # or whatever function of that item.
index += 1
if index == limit:
break
More idiomatic implementation
But Python has enumerate, which subsumes about half of that code nicely:
for index, item in enumerate(items):
print(item)
if index == limit: # There's gotta be a better way.
break
So we've about cut the extra code in half. But there's gotta be a better way.
Can we approximate the below pseudocode behavior?
If enumerate took another optional stop argument (for example, it takes a start argument like this: enumerate(items, start=1)) that would, I think, be ideal, but the below doesn't exist (see the documentation on enumerate here):
# hypothetical code, not implemented:
for _, item in enumerate(items, start=0, stop=limit): # `stop` not implemented
print(item)
Note that there would be no need to name the index because there is no need to reference it.
Is there an idiomatic way to write the above? How?
A secondary question: why isn't this built into enumerate?
How can I limit iterations of a loop in Python?
for index, item in enumerate(items):
print(item)
if index == limit:
break
Is there a shorter, idiomatic way to write the above? How?
Including the index
zip stops on the shortest iterable of its arguments. (In contrast with the behavior of zip_longest, which uses the longest iterable.)
range can provide a limited iterable that we can pass to zip along with our primary iterable.
So we can pass a range object (with its stop argument) to zip and use it like a limited enumerate.
zip(range(limit), items)
Using Python 3, zip and range return iterables, which pipeline the data instead of materializing the data in lists for intermediate steps.
for index, item in zip(range(limit), items):
print(index, item)
To get the same behavior in Python 2, just substitute xrange for range and itertools.izip for zip.
from itertools import izip
for index, item in izip(xrange(limit), items):
print(item)
If not requiring the index, itertools.islice
You can use itertools.islice:
for item in itertools.islice(items, 0, stop):
print(item)
which doesn't require assigning to the index.
Composing enumerate(islice(items, stop)) to get the index
As Pablo Ruiz Ruiz points out, we can also compose islice with enumerate.
for index, item in enumerate(islice(items, limit)):
print(index, item)
Why isn't this built into enumerate?
Here's enumerate implemented in pure Python (with possible modifications to get the desired behavior in comments):
def enumerate(collection, start=0): # could add stop=None
i = start
it = iter(collection)
while 1: # could modify to `while i != stop:`
yield (i, next(it))
i += 1
The above would be less performant for those using enumerate already, because it would have to check whether it is time to stop every iteration. We can just check and use the old enumerate if don't get a stop argument:
_enumerate = enumerate
def enumerate(collection, start=0, stop=None):
if stop is not None:
return zip(range(start, stop), collection)
return _enumerate(collection, start)
This extra check would have a slight negligible performance impact.
As to why enumerate does not have a stop argument, this was originally proposed (see PEP 279):
This function was originally proposed with optional start
and stop arguments. GvR [Guido van Rossum] pointed out that the function call
enumerate(seqn, 4, 6) had an alternate, plausible interpretation as
a slice that would return the fourth and fifth elements of the
sequence. To avoid the ambiguity, the optional arguments were
dropped even though it meant losing flexibility as a loop counter.
That flexibility was most important for the common case of
counting from one, as in:
for linenum, line in enumerate(source,1): print linenum, line
So apparently start was kept because it was very valuable, and stop was dropped because it had fewer use-cases and contributed to confusion on the usage of the new function.
Avoid slicing with subscript notation
Another answer says:
Why not simply use
for item in items[:limit]: # or limit+1, depends
Here's a few downsides:
It only works for iterables that accept slicing, thus it is more limited.
If they do accept slicing, it usually creates a new data structure in memory, instead of iterating over the reference data structure, thus it wastes memory (All builtin objects make copies when sliced, but, for example, numpy arrays make a view when sliced).
Unsliceable iterables would require the other kind of handling. If you switch to a lazy evaluation model, you'll have to change the code with slicing as well.
You should only use slicing with subscript notation when you understand the limitations and whether it makes a copy or a view.
Conclusion
I would presume that now the Python community knows the usage of enumerate, the confusion costs would be outweighed by the value of the argument.
Until that time, you can use:
for index, element in zip(range(limit), items):
...
or
for index, item in enumerate(islice(items, limit)):
...
or, if you don't need the index at all:
for element in islice(items, 0, limit):
...
And avoid slicing with subscript notation, unless you understand the limitations.
You can use itertools.islice for this. It accepts start, stop and step arguments, if you're passing only one argument then it is considered as stop. And it will work with any iterable.
itertools.islice(iterable, stop)
itertools.islice(iterable, start, stop[, step])
Demo:
>>> from itertools import islice
>>> items = list(range(10))
>>> limit = 5
>>> for item in islice(items, limit):
print item,
...
0 1 2 3 4
Example from docs:
islice('ABCDEFG', 2) --> A B
islice('ABCDEFG', 2, 4) --> C D
islice('ABCDEFG', 2, None) --> C D E F G
islice('ABCDEFG', 0, None, 2) --> A C E G
Why not simply use
for item in items[:limit]: # or limit+1, depends
print(item) # or whatever function of that item.
This will only work for some iterables, but since you specified Lists, it works.
It doesn't work if you use Sets or dicts etc.
Pass islice with the limit inside enumerate
a = [2,3,4,2,1,4]
for a, v in enumerate(islice(a, 3)):
print(a, v)
Output:
0 2
1 3
2 4
Why not loop until the limit or the end of list, whichever occurs earlier, like this:
items = range(10)
limit = 5
for i in range(min(limit, len(items))):
print items[i]
Output:
0
1
2
3
4
short solution
items = range(10)
limit = 5
for i in items[:limit]: print(i)

for-in loop's upper limit changing in each loop

How can I update the upper limit of a loop in each iteration? In the following code, List is shortened in each loop. However, the lenList in the for, in loop is not, even though I defined lenList as global. Any ideas how to solve this? (I'm using Python 2.sthg)
Thanks!
def similarity(List):
import difflib
lenList = len(List)
for i in range(1,lenList):
import numpy as np
global lenList
a = List[i]
idx = [difflib.SequenceMatcher(None, a, x).ratio() for x in List]
z = idx > .9
del List[z]
lenList = len(List)
X = ['jim','jimmy','luke','john','jake','matt','steve','tj','pat','chad','don']
similarity(X)
Looping over indices is bad practice in python. You may be able to accomplish what you want like this though (edited for comments):
def similarity(alist):
position = 0
while position < len(alist):
item = alist[position]
position += 1
# code here that modifies alist
A list will evaluate True if it has any entries, or False when it is empty. In this way you can consume a list that may grow during the manipulation of its items.
Additionally, if you absolutely have to have indices, you can get those as well:
for idx, item in enumerate(alist):
# code here, where items are actual list entries, and
# idx is the 0-based index of the item in the list.
In ... 3.x (I believe) you can even pass an optional parameter to enumerate to control the starting value of idx.
The issue here is that range() is only evaluated once at the start of the loop and produces a range generator (or list in 2.x) at that time. You can't then change the range. Not to mention that numbers and immutable, so you are assigning a new value to lenList, but that wouldn't affect any uses of it.
The best solution is to change the way your algorithm works not to rely on this behaviour.
The range is an object which is constructed before the first iteration of your loop, so you are iterating over the values in that object. You would instead need to use a while loop, although as Lattyware and g.d.d.c point out, it would not be very Pythonic.
What you are effectively looping on in the above code is a list which got generated in the first iteration itself.
You could have as well written the above as
li = range(1,lenList)
for i in li:
... your code ...
Changing lenList after li has been created has no effect on li
This problem will become quite a lot easier with one small modification to how your function works: instead of removing similar items from the existing list, create and return a new one with those items omitted.
For the specific case of just removing similarities to the first item, this simplifies down quite a bit, and removes the need to involve Numpy's fancy indexing (which you weren't actually using anyway, because of a missing call to np.array):
import difflib
def similarity(lst):
a = lst[0]
return [a] + \
[x for x in lst[1:] if difflib.SequenceMatcher(None, a, x).ratio() > .9]
From this basis, repeating it for every item in the list can be done recursively - you need to pass the list comprehension at the end back into similarity, and deal with receiving an empty list:
def similarity(lst):
if not lst:
return []
a = lst[0]
return [a] + similarity(
[x for x in lst[1:] if difflib.SequenceMatcher(None, a, x).ratio() > .9])
Also note that importing inside a function, and naming a variable list (shadowing the built-in list) are both practices worth avoiding, since they can make your code harder to follow.

Remove items from a list while iterating without using extra memory in Python

My problem is simple: I have a long list of elements that I want to iterate through and check every element against a condition. Depending on the outcome of the condition I would like to delete the current element of the list, and continue iterating over it as usual.
I have read a few other threads on this matter. Two solutions seam to be proposed. Either make a dictionary out of the list (which implies making a copy of all the data that is already filling all the RAM in my case). Either walk the list in reverse (which breaks the concept of the alogrithm I want to implement).
Is there any better or more elegant way than this to do it ?
def walk_list(list_of_g):
g_index = 0
while g_index < len(list_of_g):
g_current = list_of_g[g_index]
if subtle_condition(g_current):
list_of_g.pop(g_index)
else:
g_index = g_index + 1
li = [ x for x in li if condition(x)]
and also
li = filter(condition,li)
Thanks to Dave Kirby
Here is an alternative answer for if you absolutely have to remove the items from the original list, and you do not have enough memory to make a copy - move the items down the list yourself:
def walk_list(list_of_g):
to_idx = 0
for g_current in list_of_g:
if not subtle_condition(g_current):
list_of_g[to_idx] = g_current
to_idx += 1
del list_of_g[to_idx:]
This will move each item (actually a pointer to each item) exactly once, so will be O(N). The del statement at the end of the function will remove any unwanted items at the end of the list, and I think Python is intelligent enough to resize the list without allocating memory for a new copy of the list.
removing items from a list is expensive, since python has to copy all the items above g_index down one place. If the number of items you want to remove is proportional to the length of the list N, then your algorithm is going to be O(N**2). If the list is long enough to fill your RAM then you will be waiting a very long time for it to complete.
It is more efficient to create a filtered copy of the list, either using a list comprehension as Marcelo showed, or use the filter or itertools.ifilter functions:
g_list = filter(not_subtle_condition, g_list)
If you do not need to use the new list and only want to iterate over it once, then it is better to use ifilter since that will not create a second list:
for g_current in itertools.ifilter(not_subtle_condtion, g_list):
# do stuff with g_current
The built-in filter function is made just to do this:
list_of_g = filter(lambda x: not subtle_condition(x), list_of_g)
How about this?
[x for x in list_of_g if not subtle_condition(x)]
its return the new list with exception from subtle_condition
For simplicity, use a list comprehension:
def walk_list(list_of_g):
return [g for g in list_of_g if not subtle_condition(g)]
Of course, this doesn't alter the original list, so the calling code would have to be different.
If you really want to mutate the list (rarely the best choice), walking backwards is simpler:
def walk_list(list_of_g):
for i in xrange(len(list_of_g), -1, -1):
if subtle_condition(list_of_g[i]):
del list_of_g[i]
Sounds like a really good use case for the filter function.
def should_be_removed(element):
return element > 5
a = range(10)
a = filter(should_be_removed, a)
This, however, will not delete the list while iterating (nor I recommend it). If for memory-space (or other performance reasons) you really need it, you can do the following:
i = 0
while i < len(a):
if should_be_removed(a[i]):
a.remove(a[i])
else:
i+=1
print a
If you perform a reverse iteration, you can remove elements on the fly without affecting the next indices you'll visit:
numbers = range(20)
# remove all numbers that are multiples of 3
l = len(numbers)
for i, n in enumerate(reversed(numbers)):
if n % 3 == 0:
del numbers[l - i - 1]
print numbers
The enumerate(reversed(numbers)) is just a stylistic choice. You may use a range if that's more legible to you:
l = len(numbers)
for i in range(l-1, -1, -1):
n = numbers[i]
if n % 3 == 0:
del numbers[i]
If you need to travel the list in order, you can reverse it in place with .reverse() before and after the reversed iteration. This won't duplicate your list either.

Categories