Modify a list while iterating [duplicate] - python

This question already has answers here:
How to modify list entries during for loop?
(10 answers)
Closed 5 months ago.
I know you should not add/remove items while iterating over a list. But can I modify an item in a list I'm iterating over if I do not change the list length?
class Car(object):
def __init__(self, name):
self.name = name
def __repr__(self):
return type(self).__name__ + "_" + self.name
my_cars = [Car("Ferrari"), Car("Mercedes"), Car("BMW")]
print(my_cars) # [Car_Ferrari, Car_Mercedes, Car_BMW]
for car in my_cars:
car.name = "Moskvich"
print(my_cars) # [Car_Moskvich, Car_Moskvich, Car_Moskvich]
Or should I iterate over the list indices instead? Like that:
for car_id in range(len(my_cars)):
my_cars[car_id].name = "Moskvich"
The question is: are the both ways above allowed or only the second one is error-free?
If the answer is yes, will the following snippet be valid?
lovely_numbers = [[41, 32, 17], [26, 55]]
for numbers_pair in lovely_numbers:
numbers_pair.pop()
print(lovely_numbers) # [[41, 32], [26]]
UPD. I'd like to see the python documentation where it says "these operations are allowed" rather than someone's assumptions.

You are not modifying the list, so to speak. You are simply modifying the elements in the list. I don't believe this is a problem.
To answer your second question, both ways are indeed allowed (as you know, since you ran the code), but it would depend on the situation. Are the contents mutable or immutable?
For example, if you want to add one to every element in a list of integers, this would not work:
>>> x = [1, 2, 3, 4, 5]
>>> for i in x:
... i += 1
...
>>> x
[1, 2, 3, 4, 5]
Indeed, ints are immutable objects. Instead, you'd need to iterate over the indices and change the element at each index, like this:
>>> for i in range(len(x)):
... x[i] += 1
...
>>> x
[2, 3, 4, 5, 6]
If your items are mutable, then the first method (of directly iterating over the elements rather than the indices) is more efficient without a doubt, because the extra step of indexing is an overhead that can be avoided since those elements are mutable.

I know you should not add/remove items while iterating over a list. But can I modify an item in a list I'm iterating over if I do not change the list length?
You're not modifying the list in any way at all. What you are modifying is the elements in the list; That is perfectly fine. As long as you don't directly change the actual list, you're fine.
There's no need to iterate over the indices. In fact, that's unidiomatic. Unless you are actually trying to change the list itself, simply iterate over the list by value.
If the answer is yes, will the following snippet be valid?
lovely_numbers = [[41, 32, 17], [26, 55]]
for numbers_pair in lovely_numbers:
numbers_pair.pop()
print(lovely_numbers) # [[41, 32], [26]]
Absolutely. For the exact same reasons as I said above. Your not modifying lovely_numbers itself. Rather, you're only modifying the elements in lovely_numbers.

Examples where the list is modified and not during while iterating over the elements of the list
list_modified_during_iteration.py
a = [1,2]
i = 0
for item in a:
if i<5:
print 'append'
a.append(i+2)
print a
i += 1
list_not_modified_during_iteration.py (Changed item to i)
a = [1,2]
i = 0
for i in range(len(a)):
if i<5:
print 'append'
a.append(i+2)
print a
i += 1

Of course, you can. The first way is normal, but in some cases you can also use list comprehensions or map().

Related

"list index out of range" - Python

I was trying to write a piece of program that will remove any repeating items in the list, but I get a list index out of range
Here's the code:
a_list = [1, 4, 3, 2, 3]
def repeating(any_list):
list_item, comparable = any_list, any_list
for x in any_list:
list_item[x]
comparable[x]
if list_item == comparable:
any_list.remove(x)
print(any_list)
repeating(a_list)
So my question is, what's wrong?
Your code does not do what you think it does.
First you are creating additional references to the same list here:
list_item, comparable = any_list, any_list
list_item and comparable are just additional names to access the same list object.
You then loop over the values contained in any_list:
for x in any_list:
This assigns first 1, then 4, then 3, then 2, then 3 again to x.
Next, use those values as indexes into the other two references to the list, but ignore the result of those expressions:
list_item[x]
comparable[x]
This doesn't do anything, other than test if those indexes exist.
The following line then is always true:
if list_item == comparable:
because the two variables reference the same list object.
Because that is always true, the following line is always executed:
any_list.remove(x)
This removes the first x from the list, making the list shorter, while still iterating. This causes the for loop to skip items as it'll move the pointer to the next element. See Loop "Forgets" to Remove Some Items for why that is.
All in all, you end up with 4, then 3 items in the list, so list_item[3] then fails and throws the exception.
The proper way to remove duplicates is to use a set object:
def repeating(any_list):
return list(set(any_list))
because a set can only hold unique items. It'll alter the order however. If the order is important, you can use a collections.OrderedDict() object:
def repeating(any_list):
return list(OrderedDict.fromkeys(any_list))
Like a set, a dictionary can only hold unique keys, but an OrderedDict actually also keeps track of the order of insertion; the dict.fromkeys() method gives each element in any_list a value of None unless the element was already there. Turning that back in to a list gives you the unique elements in a first-come, first serve order:
>>> from collections import OrderedDict
>>> a_list = [1, 4, 3, 2, 3]
>>> list(set(a_list))
[1, 2, 3, 4]
>>> list(OrderedDict.fromkeys(a_list))
[1, 4, 3, 2]
See How do you remove duplicates from a list in whilst preserving order? for more options still.
The easiest way to solve your issue is to convert the list to a set and then, back to a list...
def repeating(any_list):
print list(set(any_list))
You're probably having an issue, because you're modifying the list (removing), while iterating over it.
If you want to remove duplicates in a list but don't care about the elements formatting then you can
def removeDuplicate(numlist):
return list(set(numlist))
If you want to preserve the order then
def removeDuplicate(numlist):
return sorted(list(set(numlist)), key=numlist.index)

creating recursive list length checker in python [duplicate]

This question already has answers here:
Flatten an irregular (arbitrarily nested) list of lists
(51 answers)
Closed 8 years ago.
I have a set Elements that have multiple other elements nested inside of them. I am trying to extract all of them through recursion since I don't know how many levels deep does the nesting go. To compare this to something more pythonic i would say imagine a list of elements. Each item on that list can either be a single value or another list of elements. Then for each sub-list there can be either a single value or more sub-lists. I want to go through all of them and pull out each element from all of the list until the last of sub-lists have nothing but single items on it.
lst = [1,[[2,3,4],[5,6,7]],[[8,9,10],[[11,12,13],[14,15,16]]],17,18]
for i in lst:
subElem = i.GetSubComponentIds()
if subElem.Count >= 1:
idsList.append(subElem)
for i in subElem:
subElem2 = i.GetSubComponentIds():
if subElem2.Count = >= 1:.... and so on
How would I set up a recursive function that would grab every element on a input list run a GetSubComponentIds() function on it (that either returns another list or nothing). if the return is a list then run the same function GetSubComponentsIds() on each item of that sublist until you get nothing in return. At the same time for those that returned nothing i want that Id appended. So if i used the lst from the example above i would end up with a list of all of the elements 1-18 (the only trick is that i donk know how many sub lists deep each element on the original list is).
As I understand it, you want to use recursion to extract the elements buried in some nested object. Here is one method:
def is_list(x):
# Replace this with an appropriate test for your type
return hasattr(x, 'index')
def recurse(lst):
if is_list(lst):
elements = []
for element in lst:
elements += recurse(element)
return elements
else:
return [lst]
Run on your sample list:
>>> recurse(lst)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
Please refer to following code working with regular lists:
def flattern(lst, res):
for elem in lst:
if isinstance(elem, list):
flattern(elem, res)
else:
res.append(elem)
Please update it to using your functions

Is there a more Pythonic way to prevent adding a duplicate to a list?

Is there a more Pythonic (or succinct) way to prevent adding a duplicate to a list?
if item not in item_list:
item_list.append(item)
Or is this in fact a cheap operation?
Since #hcwsha's original solution has been replaced, I'm recording it here:
seen = set(item_list)
# [...]
if item not in seen:
seen.add(item)
item_list.append(item)
This runs in O (1) and could therefore be considered better than the one that you are currently using.
Your way is great! Set are useful for this sort of things but as previously mentioned, they don't maintain order. Other ways of writing a little more succinctly, though maybe not as clear, are show below:
item_list.append(item) if item not in item_list else None
and
item_list += [item] if item not in item_list else []
this last one can be adapted if you wanted to add multiple new_items = [item1, ...] like so
item_list += [item for item in new_items if item not in item_list]
Use a set to keep track of seen items, sets provide O(1) lookup.
>>> item_list = [1, 7, 7, 7, 11, 14 ,100, 100, 4, 4, 4]
>>> seen = set()
>>> item_list[:] = [item for item in item_list
if item not in seen and not seen.add(item)]
>>> item_list
[1, 7, 11, 14, 100, 4]
If order doesn't matter then simply use set() on item_list:
>>> set(item_list)
set([1, 100, 7, 11, 14, 4])
If you have multiple places where you append to the collection its not very convenient to write boilerplate code like if item not in item_list:.... , you either should have a separate function that tracks changes to collection or subclass list with 'append' method override:
class CollisionsList(list):
def append(self, other):
if other in self:
raise ValueError('--> Value already added: {0}'.format(other))
super().append(other)
l = CollisionsList()
l.append('a')
l.append('b')
l.append('a')
print(l)
You can use the built-in set() function as shown below and the list() function to convert that set object to a normal python list:
item_list = ['a','b','b']
print list(set(item_list))
#['a', 'b']
Note: The order is not maintained when using sets
For when you have objects in a list and need to check a certain attribute to see if it's already in the list.
Not saying this is the best solution, but it does the job:
def _extend_object_list_prevent_duplicates(list_to_extend, sequence_to_add, unique_attr):
"""
Extends list_to_extend with sequence_to_add (of objects), preventing duplicate values. Uses unique_attr to distinguish between objects.
"""
objects_currently_in_list = {getattr(obj, unique_attr) for obj in list_to_extend}
for obj_to_add in sequence_to_add:
obj_identifier = getattr(obj_to_add, unique_attr)
if obj_identifier not in objects_currently_in_list:
list_to_extend.append(obj_to_add)
return list_to_extend

how to convert two lists into a dictionary (one list is the keys and the other is the values)? [duplicate]

This question already has answers here:
How can I make a dictionary (dict) from separate lists of keys and values?
(21 answers)
Closed 6 years ago.
This is code in IDLE2 in python, and error.
I need to include each "data" element as key and value "otro", in an orderly manner. Well "data" and "otro" it's list with 38 string's, as for "dik" it's an dictionary.
>>> for i in range(len(otro)+1):
dik[dato[i]] = otro[i]
Traceback (most recent call last):
File "<pyshell#206>", line 2, in <module>
dik[dato[i]] = otro[i]
IndexError: list index out of range
>>>
this problem is range(0, 38)
output -> (0, 1,2,3 ... 37) and it is all messy
I think something like:
dik = dict(zip(dato,otro))
is a little cleaner...
If dik already exists and you're just updating it:
dik.update(zip(dato,otro))
If you don't know about zip, you should invest a little time learning it. It's super useful.
a = [ 1 , 2 , 3 , 4 ]
b = ['a','b','c','d']
zip(a,b) #=> [(1,'a'),(2,'b'),(3,'c'),(4,'d')] #(This is actually a zip-object on python 3.x)
zip can also take more arguments (zip(a,b,c)) for example will give you a list of 3-tuples, but that's not terribly important for the discussion here.
This happens to be exactly one of the things that the dict "constructor" (type) likes to initialize a set of key-value pairs. The first element in each tuple is the key and the second element is the value.
The error comes from this: range(len(otro)+1). When you use range, the upper value isn't actually iterated, so when you say range(5) for instance, your iteration goes 0, 1, 2, 3, 4, where position 5 is the element 4. If we then took that list elements and said for i in range(len(nums)+1): print nums[i], the final i would be len(nums) + 1 = 6, which as you can see would cause an error.
The more 'Pythonic' way to iterate over something is to not use the len of the list - you iterate over the list itself, pulling out the index if necessary by using enumerate:
In [1]: my_list = ['one', 'two', 'three']
In [2]: for index, item in enumerate(my_list):
...: print index, item
...:
...:
0 one
1 two
2 three
Applying this to your case, you can then say:
>>> for index, item in enumerate(otro):
... dik[dato[index]] = item
However keeping with the Pythonicity theme, #mgilson's zip is the better version of this construct.

Linear merging for lists in Python

I'm working through Google's Python class exercises. One of the exercises is this:
Given two lists sorted in increasing order, create and return a merged list of all the elements in sorted order. You may modify the passed in lists. Ideally, the solution should work in "linear" time, making a single pass of both lists.
The solution I came up with was:
def linear_merge(list1, list2):
list1.extend(list2)
return sorted(list1)
It passed the the test function, but the solution given is this:
def linear_merge(list1, list2):
result = []
# Look at the two lists so long as both are non-empty.
# Take whichever element [0] is smaller.
while len(list1) and len(list2):
if list1[0] < list2[0]:
result.append(list1.pop(0))
else:
result.append(list2.pop(0))
# Now tack on what's left
result.extend(list1)
result.extend(list2)
return result
Included as part of the solution was this:
Note: the solution above is kind of cute, but unfortunately list.pop(0) is
not constant time with the standard python list implementation, so the
above is not strictly linear time. An alternate approach uses pop(-1) to
remove the endmost elements from each list, building a solution list which
is backwards. Then use reversed() to put the result back in the correct
order. That solution works in linear time, but is more ugly.
Why are these two solutions so different? Am I missing something, or are they being unnecessarily complicated?
They're encouraging you to think about the actual method (algorithm) of merging two sorted lists. Suppose you had two stacks of paper with names on them, each in alphabetical order, and you wanted to make one sorted stack from them. You wouldn't just lump them together and then sort that from scratch; that would be too much work. You'd make use of the fact that each pile is already sorted, so you can just take the one that comes first off of one pile or the other, and put them into a new stack.
As you noted, your solution works perfectly. So why the complexity? Well, for a start
Ideally, the solution should work in "linear" time, making a single
pass of both lists.
Well, you're not explicitly passing through any lists, but you are calling sorted(). So how many times will sorted() pass over the lists?
Well, I don't actually know. Normally, a sorting algorithm would operate in something like O(n*log(n)) time, though look at this quote from the Python docs:
The Timsort algorithm used in Python does multiple sorts efficiently
because it can take advantage of any ordering already present in a
dataset.
Maybe someone who knows timsort better can figure it out.
But what they're doing in the solution, is using the fact that they know they have 2 sorted lists. So rather than starting from "scratch" with sorted, they're picking off elements 1 by 1.
I like the #Abhijit approach the most. Here is a slightly more pythonic/readable version of his code snippet:
def linear_merge(list1, list2):
result = []
while list1 and list2:
result.append((list1 if list1[-1] > list2[-1] else list2).pop(-1))
return (result + list1 + list2)[-1::-1]
With the help of the built-in python features, we:
don't need to explicitly check if the lists are empty with the
len function.
can merge/append empty lists and the result will remain unchanged, so no need for explicit checking.
we can combine multiple statements (if the readability allows), which sometimes makes the code more compact.
result = []
while list1 and list2:
result.append((list1 if list1[-1] > list2[-1] else list2).pop(-1))
if len(list1):
result += list1[-1::-1]
if len(list2):
result += list2[-1::-1]
return result[-1::-1]
The solution by #Abhijit and #intel do not work in all cases because they have not reversed the leftover parts of the original lists. If we have list1 = [1, 2, 3, 5, 9, 11, 13, 17] and list2 = [6, 7, 12, 15] then their solution would give [5, 3, 2, 1, 6, 7, 9, 11, 12, 13, 15, 17] where we would want [1, 2, 3, 5, 6, 7, 9, 11, 12, 13, 15, 17].
Your solution is O(n log n), which means that if your lists were 10 times as long, the program would take (roughly) 30 times as much time. Their solution would only take 10 times as long.
Pop off the end of the lists until one is empty. I think this is linear, and also the reverses are linear too. Ugly, but a solution.
def linear_merge(list1, list2):
# NOT return sorted (list1 + list2), as this is not linear
list3 = []
rem = []
empty = False
while not empty:
# Get last items from each list, if they exist
if len (list1) > 0:
a = list1[-1]
else:
rem = list2[:]
empty = True
if len (list2) > 0:
b = list2[-1]
else:
rem = list1[:]
empty = True
# Pop the one that's largest onto the new list
if not empty:
if a > b:
list3.append (a)
list1.pop ()
else:
list3.append (b)
list2.pop ()
# add the (reversed) remainder to the list
rem.reverse ()
list3 += rem
# reverse the entire list
list3.reverse ()
return list3
A slightly refined by still ugly solution (in Python3.5):
def linear_merge(list1: list, list2: list):
result = []
while len(list1) and len(list2):
result.append((list1 if list1[-1] > list2[-1] else list2).pop(-1))
result += list1 if len(list1) else list2
return result[-1::-1]
def linear_merge(list1, list2):
a= list1 + list2
a.sort()
return a

Categories