Iterating over lists of lists in Python - python

I have a list of lists:
lst1 = [["(a)", "(b)", "(c)"],["(d)", "(e)", "(f)", "(g)"]]
I want to iterate over each element and perform some string operations on them for example:
replace("(", "")
I tried iterating over the list using:
for l1 in lst1:
for i in l1:
lst2.append(list(map(str.replace("(", ""), l1)))
I wanted the out result to be the same as original list of lists but without the parenthesis. Also, I am looking for a method in editing lists of lists and not really a specific solution to this question.
Thank you,

Edit:
Yes, you should use normal for-loops if you want to:
Preform multiple operations on each item contained in each sub-list.
Keep both the main list as well as the sub-lists as the same objects.
Below is a simple demonstration of how to do this:
main = [["(a)", "(b)", "(c)"], ["(d)", "(e)", "(f)", "(g)"]]
print id(main)
print id(main[0])
print id(main[1])
print
for sub in main:
for index,item in enumerate(sub):
### Preform operations ###
item = item.replace("(", "")
item = item.replace(")", "")
item *= 2
sub[index] = item # Reassign the item
print main
print
print id(main)
print id(main[0])
print id(main[1])
Output:
25321880
25321600
25276288
[['aa', 'bb', 'cc'], ['dd', 'ee', 'ff', 'gg']]
25321880
25321600
25276288
Use a nested list comprehension:
>>> lst1 = [["(a)", "(b)", "(c)"],["(d)", "(e)", "(f)", "(g)"]]
>>> id(lst1)
35863808
>>> lst1[:] = [[y.replace("(", "") for y in x] for x in lst1]
>>> lst1
[['a)', 'b)', 'c)'], ['d)', 'e)', 'f)', 'g)']]
>>> id(lst1)
35863808
>>>
The [:] will keep the list object the same.

I just did what you did, i used the fact that each element of a list can be assigned a new (or updated) value:
>>> lst1 = [["(a)", "(b)", "(c)"],["(d)", "(e)", "(f)", "(g)"]]
>>> for x in range(len(lst1)):
for y in range(len(lst1[x])):
lst1[x][y] = lst1[x][y].replace("(", "")
>>> lst1
[['a)', 'b)', 'c)'], ['d)', 'e)', 'f)', 'g)']]
EDIT
This is how you do it with the "real problem" that you mentioned in the comment:
a = [[(12.22, 12.122, 0.000)], [(1232.11, 123.1231, 0.000)]]
some_num = 10
for x in range(len(a)):
b = list(a[x][0])
for y in range(len(b)):
b[y] *= some_num
a[x] = tuple(b)
print(a)
OUTPUT:
[(122.2, 121.22, 0.0), (12321.099999999999, 1231.231, 0.0)]
^ All elements have been multiplied by a number and the original format is kept
This is how it works:
So you have the initial list 'a' that has two sublists each with only ONE element (the tuple that contains the x,y,z coordinates). I go through list 'a' and make the tuples a list and set them equal to 'b' (so the fourth line has a value of [12.22, 12.122, 0.000] the first time around (and the next tuple (as a list) the next time around).
Then I go through each of the elements in 'b' (the tuple converted into a list) and multiply each element in that tuple by a number with the use of the increment operator (+=, -=, /=, *=). Once this loop is done, I set that same position in the master list 'a' equal to the tuple of the previously converted tuple. < If this doesn't make sense, what I'm saying is that the initial tuples are converted into lists (then operated on), and then converter back to tuples (since you want it to end up with the same format as before).
Hope this helps!

>>> lst1 = [["(a)", "(b)", "(c)"],["(d)", "(e)", "(f)", "(g)"]]
>>> [[j.strip('()') for j in i] for i in lst1]
[['a', 'b', 'c'], ['d', 'e', 'f', 'g']]
>>> [[j.lstrip('(') for j in i] for i in lst1]
[['a)', 'b)', 'c)'], ['d)', 'e)', 'f)', 'g)']]

Related

For Loop incrementing by 2s instead of 1 [duplicate]

Now I know that it is not safe to modify the list during an iterative looping. However, suppose I have a list of strings, and I want to strip the strings themselves. Does replacement of mutable values count as modification?
See Scope of python variable in for loop for a related problem: assigning to the iteration variable does not modify the underlying sequence, and also does not impact future iteration.
Since the loop below only modifies elements already seen, it would be considered acceptable:
a = ['a',' b', 'c ', ' d ']
for i, s in enumerate(a):
a[i] = s.strip()
print(a) # -> ['a', 'b', 'c', 'd']
Which is different from:
a[:] = [s.strip() for s in a]
in that it doesn't require the creation of a temporary list and an assignment of it to replace the original, although it does require more indexing operations.
Caution: Although you can modify entries this way, you can't change the number of items in the list without risking the chance of encountering problems.
Here's an example of what I mean—deleting an entry messes-up the indexing from that point on:
b = ['a', ' b', 'c ', ' d ']
for i, s in enumerate(b):
if s.strip() != b[i]: # leading or trailing whitespace?
del b[i]
print(b) # -> ['a', 'c '] # WRONG!
(The result is wrong because it didn't delete all the items it should have.)
Update
Since this is a fairly popular answer, here's how to effectively delete entries "in-place" (even though that's not exactly the question):
b = ['a',' b', 'c ', ' d ']
b[:] = [entry for entry in b if entry.strip() == entry]
print(b) # -> ['a'] # CORRECT
See How to remove items from a list while iterating?.
It's considered poor form. Use a list comprehension instead, with slice assignment if you need to retain existing references to the list.
a = [1, 3, 5]
b = a
a[:] = [x + 2 for x in a]
print(b)
One more for loop variant, looks cleaner to me than one with enumerate():
for idx in range(len(list)):
list[idx]=... # set a new value
# some other code which doesn't let you use a list comprehension
Modifying each element while iterating a list is fine, as long as you do not change add/remove elements to list.
You can use list comprehension:
l = ['a', ' list', 'of ', ' string ']
l = [item.strip() for item in l]
or just do the C-style for loop:
for index, item in enumerate(l):
l[index] = item.strip()
The answer given by Ignacio Vazquez-Abrams is really good. It can be further illustrated by this example. Imagine that:
A list with two vectors is given to you.
You would like to traverse the list and reverse the order of each one of the arrays.
Let's say you have:
v = np.array([1,2,3,4])
b = np.array([3,4,6])
for i in [v, b]:
i = i[::-1] # This command does not reverse the string.
print([v,b])
You will get:
[array([1, 2, 3, 4]), array([3, 4, 6])]
On the other hand, if you do:
v = np.array([1,2,3,4])
b = np.array([3,4,6])
for i in [v, b]:
i[:] = i[::-1] # This command reverses the string.
print([v,b])
The result is:
[array([4, 3, 2, 1]), array([6, 4, 3])]
No you wouldn't alter the "content" of the list, if you could mutate strings that way. But in Python they are not mutable. Any string operation returns a new string.
If you had a list of objects you knew were mutable, you could do this as long as you don't change the actual contents of the list.
Thus you will need to do a map of some sort. If you use a generator expression it [the operation] will be done as you iterate and you will save memory.
You can do something like this:
a = [1,2,3,4,5]
b = [i**2 for i in a]
It's called a list comprehension, to make it easier for you to loop inside a list.
It is not clear from your question what the criteria for deciding what strings to remove is, but if you have or can make a list of the strings that you want to remove , you could do the following:
my_strings = ['a','b','c','d','e']
undesirable_strings = ['b','d']
for undesirable_string in undesirable_strings:
for i in range(my_strings.count(undesirable_string)):
my_strings.remove(undesirable_string)
which changes my_strings to ['a', 'c', 'e']
In short, to do modification on the list while iterating the same list.
list[:] = ["Modify the list" for each_element in list "Condition Check"]
example:
list[:] = [list.remove(each_element) for each_element in list if each_element in ["data1", "data2"]]
Something I just discovered - when looping over a list of mutable types (such as dictionaries) you can just use a normal for loop like this:
l = [{"n": 1}, {"n": 2}]
for d in l:
d["n"] += 1
print(l)
# prints [{"n": 2}, {"n": 1}]

Easier way to check if an item from one list of tuples doesn't exist in another list of tuples in python

I have two lists of tuples, say,
list1 = [('item1',),('item2',),('item3',), ('item4',)] # Contains just one item per tuple
list2 = [('item1', 'd',),('item2', 'a',),('item3', 'f',)] # Contains multiple items per tuple
Expected output: 'item4' # Item that doesn't exist in list2
As shown in above example I want to check which item in tuples in list 1 does not exist in first index of tuples in list 2. What is the easiest way to do this without running two for loops?
Assuming your tuple structure is exactly as shown above, this would work:
tuple(set(x[0] for x in list1) - set(x[0] for x in list2))
or per #don't talk just code, better as set comprehensions:
tuple({x[0] for x in list1} - {x[0] for x in list2})
result:
('item4',)
This gives you {'item4'}:
next(zip(*list1)) - dict(list2).keys()
The next(zip(*list1)) gives you the tuple ('item1', 'item2', 'item3', 'item4').
The dict(list2).keys() gives you dict_keys(['item1', 'item2', 'item3']), which happily offers you set operations like that set difference.
Try it online!
This is the only way I can think of doing it, not sure if it helps though. I removed the commas in the items in list1 because I don't see why they are there and it affects the code.
list1 = [('item1'),('item2'),('item3'), ('item4')] # Contains just one item per tuple
list2 = [('item1', 'd',),('item2', 'a',),('item3', 'f',)] # Contains multiple items per tuple
not_in_tuple = []
OutputTuple = [(a) for a, b in list2]
for i in list1:
if i in OutputTuple:
pass
else:
not_in_tuple.append(i)
for i in not_in_tuple:
print(i)
You don't really have a choice but to loop over the two lists. Once efficient way could be to first construct a set of the first elements of list2:
items = {e[0] for e in list2}
list3 = list(filter(lambda x:x[0] not in items, list1))
Output:
>>> list3
[('item4',)]
Try set.difference:
>>> set(next(zip(*list1))).difference(dict(list2))
{'item4'}
>>>
Or even better:
>>> set(list1) ^ {x[:1] for x in list2}
{('item4',)}
>>>
that is a difference operation for sets:
set1 = set(j[0] for j in list1)
set2 = set(j[0] for j in list2)
result = set1.difference(set2)
output:
{'item4'}
for i in list1:
a=i[0]
for j in list2:
b=j[0]
if a==b:
break
else:
print(a)

Replace elements in a list of lists python

I have a list of lists as follows:
list=[]
*some code to append elements to list*
list=[['a','bob'],['a','bob'],['a','john']]
I want to go through this list and change all instances of 'bob to 'b' and leave others unchanged.
for x in list:
for a in x:
if "bob" in a:
a.replace("bob", 'b')
After printing out x it is still the same as list, but not as follows:
list=[['a','b'],['a','b'],['a','john']]
Why is the change not being reflected in list?
Because str.replace doesn't work in-place, it returns a copy. As immutable objects, you need to assign the strings to elements in your list of lists.
You can assign directly to your list of lists if you extract indexing integers via enumerate:
L = [['a','bob'],['a','bob'],['a','john']]
for i, x in enumerate(L):
for j, a in enumerate(x):
if 'bob' in a:
L[i][j] = a.replace('bob', 'b')
Result:
[['a', 'b'], ['a', 'b'], ['a', 'john']]
More Pythonic would be to use a list comprehension to create a new list. For example, if only the second of two values contains names which need checking:
L = [[i, j if j != 'bob' else 'b'] for i, j in L]
You can try using a dictionary object of python
import numpy as np
L = [['a','bob'],['a','bob'],['a','john']]
dic = {'bob':'b'} # you can specify more changes here
new_list = [dic.get(n, n) for n in np.concatenate(L)]
print(np.reshape(new_list,[-1,2]).tolist())
Result is
[['a', 'b'], ['a', 'b'], ['a', 'john']]
I'm going to use a simple example, but basically x is another variable and isn't linked to the list element. You have to change the list element directly in order to alter the list.
l=[1,2,3,4]
for x in l:
x=x+1
This doesn't change the list
l=[1,2,3,4]
for i,x in enumerate(l):
l[i]=x+1
this changes the list
I might be a little to the party, but a more Pythonic way of doing this is using a map and a list comprehension. It can operate on a list of the list with any number of values.
l = [['a','bob'],['a','bob'],['a','john']]
[list(map(lambda x: x if x != 'bob' else 'b', i)) for i in l]
it gives you the desired output
[['a', 'b'], ['a', 'b'], ['a', 'john']]
The main idea is that the inner loop is iterating through the inner loop and using the simple lambda function to perform the replacement.
I hope that this helps anyone else who is looking out for something similar.
This is the case because you are only changing the temporary variable a.
list = [1,2,3]
for i in list:
i+=1
list will still be [1,2,3]
you have to edit the string based on its index in the list

Python list for loop

This is a function that I saw to find the unique items in an array in order, I am new to python but this seemed very elegant.
unique_in_order = lambda l: [z for i, z in enumerate(l) if i == 0 or l[i - 1] != z]
How does this for loop exactly work.
z for i,z in enumerate(l)
enumerate(..) is a builtin function that takes as input an iterable object (l here) and generates a sequence of tuples containing the index and the element for each element.
So enumerate([1,4,2,5]) emits tuples like (0,1), (1,4), (2,2), (3,5). If you use a comma-separated list of identifiers in the head of the for loop, the tuple is untupled. So:
for i,z in enumerate([1,4,2,5]):
pass
will iterate four times, the first time i will be 0 and z 1; the next iteration i will be 1 and z 4; the next iteration i will be 2 and z 2; the next iteration i will be 3 and z 5.
Now your statement also contains some list comprehension, the first z in z for i,z in enumerate(l) means it will emit the z values. Notice furthermore that there is condition (the if part), so not all values will be emitted.
You should start with concept of list comprehensions in python to understand what does this lambda function do. In short it creates list of z elements that meet a condition on right side of statement.
Another important thing is builtin enumerate function that simply emits list of touples consisting of element and it's index.
enumerate() helps you to iterate over both the indices and the items of sequences at once.
Here is an example :
>>> l=['a','b','c']
>>> for index,value in enumerate(l):
print (index,value)
0 a
1 b
2 c
The solution you've posted is wrong and doesn't return unique elements as it only checks for duplicates on the previous item only (l[i-1]!=z).
To elaborate on what I meant, here is a test run :
>>> unique_in_order = lambda l: [z for i, z in enumerate(l) if i == 0 or l[i - 1] != z]
>>> l=[1,1,123,5,6,123]
>>> unique_in_order(l)
[1, 123, 5, 6, 123]
You can see that 123 occurs twice because it was tested only against its previous element 6.
Before I provide a simple solution, we need to be clear that we are finding unique items from a list in order or we are trying to get rid of duplicates entirely.
A simple and elegant solution would be to use list.count method. It returns the number of times an item occurs in the list.
>>> l=['a', 'a',2,5,6,'b', 'c', 'd', 'e','e',2,2,6]
>>> [x for x in l if l.count(x)<2]
[5, 'b', 'c', 'd']
If you did not meant to discard the duplicates entirely and instead wanted the list to have a single occurence of the duplicate items then you can do this :
>>> l=['a', 'a',2,5,6,'b', 'c', 'd', 'e','e',2,2,6]
>>> dups=set()
>>> [x for x in l if x not in dups and (dups.add(x) or True)]
['a', 2, 5, 6, 'b', 'c', 'd', 'e']

List index out of range error

So I am getting a list index out of range error in python again, and I can't figure out what's wrong.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
f1 = open("membrane_GO.txt","r")
new_list1 = f1.readlines()
new_list2 = new_list1
for i in range(len(new_list1)):
if "Reactome" in new_list1[i]:
new_list2.pop(i)
print new_list2
f1.close()
I made sure that the a duplicated list is being modified as the primary list is iterated over, so that can't be the problem.
Appreciate any help
Thanks :)
You only duplicated a reference to the list. If you want to make a separate copy of a list, use slices: list2 = list1[:] or look into the deepcopy module.
When you pop, the array size goes down. That means if the list has length 10, and you pop(0), then the list has length 9. If you then pop(9), which doesn't exist it will give you an out of bounds error.
Example:
>>> x = [0,1,2,3,4]
>>> print x, len(x)
[0,1,2,3,4], 5
>>> x.pop(0)
>>> print x, len(x)
[1,2,3,4], 4
This is an error in your case because you go from 0 to len(new_list1).
The approach I advise you to take is to create a new list where "Reactome" is not in new_list1[i].
You can do this easily in a list comprehension.
with open("membrane_GO.txt","r") as f:
lines = [line for line in f.readlines() if "Reactome" not in line]
print lines
Assume that your list is initially ['a', 'b', 'c'],
then list1 = list2 = ['a', 'b', 'c']
Then you perform iteration for len(list2), ie 3 times,
Then i will take values 0, 1, and 2.
In each iteration you are removing one element from list1.
i = 0
remove list1[0]
new list = ['b', 'c']
i = 1
remove list1[1]
new list = ['b']
i = 2
remove list[2] which does not exist.
So you will get a index out of bound error
Just to add to TigerHawks answer:
Because you have only duplicated the reference (not the list itself), when you pop() an element out of new_list2, you also remove it from new_list1 beceause they're both references to the same list.
Say there are 'n' elements in new_list1 at the start of the loop. It will run for 'n' iterations.
Suppose then that you pop an element out of new_list2 (and so out of new_list1 as well), within the loop, you will get an index out of range error when the loop tries to access the 'nth' element of a list which now only has 'n-1' elements in it
For this to work properly use slicing to copy the list:
new_list2 = new_list1[:]
Incidentally, for i in range(len(new_list1)): is considered un-pythonic, I believe. A 'better' way would be to use enumerate:
for index, element in enumerate(new_list1):
if "Reactome" in element:
new_list2.pop(index)

Categories