When looping through a list, you can work with the current item of the list. For example, if you want to replace certain items with others, you can use:
a=['a','b','c','d','e']
b=[]
for i in a:
if i=='b':
b.append('replacement')
else:
b.append(i)
print b
['a', 'replacement', 'c', 'd', 'e']
However, I wish the replace certain values not based on index i, but based on index i+1. I've been trying for ages and I can't seem to make it work. I would like something like this:
c=['a','b','c','d','e']
d=[]
for i in c:
if i+1=='b':
d.append('replacement')
else:
d.append(i)
print d
d=['replacement','b','c','d','e']
Is there any way to achieve this?
Use a list comprehension along with enumerate
>>> ['replacement' if a[i+1]=='b' else v for i,v in enumerate(a[:-1])]+[a[-1]]
['replacement', 'b', 'c', 'd', 'e']
The code replaces all those elements where the next element is b. However to take care of the last index and prevent IndexError, we just append the last element and loop till the penultimate element.
Without a list comprehension
a=['a','b','c','d','e']
d=[]
for i,v in enumerate(a[:-1]):
if a[i+1]=='b':
d.append('replacement')
else:
d.append(v)
d.append(a[-1])
print d
It's generally better style to not iterate over indices in Python. A common way to approach a problem like this is to use zip (or the similar izip_longest in itertools) to see multiple values at once:
In [32]: from itertools import izip_longest
In [33]: a=['a','b','c','d','e']
In [34]: b = []
In [35]: for c, next in izip_longest(a, a[1:]):
....: if next == 'd':
....: b.append("replacement")
....: else:
....: b.append(c)
....:
In [36]: b
Out[36]: ['a', 'b', 'replacement', 'd', 'e']
I think there's a confusion in your post between the list indices and list elements. In the loop as you have written it i will be the actual element (e.g. 'b') and not the index, thus i+1 is meaningless and will throw a TypeError exception.
I think one of the smallest set of changes you can do to your example to make it work is:
c = ['a', 'b', 'c', 'd', 'e']
d = []
for i, el in enumerate(c[:-1]):
if c[i + 1] == 'b':
d.append('replacement')
else:
d.append(el)
print d
# Output...
# ['replacement', 'b', 'c', 'd']
Additionally it's undefined how you should deal with the boundaries. Particularly when i points to the last element 'e', what should i+1 point to? There are many possible answers here. In the example above I've chosen one option, which is to end the iteration one element early (so we never point to the last element e).
If I was doing this I would do something similar to a combination of the other answers:
c = ['a', 'b', 'c', 'd', 'e']
d = ['replacement' if next == 'b' else current
for current, next in zip(c[:-1], c[1:]) ]
print d
# Output...
# ['replacement', 'b', 'c', 'd']
where I have used a list comprehension to avoid the loop, and zip on the list and a shifted list to avoid the explicit indices.
Try using index of current element to check for the next element in the list .
Replace
if i+1=='b':
with
if c[c.index(i)+1]=='b':
Related
I want to compare two different lists and return the indexes of similar stings.
For example, if I have two lists like:
grades = ['A', 'B', 'A', 'E', 'D']
scored = ['A', 'B', 'F', 'F', 'D']
My expected output is:
[0, 1, 4] #The indexes of similar strings in both lists
However this is the result I am getting at the moment:
[0, 1, 2, 4] #Problem: The 2nd index being counted again
I have tried coding using using two approaches.
First Approach:
def markGrades(grades, scored):
indices = [i for i, item in enumerate(grades) if item in scored]
return indices
Second Approach:
def markGrades(grades, scored):
indices = []
for i, item in enumerate(grades):
if i in scored and i not in indices:
indices.append(i)
return indices
The second approach returns correct strings but not the indexes.
You can use enumerate along with zip in list comprehension to achieve this as:
>>> grades = ['A', 'B', 'A', 'E', 'D']
>>> scored = ['A', 'B', 'F', 'F', 'D']
>>> [i for i, (g, s) in enumerate(zip(grades, scored)) if g==s]
[0, 1, 4]
Issue with your code is that you are not comparing the elements at the same index. Instead via using in you are checking whether elements of one list are present in another list or not.
Because 'A' at index 2 of grades is present in scored list. You are getting index 2 in your resultant list.
Your logic fails in that it doesn't check whether the elements are in the same position, merely that the grades element appears somewhere in scored. If you simply check corresponding elements, you can do this simply.
Using your second approach:
for i, item in enumerate(grades):
if item == scored[i]:
indices.append(i)
The solution that Anonymous gives is what I was about to add as the "Pythonic" way to solve the problem.
You can access the two lists in pairs (to avoid the over-generalization of finding a match anywhere in the other array) with zip
grades = ['A', 'B', 'A', 'E', 'D']
scored = ['A', 'B', 'F', 'F', 'D']
matches = []
for ix, (gr, sc) in enumerate(zip(grades,scored)):
if gr == sc:
matches.append(ix)
or more compactly with list comprehension, if that suits your purpose
matches = [ix for ix, (gr, sc) in enumerate(zip(grades,scored)) if gr == sc]
I'm trying to create a list of lists from a single list. I'm able to do this if the new list of lists have the same number of elements, however this will not always be the case
As said earlier, the function below works when the list of lists have the same number of elements.
I've tried using regular expressions to determine if an element matches a pattern using
pattern2=re.compile(r'\d\d\d\d\d\d') because the first value on my new list of lists will always be 6 digits and it will be the only one that follows that format. However, i'm not sure of the syntax of getting it to stop at the next match and create another list
def chunks(l,n):
for i in range(0,len(l),n):
yield l[i:i+n]
The code above works if the list of lists will contain the same number of elements
Below is what I expect.
OldList=[111111,a,b,c,d,222222,a,b,c,333333,a,d,e,f]
DesiredList=[[111111,a,b,c,d],[222222,a,b,c],[333333,a,d,e,f]]
Many thanks indeed.
Cheers
Likely a much more efficient way to do this (with fewer loops), but here is one approach that finds the indexes of the breakpoints and then slices the list from index to index appending None to the end of the indexes list to capture the remaining items. If your 6 digit numbers are really strings, then you could eliminate the str() inside re.match().
import re
d = [111111,'a','b','c','d',222222,'a','b','c',333333,'a','d','e','f']
indexes = [i for i, x in enumerate(d) if re.match(r'\d{6}', str(x))]
groups = [d[s:e] for s, e in zip(indexes, indexes[1:] + [None])]
print(groups)
# [[111111, 'a', 'b', 'c', 'd'], [222222, 'a', 'b', 'c'], [333333, 'a', 'd', 'e', 'f']]
You can use a fold.
First, define a function to locate the start flag:
>>> def is_start_flag(v):
... return len(v) == 6 and v.isdigit()
That will be useful if the flags are not exactly what you expected them to be, or to exclude some false positives, or even if you need a regex.
Then use functools.reduce:
>>> L = d = ['111111', 'a', 'b', 'c', 'd', '222222', 'a', 'b', 'c', '333333', 'a', 'd', 'e', 'f']
>>> import functools
>>> functools.reduce(lambda acc, x: acc+[[x]] if is_start_flag(x) else acc[:-1]+[acc[-1]+[x]], L, [])
[['111111', 'a', 'b', 'c', 'd'], ['222222', 'a', 'b', 'c'], ['333333', 'a', 'd', 'e', 'f']]
If the next element x is the start flag, then append a new list [x] to the accumulator. Else, add the element to the current list, ie the last list of the accumulator.
I am searching through a list like this:
my_list = [['a','b'],['b','c'],['a','x'],['f','r']]
and I want to see which elements come with 'a'. So first I have to find lists in which 'a' occurs. Then get access to the other element of the list. I do this by abs(pair.index('a')-1)
for pair in my_list:
if 'a' in pair:
print( pair[abs(pair.index('a')-1)] )
Is there any better pythonic way to do that?
Something like: pair.index(not 'a') maybe?
UPDATE:
Maybe it is good to point out that 'a' is not necessarily the first element.
in my case, ['a','a'] doesn't happen, but generally maybe it's good to choose a solution which handles this situation too
Are you looking for elements that accompany a? If so, a simple list comprehension will do:
In [110]: [x for x in my_list if 'a' in x]
Out[110]: [['a', 'b'], ['a', 'x']]
If you just want the elements and not the pairs, how about getting rid of a before printing:
In [112]: [(set(x) - {'a'}).pop() for x in my_list if 'a' in x]
Out[112]: ['b', 'x']
I use a set because a could either be the first or second element in the pair.
If I understand your question correctly, the following should work:
my_list = filter(
lambda e: 'a' not in e,
my_list
)
Note that in python 3, this returns a filter object instance. You may want to wrap the code in a list() command to get a list instance instead.
That technique works ok here, but it may be more efficient, and slightly more readable, to do it using sets. Here's one way to do that.
def paired_with(seq, ch):
chset = set(ch)
return [(set(pair) - chset).pop() for pair in seq if ch in pair]
my_list = [['a','b'], ['b','c'], ['x','a'], ['f','r']]
print(paired_with(my_list, 'a'))
output
['b', 'x']
If you want to do lots of tests on the same list, it would be more efficient to build a list of sets.
def paired_with(seq, ch):
chset = set(ch)
return [(pair - chset).pop() for pair in seq if ch in pair]
my_list = [['a','b'], ['b','c'], ['x','a'], ['f','r']]
my_sets = [set(u) for u in my_list]
print(my_sets)
print(paired_with(my_sets, 'a'))
output
[{'b', 'a'}, {'c', 'b'}, {'x', 'a'}, {'r', 'f'}]
['b', 'x']
This will fail if there's a pair like ['a', 'a'], but we can easily fix that:
def paired_with(seq, ch):
chset = set(ch)
return [(pair - chset or chset).pop() for pair in seq if ch in pair]
my_list = [['a','b'], ['b','c'], ['x','a'], ['f','r'], ['a', 'a']]
my_sets = [set(u) for u in my_list]
print(paired_with(my_sets, 'a'))
output
['b', 'x', 'a']
I am looking to do the subtraction of a list from another list but by respecting repetitions:
>>> a = ['a', 'b', 'c','c', 'c', 'c', 'd', 'e', 'e']
>>> b = ['a', 'c', 'e', 'f','c']
>>> a - b
['b', 'c','c', 'd', 'e']
Order of elements does not matter.
There is a question with answers here but it ignores the repetitions. Solutions there would give:
>>> a - b
['b', 'd']
One solution considers duplicates but it alters one of the original list:
[i for i in a if not i in b or b.remove(i)]
I wrote this solution:
a_sub_b = list(a)
b_sub_a = list(b)
for e in a:
if e in b_sub_a:
a_sub_b.remove(e)
b_sub_a.remove(e)
print a_sub_b # a - b
print b_sub_a # b - a
That works for me , but is there a better solution , simpler or more efficient ?
If order doesn't matter, use collections.Counter:
c = list((Counter(a) - Counter(b)).elements())
Counter(a) - Counter(b) builds a Counter with the count of an element x equal to the number of times x appears in a minus the number of times x appears in b. elements() creates an iterator that yields each element a number of times equal to its count, and list turns that into a list. The whole thing takes O(len(a)+len(b)) time.
Note that depending on what you're doing, it might be best to not work in terms of lists and just keep a, b, and c represented as Counters.
This is going to search every element of b for each element of a. It's also going to do a linear remove on each list for each element that matches. So, your algorithm takes quadratic time—O(max(N, M)^2) where N is the length of a and M is the length of b.
If you just copy b into a set instead of a list, that solves the problem. Now you're just doing a constant-time set lookup for each element in a, and a constant-time set remove instead of a list remove. But you've still got the problem with the linear-time and incorrect removing from the a copy. And you can't just copy a into a set, because that loses duplicates.
On top of that, a_sub_b.remove(e) removes an element matching e. That isn't necessarily the same element as the element you just looked up. It's going to be an equal element, and if identity doesn't matter at all, that's fine… but if it does, then remove may do the wrong thing.
At any rate, performance is already a good enough reason not to use remove. Once you've solved the problems above, this is the only thing making your algorithm quadratic instead of linear.
The easiest way to solve this problem is to build up a new list, rather than copying the list and removing from it.
Solving both problems, you have O(2N+M) time, which is linear.
So, putting the two together:
b_set = set(b)
new_a = []
for element in a:
if a in b_set:
b_set.remove(element)
else:
new_a.append(element)
However, this still may have a problem. You haven't stated things very clearly, so it's hard to be sure, but can b contain duplicates, and, if so, does that mean the duplicated elements should be removed from a multiple times? If so, you need a multi-set, not a set. The easiest way to do that in Python is with a Counter:
from collections import Counter
b_counts = Counter(b)
new_a = []
for element in a:
if b_counts[element]:
b_counts[element] -= 1
else:
new_a.append(element)
On the other hand, if the order of neither a nor b matters, this just reduces to multiset difference, which makes it even easier:
new_a = list((Counter(a) - Counter(b)).elements())
But really, if the order of both is meaningless, you probably should have been using a Counter or other multiset representation in the first place, not a list…
The following uses standard library only:
a = ['a', 'b', 'b', 'c', 'c', 'c', 'c', 'd', 'd', 'd', 'e', 'e']
b = ['a', 'c', 'e', 'f','c']
a_set = set(a)
b_set = set(b)
only_in_a = list(a_set - b_set)
diff_list = list()
for _o in only_in_a:
tmp = a.count(_o) * _o
diff_list.extend(tmp)
for _b in b_set:
tmp = (a.count(_b) - b.count(_b)) * _b
diff_list.extend(tmp)
print diff_list
And gives:
['b', 'b', 'd', 'd', 'd', 'c', 'c', 'e']
as expected.
what I basically need is to check every element of a list and if some criteria fit I want to remove it from the list.
So for example let's say that
list=['a','b','c','d','e']
I basically want to write (in principle and not the actual code I try to implement)
If an element of the list is 'b' or 'c' remove it from the list and take the next.
But
for s in list:
if s=='b' or s=='c':
list.remove(s)
fails because when 'b' is removed the loop takes 'd' and not 'c' as the next element. So is there a way to do that faster than storing the elements in a separate list and removing them afterwards?
Thanks.
The easier way is to use a copy of the list - it can be done with a slice that extends "from the beginning" to the "end" of the list, like this:
for s in list[:]:
if s=='b' or s=='c':
list.remove(s)
You have considered this, and this is simple enough to be in your code, unless this list is really big, and in a critical part of the code (like, in the main loop of an action game). In that case, I sometimes use the following idiom:
to_remove = []
for index, s in enumerate(list):
if s == "b" or s == "c":
to_remove.append(index)
for index in reversed(to_remove):
del list[index]
Of course you can resort to a while loop instead:
index = 0
while index < len(list):
if s == "b" or s == "c":
del list[index]
continue
index += 1
Its better not to reinvent things which are already available. Use filter functions and lambda in these cases. Its more pythonic and looks cleaner.
filter(lambda x:x not in ['b','c'],['a','b','c','d','e'])
alternatively you can use list comprehension
[x for x in ['a','b','c','d','e'] if x not in ['b','c']]
This is exactly what itertools.ifilter is designed for.
from itertools import ifilter
ifilter(lambda x: x not in ['b', 'c'], ['a', 'b', 'c', 'd', 'e'])
will give you back a generator for your list. If you actually need a list, you can create it using one of the standard techniques for converting a generator to a list:
list(ifilter(lambda x: x not in ['b', 'c'], ['a', 'b', 'c', 'd', 'e']))
or
[x for x in ifilter(lambda x: x not in ['b', 'c'], ['a', 'b', 'c', 'd', 'e'])]
If you are ok with creating a copy of the list you can do it like this (list comprehension):
[s for s in list if s != 'b' and s != 'c']