remove elements from list of strings while traversing [duplicate]

remove elements from list of strings while traversing [duplicate] - python

This question already has answers here:
Modifying list while iterating [duplicate]
(7 answers)
How to remove items from a list while iterating?
(25 answers)
Closed 3 years ago.
how to remove elements from a list of strings while traversing through it.
I have a list
list1 = ['', '$', '32,324', '$', '32', '$', '(35', ')', '$', '32,321']
i want to remove $ fro the list and if a ) or )% or % comes add that to the previous elemt of the list.
expected output is :
['', '32,324', '32', '(35)', '32,321']
what i have tried is
for j,element in enumerate(list1):
if element == '%' or element == ")%" or element ==')':
list1[j-1] = list1[j-1] + element
list1.pop(j)
elif element == '$':
list1.pop(j)
but the output i am getting is
['', '32,324', '32', '(35)', '$', '32,321']
whis is not the expected output. Please help
This question is different from the suggested reference is, here I have to do a concatenation with the previous element if the current element is ),)% or %.

What Green Cloak Guy said is mostly correct. Editing the size of the list (by calling .pop()) is causing you to have an unexpected j value. To me, the easiest way to fix this problem while keeping your existing code is to simply not mutate your list, and build up a new one instead:
new_list = []
for j,element in enumerate(list1):
if element == '%' or element == ")%" or element ==')':
ret[len(ret) - 1] += element # add at the end of the previous element
elif element != '$':
new_list.push(element)
However, I would encourage you to think about your edge cases here. What happens when a ')' is followed by another ')' in your list? This may be a special case in your if statement. Hope this helped!

Instead of attempting to remove and merge elements dynamically while iterating on the list, it will be much easier to make a new list based on the conditions here.
list1 = ['', '$', '32,324', '$', '32', '$', '(35', ')', '$', '32,321']
out = []
for element in list1:
if element == "$":
continue #skip if $ present
elif element in ("%", ")", ")%"):
out[-1] = out[-1] + element #merge with last element of out so far.
else:
out.append(element)
print(out)
#Output:
['', '32,324', '32', '(35)', '32,321']

I think this list comprehension works (haven't seen an example of how % is handled):
[ (a+b if b in (')',')%','%') else a) for a,b in zip(list1,list1[1:]+['']) if a not in ('$',')',')%','%')]
The idea is to:
make a list of pairings of elements and their successors
filter out elements that should be removed
add the successor as appropriate to those that we keep

Related

Want to remove elements based on first character - Python

This is a program that lists all the substrings except the one that starts with vowel letters.
However, I don't understand why startswith() function doesn't work as I expected. It is not removing the substrings that start with the letter 'A'.
Here is my code:
ban = 'BANANA'
cur_pos=0
sub = []
#Finding the substrings
for i in range(len(ban)):
limit=1
for j in range(len(ban)):
a = ban[cur_pos:limit]
sub.append(a)
limit+=1
cur_pos+=1
#removing the substrings that starts with vowels
for i in sub:
if (i.startswith(('A','E','I','O','U'))):
sub.remove(i)
print(sub)

Why this doesn't work...
To answer your question, the mantra for this issue is delete array elements in reverse order, which I occasionally forget and wonder whatever has gone wrong.
Explanation
The problem isn't with startswith() but using remove() inside this specific type of for loop, which uses an iterator rather than a range.
for i in sub:
This fails in this code for the following reason.
ban = 'BANANA'
cur_pos=0
sub = []
#Finding the substrings
for i in range(len(ban)):
limit=1
for j in range(len(ban)):
a = ban[cur_pos:limit]
sub.append(a)
limit+=1
cur_pos+=1
print(sub)
#removing the subtrings that start with vowels
for i in sub:
if (i.startswith(('A','E','I','O','U'))):
sub.remove(i)
print(sub)
print(sub)
I've added some print statements to assist debugging.
Initially the array is:
['B', 'BA', 'BAN', 'BANA', 'BANAN', 'BANANA', '', 'A', 'AN', 'ANA', 'ANAN', 'ANANA', '', '', 'N', 'NA', 'NAN', 'NANA', '', '', '', 'A', 'AN', 'ANA', '', '', '', '', 'N', 'NA', '', '', '', '', '', 'A']
...then we eventually get to remove the first 'A', which seems to be removed fine...
['B', 'BA', 'BAN', 'BANA', 'BANAN', 'BANANA', '', 'AN', 'ANA', 'ANAN', ...etc...
...but there is some nastiness happening behind the scenes that shows up when we reach the next vowel...
['B', 'BA', 'BAN', 'BANA', 'BANAN', 'BANANA', '', 'AN', 'ANAN',
Notice that 'ANA' was removed, not the expected 'AN'!
Why?
Because the remove() modified the array and shifted all the elements along by one position, but the for loop index behind the scenes does not know about this. The index is still pointing to the next element which it expects is 'AN' but because we moved all the elements by one position it is actually pointing to the 'ANA' element.
Fixing the problem
One way is to append vowel matches to a new empty array:
ban = 'BANANA'
cur_pos=0
sub = []
add = []
#Finding the subtrings
for i in range(len(ban)):
limit=1
for j in range(len(ban)):
a = ban[cur_pos:limit]
sub.append(a)
limit+=1
cur_pos+=1
#adding the subtrings that don't start with vowels
for i in sub:
if (not i.startswith(('A','E','I','O','U'))):
add.append(i)
print(add)
Another way
There is, however a simple way to modifying the original array, as you wanted, and that's to iterate through the array in reverse order using an index-based for loop.
The important part here is that you are not modifying any of the array elements that you are processing, only the parts that you are finished with, so that when you remove an element from the array, the array index won't point to the wrong element. This is common and acceptable practice, so long as you understand and make clear what you're doing.
ban = 'BANANA'
cur_pos=0
sub = []
#Finding the subtrings
for i in range(len(ban)):
limit=1
for j in range(len(ban)):
a = ban[cur_pos:limit]
sub.append(a)
limit+=1
cur_pos+=1
#removing the badtrings that start with vowels, in reverse index order
start = len(sub)-1 # last element index, less one (zero-based array indexing)
stopAt = -1 # first element index, less one (zero-based array indexing)
step = -1 # step backwards
for index in range(start,stopAt,step): # count backwards from last element to the first
i = sub[index]
if (i.startswith(('A','E','I','O','U'))):
print('#'+str(index)+' = '+i)
del sub[index]
print(sub)
For more details, see the official page on for
https://docs.python.org/3/reference/compound_stmts.html#index-6
Aside: This is my favourite array problem.
Edit: I just got bitten by this in Javascript, while removing DOM nodes.

It is not a good practice to iterate a list then removing item during the loop. I suggest you change it to this:
sub2=list()
#removing the substrings that starts with vowels
for i in sub:
if not (i.startswith(('A','E','I','O','U'))):
sub2.append(i)
print(sub2)
So if the substring do not starts with vowel, then add it to another list sub2.

As mentioned in the comments in python you shouldn't remove items from a list while iterating its elements since you mutate the original list before the loop ends. If you want to do that you'll either have to use a another list and then assign it to your old one or do it directly using a list comprehension like so:
sub = [i for i in sub if not i.startswith(('A','E','I','O','U'))]

Removing specific set of characters in a list of strings

I have a list of strings, and want to use another list of strings and remove any instance of the combination of bad list in my list. Such as the output of the below would be foo, bar, foobar, foofoo... Currently I have tried a few things for example below
mylist = ['foo!', 'bar\\n', 'foobar!!??!!', 'foofoo::!*']
remove_list = ['\\n', '!', '*', '?', ':']
for remove in remove_list:
for strings in mylist:
strings = strings.replace(bad, ' ')
The above code doesnt work, I did at one point set it to a new variable and append that afterwords but that wasnt working well becuase if their was two issues in a string it would be appended twice.

You changed the temporary variable, not the original list. Instead, assign the result back into mylist
for bad in remove_list:
for pos, string in enumerate(mylist):
mylist[pos] = string.replace(bad, ' ')

Try this:
mylist = ['foo!', 'bar\\n', 'foobar!!??!!', 'foofoo::!*']
bads = ['\\n', '!', '*', '?', ':']
result = []
for s in mylist:
# s is a temporary copy
for bad in bads:
s = s.replace(bad, '') # for all bad remove it
result.append(s)
print(result)
Could be implemented more concise, but this way it's more understandable.

I had a hard time interpreting the question, but I see you have the result desired at the top of your question.
mylist = ['foo!', 'bar\\n', 'foobar!!??!!', 'foofoo::!*']
remove_list = ['\\n', '!', '*', '?', ':']
output = output[]
for strings in mylist:
for remove in remove_list:
strings = strings.replace(remove, '')
output.append(strings)

import re
for list1 in mylist:
t = regex.sub('', list1)
print(t)
If you just want to get rid of non-chars do this. It works a lot better than comparing two separate array lists.

Why not have regex do the work for you? No nested loops this way (just make sure to escape correctly):
import re
mylist = ['foo!', 'bar\\n', 'foobar!!??!!', 'foofoo::!*']
remove_list = [r'\\n', '\!', '\*', '\?', ':']
removals = re.compile('|'.join(remove_list))
print([removals.sub('', s) for s in mylist])
['foo', 'bar', 'foobar', 'foofoo']

Another solution you can use is a comprehension list and remove the characters you want. After that, you delete duplicates.
list_good = [word.replace(bad, '') for word in mylist for bad in remove_list]
list_good = list(set(list_good))

my_list = ["foo!", "bar\\n", "foobar!!??!!", "foofoo::*!"]
to_remove = ["!", "\\n", "?", ":", "*"]
for index, item in enumerate(my_list):
for char in to_remove:
if char in item:
item = item.replace(char, "")
my_list[index] = item
print(my_list) # outputs [“foo”,”bar”,”foobar”,”foofoo”]

Del list and next list element in list if string exist

I have an example:
list = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
for i in range(len(list)):
if list[i][-1] == "last":
del(list[i+1])
del(list[i])
I'd like to delete this list where the last item is "last" and the next item on the list.
In this example there is a problem every time - I tried different configurations, replacing with numpy array - nothing helps.
Trackback:
IndexError: list index out of range
I want the final result of this list to be ['3', '4', 'next']
Give me some tips or help how I can solve it.

Try this:
l = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
delete_next = False
to_ret = []
for x in l:
if x[-1] == 'last':
delete_next = True
elif delete_next:
delete_next = False
else:
to_ret.append(x)
Using a variable to store if this needs to be deleted

Loop over the list, if the last element of that iteration == 'last' then skip, else, append to a new list.
Also, it is not recommended to edit lists while iterating over them as strange things can happen, as mentioned in the comments above, like the indexes changing.
l = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
newlist = []
for i in l:
if i[-1] == 'last':
continue
else:
newlist.append(i)

Comparing strings in a list and appending those that have the same first and last character to a new list

I'm in an Intro to Python class and was given this assignment:
Given a list of strings, return a new list containing all the strings from the original list that begin and end with the same character. Matching is not case-sensitive, meaning 'a' should match with 'A'. Do not alter the original list in any way.
I was running into problems with slicing and comparing the strings because the possible lists given include '' (empty string). I'm pretty stumped and any help would be appreciated.
def first_last(strings):
match=[]
x=''
count=0
while count<len(strings):
if x[0] == x[-1]:
match.append(x)
x+=x
count+=1
So, when given:
['aba', 'dcn', 'z', 'zz', '']
or
['121', 'NbA', '898', '']
I get this:
string index out of range
When I should be seeing:
['aba', 'z', 'zz']
and
['121', '898']

Your list contains an empty string (''). Thus, you will have to check for the length of each element that you are currently iterating over. Also, it does not seem that you use x:
def first_last(strings):
match=[]
count=0
while count<len(strings):
if strings[count]:
if strings[count][0].lower() == strings[count][-1].lower():
match.append(strings[count])
count += 1
return match
Note, however, that you can also use list comprehension:
s = ['aba', 'dcn', 'z', 'zz', '']
final_strings = [i for i in s if i and i[0].lower() == i[-1].lower()]

def first_last(strings):
match=[]
for x in strings:
if x is '' continue;
if x.lower()[0] == x.lower()[-1]:
match.append(x)
return match

Test if the list element is not None first:
def first_last(strings):
match = []
for element in strings:
if element and element[0].lower() == element[-1].lower():
match.append(element)
return match
or with list comp:
match = [element for element in strings if element and element[0].lower() == element[-1].lower()]

How to force process identical elements individually in a list of lists?

I am writing a program that tags parts of speech, producing a list of lists. Here is an example function from the program:
phrase = [['he',''],['is', ''],['believed', ''],['to',''],['have',''],['believed','']]
def parts_tagger(input_list):
parts = []
for [x,y] in input_list:
prior_word = input_list[input_list.index([x,y]) - 1][0]
if x.startswith('be') and y == '' and prior_word == 'is':
parts.append([x,'passive'])
else:
parts.append([x,y])
return parts
print (parts_tagger(phrase))
When you run this piece of code, Python finds the first word to which the condition applies (the first "believed") and tags it correctly:
[['he', ''], ['is', ''], ['believed', 'passive'], ['to', ''], ['have', ''], ['believed', 'passive']]
But then it somehow applies the same tag to other identical words (the second "believed") in the list to which the condition does not apply. What am I doing wrong? How can fix this and force Python to treat each item in the list indivdually?

The problem is with this line
prior_word = input_list[input_list.index([x,y]) - 1][0]
list.index returns the index of the first match.
Return the index in the list of the first item whose value is x. It is an error if there is no such item.
You can use enumerate to solve your problem. Change your loop and the next line to these.
for ind,[x,y] in enumerate(input_list):
prior_word = input_list[ind - 1][0]
The output will be as expected
[['he', ''], ['is', ''], ['believed', 'passive'], ['to', ''], ['have', ''], ['believed', '']]
As Shawn pointed out below (in a now deleted comment), I think that you would need to start with the second index with yourself manually filling the value for the first element. This is because for the first element, you will not have any previous value. There are two work-around(s) for this
Start with the second element
for ind,[x,y] in enumerate(input_list[1:],start=1):
Add an condition in your body.
for ind,[x,y] in enumerate(input_list):
prior_index = ind - 1
if prior_index<0:
# Do something
break
prior_word = input_list[ind - 1][0]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

remove elements from list of strings while traversing [duplicate] - python

Related

Want to remove elements based on first character - Python

Removing specific set of characters in a list of strings

Del list and next list element in list if string exist

Comparing strings in a list and appending those that have the same first and last character to a new list

How to force process identical elements individually in a list of lists?

Categories

Resources