How to remove matching item from nested list? - python

I have a list with lists and I would like remove a wildcard matching item from each list if present, otherwise return it as it is.
Example
nested_list = [["abc","fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","abcvs","ewq"],...]
What I tried to do is:
for x in nested_list :
for y in x:
if re.search('abc.+', y) in x:
nested_list.remove(x)
However it returns the same list, without any changes
My desirable output would be:
nested_list = [["fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","ewq"],...]
Is there a solution?

Here is one way to do this with a nested 2D list comprehension:
nested_list = [["abc","fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","abcvs","ewq"]]
output = [[y for y in x if not re.search(r'^abc', y)] for x in nested_list]
print(output) # [['fds', 'gfssdf'], ['dfsdf', 'cds', 'dvc'], ['dsaf', 'ewq']]

You could also do this using startswith instead of re:
>>> [[y for y in x if not y.startswith("abc")] for x in nested_list]
[['fds', 'gfssdf'], ['dfsdf', 'cds', 'dvc'], ['dsaf', 'ewq']]

The other answers are providing a nice solution, but I wanted to answer OP's original question for learning purposes
There are some mistakes in your code, I'll adress them one by one:
if re.search('abc.+', y) in x:
re.search returns None if it's not found, so you can remove the in x
The + in abc.+ searched for 1 or more, since you want to match abc, change the + to a ? to match 0 or more
If you'd remove all the elements from an deeper list, you'll end op with a empty list, so lets add a check for that and remove the empty list:
if not x:
nested_list.remove(x)
Applying those fixes gives us:
import re
nested_list = [["abc","fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","abcvs","ewq"], ["abc"]]
for x in nested_list :
for y in x:
if re.search('abc.?', y):
x.remove(y)
if not x:
nested_list.remove(x)
print(nested_list)
Witch gives the expected output:
[['fds', 'gfssdf'], ['dfsdf', 'cds', 'dvc'], ['dsaf', 'ewq']]
As you can test in this online demo.

Related

Deleting list from a list of lists

I need to delete these lists inside of list that contains the / symbol.
List for example:
X = [['a/','$1'], ["c","d"]]
so X[0] should be deleted. The actual list are much longer and contains more instances of this condition.
I tried use something like:
print([l for l in X if l.count("/") <1])
But if I understand correctly because the / is attached to another symbol he is not counted.
Should I convert this list of lists to string, separate the / from another character, and then use the count function, or there is better solution?
One way to search "/" in each item in the sublists is to wrap a generator expression with any. Since you don't want sublists with "/" in it, the condition should be not any():
out = [lst for lst in X if not any('/' in x for x in lst)]
Output:
[['c', 'd']]
The call to filter() applies that lambda function to every list in X and filters out list with '/'.
result = list(filter(lambda l: not any('/' in s for s in l), X))
counter = 0
while counter < len(X):
removed = False
for i in X[counter]:
if '/' in i:
X.pop(counter)
removed = True
break
if not removed:
counter += 1
Given:
X = [['a/','$1'], ["c","d"]]
You can convert the sub lists to their repr string representations and detect the / in that string:
new_x=[sl for sl in X if not '/' in repr(sl)]
Or, you can use next:
new_x=[sl for sl in X if not next('/' in s for s in sl)]
Either:
>>> new_x
[['c', 'd']]

Remove list for list

I receive a list in a list on this example:
y.append([x for x in range(0,6)])
The result:
[[0,1,2,3,4,5]]
How can I remove one []?
y.extend([x for x in range(0,6)])
or if you want to assign it to another value
a = [x for x in range(0,6)]
Just do this:
y.append([x for x in range(0,6)])
new_list = y[0]
new list will be the first list bit without one set of braces

python efficient way to compare nested lists and append matches to new list

I wish to compare two nested lists. If there is a match between the first element of each sublist, I wish to add the matched element to a new list for further operations. Below is an example and what I've tried so far:
Example:
x = [['item1','somethingelse1'], ['item2', 'somethingelse2']...]
y = [['item1','somethingelse3'], ['item3','somethingelse4']...]
What I've I tried so far:
match = []
for itemx in x:
for itemy in y:
if itemx[0] == itemy[0]:
match.append(itemx)
The above of what I tried did the job of appending the matched item into the new list, but I have two very long nested lists, and what I did above is very slow for operating on very long lists. Are there any more efficient ways to get out the matched item between two nested lists?
Yes, use a data structure with constant-time membership testing. So, using a set, for example:
seen = set()
for first,_ in x:
seen.add(first)
matched = []
for first,_ in y:
if first in seen:
matched.append(first)
Or, more succinctly using set/list comprehensions:
seen = {first for first,_ in x}
matched = [first for first,_ in y if first in seen]
(This was before the OP changed the question from append(itemx[0]) to append(itemx)...)
>>> {a[0] for a in x} & {b[0] for b in y}
{'item1'}
Or if the inner lists are always pairs:
>>> dict(x).keys() & dict(y)
{'item1'}
IIUC using numpy:
import numpy as np
y=[l[0] for l in y]
x=np.array(x)
x[np.isin(x[:, 0], y)]

Create list from 2D list based on conditions

I have a 2D Array in Python that is set up like
MY_ARRAY = [
['URL1', "ABC"],
['URL2'],
['URL3'],
['URL4', "ABC"]
]
I want to make an array of first element of each array only if the 2nd parameter is "ABC". So the result of the above example would be ['URL1', 'URL4']
I tried to do [x[0] for x in MY_ARRAY if x[1] == 'ABC'] but this returns IndexError: list index out of range. I think this is because sometimes x[1] is nonexistant.
I am looking for this to be a simple one-liner.
You could simply add a length check to the filtering criteria first. This works because Python short-circuits boolean expressions.
[ele[0] for ele in MY_ARRAY if len(ele) > 1 and ele[1] == 'ABC']
Also note that the proper terminology here is a list of lists, not an array.
I think you should try doing this:
if len(x) > 1:
if x[1] == 'ABC':
#do something here
This is a work around, but you can try it on one-liner code using:
if len(x) > 1 and x[1] == "ABC"
Cheers!
Try this
[x[0] for x in MY_ARRAY if len(x) > 1 and x[1] == 'ABC']
It is happening because you have two list are having only one item and you are trying to access second item from that list
First, let me tell you are on the right track, when x[1] doesn't exist you get the error
But, it's a bad habit to insist on doing things as a one-liner if it complicates matters.
Having said that, here's a one-liner that does that:
NEW_ARRAY = [x[0] for x in MY_ARRAY if len(x)>1 and x[1]=='ABC']
I found a simple solution that works for my usage.
[x[0] for x in MY_ARRAY if x[-1] == 'ABC']
x[-1] will always exist since thats the last element in the array/list.

python list that matches everything

I probably didn't ask correctly: I would like a list value that can match any list: the "inverse" of (None,)
but even with (None,) it will match item as None (which I don't want)
The point is I have a function working with: [x for x in my_list if x[field] not in filter_list]
and I would like to filter everything or nothing without making tests like:
if filter_list==(None,): return [] and if filter_list==('*',): return my_list
PS: I wanted to simplify my question leading to some errors (list identifier) or stupid thing [x for x in x] ;)
Hi,
I need to do some filtering with list comprehension in python.
if I do something like that:
[x for x in list if x in (None,)]
I get rid of all values, which is fine
but I would like to have the same thing to match everything
I can do something like:
[x for x in list if x not in (None,)]
but it won't be homogeneous with the rest
I tried some things but for example (True,) matches only 1
Note than the values to filter are numeric, but if you have something generic (like (None,) to match nothing), it would be great
Thanks
Louis
__contains__ is the magic method that checks if something is in a sequence:
class everything(object):
def __contains__(self, _):
return True
for x in (1,2,3):
print x in everything()
The better syntax would be:
[x for x in lst if x is None]
[x for x in lst if x is not None]
What do you mean by
I would like to have the same thing to match everything
Just do
[x for x in list]
and every item in list is matched.
You could change your program to accept a filter object, instead of a list.
The abstract base filter would have a matches method, that returns true if x *matches".
Your general case filters would be constructed with a list argument, and would filter on membership of the list - the matches function would search the list and return true if the argument was in the list.
You could also have two special subclasses of the filter object : none and all.
These would have special match functions which either always return true (all) or false (none).
You don't need an if, you can just say
[x for x in list]
but I would like to have the same
thing to match everything
To match everything, you don't need if statement
[x for x in list1]
or If you really like to do
[x for x in list1 if x in [x]]
Answering your revised question: the list that "matches" all possible values is effectively of infinite length. So you can't do what you want to do without an if test. I suggest that your arg should be either a list or one of two values representing the "all" and "none" cases:
FILTER_NONE = object() # or []
FILTER_ALL = object()
def filter_func(alist, filter_list):
if filter_list is FILTER_ALL:
return []
elif filter_list is FILTER_NONE:
return alist
# or maybe alist[:] # copy the list
return [x for x in alist if x not in filter_list]
If filter_list is large, you may wish the replace the last line by:
filter_set = set(filter_list)
return [x for x in alist if x not in filter_set]
Alternatively, don't bother; just document that filter_list (renamed as filter_collection) can be anything that supports __contains__() and remind readers that sets will be faster than lists.

Categories