python list that matches everything - python

I probably didn't ask correctly: I would like a list value that can match any list: the "inverse" of (None,)
but even with (None,) it will match item as None (which I don't want)
The point is I have a function working with: [x for x in my_list if x[field] not in filter_list]
and I would like to filter everything or nothing without making tests like:
if filter_list==(None,): return [] and if filter_list==('*',): return my_list
PS: I wanted to simplify my question leading to some errors (list identifier) or stupid thing [x for x in x] ;)
Hi,
I need to do some filtering with list comprehension in python.
if I do something like that:
[x for x in list if x in (None,)]
I get rid of all values, which is fine
but I would like to have the same thing to match everything
I can do something like:
[x for x in list if x not in (None,)]
but it won't be homogeneous with the rest
I tried some things but for example (True,) matches only 1
Note than the values to filter are numeric, but if you have something generic (like (None,) to match nothing), it would be great
Thanks
Louis

__contains__ is the magic method that checks if something is in a sequence:
class everything(object):
def __contains__(self, _):
return True
for x in (1,2,3):
print x in everything()

The better syntax would be:
[x for x in lst if x is None]
[x for x in lst if x is not None]

What do you mean by
I would like to have the same thing to match everything
Just do
[x for x in list]
and every item in list is matched.

You could change your program to accept a filter object, instead of a list.
The abstract base filter would have a matches method, that returns true if x *matches".
Your general case filters would be constructed with a list argument, and would filter on membership of the list - the matches function would search the list and return true if the argument was in the list.
You could also have two special subclasses of the filter object : none and all.
These would have special match functions which either always return true (all) or false (none).

You don't need an if, you can just say
[x for x in list]

but I would like to have the same
thing to match everything
To match everything, you don't need if statement
[x for x in list1]
or If you really like to do
[x for x in list1 if x in [x]]

Answering your revised question: the list that "matches" all possible values is effectively of infinite length. So you can't do what you want to do without an if test. I suggest that your arg should be either a list or one of two values representing the "all" and "none" cases:
FILTER_NONE = object() # or []
FILTER_ALL = object()
def filter_func(alist, filter_list):
if filter_list is FILTER_ALL:
return []
elif filter_list is FILTER_NONE:
return alist
# or maybe alist[:] # copy the list
return [x for x in alist if x not in filter_list]
If filter_list is large, you may wish the replace the last line by:
filter_set = set(filter_list)
return [x for x in alist if x not in filter_set]
Alternatively, don't bother; just document that filter_list (renamed as filter_collection) can be anything that supports __contains__() and remind readers that sets will be faster than lists.

Related

How to remove matching item from nested list?

I have a list with lists and I would like remove a wildcard matching item from each list if present, otherwise return it as it is.
Example
nested_list = [["abc","fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","abcvs","ewq"],...]
What I tried to do is:
for x in nested_list :
for y in x:
if re.search('abc.+', y) in x:
nested_list.remove(x)
However it returns the same list, without any changes
My desirable output would be:
nested_list = [["fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","ewq"],...]
Is there a solution?
Here is one way to do this with a nested 2D list comprehension:
nested_list = [["abc","fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","abcvs","ewq"]]
output = [[y for y in x if not re.search(r'^abc', y)] for x in nested_list]
print(output) # [['fds', 'gfssdf'], ['dfsdf', 'cds', 'dvc'], ['dsaf', 'ewq']]
You could also do this using startswith instead of re:
>>> [[y for y in x if not y.startswith("abc")] for x in nested_list]
[['fds', 'gfssdf'], ['dfsdf', 'cds', 'dvc'], ['dsaf', 'ewq']]
The other answers are providing a nice solution, but I wanted to answer OP's original question for learning purposes
There are some mistakes in your code, I'll adress them one by one:
if re.search('abc.+', y) in x:
re.search returns None if it's not found, so you can remove the in x
The + in abc.+ searched for 1 or more, since you want to match abc, change the + to a ? to match 0 or more
If you'd remove all the elements from an deeper list, you'll end op with a empty list, so lets add a check for that and remove the empty list:
if not x:
nested_list.remove(x)
Applying those fixes gives us:
import re
nested_list = [["abc","fds","gfssdf"],["dfsdf","cds","dvc"],["dsaf","abcvs","ewq"], ["abc"]]
for x in nested_list :
for y in x:
if re.search('abc.?', y):
x.remove(y)
if not x:
nested_list.remove(x)
print(nested_list)
Witch gives the expected output:
[['fds', 'gfssdf'], ['dfsdf', 'cds', 'dvc'], ['dsaf', 'ewq']]
As you can test in this online demo.

behaviour difference between list() function and [] in a conditional, the former behaves like string

Why does list(str) behaves as string here when [str] doesn't?
Is there a difference between these methods
Before someone marks this as a duplicate do link the answer because I've spent a fair bit of time scrawling through stackoverflow!
code
x = 'ar'
'a' in list(x)
#True
'a' in [x]
#False
l = list(x)
'a' in l
#True
type(list(x))
#list
type([x])
#list
This is because list() converts the string to a list where each letter is one element. But [] creates a list where the things inside are the elements. List() is converting the string to a list whereas [] is just putting the string in a list.
You can use debug output for clarifying such things. Like this:
x = 'ar'
print(list(x))
print([x])
Prints this:
['a', 'r']
['ar']
Then let's think logically. list(x) is a constructor of a list from the string, it creates a list of all characters of a given string. And [x] just creates a list with one item: x.
Because you are asking if the element 'a' is in the list. Which it is not, your only element is 'ar'. If you print([x]) the result should be ['ar']
[x] creates a single-element list, where the element is x. So if x = 'ar', then the resulting list is ['ar'].
list(x) casts the variable x into a list. This can work on any iterable object, and strings are iterable. The resulting list is ['a', 'r'].
The element 'a' is in the second list but not the first.

Python - comparing value to an element of list of lists

Let's assume i got list like this:
list = [[1],[3],[4],[5],[8],[9],[12],[14],[15]]
then for some items in range(16) i want to compare these items to list elements and if they are equals do something.
For my best try i got a code like this:
for f in range(16):
if any(f == any(list) for x in list):
print('f: ',f)
in this case it prints only once for the f == 1, where I want it to get print() for each equal elements. I'm pretty sure I'm comparing int to list and I'm not getting desired result but in case like this I don't know how to get to inners list values : -/
You can use any here. You should also rename list to avoid shadowing the built-in type by that name.
l = [[1],[3],[4],[5],[8],[9],[12],[14],[15]]
for f in range(16):
if any(f in sub_list for sub_list in l):
print('f:', f)
any accepts an iterable, and returns True if any element of that iterable is true and False otherwise. What we are doing here is defining a generator comprehension (f in sub_list...) that checks each sublist for membership. Since any short-circuits (i.e. it doesn't keep checking elements once it has discovered one is True), using a lazily evaluating iterator saves unnecessary effort.
What was happening in your original code was that True also has a numeric value of 1 (mostly for legacy reasons). So since any(list) was always going to be True, f == any(list) is true only when f is 1
In this case you should try to have a list of integers. If it is not possible, I would use the following code:
l = [[1],[3],[4],[5],[8],[9],[12],[14],[15]]
for i in l:
if i[0] in range(16):
print(i[0])

What is an elegant way to convert the result of [x for x in y] from list to a regular variable?

What is an elegant way to convert the result of [x for x in y] from list to a regular variable?
result= [x for x in range(10) if x==7]
The result of the above will be [7].
I am now using result=result[0] but ...it does not look right :-)
thanks
You have a list comprehension on the right hand side. It evaluates to a list.
You want to pick up the first element (which is perhaps the only element for the kind of problems you are trying to solve) from it, so index the 0-th element in the list returned by the list comprehension, just like you would do it for a regular list.
result = [x for x in range(10) if x == 7][0]
You can also use a generator expression instead of a list expression and then call the next() function to retrieve the first item from the iterator returned by the generator expression.
result = next(x for x in range(10) if x == 7)
You can use next that retrieves the next object from the iterator. The parameter that goes within next is a generator. This allows saves you from fully constructing the list and then filtering for 7. Instead it only iterates until it hits 7, and wont evaluate until the next next(..) is called on the generator object.
>>> next(x for x in range(10) if x==7)
7

Uniqueify returning a empty list

I'm new to python and trying to make a function Uniqueify(L) that will be given either a list of numbers or a list of strings (non-empty), and will return a list of the unique elements of that list.
So far I have:
def Uniquefy(x):
a = []
for i in range(len(x)):
if x[i] in a == False:
a.append(x[i])
return a
It looks like the if str(x[i]) in a == False: is failing, and that's causing the function to return a empty list.
Any help you guys can provide?
Relational operators all have exactly the same precedence and are chained. This means that this line:
if x[i] in a == False:
is evaluated as follows:
if (x[i] in a) and (a == False):
This is obviously not what you want.
The solution is to remove the second relational operator:
if x[i] not in a:
You can just create a set based on the list which will only contain unique values:
>>> s = ["a", "b", "a"]
>>> print set(s)
set(['a', 'b'])
The best option here is to use a set instead! By definition, sets only contain unique items and putting the same item in twice will not result in two copies.
If you need to create it from a list and need a list back, try this. However, if there's not a specific reason you NEED a list, then just pass around a set instead (that would be the duck-typing way anyway).
def uniquefy(x):
return list(set(x))
You can use the built in set type to get unique elements from a collection:
x = [1,2,3,3]
unique_elements = set(x)
You should use set() here. It reduces the in operation time:
def Uniquefy(x):
a = set()
for item in x:
if item not in a:
a.add(item)
return list(a)
Or equivalently:
def Uniquefy(x):
return list(set(x))
If order matters:
def uniquefy(x):
s = set()
return [i for i in x if i not in s and s.add(i) is None]
Else:
def uniquefy(x):
return list(set(x))

Categories