I am doing some stuff in python, but I am not that used to it yet.
This is what I want the code to do (pseudocode):
if x in list or y in list but not x and y:
#do-stuff
the first part is correct (if x in list or y in list) but I don't know what to write after that. The but not x and ypart.
I'd probably go for using a set here instead...
if len({x, y}.intersection(your_list)) == 1:
# do something
This has the advantage that it only needs to iterate once over your_list.
Example:
examples = [
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[100, 101, 102, 103, 104]
]
for example in examples:
overlap = {1, 5}.intersection(example)
print 'overlap is', overlap, 'with a length of', len(overlap)
# overlap is set([1, 5]) with a length of 2
# overlap is set([5]) with a length of 1
# overlap is set([]) with a length of 0
With "but" spelled and and a pair of parentheses, you can use:
x_in_list = x in list
y_in_list = y in list
if (x_in_list or y_in_list) and not (x_in_list and y_in_list):
...
Or, as suggested by behzad.nouri, use xor:
if (x in list) ^ (y in list):
...
This is shorter, but possibly less understandable to a non-CS-savvy reader.
Exclusive or means "one or the other, but not both" and maps perfectly in what you want to do:
if (x in list) ^ (y in list):
...
looks a bit weird because normally xor is only used for bitwise operations, but here works because Python will implicitly convert True and False to 1 and 0, making the operator work as expected in an if.
Note however that the parenthesis are necessary because the xor operator ^ has an higher precedence than in (it's most often used for bitwise math, so this choice is reasonable).
Inspired by Jon Clements's centi-upvoted answer,
items, it = {x, y}, (i in items for i in lst)
print(any(it) and not any(it))
This short-circuit version is better, in the negative cases.
it is an iterator of the original list, and we check if any of it is in items, if yes we check if all the other items are not in items.
Note: This works only if the items are unique in lst
Related
I am new to Python programming. I want to rewrite the following code as a list comprehension:
lx = [1, 2, 3, 4, 5, 1, 2]
ly = [2, 5, 4]
lz = []
for x in lx:
if x in ly and x not in lz:
lz.append(x)
This will create a new list with common elements of lx and ly; but the condition x not in lz depends on the list that is being built. How can this code be rewritten as a list comprehension?
You cannot do it that way in a list comprehension as you cannot compare against the list lz that does not yet exist - assuming you are trying to avoid duplicates in the resulting list as in your example.
Instead, you can use the python set which will enforce only a single instance of each value:
lz = set(x for x in lx if x in ly)
And if what you are really after is a set intersection (elements in common):
lz = set(lx) & set(ly)
UPDATE:
As pointed out by #Błotosmętek in the comments - using the set will not retain the order of the elements as the set is, by definition, unordered. If the order of the elements is significant a different strategy will be necessary.
The correct answer here is to use sets, because (1) sets naturally have distinct elements, and (2) sets are more efficient than lists for membership tests. So the simple solution is list(set(lx) & set(ly)).
However, sets do not preserve the order that elements are inserted in, so in case the order is important, here's a solution which preserves the order from lx. (If you want the order from ly, simply swap the roles of the two lists.)
def ordered_intersection(lx, ly):
ly_set = set(ly)
return [ly_set.remove(x) or x for x in lx if x in ly_set]
Example:
>>> ordered_intersection(lx, ly)
[2, 4, 5]
>>> ordered_intersection(ly, lx)
[2, 5, 4]
It works because ly_set.remove(x) always returns None, which is falsy, so ly_set.remove(x) or x always has the value of x.
The reason you cannot do this with a simpler list comprehension like lz = [... if x in lz] is because the whole list comprehension will be evaluated before the resulting list is assigned to the variable lz; so the x in lz test will give a NameError because there is no such variable yet.
That said, it is possible to rewrite your code to directly use a generator expression (which is somewhat like a list comprehension) instead of a for loop; but it is bad code and you shouldn't do this:
def ordered_intersection_bad_dont_do_this(lx, ly):
lz = []
lz.extend(x for x in lx if x in ly and x not in lz)
return lz
This is not just bad because of repeatedly testing membership of lists; it is worse, because it depends on an unspecified behaviour of the extend method. In particular, it adds each element one by one rather than exhausting the iterator first and then adding them all at once. The docs don't say that this is guaranteed to happen, so this bad solution won't necessarily work in other versions of Python.
If you don't want to use set, this can be another approach using list comprehension.
lx = [1, 2, 3, 4, 5, 1, 2]
ly = [2, 5, 4]
lz=[]
[lz.append(x) for x in lx if (x in ly and x not in lz)]
print(lz)
Python3's range objects support O(1) containment checking for integers (1) (2)
So one can do 15 in range(3, 19, 2) and get the correct answer True
However, it doesn't support containment checking b/w two ranges
a = range(0, 10)
b = range(3, 7)
a in b # False
b in a # False, even though every element in b is also in a
a < b # TypeError: '<' not supported between instances of 'range' and 'range'
It seems that b in a is interpreted as is any element in the range 'a' equal to the object 'b'?
However, since the range cannot contain anything but integers, range(...) in range(...) will always return False. IMHO, such a query should be answered as is every element in range 'b' also in range 'a'? Given that range only stores the start, stop, step and length, this query can also be answered in O(1).
The slice object doesn't help either. It doesn't implement the __contains__ method, and the __lt__ method simply compares two slices as tuples (which makes no sense)
Is there a reason behind the current implementation of these, or is it just a "it happened to be implemented this way" thing?
It looks like the implementation of __contains__ for ranges is range_contains, which just checks if the given element is in the iterable, with a special case for longs.
As you have correctly observed, e in b returns true iff e is an element in b. Any other implementation, such as one that checks if e is a subset of b, would be ambiguous. This is especially problematic in Python, which doesn't require iterables be homogenous, and allows nested lists/tuples/sets/iterables.
Consider a world where your desired behavior was implemented. Now, suppose we have the following list:
my_list = [1, 2, (3, 4), 5]
ele1 = (3, 4)
ele2 = (2, 5)
In this case, what should ele1 in my_list and ele2 in my_list return? Your desired behavior makes it tedious to write code such as
if e in my_iterable:
# ah! e must exist!
my_iterable.remove(e)
A safer, better way is to keep the current behavior, and instead use a different type-sensitive operator to implement subset predicates:
x = set([1])
y = set([1,2])
x < y # True
[1] < y # raises a TypeError
You're confusing 'b' containing 'a' with 'a' being a subset of 'b' - These are two different things.
b containing a means range(0, 10) is inside b. Let's say:
a = [1, 2, 3]
and
b = [1, 2, 3, 4, 5]
a in b is only true if the actual list [1, 2, 3] is in [[1, 2, 3], 4, 5]. So you're actually checking if the list itself is inside the other list, not that all the elements are in the other list.
A list a is a subset of b if all elements of a are inside b. In your example, b is a subset of a, yes, but the actual list b is not IN a.
If you want to do such methods, then it's probably recommended that you use a set data structure
Range objects implement the collections.abc.Sequence, It supports containment tests.
a in b
b in a
In this case, you are searching for Range object a in range b, vice versa. It should be false.
I have a list say l = [1,5,8,-3,6,8,-3,2,-4,6,8]. Im trying to split it into sublists of positive integers i.e. the above list would give me [[1,5,8],[6,8],[2],[6,8]]. I've tried the following:
l = [1,5,8,-3,6,8,-3,2,-4,6,8]
index = 0
def sublist(somelist):
a = []
for i in somelist:
if i > 0:
a.append(i)
else:
global index
index += somelist.index(i)
break
return a
print sublist(l)
With this I can get the 1st sublist ( [1,5,8] ) and the index number of the 1st negative integer at 3. Now if I run my function again and pass it l[index+1:], I cant get the next sublist and assume that index will be updated to show 6. However i cant, for the life of me cant figure out how to run the function in a loop or what condition to use so that I can keep running my function and giving it l[index+1:] where index is the updated, most recently encountered position of a negative integer. Any help will be greatly appreciated
You need to keep track of two levels of list here - the large list that holds the sublists, and the sublists themselves. Start a large list, start a sublist, and keep appending to the current sublist while i is non-negative (which includes positive numbers and 0, by the way). When i is negative, append the current sublist to the large list and start a new sublist. Also note that you should handle cases where the first element is negative or the last element isn't negative.
l = [1,5,8,-3,6,8,-3,2,-4,6,8]
def sublist(somelist):
result = []
a = []
for i in somelist:
if i > 0:
a.append(i)
else:
if a: # make sure a has something in it
result.append(a)
a = []
if a: # if a is still accumulating elements
result.append(a)
return result
The result:
>>> sublist(l)
[[1, 5, 8], [6, 8], [2], [6, 8]]
Since somelist never changes, rerunning index will always get index of the first instance of an element, not the one you just reached. I'd suggest looking at enumerate to get the index and element as you loop, so no calls to index are necessary.
That said, you could use the included batteries to solve this as a one-liner, using itertools.groupby:
from itertools import groupby
def sublist(somelist):
return [list(g) for k, g in groupby(somelist, key=(0).__le__) if k]
Still worth working through your code to understand it, but the above is going to be fast and fairly simple.
This code makes use of concepts found at this URL:
Python list comprehension- "pop" result from original list?
Applying an interesting concept found here to your problem, the following are some alternatives to what others have posted for this question so far. Both use list comprehensions and are commented to explain the purpose of the second option versus the first. Did this experiment for me as part of my learning curve, but hoping it may help you and others on this thread as well:
What's nice about these is that if your input list is very very large, you won't have to double your memory expenditure to get the job done. You build one up as you shrink the other down.
This code was tested on Python 2.7 and Python 3.6:
o1 = [1,5,8,-3,6,9,-4,2,-5,6,7,-7, 999, -43, -1, 888]
# modified version of poster's list
o1b = [1,5,8,-3,6,8,-3,2,-4,6,8] # poster's list
o2 = [x for x in (o1.pop() for i in range(len(o1))) \
if (lambda x: True if x < 0 else o1.insert(0, x))(x)]
o2b = [x for x in (o1b.pop() for i in range(len(o1b))) \
if (lambda x: True if x < 0 else o1b.insert(0, x))(x)]
print(o1)
print(o2)
print("")
print(o1b)
print(o2b)
It produces result sets like this (on iPython Jupyter Notebooks):
[1, 5, 8, 6, 9, 2, 6, 7, 999, 888]
[-1, -43, -7, -5, -4, -3]
[1, 5, 8, 6, 8, 2, 6, 8]
[-4, -3, -3]
Here is another version that also uses list comprehensions as the work horse, but functionalizes the code in way that is more read-able (I think) and easier to test with different numeric lists. Some will probably prefer the original code since it is shorter:
p1 = [1,5,8,-3,6,9,-4,2,-5,6,7,-7, 999, -43, -1, 888]
# modified version of poster's list
p1b = [1,5,8,-3,6,8,-3,2,-4,6,8] # poster's list
def lst_mut_byNeg_mod(x, pLst): # list mutation by neg nums module
# this function only make sense in context of usage in
# split_pos_negs_in_list()
if x < 0: return True
else:
pLst.insert(0,x)
return False
def split_pos_negs_in_list(pLst):
pLngth = len(pLst) # reduces nesting of ((()))
return [x for x in (pLst.pop() for i in range(pLngth)) \
if lst_mut_byNeg_mod(x, pLst)]
p2 = split_pos_negs_in_list(p1)
print(p1)
print(p2)
print("")
p2b = split_pos_negs_in_list(p1b)
print(p1b)
print(p2b)
Final Thoughts:
Link provided earlier had a number of ideas in the comment thread:
It recommends a Google search for the "python bloom filter library" - this sounds promising from a performance standpoint but I have not yet looked into it
There is a post on that thread with 554 up-voted, and yet it has at least 4 comments explaining what might be faulty with it. When exploring options, it may be advisable to scan the comment trail and not just review what gets the most votes. There are many options proposed for situations like this.
Just for fun you can use re too for a one liner.
l = [1,5,8,-3,6,8,-3,2,-4,6,8]
print map(lambda x: map(int,x.split(",")), re.findall(r"(?<=[,\[])\s*\d+(?:,\s*\d+)*(?=,\s*-\d+|\])", str(l)))
Output:[[1, 5, 8], [6, 8], [2], [6, 8]]
I want to create a range x from 0 ... n, without any of the numbers in the list y. How can I do this?
For example:
n = 10
y = [3, 7, 8]
x = # Do Something
Should give the output:
x = [0, 1, 2, 4, 5, 6, 9]
One naive way would be to concatenate several ranges, each spanning a set of numbers which have been intersected by the numbers in y. However, I'm not sure of what the simplest syntax to do this is in Python.
You can use a list comprehension to filter the range from 0 to n: range(n) generates a list (or, in Python 3, a generator object) from 0 to n - 1 (including both ends):
x = [i for i in range(n) if i not in y]
This filters out all numbers in y from the range.
You can also turn it into a generator (which you could only iterate over once but which would be faster for (very) large n) by replacing [ with ( and ] with ). Further, in Python 2, you can use xrange instead of range to avoid loading the entire range into memory at once. Also, especially if y is a large list, you can turn it into a set first to use O(1) membership checks instead of O(n) on list or tuple objects. Such a version might look like
s = set(y)
x = (i for i in range(n) if i not in s)
hlt's answer is ideal, but I'll quickly suggest another way using set operations.
n = 10
y = [3, 7, 8]
x = set(range(n)) - set(y)
x will be a set object. If you definitely need x to be a list, you can just write x = list(x).
Note that the ordering of a set in Python is not guaranteed to be anything in particular. If order is needed, remember to sort.
Adding on to the above answers, here is my answer using lambda function:
x = filter(lambda x: x not in y,range(n))
I want to compare two lists of same length
a = [1, 3, 5, 7, 9]
b = [1, 2, 5, 7, 3]
and find out the number of differences n, in this case it'll be n = 2, and also return an error if the length are not equal. What's the pythonic way of doing this?
The simplest way to do this is to use the sum() built-in and a generator expression:
def differences(a, b):
if len(a) != len(b):
raise ValueError("Lists of different length.")
return sum(i != j for i, j in zip(a, b))
We loop over the lists together using zip() and then compare them. As True == 1 and False == 0, we just sum this to get the number of differences. Another option would be to use the conditional part of the generator expression:
sum(1 for i, j in zip(a, b) if i != j)
I can't really say I feel one is more readable than the other, and doubt there will be a performance difference.
One-liner solution that also produces an error if the length is not equal:
>>> sum(map(lambda x,y: bool(x-y),a,b))
2
Now try the input of different length:
>>> sum(map(lambda x,y: bool(x-y),[1,2],[1]))
TypeError
How it works: bool(x,y) returns True if elements are different. Then we map this function on 2 lists and get the list [False, True, False, True, False].
If we put into the function map() the lists of different length, we get the TypeError
Finally, the function sum() of this boolean list gives 2.
You could use sets. Cast both to a set, then find difference between the two. For example:
>>> a = [1,3,5,7,9]
>>> b = [1,2,5,7,2]
>>> len(set(a) - set(b))
2
This could be wrapped up in a function to check for length differences first.