Given two lists, what is the best way to remove the intersection of the two? For example, given:
a = [2,2,2,3]
b = [2,2,5]
I want to return:
a = [2,3]
b = [5]
Let's assume you wish to handle the general case (same elements appear more than once in each list), the so called multiset.
You can use collections.Counter:
from collections import Counter
intersection = Counter(a) & Counter(b)
multiset_a_without_common = Counter(a) - intersection
multiset_b_without_common = Counter(b) - intersection
new_a = list(multiset_a_without_common.elements())
new_b = list(multiset_b_without_common.elements())
For your values of a, b, you'll get:
a = [2,2,2,3]
b = [2,2,5]
new_a = [2, 3]
new_b = [5]
Note that for a special case of each element appearing exactly once, you can use the standard set, as the other answers are suggesting.
You can loop through the two lists and remove elements as you find an intersect point as the following:
a = [2, 2, 2, 3]
b = [2, 2, 5]
delete = []
for c in a:
for n in b:
if n == c:
delete.append(c)
delete.append(n)
break
a.remove(delete[0])
b.remove(delete[1])
delete = []
print a
print b
output:
[2, 3]
[5]
a = [2,2,2,3]
b = [2,2,5]
for i in list(b): #I call list() on b because otherwise I can't remove from it during the for loop.
if i in a:
a.remove(i)
b.remove(i)
Output:
a = [2, 3]
b = [5]
Related
In resume, I have two keys in the same dictionary where each one has their corresponding lists.
I try to compare both list to check common and differential elements. It means that the output I will count how many elements are identical or present in only one key's list.
from the beginning I am inserting the elements using the files as arguments and they are read in the function
def shared(list):
dict_shared = {}
for i in list:
infile = open(i, 'r')
if i not in dict_shared:
dict_shared[i] = []
for line in infile:
dict_shared[spacer].append(record.id)
return dict_shared
Now I am stuck trying to find a way to compare the lists created and present in the dictionary.
dict = {a:[1,2,3,4,5], b:[2,3,4,6]}
My intention is to compare the lists in order to have the lines shared between two texts.
a: [1,5]
b: [6]
a-b: [2,3,4]
From now I can't find a way to solve this. Any suggestion?
You could use set:
d = {'a':[1,2,3,4,5], 'b':[2,3,4,6]}
print(list(set(d['a'])-set(d['b'])))
print(list(set(d['b'])-set(d['a'])))
print(list(set(d['b'])&set(d['a'])))
result:
[1, 5]
[6]
[2, 3, 4]
you can do that by utilising python inbuilt functions like union, difference, intersection.
Note: These are for sets,
you can convert a list to set by
1stset = set(a)
example:
print(1stset.difference(2ndset))
print(1stset.intersection(2ndset))
print(1stset.union(2ndset))
you can refer the following links for more information
https://www.geeksforgeeks.org/python-intersection-two-lists/
https://www.geeksforgeeks.org/python-union-two-lists/
https://www.geeksforgeeks.org/python-difference-two-lists/
A solution with list comprehension would be:
dictionary = {'a':[1,2,3,4,5], 'b':[2,3,4,6]}
only_in_a = [x for x in dictionary['a'] if not x in dictionary['b']]
only_in_b = [x for x in dictionary['b'] if not x in dictionary['a']]
in_both = [x for x in dictionary['a'] if x in dictionary['b']]
Note that this is not especially wise in terms of complexity, for larger lists.
Not sure if I understand correctly what you are trying to achieve but it seems like you'd need set operations:
dictionary = {"a":[1,2,3,4,5], "b":[2,3,4,6]}
#in a but not in b
set(dictionary["a"]) - set(dictionary["b"])
#in b but not in a
set(dictionary["b"]) - set(dictionary["a"])
#union of both
set(dictionary["b"]).union(set(dictionary["a"]))
#intersection of both
set(dictionary["b"]).intersection(set(dictionary["a"]))
You can try something like this
mydict = {'a': [1,2,3,4,5], 'b': [2,3,4,6]}
>>> list(set(mydict['a']).intersection(mydict['b'])) # common to both
[2, 3, 4]
>>> list(set(mydict['a']).difference(mydict['b'])) # a - b
[1, 5]
>>> list(set(mydict['b']).difference(mydict['a'])) # b - a
[6]
>>> list(set(mydict['a']).union(mydict['b'])) # union of both
[1, 2, 3, 4, 5, 6]
Try this
print("a - b : {} ".format(list(set(_dict['a']) - set(_dict['b']))))
print('b - a : {} '.format(list(set(_dict['b']) - set(_dict['a']))))
print('a \u2229 b : {} '.format(list(set(_dict['a']).intersection(set(_dict['b'])))))
Output
a - b : [1, 5]
b - a : [6]
a ∩ b : [2, 3, 4]
I have two lists:
A = [5,5,4,3]
B = [5,1]
I want to remove values that appear in both lists, but only once, i.e. output should be:
Aprime = [5,4,3]
Bprime = [1]
I understand that a good way to get the difference is with sets, but this removes all repeats, not just once
You can create collections.Counter objects with the input lists and obtain the differences of the two:
from collections import Counter
a = Counter(A)
b = Counter(B)
Aprime = list((a - b).elements()) # Aprime becomes: [5, 4, 3]
Bprime = list((b - a).elements()) # Bprime becomes: [1]
Use sets to find the duplicates, but then remove them once from the original lists.
dups = set(A).intersection(set(B))
for dup in dups:
A.remove(dup)
B.remove(dup)
You can remove from each of the counters associate to A and B the Counter build using the set of the two lists so that they are only considered once.
>>> A = [5,5,5,1,1]
>>> B = [5,1,1]
>>> a_new = list((Counter(A) - Counter(set(B))).elements())
>>> b_new = list((Counter(B) - Counter(set(A))).elements())
>>> a_new
[5, 5, 1]
>>> b_new
[1]
I am a bit stuck on this:
a = [1,2,3,2,4,5]
b = [2,5]
I want to compare the two lists and generate a list with the same items as a, but with any items that don't occur in b set to 0. Valid outputs would be these:
c = [0,2,0,0,0,5]
# or
c = [0,0,0,2,0,5]
I would not know the number elements in either list beforehand.
I tried for loops but
['0' for x in a if x not in b]
It removes all instances of 2. Which I only want to remove once(it occurs once in b for the moment). I need to add a condition in the above loop to keep elements which match.
The following would work:
a = [1,2,3,2,4,5]
b = [2, 5]
output = []
for x in a:
if x in b:
b.remove(x)
output.append(x)
else:
output.append(0)
or for a one-liner, using the fact that b.remove(x) returns None:
a = [1,2,3,2,4,5]
b = {2, 5}
output = [(b.remove(x) or x) if x in b else 0 for x in a]
If the elements in b are unique, this is best done with a set, because sets allow very efficient membership testing:
a = [1,2,3,2,4,5]
b = {2, 5} # make this a set
result = []
for num in a:
# If this number occurs in b, remove it from b.
# Otherwise, append a 0.
if num in b:
b.remove(num)
result.append(num)
else:
result.append(0)
# result: [0, 2, 0, 0, 0, 5]
If b can contain duplicates, you can replace the set with a Counter, which represents a multiset:
import collections
a = [1,2,3,2,4,5]
b = collections.Counter([2, 2, 5])
result = []
for num in a:
if b[num] > 0:
b[num] -= 1
result.append(num)
else:
result.append(0)
# result: [0, 2, 0, 2, 0, 5]
Here's one way using set. Downside is the list copy operation and initial set conversion. Upside is O(1) removal and lookup operations.
a = [1,2,3,2,4,5]
b = [2,5]
b_set = set(b)
c = a.copy()
for i in range(len(c)):
if c[i] in b_set:
b_set.remove(c[i])
else:
c[i] = 0
print(c)
[0, 2, 0, 0, 0, 5]
For two lists I want
A = [ 1,2,3,4,5]
B = [4,5,6,7]
result
C = [1,2,3,4,5,6,7]
if I specify an overlap of 2.
Code so far:
concat_list = []
word_overlap = 2
for lst in [lst1, lst2, lst3]:
if (len(concat_list) != 0):
if (concat_list[-word_overlap:] != lst[:word_overlap]):
concat_list += lst
elif ([concat_list[-word_overlap:]] == lst[:word_overlap]):
raise SystemExit
else:
concat_list += lst
doing it for lists of strings, but should be the same thing.
EDIT:
What I want my code to do is, first, check if there is any overlap (of 1, of 2, etc), then concatenate lists, eliminating the overlap (so I don't get double elements).
[1,2,3,4,5] + [4,5,6,7] = [1,2,3,4,5,6,7]
but
[1,2,3] + [4,5,6] = [1,2,3,4,5,6]
I want it to also check for any overlap smaller than my set word_overlap.
Here's a naïve variant:
def concat_nooverlap(a,b):
maxoverlap=min(len(a),len(b))
for overlap in range(maxoverlap,-1,-1):
# Check for longest possible overlap first
if a[-overlap:]==b[:overlap]:
break # Found an overlap, don't check any shorter
return a+b[overlap:]
It would be more efficient with types that support slicing by reference, such as buffers or numpy arrays.
One quite odd thing this does is, upon reaching overlap=0, it compares the entirety of a (sliced, which is a copy for a list) with an empty slice of b. That comparison will fail unless they were empty, but it still leaves overlap=0, so the return value is correct. We can handle this case specifically with a slight rewrite:
def concat_nooverlap(a,b):
maxoverlap=min(len(a),len(b))
for overlap in range(maxoverlap,0,-1):
# Check for longest possible overlap first
if a[-overlap:]==b[:overlap]:
return a+b[overlap:]
else:
return a+b
You can use set and union
s.union(t): new set with elements from both s and t
>> list(set(A) | set(B))
[1, 2, 3, 4, 5, 6, 7]
But you can't have the exact number you need to overlap this way.
To answer you question, you will have to ruse and use a combination of sets:
get a new list with elements from both A and B
get new list with elements common to A and B
get only the number of elements you need in this list using slicing
get new list with elements in either A or B but not both
OVERLAP = 1
A = [1, 2, 3, 4, 5]
B = [4, 5, 6, 7]
C = list(set(A) | set(B)) # [1, 2, 3, 4, 5, 6, 7]
D = list(set(A) & set(B)) # [4, 5]
D = D[OVERLAP:] # [5]
print list(set(C) ^ set(D)) # [1, 2, 3, 4, 6, 7]
just for fun, a one-liner could give this:
list((set(A) | set(B)) ^ set(list(set(A) & set(B))[OVERLAP:])) # [1, 2, 3, 4, 6, 7]
Where OVERLAP is the constant where you need you reunion.
Not sure if I correctly interpreted your question, but you could do it like this:
A = [ 1,2,3,4,5]
B = [4,5,6,7]
overlap = 2
print A[0:-overlap] + B
If you want to make sure they have the same value, your check could be along the lines of:
if(A[-overlap:] == B[:overlap]):
print A[0:-overlap] + B
else:
print "error"
assuming that both lists will be consecutive, and list a will always have smaller values than list b. I come up with this solution.
This will also help you detect overlap.
def concatenate_list(a,b):
max_a = a[len(a)-1]
min_b = b[0]
if max_a >= min_b:
print 'overlap exists'
b = b[(max_a - min_b) + 1:]
else:
print 'no overlap'
return a + b
For strings you can do this also
def concatenate_list_strings(a,b):
count = 0
for i in xrange(min(len(a),len(b))):
max_a = a[len(a) - 1 - count:]
min_b = b[0:count+1]
if max_a == min_b:
b = b[count +1:]
return 'overlap count ' + str(count), a+b
count += 1
return a + b
I'm trying to make a function that when a list is empty, a second list will empty into the empty one in reverse order. Right now, I tried to do:
a = [1,2,3,4]
b = []
def a_to_b(a, b):
if not a:
print('a:',a)
print('b:',b)
for c in b:
a.append(c)
b.remove(c)
print('a:',a)
print('b:',b)
return True
else:
top = a.pop()
b.append(top)
print('a:',a)
print('b:',b)
return True
I want it to be after each run:
1) a = [1,2,3]
b = [4]
2) a = [1,2]
b = [4,3]
3) a = [1]
b = [4,3,2]
4) a = []
b = [4,3,2,1]
5) a = [1,2,3,4]
b = []
But after this fifth run it is giving me
a = [4,2]
b = [3,1]
And I cannot figure out why it is only applying to every other number in b.
This should work
a = [1,2,3,4]
b = []
for i in range(len(a)):
b.append(a[-1])
a.pop()
print a,b
Your problem is caused by you removing elements that you're hopping over in the for loop)
Pointer is going as index 0, 1, 2, 3 but you're already removing the 0th element causing the pointer to go straight to 2 (which is now index 1 in remaining list)
to avoid it, you could change your code to:
for c in b:
a.append(c)
for c in a:
b.remove(c)
Here's the reason why you're getting a weird result:
for c in b:
a.append(c)
b.remove(c)
You're changing list b as you're iterating over it. Thus things are not going to come out as you expect them to. Why don't you just do
b.reverse()
a = b
b = []
In the place of what you had before. So it'd be
a = [1,2,3,4]
b = []
def a_to_b(a, b):
if not a:
print('a:',a)
print('b:',b)
b.reverse()
a = b
b = []
print('a:',a)
print('b:',b)
return True
else:
top = a.pop()
b.append(top)
print('a:',a)
print('b:',b)
return True
def f(a, b):
if a:
b.append(a.pop())
else:
while b:
a.append(b.pop())
print 'a: %s' % a
print 'b: %s' % b
>>> a = [1, 2, 3, 4]
>>> b = []
>>> f(a, b)
a: [1, 2, 3]
b: [4]
>>> f(a, b)
a: [1, 2]
b: [4, 3]
>>> f(a, b)
a: [1]
b: [4, 3, 2]
>>> f(a, b)
a: []
b: [4, 3, 2, 1]
>>> f(a, b)
a: [1, 2, 3, 4]
b: []
The problem is here:
for c in b:
a.append(c)
b.remove(c)
You cant remove objects from the middle of an interable without confusing the loop.
Heres what happens:
You have b=[4,3,2,1] and a=[]. You start the for loop and c is pointing to the first index of b, ie 4. You remove c from the list and put it in a.
Now you have b = [3,2,1] and a=[4] like you expect.
When you try to start the next loop, your index is incremented (c is now pointing at the 2nd element) but the problem is you've messed with the structure of the iterable. So the loop removes c like its supposed to, but c=2, not 3 like you are expecting.
Now you have a=[4,2] and b=[1,3] and when the loop checks for index 3, it finds that b is only has 2 elements so it exits.