Common elements between two lists not using sets in Python - python

I want count the same elements of two lists. Lists can have duplicate elements, so I can't convert this to sets and use & operator.
a=[2,2,1,1]
b=[1,1,3,3]
set(a) & set(b) work
a & b don't work
It is possible to do it withoud set and dictonary?

In Python 3.x (and Python 2.7, when it's released), you can use collections.Counter for this:
>>> from collections import Counter
>>> list((Counter([2,2,1,1]) & Counter([1,3,3,1])).elements())
[1, 1]
Here's an alternative using collections.defaultdict (available in Python 2.5 and later). It has the nice property that the order of the result is deterministic (it essentially corresponds to the order of the second list).
from collections import defaultdict
def list_intersection(list1, list2):
bag = defaultdict(int)
for elt in list1:
bag[elt] += 1
result = []
for elt in list2:
if elt in bag:
# remove elt from bag, making sure
# that bag counts are kept positive
if bag[elt] == 1:
del bag[elt]
else:
bag[elt] -= 1
result.append(elt)
return result
For both these solutions, the number of occurrences of any given element x in the output list is the minimum of the numbers of occurrences of x in the two input lists. It's not clear from your question whether this is the behavior that you want.

Using sets is the most efficient, but you could always do r = [i for i in l1 if i in l2].

SilentGhost, Mark Dickinson and Lo'oris are right, Thanks very much for report this problem - I need common part of lists, so for:
a=[1,1,1,2]
b=[1,1,3,3]
result should be [1,1]
Sorry for comment in not suitable place - I have registered today.
I modified yours solutions:
def count_common(l1,l2):
l2_copy=list(l2)
counter=0
for i in l1:
if i in l2_copy:
counter+=1
l2_copy.remove(i)
return counter
l1=[1,1,1]
l2=[1,2]
print count_common(l1,l2)
1

Related

python list of sets find symmetric difference in all elements

Consider this list of sets
my_input_list= [
{1,2,3,4,5},
{2,3,7,4,5},
set(),
{1,2,3,4,5,6},
set(),]
I want to get the only exclusive elements 6 and 7 as the response, list or set. Set preferred.
I tried
print reduce(set.symmetric_difference,my_input_list) but that gives
{2,3,4,5,6,7}
And i tried sorting the list by length, smallest first raises an error due to two empty sets. Largest first gives the same result as unsorted.
Any help or ideas please?
Thanks :)
Looks like the most straightforward solution is to count everything and return the elements that only appear once.
This solution uses chain.from_iterable (to flatten your sets) + Counter (to count things). Finally, use a set comprehension to filter elements with count == 1.
from itertools import chain
from collections import Counter
c = Counter(chain.from_iterable(my_input_list))
print({k for k in c if c[k] == 1})
{6, 7}
A quick note; the empty literal {} is used to indicate an empty dict, not set. For the latter, use set().
You could use itertools.chain and collection.Counter:
from itertools import chain
from collections import Counter
r = {k for k,v in Counter(chain.from_iterable(my_input_list)).items() if v==1}

Check number not a sum of 2 ints on a list

Given a list of integers, I want to check a second list and remove from the first only those which can not be made from the sum of two numbers from the second. So given a = [3,19,20] and b = [1,2,17], I'd want [3,19].
Seems like a a cinch with two nested loops - except that I've gotten stuck with break and continue commands.
Here's what I have:
def myFunction(list_a, list_b):
for i in list_a:
for a in list_b:
for b in list_b:
if a + b == i:
break
else:
continue
break
else:
continue
list_a.remove(i)
return list_a
I know what I need to do, just the syntax seems unnecessarily confusing. Can someone show me an easier way? TIA!
You can do like this,
In [13]: from itertools import combinations
In [15]: [item for item in a if item in [sum(i) for i in combinations(b,2)]]
Out[15]: [3, 19]
combinations will give all possible combinations in b and get the list of sum. And just check the value is present in a
Edit
If you don't want to use the itertools wrote a function for it. Like this,
def comb(s):
for i, v1 in enumerate(s):
for j in range(i+1, len(s)):
yield [v1, s[j]]
result = [item for item in a if item in [sum(i) for i in comb(b)]]
Comments on code:
It's very dangerous to delete elements from a list while iterating over it. Perhaps you could append items you want to keep to a new list, and return that.
Your current algorithm is O(nm^2), where n is the size of list_a, and m is the size of list_b. This is pretty inefficient, but a good start to the problem.
Thee's also a lot of unnecessary continue and break statements, which can lead to complicated code that is hard to debug.
You also put everything into one function. If you split up each task into different functions, such as dedicating one function to finding pairs, and one for checking each item in list_a against list_b. This is a way of splitting problems into smaller problems, and using them to solve the bigger problem.
Overall I think your function is doing too much, and the logic could be condensed into much simpler code by breaking down the problem.
Another approach:
Since I found this task interesting, I decided to try it myself. My outlined approach is illustrated below.
1. You can first check if a list has a pair of a given sum in O(n) time using hashing:
def check_pairs(lst, sums):
lookup = set()
for x in lst:
current = sums - x
if current in lookup:
return True
lookup.add(x)
return False
2. Then you could use this function to check if any any pair in list_b is equal to the sum of numbers iterated in list_a:
def remove_first_sum(list_a, list_b):
new_list_a = []
for x in list_a:
check = check_pairs(list_b, x)
if check:
new_list_a.append(x)
return new_list_a
Which keeps numbers in list_a that contribute to a sum of two numbers in list_b.
3. The above can also be written with a list comprehension:
def remove_first_sum(list_a, list_b):
return [x for x in list_a if check_pairs(list_b, x)]
Both of which works as follows:
>>> remove_first_sum([3,19,20], [1,2,17])
[3, 19]
>>> remove_first_sum([3,19,20,18], [1,2,17])
[3, 19, 18]
>>> remove_first_sum([1,2,5,6],[2,3,4])
[5, 6]
Note: Overall the algorithm above is O(n) time complexity, which doesn't require anything too complicated. However, this also leads to O(n) extra auxiliary space, because a set is kept to record what items have been seen.
You can do it by first creating all possible sum combinations, then filtering out elements which don't belong to that combination list
Define the input lists
>>> a = [3,19,20]
>>> b = [1,2,17]
Next we will define all possible combinations of sum of two elements
>>> y = [i+j for k,j in enumerate(b) for i in b[k+1:]]
Next we will apply a function to every element of list a and check if it is present in above calculated list. map function can be use with an if/else clause. map will yield None in case of else clause is successful. To cater for this we can filter the list to remove None values
>>> list(filter(None, map(lambda x: x if x in y else None,a)))
The above operation will output:
>>> [3,19]
You can also write a one-line by combining all these lines into one, but I don't recommend this.
you can try something like that:
a = [3,19,20]
b= [1,2,17,5]
n_m_s=[]
data=[n_m_s.append(i+j) for i in b for j in b if i+j in a]
print(set(n_m_s))
print("after remove")
final_data=[]
for j,i in enumerate(a):
if i not in n_m_s:
final_data.append(i)
print(final_data)
output:
{19, 3}
after remove
[20]

searching a list for duplicate values in python 2.7

list1 = ["green","red","yellow","purple","Green","blue","blue"]
so I got a list, I want to loop through the list and see if the colour has not been mentioned more than once. If it has it doesn't get append to the new list.
so u should be left with
list2 = ["red","yellow","purple"]
So i've tried this
list1 = ["green","red","yellow","purple","Green","blue","blue","yellow"]
list2 =[]
num = 0
for i in list1:
if (list1[num]).lower == (list1[i]).lower:
num +=1
else:
list2.append(i)
num +=1
but i keep getting an error
Use a Counter: https://docs.python.org/2/library/collections.html#collections.Counter.
from collections import Counter
list1 = ["green","red","yellow","purple","green","blue","blue","yellow"]
list1_counter = Counter([x.lower() for x in list1])
list2 = [x for x in list1 if list1_counter[x.lower()] == 1]
Note that your example is wrong because yellow is present twice.
The first problem is that for i in list1: iterates over the elements in the list, not indices, so with i you have an element in your hands.
Next there is num which is an index, but you seem to increment it rather the wrong way.
I would suggest that you use the following code:
for i in range(len(list1)):
unique = True
for j in range(len(list1)):
if i != j and list1[i].lower() == list1[j].lower():
unique = False
break
if unique:
list2.append(list1[i])
How does this work: here i and j are indices, you iterate over the list and with i you iterate over the indices of element you potentially want to add, now you do a test: you check if somewhere in the list you see another element that is equal. If so, you set unique to False and do it for the next element, otherwise you add.
You can also use a for-else construct like #YevhenKuzmovych says:
for i in range(len(list1)):
for j in range(len(list1)):
if i != j and list1[i].lower() == list1[j].lower():
break
else:
list2.append(list1[i])
You can make the code more elegant by using an any:
for i in range(len(list1)):
if not any(i != j and list1[i].lower() == list1[j].lower() for j in range(len(list1))):
list2.append(list1[i])
Now this is more elegant but still not very efficient. For more efficiency, you can use a Counter:
from collections import Counter
ctr = Counter(x.lower() for x in list1)
Once you have constructed the counter, you look up the amount of times you have seen the element and if it is less than 2, you add it to the list:
from collections import Counter
ctr = Counter(x.lower() for x in list1)
for element in list1:
if ctr[element.lower()] < 2:
list2.append(element)
Finally you can now even use list comprehension to make it very elegantly:
from collections import Counter
ctr = Counter(x.lower() for x in list1)
list2 = [element for element in list1 if ctr[element.lower()] < 2]
Here's another solution:
list2 = []
for i, element in enumerate(list1):
if element.lower() not in [e.lower() for e in list1[:i] + list1[i + 1:]]:
list2.append(element)
By combining the built-ins, you can keep only unique items and preserve order of appearance, with just a single empty class definition and a one-liner:
from future_builtins import map # Only on Python 2 to get generator based map
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
pass
list1 = ["green","red","yellow","purple","Green","blue","blue"]
# On Python 2, use .iteritems() to avoid temporary list
list2 = [x for x, cnt in OrderedCounter(map(str.lower, list1)).items() if cnt == 1]
# Result: ['red', 'yellow', 'purple']
You use the Counter feature to count an iterable, while inheriting from OrderedDict preserves key order. All you have to do is filter the results to check and strip the counts, reducing your code complexity significantly.
It also reduces the work to one pass of the original list, with a second pass that does work proportional to the number of unique items in the list, rather than having to perform two passes of the original list (important if duplicates are common and the list is huge).
Here's yet another solution...
Firstly, before I get started, I think there is a typo... You have "yellow" listed twice and you specified you'd have yellow in the end. So, to that, I will provide two scripts, one that allows the duplicates and one that doesn't.
Original:
list1 = ["green","red","yellow","purple","Green","blue","blue","yellow"]
list2 =[]
num = 0
for i in list1:
if (list1[num]).lower == (list1[i]).lower:
num +=1
else:
list2.append(i)
num +=1
Modified (does allow duplicates):
colorlist_1 = ["green","red","yellow","purple","Green","blue","blue","yellow"]
colorlist_2 = []
seek = set()
i = 0
numberOfColors = len(colorlist_1)
while i < numberOfColors:
if numberOfColors[i].lower() not in seek:
seek.add(numberOfColors[i].lower())
colorlist_2.append(numberOfColors[i].lower())
i+=1
print(colorlist_2)
# prints ["green","red","yellow","purple","blue"]
Modified (does not allow duplicates):
EDIT
Willem Van Onsem's answer is totally applicable and already thorough.

Find how many lists in list have the same element

I am new at Python, so I'm having trouble with something. I have a few string lists in one list.
list=[ [['AA','A0'],['AB','A0']],
[['AA','B0'],['AB','A0']],
[['A0','00'],['00','A0'], [['00','BB'],['AB','A0'],['AA','A0']] ]
]
And I have to find how many lists have the same element. For example, the correct result for the above list is 3 for the element ['AB','A0'] because it is the element that connects the most of them.
I wrote some code...but it's not good...it works for 2 lists in list,but not for more....
Please,help!
This is my code...for the above list...
for t in range(0,len(list)-1):
pattern=[]
flag=True
pattern.append(list[t])
count=1
rest=list[t+1:]
for p in pattern:
for j in p:
if flag==False:
break
pair= j
for i in rest:
for y in i:
if pair==y:
count=count+1
break
if brojac==len(list):
flag=False
break
Since your data structure is rather complex, you might want to build a recursive function, that is a function that calls itself (http://en.wikipedia.org/wiki/Recursion_(computer_science)).
This function is rather simple. You iterate through all items of the original list. If the current item is equal to the value you are searching for, you increment the number of found objects by 1. If the item is itself a list, you will go through that sub-list and find all matches in that sub-list (by calling the same function on the sub-list, instead of the original list). You then increment the total number of found objects by the count in your sub-list. I hope my explanation is somewhat clear.
alist=[[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]
def count_in_list(val, arr):
val_is_list = isinstance(val, list)
ct = 0
for item in arr:
item_is_list = isinstance(item, list)
if item == val or (val_is_list and item_is_list and sorted(item) == sorted(val)):
ct += 1
if item_is_list :
ct += count_in_list(val, item)
return ct
print count_in_list(['AB', 'A0'], alist)
This is an iterative approach that will also work using python3 that will get the count of all sublists:
from collections import defaultdict
d = defaultdict(int)
def counter(lst, d):
it = iter(lst)
nxt = next(it)
while nxt:
if isinstance(nxt, list):
if nxt and isinstance(nxt[0], str):
d[tuple(nxt)] += 1
rev = tuple(reversed(nxt))
if rev in d:
d[rev] += 1
else:
lst += nxt
nxt = next(it,"")
return d
print((counter(lst, d)['AB', 'A0'])
3
It will only work on data like your input, nesting of strings beside lists will break the code.
To get a single sublist count is easier:
def counter(lst, ele):
it = iter(lst)
nxt = next(it)
count = 0
while nxt:
if isinstance(nxt, list):
if ele in (nxt, nxt[::-1]):
count += 1
else:
lst += nxt
nxt = next(it, "")
return count
print(counter(lst, ['AB', 'A0']))
3
Ooookay - this maybe isn't very nice and straightforward code, but that's how i'd try to solve this. Please don't hurt me ;-)
First,
i'd fragment the problem in three smaller ones:
Get rid of your multiple nested lists,
Count the occurence of all value-pairs in the inner lists and
Extract the most occurring value-pair from the counting results.
1.
I'd still use nested lists, but only of two-levels depth. An outer list, to iterate through, and all the two-value-lists inside of it. You can finde an awful lot of information about how to get rid of nested lists right here. As i'm just a beginner, i couldn't make much out of all that very detailed information - but if you scroll down, you'll find an example similar to mine. This is what i understand, this is how i can do.
Note that it's a recursive function. As you mentioned in comments that you think this isn't easy to understand: I think you're right. I'll try to explain it somehow:
I don't know if the nesting depth is consistent in your list. and i don't want to exctract the values themselves, as you want to work with lists. So this function loops through the outer list. For each element, it checks if it's a list. If not, nothing happens. If it is a list, it'll have a look at the first element inside of that list. It'll check again if it's a list or not.
If the first element inside the current list is another list, the function will be called again - recursive - but this time starting with the current inner list. This is repeated until the function finds a list, containing an element on the first position that is NOT a list.
In your example, it'll dig through the complete list-of-lists, until it finds your first string values. Then it gets the list containing this value - and put that in another list, the one that is returned.
Oh boy, that sounds really crazy - tell me if that clarified anything... :-D
"Yo dawg, i herd you like lists, so i put a list in a list..."
def get_inner_lists(some_list):
inner_lists = []
for item in some_list:
if hasattr(item, '__iter__') and not isinstance(item, basestring):
if hasattr(item[0], '__iter__') and not isinstance(item[0], basestring):
inner_lists.extend(get_inner_lists(item))
else:
inner_lists.append(item)
return inner_lists
Whatever - call that function and you'll find your list re-arranged a little bit:
>>> foo = [[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]
>>> print get_inner_lists(foo)
[['AA', 'A0'], ['AB', 'A0'], ['AA', 'B0'], ['AB', 'A0'], ['A0', '00'], ['00', 'A0'], ['00', 'BB'], ['AB', 'A0'], ['AA', 'A0']]
2.
Now i'd iterate through that lists and build a string with their values. This will only work with lists of two values, but as this is what you showed in your example it'll do. While iterating, i'd build up a dictionary with the strings as keys and the occurrence as values. That makes it really easy to add new values and raise the counter of existing ones:
def count_list_values(some_list):
result = {}
for item in some_list:
str = item[0]+'-'+item[1]
if not str in result.keys():
result[str] = 1
else:
result[str] += 1
return result
There you have it, all the counting is done. I don't know if it's needed, but as a side effect there are all values and all occurrences:
>>> print count_list_values(get_inner_lists(foo))
{'00-A0': 1, '00-BB': 1, 'A0-00': 1, 'AB-A0': 3, 'AA-A0': 2, 'AA-B0': 1}
3.
But you want clear results, so let's loop through that dictionary, list all keys and all values, find the maximum value - and return the corresponding key. Having built the string-of-two-values with a seperator (-), it's easy to split it and make a list out of it, again:
def get_max_dict_value(some_dict):
all_keys = []
all_values = []
for key, val in some_dict.items():
all_keys.append(key)
all_values.append(val)
return all_keys[all_values.index(max(all_values))].split('-')
If you define this three little functions and call them combined, this is what you'll get:
>>> print get_max_dict_value(count_list_values(get_inner_lists(foo)))
['AB', 'A0']
Ta-Daa! :-)
If you really have such lists with only nine elements, and you don't need to count values that often - do it manually. By reading values and counting with fingers. It'll be so much easier ;-)
Otherwise, here you go!
Or...
...you wait until some Guru shows up and gives you a super fast, elegant one-line python command that i've never seen before, which will do the same ;-)
This is as simple as I can reasonably make it:
from collections import Counter
lst = [ [['AA','A0'],['AB','A0']],
[['AA','B0'],['AB','A0']],
[['A0','00'],['00','A0'], [['00','BB'],['AB','A0'],['AA','A0']] ]
]
def is_leaf(element):
return (isinstance(element, list) and
len(element) == 2 and
isinstance(element[0], basestring)
and isinstance(element[1], basestring))
def traverse(iterable):
for element in iterable:
if is_leaf(element):
yield tuple(sorted(element))
else:
for value in traverse(element):
yield value
value, count = Counter(traverse(lst)).most_common(1)[0]
print 'Value {!r} is present {} times'.format(value, count)
The traverse() generate yields a series of sorted tuples representing each item in your list. The Counter object counts the number of occurrences of each, and its .most_common(1) method returns the value and count of the most common item.
You've said recursion is too difficult, but I beg to differ: it's the simplest way possible to attack this problem. The sooner you come to love recursion, the happier you'll be. :-)
Hopefully soemthing like this is what you were looking for. It is a bit tenuous and would suggest that recursion is better. But Since you didn't want it that way here is some code that might work. I am not super good at python but hope it will do the job:
def Compare(List):
#Assuming that the list input is a simple list like ["A1","B0"]
myList =[[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]
#Create a counter that will count if the elements are the same
myCounter = 0;
for innerList1 in myList:
for innerList2 in innerList1
for innerList3 in innerList2
for element in innerList3
for myListElements in myList
if (myListElements == element)
myCounter = myCounter + 1;
#I am putting the break here so that it counts how many lists have the
#same elements, not how many elements are the same in the lists
break;
return myCounter;

How does `sum` flatten lists?

A multidimensional list like l=[[1,2],[3,4]] could be converted to a 1D one by doing sum(l,[]). How does this happen?
(This doesn't work directly for higher multidimensional lists, but it can be repeated to handle those cases. For example if A is a 3D-list, then sum(sum(A),[]),[]) will flatten A to a 1D list.)
If your list nested is, as you say, "2D" (meaning that you only want to go one level down, and all 1-level-down items of nested are lists), a simple list comprehension:
flat = [x for sublist in nested for x in sublist]
is the approach I'd recommend -- much more efficient than summing would be (sum is intended for numbers -- it was just too much of a bother to somehow make it block all attempts to "sum" non-numbers... I was the original proposer and first implementer of sum in the Python standard library, so I guess I should know;-).
If you want to go down "as deep as it takes" (for deeply nested lists), recursion is the simplest way, although by eliminating the recursion you can get higher performance (at the price of higher complication).
This recipe suggests a recursive solution, a recursion elimination, and other approaches
(all instructive, though none as simple as the one-liner I suggested earlier in this answer).
sum adds a sequence together using the + operator. e.g sum([1,2,3]) == 6. The 2nd parameter is an optional start value which defaults to 0. e.g. sum([1,2,3], 10) == 16.
In your example it does [] + [1,2] + [3,4] where + on 2 lists concatenates them together. Therefore the result is [1,2,3,4]
The empty list is required as the 2nd paramter to sum because, as mentioned above, the default is for sum to add to 0 (i.e. 0 + [1,2] + [3,4]) which would result in unsupported operand type(s) for +: 'int' and 'list'
This is the relevant section of the help for sum:
sum(sequence[, start]) -> value
Returns the sum of a sequence of
numbers (NOT strings) plus the value
of parameter 'start' (which defaults
to 0).
Note
As wallacoloo comented this is not a general solution for flattening any multi dimensional list. It just works for a list of 1D lists due to the behavior described above.
Update
For a way to flatten 1 level of nesting see this recipe from the itertools page:
def flatten(listOfLists):
"Flatten one level of nesting"
return chain.from_iterable(listOfLists)
To flatten more deeply nested lists (including irregularly nested lists) see the accepted answer to this question (there are also some other questions linked to from that question itself.)
Note that the recipe returns an itertools.chain object (which is iterable) and the other question's answer returns a generator object so you need to wrap either of these in a call to list if you want the full list rather than iterating over it. e.g. list(flatten(my_list_of_lists)).
For any kind of multidiamentional array, this code will do flattening to one dimension :
def flatten(l):
try:
return flatten(l[0]) + (flatten(l[1:]) if len(l) > 1 else []) if type(l) is list else [l]
except IndexError:
return []
It looks to me more like you're looking for a final answer of:
[3, 7]
For that you're best off with a list comprehension
>>> l=[[1,2],[3,4]]
>>> [x+y for x,y in l]
[3, 7]
I wrote a program to do multi-dimensional flattening using recursion. If anyone has comments on making the program better, you can always see me smiling:
def flatten(l):
lf=[]
li=[]
ll=[]
p=0
for i in l:
if type(i).__name__=='list':
li.append(i)
else:
lf.append(i)
ll=[x for i in li for x in i]
lf.extend(ll)
for i in lf:
if type(i).__name__ =='list':
#not completely flattened
flatten(lf)
else:
p=p+1
continue
if p==len(lf):
print(lf)
I've written this function:
def make_array_single_dimension(l):
l2 = []
for x in l:
if type(x).__name__ == "list":
l2 += make_array_single_dimension(x)
else:
l2.append(x)
return l2
It works as well!
The + operator concatenates lists and the starting value is [] an empty list.

Categories