Find indexes of common items in two python lists

Find indexes of common items in two python lists - python

I have two lists in python list_A and list_B and I want to find the common item they share. My code to do so is the following:
both = []
for i in list_A:
for j in list_B:
if i == j:
both.append(i)
The list common in the end contains the common items. However, I want also to return the indexes of those elements in the initial two lists. How can I do so?

It is advised in python that you avoid as much as possible to use for loops if better methods are available. You can efficiently find the common elements in the two lists by using python set as follows
both = set(list_A).intersection(list_B)
Then you can find the indices using the build-in index method
indices_A = [list_A.index(x) for x in both]
indices_B = [list_B.index(x) for x in both]

Instead of iterating through the list, access elements by index:
both = []
for i in range(len(list_A)):
for j in range(len(list_B)):
if list_A[i] == list_B[j]:
both.append((i,j))
Here i and j will take integer values and you can check values in list_A and list_B by index.

You can also get common elements and their indexes with numpy.intersect1d()
common_elements, a_indexes, b_indexes = np.intersect1d(a, b, return_indices=True)

Related

Grouping a grouped list of str without duplicates

I have a grouped list of strings that sort of looks like this, the lists inside of these groups will always contain 5 elements:
text_list = [['aaa','bbb','ccc','ddd','eee'],
['fff','ggg','hhh','iii','jjj'],
['xxx','mmm','ccc','bbb','aaa'],
['fff','xxx','aaa','bbb','ddd'],
['aaa','bbb','ccc','ddd','eee'],
['fff','xxx','aaa','ddd','eee'],
['iii','xxx','ggg','jjj','aaa']]
The objective is simple, group all of the list that is similar by the first 3 elements that is then compared against all of the elements inside of the other groups.
So from the above example the output might look like this (output is the index of the list):
[[0,2,4],[3,5]]
Notice how if there is another list that contains the same elements but in a different order is removed.
I've written the following code to extract the groups but they would return duplicates and I am unsure how to proceed. I also think this might not be the most efficient way to do the extraction as the real list can contain upwards to millions of groups:
grouped_list = []
for i in range(0,len(text_list)):
int_temp = []
for m in range(0,len(text_list)):
if i == m:
continue
bool_check = all( x in text_list[m] for x in text_list[i][0:3])
if bool_check:
if len(int_temp) == 0:
int_temp.append(i)
int_temp.append(m)
continue
int_temp.append(m)
grouped_list.append(int_temp)
## remove index with no groups
grouped_list = [x for x in grouped_list if x != []]
Is there a better way to go about this? How do I remove the duplicate group afterwards? Thank you.
Edit:
To be clearer, I would like to retrieve the lists that is similar to each other but only using the first 3 elements of the other lists. For example, using the first 3 elements from list A, check if list B,C,D... contains all 3 of the elements from list A. Repeat for the entire list then remove any list that contains duplicate elements.

You can build a set of frozensets to keep track of indices of groups with the first 3 items being a subset of the rest of the members:
groups = set()
sets = list(map(set, text_list))
for i, lst in enumerate(text_list):
groups.add(frozenset((i, *(j for j, s in enumerate(sets) if set(lst[:3]) <= s))))
print([sorted(group) for group in groups if len(group) > 1])
If the input list is long, it would be faster to create a set of frozensets of the first 3 items of all sub-lists and use the set to filter all combinations of 3 items from each sub-list, so that the time complexity is essentially linear to the input list rather than quadratic despite the overhead in generating combinations:
from itertools import combinations
sets = {frozenset(lst[:3]) for lst in text_list}
groups = {}
for i, lst in enumerate(text_list):
for c in map(frozenset, combinations(lst, 3)):
if c in sets:
groups.setdefault(c, []).append(i)
print([sorted(group) for group in groups.values() if len(group) > 1])

How to find how many items of list B forms list A

I want to find minimum items of list B that forms list A. lets assume A=['abcdef'] and B=['abc','ad','adef','adef','bdf']. Then since items with index 0 and 2 contains all letters of list A, the answer will be 2. I used the combination to find all the possible combination of B. But I am not sure how to continue. It seems most optimized way is Brute Force.
A=['abcdef']
B=['abc','ad','adef','adef','bdf']
for L in range(0, len(B)+1):
for subset in itertools.combinations(B, L):
if subset==A:
print(subset)

Here is an extension of your attempt. I have used a rather confusing comprehension to flatten the the list out into a set so I can make use of straightforward set operations to check whether one is a subset of the other.
import itertools
A='abcdef'
B=['abc','ad','adef','adef','bdf']
set_a = set(A)
solved = False
for L in range(0, len(B)+1):
for subset in itertools.combinations(B, L):
s = set(item for sublist in subset for item in sublist)
if set_a.issubset(s):
print(f'String A can be made from {L} items of list B, specifically: {subset}')
solved = True
break
if solved: break

Reordering elements of lists in a list of lists

I have a list of lists composed of elements that are strings and integers, e.g.
a = [['i','j',1,2,3...n], ['k','l',4,5,6...n], ['m','n',7,8,9...n]]
I would like to reorder the string elements in each sub list such that a[l][1] takes the place of a[l][0] and vice versa, so I end up with:
a = [['j','i',1,2,3...n], ['l','k',4,5,6...n], ['n','m',7,8,9...n]]
I have tried iterating through each sub list and using an order variable to change the integer positions, but it does not seem to change the ordering in each list:
order = [1,0,2,3,4...n]
for l in reversed(range(len(a))):
[a[l][j] for j in order]
What am I doing wrong?

[a[l][j] for j in order]
This is just an expression that is evaluated and then discarded.
If you want to change a, you need to assign to it:
a[l] = [a[l][j] for j in order]

The simplest for the concrete issue would be slice assignment in a loop:
for sub in a:
sub[:2] = reversed(sub[:2])
With your general order approach, you could do:
for sub in a:
sub[:] = [sub[i] for i in order] # needs to mutate the actual object!

just use this code:
a[1][1], a[1][0] = a[1][0] , a[1][1]
thanks to python

Check number not a sum of 2 ints on a list

Given a list of integers, I want to check a second list and remove from the first only those which can not be made from the sum of two numbers from the second. So given a = [3,19,20] and b = [1,2,17], I'd want [3,19].
Seems like a a cinch with two nested loops - except that I've gotten stuck with break and continue commands.
Here's what I have:
def myFunction(list_a, list_b):
for i in list_a:
for a in list_b:
for b in list_b:
if a + b == i:
break
else:
continue
break
else:
continue
list_a.remove(i)
return list_a
I know what I need to do, just the syntax seems unnecessarily confusing. Can someone show me an easier way? TIA!

You can do like this,
In [13]: from itertools import combinations
In [15]: [item for item in a if item in [sum(i) for i in combinations(b,2)]]
Out[15]: [3, 19]
combinations will give all possible combinations in b and get the list of sum. And just check the value is present in a
Edit
If you don't want to use the itertools wrote a function for it. Like this,
def comb(s):
for i, v1 in enumerate(s):
for j in range(i+1, len(s)):
yield [v1, s[j]]
result = [item for item in a if item in [sum(i) for i in comb(b)]]

Comments on code:
It's very dangerous to delete elements from a list while iterating over it. Perhaps you could append items you want to keep to a new list, and return that.
Your current algorithm is O(nm^2), where n is the size of list_a, and m is the size of list_b. This is pretty inefficient, but a good start to the problem.
Thee's also a lot of unnecessary continue and break statements, which can lead to complicated code that is hard to debug.
You also put everything into one function. If you split up each task into different functions, such as dedicating one function to finding pairs, and one for checking each item in list_a against list_b. This is a way of splitting problems into smaller problems, and using them to solve the bigger problem.
Overall I think your function is doing too much, and the logic could be condensed into much simpler code by breaking down the problem.
Another approach:
Since I found this task interesting, I decided to try it myself. My outlined approach is illustrated below.
1. You can first check if a list has a pair of a given sum in O(n) time using hashing:
def check_pairs(lst, sums):
lookup = set()
for x in lst:
current = sums - x
if current in lookup:
return True
lookup.add(x)
return False
2. Then you could use this function to check if any any pair in list_b is equal to the sum of numbers iterated in list_a:
def remove_first_sum(list_a, list_b):
new_list_a = []
for x in list_a:
check = check_pairs(list_b, x)
if check:
new_list_a.append(x)
return new_list_a
Which keeps numbers in list_a that contribute to a sum of two numbers in list_b.
3. The above can also be written with a list comprehension:
def remove_first_sum(list_a, list_b):
return [x for x in list_a if check_pairs(list_b, x)]
Both of which works as follows:
>>> remove_first_sum([3,19,20], [1,2,17])
[3, 19]
>>> remove_first_sum([3,19,20,18], [1,2,17])
[3, 19, 18]
>>> remove_first_sum([1,2,5,6],[2,3,4])
[5, 6]
Note: Overall the algorithm above is O(n) time complexity, which doesn't require anything too complicated. However, this also leads to O(n) extra auxiliary space, because a set is kept to record what items have been seen.

You can do it by first creating all possible sum combinations, then filtering out elements which don't belong to that combination list
Define the input lists
>>> a = [3,19,20]
>>> b = [1,2,17]
Next we will define all possible combinations of sum of two elements
>>> y = [i+j for k,j in enumerate(b) for i in b[k+1:]]
Next we will apply a function to every element of list a and check if it is present in above calculated list. map function can be use with an if/else clause. map will yield None in case of else clause is successful. To cater for this we can filter the list to remove None values
>>> list(filter(None, map(lambda x: x if x in y else None,a)))
The above operation will output:
>>> [3,19]
You can also write a one-line by combining all these lines into one, but I don't recommend this.

you can try something like that:
a = [3,19,20]
b= [1,2,17,5]
n_m_s=[]
data=[n_m_s.append(i+j) for i in b for j in b if i+j in a]
print(set(n_m_s))
print("after remove")
final_data=[]
for j,i in enumerate(a):
if i not in n_m_s:
final_data.append(i)
print(final_data)
output:
{19, 3}
after remove
[20]

Python List Indexing or Appending?

What is the best way to add values to a List in terms of processing time, memory usage and just generally what is the best programming option.
list = []
for i in anotherArray:
list.append(i)
or
list = range(len(anotherArray))
for i in list:
list[i] = anotherArray[i]
Considering that anotherArray is for example an array of Tuples. (This is just a simple example)

It really depends on your use case. There is no generic answer here as it depends on what you are trying to do.
In your example, it looks like you are just trying to create a copy of the array, in which case the best way to do this would be to use copy:
from copy import copy
list = copy(anotherArray)
If you are trying to transform the array into another array you should use list comprehension.
list = [i[0] for i in anotherArray] # get the first item from tuples in anotherArray
If you are trying to use both indexes and objects, you should use enumerate:
for i, j in enumerate(list)
which is much better than your second example.
You can also use generators, lambas, maps, filters, etc. The reason all of these possibilities exist is because they are all "better" for different reasons. The writters of python are pretty big on "one right way", so trust me, if there was one generic way which was always better, that is the only way that would exist in python.
Edit: Ran some results of performance for tuple swap and here are the results:
comprehension: 2.682028295999771
enumerate: 5.359116118001111
for in append: 4.177091988000029
for in indexes: 4.612594166001145
As you can tell, comprehension is usually the best bet. Using enumerate is expensive.
Here is the code for the above test:
from timeit import timeit
some_array = [(i, 'a', True) for i in range(0,100000)]
def use_comprehension():
return [(b, a, i) for i, a, b in some_array]
def use_enumerate():
lst = []
for j, k in enumerate(some_array):
i, a, b = k
lst.append((b, a, i))
return lst
def use_for_in_with_append():
lst = []
for i in some_array:
i, a, b = i
lst.append((b, a, i))
return lst
def use_for_in_with_indexes():
lst = [None] * len(some_array)
for j in range(len(some_array)):
i, a, b = some_array[j]
lst[j] = (b, a, i)
return lst
print('comprehension:', timeit(use_comprehension, number=200))
print('enumerate:', timeit(use_enumerate, number=200))
print('for in append:', timeit(use_for_in_with_append, number=200))
print('for in indexes:', timeit(use_for_in_with_indexes, number=200))
Edit2:
It was pointed out to me the the OP just wanted to know the difference between "indexing" and "appending". Really, those are used for two different use cases as well. Indexing is for replacing objects, whereas appending is for adding. However, in a case where the list starts empty, appending will always be better because the indexing has the overhead of creating the list initially. You can see from the results above that indexing is slightly slower, mostly because you have to create the first list.

Best way is list comprehension :
my_list=[i for i in anotherArray]
But based on your problem you can use a generator expression (is more efficient than list comprehension when you just want to loop over your items and you don't need to use some list methods like indexing or len or ... )
my_list=(i for i in anotherArray)

I would actually say the best is a combination of index loops and value loops with enumeration:
for i, j in enumerate(list): # i is the index, j is the value, can't go wrong

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find indexes of common items in two python lists - python

Instead of iterating through the list, access elements by index: both = [] for i in range(len(list_A)): for j in range(len(list_B)): if list_A[i] == list_B[j]: both.append((i,j)) Here i and j will take integer values and you can check values in list_A and list_B by index.

You can also get common elements and their indexes with numpy.intersect1d() common_elements, a_indexes, b_indexes = np.intersect1d(a, b, return_indices=True)

Related

Grouping a grouped list of str without duplicates

How to find how many items of list B forms list A

Reordering elements of lists in a list of lists

Check number not a sum of 2 ints on a list

Python List Indexing or Appending?

Categories

Resources