Get unique elements from a 2D list - python

I have a 2D list which I create like so:
Z1 = [[0 for x in range(3)] for y in range(4)]
I then proceed to populate this list, such that Z1 looks like this:
[[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
I need to extract the unique 1x3 elements of Z1, without regard to order:
Z2 = makeUnique(Z1) # The solution
The contents of Z2 should look like this:
[[4, 5, 6], [2, 5, 1]]
As you can see, I consider [1, 2, 3] and [2, 3, 1] to be duplicates because I don't care about the order.
Also note that single numeric values may appear more than once across elements (e.g. [2, 3, 1] and [2, 5, 1]); it's only when all three values appear together more than once (in the same or different order) that I consider them to be duplicates.
I have searched dozens of similar problems, but none of them seems to address my exact issue. I'm a complete Python beginner so I just need a push in the right direction.
I have already tried :
Z2= dict((x[0], x) for x in Z1).values()
Z2= set(i for j in Z2 for i in j)
But this does not produce the desired behaviour.
Thank you very much for your help!
Louis Vallance

If the order of the elements inside the sublists does not matter, you could use the following:
from collections import Counter
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
temp = Counter([tuple(sorted(x)) for x in z1])
z2 = [list(k) for k, v in temp.items() if v == 1]
print(z2) # [[4, 5, 6], [1, 2, 5]]
Some remarks:
sorting makes lists [1, 2, 3] and [2, 3, 1] from the example equal so they get grouped by the Counter
casting to tuple converts the lists to something that is hashable and can therefore be used as a dictionary key.
the Counter creates a dict with the tuples created above as keys and a value equal to the number of times they appear in the original list
the final list-comprehension takes all those keys from the Counter dictionary that have a count of 1.
If the order does matter you can use the following instead:
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
def test(sublist, list_):
for sub in list_:
if all(x in sub for x in sublist):
return False
return True
z2 = [x for i, x in enumerate(z1) if test(x, z1[:i] + z1[i+1:])]
print(z2) # [[4, 5, 6], [2, 5, 1]]

Related

How can i sum up all values with the same index in a dictionary which each key has a nested list as a value?

I have a dictionary, each key of dictionary has a list of list (nested list) as its value. What I want is imagine we have:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
My question is how can I access each element of the dictionary and concatenate those with same index: for example first list from all keys:
[1,2] from first keye +
[2,1] from second and
[1,5] from third one
How can I do this?
You can access your nested list easily when you're iterating through your dictionary and append it to a new list and the you apply the sum function.
Code:
x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
ans=[]
for key in x:
ans += x[key][0]
print(sum(ans))
Output:
12
Assuming you want a list of the first elements, you can do:
>>> x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
>>> y = [a[0] for a in x.values()]
>>> y
[[1, 2], [2, 1], [1, 5]]
If you want the second element, you can use a[1], etc.
The output you expect is not entirely clear (do you want to sum? concatenate?), but what seems clear is that you want to handle the values as matrices.
You can use numpy for that:
summing the values
import numpy as np
sum(map(np.array, x.values())).tolist()
output:
[[4, 8], [10, 15]] # [[1+2+1, 2+1+5], [3+2+5, 5+6+4]]
concatenating the matrices (horizontally)
import numpy as np
np.hstack(list(map(np.array, x.values()))).tolist()
output:
[[1, 2, 2, 1, 1, 5], [3, 5, 2, 6, 5, 4]]
As explained in How to iterate through two lists in parallel?, zip does exactly that: iterates over a few iterables at the same time and generates tuples of matching-index items from all iterables.
In your case, the iterables are the values of the dict. So just unpack the values to zip:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
for y in zip(*x.values()):
print(y)
Gives:
([1, 2], [2, 1], [1, 5])
([3, 5], [2, 6], [5, 4])

Python - doing maths with a nested list

If I have a nested list, e.g. x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]], how can I calculate the difference between all of them? Let's called the lists inside x - A, B, and C. I want to calculate the difference of A from B & C, then B from A & C, then C from A & B, then put them in a list diff = [].
My problem is correctly indexing the numbers and using them to do maths with corresponding elements in other lists.
This is what I have so far:
for i in range(len(x)):
diff = []
for j in range(len(x)):
if x[i]!=x[j]:
a = x[i]
b = x[j]
for h in range(len(a)):
d = a[h] - b[h]
diff.append(d)
Essentially for the difference of A to B it is ([1-2] + [2-4] + [3-6])
I would like it to return: diff = [[diff(A,B), diff(A,C)], [diff(B,A), diff(B,C)], [diff(C,A), diff(C,B)]] with the correct differences between points.
Thanks in advance!
Your solution is actually not that far off. As Aniketh mentioned, one issue is your use of x[i] != x[j]. Since x[i] and x[j] are arrays, that will actually always evaluate to false.
The reason is that python will not do a useful comparison of arrays by default. It will just check if the array reference is the same. This is obviously not what you want, you are trying to see if the array is at the same index in x. For that use i !=j.
Though there are other solutions posted here, I'll add mine below because I already wrote it. It makes use of python's list comprehensions.
def pairwise_diff(x):
diff = []
for i in range(len(x)):
A = x[i]
for j in range(len(x)):
if i != j:
B = x[j]
assert len(A) == len(B)
item_diff = [A[i] - B[i] for i in range(len(A))]
diff.append(sum(item_diff))
# Take the answers and group them into arrays of length 2
return [diff[i : i + 2] for i in range(0, len(diff), 2)]
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
print(pairwise_diff(x))
This is one of those problems where it's really helpful to know a bit of Python's standard library — especially itertools.
For example to get the pairs of lists you want to operate on, you can reach for itertools.permutations
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
list(permutations(x, r=2))
This gives the pairs of lists your want:
[([1, 2, 3], [2, 4, 6]),
([1, 2, 3], [3, 5, 7]),
([2, 4, 6], [1, 2, 3]),
([2, 4, 6], [3, 5, 7]),
([3, 5, 7], [1, 2, 3]),
([3, 5, 7], [2, 4, 6])]
Now, if you could just group those by the first of each pair...itertools.groupby does just this.
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
list(list(g) for k, g in groupby(permutations(x, r=2), key=lambda p: p[0]))
Which produces a list of lists grouped by the first:
[[([1, 2, 3], [2, 4, 6]), ([1, 2, 3], [3, 5, 7])],
[([2, 4, 6], [1, 2, 3]), ([2, 4, 6], [3, 5, 7])],
[([3, 5, 7], [1, 2, 3]), ([3, 5, 7], [2, 4, 6])]]
Putting it all together, you can make a simple function that subtracts the lists the way you want and pass each pair in:
from itertools import permutations, groupby
def sum_diff(pairs):
return [sum(p - q for p, q in zip(*pair)) for pair in pairs]
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
# call sum_diff for each group of pairs
result = [sum_diff(g) for k, g in groupby(permutations(x, r=2), key=lambda p: p[0])]
# [[-6, -9], [6, -3], [9, 3]]
This reduces the problem to just a couple lines of code and will be performant on large lists. And, since you mentioned the difficulty in keeping indices straight, notice that this uses no indices in the code other than selecting the first element for grouping.
Here is the code I believe you're looking for. I will explain it below:
def diff(a, b):
total = 0
for i in range(len(a)):
total += a[i] - b[i]
return total
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
differences = []
for i in range(len(x)):
soloDiff = []
for j in range(len(x)):
if i != j:
soloDiff.append(diff(x[i],x[j]))
differences.append(soloDiff)
print(differences)
Output:
[[-6, -9], [6, -3], [9, 3]]
First off, in your explanation of your algorithm, you are making it very clear that you should use a function to calculate the differences between two lists since you will be using it repeatedly.
Your for loops start off fine, but you should have a second list to append diff to 3 times. Also, when you are checking for repeats you need to make sure that i != j, not x[i] != x[j]
Let me know if you have any other questions!!
this is the simplest solution i can think:
import numpy as np
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
x = np.array(x)
vectors = ['A','B','C']
for j in range(3):
for k in range(3):
if j!=k:
print(vectors[j],'-',vectors[k],'=', x[j]-x[k])
which will return
A - B = [-1 -2 -3]
A - C = [-2 -3 -4]
B - A = [1 2 3]
B - C = [-1 -1 -1]
C - A = [2 3 4]
C - B = [1 1 1]

Indices of duplicate lists in a nested list

I am trying to solve a problem that is a part of my genome alignment project. The problem goes as follows:
if given a nested list
y = [[1,2,3],[1,2,3],[3,4,5],[6,5,4],[4,2,5],[4,2,5],[1,2,8],[1,2,3]]
extract indices of unique lists into a nested list again.
For example, the output for the above nested list should be
[[0,1,7],[2],[3],[4,5],[6]].
This is because list [1,2,3] is present in 0,1,7th index positions, [3,4,5] in 2nd index position and so on.
Since I will be dealing with large lists, what could be the most optimal way of achieving this in Python?
You could create an dictionary (or OrderedDict if on older pythons). The keys of the dict will be tuples of the sub-lists and the values will be an array of indexes. After looping through, the dictionary values will hold your answer:
from collections import OrderedDict
y = [[1,2,3],[1,2,3],[3,4,5],[6,5,4],[4,2,5],[4,2,5],[1,2,8],[1,2,3]]
lookup = OrderedDict()
for idx,l in enumerate(y):
lookup.setdefault(tuple(l), []).append(idx)
list(lookup.values())
# [[0, 1, 7], [2], [3], [4, 5], [6]]
You could use list comprehension and range to check for duplicate indexes and append them to result.
result = []
for num in range(len(y)):
occurances = [i for i, x in enumerate(y) if x == y[num]]
if occurances not in result: result.append(occurances)
result
#[[0, 1, 7], [2], [3], [4, 5], [6]]
Consider numpy to solve this:
import numpy as np
y = [
[1, 2, 3],
[1, 2, 3],
[3, 4, 5],
[6, 5, 4],
[4, 2, 5],
[4, 2, 5],
[1, 2, 8],
[1, 2, 3]
]
# Returns unique values of array, indices of that
# array, and the indices that would rebuild the original array
unique, indices, inverse = np.unique(y, axis=0, return_index=True, return_inverse=True)
Here's a print out of each variable:
unique = [
[1 2 3]
[1 2 8]
[3 4 5]
[4 2 5]
[6 5 4]]
indices = [0 6 2 4 3]
inverse = [0 0 2 4 3 3 1 0]
If we look at our variable - inverse, we can see that we do indeed get [0, 1, 7] as the index positions for our first unique element [1,2,3], all we need to do now is group them appropriately.
new_list = []
for i in np.argsort(indices):
new_list.append(np.where(inverse == i)[0].tolist())
Output:
new_list = [[0, 1, 7], [2], [3], [4, 5], [6]]
Finally, refs for the code above:
Numpy - unique, where, argsort
One more solution:
y = [[1, 2, 3], [1, 2, 3], [3, 4, 5], [6, 5, 4], [4, 2, 5], [4, 2, 5], [1, 2, 8], [1, 2, 3]]
occurrences = {}
for i, v in enumerate(y):
v = tuple(v)
if v not in occurrences:
occurrences.update({v: []})
occurrences[v].append(i)
print(occurrences.values())

Generate all combinations of 2 lists (game playing)

I am trying to generate all possible combinations between 2 lists A and B in python with a few constraints. A and B alternate in picking values, A always picks first. A and B may have overlapping values. If A has already picked a value, then B cannot pick it, and vice versa.
Both lists need not be of equal lengths. If one list has no available values to pick then I stop generating combinations
Also the elements picked by each must be in increasing order, i.e. A[1] < A[2] < .... A[n] and B[1] < B[2] < .... B[n] where A[i] and B[i] is the i-th element picked by A and B respectively
Example:
A = [1, 2, 3, 4]
B = [2, 5]
Solution I need is
(1), (2), (3), (4),
(1,2), (1,5), (2,5), (3,2), (3,5), (4,2), (4,5),
(1,2,3), (1,2,4), (3,2,4), (1,5,2), (1,5,3), (1,5,4), (2,5,3), (2,5,4), (3,5,4),
(1,2,3,5), (1,2,4,5), (3,2,4,5)
(1,2,3,5,4)
I believe itertools in python can be useful for this but I havent really figured out how to implement it for this case.
As of now, this is how I am solving it:
A = [1, 2, 3, 4]
B = [2, 5]
A_set = set(A)
B_set = set(b)
#Append both sets
C = A.union(B)
for L in range(len(C), 0, -1):
for subset in itertools.combinations(C, L):
#Check if subset meets constraints and print it if it does
As noted in comments, this is probably much too specific to be easily solved using itertools, and you should use a recursive (generator) function instead. Just pick the next element from whichever list's turn it is, keeping track of the elements already selected, and recursively call the function again, swapping and shortening the lists and adding the element to the set of selected elements, until you've got the required number.
Something like this (this might be improved by adding parameters for the current index in both lists instead of actually slicing the lists for the recursive calls):
def solve(n, lst1, lst2, selected):
if n == 0:
yield []
elif lst1:
for i, x in enumerate(lst1):
if x not in selected:
selected.add(x)
for rest in solve(n-1, lst2, lst1[i+1:], selected):
yield [x] + rest
selected.remove(x)
Or a bit more condensed:
def solve(n, lst1, lst2, selected):
if n == 0:
yield []
elif lst1:
yield from ([x] + rest for i, x in enumerate(lst1) if x not in selected
for rest in solve(n-1, lst2, lst1[i+1:], selected.union({x})))
Example:
A = [1, 2, 3, 4]
B = [2, 5]
result = [res for n in range(1, len(A)+len(B)+1) for res in solve(n, A, B, set())]
Afterwards, result is:
[[1], [2], [3], [4],
[1, 2], [1, 5], [2, 5], [3, 2], [3, 5], [4, 2], [4, 5],
[1, 2, 3], [1, 2, 4], [1, 5, 2], [1, 5, 3], [1, 5, 4], [2, 5, 3], [2, 5, 4], [3, 2, 4], [3, 5, 4],
[1, 2, 3, 5], [1, 2, 4, 5], [3, 2, 4, 5],
[1, 2, 3, 5, 4]]

finding a list in a list of list based on one element

I have a list of lists representing a connectivity graph in Python. This list look like a n*2 matrix
example = [[1, 2], [1, 5], [1, 8], [2, 1], [2, 9], [2,5] ]
what I want to do is to find the value of the first elements of the lists where the second element is equal to a user defined value. For instance :
input 1 returns [2] (because [2,1])
input 5 returns [1,2] (because [1,5] and [2,5])
input 7 returns []
in Matlab, I could use
output = example(example(:,1)==input, 2);
but I would like to do this in Python (in the most pythonic and efficient way)
You can use list comprehension as a filter, like this
>>> example = [[1, 2], [1, 5], [1, 8], [2, 1], [2, 9], [2,5]]
>>> n = 5
>>> [first for first, second in example if second == n]
[1, 2]
You can work with the Python functions map and filter very comfortable:
>>> example = [[1, 2], [1, 5], [1, 8], [2, 1], [2, 9], [2,5] ]
>>> n = 5
>>> map(lambda x: x[0], filter(lambda x: n in x, example))
[1,2]
With lambda you can define anonyme functions...
Syntax:
lambda arg0,arg1...: e
arg0,arg1... are your parameters of the fucntion, and e is the expression.
They use lambda functions mostly in functions like map, reduce, filter etc.
exemple = [[1, 2], [1, 5], [1, 8], [2, 1], [2, 9], [2,5] ]
foundElements = []
** input = [...] *** List of Inputs
for item in exemple:
if item[1] in input :
foundElements.append(item[0])

Categories