Related
If I have a nested list, e.g. x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]], how can I calculate the difference between all of them? Let's called the lists inside x - A, B, and C. I want to calculate the difference of A from B & C, then B from A & C, then C from A & B, then put them in a list diff = [].
My problem is correctly indexing the numbers and using them to do maths with corresponding elements in other lists.
This is what I have so far:
for i in range(len(x)):
diff = []
for j in range(len(x)):
if x[i]!=x[j]:
a = x[i]
b = x[j]
for h in range(len(a)):
d = a[h] - b[h]
diff.append(d)
Essentially for the difference of A to B it is ([1-2] + [2-4] + [3-6])
I would like it to return: diff = [[diff(A,B), diff(A,C)], [diff(B,A), diff(B,C)], [diff(C,A), diff(C,B)]] with the correct differences between points.
Thanks in advance!
Your solution is actually not that far off. As Aniketh mentioned, one issue is your use of x[i] != x[j]. Since x[i] and x[j] are arrays, that will actually always evaluate to false.
The reason is that python will not do a useful comparison of arrays by default. It will just check if the array reference is the same. This is obviously not what you want, you are trying to see if the array is at the same index in x. For that use i !=j.
Though there are other solutions posted here, I'll add mine below because I already wrote it. It makes use of python's list comprehensions.
def pairwise_diff(x):
diff = []
for i in range(len(x)):
A = x[i]
for j in range(len(x)):
if i != j:
B = x[j]
assert len(A) == len(B)
item_diff = [A[i] - B[i] for i in range(len(A))]
diff.append(sum(item_diff))
# Take the answers and group them into arrays of length 2
return [diff[i : i + 2] for i in range(0, len(diff), 2)]
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
print(pairwise_diff(x))
This is one of those problems where it's really helpful to know a bit of Python's standard library — especially itertools.
For example to get the pairs of lists you want to operate on, you can reach for itertools.permutations
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
list(permutations(x, r=2))
This gives the pairs of lists your want:
[([1, 2, 3], [2, 4, 6]),
([1, 2, 3], [3, 5, 7]),
([2, 4, 6], [1, 2, 3]),
([2, 4, 6], [3, 5, 7]),
([3, 5, 7], [1, 2, 3]),
([3, 5, 7], [2, 4, 6])]
Now, if you could just group those by the first of each pair...itertools.groupby does just this.
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
list(list(g) for k, g in groupby(permutations(x, r=2), key=lambda p: p[0]))
Which produces a list of lists grouped by the first:
[[([1, 2, 3], [2, 4, 6]), ([1, 2, 3], [3, 5, 7])],
[([2, 4, 6], [1, 2, 3]), ([2, 4, 6], [3, 5, 7])],
[([3, 5, 7], [1, 2, 3]), ([3, 5, 7], [2, 4, 6])]]
Putting it all together, you can make a simple function that subtracts the lists the way you want and pass each pair in:
from itertools import permutations, groupby
def sum_diff(pairs):
return [sum(p - q for p, q in zip(*pair)) for pair in pairs]
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
# call sum_diff for each group of pairs
result = [sum_diff(g) for k, g in groupby(permutations(x, r=2), key=lambda p: p[0])]
# [[-6, -9], [6, -3], [9, 3]]
This reduces the problem to just a couple lines of code and will be performant on large lists. And, since you mentioned the difficulty in keeping indices straight, notice that this uses no indices in the code other than selecting the first element for grouping.
Here is the code I believe you're looking for. I will explain it below:
def diff(a, b):
total = 0
for i in range(len(a)):
total += a[i] - b[i]
return total
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
differences = []
for i in range(len(x)):
soloDiff = []
for j in range(len(x)):
if i != j:
soloDiff.append(diff(x[i],x[j]))
differences.append(soloDiff)
print(differences)
Output:
[[-6, -9], [6, -3], [9, 3]]
First off, in your explanation of your algorithm, you are making it very clear that you should use a function to calculate the differences between two lists since you will be using it repeatedly.
Your for loops start off fine, but you should have a second list to append diff to 3 times. Also, when you are checking for repeats you need to make sure that i != j, not x[i] != x[j]
Let me know if you have any other questions!!
this is the simplest solution i can think:
import numpy as np
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
x = np.array(x)
vectors = ['A','B','C']
for j in range(3):
for k in range(3):
if j!=k:
print(vectors[j],'-',vectors[k],'=', x[j]-x[k])
which will return
A - B = [-1 -2 -3]
A - C = [-2 -3 -4]
B - A = [1 2 3]
B - C = [-1 -1 -1]
C - A = [2 3 4]
C - B = [1 1 1]
I am trying to solve a problem that is a part of my genome alignment project. The problem goes as follows:
if given a nested list
y = [[1,2,3],[1,2,3],[3,4,5],[6,5,4],[4,2,5],[4,2,5],[1,2,8],[1,2,3]]
extract indices of unique lists into a nested list again.
For example, the output for the above nested list should be
[[0,1,7],[2],[3],[4,5],[6]].
This is because list [1,2,3] is present in 0,1,7th index positions, [3,4,5] in 2nd index position and so on.
Since I will be dealing with large lists, what could be the most optimal way of achieving this in Python?
You could create an dictionary (or OrderedDict if on older pythons). The keys of the dict will be tuples of the sub-lists and the values will be an array of indexes. After looping through, the dictionary values will hold your answer:
from collections import OrderedDict
y = [[1,2,3],[1,2,3],[3,4,5],[6,5,4],[4,2,5],[4,2,5],[1,2,8],[1,2,3]]
lookup = OrderedDict()
for idx,l in enumerate(y):
lookup.setdefault(tuple(l), []).append(idx)
list(lookup.values())
# [[0, 1, 7], [2], [3], [4, 5], [6]]
You could use list comprehension and range to check for duplicate indexes and append them to result.
result = []
for num in range(len(y)):
occurances = [i for i, x in enumerate(y) if x == y[num]]
if occurances not in result: result.append(occurances)
result
#[[0, 1, 7], [2], [3], [4, 5], [6]]
Consider numpy to solve this:
import numpy as np
y = [
[1, 2, 3],
[1, 2, 3],
[3, 4, 5],
[6, 5, 4],
[4, 2, 5],
[4, 2, 5],
[1, 2, 8],
[1, 2, 3]
]
# Returns unique values of array, indices of that
# array, and the indices that would rebuild the original array
unique, indices, inverse = np.unique(y, axis=0, return_index=True, return_inverse=True)
Here's a print out of each variable:
unique = [
[1 2 3]
[1 2 8]
[3 4 5]
[4 2 5]
[6 5 4]]
indices = [0 6 2 4 3]
inverse = [0 0 2 4 3 3 1 0]
If we look at our variable - inverse, we can see that we do indeed get [0, 1, 7] as the index positions for our first unique element [1,2,3], all we need to do now is group them appropriately.
new_list = []
for i in np.argsort(indices):
new_list.append(np.where(inverse == i)[0].tolist())
Output:
new_list = [[0, 1, 7], [2], [3], [4, 5], [6]]
Finally, refs for the code above:
Numpy - unique, where, argsort
One more solution:
y = [[1, 2, 3], [1, 2, 3], [3, 4, 5], [6, 5, 4], [4, 2, 5], [4, 2, 5], [1, 2, 8], [1, 2, 3]]
occurrences = {}
for i, v in enumerate(y):
v = tuple(v)
if v not in occurrences:
occurrences.update({v: []})
occurrences[v].append(i)
print(occurrences.values())
I have a 2D list which I create like so:
Z1 = [[0 for x in range(3)] for y in range(4)]
I then proceed to populate this list, such that Z1 looks like this:
[[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
I need to extract the unique 1x3 elements of Z1, without regard to order:
Z2 = makeUnique(Z1) # The solution
The contents of Z2 should look like this:
[[4, 5, 6], [2, 5, 1]]
As you can see, I consider [1, 2, 3] and [2, 3, 1] to be duplicates because I don't care about the order.
Also note that single numeric values may appear more than once across elements (e.g. [2, 3, 1] and [2, 5, 1]); it's only when all three values appear together more than once (in the same or different order) that I consider them to be duplicates.
I have searched dozens of similar problems, but none of them seems to address my exact issue. I'm a complete Python beginner so I just need a push in the right direction.
I have already tried :
Z2= dict((x[0], x) for x in Z1).values()
Z2= set(i for j in Z2 for i in j)
But this does not produce the desired behaviour.
Thank you very much for your help!
Louis Vallance
If the order of the elements inside the sublists does not matter, you could use the following:
from collections import Counter
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
temp = Counter([tuple(sorted(x)) for x in z1])
z2 = [list(k) for k, v in temp.items() if v == 1]
print(z2) # [[4, 5, 6], [1, 2, 5]]
Some remarks:
sorting makes lists [1, 2, 3] and [2, 3, 1] from the example equal so they get grouped by the Counter
casting to tuple converts the lists to something that is hashable and can therefore be used as a dictionary key.
the Counter creates a dict with the tuples created above as keys and a value equal to the number of times they appear in the original list
the final list-comprehension takes all those keys from the Counter dictionary that have a count of 1.
If the order does matter you can use the following instead:
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
def test(sublist, list_):
for sub in list_:
if all(x in sub for x in sublist):
return False
return True
z2 = [x for i, x in enumerate(z1) if test(x, z1[:i] + z1[i+1:])]
print(z2) # [[4, 5, 6], [2, 5, 1]]
I'm relatively new to programming, and I want to sort a 2-D array (lists as they're called in Python) by the value of all the items in each sub-array. For example:
pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
The sum of the first element of pop would be 9, because 1 + 5 + 3 = 9. The sum of the second would be 3, because 1 + 1 + 1 = 3, and so on.
I want to rearrange this so the new order would be:
newPop = [pop[1], pop[0], pop[3], pop[2]]
How would I do this?
Note: I don't want to sort the elements each sub-array, but sort according to the sum of all the numbers in each sub-array.
You can use sorted():
>>> pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
>>> newPop = sorted(pop, key=sum)
>>> newPop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
You can also sort in-place with pop.sort(key=sum). Unless you definitely want to preserve the original list, you should prefer in-pace sorting.
Try this:
sorted(pop, key=sum)
Explanation:
The sorted() procedure sorts an iterable (a list in this case) in ascending order
Optionally, a key parameter can be passed to determine what property of the elements in the list is going to be used for sorting
In this case, the property is the sum of each of the elements (which are sublists)
So essentially this is what's happening:
[[1,5,3], [1,1,1], [7,5,8], [2,5,4]] # original list
[sum([1,5,3]), sum([1,1,1]), sum([7,5,8]), sum([2,5,4])] # key=sum
[9, 3, 20, 11] # apply key
sorted([9, 3, 20, 11]) # sort
[3, 9, 11, 20] # sorted
[[1,1,1], [1,5,3], [2,5,4], [7,5,8]] # elements coresponding to keys
#arshajii beat me to the punch, and his answer is good. However, if you would prefer an in-place sort:
>>> pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
>>> pop.sort(key=sum)
>>> pop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
I have to look up Python's sorting algorithm -- I think it's called Timsort, bit I'm pretty sure an in-place sort would be less memory intensive and about the same speed.
Edit: As per this answer, I would definitely recommend x.sort()
If you wanted to sort the lists in a less traditional way, you could write your own function (that takes one parameter.) At risk of starting a flame war, I would heavily advise against lambda.
For example, if you wanted the first number to be weighted more heavily than the second number more heavily than the third number, etc:
>>> def weightedSum(listToSum):
... ws = 0
... weight = len(listToSum)
... for i in listToSum:
... ws += i * weight
... weight -= 1
... return ws
...
>>> weightedSum([1, 2, 3])
10
>>> 1 * 3 + 2 * 2 + 3 * 1
10
>>> pop
[[1, 5, 3], [1, 1, 1], [7, 5, 8], [2, 5, 4]]
>>> pop.sort(key=weightedSum)
>>> pop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
>>> pop += [[1, 3, 8]]
>>> pop.sort(key=weightedSum)
>>> pop
[[1, 1, 1], [1, 5, 3], [1, 3, 8], [2, 5, 4], [7, 5, 8]]
I have a list in the form of
[ [[a,b,c],[d,e,f]] , [[a,b,c],[d,e,f]] , [[a,b,c],[d,e,f]] ... ] etc.
I want to return the minimal c value and the maximal c+f value. Is this possible?
For the minimum c:
min(c for (a,b,c),(d,e,f) in your_list)
For the maximum c+f
max(c+f for (a,b,c),(d,e,f) in your_list)
Example:
>>> your_list = [[[1,2,3],[4,5,6]], [[0,1,2],[3,4,5]], [[2,3,4],[5,6,7]]]
>>> min(c for (a,b,c),(d,e,f) in lst)
2
>>> max(c+f for (a,b,c),(d,e,f) in lst)
11
List comprehension to the rescue
a=[[[1,2,3],[4,5,6]], [[2,3,4],[4,5,6]]]
>>> min([x[0][2] for x in a])
3
>>> max([x[0][2]+ x[1][2] for x in a])
10
You have to map your list to one containing just the items you care about.
Here is one possible way of doing this:
x = [[[5, 5, 3], [6, 9, 7]], [[6, 2, 4], [0, 7, 5]], [[2, 5, 6], [6, 6, 9]], [[7, 3, 5], [6, 3, 2]], [[3, 10, 1], [6, 8, 2]], [[1, 2, 2], [0, 9, 7]], [[9, 5, 2], [7, 9, 9]], [[4, 0, 0], [1, 10, 6]], [[1, 5, 6], [1, 7, 3]], [[6, 1, 4], [1, 2, 0]]]
minc = min(l[0][2] for l in x)
maxcf = max(l[0][2]+l[1][2] for l in x)
The contents of the min and max calls is what is called a "generator", and is responsible for generating a mapping of the original data to the filtered data.
Of course it's possible. You've got a list containing a list of two-element lists that turn out to be lists themselves. Your basic algorithm is
for each of the pairs
if c is less than minimum c so far
make minimum c so far be c
if (c+f) is greater than max c+f so far
make max c+f so far be (c+f)
suppose your list is stored in my_list:
min_c = min(e[0][2] for e in my_list)
max_c_plus_f = max(map(lambda e : e[0][2] + e[1][2], my_list))