I'm relatively new to programming, and I want to sort a 2-D array (lists as they're called in Python) by the value of all the items in each sub-array. For example:
pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
The sum of the first element of pop would be 9, because 1 + 5 + 3 = 9. The sum of the second would be 3, because 1 + 1 + 1 = 3, and so on.
I want to rearrange this so the new order would be:
newPop = [pop[1], pop[0], pop[3], pop[2]]
How would I do this?
Note: I don't want to sort the elements each sub-array, but sort according to the sum of all the numbers in each sub-array.
You can use sorted():
>>> pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
>>> newPop = sorted(pop, key=sum)
>>> newPop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
You can also sort in-place with pop.sort(key=sum). Unless you definitely want to preserve the original list, you should prefer in-pace sorting.
Try this:
sorted(pop, key=sum)
Explanation:
The sorted() procedure sorts an iterable (a list in this case) in ascending order
Optionally, a key parameter can be passed to determine what property of the elements in the list is going to be used for sorting
In this case, the property is the sum of each of the elements (which are sublists)
So essentially this is what's happening:
[[1,5,3], [1,1,1], [7,5,8], [2,5,4]] # original list
[sum([1,5,3]), sum([1,1,1]), sum([7,5,8]), sum([2,5,4])] # key=sum
[9, 3, 20, 11] # apply key
sorted([9, 3, 20, 11]) # sort
[3, 9, 11, 20] # sorted
[[1,1,1], [1,5,3], [2,5,4], [7,5,8]] # elements coresponding to keys
#arshajii beat me to the punch, and his answer is good. However, if you would prefer an in-place sort:
>>> pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
>>> pop.sort(key=sum)
>>> pop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
I have to look up Python's sorting algorithm -- I think it's called Timsort, bit I'm pretty sure an in-place sort would be less memory intensive and about the same speed.
Edit: As per this answer, I would definitely recommend x.sort()
If you wanted to sort the lists in a less traditional way, you could write your own function (that takes one parameter.) At risk of starting a flame war, I would heavily advise against lambda.
For example, if you wanted the first number to be weighted more heavily than the second number more heavily than the third number, etc:
>>> def weightedSum(listToSum):
... ws = 0
... weight = len(listToSum)
... for i in listToSum:
... ws += i * weight
... weight -= 1
... return ws
...
>>> weightedSum([1, 2, 3])
10
>>> 1 * 3 + 2 * 2 + 3 * 1
10
>>> pop
[[1, 5, 3], [1, 1, 1], [7, 5, 8], [2, 5, 4]]
>>> pop.sort(key=weightedSum)
>>> pop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
>>> pop += [[1, 3, 8]]
>>> pop.sort(key=weightedSum)
>>> pop
[[1, 1, 1], [1, 5, 3], [1, 3, 8], [2, 5, 4], [7, 5, 8]]
Related
case 1:
a=[1,2,3,4,5]
index k=2
a[:k],a[k:]=a[k:],a[:k]
When I swap array elements like this. I got this output.
**OUTPUT:[3, 4, 1, 2]
case 2:
b=[1,2,3,4,5]
b[k:],b[:k]=b[:k],b[k:]
but when I swap array elements like this i got this.The only difference is the order of swapping.
OUTPUT:[3, 4, 5, 1, 2]
If we swap two variables, the order of swapping doesn't make a difference.
i.e a,b=b,a is same as b,a=a,b.
Why this is not working in the case of lists/array?
The right hand side is evaluated fully before any assignments are done. Subsequently, the assignments are performed in left to right order.
So the first case evaluates to:
a[:2], a[2:] = [3, 4, 5], [1, 2]
The first assignment replaces the first two elements [1, 2] by the three elements [3, 4, 5], so now we have a == [3, 4, 5, 3, 4, 5]. The second assignment then replaces element 2 onwards by [1, 2], so it results in a == [3, 4, 1, 2].
In the second case, we have:
b[2:], b[:2] = [1, 2], [3, 4, 5]
The first assignment replaces from element 2 onwards by [1, 2] resulting in b == [1, 2, 1, 2]. The second assignment replaces the first two elements by [3, 4, 5], giving b == [3, 4, 5, 1, 2].
You may have noticed by now that this is rather tricky to think about and sometimes works mostly by accident, so I'd recommend simplifying the whole thing to:
a[:] = a[k:] + a[:k]
If you swap two variables, there's no relation between two variables, then ok.
You can find the steps as following:
>>> a=[1,2,3,4,5]
>>> k=2
>>> # a[:k], a[k:] = a[k:], a[:k]
>>> x, y = a[k:], a[:k]
>>> a[:k] = x
>>> x, a
([3, 4, 5], [3, 4, 5, 3, 4, 5])
>>> a[k:] = y
>>> y, a
([1, 2], [3, 4, 1, 2])
I have a 2D list which I create like so:
Z1 = [[0 for x in range(3)] for y in range(4)]
I then proceed to populate this list, such that Z1 looks like this:
[[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
I need to extract the unique 1x3 elements of Z1, without regard to order:
Z2 = makeUnique(Z1) # The solution
The contents of Z2 should look like this:
[[4, 5, 6], [2, 5, 1]]
As you can see, I consider [1, 2, 3] and [2, 3, 1] to be duplicates because I don't care about the order.
Also note that single numeric values may appear more than once across elements (e.g. [2, 3, 1] and [2, 5, 1]); it's only when all three values appear together more than once (in the same or different order) that I consider them to be duplicates.
I have searched dozens of similar problems, but none of them seems to address my exact issue. I'm a complete Python beginner so I just need a push in the right direction.
I have already tried :
Z2= dict((x[0], x) for x in Z1).values()
Z2= set(i for j in Z2 for i in j)
But this does not produce the desired behaviour.
Thank you very much for your help!
Louis Vallance
If the order of the elements inside the sublists does not matter, you could use the following:
from collections import Counter
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
temp = Counter([tuple(sorted(x)) for x in z1])
z2 = [list(k) for k, v in temp.items() if v == 1]
print(z2) # [[4, 5, 6], [1, 2, 5]]
Some remarks:
sorting makes lists [1, 2, 3] and [2, 3, 1] from the example equal so they get grouped by the Counter
casting to tuple converts the lists to something that is hashable and can therefore be used as a dictionary key.
the Counter creates a dict with the tuples created above as keys and a value equal to the number of times they appear in the original list
the final list-comprehension takes all those keys from the Counter dictionary that have a count of 1.
If the order does matter you can use the following instead:
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
def test(sublist, list_):
for sub in list_:
if all(x in sub for x in sublist):
return False
return True
z2 = [x for i, x in enumerate(z1) if test(x, z1[:i] + z1[i+1:])]
print(z2) # [[4, 5, 6], [2, 5, 1]]
I may not explain clearly.
I want to have a comparing function to identify whether the values of the same index from 2 lists are the same or not.
For example, 2 lists A and B, which should be the same(accuracy =100%).
A=[1,2,1,1,3,4,3,2,5]
B=[4,2,4,4,3,1,3,2,5]
since A(0),A(2),A(3) are the same value = 1,and B(0),B(2),B(3) are the same value = 4; A(1),A(7) are the same value = 2, the same as B(1),B(7);
A(4),A(6) are the same value = 3, the same as B(4),B(6);
A(5), unique value in list A, is the same as B(5);
A(8), unique value in list A, is the same as B(8).
And then taking the same rule for list C & D, which accuracy should be 80%.
C=[1,2,2,2,3,4,4,4,5,6]
D=[3,4,4,4,1,5,5,6,5,6]
D(7) should be the same value as D(5),D(6), not the same with D(9), and D(8) should not be the same value as D(5),D(6), which should be a standalone value.
notice: the value in list may not be sequential number. list A can
also be [1,26,1,1,30,4,30,26,5], and B can be[4,22,4,4,3,100,3,22,5].
Which I still take them to be the same.
How can I have an accuracy of a comparing function to check it?
Thanks!
If you want to compare the length of the set intersection to the length of the set union:
How many elements are in both lists? (set intersection &)
How many elements are there in total? (set union |)
This method doesn't take position or distribution into account:
A = [1, 2, 1, 1, 3, 4, 3, 2, 5]
B = [4, 2, 4, 4, 3, 1, 3, 2, 5]
C = [1, 2, 2, 2, 3, 4, 4, 4, 5, 6]
D = [3, 4, 4, 4, 1, 5, 5, 6, 5, 6]
def overlapping_percentage(x, y):
return (100.0 * len(set(x) & set(y))) / len(set(x) | set(y))
print(overlapping_percentage(A, B))
# 100.0
print(overlapping_percentage(C, D))
# 83.3
Here's a different method, which might be closer to what you want. It's not perfect and you'll probably have to optimize it.
To be honest, I don't understand exactly where those 80% come from.
This method extracts a "fingerprint" out of lists: where the elements are placed, independently from their values. The fingerprints are then compared to one another:
from collections import defaultdict
A=[1,2,1,1,3,4,3,2,5]
B=[4,2,4,4,3,1,3,2,5]
C = [1, 2, 2, 2, 3, 4, 4, 4, 5, 6]
D = [3, 4, 4, 4, 1, 5, 5, 6, 5, 6]
def fingerprint(lst):
r = defaultdict(list)
for i,x in enumerate(lst):
r[x].append(i)
return sorted(r.values())
fA = fingerprint(A)
# [[0, 2, 3], [1, 7], [4, 6], [5], [8]]
fB = fingerprint(B)
# [[0, 2, 3], [1, 7], [4, 6], [5], [8]]
fC = fingerprint(C)
# [[0], [1, 2, 3], [4], [5, 6, 7], [8], [9]]
fD = fingerprint(D)
# [[0], [1, 2, 3], [4], [5, 6, 8], [7, 9]]
print((100.0*sum(1 for a,b in zip(fA, fB) if a == b)/len(fB)))
# 100.0
print((100.0*sum(1 for c,d in zip(fC, fD) if c == d)/len(fD)))
# 60.0
It too late to answer ,but I did same thing in another way, and you can calculate each percentage of numbers same with another list in same index.
I'll just give you my code , you may refer to it.
def accuracy(self,*args):
check = final_result['train_num'] == final_result['test_num']
passed = final_result[check]
accuracy = len(passed.index) / len(final_result.index)
analysis = passed['test_num'].value_counts()
analysis = analysis / 50
analysis['accuracy']=round(accuracy,5)
pd.Series.to_csv(analysis,csvpath+"accuracy.csv",sep=',')
print("accuracy:{:.3f}".format(accuracy))
train_num test_numis 2 columns of a dataframe final_result,you can replace it with your data.
be careful with analysis = analysis / 50,50 is total degree of each elements in my data, you should change it.
I have a variable stacks:
stacks = [[1, 2, 3], [[4, 5, 6], [1, 2, 3]]]
From this I want to create another list of heights, where each height is the sum of the element at index 1 of each stack in stacks. In the example above, heights would be:
heights = [2, 7]
Where 2 is stacks[0][1] and 7 is stacks[1][0][1] + stacks[1][1][1]. Sorry if it was unclear before. How can I do this concisely using list comprehensions, map and / or reduce?
Assuming that stacks is exactly as you've described:
>>> stacks = [[1, 2, 3], [[4, 5, 6], [1, 2, 3]]]
>>> wrapped = (s if isinstance(s[0], list) else [s] for s in stacks)
>>> total = [sum(x[1] for x in w) for w in wrapped]
>>> total
[2, 7]
It would be more natural, IMHO, if the elements of stacks were always lists of lists:
>>> stacks = [[[1, 2, 3]], [[4, 5, 6], [1, 2, 3]]]
>>> total = [sum(x[1] for x in w) for w in stacks]
>>> total
[2, 7]
I want to merge two arrays in python based on the first element in each column of each array.
For example,
A = ([[1, 2, 3],
[4, 5, 6],
[4, 6, 7],
[5, 7, 8],
[5, 9, 1]])
B = ([[1, .002],
[4, .005],
[5, .006]])
So that I get an array
C = ([[1, 2, 3, .002],
[4, 5, 6, .005],
[4, 6, 7, .005],
[5, 7, 8, .006],
[5, 9, 1, .006]])
For more clarity:
First column in A is 1, 4, 4, 5, 5 and
First column of B is 1, 4, 5
So that 1 in A matches up with 1 in B and gets .002
How would I do this in python? Any suggestions would be great.
Is it Ok to modify A in place?:
d = dict((x[0],x[1:]) for x in B)
Now d is a dictionary where the first column are keys and the subsequent columns are values.
for lst in A:
if lst[0] in d: #Is the first value something that we can extend?
lst.extend(d[lst[0]])
print A
To do it out of place (inspired by the answer by Ashwini):
d = dict((x[0],x[1:]) for x in B)
C = [lst + d.get(lst[0],[]) for lst in A]
However, with this approach, you need to have lists in both A and B. If you have some lists and some tuples it'll fail (although it could be worked around if you needed to), but it will complicate the code slightly.
with either of these answers, B can have an arbitrary number of columns
As a side note on style: I would write the lists as:
A = [[1, 2, 3],
[4, 5, 6],
[4, 6, 7],
[5, 7, 8],
[5, 9, 1]]
Where I've dropped the parenthesis ... They make it look too much like you're putting a list in a tuple. Python's automatic line continuation happens with parenthesis (), square brackets [] or braces {}.
(This answer assumes these are just regular lists. If they’re NumPy arrays, you have more options.)
It looks like you want to use B as a lookup table to find values to add to each row of A.
I would start by making a dictionary out of the data in B. As it happens, B is already in just the right form to be passed to the dict() builtin:
B_dict = dict(B)
Then you just need to build C row by row.
For each row in A, row[0] is the first element, so B_dict[row[0]] is the value you want to add to the end of the row. Therefore row + [B_dict[row[0]] is the row you want to add to C.
Here is a list comprehension that builds C from A and B_dict.
C = [row + [B_dict[row[0]]] for row in A]
You can convert B to a dictionary first, with the first element of each sublist as key and second one as value.
Then simply iterate over A and append the related value fetched from the dict.
In [114]: A = ([1, 2, 3],
[4, 5, 6],
[4, 6, 7],
[5, 7, 8],
[6, 9, 1])
In [115]: B = ([1, .002],
[4, .005],
[5, .006])
In [116]: [x + [dic[x[0]]] if x[0] in dic else [] for x in A]
Out[116]:
[[1, 2, 3, 0.002],
[4, 5, 6, 0.005],
[4, 6, 7, 0.005],
[5, 7, 8, 0.006],
[6, 9, 1]]
Here is a solution using itertools.product() that prevents having to create a dictionary for B:
In [1]: from itertools import product
In [2]: [lst_a + lst_b[1:] for (lst_a, lst_b) in product(A, B) if lst_a[0] == lst_b[0]]
Out[2]:
[[1, 2, 3, 0.002],
[4, 5, 6, 0.005],
[4, 6, 7, 0.005],
[5, 7, 8, 0.006],
[5, 9, 1, 0.006]]
The naive, simple way:
for alist in A:
for blist in B:
if blist[0] == alist[0]:
alist.extend(blist[1:])
# alist.append(blist[1]) if B will only ever contain 2-tuples.
break # Remove this if you want to append more than one.
The downside here is that it's O(N^2) complexity. For most small data sets, that should be ok. If you're looking for something more comprehensive, you'll probably want to look at #mgilson's answer. Some comparison:
His response converts everything in B to a dict and performs list slicing on each element. If you have a lot of values in B, that could be expensive. This uses the existing lists (you're only looking at the first value, anyway).
Because he's using dicts, he gets O(1) lookup times (his answer also assumes that you're never going to append multiple values to the end of the values in A). That means overall, his algorithm will achieve O(N). You'll need to weigh whether the overhead of creating a dict is going to outweight the iteration of the values in B.