Making list of list oneliner -python - python

I have a list
l=[(1,2),(1,6),(3,4),(3,6),(1,4),(4,3)]
I want to return a list that contains lists by the first number in each tuple.
Something like this:
[[2,4,6],[4,6],[3]]
To make a program that iterates on list and writing a whole function that does it is easy.
I want to find a oneliner - python way of doing it.
Any ideas?

>>> from itertools import groupby
>>> from operator import itemgetter
>>> L = [(1,2), (1,6), (3,4), (3,6), (1,4), (4,3)]
>>> [[y for x, y in v] for k, v in groupby(sorted(L), itemgetter(0))]
[[2, 4, 6], [4, 6], [3]]
Explanation
This works by using itertools.groupby. groupby finds consecutive groups in an iterable, returning an iterator through key, group pairs.
The argument given to groupby is a key function, itemgetter(0) which is called for each tuple, returning the first item as the key to groupby.
groupby groups elements in their original order so if you want to group by the first number in the list, it must first be sorted so groupby can go through the first numbers in ascending order and actually group them.
>>> sorted(L)
[(1, 2), (1, 4), (1, 6), (3, 4), (3, 6), (4, 3)]
There is the sorted list where you can clearly see the groups that will be created if you look back to the final output. Now you can use groupby to show the key, group pairs.
[(1, <itertools._grouper object at 0x02BB7ED0>), (3, <itertools._grouper object at 0x02BB7CF0>), (4, <itertools._grouper object at 0x02BB7E30>)]
Here are the sorted items grouped by the first number. groupby returns the group for each key as an iterator, this is great and very efficient but for this example we will just convert it to a list to make sure it's working properly.
>>> [(k, list(v)) for k,v in groupby(sorted(L), itemgetter(0))]
[(1, [(1, 2), (1, 4), (1, 6)]), (3, [(3, 4), (3, 6)]), (4, [(4, 3)])]
That is almost the right thing but the required output shows only the 2nd number in the groups in each list. So the following achieves the desired result.
[[y for x, y in v] for k, v in groupby(sorted(L), itemgetter(0))]

l = [(1, 2), (1, 6), (3, 4), (3, 6), (1, 4), (4, 3)]
d = {}
for (k, v) in l:
d.setdefault(k, []).append(v)
print d.values()
I know it's not a one liner, but perhaps it's easier to read than a one liner.

Related

How to i make "rows" consiting of pairs from a list of objects that is sorted based on their attributes

I have created a class with attributes and sorted them based on their level of x, from 1-6. I then want to sort the list into pairs, where the objects with the highest level of "x" and the object with the lowest level of "x" are paired together, and the second most and second less and so on. If it was my way it would look like this, even though objects are not itereable.
for objects in sortedlist:
i = 0
row(i) = [[sortedlist[i], list[-(i)-1]]
i += 1
if i => len(sortedlist)
break
Using zip
I think the code you want is:
rows = list(zip(sortedList, reversed(sortedList)))
However, note that this would "duplicate" the elements:
>>> sortedList = [1, 2, 3, 4, 5]
>>> list(zip(sortedList, reversed(sortedList)))
[(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)]
If you know that the list has an even number of elements and want to avoid duplicates, you can instead write:
rows = list(zip(sortedList[:len(sortedList)//2], reversed(sortedList[len(sortedList)//2:])))
With the following result:
>>> sortedList = [1,2,3,4,5,6]
>>> list(zip(sortedList[:len(sortedList)//2], reversed(sortedList[len(sortedList)//2:])))
[(1, 6), (2, 5), (3, 4)]
Using loops
Although I recommend using zip rather than a for-loop, here is how to fix the loop you wrote:
rows = []
for i in range(len(sortedList)):
rows.append((sortedList[i], sortedList[-i-1]))
With result:
>>> sortedList=[1,2,3,4,5]
>>> rows = []
>>> for i in range(len(sortedList)):
... rows.append((sortedList[i], sortedList[-i-1]))
...
>>> rows
[(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)]

Sorting in a dictionary [duplicate]

This question already has answers here:
How do I sort a dictionary by value?
(34 answers)
Closed 2 years ago.
Im trying to get the output from my dictionary to be ordered from their values in stead of keys
Question:
ValueCount that accepts a list as a parameter. Your function will return a list of tuples. Each tuple will contain a value and the number of times that value appears in the list
Desired outcome
>>> data = [1,2,3,1,2,3,5,5,4]
>>> ValueCount(data)
[(1, 2), (2, 2), (5, 1), (4, 1)]
My code and outcome
def CountValues(data):
dict1 = {}
for number in data:
if number not in dict1:
dict1[number] = 1
else:
dict1[number] += 1
tuple_data = dict1.items()
lst = sorted(tuple_data)
return(lst)
>>>[(1, 2), (2, 2), (3, 2), (4, 1), (5, 2)]
How would I sort it ascendingly by using the values instead of keys.
If you want to sort by the values(second item in each tuple), specify key:
sorted(tuple_data, key=lambda x: x[1])
Or with operator.itemgetter:
sorted(tuple_data, key=operator.itemgetter(1))
Also as a side note, your counting code:
dict1 = {}
for number in data:
if number not in dict1:
dict1[number] = 1
else:
dict1[number] += 1
Can be simplified with collections.Counter:
dict1 = collections.Counter(data)
With all the above in mind, your code could look like this:
from operator import itemgetter
from collections import Counter
def CountValues(data):
counts = Counter(data)
return sorted(counts.items(), key=itemgetter(1))
print(CountValues([1,2,3,1,2,3,5,5,4]))
# [(4, 1), (1, 2), (2, 2), (3, 2), (5, 2)]
You can use the sorted with the help of key parameter. it is not a in-place sorting . Thus it never modifies the original array.
for more
In [18]: data = [1,2,3,1,2,3,5,5,4]
In [19]: from collections import Counter
In [20]: x=Counter(data).items()
#Sorted OUTPUT
In [21]: sorted(list(x), key= lambda i:i[1] )
Out[21]: [(4, 1), (1, 2), (2, 2), (3, 2), (5, 2)]
In [22]: x
Out[22]: dict_items([(1, 2), (2, 2), (3, 2), (5, 2), (4, 1)])
"Sort" function uses first element of data.
To sort dictionary by its values you can use for-loop for values:
d={1:1,2:2,5:2,4:3,3:2}
x=[]
for i in set(sorted(d.values())):
for j in sorted(d.items()):
if j[1]==i:
x.append(j)
print(x)
if you don't convert sorted(d.values()) to set{} , it will check every value, even there are same numbers. For example if your values list is [1,2,2,3] , it will check items for value "2" two times and as a result your sorted list will contain repeated data which both have value "2" . But set{} keeps only one of each element and in this case, for-loop will check every different value of d.values() . And if there are items with a same value, code will sort them by keys because of sorted(d.items()) .
(to understand better you can use this code without that set{} and use d.items() instead of sorted(d.items()))

How do I sort this list of tuples by both values?

I have a list of tuples: [(2, Operation.SUBSTITUTED), (1, Operation.DELETED), (2, Operation.INSERTED)]
I would like to sort this list in 2 ways:
First by its 1st value by ascending value, i.e. 1, 2, 3... etc
Second by its 2nd value by reverse alphabetical order, i.e. Operation.SUBSTITITUTED, Operation.INSERTED, Operation, DELETED
So the above list should be sorted as:
[(1, Operation.DELETED), (2, Operation.SUBSTITUTED), (2, Operation.INSERTED)]
How do I go about sort this list?
Since sorting is guaranteed to be stable, you can do this in 2 steps:
lst = [(2, 'Operation.SUBSTITUTED'), (1, 'Operation.DELETED'), (2, 'Operation.INSERTED')]
res_int = sorted(lst, key=lambda x: x[1], reverse=True)
res = sorted(res_int, key=lambda x: x[0])
print(res)
# [(1, 'Operation.DELETED'), (2, 'Operation.SUBSTITUTED'), (2, 'Operation.INSERTED')]
In this particular case, because the order of comparison can be easily inverted for integers, you can sort in one time using negative value for integer key & reverse:
lst = [(2, 'Operation.SUBSTITUTED'), (1, 'Operation.DELETED'), (2, 'Operation.INSERTED')]
res = sorted(lst, key=lambda x: (-x[0],x[1]), reverse=True)
result:
[(1, 'Operation.DELETED'), (2, 'Operation.SUBSTITUTED'), (2, 'Operation.INSERTED')]
negating the integer key cancels the "reverse" aspect, only kept for the second string criterion.
You can use this:
from operator import itemgetter
d = [(1, 'DELETED'), (2, 'INSERTED'), (2, 'SUBSTITUTED')]
d.sort(key=itemgetter(1),reverse=True)
d.sort(key=itemgetter(0))
print(d)
Another way using itemgetter from operator module:
from operator import itemgetter
lst = [(2, 'Operation.SUBSTITUTED'), (1, 'Operation.DELETED'), (2, 'Operation.INSERTED')]
inter = sorted(lst, key=itemgetter(1), reverse=True)
sorted_lst = sorted(inter, key=itemgetter(0))
print(sorted_lst)
# [(1, 'Operation.DELETED'), (2, 'Operation.SUBSTITUTED'), (2, 'Operation.INSERTED')]

How to sort a tuple based on a value within the list of tuples

In python, I wish to sort tuples based on the value of their last element. For example, i have a tuple like the one below.
tuples = [(2,3),(5,7),(4,3,1),(6,3,5),(6,2),(8,9)]
which after sort I wish to be in this format.
tuples = [(4,3,1),(6,2),(2,3),(6,3,5),(5,7),(8,9)]
How do i get to doing that?
Povide list.sort with an appropriate key function that returns the last element of a tuple:
tuples.sort(key=lambda x: x[-1])
You can use:
from operator import itemgetter
tuples = sorted(tuples, key=itemgetter(-1))
The point is that we use key as a function to map the elements on an orderable value we wish to sort on. With itemgetter(-1) we construct a function, that for a value x, will return x[-1], so the last element.
This produces:
>>> sorted(tuples, key=itemgetter(-1))
[(4, 3, 1), (6, 2), (2, 3), (6, 3, 5), (5, 7), (8, 9)]

Sort list by nested tuple values

Is there a better way to sort a list by a nested tuple values than writing an itemgetter alternative that extracts the nested tuple value:
def deep_get(*idx):
def g(t):
for i in idx: t = t[i]
return t
return g
>>> l = [((2,1), 1),((1,3), 1),((3,6), 1),((4,5), 2)]
>>> sorted(l, key=deep_get(0,0))
[((1, 3), 1), ((2, 1), 1), ((3, 6), 1), ((4, 5), 2)]
>>> sorted(l, key=deep_get(0,1))
[((2, 1), 1), ((1, 3), 1), ((4, 5), 2), ((3, 6), 1)]
I thought about using compose, but that's not in the standard library:
sorted(l, key=compose(itemgetter(1), itemgetter(0))
Is there something I missed in the libs that would make this code nicer?
The implementation should work reasonably with 100k items.
Context: I would like to sort a dictionary of items that are a histogram. The keys are a tuples (a,b) and the value is the count. In the end the items should be sorted by count descending, a and b. An alternative is to flatten the tuple and use the itemgetter directly but this way a lot of tuples will be generated.
Yes, you could just use a key=lambda x: x[0][1]
Your approach is quite good, given the data structure that you have.
Another approach would be to use another structure.
If you want speed, the de-factor standard NumPy is the way to go. Its job is to efficiently handle large arrays. It even has some nice sorting routines for arrays like yours. Here is how you would write your sort over the counts, and then over (a, b):
>>> arr = numpy.array([((2,1), 1),((1,3), 1),((3,6), 1),((4,5), 2)],
dtype=[('pos', [('a', int), ('b', int)]), ('count', int)])
>>> print numpy.sort(arr, order=['count', 'pos'])
[((1, 3), 1) ((2, 1), 1) ((3, 6), 1) ((4, 5), 2)]
This is very fast (it's implemented in C).
If you want to stick with standard Python, a list containing (count, a, b) tuples would automatically get sorted in the way you want by Python (which uses lexicographic order on tuples).
I compared two similar solutions. The first one uses a simple lambda:
def sort_one(d):
result = d.items()
result.sort(key=lambda x: (-x[1], x[0]))
return result
Note the minus on x[1], because you want the sort to be descending on count.
The second one takes advantage of the fact that sort in Python is stable. First, we sort by (a, b) (ascending). Then we sort by count, descending:
def sort_two(d):
result = d.items()
result.sort()
result.sort(key=itemgetter(1), reverse=True)
return result
The first one is 10-20% faster (both on small and large datasets), and both complete under 0.5sec on my Q6600 (one core used) for 100k items. So avoiding the creation of tuples doesn't seem to help much.
This might be a little faster version of your approach:
l = [((2,1), 1), ((1,3), 1), ((3,6), 1), ((4,5), 2)]
def deep_get(*idx):
def g(t):
return reduce(lambda t, i: t[i], idx, t)
return g
>>> sorted(l, key=deep_get(0,1))
[((2, 1), 1), ((1, 3), 1), ((4, 5), 2), ((3, 6), 1)]
Which could be shortened to:
def deep_get(*idx):
return lambda t: reduce(lambda t, i: t[i], idx, t)
or even just simply written-out:
sorted(l, key=lambda t: reduce(lambda t, i: t[i], (0,1), t))

Categories