Arranging elements of a list of tuples - python

I have a list 'l' of tuples.
l = [('apple',4), ('carrot',2), ('apple',1), ('carrot',7)]
I want to arrange the first elements of tuples according to the values in ascending order.
The expected result is:
result = [('apple', (1,4)), ('carrot', (2,7))]
I tried as:
for x in l:
variables = list(set(x[0]))
I suppose that there is more better way of doing it. Any ideas please.

You could use a defaultdict to collect those values, and then get the items from the dictionary to get the desired result:
>>> l = [('apple',4), ('carrot',2), ('apple',1), ('carrot',7)]
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for k, v in l:
d[k].append(v)
>>> dict(d)
{'carrot': [2, 7], 'apple': [4, 1]}
>>> list(d.items())
[('carrot', [2, 7]), ('apple', [4, 1])]
In order to sort those sublists then, you could use a list comprehension:
>>> [(k, tuple(sorted(v))) for k, v in d.items()]
[('carrot', (2, 7)), ('apple', (1, 4))]
And if you want to sort that also by the “key”, just sort that resulting list using list.sort().

Here is my solution:
from collections import defaultdict
l = [('apple',4), ('carrot',2), ('apple',1), ('carrot',7)]
d = defaultdict(list)
for i, j in l:
d[i].append(j)
result = sorted([tuple([x, tuple(sorted(y))]) for x, y in d.items()])
print(result)
And here is the result:
[('apple', (1, 4)), ('carrot', (2, 7))]

Here's a one liner for you:
>>> a = [('apple',4), ('carrot',2), ('apple',1), ('carrot',7)]
>>> sorted([(n, tuple(sorted([e[1] for e in a if e[0] == n]))) for n in set(e for e,f in a)])
[('apple', (1, 4)), ('carrot', (2, 7))]
This sorts both the first element (apple, carrot, ...), and each second element ( (1,4) (2,7) ).
Note that #poke's solution does not sort it.

Related

How do I keep dictionary both key and value while sorting? (python)

item = {"num1":[1,3] , "num2": [2,4]}
wanted output
num1:1
num2:2
num1:3
num2:4
(basically order by value but some how keep the key it came with
it doesn't have to be in a dictionary as long as the output is order by value)
I'm completely stump atm any help would be much appreciated
Make pairs(2-tuples) with key-elements.
Sort it with second key. (sorted(target, key=lambda x: x[1]))
item = {'num1': [1, 3], 'num2': [2, 4]}
result = sorted(
[
(k, v)
for k, list_ in item.items()
for v in list_
],
key=lambda x: x[1],
)
print(result)
for r in result:
print(f'{r[0]}:{r[1]}')
output:
[('num1', 1), ('num2', 2), ('num1', 3), ('num2', 4)]
num1:1
num2:2
num1:3
num2:4
#OldBill's solution:
By reordering elements of tuples, you can use tuple itself for sorting key.
If you use tuple unpacking in for-loop, it become more readable.
item = {'num1': [1, 3], 'num2': [2, 4]}
result = sorted([
(v, k)
for k, list_ in item.items()
for v in list_
])
print(result)
for v, k in result:
print(f'{k}:{v}')
output:
[(1, 'num1'), (2, 'num2'), (3, 'num1'), (4, 'num2')]
num1:1
num2:2
num1:3
num2:4

Sort tuple list with another list

I have a tuple list to_order such as:
to_order = [(0, 1), (1, 3), (2, 2), (3,2)]
And a list which gives the order to apply to the second element of each tuple of to_order:
order = [2, 1, 3]
So I am looking for a way to get this output:
ordered_list = [(2, 2), (3,2), (0, 1), (1, 3)]
Any ideas?
You can provide a key that will check the index (of the second element) in order and sort based on it:
to_order = [(0, 1), (1, 3), (2, 2), (3,2)]
order = [2, 1, 3]
print(sorted(to_order, key=lambda item: order.index(item[1]))) # [(2, 2), (3, 2), (0, 1), (1, 3)]
EDIT
Since, a discussion on time complexities was start... here ya go, the following algorithm runs in O(n+m), using Eric's input example:
N = 5
to_order = [(randrange(N), randrange(N)) for _ in range(10*N)]
order = list(set(pair[1] for pair in to_order))
shuffle(order)
def eric_sort(to_order, order):
bins = {}
for pair in to_order:
bins.setdefault(pair[1], []).append(pair)
return [pair for i in order for pair in bins[i]]
def alfasin_new_sort(to_order, order):
arr = [[] for i in range(len(order))]
d = {k:v for v, k in enumerate(order)}
for item in to_order:
arr[d[item[1]]].append(item)
return [item for sublist in arr for item in sublist]
from timeit import timeit
print("eric_sort", timeit("eric_sort(to_order, order)", setup=setup, number=1000))
print("alfasin_new_sort", timeit("alfasin_new_sort(to_order, order)", setup=setup, number=1000))
OUTPUT:
eric_sort 59.282021682999584
alfasin_new_sort 44.28244407700004
Algorithm
You can distribute the tuples in a dict of lists according to the second element and iterate over order indices to get the sorted list:
from collections import defaultdict
to_order = [(0, 1), (1, 3), (2, 2), (3, 2)]
order = [2, 1, 3]
bins = defaultdict(list)
for pair in to_order:
bins[pair[1]].append(pair)
print(bins)
# defaultdict(<class 'list'>, {1: [(0, 1)], 3: [(1, 3)], 2: [(2, 2), (3, 2)]})
print([pair for i in order for pair in bins[i]])
# [(2, 2), (3, 2), (0, 1), (1, 3)]
sort or index aren't needed and the output is stable.
This algorithm is similar to the mapping mentioned in the supposed duplicate. This linked answer only works if to_order and order have the same lengths, which isn't the case in OP's question.
Performance
This algorithm iterates twice over each element of to_order. The complexity is O(n). #alfasin's first algorithm is much slower (O(n * m * log n)), but his second one is also O(n).
Here's a list with 10000 random pairs between 0 and 1000. We extract the unique second elements and shuffle them in order to define order:
from random import randrange, shuffle
from collections import defaultdict
from timeit import timeit
from itertools import chain
N = 1000
to_order = [(randrange(N), randrange(N)) for _ in range(10*N)]
order = list(set(pair[1] for pair in to_order))
shuffle(order)
def eric(to_order, order):
bins = defaultdict(list)
for pair in to_order:
bins[pair[1]].append(pair)
return list(chain.from_iterable(bins[i] for i in order))
def alfasin1(to_order, order):
arr = [[] for i in range(len(order))]
d = {k:v for v, k in enumerate(order)}
for item in to_order:
arr[d[item[1]]].append(item)
return [item for sublist in arr for item in sublist]
def alfasin2(to_order, order):
return sorted(to_order, key=lambda item: order.index(item[1]))
print(eric(to_order, order) == alfasin1(to_order, order))
# True
print(eric(to_order, order) == alfasin2(to_order, order))
# True
print("eric", timeit("eric(to_order, order)", globals=globals(), number=100))
# eric 0.3117517130003762
print("alfasin1", timeit("alfasin1(to_order, order)", globals=globals(), number=100))
# alfasin1 0.36100843100030033
print("alfasin2", timeit("alfasin2(to_order, order)", globals=globals(), number=100))
# alfasin2 15.031453827000405
Another solution:
[item for key in order for item in filter(lambda x: x[1] == key, to_order)]
This solution works off of order first, filtering to_order for each key in order.
Equivalent:
ordered = []
for key in order:
for item in filter(lambda x: x[1] == key, to_order):
ordered.append(item)
Shorter, but I'm not aware of a way to do this with list comprehension:
ordered = []
for key in order:
ordered.extend(filter(lambda x: x[1] == key, to_order))
Note: This will not throw a ValueError if to_order contains a tuple x where x[1] is not in order.
I personally prefer the list objects sort function rather than the built-in sort which generates a new list rather than changing the list in place.
to_order = [(0, 1), (1, 3), (2, 2), (3,2)]
order = [2, 1, 3]
to_order.sort(key=lambda x: order.index(x[1]))
print(to_order)
>[(2, 2), (3, 2), (0, 1), (1, 3)]
A little explanation on the way: The key parameter of the sort method basically preprocesses the list and ranks all the values based on a measure. In our case order.index() looks at the first occurrence of the currently processed item and returns its position.
x = [1,2,3,4,5,3,3,5]
print x.index(5)
>4

How to flatten a list of tuples and remove the duplicates?

I'm trying to remove the duplicates out of tuples in a list and add them in a new list with out duplicates,
I tried to make two loops but and check for duplicates or sets but the problem there's three tuples
can anyone help me, I'm stuck here
example
[(2, 5), (3, 5), (2, 5)]
Output
[2, 3, 5]
If order isn't important, you can make a set, add each element of each tuple to the set, and the set is your result.
s = set()
for tup in lst:
for el in tup:
s.add(el)
# or use a set comprehension:
# # s = {el for tup in lst for el in tup}
If order IS important, you can do mostly the same, but also have a result list to add to.
s = set()
result = []
for tup in lst:
for el in tup:
if el not in s:
s.add(el)
result.append(el)
You can use set comprehension:
lst = [(2, 5), (3, 5), (2, 5)]
{e for l in lst for e in l}
you need to iter through each tuple and then each element of tuple. before append just check if the element is in list:
a = [(2, 5), (3, 5), (2, 5)]
b = []
for i in a:
for j in i:
if j not in b:
b.append(j)
print b
After running above code I got this output:
[2, 5, 3]
An easy way to do so is using numpy ravel, and then set:
import numpy as np
lst = [(2, 5), (3, 5), (2, 5)]
res = list(set(np.ravel(a)))
gives:
[2, 3, 5]
Answer to Apero's comment:
if you don't want to use numpy, you would be able to flatten the list with:
lst = [(2,5), (3,5), (2,5)]
tmp = []
for i in lst:
for j in i:
tmp.append(j)
res = set(tmp)
print res
which gives:
[2, 3, 5]

Making list of list oneliner -python

I have a list
l=[(1,2),(1,6),(3,4),(3,6),(1,4),(4,3)]
I want to return a list that contains lists by the first number in each tuple.
Something like this:
[[2,4,6],[4,6],[3]]
To make a program that iterates on list and writing a whole function that does it is easy.
I want to find a oneliner - python way of doing it.
Any ideas?
>>> from itertools import groupby
>>> from operator import itemgetter
>>> L = [(1,2), (1,6), (3,4), (3,6), (1,4), (4,3)]
>>> [[y for x, y in v] for k, v in groupby(sorted(L), itemgetter(0))]
[[2, 4, 6], [4, 6], [3]]
Explanation
This works by using itertools.groupby. groupby finds consecutive groups in an iterable, returning an iterator through key, group pairs.
The argument given to groupby is a key function, itemgetter(0) which is called for each tuple, returning the first item as the key to groupby.
groupby groups elements in their original order so if you want to group by the first number in the list, it must first be sorted so groupby can go through the first numbers in ascending order and actually group them.
>>> sorted(L)
[(1, 2), (1, 4), (1, 6), (3, 4), (3, 6), (4, 3)]
There is the sorted list where you can clearly see the groups that will be created if you look back to the final output. Now you can use groupby to show the key, group pairs.
[(1, <itertools._grouper object at 0x02BB7ED0>), (3, <itertools._grouper object at 0x02BB7CF0>), (4, <itertools._grouper object at 0x02BB7E30>)]
Here are the sorted items grouped by the first number. groupby returns the group for each key as an iterator, this is great and very efficient but for this example we will just convert it to a list to make sure it's working properly.
>>> [(k, list(v)) for k,v in groupby(sorted(L), itemgetter(0))]
[(1, [(1, 2), (1, 4), (1, 6)]), (3, [(3, 4), (3, 6)]), (4, [(4, 3)])]
That is almost the right thing but the required output shows only the 2nd number in the groups in each list. So the following achieves the desired result.
[[y for x, y in v] for k, v in groupby(sorted(L), itemgetter(0))]
l = [(1, 2), (1, 6), (3, 4), (3, 6), (1, 4), (4, 3)]
d = {}
for (k, v) in l:
d.setdefault(k, []).append(v)
print d.values()
I know it's not a one liner, but perhaps it's easier to read than a one liner.

Using Python's list index() method on a list of tuples or objects?

Python's list type has an index() method that takes one parameter and returns the index of the first item in the list matching the parameter. For instance:
>>> some_list = ["apple", "pear", "banana", "grape"]
>>> some_list.index("pear")
1
>>> some_list.index("grape")
3
Is there a graceful (idiomatic) way to extend this to lists of complex objects, like tuples? Ideally, I'd like to be able to do something like this:
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> some_list.getIndexOfTuple(1, 7)
1
>>> some_list.getIndexOfTuple(0, "kumquat")
2
getIndexOfTuple() is just a hypothetical method that accepts a sub-index and a value, and then returns the index of the list item with the given value at that sub-index. I hope
Is there some way to achieve that general result, using list comprehensions or lambas or something "in-line" like that? I think I could write my own class and method, but I don't want to reinvent the wheel if Python already has a way to do it.
How about this?
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> [x for x, y in enumerate(tuple_list) if y[1] == 7]
[1]
>>> [x for x, y in enumerate(tuple_list) if y[0] == 'kumquat']
[2]
As pointed out in the comments, this would get all matches. To just get the first one, you can do:
>>> [y[0] for y in tuple_list].index('kumquat')
2
There is a good discussion in the comments as to the speed difference between all the solutions posted. I may be a little biased but I would personally stick to a one-liner as the speed we're talking about is pretty insignificant versus creating functions and importing modules for this problem, but if you are planning on doing this to a very large amount of elements you might want to look at the other answers provided, as they are faster than what I provided.
Those list comprehensions are messy after a while.
I like this Pythonic approach:
from operator import itemgetter
tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
def collect(l, index):
return map(itemgetter(index), l)
# And now you can write this:
collect(tuple_list,0).index("cherry") # = 1
collect(tuple_list,1).index("3") # = 2
If you need your code to be all super performant:
# Stops iterating through the list as soon as it finds the value
def getIndexOfTuple(l, index, value):
for pos,t in enumerate(l):
if t[index] == value:
return pos
# Matches behavior of list.index
raise ValueError("list.index(x): x not in list")
getIndexOfTuple(tuple_list, 0, "cherry") # = 1
One possibility is to use the itemgetter function from the operator module:
import operator
f = operator.itemgetter(0)
print map(f, tuple_list).index("cherry") # yields 1
The call to itemgetter returns a function that will do the equivalent of foo[0] for anything passed to it. Using map, you then apply that function to each tuple, extracting the info into a new list, on which you then call index as normal.
map(f, tuple_list)
is equivalent to:
[f(tuple_list[0]), f(tuple_list[1]), ...etc]
which in turn is equivalent to:
[tuple_list[0][0], tuple_list[1][0], tuple_list[2][0]]
which gives:
["pineapple", "cherry", ...etc]
You can do this with a list comprehension and index()
tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
[x[0] for x in tuple_list].index("kumquat")
2
[x[1] for x in tuple_list].index(7)
1
Inspired by this question, I found this quite elegant:
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> next(i for i, t in enumerate(tuple_list) if t[1] == 7)
1
>>> next(i for i, t in enumerate(tuple_list) if t[0] == "kumquat")
2
I would place this as a comment to Triptych, but I can't comment yet due to lack of rating:
Using the enumerator method to match on sub-indices in a list of tuples.
e.g.
li = [(1,2,3,4), (11,22,33,44), (111,222,333,444), ('a','b','c','d'),
('aa','bb','cc','dd'), ('aaa','bbb','ccc','ddd')]
# want pos of item having [22,44] in positions 1 and 3:
def getIndexOfTupleWithIndices(li, indices, vals):
# if index is a tuple of subindices to match against:
for pos,k in enumerate(li):
match = True
for i in indices:
if k[i] != vals[i]:
match = False
break;
if (match):
return pos
# Matches behavior of list.index
raise ValueError("list.index(x): x not in list")
idx = [1,3]
vals = [22,44]
print getIndexOfTupleWithIndices(li,idx,vals) # = 1
idx = [0,1]
vals = ['a','b']
print getIndexOfTupleWithIndices(li,idx,vals) # = 3
idx = [2,1]
vals = ['cc','bb']
print getIndexOfTupleWithIndices(li,idx,vals) # = 4
ok, it might be a mistake in vals(j), the correction is:
def getIndex(li,indices,vals):
for pos,k in enumerate(lista):
match = True
for i in indices:
if k[i] != vals[indices.index(i)]:
match = False
break
if(match):
return pos
z = list(zip(*tuple_list))
z[1][z[0].index('persimon')]
tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
def eachtuple(tupple, pos1, val):
for e in tupple:
if e == val:
return True
for e in tuple_list:
if eachtuple(e, 1, 7) is True:
print tuple_list.index(e)
for e in tuple_list:
if eachtuple(e, 0, "kumquat") is True:
print tuple_list.index(e)
Python's list.index(x) returns index of the first occurrence of x in the list. So we can pass objects returned by list compression to get their index.
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> [tuple_list.index(t) for t in tuple_list if t[1] == 7]
[1]
>>> [tuple_list.index(t) for t in tuple_list if t[0] == 'kumquat']
[2]
With the same line, we can also get the list of index in case there are multiple matched elements.
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11), ("banana", 7)]
>>> [tuple_list.index(t) for t in tuple_list if t[1] == 7]
[1, 4]
I guess the following is not the best way to do it (speed and elegance concerns) but well, it could help :
from collections import OrderedDict as od
t = [('pineapple', 5), ('cherry', 7), ('kumquat', 3), ('plum', 11)]
list(od(t).keys()).index('kumquat')
2
list(od(t).values()).index(7)
7
# bonus :
od(t)['kumquat']
3
list of tuples with 2 members can be converted to ordered dict directly, data structures are actually the same, so we can use dict method on the fly.
This is also possible using Lambda expressions:
l = [('rana', 1, 1), ('pato', 1, 1), ('perro', 1, 1)]
map(lambda x:x[0], l).index("pato") # returns 1
Edit to add examples:
l=[['rana', 1, 1], ['pato', 2, 1], ['perro', 1, 1], ['pato', 2, 2], ['pato', 2, 2]]
extract all items by condition:
filter(lambda x:x[0]=="pato", l) #[['pato', 2, 1], ['pato', 2, 2], ['pato', 2, 2]]
extract all items by condition with index:
>>> filter(lambda x:x[1][0]=="pato", enumerate(l))
[(1, ['pato', 2, 1]), (3, ['pato', 2, 2]), (4, ['pato', 2, 2])]
>>> map(lambda x:x[1],_)
[['pato', 2, 1], ['pato', 2, 2], ['pato', 2, 2]]
Note: The _ variable only works in the interactive interpreter. More generally, one must explicitly assign _, i.e. _=filter(lambda x:x[1][0]=="pato", enumerate(l)).
I came up with a quick and dirty approach using max and lambda.
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> target = 7
>>> max(range(len(tuple_list)), key=lambda i: tuple_list[i][1] == target)
1
There is a caveat though that if the list does not contain the target, the returned index will be 0, which could be misleading.
>>> target = -1
>>> max(range(len(tuple_list)), key=lambda i: tuple_list[i][1] == target)
0

Categories