Comparing items of a tuple within a list of tuples - python

I have designed a program to calculate some maximum tuples from a list, however I am stuck at the point where I have to compare the maximum item[1] from a tuple and compare it to another tuple item[1].
This is the list of tuples:
fff = []
anslist = [(2, [1]), (3, [7]), (4, [2]), (5, [5]), (6, [8]), (7, [16]), (8, [3]), (9, [19]), (10, [6]), (11, [14]), (12, [9]), (13, [9]), (14, [17]), (15, [17]), (16, [4]), (17, [12]), (18, [20]), (19, [20]), (20, [7])]
This is the code I have:
print(max(anslist, key=lambda x: x[1]))
fff.append(max(anslist, key=lambda x: x[1]))
anslist.remove(max(anslist, key=lambda x: x[1]))
while (max(fff[1], key=lambda x: x[1])) == (max(anslist[1], key=lambda x: x[1])):
print(max(anslist, key=lambda x: x[1]))
anslist.remove(max(anslist, key=lambda x: x[1]))
I expect my program to print (18, [20]) and (19, [20]), and because the list of tuples always changes in response to the users input I needed some kind of loop which needed to go through all tuples until the second item of the tuple is not the largest anymore.
This is the error I get :
while (max(fff[1], key=lambda x: x[1])) == (max(anslist[1], key=lambda x: x[1])):
IndexError: list index out of range

The problem is with your subscript [1] which is in the wrong position.
Instead of:
while (max(fff[1], key=lambda x: x[1])) == (max(anslist[1], key=lambda x: x[1])):
This is what you want:
while max(fff, key=lambda x: x[1])[1] == max(anslist, key=lambda x: x[1])[1]:
That prints (19, [20]) as you expect.
(I also removed all unneeded parens from your expression.)
A much more efficient way to do this is to find the max only once and filter the list based on that:
max_element = max(x[1] for x in anslist)
max_tuples = [x for x in anslist if x[1] == max_element]
Also interesting is sorting anslist to put (18, [20]) and (19, [20]) in front, after which you can use a simple for loop and break when the second element of the tuple changes.
sorted_anslist = sorted(anslist, key=lambda x: x[1], reverse=True)

Related

Remove overlaping tuple ranges from list leaving only the longest range

For a given list of range-tuples, I need to remove overlapping rage tuples while leaving the longest range for those that overlap or if same length keep both.
eg
input = [ [(1, 7), (2, 3), (7, 8), (9, 20)], [(4, 7), (2, 3), (7, 10)], [(1, 7), (2, 3), (7, 8)]]
expected_output = [ [(1,7), (9,20)], [(4,7), (2, 3), (7,10)], [(1,7)] ]
so only the longest overlapping range-tuple should not be removed.
def overlap(x:tuple, y:tuple) -> bool:
return bool(len( range(max(x[0],y[0]), min(x[1], y[1])+1 ) ))
def drop_overlaps(tuples: list):
def other_tuples(elems: list, t: tuple)-> list:
return [e for e in elems if e != t]
return [ t for t in tuples if not any( overlap(t, other_tuple)
for other_tuple in other_tuples(tuples, t)) ]
How do I remove the overlaps and keep the longest of them and those that are non-overlapping?
You can sort the tuple based on the first key, Then compare using your overlap function and check the difference and add the values to result based on the difference. If the difference is equal add value to result list otherwise replace last element in result with max value.
def drop(lst):
sorted_lst = sorted(lst, key=lambda x: x[0])
diff = lambda x: abs(x[0]-x[1])
res = [sorted_lst[0]]
for x in sorted_lst[1:]:
if overlap(res[-1], x):
if diff(res[-1]) == diff(x):
res.append(x)
else:
res[-1] = max(res[-1], x, key=diff)
else:
res.append(x)
return res

Python sorting list with multiple keys and negating for some of them

I have a list of tuple in python and I would like to sort it first the decreasing order of value(int) and if it matches the increasing order of word(str)
data = [(1, u'day'), (2, u'is'), (2, u'lunny'), (4, u'the')]
data.sort(key = lambda x: (x[0], x[1]), reverse=True)
The above sorts by decreasing order of value but fails to handle sorting by increasing order of str(2nd value in the tuple).
Does anyone have suggestion for a workaround on how to fix this?
You could leave reverse=False but just negate the first value
>>> data.sort(key=lambda x: (-x[0], x[1]))
>>> data
[(4, 'the'), (2, 'is'), (2, 'lunny'), (1, 'day')]
You could also run sort twice, taking advantage of it being stable (see doc). Maybe you find this to be more explicit:
>>> data = [(1, u'day'), (2, u'is'), (2, u'lunny'), (4, u'the')]
>>> data.sort(key = lambda x: x[1])
>>> data.sort(key = lambda x: x[0], reverse=True)
>>> data
[(4, 'the'), (2, 'is'), (2, 'lunny'), (1, 'day')]
When sorting by x[0], the previous order (ascending by x[1]) is preserved.

How do I sort this list of tuples by both values?

I have a list of tuples: [(2, Operation.SUBSTITUTED), (1, Operation.DELETED), (2, Operation.INSERTED)]
I would like to sort this list in 2 ways:
First by its 1st value by ascending value, i.e. 1, 2, 3... etc
Second by its 2nd value by reverse alphabetical order, i.e. Operation.SUBSTITITUTED, Operation.INSERTED, Operation, DELETED
So the above list should be sorted as:
[(1, Operation.DELETED), (2, Operation.SUBSTITUTED), (2, Operation.INSERTED)]
How do I go about sort this list?
Since sorting is guaranteed to be stable, you can do this in 2 steps:
lst = [(2, 'Operation.SUBSTITUTED'), (1, 'Operation.DELETED'), (2, 'Operation.INSERTED')]
res_int = sorted(lst, key=lambda x: x[1], reverse=True)
res = sorted(res_int, key=lambda x: x[0])
print(res)
# [(1, 'Operation.DELETED'), (2, 'Operation.SUBSTITUTED'), (2, 'Operation.INSERTED')]
In this particular case, because the order of comparison can be easily inverted for integers, you can sort in one time using negative value for integer key & reverse:
lst = [(2, 'Operation.SUBSTITUTED'), (1, 'Operation.DELETED'), (2, 'Operation.INSERTED')]
res = sorted(lst, key=lambda x: (-x[0],x[1]), reverse=True)
result:
[(1, 'Operation.DELETED'), (2, 'Operation.SUBSTITUTED'), (2, 'Operation.INSERTED')]
negating the integer key cancels the "reverse" aspect, only kept for the second string criterion.
You can use this:
from operator import itemgetter
d = [(1, 'DELETED'), (2, 'INSERTED'), (2, 'SUBSTITUTED')]
d.sort(key=itemgetter(1),reverse=True)
d.sort(key=itemgetter(0))
print(d)
Another way using itemgetter from operator module:
from operator import itemgetter
lst = [(2, 'Operation.SUBSTITUTED'), (1, 'Operation.DELETED'), (2, 'Operation.INSERTED')]
inter = sorted(lst, key=itemgetter(1), reverse=True)
sorted_lst = sorted(inter, key=itemgetter(0))
print(sorted_lst)
# [(1, 'Operation.DELETED'), (2, 'Operation.SUBSTITUTED'), (2, 'Operation.INSERTED')]

Sum values in tuple (values in dict)

I have a dictionary data that looks like that with sample values:
defaultdict(<type 'list'>,
{(None, 2014): [(5, 1), (10, 2)],
(u'Middle', 2014): [(6, 2), (11, 3)],
(u'SouthWest', 2015): [(7,3), (12, 4)]})
I get this from collections.defaultdict(list) because my keys have to be lists.
My goal is to get a new dictionary that will contain the sum values for every tuple with respect to their position in the tuple.
By running
out = {k:(sum(tup[0] for tup in v),sum(tup[1] for tup in v)) for k,v in data.items()}
I get
{(None, 2014): (15, 3), (u'Middle', 2014): (17, 5), (u'SouthWest', 2015): (19, 7)}
However, I don't know in advance how many items will be in every tuple, so using the sum(tup[0] for tup in v) with hard-coded indices is not an option. I know, however, how many integers will be in the tuple. This value is an integer and I get this along with the data dict. All tuples are always of the same length (in this example, of length 2).
How do I tell Python that I want the out dict to contain tuple of the size that matches the length I have to use?
I think you want the built-in zip function:
In [26]: {k: tuple(sum(x) for x in zip(*v)) for k, v in data.items()}
Out[26]:
{('SouthWest', 2015): (19, 7),
(None, 2014): (15, 3),
('Middle', 2014): (17, 5)}

How to sort a list/tuple of lists/tuples by the element at a given index?

I have some data either in a list of lists or a list of tuples, like this:
data = [[1,2,3], [4,5,6], [7,8,9]]
data = [(1,2,3), (4,5,6), (7,8,9)]
And I want to sort by the 2nd element in the subset. Meaning, sorting by 2,5,8 where 2 is from (1,2,3), 5 is from (4,5,6). What is the common way to do this? Should I store tuples or lists in my list?
sorted_by_second = sorted(data, key=lambda tup: tup[1])
or:
data.sort(key=lambda tup: tup[1]) # sorts in place
The default sort mode is ascending. To sort in descending order use the option reverse=True:
sorted_by_second = sorted(data, key=lambda tup: tup[1], reverse=True)
or:
data.sort(key=lambda tup: tup[1], reverse=True) # sorts in place
from operator import itemgetter
data.sort(key=itemgetter(1))
For sorting by multiple criteria, namely for instance by the second and third elements in a tuple, let
data = [(1,2,3),(1,2,1),(1,1,4)]
and so define a lambda that returns a tuple that describes priority, for instance
sorted(data, key=lambda tup: (tup[1],tup[2]) )
[(1, 1, 4), (1, 2, 1), (1, 2, 3)]
I just want to add to Stephen's answer if you want to sort the array from high to low, another way other than in the comments above is just to add this to the line:
reverse = True
and the result will be as follows:
data.sort(key=lambda tup: tup[1], reverse=True)
Stephen's answer is the one I'd use. For completeness, here's the DSU (decorate-sort-undecorate) pattern with list comprehensions:
decorated = [(tup[1], tup) for tup in data]
decorated.sort()
undecorated = [tup for second, tup in decorated]
Or, more tersely:
[b for a,b in sorted((tup[1], tup) for tup in data)]
As noted in the Python Sorting HowTo, this has been unnecessary since Python 2.4, when key functions became available.
In order to sort a list of tuples (<word>, <count>), for count in descending order and word in alphabetical order:
data = [
('betty', 1),
('bought', 1),
('a', 1),
('bit', 1),
('of', 1),
('butter', 2),
('but', 1),
('the', 1),
('was', 1),
('bitter', 1)]
I use this method:
sorted(data, key=lambda tup:(-tup[1], tup[0]))
and it gives me the result:
[('butter', 2),
('a', 1),
('betty', 1),
('bit', 1),
('bitter', 1),
('bought', 1),
('but', 1),
('of', 1),
('the', 1),
('was', 1)]
Without lambda:
def sec_elem(s):
return s[1]
sorted(data, key=sec_elem)
itemgetter() is somewhat faster than lambda tup: tup[1], but the increase is relatively modest (around 10 to 25 percent).
(IPython session)
>>> from operator import itemgetter
>>> from numpy.random import randint
>>> values = randint(0, 9, 30000).reshape((10000,3))
>>> tpls = [tuple(values[i,:]) for i in range(len(values))]
>>> tpls[:5] # display sample from list
[(1, 0, 0),
(8, 5, 5),
(5, 4, 0),
(5, 7, 7),
(4, 2, 1)]
>>> sorted(tpls[:5], key=itemgetter(1)) # example sort
[(1, 0, 0),
(4, 2, 1),
(5, 4, 0),
(8, 5, 5),
(5, 7, 7)]
>>> %timeit sorted(tpls, key=itemgetter(1))
100 loops, best of 3: 4.89 ms per loop
>>> %timeit sorted(tpls, key=lambda tup: tup[1])
100 loops, best of 3: 6.39 ms per loop
>>> %timeit sorted(tpls, key=(itemgetter(1,0)))
100 loops, best of 3: 16.1 ms per loop
>>> %timeit sorted(tpls, key=lambda tup: (tup[1], tup[0]))
100 loops, best of 3: 17.1 ms per loop
#Stephen 's answer is to the point! Here is an example for better visualization,
Shout out for the Ready Player One fans! =)
>>> gunters = [('2044-04-05', 'parzival'), ('2044-04-07', 'aech'), ('2044-04-06', 'art3mis')]
>>> gunters.sort(key=lambda tup: tup[0])
>>> print gunters
[('2044-04-05', 'parzival'), ('2044-04-06', 'art3mis'), ('2044-04-07', 'aech')]
key is a function that will be called to transform the collection's items for comparison.. like compareTo method in Java.
The parameter passed to key must be something that is callable. Here, the use of lambda creates an anonymous function (which is a callable).
The syntax of lambda is the word lambda followed by a iterable name then a single block of code.
Below example, we are sorting a list of tuple that holds the info abt time of certain event and actor name.
We are sorting this list by time of event occurrence - which is the 0th element of a tuple.
Note - s.sort([cmp[, key[, reverse]]]) sorts the items of s in place
I use this in my code:
#To sort the list based on each element's second integer (elem[1])
sorted(d2, key=lambda elem: elem[1])
Depending on which element you want to sort it by you can put it in the
(elem[*insert the index of the element you are sorting it by*])
Sorting a tuple is quite simple:
tuple(sorted(t))

Categories