most pythonic3 dezip(?) implementation - python

("dezip" is obviously a bad name, but I'm not sure what the right one would be. Please excuse me if that means I've missed an authoritative answer by not knowing what to search for.)
Let's say we have
people = [
(1, 'anne'),
(2, 'ben'),
(3, 'charlie'),
]
(common in django for choices etc.)
Now we want a list of "keys" or list of the first elements: [1, 2, 3]
In python 3 I'm using
people_ids, _ = list(zip(*people))
# or even
people_ids = [p[0] for p in people]
The zip way doesn't seem very neat, particularly with the extra list(...) required by python 3 making zip an iterator.
The second comprehension approach is slightly more readable but wouldn't generalise as well, eg. return lists of the the second, third elements etc. in the same call.
Is there a better way?
(where "better" mainly means clean and readable but performance might also have some considerable)

Using next, you can get the first item from the iterable:
>>> people = [
... (1, 'anne'),
... (2, 'ben'),
... (3, 'charlie'),
... ]
>>> next(zip(*people))
(1, 2, 3)
alternative using map with operator.itemgetter:
>>> import operator
>>> list(map(operator.itemgetter(0), people))
[1, 2, 3]
BTW, zip solution without list should work:
>>> people_ids, _ = zip(*people)
>>> people_ids
(1, 2, 3)

For the particular example, you can "abuse" a little bit the dict:
people = [
(1, 'anne'),
(2, 'ben'),
(3, 'charlie'),
]
d_people = dict(people)
And then you have a nice data model. This has some problems: keys cannot be repeated, and won't work with tuples with more than two elements. But for this case (which is quite typical!) works very nicely.
Then you can simply get the keys by doing keys:
d_people.keys()
or, explicit list, either:
list(d_people)
list(d_people.keys())
which are equivalent.

Getting a subset based on ids may be done by operator.itemgetter. Getting a subslice of zipped values may be done by creating explicit slice object and pass it to function.
import operator
people = [
(1, 'anne', 'some'),
(2, 'ben', 'another'),
(3, 'charlie', 'field'),
]
people_ids = [p[0] for p in people] # 0 may be passed as funtion argument
people_ids_and_another = [operator.itemgetter(*[0, 2])(p) for p in people] # [0, 2] may be passed as function argument
people_ids_and_name_via_slice = [p[slice(0,2,None)] for p in people] # equal to p[0:2], but passable as argument
To demonstrate function usage:
def dezip(seq, what):
if isinstance(what, list):
return [operator.itemgetter(*what)(p) for p in people]
else:
return [p[what] for p in people]
assert dezip(people, slice(0,2,None)) == [(1, 'anne'), (2, 'ben'), (3, 'charlie')]
assert dezip(people, 0) == [1, 2, 3]
assert dezip(people, [0, 2]) == [(1, 'some'), (2, 'another'), (3, 'field')]
If you'll drop 'list of indices' requirement, you may drop if statement in function body.

Related

Implementation of nested pairs

I'm doing Advent of Code and until this point I had no problem to solve issues on my own until day 18 which got me. I want to solve it on my own but I feel like I can't even start however the task itself is not difficult. It's kinda long to elaborate but the point is the following:
I have to implement nested pairs, so every item of a pair can be an another pair and so on, for example:
[[[[[9,8],1],2],3],4]
[[3,[2,[1,[7,3]]]],[6,[5,[4,[3,2]]]]]
I have to do different operations on these pairs, such as deleting pairs on a very deep level and use its values at higher level. I know Python handles tuples really great but I have yet to find a solution for recursive (?) traversal, deletion and "saving" values from the "deep". It's not a wise solution to delete items during iterating. Shall I use a different approach or some custom data structure for a task like this? I don't need an exact solution just some generic guidance.
I haven't read problem 18 from advent of code, but here is an example answer to your question with a list version and a tuple version.
Imagine I have nested pairs, and I want to write a function that delete all the deepest pairs and replaced them with a single number being the sum of the two numbers in the pair. For example:
input -> output
(1, 2) -> 3
((1, 2), 3) -> (3, 3)
(((1,2), (3,4)), ((5,6), (7,8))) -> ((3, 7), (11, 15))
((((((1, 2), 3), 4), 5), 6), 7) -> (((((3, 3), 4), 5), 6), 7)
Here is a version with immutable tuples, that returns a new tuple:
def updated(p):
result, _ = updated_helper(p)
return result
def updated_helper(p):
if isinstance(p, tuple) and len(p) == 2:
a, b = p
new_a, a_is_number = updated_helper(a)
new_b, b_is_number = updated_helper(b)
if a_is_number and b_is_number:
return a+b, False
else:
return (new_a, new_b), False
else:
return p, True
Here is a version with mutable lists, that returns nothing useful but mutates the list:
def update(p):
if isinstance(p, list) and len(p) == 2:
a, b = p
a_is_number = update(a)
b_is_number = update(b)
if isinstance(a, list) and len(a) == 1:
p[0] = a[0]
if isinstance(b, list) and len(b) == 1:
p[1] = b[0]
if a_is_number and b_is_number:
p[:] = [a+b]
return False
else:
return True
Note how I used a substantive, updated, and a verb, update, to highlight the different logic between these two similar functions. Function update performs an action (modifying a list), whereas updated(p) doesn't perform any action, but is the updated pair.
Testing:
print( updated( (((1,2), (3,4)), ((5,6), (7,8))) ) )
# ((3, 7), (11, 15))
l = [[[1,2], [3,4]], [[5,6], [7,8]]]
update(l)
print(l)
# [[3, 7], [11, 15]]

Is There A Universal Selector Option For if...in Clauses?

I have a "large" list of tuples:
thelist=[(1,2),(1,3),(2,3)]
I want to check whether any tuple in the list starts with a 1, and if it does, print "aaa":
for i in thelist:
templist.append((i[0],i))
for i in templist:
if i[0]==1:
print("aaa")
break
Which is rather ardurous as I have to create the templist. Is there any way I can do this:
if (1,_) in thelist:
print("aaa")
Where _ is the universal selector. Note that the list would be very large and thus it is very costly to implement another list.
There isn't, although you can just use any
any(i[0] == 1 for i in thelist) --> Returns true if the first element is 1
If you don’t actually need the actual tuple, like you do in your example, then you can actually use tuple unpacking for exactly that purpose:
>>> the_list = [(1, 2), (1, 3), (2, 3)]
>>> for x, y in the_list:
if x == 1:
print('aaa')
break
aaa
If you add a * in front of the y, you can also unpack tuples of different sizes, collecting the remainder of the tuple:
>>> other_list = [(1, 2, 3, 4, 5), (1, 3), (2, 3)]
>>> for x, *y in other_list:
if x == 1:
print(y)
break
[2, 3, 4, 5]
Otherwise, if you just want to filter your list based on some premise and then do something on those filtered items, you can use filter with a custom function:
>>> def startsWithOne(x):
return x[0] == 1
>>> thelist = [(1, 2), (1, 3), (2, 3)]
>>> for x in filter(starts_with_one, the_list):
print(x)
(1, 2)
(1, 3)
This is probably the most flexible way which also avoids creating a separate list in memory, as the elements are filtered lazily when you interate the list with your loop.
Finally, if you just want to figure out if any of your items starts with a 1, like you do in your example code, then you could just do it like this:
>>> if any(filter(starts_with_one, the_list)):
print('aaa')
aaa
But I assume that this was just an oversimplified example.

I get the empty set when using the for loop

I know this must be something very basic, but I dont know why I keep getting an empty set. What should I do in order to not get an empty set?
s=set()
for a in [(1,2),(1,3),(1,2)]:
b=[]
for j in range(len(a)-1):
b.append(a)
s.union(b)
print(s)
I get:
s=([])
but the result I want is
{(1,2),(1,3)}
I know there is another way to take the union, but I wish to do so with this for loop.
Try this:
s = set()
s = s.union([(1,2),(1,3),(1,2)])
set.union doesn't modify the set, so you need to save the result
if you want to use a for-loop, as you did in your post, then make sure to save the result of the union: s = s.union(b)
The function union returns a set that is the union of s with b, you should change your code to:
s = set()
for a in [(1, 2), (1, 3), (1, 2)]:
b = []
for j in range(len(a) - 1):
b.append(a)
s = s.union(b)
print(s)
Output
{(1, 2), (1, 3)}
As an alternative you can use the update function:
s.update(b)
Output
{(1, 2), (1, 3)}
You only need the union method if you want to union two non-empty sets. Since you start with an empty set, you can simply use the set constructor:
s = set([(1,2),(1,3),(1,2)])
and s would become:
{(1, 2), (1, 3)}

Testing functions returning iterable in python

I'm having difficulties testing python functions that
return an iterable, like functions that are
yielding or functions that simply return an iterable, like return imap(f, some_iter) or return permutations([1,2,3]).
So with the permutations example, I expect the output of the function to be [(1, 2, 3), (1, 3, 2), ...]. So, I start testing my code.
def perm3():
return permutations([1,2,3])
# Lets ignore test framework and such details
def test_perm3():
assertEqual(perm3(), [(1, 2, 3), (1, 3, 2), ...])
This will not work, since perm3() is an iterable, not a
list. So we can fix this particular example.
def test_perm3():
assertEqual(list(perm3()), [(1, 2, 3), (1, 3, 2), ...])
And this works fine. But what if I have nested iterables? That is
iterables yielding iterables? Like say the expressions
product(permutations([1, 2]), permutations([3, 4])). Now this is
probably not useful but it's clear that it will be (once unrolling the
iterators) something like [((1, 2), (3, 4)), ((1, 2), (4, 3)), ...].
However, we can not just wrap list around our result, as that will only
turn iterable<blah> to [iterable<blah>, iterable<blah>, ...]. Well
of course I can do map(list, product(...)), but this only works for a
nesting level of 2.
So, does the python testing community have any solution for the
problems when testing iterables? Naturally some iterables can't
be tested in this way, like if you want an infinite generator, but
still this issue should be common enough for somebody to have thought
about this.
I use KennyTM's assertRecursiveEq:
import unittest
import collections
import itertools
class TestCase(unittest.TestCase):
def assertRecursiveEq(self, first, second, *args, **kwargs):
"""
https://stackoverflow.com/a/3124155/190597 (KennyTM)
"""
if (isinstance(first, collections.Iterable)
and isinstance(second, collections.Iterable)):
for first_, second_ in itertools.izip_longest(
first, second, fillvalue = object()):
self.assertRecursiveEq(first_, second_, *args, **kwargs)
else:
# If first = np.nan and second = np.nan, I want them to
# compare equal. np.isnan raises TypeErrors on some inputs,
# so I use `first != first` as a proxy. I avoid dependency on numpy
# as a bonus.
if not (first != first and second != second):
self.assertAlmostEqual(first, second, *args, **kwargs)
def perm3():
return itertools.permutations([1,2,3])
class Test(TestCase):
def test_perm3(self):
self.assertRecursiveEq(perm3(),
[(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)])
if __name__ == '__main__':
import sys
sys.argv.insert(1, '--verbose')
unittest.main(argv = sys.argv)
1. If the order of results doesn't matter
Use unittest.assertItemsEqual(). This tests that the items are present in both self and reference, but ignores the order. This works on your example one nested deep example. It also works on a 2-deep example that I concocted.
2. If the order of results matters
I would suggest not ever casting the results of perm3() to a list. Instead, compare the elements directly as you iterate. Here's a test function that will work for your example. I added it to a subclass of unittest.TestCase:
def assertEqualIterables(self, itable1, itable2):
for ival1, ival2 in zip(itable1, itable2):
if "__iter__" in dir(ival1):
self.assertEqualIterables(ival1, ival2)
else:
self.assertEquals(ival1, ival2)
Use it like:
def test_perm3(self):
reference = [((1, 2), (3, 4)), ((1, 2), (4, 3)),
((2, 1), (3, 4)), ((2, 1), (4, 3)),]
self.assertEqualIterables(perm3(), reference)
You could extend you suggestion to include type (that was allowing you to distinguish between lists, tuples, etc.), like so:
def unroll(item):
if "__iter__" in dir(item):
return map(unroll, item), type(item)
else:
return item, type(item)
For example:
got = unroll(permutations([1,2]))
([([(1, <type 'int'>), (2, <type 'int'>)], <type 'tuple'>), ([(2, <type 'int'>), (1, <type 'int'>)], <type 'tuple'>)], <type 'itertools.permutations'>)
# note the final: <type 'itertools.permutations'>
expected = [(1, 2), (2, 1)]
assertEqual(x[0], unroll(expected) ) # check underlying
assertEqual(x[1], type(permutations([]) ) # check type
.
One thing to mention, type is coarse in distinguishing between objects e.g. <type 'classobj'>...
I don't know of any standard way python programmers test iterables, but you
can simply apply your idea of map and list into a recursive function
working for any level of nestedness.
def unroll(item):
if "__iter__" in dir(item):
return map(unroll, item)
else:
return item
Then your test will actually work.
def test_product_perms():
got = unroll(product(...))
expected = [[[1, 2], [3, 4]], [[1, 2], [4, 3]], ...]
assertEqual(got, expected)
However there is a flaw with this as you can see. When unrolling something, it
will always be turned to an array, this was desireable for iterables but it
also applies to the tuples. So therfore I had to manually convert the tuples in the expected result to lists. Hence, you can't diffrentiate if outputs are lists or tuples.
Another problem with this naive approach is that a passing test doesn't mean
that the function work. Say that you check assertEqual(list(my_fun()), [1, 2,
3]), while you think it might return an iterable that when "listed" is
equal to [1, 2, 3]. It might be that it did not return an iterable as you
wanted, it might have returned a list or a tuple too!

Using Python's list index() method on a list of tuples or objects?

Python's list type has an index() method that takes one parameter and returns the index of the first item in the list matching the parameter. For instance:
>>> some_list = ["apple", "pear", "banana", "grape"]
>>> some_list.index("pear")
1
>>> some_list.index("grape")
3
Is there a graceful (idiomatic) way to extend this to lists of complex objects, like tuples? Ideally, I'd like to be able to do something like this:
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> some_list.getIndexOfTuple(1, 7)
1
>>> some_list.getIndexOfTuple(0, "kumquat")
2
getIndexOfTuple() is just a hypothetical method that accepts a sub-index and a value, and then returns the index of the list item with the given value at that sub-index. I hope
Is there some way to achieve that general result, using list comprehensions or lambas or something "in-line" like that? I think I could write my own class and method, but I don't want to reinvent the wheel if Python already has a way to do it.
How about this?
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> [x for x, y in enumerate(tuple_list) if y[1] == 7]
[1]
>>> [x for x, y in enumerate(tuple_list) if y[0] == 'kumquat']
[2]
As pointed out in the comments, this would get all matches. To just get the first one, you can do:
>>> [y[0] for y in tuple_list].index('kumquat')
2
There is a good discussion in the comments as to the speed difference between all the solutions posted. I may be a little biased but I would personally stick to a one-liner as the speed we're talking about is pretty insignificant versus creating functions and importing modules for this problem, but if you are planning on doing this to a very large amount of elements you might want to look at the other answers provided, as they are faster than what I provided.
Those list comprehensions are messy after a while.
I like this Pythonic approach:
from operator import itemgetter
tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
def collect(l, index):
return map(itemgetter(index), l)
# And now you can write this:
collect(tuple_list,0).index("cherry") # = 1
collect(tuple_list,1).index("3") # = 2
If you need your code to be all super performant:
# Stops iterating through the list as soon as it finds the value
def getIndexOfTuple(l, index, value):
for pos,t in enumerate(l):
if t[index] == value:
return pos
# Matches behavior of list.index
raise ValueError("list.index(x): x not in list")
getIndexOfTuple(tuple_list, 0, "cherry") # = 1
One possibility is to use the itemgetter function from the operator module:
import operator
f = operator.itemgetter(0)
print map(f, tuple_list).index("cherry") # yields 1
The call to itemgetter returns a function that will do the equivalent of foo[0] for anything passed to it. Using map, you then apply that function to each tuple, extracting the info into a new list, on which you then call index as normal.
map(f, tuple_list)
is equivalent to:
[f(tuple_list[0]), f(tuple_list[1]), ...etc]
which in turn is equivalent to:
[tuple_list[0][0], tuple_list[1][0], tuple_list[2][0]]
which gives:
["pineapple", "cherry", ...etc]
You can do this with a list comprehension and index()
tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
[x[0] for x in tuple_list].index("kumquat")
2
[x[1] for x in tuple_list].index(7)
1
Inspired by this question, I found this quite elegant:
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> next(i for i, t in enumerate(tuple_list) if t[1] == 7)
1
>>> next(i for i, t in enumerate(tuple_list) if t[0] == "kumquat")
2
I would place this as a comment to Triptych, but I can't comment yet due to lack of rating:
Using the enumerator method to match on sub-indices in a list of tuples.
e.g.
li = [(1,2,3,4), (11,22,33,44), (111,222,333,444), ('a','b','c','d'),
('aa','bb','cc','dd'), ('aaa','bbb','ccc','ddd')]
# want pos of item having [22,44] in positions 1 and 3:
def getIndexOfTupleWithIndices(li, indices, vals):
# if index is a tuple of subindices to match against:
for pos,k in enumerate(li):
match = True
for i in indices:
if k[i] != vals[i]:
match = False
break;
if (match):
return pos
# Matches behavior of list.index
raise ValueError("list.index(x): x not in list")
idx = [1,3]
vals = [22,44]
print getIndexOfTupleWithIndices(li,idx,vals) # = 1
idx = [0,1]
vals = ['a','b']
print getIndexOfTupleWithIndices(li,idx,vals) # = 3
idx = [2,1]
vals = ['cc','bb']
print getIndexOfTupleWithIndices(li,idx,vals) # = 4
ok, it might be a mistake in vals(j), the correction is:
def getIndex(li,indices,vals):
for pos,k in enumerate(lista):
match = True
for i in indices:
if k[i] != vals[indices.index(i)]:
match = False
break
if(match):
return pos
z = list(zip(*tuple_list))
z[1][z[0].index('persimon')]
tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
def eachtuple(tupple, pos1, val):
for e in tupple:
if e == val:
return True
for e in tuple_list:
if eachtuple(e, 1, 7) is True:
print tuple_list.index(e)
for e in tuple_list:
if eachtuple(e, 0, "kumquat") is True:
print tuple_list.index(e)
Python's list.index(x) returns index of the first occurrence of x in the list. So we can pass objects returned by list compression to get their index.
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> [tuple_list.index(t) for t in tuple_list if t[1] == 7]
[1]
>>> [tuple_list.index(t) for t in tuple_list if t[0] == 'kumquat']
[2]
With the same line, we can also get the list of index in case there are multiple matched elements.
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11), ("banana", 7)]
>>> [tuple_list.index(t) for t in tuple_list if t[1] == 7]
[1, 4]
I guess the following is not the best way to do it (speed and elegance concerns) but well, it could help :
from collections import OrderedDict as od
t = [('pineapple', 5), ('cherry', 7), ('kumquat', 3), ('plum', 11)]
list(od(t).keys()).index('kumquat')
2
list(od(t).values()).index(7)
7
# bonus :
od(t)['kumquat']
3
list of tuples with 2 members can be converted to ordered dict directly, data structures are actually the same, so we can use dict method on the fly.
This is also possible using Lambda expressions:
l = [('rana', 1, 1), ('pato', 1, 1), ('perro', 1, 1)]
map(lambda x:x[0], l).index("pato") # returns 1
Edit to add examples:
l=[['rana', 1, 1], ['pato', 2, 1], ['perro', 1, 1], ['pato', 2, 2], ['pato', 2, 2]]
extract all items by condition:
filter(lambda x:x[0]=="pato", l) #[['pato', 2, 1], ['pato', 2, 2], ['pato', 2, 2]]
extract all items by condition with index:
>>> filter(lambda x:x[1][0]=="pato", enumerate(l))
[(1, ['pato', 2, 1]), (3, ['pato', 2, 2]), (4, ['pato', 2, 2])]
>>> map(lambda x:x[1],_)
[['pato', 2, 1], ['pato', 2, 2], ['pato', 2, 2]]
Note: The _ variable only works in the interactive interpreter. More generally, one must explicitly assign _, i.e. _=filter(lambda x:x[1][0]=="pato", enumerate(l)).
I came up with a quick and dirty approach using max and lambda.
>>> tuple_list = [("pineapple", 5), ("cherry", 7), ("kumquat", 3), ("plum", 11)]
>>> target = 7
>>> max(range(len(tuple_list)), key=lambda i: tuple_list[i][1] == target)
1
There is a caveat though that if the list does not contain the target, the returned index will be 0, which could be misleading.
>>> target = -1
>>> max(range(len(tuple_list)), key=lambda i: tuple_list[i][1] == target)
0

Categories