Need an easy way to remove duplicates of nested tuples in python - python

I am currently working with a script that has lists that looks like this:
example = [ ((2,1),(0,1)), ((0,1),(2,1)), ((2,1),(0,1)) ]
Now turning this list to a set returns:
set( [ ((2,1),(0,1)), ((0,1),(2,1)) ] )
For my purposes I need to recognize these tuples as being equal as well. I dont care about retaining the order. All solutions I can think of is really messy so if anyone has any idea I would be gratefull.

It sounds like you may be off using frozensets instead of tuples.
>>> x = [((2, 1), (0, 1)), ((0, 1), (2, 1)), ((2, 1), (0, 1))]
>>> x
[((2, 1), (0, 1)), ((0, 1), (2, 1)), ((2, 1), (0, 1))]
>>> set(frozenset(ts) for ts in x)
set([frozenset([(0, 1), (2, 1)])])

In [10]: set(tuple(sorted(elt)) for elt in example)
Out[10]: set([ ((0, 1), (2, 1)) ])

First transform all elements to sets too. Then make a set of the whole list.

Related

Problem with using sets and issubset in Python

I'm not clear about how my sets needs to be written. The answer I'm getting is False, which means I'm using the issubset type wrong or my sets are not written correctly.
some_cords = {(1, 2), (2, 2), (3, 2)}
line_cords = {((1, 2), (2, 2), (3, 2)), ((1, 3), (2, 3), (3, 3)), ((2, 1), (2, 2), (2, 3)),
((2, 4), (2, 3), (2, 2))}
print(some_cords.issubset(line_cords))
print(any(some_cords in k for k in line_cords))
>>False
>>False
I've tried these two methods but why they're False, I'm not sure. I've also tried writing some_cords and line_cords as tuples rather than sets, i.e.
some_cords = ((1, 2), (2, 2), (3, 2)) but I'm getting the same result. I'm running Python 3.10 in IntelliJ IDE. Thanks for any help.
As written, some_cords is a set with 3 elements, each one is a tuple: (1, 2), (2, 2) and (3, 2). On the other hand, line_cords is a set of 4 tuples, each tuple being a tuple of length 3.
Adding parenthesis around the tuples of some_cords will turn it into a set with one element similar to the ones of line_cords:
some_cords = {((1, 2), (2, 2), (3, 2))}
and the first test will return True
As for the second test (using in), some_cords is a set, and each element of line_cords is a tuple, so it will always return False. This test would be OK if some_cords was a tuple: you can achieve this removing the braces, but then you can not use issubset.

Alternate ways in place of zip

I have below inputs,
inp = 'Sample'
n = 5
I would like to generate a list of tuples of n elements packing input with index. So that my output is,
[('Sample', 0), ('Sample', 1), ('Sample', 2), ('Sample', 3), ('Sample', 4)]
Below snippet does the work neat,
output = zip([inp]*n, range(n))
Just curious about alternate approaches to solve the same?
The most obvious solution (a list comprehension) has already been mentioned in the comments, so here's an alternative with itertools.zip_longest, just for fun -
from itertools import zip_longest
r = list(zip_longest([], range(n), fillvalue=inp))
print(r)
[('Sample', 0), ('Sample', 1), ('Sample', 2), ('Sample', 3), ('Sample', 4)]
On python2.x, you'd need izip_longest instead.
inp='sample'
n=5
print [(inp,i) for i in range(n)]
it Shows O/P as:
[('sample', 0), ('sample', 1), ('sample', 2), ('sample', 3), ('sample', 4)]

Python: Print a generator expression's values when those values are itertools.product objects

I'm trying to dig into some code I found online here to better understand Python.
This is the code fragment I'm trying to get a feel for:
from itertools import chain, product
def generate_groupings(word_length, glyph_sizes=(1,2)):
cartesian_products = (
product(glyph_sizes, repeat=r)
for r in range(1, word_length + 1)
)
Here, word_length is 3.
I'm trying to evaluate the contents of the cartesian_products generator. From what I can gather after reading the answer at this SO question, generators do not iterate (and thus, do not yield a value) until they are called as part of a collection, so I've placed the generator in a list:
list(cartesian_products)
Out[6]:
[<itertools.product at 0x1025d1dc0>,
<itertools.product at 0x1025d1e10>,
<itertools.product at 0x1025d1f50>]
Obviously, I now see inside the generator, but I was hoping to get more specific information than the raw details of the itertools.product objects. Is there a way to accomplish this?
if you don't care about exhausting the generator, you can use:
list(map(list,cartesian_products))
You will get the following for word_length = 3
Out[1]:
[[(1,), (2,)],
[(1, 1), (1, 2), (2, 1), (2, 2)],
[(1, 1, 1),
(1, 1, 2),
(1, 2, 1),
(1, 2, 2),
(2, 1, 1),
(2, 1, 2),
(2, 2, 1),
(2, 2, 2)]]

Is it possible to include mutiple equations in a lambda function

I'm trying to include multiple operations in the lambda function with variables that have different lengths, i.e. something like:
$ serial_result = map(lambda x,y:(x**2,y**3), range(20), range(10))
but this doesn't work. Could someone tell me how to get around this?
I understand that:
$ serial_result = map(lambda x,y:(x**2,y**3), range(0,20,2), range(10))
works because the arrays of "x" and "y" have the same length.
If you want the product of range items you can use itertools.product :
>>> from itertools import product
>>> serial_result = map(lambda x:(x[0]**2,x[1]**3), product(range(20), range(10)))
If you want to pass the pairs to lambda like second case you can use itertools.zip_longest (in python 2 use izip_longest)and pass a fillvalue to fill the missed items,
>>> from itertools import zip_longest
>>> serial_result = map(lambda x:(x[0]**2,x[1]**3), zip_longest(range(20), range(10),fillvalue=1))
Note that if you are in python 2 you can pass multiple argument to lambda as a tuple :
>>> serial_result = map(lambda (x,y):(x**2,y**3), product(range(20), range(10)))
See the difference of izip_longest and product in following example :
>>> list(product(range(5),range(3)))
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), (3, 0), (3, 1), (3, 2), (4, 0), (4, 1), (4, 2)]
>>> list(zip_longest(range(5),range(3)))
[(0, 0), (1, 1), (2, 2), (3, None), (4, None)]
>>> list(zip_longest(range(5),range(3),fillvalue=1))
[(0, 0), (1, 1), (2, 2), (3, 1), (4, 1)]
It sounds like you may be confused as to how exactly you want to use all the values in these two variables. There are several ways combine them...
If you want a result for every combination of an element in a and an element in b: itertools.product(a, b).
If you want to stop once you get to the end of the shorter: zip(a, b)
If you want to continue on until you've used all of the longest: itertools.zip_longer(a, b) (izip_longer in python 2). Once a runs out of elements it will be filled in with None, or a default you provide.

Double sort with reverse

It is easy to implement a regular double sort:
pairs = [(1, 2), (2, 1), (1, 3), (2, 4), (3, 1)]
sorted(pairs,key=lambda x: (x[0],x[1]))
# out: [(1, 2), (1, 3), (2, 1), (2, 4), (3, 1)]
I am interested how to do it with the second elements in the reverse order. This can be easily implemented by grouping the pairs by the first item at first and then adding the the sorted second items together. I have implemented this both using itertools.groupby and defaultdict. Still, it remains far more complex, than the regular double sort, so i wonder, if there is a neat trick to do it in a more concise way.
double_sort(pairs)
# out: [(1, 3), (1, 2), (2, 4), (2, 1), (3, 1)]
PS! I know how to do it with numpy.argsort and would mostly like to see a standard lib approach.
This will work for numbers and similar data types
sorted(pairs, key=lambda x: (x[0], -x[1]))
This will work for all comparable types only in Python 2 :(
sorted(pairs, lambda x, y: cmp(x[0], y[0]) or cmp(y[1], x[1]))

Categories