Using tuples as indexes to compare items across a list

Using tuples as indexes to compare items across a list - python

I have a list of three tuples and a list of three strings:
pairs = [(0, 1), (0, 2), (1, 2)]
values = ['aac', 'ccc', 'caa']
I would like to use the elements of the pairs as indexes to compare the strings in the following way:
The first pair of indexes, (0, 1) operate across the first letter of each string: a from the first, c from the second, and c from the third. That is, it compares the values at index 0 and 1 in the sequence a, c, c. Since a is lexically less than c, this comparison should give 'smaller'.
The second pair is (0, 2) and operates across the second letter of each string: a, c, a. Since they're both a, the result should be 'equal'.
Finally, (1, 2) is checked on c, c, a, resulting in 'bigger'.
So the total expected output is the following list:
['smaller', 'bigger', 'equal']
I have tried the following code:
n=0
for x,y in pairs:
if ord(values[x][n])>ord(values[y][n]):
print('bigger')
n+=1
elif ord(values[x][n])<ord(values[y][n]):
print('smaller')
n+=1
else:
print('equal')
n+=1
However, not only does it print the results instead of building a list, it also gives incorrect results (smaller, equal, bigger). How do I achieve my intended result?

You could use a list comprehension with the zip function to combine the two lists:
pairs = [(0, 1), (0, 2), (1, 2)]
values = ['aac', 'ccc', 'caa']
result = [ ("smaller","equal","bigger")[(v[x]>v[y])+(v[x]>=v[y])]
for v,(x,y) in zip(zip(*values),pairs) ]
print(result)
['smaller', 'equal', 'bigger']
zip(*values) will create tuples with the nth character of each string: ('a','c','c'), ('a','c','a'), ('c','c','a')
zip(zip(*values),pairs) combines those character tuples with each corresponding pair: (('a','c','c'),(0,1)), (('a','c','a'),(0,2)), (('c','c','a'),(1,2))
these become v (the nth characters of each value) and x,y (the nth index pair)
The appropriate keyword is then chosen in ("smaller","equal","bigger") using the index 0, 1 or 2
Python treats True as 1 and False as 0 when adding booleans (comparison results) so the index will be 1+1 if the v[x] is greater than v[y], It will be 0+1 if v[x] is equal to v[y] and zero otherwise. BTW, you don't need ord() to compare characters.

Your code is good, there are just a few things to improve.
No need to compare the ord, you can compare characters directly.
Instead of printing, save each step in a list
You can use enumerate to enumerate iterables:
out = []
for n, (x,y) in enumerate(pairs):
if values[x][n]>values[y][n]:
out.append('bigger')
elif values[x][n]<values[y][n]:
out.append('smaller')
else:
out.append('equal')
out
Output:
>>> out
['smaller', 'equal', 'bigger']
NB. I am not commenting on the global logic as what you ultimately want to do was not explicited

Related

Efficiently combine two uneven lists into dictionary based on condition

I have two lists of tuples and I want to map elements in e to elements in s based on a condition. The condition is that the 1st element of something in e needs to be >= to the 1st element in s and elements 1+2 in e need to be <= to elements 1+2 in s. The 1st number in each s tuple is a start position and the second is the length. I can do it as follows:
e = [('e1',3,3),('e2',6,2),('e3',330,3)]
s = [('s1',0,10),('s2',11,24),('s3',35,35),('s4',320,29)]
for i in e:
d = {i:j for j in s if i[1]>=j[1] and i[1]+i[2]<=j[1]+j[2]}
print(d)
Output (what I want):
{('e1', 3, 3): ('s1', 0, 10)}
{('e2', 6, 2): ('s1', 0, 10)}
{('e3', 330, 3): ('s4', 320, 29)}
Is there a more efficient way to get to this result, ideally without the for loop (at least not a double loop)? I've have tried some things with zip as well as something along the lines of
list(map(lambda i,j: {i:j} if i[1]>=j[1] and i[1]+i[2]<=j[1]+j[2] else None, e, s))
but it is not giving me quite what I am looking for.
The elements in s will never overlap. For example, you wouldn't have ('s1',0,10) and ('s2', 5,15). In other words, the range (0, 0+10) will never overlap with (5,5+15) or that of any other tuple. Additionally, all the tuples in e will be unique.

The constraint that the tuples in s can't overlap is pretty strong. In particular, it implies that each value in e can only match with at most one value in s (I think the easiest way to show that is to assume two distinct, non-overlapping tuples in s match with a single element in e and derive a contradiction).
Because of that property, any s-tuple s1 matching an e-tuple e1 has the property that among all tuples sx in s with sx[1]<=e1[1], it is the one with the greatest sum(sx[1:]), since if it weren't then the s-tuple with a small enough 1st coordinate and greater sum would also be a match (which we already know is impossible).
That observation lends itself to a fairly simple algorithm where we linearly walk through both e and s (sorted), keeping track of the s-tuple with the biggest sum. If that sum is big enough compared to the e-tuple we're looking at, add the pair to our result set.
def pairs(e, s):
# Might be able to skip or replace with .sort() depending
# on the rest of your program
e = sorted(e, key=lambda t: t[1])
s = sorted(s, key=lambda t: t[1])
i, max_seen, max_sum, results = 0, None, None, {}
for ex in e:
while i < len(s) and (sx:=s[i])[1] <= ex[1]:
if not max_seen or sum(sx[1:]) > max_sum:
max_seen, max_sum = sx, sum(sx[1:])
i += 1
if max_seen and max_sum > sum(ex[1:]):
results[ex] = s[i]
return results

How do I find the intersection of a set of tuples while ignoring the last element of the tuples?

Edit: Solved
I've solved this by creating dictionaries a and b where the keys are the tuples (x,y) and my values are the integers t. I then return my keys as sets, take the built-in intersection, then get the values for all intersecting (x,y) points.
a{(x,y): t, ...}
b{(x,y): t, ...}
c = set([*a]).intersection(set([*b]))
for each in c:
val_a = a.get(each)
val_b = b.get(each)
Original Question
I have two sets of tuples, each of the form
a = {(x,y,t), (x,y,t), ...}
b = {(x,y,t), (x,y,t), ...}
I'd like to find the "intersection" of a and b while ignoring the t element of the tuples.
For example:
a = {(1,2,5), (4,6,7)}
b = {(1,2,7), (5,5,3)}
c = a.magicintersection(b,'ignore-last-element-of-tuple-magic-keyword')
where c, the desired output, would yield {(1,2,5), (1,2,7)}.
I'd like to leverage the built-in intersection function rather than writing my own (horribly inefficient) function but I can't see a way around this.

You cant use the built in intersection methods for that. You also can't attach function to built ins:
def magic_intersect(x):
pass
set.mi = magic_intersect
results in
set.mi = magic_intersect
TypeError: can't set attributes of built-in/extension type 'set'
You could prop them all into a dictionary with keys of the 1st two elements of each tuple and values of set/list all tuples that match this to get the result:
a = {(1,2,5), (4,6,7)}
b = {(1,2,7), (5,5,3)}
from collections import defaultdict
d = defaultdict(set)
for x in (a,b):
for s in x:
d[(s[0],s[1])].add(s)
print(d)
print(d.get( (1,2) )) # get all tuples that start with (1,2,_)
Output:
defaultdict(<class 'set'>, {
(4, 6): {(4, 6, 7)},
(1, 2): {(1, 2, 5), (1, 2, 7)},
(5, 5): {(5, 5, 3)}})
{(1, 2, 5), (1, 2, 7)}
but thats only going to be worth it if you need to query for those multiple times and do not need to put millions of sets in them.
The actual "lookup" of what 2-tuple has which 3-tuples is O(1) fast - but you need space/time to build the dictionary.
This approch looses the information from wich set of tuples the items came - if you need to preserve that as well, you would have to store that as well - somehow.

What would your "intersection" have as a result, if the third component varies?
Anyway, the way to do this is to have a dictionary where the key is a tuple with the components of interest. The dictionary values can be lists with all matching 3-tuples, and then you can select just those which have more than one element.
This is not inefficient, you will only have to walk each set once - so it is O(M + N) - and you have a lot of lists and thousands of tuples with the the same x, y - then building the dictionary will append the matching tuples to a list, which is O(1).
matches = {}
for series_of_tuples in (a, b):
for tuple in series_of_tuples:
matches.setdefault(tuple[:2], []).append(tuple)
intersection = [values for values in matches.values() if len(values) > 1]

What are 'c' and 'value' ? Can someone explain how these works? [duplicate]

This question already has answers here:
What does enumerate() mean?
(7 answers)
Closed 4 years ago.
First example:
my_list = ['apple', 'banana', 'grapes', 'pear']
for c, value in enumerate(my_list, 1):
print(c, value)
Another example:
[print(int(x)==sum(int(d)**p for p,d in enumerate(x,1)))for x in[input()]]
How does x, d, and p work ?

So, there are two small questions here:
a)
my_list = ['apple', 'banana', 'grapes', 'pear']
for c, value in enumerate(my_list, 1):
print(c, value)
Step as follows:
enumerate(my_list, 1) will get a list with index, here the output is a enumereate object, if use list(enumerate(my_list, 1) to have a look, it is [(1, 'apple'), (2, 'banana'), (3, 'grapes'), (4, 'pear')].
So, with every for, the first iterate get c=1, value='apple', the second get c=2, value='banana' ...
Then the final output is:
1 apple
2 banana
3 grapes
4 pear
b)
[print(int(x)==sum(int(d)**p for p,d in enumerate(x,1)))for x in[input()]]
Step as follows:
First, it's a list comprehension, I suppose you have known that.
The input first expect a user input, let's input 100 for example, then the input will treat it as a str, so [input()] returns ['100']
Then with for x in [input()] the x is '100'
Next according list comprehension, it will handle int(x)==sum(int(d)**p for p,d in enumerate(x,1))
(int(d)**p for p,d in enumerate(x,1)) will first iterate '100', get something like [(1, '1'), (2, '0'), (3, '0')] if use list to see it, just similar as example 1. Then calculate int(d)**p for every iterate and finally use sum to get the result, similar to int('1')**1 + int('0')**2 + int('0')**3, the result is 1.
So print(int('100')==1 certainly output False
And the return value of print function call is always None, so list comprehension will make the new list is [None].
So the final outout is (NOTE: 100 is the echo of your input):
>>> [print(int(x)==sum(int(d)**p for p,d in enumerate(x,1)))for x in[input()]]
100
False
[None]

It is good practice to look up things in the Python documentation when you have questions about built in functions like enumerate. A link to that is here.
To explain it, enumerate will help you iterate over all values in a list, and if you pass in an optional parameter number (like the 1 in your example), it specifies the starting index to iterate from.
What enumerate returns is first the index and then the item stored at that index in the list. So c in your example is going to be each index (1, 2, 3, etc) as it loops through the for loop. And value in your example is the actual value stored at that index in the list.
Also remember that lists start at index 0 instead of 1 as their first value.

The common naming system when using enumerate is idx, item it makes it clear what each element represents and would suit you best to follow this naming scheme
l = ['a', 'b', 'c']
print([(idx, item) for idx, item in enumerate(l)])
[(0, 'a'), (1, 'b'), (2, 'c')]
As you can see idx represents the index of the item, and item is the item.

Python enumerate - what cannot I see?

Can anyone explain what is going on here as I am flummoxed
I have a module-wide list variable with elements that have fields - mylist with 'n' entries, each of field1, field2..fieldx
I want to access them in a procedure, so have (with some trace/debug statements)
print mylist [1].dataFieldCheckType
for lIndex, lField in enumerate(mylist, start = 1):
print lField.dataFieldCheckType
The first print statement gives the value -4 (which is correct), the second gives a different value, 0, over a simple one-statement step.
To my mind, lField is being created as a new element with default values but I do not know, nor understand, why. Why is the second print statement giving a different value from the first?
What am I doing wrong? Or, probably more pertinently, what am I not understanding?
I have asked this in another forum but no-one has come up with a plausible explanation.

In enumerate(), start does not specify the starting index into the iterable. It specifies the starting value of the count. enumerate() iterates over the whole iterable, from the first (index 0) to the last element, regardless of the start parameter.
The first print statement in your loop prints mylist[0].dataFieldCheckType, just as it ought to. You're just hoping it would be mylist[1].dataFieldCheckType.
If you want to take all elements of the list starting at the second (index 1), just slice it:
mylist[1:]
And if you really do need the index, too, combine the slice with enumerate():
enumerate(mylist[1:], start=1)

enumerate yields (index + start, value) tuples for every value of an iterable. The optional start parameter is used as an offset value to compute the first element of the generated tuples:
>>> a = ['hi', 'stack', 'overflow']
>>> for x in enumerate(a, -4):
... x
...
(-4, 'hi') # 0 + (-4)
(-3, 'stack') # 1 + (-4)
(-2, 'overflow') # 2 + (-4)
If you want to skip elements of an iterable, but don't need that particular slice in memory (all you want to do is iteration), use itertools.islice:
>>> from itertools import islice
>>> for x in islice(a, 2, None):
... x
...
'overflow'
Of course, you could combine the two for great justice.
>>> for x in islice(enumerate(a), 0, 2):
... x
...
(0, 'hi')
(1, 'stack')

sorting a list numerically that has string and integar value

I am looking for a code that can sort a list say for example list x, which contains integers and string. the code would then sort the list x so that the integer value is sorted corresponding to the string. so far I have tried this code however it does not work.
x =["a" 2,"c" 10, "b" 5]
x.sort()
print (x)
I want the result to be
["a" 2 "b" 5 "C" 10]
so the list is sorted numerically in acceding order and the string is also printed.

Use List of Tuples and then sort them according to what you want, example:
x = [('b',5),('a',2),('c',10)]
x.sort() # This will sort them based on item[0] of each tuple
x.sort(key=lambda s: s[1]) # This will sort them based on item[1] of each tuple
Another approach is to use dictionary instead of list of tuples, example:
x = {'b':5,'a':2,'c':10}#This will be automatically sorted based on the key of each element
if you print x, you will get:
{'a': 2, 'c': 10, 'b': 5}
if you want to sort them based on the value of each element, then:
x = sorted(x.items(), key=lambda s:s[1])
This will create a new list of tuples, since sorted() returns "new" sorted list, hence the result will be:
[('a', 2), ('b', 5), ('c', 10)]

If I deducted correctly you also want the resulting list to have an integer where the original list has an integer (and the same for characters).
I don't think there is an out-of-the-box way to do that. One possible approach is to separate your list into two others: one with integer, one with chars. Then, after sorting each list, you can merge them respecting the desired positions of integers and chars.

Use a nested iterable to pair the letters to numbers, then sort the items by the second elements:
# just pairs.sort(key = lambda x: x[1])
pairs = [('a', 2), ('c', 10), ('b', 5)]

I considered the elements are separate. The following code might help, you can fill or remove the print statement in the except block, as you wish.
x =["a", 2,"c", 10, "b", 5]
numbers = []
letters = []
for element in x:
try:
numbers.append(int(element))
except:
letters.append(str(element))
numbers.sort()
letters.sort()
numbers.reverse()
letters.reverse()
for index,item in enumerate(x):
try:
print int(item),
x[index] = numbers.pop()
except ValueError:
x[index] = letters.pop()
print "\n"+ str(x)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using tuples as indexes to compare items across a list - python

Related

Efficiently combine two uneven lists into dictionary based on condition

How do I find the intersection of a set of tuples while ignoring the last element of the tuples?

What are 'c' and 'value' ? Can someone explain how these works? [duplicate]

Python enumerate - what cannot I see?

sorting a list numerically that has string and integar value

Categories

Resources