Python: How to make an array of arrays of pairs? - python

I want to make an array which contains arrays of pairs (C++ like pairs) of different sizes, any idea how to do it in python?
To be more specific, I need to know the equivalent to this C++ code in python:
vector<vector<pair<int, int>>> my_vector;

You can implement pairs with tuples, which can be created by separating elements with commas, usually with parenthesis enclosing them for better readability.
For the vectors, you can use lists which allow you to remove and add elements to them as you see fit. These are created with comma separated elements enclosed in square brackets.
An example of implementing your type of structure would be:
pair1 = (0, 1)
pair2 = (4, 3)
inner_vector1 = [pair1, pair2]
inner_vector2 = [pair2]
inner_vector2.append(pair1)
outer_vector = [inner_vector1, inner_vector2]
Which results in the object:
[[(0, 1), (4, 3)], [(4, 3), (0, 1)]]
Which can be visualized as:
outer_vector (list)
{
inner_vector1 (list)
{
pair1, (tuple)
pair2 (tuple)
},
inner_vector2 (list)
{
pair2, (tuple)
pair1 (tuple)
}
}

my_vector = [] # python list, for list of lists: use [[]]
my_vector.append([(3,5),(-2,1)]) # my_vector = [[(3,5),(-2,1)]]
my_vector.append([(8,4)]) # my_vector = [[(3,5),(-2,1)],[(8,4)]]
my_vector[1].append((-1,1)) # my_vector = [[(3,5),(-2,1)],[(8,4),(-1,1)]]
my_vector[0].pop() # my_vector = [[(3,5)],[(8,4),(-1,1)]]
my_vector.pop() # my_vector = [[(3,5)]]
append() is similar to push_back() method for C++ vector.
pop() is the same thing as pop() method for C++ vector.
() is used in place of make_pair() in C++ pair
Hope this clear it all.

In Python we call vector as list.
To construct a list, use
l = [].
To construct an empty list of list, use ll = [[]]
To construct an empty list of list of tuple, First, you need a list of list, uselll = [[]]. Then you construct a pair, in Python we call it tuple. Say we have a tuple t = (3, 9). Then we may append this tuple to our lll ==> Use lll[0].append(t).
Print out lll, we get [[(3, 9)]].

Related

How do I find the intersection of a set of tuples while ignoring the last element of the tuples?

Edit: Solved
I've solved this by creating dictionaries a and b where the keys are the tuples (x,y) and my values are the integers t. I then return my keys as sets, take the built-in intersection, then get the values for all intersecting (x,y) points.
a{(x,y): t, ...}
b{(x,y): t, ...}
c = set([*a]).intersection(set([*b]))
for each in c:
val_a = a.get(each)
val_b = b.get(each)
Original Question
I have two sets of tuples, each of the form
a = {(x,y,t), (x,y,t), ...}
b = {(x,y,t), (x,y,t), ...}
I'd like to find the "intersection" of a and b while ignoring the t element of the tuples.
For example:
a = {(1,2,5), (4,6,7)}
b = {(1,2,7), (5,5,3)}
c = a.magicintersection(b,'ignore-last-element-of-tuple-magic-keyword')
where c, the desired output, would yield {(1,2,5), (1,2,7)}.
I'd like to leverage the built-in intersection function rather than writing my own (horribly inefficient) function but I can't see a way around this.
You cant use the built in intersection methods for that. You also can't attach function to built ins:
def magic_intersect(x):
pass
set.mi = magic_intersect
results in
set.mi = magic_intersect
TypeError: can't set attributes of built-in/extension type 'set'
You could prop them all into a dictionary with keys of the 1st two elements of each tuple and values of set/list all tuples that match this to get the result:
a = {(1,2,5), (4,6,7)}
b = {(1,2,7), (5,5,3)}
from collections import defaultdict
d = defaultdict(set)
for x in (a,b):
for s in x:
d[(s[0],s[1])].add(s)
print(d)
print(d.get( (1,2) )) # get all tuples that start with (1,2,_)
Output:
defaultdict(<class 'set'>, {
(4, 6): {(4, 6, 7)},
(1, 2): {(1, 2, 5), (1, 2, 7)},
(5, 5): {(5, 5, 3)}})
{(1, 2, 5), (1, 2, 7)}
but thats only going to be worth it if you need to query for those multiple times and do not need to put millions of sets in them.
The actual "lookup" of what 2-tuple has which 3-tuples is O(1) fast - but you need space/time to build the dictionary.
This approch looses the information from wich set of tuples the items came - if you need to preserve that as well, you would have to store that as well - somehow.
What would your "intersection" have as a result, if the third component varies?
Anyway, the way to do this is to have a dictionary where the key is a tuple with the components of interest. The dictionary values can be lists with all matching 3-tuples, and then you can select just those which have more than one element.
This is not inefficient, you will only have to walk each set once - so it is O(M + N) - and you have a lot of lists and thousands of tuples with the the same x, y - then building the dictionary will append the matching tuples to a list, which is O(1).
matches = {}
for series_of_tuples in (a, b):
for tuple in series_of_tuples:
matches.setdefault(tuple[:2], []).append(tuple)
intersection = [values for values in matches.values() if len(values) > 1]

How do I implement a Schwartzian Transform in Python?

In Perl I sometimes use the Schwartzian Transform to efficiently sort complex arrays:
#sorted = map { $_->[0] } # sort by word length
sort { $a->[1] <=> $b->[1] } # use numeric comparison
map { [$_, length($_)] } # calculate the length of the string
#unsorted;
How to implement this transform in Python?
You don't need to. Python has this feature built in, and in fact Python 3 removed C-style custom comparisons because this is so much better in the vast majority of cases.
To sort by word length:
unsorted.sort(key=lambda item: len(item))
Or, because len is already an unary function:
unsorted.sort(key=len)
This also works with the built-in sorted function.
If you want to sort on multiple criteria, you can take advantage of the fact that tuples sort lexicographically:
# sort by word length, then alphabetically in case of a tie
unsorted.sort(key=lambda item: (len(item), item)))
While there should normally be no reason not to use the key argument for the sorted function or list.sort method, you can of course do without it, by creating a list of pairs (called tmp below) where the first item is the sort key and the second item is the original item.
Due to lexicographical sorting, sorting this list will sort by the key first. Then you can take the items in the desired order from the sorted tmp list of pairs.
example = ["hello", "spam", "foo"]
tmp = []
for item in example:
sort_key = len(item)
tmp.append((sort_key, item))
# tmp: [(5, "hello"), (4, "spam"), (3, "foo")]
tmp.sort()
# tmp: [(3, "foo"), (4, "spam"), (5, "hello")]
result = []
for _, item in tmp:
result.append(item)
# result: ["foo", "spam", "hello"]
Note that usually this would be written with list comprehensions instead of calling .append in a loop, but the purpose of this answer is to illustrate the underlying algorithm in a way most likely to be understood by beginners.

string multi index replace without affecting the insert order?

Suppose I have a string value str=xxx. Now I want to replace it via multi-index, such as {(1, 3):'rep1', (4, 7):'rep2', (8, 9):'rep3'}, without disturbing the index order. How can I do it?
Pseudo-code (in Python 2.7):
str = 'abcdefghij'
replacement = {(1, 3):'123', (4, 7):'+', (8, 9):'&'} # right index isn't include
# after some replacements:
str = some_replace(str, replacement)
# i want to get this:
print str
# 'a123d+h&j'
# since string is immutable, make a list out of the string to modify in-place
lst = list(str)
Use slice to modify items, a stronger explanation about slice and assignment can be found here; slice(*k) creates a slice object from the keys of the replacement. For instance, slice(*(1, 3)) gives a slice of slice(1, 3) which is equivalent to lst[1:3] when used as index, and replaces the corresponding elements with the corresponding value when the assignment on the slice is called:
# Here sort the index in reverse order so as to avoid tracking the index change due to the
# difference of the sizes between the index and replacement
for k, v in sorted(replacement.items(), reverse=True):
lst[slice(*k)] = v
''.join(lst)
# 'a123d+h&j'

Modifying every tuple (and eventually just removing) from a list of tuples

I have a list of tuples like this:
tradeRanges = [(0,3), (10,14), (16,16), (21,23), (25,25)]
What I would like to do is:
Take every tuple and analysing the difference between the two numbers;
If this difference is non-zero, then I'd like to append a third element which is in fact the difference of this two numbers; if it's zero, I'd just like to pull the tuple out of the list.
The final output, hence, would be this:
tradeRanges = [(0,3,3), (10,14,4), (21,23,2)]
With this purpose I have tried to write the following script:
for tups in tradeRanges:
tradeRanges.remove(tups)
tups = list(tups)
lenTup = tups[1]-tups[0]
if lenTup > 0:
tups.append(lenTup) #so when it's done I would have the list into the same order
tups = tuple(tups)
tradeRanges.append(tups)
The problem here is that it skips the elements. When it gets the element (0,3) and remove it, rather than saving in memory the element (10,14) it will save the element (16,16). I have a vague idea of why this happens (probably the for loop is taking care of the positioning of the elements?) but I have no clue how to fix it. Is there any elegant way or I should use some control variables to take into account the position of the elements within the list?
tups = tuple(tups)
tradeRanges.append(tups)
tradeRanges = [(0,3), (10,14), (16,16), (21,23), (25,25)]
print [(n1, n2, abs(n1-n2)) for n1, n2 in tradeRanges if n1 != n2]
# [(0, 3, 3), (10, 14, 4), (21, 23, 2)]

Find duplicate items within a list of list of tuples Python

I want to find the matching item from the below given list.My List may be super large.
The very first item in the tuple "N1_10" is duplicated and matched with another item in another array
tuple in 1st array in the ListA ('N1_10', 'N2_28')
tuple in 2nd array in the ListA ('N1_10', 'N3_98')
ListA = [[('N1_10', 'N2_28'), ('N1_35', 'N2_44')],
[('N1_22', 'N3_72'), ('N1_10', 'N3_98')],
[('N2_33', 'N3_28'), ('N2_55', 'N3_62'), ('N2_61', 'N3_37')]]
what I want for the output is
output --> [('N1_10','N2_28','N3_98') , .... and the rest whatever match one of the
key will get into same tuple]
If you guys think , changing the data structure of the ListA is better option , pls feel free to advise!
Thanks for helping out!
SIMPLIFIED VERSION
List A = [[(a,x),(b,k),(c,l),(d,m)],[(e,d),(a,p),(g,s)],[...],[...]....]
wantedOutput --> [(a,x,p),(b,k),(c,l),(d,m,e),(g,s).....]
Update: After rereading your question, it appears that you're trying to create equivalence classes, rather than collecting values for keys. If
[[(1, 2), (3, 4), (2, 3)]]
should become
[(1, 2, 3, 4)]
, then you're going to need to interpret your input as a graph and apply a connected components algorithm. You could turn your data structure into an adjacency list representation and traverse it with a breadth-first or depth-first search, or iterate over your list and build disjoint sets. In either case, your code is going to suddenly involve a lot of graph-related complexity, and it'll be hard to provide any output ordering guarantees based on the order of the input. Here's an algorithm based on a breadth-first search:
import collections
# build an adjacency list representation of your input
graph = collections.defaultdict(set)
for l in ListA:
for first, second in l:
graph[first].add(second)
graph[second].add(first)
# breadth-first search the graph to produce the output
output = []
marked = set() # a set of all nodes whose connected component is known
for node in graph:
if node not in marked:
# this node is not in any previously seen connected component
# run a breadth-first search to determine its connected component
frontier = set([node])
connected_component = []
while frontier:
marked |= frontier
connected_component.extend(frontier)
# find all unmarked nodes directly connected to frontier nodes
# they will form the new frontier
new_frontier = set()
for node in frontier:
new_frontier |= graph[node] - marked
frontier = new_frontier
output.append(tuple(connected_component))
Don't just copy this without understanding it, though; understand what it's doing, or write your own implementation. You'll probably need to be able to maintain this. (I would've used pseudocode, but Python is practically as simple as pseudocode already.)
In case my original interpretation of your question was correct, and your input is a collection of key-value pairs that you want to aggregate, here's my original answer:
Original answer
import collections
clusterer = collections.defaultdict(list)
for l in ListA:
for k, v in l:
clusterer[k].append(v)
output = clusterer.values()
defaultdict(list) is a dict that automatically creates a list as the value for any key that wasn't already present. The loop goes over all the tuples, collecting all values that match up to the same key, then creates a list of (key, value_list) pairs from the defaultdict.
(The output of this code is not quite in the form you specified, but I believe this form is more useful. If you want to change the form, that should be a simple matter.)
Does output order matter? This is the simplest way I could think of:
ListA = [[('N1_10', 'N2_28'), ('N1_35', 'N2_44')],[('N1_22', 'N3_72'), ('N1_10', 'N3_98')],
[('N2_33', 'N3_28'), ('N2_55', 'N3_62'), ('N2_61', 'N3_37')]]
idx = dict()
for sublist in ListA:
for pair in sublist:
for item in pair:
mapping = idx.get(item,set())
mapping.update(pair)
idx[item] = mapping
for subitem in mapping:
submapping = idx.get(subitem,set())
submapping.update(mapping)
idx[subitem] = submapping
for x in set([frozenset(x) for x in idx.values()]):
print list(x)
Output:
['N3_72', 'N1_22']
['N2_28', 'N3_98', 'N1_10']
['N2_61', 'N3_37']
['N2_33', 'N3_28']
['N2_55', 'N3_62']
['N2_44', 'N1_35']
tupleList = [(1, 2), (3, 4), (1, 4), (3, 2), (1, 2), (7, 9), (9, 8), (5, 6)]
newSetSet = set ([frozenset (aTuple) for aTuple in tupleList])
setSet = set ()
while newSetSet != setSet:
print '*'
setSet = newSetSet
newSetSet = set ()
for set0 in setSet:
merged = False
for set1 in setSet:
if set0 & set1 and set0 != set1:
newSetSet.add (set0 | set1)
merged = True
if not merged:
newSetSet.add (set0)
print [tuple (element) for element in setSet]
print [tuple (element) for element in newSetSet]
print
print [tuple (element) for element in newSetSet]
# Result: [(1, 2, 3, 4), (5, 6), (8, 9, 7)]

Categories