tree where elements can be moved around - python

I am looking for a data structure that allows moving of sub-trees.
I have a list of elements, e.g. [('A', 'B'), ('C', 'D'), ('B', 'C')].
When I use a normal tree I only get:
A--B
C--D
B--C
But I would like to have:
+--B
|
A
|
+--C---D
Is there a known data structure to do this?

Related

Select first item in each list

Here is my list:
[(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
Basically, I'd like to get:
[('A', 'C'), ('E', 'G')]
So, I'd like to select first elements from the lowest-level lists and build mid-level lists with them.
====================================================
Additional explanation below:
I could just zip them by
list(zip([w[0][0] for w in list1], [w[1][0] for w in list1]))
But later I'd like to add a condition: the second elements in the lowest level lists must be 'B' and 'D' respectively, so the final outcome should be:
[('A', 'C')] # ('E', 'G') must be sorted out
I'm a beginner, but can't find the case anywhere... Would be grateful for help.
I'd do it the following way
list = [(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
out = []
for i in list:
listAux = []
for j in i:
listAux.append(j[0])
out.append((listAux[0],listAux[1]))
print(out)
I hope that's what you're looking for.

write the elements of list to file

Bigram is a list which looks like-
[('a', 'b'), ('b', 'b'), ('b', 'b'), ('b', 'c'), ('c', 'c'), ('c', 'c'), ('c', 'd'), ('d', 'd'), ('d', 'e')]
Now I am trying to wrote each element if the list as a separate line in a file with this code-
bigram = list(nltk.bigrams(s.split()))
outfile1.write("%s" % ''.join(ele) for ele in bigram)
but I am getting this error :
TypeError: write() argument must be str, not generator
I want the result as in file-
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
......
you're passing a generator comprehension to write, which needs strings.
If I understand correctly you want to write one representation of tuple per line.
You can achieve that with:
outfile1.write("".join('{}\n'.format(ele) for ele in bigram))
or
outfile1.writelines('{}\n'.format(ele) for ele in bigram)
the second version passes a generator comprehension to writelines, which avoids to create the big string in memory before writing to it (and looks more like your attempt)
it produces a file with this content:
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
('c', 'c')
('c', 'd')
('d', 'd')
('d', 'e')
Try this:
outfile1.writelines("{}\n".format(ele) for ele in bigram)
This is the operator precedence problem.
You want an expression like this:
("%s" % ''.join(ele)) for ele in bigram
Instead, you get it interpreted like this, where the part in the parens is indeed a generator:
"%s" % (''.join(ele) for ele in bigram)
Use the explicit parentheses.
Please note that ("%s" % ''.join(ele)) for ele in bigram is itself a generator. You need to call write on each element from it.
If you want to write each pair in a separate line, you have to add line separators explicitly. The easiest, to my mind, is an explicit loop:
for pair in bigram:
outfile.write("(%s, %s)\n" % pair)

"Transpose" (rotate?) nested list

I have a list of lists of lists like this:
[
[
[a,b],
[c,d]
],
[
[e,f],
[g,h]
]
]
Basically, this is a cube of values.
What I want is a different order of items in the same cube, like this:
[
[
[a,e],
[b,f]
],
[
[c,g],
[d,h]
]
]
And, preferably, in a one-liner (yes, I do know that's not the best practice).
I know of the map(list, *zip(a)) trick, but i couldn't figure out how to apply it here. Something with lambdas and maps, probably?
UPD: As for what I need it for --- I've done some tests for speeds of different sorting algorithms; each deepest list has values -- the times that the sorting algorithms that I tested took. These lists are in lists, which represent different types of tests, and the outer list has the same thing repeated for different test sizes. After such rotation, I will have list (test size) of lists (test type) of lists (sort type) of items (time), which is so much more convenient to plot.
If I understand you correctly, you want to first transpose all the sublists then transpose the newly transposed groups:
print([list(zip(*sub)) for sub in zip(*l)])
Output:
In [69]: [list(zip(*sub)) for sub in zip(*l)]
Out[69]: [[('a', 'e'), ('b', 'f')], [('c', 'g'), ('d', 'h')]]
If you want some map foo with a lambda:
In [70]: list(map(list, map(lambda x: zip(*x), zip(*l))))
Out[70]: [[('a', 'e'), ('b', 'f')], [('c', 'g'), ('d', 'h'
For python2 you don't need the extra map call but I would use itertools.izip to do the initial transpose.:
In [9]: from itertools import izip
In [10]: map(lambda x: zip(*x), izip(*l))
Out[10]: [[('a', 'e'), ('b', 'f')], [('c', 'g'), ('d', 'h')]]

Python Shortest path between 2 points

I have found many algorithms and approaches that talk about finding the shortest path between 2 points , but i have this situation where the data is modeled as :
[(A,B),(C,D),(B,C),(D,E)...] # list of possible paths
If we suppose i need the path from A to E , the result should be:
(A,B)=>(B,C)=>(C,D)=>(D,E)
but i can't find a pythonic way to do this search.
The Pythonic way is to to use a module if one exists. As in this case, we know, networkx is there , we can write
Implementation
import networkx as nx
G = nx.Graph([('A','B'),('C','D'),('B','C'),('D','E')])
path = nx.shortest_path(G, 'A', 'E')
Output
zip(path, path[1:])
[('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E')]
If you think of your points as vertices in a graph, your pairs as edges in that graph, then you can assign to edge graph edge a weight equal to the distance between your points.
Framed this way your problem is just the classic shortest path problem.
You asked for a Pythonic way to write it. The only advice I'd give is represent your graph as a dictionary, so that each key is a point, the returned values are a list of the other points directly reachable from that point. That will make traversing the graph faster. graph[C] -> [B, D] for your example.
Here is a solution using A*:
pip install pyformulas==0.2.8
import pyformulas as pf
transitions = [('A', 'B'), ('B', 'C'), ('C', 'A'), ('C', 'F'), ('D', 'F'), ('F', 'D'), ('F', 'B'), ('D', 'E'), ('E', 'C')]
initial_state = ('A',)
def expansion_fn(state):
valid_transitions = [tn for tn in transitions if tn[0] == state[-1]]
step_costs = [1 for t in valid_transitions]
return valid_transitions, step_costs
def goal_fn(state):
return state[-1] == 'E'
path = pf.discrete_search(initial_state, expansion_fn, goal_fn) # A*
print(path)
Output:
[('A',), ('A', 'B'), ('B', 'C'), ('C', 'F'), ('F', 'D'), ('D', 'E')]

Optimal strategy for choosing pairs from a list of combinations

Questions: Can someone help me to figure out how to calculate cycles that have the maximum amount of pairs (three per cycle - see last example)?
This is what I want to do:
-> pair two users every cycle such that
- each user is only paired once with an other user in a given cycle
- each user is only paired once with every other user in all cycles
Real world:
You meet one new person from a list every week (week = cycle).
You never meet the same person again.
Every user is matched to someone else per week
This is my problem:
I'm able to create combinations of users and select pairs of users that never have met. However, sometimes I'm able to only match two pairs in a cycle instead of three. Therefore,
I'm searching for a way to create the optimal selections from a list of combinations.
1) I start with 6 users:
users = ["A","B","C","D","E","F"]
2) From this list, I create possible combinations:
x = itertools.combinations(users,2)
for i in x:
candidates.append(i)
This gives me:
. A,B A,C A,D A,E A,F
. . B,C B,D B,E B,F
. . . C,D C,E C,F
. . . . D,E D,F
. . . . . E,F
or
candidates = [('A', 'B'), ('A', 'C'), ('A', 'D'), ('A', 'E'), ('A', 'F'), ('B', 'C'),
('B', 'D'), ('B', 'E'), ('B', 'F'), ('C', 'D'), ('C', 'E'), ('C', 'F'),
('D', 'E'), ('D', 'F'), ('E', 'F')]
3) Now, I would like to select pairs from this list, such that a user (A to F) is only present once & all users are paired with someone in this cycle
Example:
cycle1 = ('A','B'),('C','D') ('E','F')
Next cycle, I want to find an other set of three pairs.
I calculated that with 6 users there should be 5 cycles with 3 pairs each:
Example:
cycle 1: AF BC DE
cycle 2: AB CD EF
cycle 3: AC BE DF
cycle 4: AE BD CF
cycle 5: AD BF CE
Can someone help me to figure out how to calculate cycles that have the maximum amount of pairs (three per cycle - see last example)?
Like Whatang mentioned in the comments your problem is in fact equivalent to that of creating a round-robin style tournament. This is a Python version of the algorithm mentioned on the Wikipedia page, see also this and this answer.
def schedule(users):
# first make copy or convert to list with length `n`
users = list(users)
n = len(users)
# add dummy for uneven number of participants
if n % 2:
users.append('_')
n += 1
cycles = []
for _ in range(n-1):
# "folding", `//` for integer division
c = zip(users[:n//2], reversed(users[n//2:]))
cycles.append(list(c))
# rotate, fixing user 0 in place
users.insert(1, users.pop())
return cycles
schedule(['A', 'B', 'C', 'D', 'E', 'F'])
For your example it produces the following:
[[('A', 'F'), ('B', 'E'), ('C', 'D')],
[('A', 'E'), ('F', 'D'), ('B', 'C')],
[('A', 'D'), ('E', 'C'), ('F', 'B')],
[('A', 'C'), ('D', 'B'), ('E', 'F')],
[('A', 'B'), ('C', 'F'), ('D', 'E')]]
Here's an itertools-based solution:
import itertools
def hasNoRepeats(matching):
flattenedList = list(itertools.chain.from_iterable(matching))
flattenedSet = set(flattenedList)
return len(flattenedSet) == len(flattenedList)
def getMatchings(users, groupSize=2):
# Get all possible pairings of users
pairings = list(itertools.combinations(users, groupSize))
# Get all possible groups of pairings of the correct size, then filter to eliminate groups of pairings where a user appears more than once
possibleMatchings = filter(hasNoRepeats, itertools.combinations(pairings, len(users)/groupSize))
# Select a series of the possible matchings, making sure no users are paired twice, to create a series of matching cycles.
cycles = [possibleMatchings.pop(0)]
for matching in possibleMatchings:
# pairingsToDate represents a flattened list of all pairs made in cycles so far
pairingsToDate = list(itertools.chain.from_iterable(cycles))
# The following checks to make sure there are no pairs in matching (the group of pairs being considered for this cycle) that have occurred in previous cycles (pairingsToDate)
if not any([pair in pairingsToDate for pair in matching]):
# Ok, 'matching' contains only pairs that have never occurred so far, so we'll add 'matching' as the next cycle
cycles.append(matching)
return cycles
# Demo:
users = ["A","B","C","D","E","F"]
matchings = getMatchings(users, groupSize=2)
for matching in matchings:
print matching
output:
(('A', 'B'), ('C', 'D'), ('E', 'F'))
(('A', 'C'), ('B', 'E'), ('D', 'F'))
(('A', 'D'), ('B', 'F'), ('C', 'E'))
(('A', 'E'), ('B', 'D'), ('C', 'F'))
(('A', 'F'), ('B', 'C'), ('D', 'E'))
Python 2.7. It's a little brute-forcey, but it gets the job done.
Ok this is pseudo code, but it should do the trick
while length(candidates) > length(users)/2 do
{
(pairs, candidates) = selectPairs(candidates, candidates)
if(length(pairs) == length(users)/2)
cycles.append(pairs)
}
selectPairs(ccand, cand)
{
if notEmpty(ccand) then
cpair = cand[0]
ncand = remove(cpair, cand)
nccand = removeOccurences(cpair, ncand)
(pairs, tmp) = selectPairs(nccand, ncand)
return (pairs.append(cpair), tmp)
else
return ([],cand)
}
where:
remove(x, xs) remove x from xs
removeOccurences(x, xs) remove every pair of xs containing at least one element of the pair `x
EDIT: the condition to stop the algorithm may need further thought ...

Categories