Here is my list:
[(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
Basically, I'd like to get:
[('A', 'C'), ('E', 'G')]
So, I'd like to select first elements from the lowest-level lists and build mid-level lists with them.
====================================================
Additional explanation below:
I could just zip them by
list(zip([w[0][0] for w in list1], [w[1][0] for w in list1]))
But later I'd like to add a condition: the second elements in the lowest level lists must be 'B' and 'D' respectively, so the final outcome should be:
[('A', 'C')] # ('E', 'G') must be sorted out
I'm a beginner, but can't find the case anywhere... Would be grateful for help.
I'd do it the following way
list = [(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
out = []
for i in list:
listAux = []
for j in i:
listAux.append(j[0])
out.append((listAux[0],listAux[1]))
print(out)
I hope that's what you're looking for.
I am using python regular expressions. I want all colon separated values in a line.
e.g.
input = 'a:b c:d e:f'
expected_output = [('a','b'), ('c', 'd'), ('e', 'f')]
But when I do
>>> re.findall('(.*)\s?:\s?(.*)','a:b c:d')
I get
[('a:b c', 'd')]
I have also tried
>>> re.findall('(.*)\s?:\s?(.*)[\s$]','a:b c:d')
[('a', 'b')]
The following code works for me:
inpt = 'a:b c:d e:f'
re.findall('(\S+):(\S+)',inpt)
Output:
[('a', 'b'), ('c', 'd'), ('e', 'f')]
Use split instead of regex, also avoid giving variable name like keywords
:
inpt = 'a:b c:d e:f'
k= [tuple(i.split(':')) for i in inpt.split()]
print(k)
# [('a', 'b'), ('c', 'd'), ('e', 'f')]
The easiest way using list comprehension and split :
[tuple(ele.split(':')) for ele in input.split(' ')]
#driver values :
IN : input = 'a:b c:d e:f'
OUT : [('a', 'b'), ('c', 'd'), ('e', 'f')]
You may use
list(map(lambda x: tuple(x.split(':')), input.split()))
where
input.split() is
>>> input.split()
['a:b', 'c:d', 'e:f']
lambda x: tuple(x.split(':')) is function to convert string to tuple 'a:b' => (a, b)
map applies above function to all list elements and returns a map object (in Python 3) and this is converted to list using list
Result
>>> list(map(lambda x: tuple(x.split(':')), input.split()))
[('a', 'b'), ('c', 'd'), ('e', 'f')]
Bigram is a list which looks like-
[('a', 'b'), ('b', 'b'), ('b', 'b'), ('b', 'c'), ('c', 'c'), ('c', 'c'), ('c', 'd'), ('d', 'd'), ('d', 'e')]
Now I am trying to wrote each element if the list as a separate line in a file with this code-
bigram = list(nltk.bigrams(s.split()))
outfile1.write("%s" % ''.join(ele) for ele in bigram)
but I am getting this error :
TypeError: write() argument must be str, not generator
I want the result as in file-
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
......
you're passing a generator comprehension to write, which needs strings.
If I understand correctly you want to write one representation of tuple per line.
You can achieve that with:
outfile1.write("".join('{}\n'.format(ele) for ele in bigram))
or
outfile1.writelines('{}\n'.format(ele) for ele in bigram)
the second version passes a generator comprehension to writelines, which avoids to create the big string in memory before writing to it (and looks more like your attempt)
it produces a file with this content:
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
('c', 'c')
('c', 'd')
('d', 'd')
('d', 'e')
Try this:
outfile1.writelines("{}\n".format(ele) for ele in bigram)
This is the operator precedence problem.
You want an expression like this:
("%s" % ''.join(ele)) for ele in bigram
Instead, you get it interpreted like this, where the part in the parens is indeed a generator:
"%s" % (''.join(ele) for ele in bigram)
Use the explicit parentheses.
Please note that ("%s" % ''.join(ele)) for ele in bigram is itself a generator. You need to call write on each element from it.
If you want to write each pair in a separate line, you have to add line separators explicitly. The easiest, to my mind, is an explicit loop:
for pair in bigram:
outfile.write("(%s, %s)\n" % pair)
Questions: Can someone help me to figure out how to calculate cycles that have the maximum amount of pairs (three per cycle - see last example)?
This is what I want to do:
-> pair two users every cycle such that
- each user is only paired once with an other user in a given cycle
- each user is only paired once with every other user in all cycles
Real world:
You meet one new person from a list every week (week = cycle).
You never meet the same person again.
Every user is matched to someone else per week
This is my problem:
I'm able to create combinations of users and select pairs of users that never have met. However, sometimes I'm able to only match two pairs in a cycle instead of three. Therefore,
I'm searching for a way to create the optimal selections from a list of combinations.
1) I start with 6 users:
users = ["A","B","C","D","E","F"]
2) From this list, I create possible combinations:
x = itertools.combinations(users,2)
for i in x:
candidates.append(i)
This gives me:
. A,B A,C A,D A,E A,F
. . B,C B,D B,E B,F
. . . C,D C,E C,F
. . . . D,E D,F
. . . . . E,F
or
candidates = [('A', 'B'), ('A', 'C'), ('A', 'D'), ('A', 'E'), ('A', 'F'), ('B', 'C'),
('B', 'D'), ('B', 'E'), ('B', 'F'), ('C', 'D'), ('C', 'E'), ('C', 'F'),
('D', 'E'), ('D', 'F'), ('E', 'F')]
3) Now, I would like to select pairs from this list, such that a user (A to F) is only present once & all users are paired with someone in this cycle
Example:
cycle1 = ('A','B'),('C','D') ('E','F')
Next cycle, I want to find an other set of three pairs.
I calculated that with 6 users there should be 5 cycles with 3 pairs each:
Example:
cycle 1: AF BC DE
cycle 2: AB CD EF
cycle 3: AC BE DF
cycle 4: AE BD CF
cycle 5: AD BF CE
Can someone help me to figure out how to calculate cycles that have the maximum amount of pairs (three per cycle - see last example)?
Like Whatang mentioned in the comments your problem is in fact equivalent to that of creating a round-robin style tournament. This is a Python version of the algorithm mentioned on the Wikipedia page, see also this and this answer.
def schedule(users):
# first make copy or convert to list with length `n`
users = list(users)
n = len(users)
# add dummy for uneven number of participants
if n % 2:
users.append('_')
n += 1
cycles = []
for _ in range(n-1):
# "folding", `//` for integer division
c = zip(users[:n//2], reversed(users[n//2:]))
cycles.append(list(c))
# rotate, fixing user 0 in place
users.insert(1, users.pop())
return cycles
schedule(['A', 'B', 'C', 'D', 'E', 'F'])
For your example it produces the following:
[[('A', 'F'), ('B', 'E'), ('C', 'D')],
[('A', 'E'), ('F', 'D'), ('B', 'C')],
[('A', 'D'), ('E', 'C'), ('F', 'B')],
[('A', 'C'), ('D', 'B'), ('E', 'F')],
[('A', 'B'), ('C', 'F'), ('D', 'E')]]
Here's an itertools-based solution:
import itertools
def hasNoRepeats(matching):
flattenedList = list(itertools.chain.from_iterable(matching))
flattenedSet = set(flattenedList)
return len(flattenedSet) == len(flattenedList)
def getMatchings(users, groupSize=2):
# Get all possible pairings of users
pairings = list(itertools.combinations(users, groupSize))
# Get all possible groups of pairings of the correct size, then filter to eliminate groups of pairings where a user appears more than once
possibleMatchings = filter(hasNoRepeats, itertools.combinations(pairings, len(users)/groupSize))
# Select a series of the possible matchings, making sure no users are paired twice, to create a series of matching cycles.
cycles = [possibleMatchings.pop(0)]
for matching in possibleMatchings:
# pairingsToDate represents a flattened list of all pairs made in cycles so far
pairingsToDate = list(itertools.chain.from_iterable(cycles))
# The following checks to make sure there are no pairs in matching (the group of pairs being considered for this cycle) that have occurred in previous cycles (pairingsToDate)
if not any([pair in pairingsToDate for pair in matching]):
# Ok, 'matching' contains only pairs that have never occurred so far, so we'll add 'matching' as the next cycle
cycles.append(matching)
return cycles
# Demo:
users = ["A","B","C","D","E","F"]
matchings = getMatchings(users, groupSize=2)
for matching in matchings:
print matching
output:
(('A', 'B'), ('C', 'D'), ('E', 'F'))
(('A', 'C'), ('B', 'E'), ('D', 'F'))
(('A', 'D'), ('B', 'F'), ('C', 'E'))
(('A', 'E'), ('B', 'D'), ('C', 'F'))
(('A', 'F'), ('B', 'C'), ('D', 'E'))
Python 2.7. It's a little brute-forcey, but it gets the job done.
Ok this is pseudo code, but it should do the trick
while length(candidates) > length(users)/2 do
{
(pairs, candidates) = selectPairs(candidates, candidates)
if(length(pairs) == length(users)/2)
cycles.append(pairs)
}
selectPairs(ccand, cand)
{
if notEmpty(ccand) then
cpair = cand[0]
ncand = remove(cpair, cand)
nccand = removeOccurences(cpair, ncand)
(pairs, tmp) = selectPairs(nccand, ncand)
return (pairs.append(cpair), tmp)
else
return ([],cand)
}
where:
remove(x, xs) remove x from xs
removeOccurences(x, xs) remove every pair of xs containing at least one element of the pair `x
EDIT: the condition to stop the algorithm may need further thought ...