Better ways to find pairs that sum to N - python

Is there a faster way to write this, the function takes a list and a value to find the pairs of numeric values in that list that sum to N without duplicates I tried to make it faster by using sets instead of using the list itself (however I used count() which I know is is linear time) any suggestions I know there is probably a way
def pairsum_n(list1, value):
set1 = set(list1)
solution = {(min(i, value - i) , max(i, value - i)) for i in set1 if value - i in set1}
solution.remove((value/2,value/2)) if list1.count(value/2) < 2 else None
return solution
"""
Example: value = 10, list1 = [1,2,3,4,5,6,7,8,9]
pairsum_n = { (1,9), (2,8), (3,7), (4,6) }
Example: value = 10, list2 = [5,6,7,5,7,5,3]
pairsum_n = { (5,5), (3,7) }
"""

Your approach is quite good, it just needs a few tweaks to make it more efficient. itertools is convenient, but it's not really suitable for this task because it produces so many unwanted pairs. It's ok if the input list is small, but it's too slow if the input list is large.
We can avoid producing duplicates by looping over the numbers in order, stopping when i >= value/2, after using a set to get rid of dupes.
def pairsum_n(list1, value):
set1 = set(list1)
list1 = sorted(set1)
solution = []
maxi = value / 2
for i in list1:
if i >= maxi:
break
j = value - i
if j in set1:
solution.append((i, j))
return solution
Note that the original list1 is not modified. The assignment in this function creates a new local list1. If you do actually want (value/2, value/2) in the output, just change the break condition.
Here's a slightly more compact version.
def pairsum_n(list1, value):
set1 = set(list1)
solution = []
for i in sorted(set1):
j = value - i
if i >= j:
break
if j in set1:
solution.append((i, j))
return solution
It's possible to condense this further, eg using itertools.takewhile, but it will be harder to read and there won't be any improvement in efficiency.

Try this, running time O(nlogn):
v = [1, 2, 3, 4, 5, 6, 7, 8, 9]
l = 0
r = len(v)-1
def myFunc(v, value):
ans = []
% this block search for the pair (value//2, value//2)
if value % 2 == 0:
c = [i for i in v if i == value // 2]
if len(c) >= 2:
ans.append((c[0], c[1]))
v = list(set(v))
l = 0
r = len(v)-1
v.sort()
while l<len(v) and r >= 0 and l < r:
if v[l] + v[r] == value:
ans.append((v[l], v[r]))
l += 1
r -= 1
elif v[l] + v[r] < value:
l += 1
else:
r -= 1
return list(set(ans))
It is called the Two pointers technique and it works as follows. First of all, sort the array. This imposes a minimum running time of O(nlogn). Then set two pointers, one pointing at the start of the array l and other pointing at its last element r (pointers name are for left and right).
Now, look at the list. If the sum of the values returned at position l and r is lower than the value we are looking for, then we need to increment l. If it's greater, we need to decrement r.
If v[l] + v[r] == value than we can increment/decrement both l or r since in any case we want to skip the combination of values (v[l], v[r]) as we don't want duplicates.

Timings: this is actually slower then the other 2 solutions. Due to the amount of combinations produced but not actually needed it gets worse the bigger the lists are.
You can use itertools.combinations to produce the 2-tuple-combinations for you.
Put them into a set if they match your value, then return as set/list:
from itertools import combinations
def pairsum_n(list1, value):
"""Returns the unique list of pairs of combinations of numbers from
list1 that sum up `value`. Reorders the values to (min_value,max_value)."""
result = set()
for n in combinations(list1, 2):
if sum(n) == value:
result.add( (min(n),max(n)) )
return list(result)
# more ugly one-liner:
# return list(set(((min(n),max(n)) for n in combinations(list1,2) if sum(n)==value)))
data = [1,2,3,4,5,6,6,5,4,3,2,1]
print(pairsum_n(data,7))
Output:
[(1, 6), (2, 5), (3, 4)]
Fun little thing, with some sorting overhead you can get all at once:
def pairsum_n2(data, count_nums=2):
"""Generate a dict with all count_nums-tuples from data. Key into the
dict is the sum of all tuple-values."""
d = {}
for n in (tuple(sorted(p)) for p in combinations(data,count_nums)):
d.setdefault(sum(n),set()).add(n)
return d
get_all = pairsum_n2(data,2) # 2 == number of numbers to combine
for k in get_all:
print(k," -> ", get_all[k])
Output:
3 -> {(1, 2)}
4 -> {(1, 3), (2, 2)}
5 -> {(2, 3), (1, 4)}
6 -> {(1, 5), (2, 4), (3, 3)}
7 -> {(3, 4), (2, 5), (1, 6)}
2 -> {(1, 1)}
8 -> {(2, 6), (4, 4), (3, 5)}
9 -> {(4, 5), (3, 6)}
10 -> {(5, 5), (4, 6)}
11 -> {(5, 6)}
12 -> {(6, 6)}
And then just access the one you need via:
print(get_all.get(7,"Not possible")) # {(3, 4), (2, 5), (1, 6)}
print(get_all.get(17,"Not possible")) # Not possible

Have another solution, it's alot faster then the one I just wrote, not as fast as #PM 2Ring's answer:
def pairsum_n(list1, value):
set1 = set(list1)
if list1.count(value/2) < 2:
set1.remove(value/2)
return set((min(x, value - x) , max(x, value - x)) for x in filterfalse(lambda x: (value - x) not in set1, set1))

Related

Given a list of numbers, how many different ways can you add them together to get a sum S?

Given a list of numbers, how many different ways can you add them together to get a sum S?
Example:
list = [1, 2]
S = 5
1) 1+1+1+1+1 = 5
2) 1+1+1+2 = 5
3) 1+2+2 = 5
4) 2+1+1+1 = 5
5) 2+2+1 = 5
6) 1+2+1+1 = 5
7) 1+1+2+1 = 5
8) 2+1+2 = 5
Answer = 8
This is what I've tried, but it only outputs 3 as the answer
lst = [1, 2]
i = 1
result = 0
while i <= 5:
s_list = [sum(comb) for comb in combinations_with_replacement(lst, i)]
for val in s_list:
if val == 5:
result += 1
i+= 1
print(result)
However, this outputs three. I believe it outputs three because it doesn't account for the different order you can add the numbers in. Any ideas on how to solve this.
The problem should work for much larger data: however, I give this simple example to give the general idea.
Using both itertools.combinations_with_replacement and permutations:
import itertools
l = [1,2]
s = 5
res = []
for i in range(1, s+1):
for tup in itertools.combinations_with_replacement(l, i):
if sum(tup) == s:
res.extend(list(itertools.permutations(tup, i)))
res = list(set(res))
print(res)
[(1, 2, 2),
(2, 2, 1),
(1, 1, 2, 1),
(1, 2, 1, 1),
(2, 1, 1, 1),
(1, 1, 1, 2),
(2, 1, 2),
(1, 1, 1, 1, 1)]
print(len(res))
# 8
How about using dynamic programming? I believe it's more easy to understand and can be implemented easily.
def cal(target, choices, record):
min_choice = min(choices)
if min_choice > target:
return False
for i in range(0, target+1):
if i == 0:
record.append(1)
elif i < min_choice:
record.append(0)
elif i == min_choice:
record.append(1)
else:
num_solution = 0
j = 0
while j < len(choices) and i-choices[j] >= 0:
num_solution += record[i-choices[j]]
j += 1
record.append(num_solution)
choices = [1, 2]
record = []
cal(5, choices, record)
print(record)
print(f"Answer:{record[-1]}")
The core idea here is using an extra record array to record how many ways can be found to get current num, e.g. record[2] = 2 means we can use to ways to get a sum of 2 (1+1 or 2).
And we have record[target] = sum(record[target-choices[i]]) where i iterates over choices. Try to think, the way of getting sum=5 must be related with the way of getting sum=4 and so on.
Use Dynamic Programming.
We suppose that your list consists of [1,2,5] so we have this recursive function :
f(n,[1,2,5]) = f(n-1,[1,2,5]) + f(n-2,[1,2,5]) + f(n-5,[1,2,5])
Because if the first number in sum is 1 then you have f(n-1,[1,2,5]) options for the rest and if it is 2 you have f(n-2,[1,2,5]) option for the rest and so on ...
so start from f(1) and work your way up with Dynamic programming. this solution in the worst case is O(n^2) and this happens when your list has O(n) items.
Solution would be something like this:
answers = []
lst = [1,2]
number = 5
def f(target):
val = 0
for i in lst: #O(lst.count())
current = target - i
if current > 0:
val += answers[current-1]
if lst.__contains__(target): #O(lst.count())
val += 1
answers.insert(target,val)
j = 1;
while j<=number: #O(n) for while loop
f(j)
j+=1
print(answers[number-1])
here is a working version.
You'd want to use recursion to traverse through each possibility for each stage of addition, and pass back the numbers used once you've reached a number that is equal to the expected.
def find_addend_combinations(sum_value, addend_choices, base=0, history=None):
if history is None: history = []
if base == sum_value:
return tuple(history)
elif base > sum_value:
return None
else:
results = []
for v in addend_choices:
r = find_addend_combinations(sum_value, addend_choices, base + v,
history + [v])
if isinstance(r, tuple):
results.append(r)
elif isinstance(r, list):
results.extend(r)
return results
You could write the last part a list comprehension but I think this way is clearer.
Combinations with the elements in a different order are considered to equivalent. For example, #3 and #5 from your list of summations are considered equivalent if you are only talking about combinations.
In contrast, permutations consider the two collections unique if they are comprised of the same elements in a different order.
To get the answer you are looking for you need to combine both concepts.
First, use your technique to find combinations that meet your criteria
Next, permute the collection of number from the combination
Finally, collect the generated permutations in a set to remove duplicates.
[ins] In [01]: def combination_generator(numbers, k, target):
...: assert k > 0, "Must be a positive number; 'k = {}".format(k)
...: assert len(numbers) > 0, "List of numbers must have at least one element"
...:
...: for candidate in (
...: {'numbers': combination, 'sum': sum(combination)}
...: for num_elements in range(1, k + 1)
...: for combination in itertools.combinations_with_replacement(numbers, num_elements)
...: ):
...: if candidate['sum'] != target:
...: continue
...: for permuted_candidate in itertools.permutations(candidate['numbers']):
...: yield permuted_candidate
...:
[ins] In [02]: {candidate for candidate in combination_generator([1, 2], 5, 5)}
Out[02]:
{(1, 1, 1, 1, 1),
(1, 1, 1, 2),
(1, 1, 2, 1),
(1, 2, 1, 1),
(1, 2, 2),
(2, 1, 1, 1),
(2, 1, 2),
(2, 2, 1)}

Efficiently searching a pair of ordered lists with noise

Assuming two data sets are in order and that they contain pairwise matches, what is an efficient way to discover the pairs? There can be noise in either list.
From sets A,B the set C will consist of pairs (A[X1],B[Y1]),(A[X2],B[Y2]),...,(A[Xn],B[Yn]) such that X1 < X2 < ... < Xn and Y1 < Y2 < ... < Yn.
The problem can be demonstrated with the simplified Python block, where the specifics of how a successful pair is validated is irrelevant.
Because the validation condition is irrelevant, the condition return_pairs(A, B, validate) == return_pairs(B, A, validate) is not required to hold, given that the data in A,B need not be the same, just that there must exist a validation function for (A[x],B[y])
A = [0,0,0,1,2,0,3,4,0,5,6,0,7,0,0,8,0,0,9]
B = [1,2,0,0,0,0,0,3,0,0,4,0,5,6,0,0,7,0,0,8,0,9]
B1 = [1,2,0,0,0,0,0,3,0,0,4,0,5,6,0,0,7,7,7,0,0,8,0,9]
def validate(a,b):
return a and b and a==b
def return_pairs(A,B, validation):
ret = []
x,y = 0,0
# Do loops and index changes...
if validation(A[x], B[y]):
ret.append((A[x], B[y]))
return ret
assert zip(range(1,10), range(1,10)) == return_pairs(A,B,validate)
assert zip(range(1,10), range(1,10)) == return_pairs(A,B1,validate)
Instead of iterating each list in two nested loops you can first remove the noise according to your own criteria, then create a third list with the filtered elements and run each item (being a newly formed tuple) of the list against your validation. This is assuming I understood the question correctly, which I think I didn't really:
Demo
A = [0,0,0,1,2,0,3,4,0,5,6,0,7,0,0,8,0,0,9]
B = [1,2,0,0,0,0,0,3,0,0,4,0,5,6,0,0,7,0,0,8,0,9]
def clean(oldList):
newList = []
for item in oldList:
if 0<item and (not newList or item>newList[-1]):
newList.append(item)
return newList
def validate(C):
for item in C:
if item[0] != item[1]:
return False
return True
C = zip(clean(A),clean(B))
#clean(A):[1, 2, 3, 4, 5, 6, 7, 8, 9]
#clean(B):[1, 2, 3, 4, 5, 6, 7, 8, 9]
#list(C):[(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)]
#validate(C): True
A solution. O(n2)
def return_pairs(A,B, validation):
ret = []
used_x, used_y = -1,-1
for x, _x in enumerate(A):
for y, _y in enumerate(B):
if x <= used_x or y <= used_y:
continue
if validation(A[x], B[y]):
used_x,used_y = x,y
ret.append((A[x], B[y]))
return ret

Create a random order of (x, y) pairs, without repeating/subsequent x's

Say I have a list of valid X = [1, 2, 3, 4, 5] and a list of valid Y = [1, 2, 3, 4, 5].
I need to generate all combinations of every element in X and every element in Y (in this case, 25) and get those combinations in random order.
This in itself would be simple, but there is an additional requirement: In this random order, there cannot be a repetition of the same x in succession. For example, this is okay:
[1, 3]
[2, 5]
[1, 2]
...
[1, 4]
This is not:
[1, 3]
[1, 2] <== the "1" cannot repeat, because there was already one before
[2, 5]
...
[1, 4]
Now, the least efficient idea would be to simply randomize the full set as long as there are no more repetitions. My approach was a bit different, repeatedly creating a shuffled variant of X, and a list of all Y * X, then picking a random next one from that. So far, I've come up with this:
import random
output = []
num_x = 5
num_y = 5
all_ys = list(xrange(1, num_y + 1)) * num_x
while True:
# end if no more are available
if len(output) == num_x * num_y:
break
xs = list(xrange(1, num_x + 1))
while len(xs):
next_x = random.choice(xs)
next_y = random.choice(all_ys)
if [next_x, next_y] not in output:
xs.remove(next_x)
all_ys.remove(next_y)
output.append([next_x, next_y])
print(sorted(output))
But I'm sure this can be done even more efficiently or in a more succinct way?
Also, my solution first goes through all X values before continuing with the full set again, which is not perfectly random. I can live with that for my particular application case.
A simple solution to ensure an average O(N*M) complexity:
def pseudorandom(M,N):
l=[(x+1,y+1) for x in range(N) for y in range(M)]
random.shuffle(l)
for i in range(M*N-1):
for j in range (i+1,M*N): # find a compatible ...
if l[i][0] != l[j][0]:
l[i+1],l[j] = l[j],l[i+1]
break
else: # or insert otherwise.
while True:
l[i],l[i-1] = l[i-1],l[i]
i-=1
if l[i][0] != l[i-1][0]: break
return l
Some tests:
In [354]: print(pseudorandom(5,5))
[(2, 2), (3, 1), (5, 1), (1, 1), (3, 2), (1, 2), (3, 5), (1, 5), (5, 4),\
(1, 3), (5, 2), (3, 4), (5, 3), (4, 5), (5, 5), (1, 4), (2, 5), (4, 4), (2, 4),\
(4, 2), (2, 1), (4, 3), (2, 3), (4, 1), (3, 3)]
In [355]: %timeit pseudorandom(100,100)
10 loops, best of 3: 41.3 ms per loop
Here is my solution. First the tuples are chosen among the ones who have a different x value from the previous selected tuple. But I ve noticed that you have to prepare the final trick for the case you have only bad value tuples to place at end.
import random
num_x = 5
num_y = 5
all_ys = range(1,num_y+1)*num_x
all_xs = sorted(range(1,num_x+1)*num_y)
output = []
last_x = -1
for i in range(0,num_x*num_y):
#get list of possible tuple to place
all_ind = range(0,len(all_xs))
all_ind_ok = [k for k in all_ind if all_xs[k]!=last_x]
ind = random.choice(all_ind_ok)
last_x = all_xs[ind]
output.append([all_xs.pop(ind),all_ys.pop(ind)])
if(all_xs.count(last_x)==len(all_xs)):#if only last_x tuples,
break
if len(all_xs)>0: # if there are still tuples they are randomly placed
nb_to_place = len(all_xs)
while(len(all_xs)>0):
place = random.randint(0,len(output)-1)
if output[place]==last_x:
continue
if place>0:
if output[place-1]==last_x:
continue
output.insert(place,[all_xs.pop(),all_ys.pop()])
print output
Here's a solution using NumPy
def generate_pairs(xs, ys):
n = len(xs)
m = len(ys)
indices = np.arange(n)
array = np.tile(ys, (n, 1))
[np.random.shuffle(array[i]) for i in range(n)]
counts = np.full_like(xs, m)
i = -1
for _ in range(n * m):
weights = np.array(counts, dtype=float)
if i != -1:
weights[i] = 0
weights /= np.sum(weights)
i = np.random.choice(indices, p=weights)
counts[i] -= 1
pair = xs[i], array[i, counts[i]]
yield pair
Here's a Jupyter notebook that explains how it works
Inside the loop, we have to copy the weights, add them up, and choose a random index using the weights. These are all linear in n. So the overall complexity to generate all pairs is O(n^2 m)
But the runtime is deterministic and overhead is low. And I'm fairly sure it generates all legal sequences with equal probability.
An interesting question! Here is my solution. It has the following properties:
If there is no valid solution it should detect this and let you know
The iteration is guaranteed to terminate so it should never get stuck in an infinite loop
Any possible solution is reachable with nonzero probability
I do not know the distribution of the output over all possible solutions, but I think it should be uniform because there is no obvious asymmetry inherent in the algorithm. I would be surprised and pleased to be shown otherwise, though!
import random
def random_without_repeats(xs, ys):
pairs = [[x,y] for x in xs for y in ys]
output = [[object()], [object()]]
seen = set()
while pairs:
# choose a random pair from the ones left
indices = list(set(xrange(len(pairs))) - seen)
try:
index = random.choice(indices)
except IndexError:
raise Exception('No valid solution exists!')
# the first element of our randomly chosen pair
x = pairs[index][0]
# search for a valid place in output where we slot it in
for i in xrange(len(output) - 1):
left, right = output[i], output[i+1]
if x != left[0] and x != right[0]:
output.insert(i+1, pairs.pop(index))
seen = set()
break
else:
# make sure we don't randomly choose a bad pair like that again
seen |= {i for i in indices if pairs[i][0] == x}
# trim off the sentinels
output = output[1:-1]
assert len(output) == len(xs) * len(ys)
assert not any(L==R for L,R in zip(output[:-1], output[1:]))
return output
nx, ny = 5, 5 # OP example
# nx, ny = 2, 10 # output must alternate in 1st index
# nx, ny = 4, 13 # shuffle 'deck of cards' with no repeating suit
# nx, ny = 1, 5 # should raise 'No valid solution exists!' exception
xs = range(1, nx+1)
ys = range(1, ny+1)
for pair in random_without_repeats(xs, ys):
print pair
This should do what you want.
rando will never generate the same X twice in a row, but I realized that it is possible (though seems unlikely, in that I never noticed it happen in the 10 or so times I ran without the extra check) that due to the potential discard of duplicate pairs it could happen upon a previous X. Oh! But I think I figured it out... will update my answer in a moment.
import random
X = [1,2,3,4,5]
Y = [1,2,3,4,5]
def rando(choice_one, choice_two):
last_x = random.choice(choice_one)
while True:
yield last_x, random.choice(choice_two)
possible_x = choice_one[:]
possible_x.remove(last_x)
last_x = random.choice(possible_x)
all_pairs = set(itertools.product(X, Y))
result = []
r = rando(X, Y)
while set(result) != all_pairs:
pair = next(r)
if pair not in result:
if result and result[-1][0] == pair[0]:
continue
result.append(pair)
import pprint
pprint.pprint(result)
For completeness, I guess I will throw in the super-naive "just keep shuffling till you get one" solution. It's not guaranteed to even terminate, but if it does, it will have a good degree of randomness, and you did say one of the desired qualities was succinctness, and this sure is succinct:
import itertools
import random
x = range(5) # this is a list in Python 2
y = range(5)
all_pairs = list(itertools.product(x, y))
s = list(all_pairs) # make a working copy
while any(s[i][0] == s[i + 1][0] for i in range(len(s) - 1)):
random.shuffle(s)
print s
As was commented, for small values of x and y (especially y!), this is actually a reasonably quick solution. Your example of 5 for each completes in an average time of "right away". The deck of cards example (4 and 13) can take much longer, because it will usually require hundreds of thousands of shuffles. (And again, is not guaranteed to terminate at all.)
Distribute the x values (5 times each value) evenly across your output:
import random
def random_combo_without_x_repeats(xvals, yvals):
# produce all valid combinations, but group by `x` and shuffle the `y`s
grouped = [[x, random.sample(yvals, len(yvals))] for x in xvals]
last_x = object() # sentinel not equal to anything
while grouped[0][1]: # still `y`s left
for _ in range(len(xvals)):
# shuffle the `x`s, but skip any ordering that would
# produce consecutive `x`s.
random.shuffle(grouped)
if grouped[0][0] != last_x:
break
else:
# we tried to reshuffle N times, but ended up with the same `x` value
# in the first position each time. This is pretty unlikely, but
# if this happens we bail out and just reverse the order. That is
# more than good enough.
grouped = grouped[::-1]
# yield a set of (x, y) pairs for each unique x
# Pick one y (from the pre-shuffled groups per x
for x, ys in grouped:
yield x, ys.pop()
last_x = x
This shuffles the y values per x first, then gives you a x, y combination for each x. The order in which the xs are yielded is shuffled each iteration, where you test for the restriction.
This is random, but you'll get all numbers between 1 and 5 in the x position before you'll see the same number again:
>>> list(random_combo_without_x_repeats(range(1, 6), range(1, 6)))
[(2, 1), (3, 2), (1, 5), (5, 1), (4, 1),
(2, 4), (3, 1), (4, 3), (5, 5), (1, 4),
(5, 2), (1, 1), (3, 3), (4, 4), (2, 5),
(3, 5), (2, 3), (4, 2), (1, 2), (5, 4),
(2, 2), (3, 4), (1, 3), (4, 5), (5, 3)]
(I manually grouped that into sets of 5). Overall, this makes for a pretty good random shuffling of a fixed input set with your restriction.
It is efficient too; because there is only a 1-in-N chance that you have to re-shuffle the x order, you should only see one reshuffle on average take place during a full run of the algorithm. The whole algorithm stays within O(N*M) boundaries therefor, pretty much ideal for something that produces N times M elements of output. Because we limit the reshuffling to N times at most before falling back to a simple reverse we avoid the (extremely unlikely) posibility of endlessly reshuffling.
The only drawback then is that it has to create N copies of the M y values up front.
Here is an evolutionary algorithm approach. It first evolves a list in which the elements of X are each repeated len(Y) times and then it randomly fills in each element of Y len(X) times. The resulting orders seem fairly random:
import random
#the following fitness function measures
#the number of times in which
#consecutive elements in a list
#are equal
def numRepeats(x):
n = len(x)
if n < 2: return 0
repeats = 0
for i in range(n-1):
if x[i] == x[i+1]: repeats += 1
return repeats
def mutate(xs):
#swaps random pairs of elements
#returns a new list
#one of the two indices is chosen so that
#it is in a repeated pair
#and swapped element is different
n = len(xs)
repeats = [i for i in range(n) if (i > 0 and xs[i] == xs[i-1]) or (i < n-1 and xs[i] == xs[i+1])]
i = random.choice(repeats)
j = random.randint(0,n-1)
while xs[j] == xs[i]: j = random.randint(0,n-1)
ys = xs[:]
ys[i], ys[j] = ys[j], ys[i]
return ys
def evolveShuffle(xs, popSize = 100, numGens = 100):
#tries to evolve a shuffle of xs so that consecutive
#elements are different
#takes the best 10% of each generation and mutates each 9
#times. Stops when a perfect solution is found
#popsize assumed to be a multiple of 10
population = []
for i in range(popSize):
deck = xs[:]
random.shuffle(deck)
fitness = numRepeats(deck)
if fitness == 0: return deck
population.append((fitness,deck))
for i in range(numGens):
population.sort(key = (lambda p: p[0]))
newPop = []
for i in range(popSize//10):
fit,deck = population[i]
newPop.append((fit,deck))
for j in range(9):
newDeck = mutate(deck)
fitness = numRepeats(newDeck)
if fitness == 0: return newDeck
newPop.append((fitness,newDeck))
population = newPop
#if you get here :
return [] #no special shuffle found
#the following function takes a list x
#with n distinct elements (n>1) and an integer k
#and returns a random list of length nk
#where consecutive elements are not the same
def specialShuffle(x,k):
n = len(x)
if n == 2:
if random.random() < 0.5:
a,b = x
else:
b,a = x
return [a,b]*k
else:
deck = x*k
return evolveShuffle(deck)
def randOrder(x,y):
xs = specialShuffle(x,len(y))
d = {}
for i in x:
ys = y[:]
random.shuffle(ys)
d[i] = iter(ys)
pairs = []
for i in xs:
pairs.append((i,next(d[i])))
return pairs
for example:
>>> randOrder([1,2,3,4,5],[1,2,3,4,5])
[(1, 4), (3, 1), (4, 5), (2, 2), (4, 3), (5, 3), (2, 1), (3, 3), (1, 1), (5, 2), (1, 3), (2, 5), (1, 5), (3, 5), (5, 5), (4, 4), (2, 3), (3, 2), (5, 4), (2, 4), (4, 2), (1, 2), (5, 1), (4, 1), (3, 4)]
As len(X) and len(Y) gets larger this has more difficulty finding a solution (and is designed to return the empty list in that eventuality), in which case the parameters popSize and numGens could be increased. As is, it is able to find 20x20 solutions very rapidly. It takes about a minute when X and Y are of size 100 but even then is able to find a solution (in the times that I have run it).
Interesting restriction! I probably overthought this, solving a more general problem: shuffling an arbitrary list of sequences such that (if possible) no two adjacent sequences share a first item.
from itertools import product
from random import choice, randrange, shuffle
def combine(*sequences):
return playlist(product(*sequences))
def playlist(sequence):
r'''Shuffle a set of sequences, avoiding repeated first elements.
'''#"""#'''
result = list(sequence)
length = len(result)
if length < 2:
# No rearrangement is possible.
return result
def swap(a, b):
if a != b:
result[a], result[b] = result[b], result[a]
swap(0, randrange(length))
for n in range(1, length):
previous = result[n-1][0]
choices = [x for x in range(n, length) if result[x][0] != previous]
if not choices:
# Trapped in a corner: Too many of the same item are left.
# Backtrack as far as necessary to interleave other items.
minor = 0
major = length - n
while n > 0:
n -= 1
if result[n][0] == previous:
major += 1
else:
minor += 1
if minor == major - 1:
if n == 0 or result[n-1][0] != previous:
break
else:
# The requirement can't be fulfilled,
# because there are too many of a single item.
shuffle(result)
break
# Interleave the majority item with the other items.
major = [item for item in result[n:] if item[0] == previous]
minor = [item for item in result[n:] if item[0] != previous]
shuffle(major)
shuffle(minor)
result[n] = major.pop(0)
n += 1
while n < length:
result[n] = minor.pop(0)
n += 1
result[n] = major.pop(0)
n += 1
break
swap(n, choice(choices))
return result
This starts out simple, but when it discovers that it can't find an item with a different first element, it figures out how far back it needs to go to interleave that element with something else. Therefore, the main loop traverses the array at most three times (once backwards), but usually just once. Granted, each iteration of the first forward pass checks each remaining item in the array, and the array itself contains every pair, so the overall run time is O((NM)**2).
For your specific problem:
>>> X = Y = [1, 2, 3, 4, 5]
>>> combine(X, Y)
[(3, 5), (1, 1), (4, 4), (1, 2), (3, 4),
(2, 3), (5, 4), (1, 5), (2, 4), (5, 5),
(4, 1), (2, 2), (1, 4), (4, 2), (5, 2),
(2, 1), (3, 3), (2, 5), (3, 2), (1, 3),
(4, 3), (5, 3), (4, 5), (5, 1), (3, 1)]
By the way, this compares x values by equality, not by position in the X array, which may make a difference if the array can contain duplicates. In fact, duplicate values might trigger the fallback case of shuffling all pairs together if more than half of the X values are the same.

top n keys with highest values in dictionary with tuples as keys

I want to get the top n keys of a dictionary with tuples as keys, where the first value of the tuple is a particular number (1 in the example below):
a = {}
a[1,2] = 3
a[1,0] =4
a[1,5] = 1
a[2,3] = 9
I want [1,0] and [1,2] to be returned, where the first element of the tuple/key = 1
this
import heapq
k = heapq.nlargest(2, a, key=a.get(1,))
returns [1,4] and [1,3], the highest keys/tuples with first element = 1, though if I make it
k = heapq.nlargest(2, a, key=a.get(2,))
it returns the same thing?
First you should take only the keys with first coordinate 1. Otherwise, there is the chance if there are a few elements with 1 as first coordinate, to get other tuples also. Then you can use heapq normally. For example:
a = {
(1, 2): 3,
(1, 0): 4,
(1, 5): 1,
(2, 3): 9
}
import heapq
print heapq.nlargest(2, (k for k in a if k[0] == 1), key=lambda k: a[k])
print heapq.nlargest(2, (k for k in a if k[0] == 2), key=lambda k: a[k])
Output:
[(1, 0), (1, 2)]
[(2, 3)]
The key parameter should be a function. But you are passing in a.get(1,). What this does is calling a.get(1,) which is the same as a.get(1) which is the same as a.get(1, None).
The dictionary doesn't have a 1 key so it returns None which means you are doing the equivalent of passing key=None which is the same as not passing a key at all: you are using the identity function as key.
Then heapq.nlargest returns the top 2 elements which are, correctly, [1, 4] and [1, 3].
This explains why using a.get(1,) and a.get(2,) does the same thing. The above reasoning works for both values and you end up with key=None in both cases.
To achieve what you want use something like:
key=lambda x: (x[0] == 1, a[x])
If you find yourself using this kind of keys often you can create a key maker function:
def make_key(value, container):
def key(x):
return x[0] == value, container[x]
return key
using it as:
nlargest(2, a, key=make_key(1, a))
nlargest(2, a, key=make_key(2, a))

Looping through a 2-D list within a 1-D list

I have 2 lists:
One dimensional: x_int_loc = [0,1,2,3,4,5]
Two dimensional: xtremes = [[0,2],[0,3],[1,3],[1,5],[2,5],[3,6],[4,8]]
I am trying to gather a count of how many times each element in x_int_loc lies within the range of values in the xtremes list. That is, count of 1 (in list x_int_loc) will be 2, as it appears in [0,2], [0,3] and so on.
Although this appears to be quite simple, I got a bit stuck while looping through these lists. Here is my code:
for i in range(len(x_int_loc)):
while k < len(xtremes):
if x_int_loc[i]>xtremes[k][0] and xtremes[k][1] > x_int_loc[i]:
count[i] = count[i]+1
print(count[:])
Could any of you please tell me where I am going wrong?
You never increment k, or reset it when i increments. The minimal fix is:
for i in range(len(x_int_loc)):
k = 0
while k < len(xtremes):
if x_int_loc[i]>xtremes[k][0] and xtremes[k][1] > x_int_loc[i]:
count[i] = count[i]+1
k += 1
It is not good practice to use a while loop with a manual index; as this clearly demonstrates, it is error prone. Why not just for loop over xtremes directly? All you really need is:
count = [sum(x < n < y for x, y in xtremes) for n in x_int_loc]
which gives me:
>>> count
[0, 2, 3, 2, 3, 2]
Unless you are too particular about optimization, in general cases, the following solution would be optimal
>>> x_int_loc = [0,1,2,3,4,5]
>>> xtremes = [[0,2],[0,3],[1,3],[1,5],[2,5],[3,6],[4,8]]
>>> xtremes_ranges = [xrange(r[0]+1,r[1]) for r in xtremes]
>>> [(x, sum(x in r for r in xtremes_ranges)) for x in x_int_loc]
[(0, 0), (1, 2), (2, 3), (3, 2), (4, 3), (5, 2)]

Categories