I am trying to count all pairs of numbers in a list. A pair is just two numbers that are the same. My current code looks like this.
def pairs(lst):
lst.sort()
count = 0
for x in range(len(lst)):
if x+1 < len(lst):
if lst[x] == lst[x+1]:
count +=1
return count
pairs([1, 1, 1, 1, 1])
What do I need to change to be able to have it count each pair of 1's?
The reason that the function gives the wrong value is that it is taking each item in the list and checking if the next value matches it. This will double count all non-endpoint values. Also looping with conditional statements is inefficient. It may be better to think of the problem as the sum of modulo 2 of the count of each distinct item in the list.
Try this:
Include incomplete pairs
import math
def count_pairs(a_list):
counter=0
for x in set(a_list):
counter += math.ceil(lst.count(x)/2)
print(counter)
Include only complete pairs
import math
def count_pairs(a_list):
counter=0
for x in set(a_list):
counter += math.floor(lst.count(x)/2)
print(counter)
Example:
lst=[1,1,1,1,1,2,2,2,2,2,3,3,3,4,4,5,6,5]
count_pairs(lst)
Output 1
11
Output 2
7
You can try this approach:
list = [1,1,1,1,1,1,2,2,3,3,4]
list.sort
# remove last element if len(list) is odd
if ( len(list) % 2 != 0 ) :
list.pop()
c = 0
# create an `iter` object to simplify comparisons
it = iter(list)
for x1 in it:
x2 = next(it)
if ( x1 == x2 ):
c += 1
print(c)
It wasn't clear to me if you only want "1", if this is the case, introduce a check for x1 or x2 greater than 1 and break the loop.
Code
def count_pairs(lst):
' Using generator with Walrus operator '
return sum(cnt*(cnt-1)//2 for element in set(lst) if (cnt:=lst.count(element)))
Test
print(count_pairs([1, 1, 1, 1, 1])) # Output: 10
print(count_pairs([1,1,1,1,1,2,2,2,2,2,3,3,3,4,4,5,6,5])) # Output: 25
Explanation
The number of pairs of a number in the list is found by:
count the frequency of the number in the list
counting its combinations taking 2 at a time (i.e. for frequency k, combinations = k*(k-1)//2
We sum the pairs count for each unique number in list (i.e. set(lst))
For clarity, the oneliner solution can be expanded to the following.
def count_pairs(lst):
cnt = 0
for element in set(lst):
frequency = lst.count(element) # frequency of element
cnt += frequency * (frequency - 1) //2 # accumulate count of pairs of element
return cnt
Hello guys here is the problem. I have something like this in input [[1,2,3],[4,5,6],[7,8,9]]...etc
And i want to generate all possible combination of product of those list and then multiply each elements of the resulting combination beetween them to finally filter the result in a interval.
So first input a n list [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]...etc
Which will then give (1,4,7,10)
(1,4,7,11)
(1,4,7,12)
and so on
Then combination of those result for k in n like (1,4,7)(1,4,10)(1,7,10) for the first row
The multiplication of x as 1*4*7 = 28, 1*4*10 = 40, 1*7*10 = 70
And from this get only the unique combination and the result need in the interval choosed beforehand : if x > 50 and x < 100 i will get (1,7,10) : 70
I did try
def mult(lst): #A function mult i'm using later
r = 1
for element in lst:
r *= element
return round(r)
s = [] #Where i add my list of list
for i in range(int(input1)):
b = input("This is line %s : " % (i+1)).split()
for i in range(len(b)):
b[i] = float(b[i])
s.append(b)
low_result = input("Expected low_result : ")
high_result = input("Expected high_result : ")
combine = []
my_list = []
for element in itertools.product(*s):
l= [float(x) for x in element]
comb = itertools.combinations([*l], int(input2))
for i in list(comb):
combine.append(i)
res = mult(i)
if res >= int(low_result) and res <= int(high_result):
my_list.append(res)
f = open("list_result.txt","a+")
f.write("%s : result is %s\n" % (i, res))
f.close()
And it always result in memory error cause there is too many variation with what i'm seeking.
What i would like is a way to generate from a list of list of 20 elements or more all the product and resulting combination of k in n for the result(interval) that i need.
As suggested above, I think this can be done without exploding your memory by never holding an array in memory at any time. But the main issue is then runtime.
The maths
As written we are:
Producing every combination of m rows of n items n ** m
Then taking a choice of c items from those m values C(m, c)
This is very large. If we have m=25 rows, of n=3 items each and pick c=3 items in them we get:
= n ** m * C(m, c)
= 3 ** 25 * 2300 - n Choose r calculator
= 1.948763802×10¹⁵
If instead we:
Choose c rows from the m rows: C(m, c) as before
Then pick every combination of n items from these c rows: n ** c
With m=25 rows, of n=3 items each and pick c=3 items in them we get:
= n ** c * C(m, c)
= 3 ** 3 * 2300
= 20700
This is now a solvable problem.
The code
from itertools import product, combinations
def mult(values, min_value, max_value):
"""
Multiply together the values, but return None if we get too big or too
small
"""
output = 1
for value in values:
output *= value
# Early return if we go too big
if output > max_value:
return None
# Early return if we goto zero (from which we never return)
if output == 0 and min_value != 0:
return None
if output < min_value:
return None
return output
def yield_valid_combos(values, choose, min_value, max_value):
# No doubt an even fancier list compression would get this too
for rows in combinations(values, choose):
for combos in product(*rows):
value = mult(combos, min_value, max_value)
if value is not None:
yield combos, value
values = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
with open('list_result.txt', 'w') as fh:
for selection, value in yield_valid_combos(
values, choose=3, min_value=50, max_value=100):
fh.write('{}: result is {}\n'.format(selection, value))
This solution also returns no duplicate answers (unless the same value appears in multiple rows).
As an optimisation the multiplication method attempts to return early if we detect the result will be too big or small. We also only open the file once and then keep adding rows to it as they come.
Further optimisation
You can also optimise your set of values ahead of time by screening out values which cannot contribute to a solution. But for smaller values of c, you may find this is not even necessary.
The smallest possible combination of values is c items from the set of the smallest values in each row. If we take the c - 1 smallest items from the set of smallest values, mutliply them together and then divide the maximum by this number, it gives us an upper bound for the largest value which can be in a solution. We can then then screen out all values above this value (cutting down on permutations)
I am given a problem to solve ! which is
A non-empty array A consisting of N integers is given. The array contains an odd number of elements, and each element of the array can be paired with another element that has the same value, except for one element that is left unpaired.
For example, in array A such that: A[0] = 9 A[1] = 3 A[2] = 9 A[3] = 3 A[4] = 9 A[5] = 7 A[6] = 9
the elements at indexes 0 and 2 have value 9,
the elements at indexes 1 and 3 have value 3,
the elements at indexes 4 and 6 have value 9,
the element at index 5 has value 7 and is unpaired.
Write a function:
def solution(A)
that, given an array A consisting of N integers fulfilling the above conditions, returns the value of the unpaired element.
For example, given array A such that:
A[0] = 9 A[1] = 3 A[2] = 9
A[3] = 3 A[4] = 9 A[5] = 7
A[6] = 9
the function should return 7, as explained in the example above.
Write an efficient algorithm for the following assumptions:
N is an odd integer within the range [1..1,000,000];
each element of array A is an integer within the range [1..1,000,000,000];
all but one of the values in A occur an even number of times.
I think I am only half way to solve the problem:
def findOddItem(A):
for i, item in enumerate(A): # look to left not immidiate one
if A[i] != A[i - 2]:
print A[i]
but this looks like printing the wrong result..
I would go with reduce() (moved to functools.reduce() in Python 3.x) in combination with operator.xor():
# Uncomment for Python 3.x:
# from functools import reduce
import operator
def solution(A):
return reduce(operator.xor, A)
arr = [9, 3, 9, 3, 9, 7, 9]
print(solution(arr)) # 7
It's an, as clean as it gets, O(n) solution.
You could use "or" bitwise operator. Since all elements occur twice except one element they would cancel each other leaving the element that has occurred only once.
def findSingleOccurance( arr, n):
res = arr[0]
# Do XOR of all elements and return
for i in range(1,n):
res = res ^ arr[i]
return res
Time complexity O(n) Space Complexity O(1).
Hope this helps.
Since there is no condition that all but one elements occur twice, I guess it could also mean 4, 6, ... , times.
In this case, I would rather use numpy.bincount to see which integer has an odd count.
a = [1,1,2,2,3,3,5,3,3,4,5,5,5,10,10]
a_cnt = list(numpy.bincount(a))
for i in a_cnt:
if i != 0 and i%2 == 1:
print(a_cnt.index(i))
# 4
You could use "xor" bitwise operator. Since all elements occur twice except one element they would cancel each other leaving the element that has occurred only once.
def SingleOccurance( arr, n):
result = arr[0]
# Do XOR of all elements and return as 'a' xor 'a' is 0 and except single
# occured number rest will turn to 0 and 'a' xor 0 is 'a'
for i in range(1,n):
result = res ^ arr[i]
return result
Or
We can sum the bits in the same positions for all the numbers and take modulo with 3. The bits for which sum is not multiple of 3, are the bits of number with a single occurrence.
Let us consider the example array {5, 5, 5, 8}. The 101, 101, 101, 1000
Sum of first bits%3 = (1 + 1 + 1 + 0)%3 = 0
Sum of second bits%3 = (0 + 0 + 0 + 0)%0 = 0
Sum of third bits%3 = (1 + 1 + 1 + 0)%3 = 0
Sum of fourth bits%3 = (1)%3 = 1
Hence number which appears once is 1000
Code:
INT_SIZE = 32
def getSingle(arr, n) :
# Initialize result
result = 0
# Iterate through every bit
for i in range(0, INT_SIZE) :
# Find sum of set bits at ith position in all array elements
sm = 0
x = (1 << i)
for j in range(0, n) :
if (arr[j] & x) :
sm = sm + 1
# The bits with sum not multiple of 3, are the
# bits of element with single occurrence.
if (sm % 3) :
result = result | x
return result
When we sort a list, like
a = [1,2,3,3,2,2,1]
sorted(a) => [1, 1, 2, 2, 2, 3, 3]
equal elements are always adjacent in the resulting list.
How can I achieve the opposite task - shuffle the list so that equal elements are never (or as seldom as possible) adjacent?
For example, for the above list one of the possible solutions is
p = [1,3,2,3,2,1,2]
More formally, given a list a, generate a permutation p of it that minimizes the number of pairs p[i]==p[i+1].
Since the lists are large, generating and filtering all permutations is not an option.
Bonus question: how to generate all such permutations efficiently?
This is the code I'm using to test the solutions: https://gist.github.com/gebrkn/9f550094b3d24a35aebd
UPD: Choosing a winner here was a tough choice, because many people posted excellent answers. #VincentvanderWeele, #David Eisenstat, #Coady, #enrico.bacis and #srgerg provided functions that generate the best possible permutation flawlessly. #tobias_k and David also answered the bonus question (generate all permutations). Additional points to David for the correctness proof.
The code from #VincentvanderWeele appears to be the fastest.
This is along the lines of Thijser's currently incomplete pseudocode. The idea is to take the most frequent of the remaining item types unless it was just taken. (See also Coady's implementation of this algorithm.)
import collections
import heapq
class Sentinel:
pass
def david_eisenstat(lst):
counts = collections.Counter(lst)
heap = [(-count, key) for key, count in counts.items()]
heapq.heapify(heap)
output = []
last = Sentinel()
while heap:
minuscount1, key1 = heapq.heappop(heap)
if key1 != last or not heap:
last = key1
minuscount1 += 1
else:
minuscount2, key2 = heapq.heappop(heap)
last = key2
minuscount2 += 1
if minuscount2 != 0:
heapq.heappush(heap, (minuscount2, key2))
output.append(last)
if minuscount1 != 0:
heapq.heappush(heap, (minuscount1, key1))
return output
Proof of correctness
For two item types, with counts k1 and k2, the optimal solution has k2 - k1 - 1 defects if k1 < k2, 0 defects if k1 = k2, and k1 - k2 - 1 defects if k1 > k2. The = case is obvious. The others are symmetric; each instance of the minority element prevents at most two defects out of a total of k1 + k2 - 1 possible.
This greedy algorithm returns optimal solutions, by the following logic. We call a prefix (partial solution) safe if it extends to an optimal solution. Clearly the empty prefix is safe, and if a safe prefix is a whole solution then that solution is optimal. It suffices to show inductively that each greedy step maintains safety.
The only way that a greedy step introduces a defect is if only one item type remains, in which case there is only one way to continue, and that way is safe. Otherwise, let P be the (safe) prefix just before the step under consideration, let P' be the prefix just after, and let S be an optimal solution extending P. If S extends P' also, then we're done. Otherwise, let P' = Px and S = PQ and Q = yQ', where x and y are items and Q and Q' are sequences.
Suppose first that P does not end with y. By the algorithm's choice, x is at least as frequent in Q as y. Consider the maximal substrings of Q containing only x and y. If the first substring has at least as many x's as y's, then it can be rewritten without introducing additional defects to begin with x. If the first substring has more y's than x's, then some other substring has more x's than y's, and we can rewrite these substrings without additional defects so that x goes first. In both cases, we find an optimal solution T that extends P', as needed.
Suppose now that P does end with y. Modify Q by moving the first occurrence of x to the front. In doing so, we introduce at most one defect (where x used to be) and eliminate one defect (the yy).
Generating all solutions
This is tobias_k's answer plus efficient tests to detect when the choice currently under consideration is globally constrained in some way. The asymptotic running time is optimal, since the overhead of generation is on the order of the length of the output. The worst-case delay unfortunately is quadratic; it could be reduced to linear (optimal) with better data structures.
from collections import Counter
from itertools import permutations
from operator import itemgetter
from random import randrange
def get_mode(count):
return max(count.items(), key=itemgetter(1))[0]
def enum2(prefix, x, count, total, mode):
prefix.append(x)
count_x = count[x]
if count_x == 1:
del count[x]
else:
count[x] = count_x - 1
yield from enum1(prefix, count, total - 1, mode)
count[x] = count_x
del prefix[-1]
def enum1(prefix, count, total, mode):
if total == 0:
yield tuple(prefix)
return
if count[mode] * 2 - 1 >= total and [mode] != prefix[-1:]:
yield from enum2(prefix, mode, count, total, mode)
else:
defect_okay = not prefix or count[prefix[-1]] * 2 > total
mode = get_mode(count)
for x in list(count.keys()):
if defect_okay or [x] != prefix[-1:]:
yield from enum2(prefix, x, count, total, mode)
def enum(seq):
count = Counter(seq)
if count:
yield from enum1([], count, sum(count.values()), get_mode(count))
else:
yield ()
def defects(lst):
return sum(lst[i - 1] == lst[i] for i in range(1, len(lst)))
def test(lst):
perms = set(permutations(lst))
opt = min(map(defects, perms))
slow = {perm for perm in perms if defects(perm) == opt}
fast = set(enum(lst))
print(lst, fast, slow)
assert slow == fast
for r in range(10000):
test([randrange(3) for i in range(randrange(6))])
Pseudocode:
Sort the list
Loop over the first half of the sorted list and fill all even indices of the result list
Loop over the second half of the sorted list and fill all odd indices of the result list
You will only have p[i]==p[i+1] if more than half of the input consists of the same element, in which case there is no other choice than putting the same element in consecutive spots (by the pidgeon hole principle).
As pointed out in the comments, this approach may have one conflict too many in case one of the elements occurs at least n/2 times (or n/2+1 for odd n; this generalizes to (n+1)/2) for both even and odd). There are at most two such elements and if there are two, the algorithm works just fine. The only problematic case is when there is one element that occurs at least half of the time. We can simply solve this problem by finding the element and dealing with it first.
I don't know enough about python to write this properly, so I took the liberty to copy the OP's implementation of a previous version from github:
# Sort the list
a = sorted(lst)
# Put the element occurring more than half of the times in front (if needed)
n = len(a)
m = (n + 1) // 2
for i in range(n - m + 1):
if a[i] == a[i + m - 1]:
a = a[i:] + a[:i]
break
result = [None] * n
# Loop over the first half of the sorted list and fill all even indices of the result list
for i, elt in enumerate(a[:m]):
result[2*i] = elt
# Loop over the second half of the sorted list and fill all odd indices of the result list
for i, elt in enumerate(a[m:]):
result[2*i+1] = elt
return result
The algorithm already given of taking the most common item left that isn't the previous item is correct. Here's a simple implementation, which optimally uses a heap to track the most common.
import collections, heapq
def nonadjacent(keys):
heap = [(-count, key) for key, count in collections.Counter(a).items()]
heapq.heapify(heap)
count, key = 0, None
while heap:
count, key = heapq.heapreplace(heap, (count, key)) if count else heapq.heappop(heap)
yield key
count += 1
for index in xrange(-count):
yield key
>>> a = [1,2,3,3,2,2,1]
>>> list(nonadjacent(a))
[2, 1, 2, 3, 1, 2, 3]
You can generate all the 'perfectly unsorted' permutations (that have no two equal elements in adjacent positions) using a recursive backtracking algorithm. In fact, the only difference to generating all the permutations is that you keep track of the last number and exclude some solutions accordingly:
def unsort(lst, last=None):
if lst:
for i, e in enumerate(lst):
if e != last:
for perm in unsort(lst[:i] + lst[i+1:], e):
yield [e] + perm
else:
yield []
Note that in this form the function is not very efficient, as it creates lots of sub-lists. Also, we can speed it up by looking at the most-constrained numbers first (those with the highest count). Here's a much more efficient version using only the counts of the numbers.
def unsort_generator(lst, sort=False):
counts = collections.Counter(lst)
def unsort_inner(remaining, last=None):
if remaining > 0:
# most-constrained first, or sorted for pretty-printing?
items = sorted(counts.items()) if sort else counts.most_common()
for n, c in items:
if n != last and c > 0:
counts[n] -= 1 # update counts
for perm in unsort_inner(remaining - 1, n):
yield [n] + perm
counts[n] += 1 # revert counts
else:
yield []
return unsort_inner(len(lst))
You can use this to generate just the next perfect permutation, or a list holding all of them. But note, that if there is no perfectly unsorted permutation, then this generator will consequently yield no results.
>>> lst = [1,2,3,3,2,2,1]
>>> next(unsort_generator(lst))
[2, 1, 2, 3, 1, 2, 3]
>>> list(unsort_generator(lst, sort=True))
[[1, 2, 1, 2, 3, 2, 3],
... 36 more ...
[3, 2, 3, 2, 1, 2, 1]]
>>> next(unsort_generator([1,1,1]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
To circumvent this problem, you could use this together with one of the algorithms proposed in the other answers as a fallback. This will guarantee to return a perfectly unsorted permutation, if there is one, or a good approximation otherwise.
def unsort_safe(lst):
try:
return next(unsort_generator(lst))
except StopIteration:
return unsort_fallback(lst)
In python you could do the following.
Consider you have a sorted list l, you can do:
length = len(l)
odd_ind = length%2
odd_half = (length - odd_ind)/2
for i in range(odd_half)[::2]:
my_list[i], my_list[odd_half+odd_ind+i] = my_list[odd_half+odd_ind+i], my_list[i]
These are just in place operations and should thus be rather fast (O(N)).
Note that you will shift from l[i] == l[i+1] to l[i] == l[i+2] so the order you end up with is anything but random, but from how I understand the question it is not randomness you are looking for.
The idea is to split the sorted list in the middle then exchange every other element in the two parts.
For l= [1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5] this leads to l = [3, 1, 4, 2, 5, 1, 3, 1, 4, 2, 5]
The method fails to get rid of all the l[i] == l[i + 1] as soon as the abundance of one element is bigger than or equal to half of the length of the list.
While the above works fine as long as the abundance of the most frequent element is smaller than half the size of the list, the following function also handles the limit cases (the famous off-by-one issue) where every other element starting with the first one must be the most abundant one:
def no_adjacent(my_list):
my_list.sort()
length = len(my_list)
odd_ind = length%2
odd_half = (length - odd_ind)/2
for i in range(odd_half)[::2]:
my_list[i], my_list[odd_half+odd_ind+i] = my_list[odd_half+odd_ind+i], my_list[i]
#this is just for the limit case where the abundance of the most frequent is half of the list length
if max([my_list.count(val) for val in set(my_list)]) + 1 - odd_ind > odd_half:
max_val = my_list[0]
max_count = my_list.count(max_val)
for val in set(my_list):
if my_list.count(val) > max_count:
max_val = val
max_count = my_list.count(max_val)
while max_val in my_list:
my_list.remove(max_val)
out = [max_val]
max_count -= 1
for val in my_list:
out.append(val)
if max_count:
out.append(max_val)
max_count -= 1
if max_count:
print 'this is not working'
return my_list
#raise Exception('not possible')
return out
else:
return my_list
Here is a good algorithm:
First of all count for all numbers how often they occur. Place the answer in a map.
sort this map so that the numbers that occur most often come first.
The first number of your answer is the first number in the sorted map.
Resort the map with the first now being one smaller.
If you want to improve efficiency look for ways to increase the efficiency of the sorting step.
In answer to the bonus question: this is an algorithm which finds all permutations of a set where no adjacent elements can be identical. I believe this to be the most efficient algorithm conceptually (although others may be faster in practice because they translate into simpler code). It doesn't use brute force, it only generates unique permutations, and paths not leading to solutions are cut off at the earliest point.
I will use the term "abundant element" for an element in a set which occurs more often than all other elements combined, and the term "abundance" for the number of abundant elements minus the number of other elements.
e.g. the set abac has no abundant element, the sets abaca and aabcaa have a as the abundant element, and abundance 1 and 2 respectively.
Start with a set like:
aaabbcd
Seperate the first occurances from the repeats:
firsts: abcd
repeats: aab
Find the abundant element in the repeats, if any, and calculate the abundance:
abundant element: a
abundance: 1
Generate all permutations of the firsts where the number of elements after the abundant element is not less than the abundance: (so in the example the "a" cannot be last)
abcd, abdc, acbd, acdb, adbc, adcb, bacd, badc, bcad, bcda, bdac, bdca,
cabd, cadb, cbad, cbda, cdab, cdba, dabc, dacb, abac, dbca, dcab, dcba
For each permutation, insert the set of repeated characters one by one, following these rules:
5.1. If the abundance of the set is greater than the number of elements after the last occurance of the abundant element in the permutation so far, skip to the next permutation.
e.g. when permutation so far is abc, a set with abundant element a can only be inserted if the abundance is 2 or less, so aaaabc is ok, aaaaabc isn't.
5.2. Select the element from the set whose last occurance in the permutation comes first.
e.g. when permutation so far is abcba and set is ab, select b
5.3. Insert the selected element at least 2 positions to the right of its last occurance in the permutation.
e.g. when inserting b into permutation babca, results are babcba and babcab
5.4. Recurse step 5 with each resulting permutation and the rest of the set.
EXAMPLE:
set = abcaba
firsts = abc
repeats = aab
perm3 set select perm4 set select perm5 set select perm6
abc aab a abac ab b ababc a a ababac
ababca
abacb a a abacab
abacba
abca ab b abcba a -
abcab a a abcaba
acb aab a acab ab a acaba b b acabab
acba ab b acbab a a acbaba
bac aab b babc aa a babac a a babaca
babca a -
bacb aa a bacab a a bacaba
bacba a -
bca aab -
cab aab a caba ab b cabab a a cababa
cba aab -
This algorithm generates unique permutations. If you want to know the total number of permutations (where aba is counted twice because you can switch the a's), multiply the number of unique permutations with a factor:
F = N1! * N2! * ... * Nn!
where N is the number of occurances of each element in the set. For a set abcdabcaba this would be 4! * 3! * 2! * 1! or 288, which demonstrates how inefficient an algorithm is that generates all permutations instead of only the unique ones. To list all permutations in this case, just list the unique permutations 288 times :-)
Below is a (rather clumsy) implementation in Javascript; I suspect that a language like Python may be better suited for this sort of thing. Run the code snippet to calculate the seperated permutations of "abracadabra".
// FIND ALL PERMUTATONS OF A SET WHERE NO ADJACENT ELEMENTS ARE IDENTICAL
function seperatedPermutations(set) {
var unique = 0, factor = 1, firsts = [], repeats = [], abund;
seperateRepeats(set);
abund = abundance(repeats);
permutateFirsts([], firsts);
alert("Permutations of [" + set + "]\ntotal: " + (unique * factor) + ", unique: " + unique);
// SEPERATE REPEATED CHARACTERS AND CALCULATE TOTAL/UNIQUE RATIO
function seperateRepeats(set) {
for (var i = 0; i < set.length; i++) {
var first, elem = set[i];
if (firsts.indexOf(elem) == -1) firsts.push(elem)
else if ((first = repeats.indexOf(elem)) == -1) {
repeats.push(elem);
factor *= 2;
} else {
repeats.splice(first, 0, elem);
factor *= repeats.lastIndexOf(elem) - first + 2;
}
}
}
// FIND ALL PERMUTATIONS OF THE FIRSTS USING RECURSION
function permutateFirsts(perm, set) {
if (set.length > 0) {
for (var i = 0; i < set.length; i++) {
var s = set.slice();
var e = s.splice(i, 1);
if (e[0] == abund.elem && s.length < abund.num) continue;
permutateFirsts(perm.concat(e), s, abund);
}
}
else if (repeats.length > 0) {
insertRepeats(perm, repeats);
}
else {
document.write(perm + "<BR>");
++unique;
}
}
// INSERT REPEATS INTO THE PERMUTATIONS USING RECURSION
function insertRepeats(perm, set) {
var abund = abundance(set);
if (perm.length - perm.lastIndexOf(abund.elem) > abund.num) {
var sel = selectElement(perm, set);
var s = set.slice();
var elem = s.splice(sel, 1)[0];
for (var i = perm.lastIndexOf(elem) + 2; i <= perm.length; i++) {
var p = perm.slice();
p.splice(i, 0, elem);
if (set.length == 1) {
document.write(p + "<BR>");
++unique;
} else {
insertRepeats(p, s);
}
}
}
}
// SELECT THE ELEMENT FROM THE SET WHOSE LAST OCCURANCE IN THE PERMUTATION COMES FIRST
function selectElement(perm, set) {
var sel, pos, min = perm.length;
for (var i = 0; i < set.length; i++) {
pos = perm.lastIndexOf(set[i]);
if (pos < min) {
min = pos;
sel = i;
}
}
return(sel);
}
// FIND ABUNDANT ELEMENT AND ABUNDANCE NUMBER
function abundance(set) {
if (set.length == 0) return ({elem: null, num: 0});
var elem = set[0], max = 1, num = 1;
for (var i = 1; i < set.length; i++) {
if (set[i] != set[i - 1]) num = 1
else if (++num > max) {
max = num;
elem = set[i];
}
}
return ({elem: elem, num: 2 * max - set.length});
}
}
seperatedPermutations(["a","b","r","a","c","a","d","a","b","r","a"]);
The idea is to sort the elements from the most common to the least common, take the most common, decrease its count and put it back in the list keeping the descending order (but avoiding putting the last used element first to prevent repetitions when possible).
This can be implemented using Counter and bisect:
from collections import Counter
from bisect import bisect
def unsorted(lst):
# use elements (-count, item) so bisect will put biggest counts first
items = [(-count, item) for item, count in Counter(lst).most_common()]
result = []
while items:
count, item = items.pop(0)
result.append(item)
if count != -1:
element = (count + 1, item)
index = bisect(items, element)
# prevent insertion in position 0 if there are other items
items.insert(index or (1 if items else 0), element)
return result
Example
>>> print unsorted([1, 1, 1, 2, 3, 3, 2, 2, 1])
[1, 2, 1, 2, 1, 3, 1, 2, 3]
>>> print unsorted([1, 2, 3, 2, 3, 2, 2])
[2, 3, 2, 1, 2, 3, 2]
Sort the list.
Generate a "best shuffle" of the list using this algorithm
It will give the minimum of items from the list in their original place (by item value) so it will try, for your example, to put the 1's, 2's and 3's away from their sorted positions.
Start with the sorted list of length n.
Let m=n/2.
Take the values at 0, then m, then 1, then m+1, then 2, then m+2, and so on.
Unless you have more than half of the numbers the same, you'll never get equivalent values in consecutive order.
Please forgive my "me too" style answer, but couldn't Coady's answer be simplified to this?
from collections import Counter
from heapq import heapify, heappop, heapreplace
from itertools import repeat
def srgerg(data):
heap = [(-freq+1, value) for value, freq in Counter(data).items()]
heapify(heap)
freq = 0
while heap:
freq, val = heapreplace(heap, (freq+1, val)) if freq else heappop(heap)
yield val
yield from repeat(val, -freq)
Edit: Here's a python 2 version that returns a list:
def srgergpy2(data):
heap = [(-freq+1, value) for value, freq in Counter(data).items()]
heapify(heap)
freq = 0
result = list()
while heap:
freq, val = heapreplace(heap, (freq+1, val)) if freq else heappop(heap)
result.append(val)
result.extend(repeat(val, -freq))
return result
Count number of times each value appears
Select values in order from most frequent to least frequent
Add selected value to final output, incrementing the index by 2 each time
Reset index to 1 if index out of bounds
from heapq import heapify, heappop
def distribute(values):
counts = defaultdict(int)
for value in values:
counts[value] += 1
counts = [(-count, key) for key, count in counts.iteritems()]
heapify(counts)
index = 0
length = len(values)
distributed = [None] * length
while counts:
count, value = heappop(counts)
for _ in xrange(-count):
distributed[index] = value
index = index + 2 if index + 2 < length else 1
return distributed