Array of positive integers , ideas for efficient implementation - python

I have a small problem within a bigger problem.
I have an array of positive integers. I need to find a position i in the array such that all the numbers which are smaller than the element at position i should appear after it.
Example:
(let's assume array is indexed at 1)
2, 3, 4, 1, 9,3, 2 => 3rd pos // 1,2,3 are less than 4 and are occurring after it.
5, 2, 1, 5 => 2nd pos
1,2,1 => 2nd pos
1, 4, 6, 7, 2, 3 => doesn't exist
I'm thinking of using a hashtable but I don't know exactly how. Or sorting would be better? Any ideas for an efficient idea?

We can start by creating a map (or hash table or whatever), which records the last occurence for each entry:
for i from 1 to n
lastOccurrence[arr[i]] = i
next
We know that if j is a valid answer, then every number smaller than j is also a valid answer. So we want to find the maximum j. The minimum j is obviously 1 because then the left sublist is empty.
We can then iterate all possible js and check their validity.
maxJ = n
for j from 1 to n
if j > maxJ
return maxJ
if lastOccurrence[arr[j]] == j
return j
maxJ = min(maxJ, lastOccurrence[arr[j]] - 1)
next

from sets import Set
def findMaxIndex(array):
lastSet = Set()
size = len(array)
maxIndex = size
for index in range(size-1,-1,-1):
if array[index] in lastSet:
continue
else:
lastSet.add(array[index])
maxIndex = index + 1
if maxIndex == 1:
return 0 # don't exist
else:
return maxIndex
from the last element to the first, use a set to keep elements having met, if iterate element(index i) is not in set, then the max index is i, and update the set

Related

How many steps to the nearest zero value

Looking for some help with solving a seemingly easy algorithm.
Brief overview:
We have an unsorted list of whole numbers. The goal is to find out how far each element of this list is from the nearest '0' value.
So if we have a list similar to this: [0, 1, 2, 0, 4, 5, 6, 7, 0, 5, 6, 9]
The expected result will be: [0, 1, 1, 0, 1, 2, 2, 1, 0, 1, 2, 3]
I've tried to simplify the problem in order to come up with some naive algorithm, but I can't figure out how to keep track of previous and next zero values.
My initial thoughts were to figure out all indexes for zeros in the list and fill the gaps between those zeros with values, but this obviously didn't quite work out for me.
The poorly implemented code (so far I'm just counting down the steps to the next zero):
def get_empty_lot_index(arr: list) -> list:
''' Gets all indices of empty lots '''
lots = []
for i in range(len(arr)):
if arr[i] == 0:
lots.append(i)
return lots
def space_to_empty_lots(arr: list) -> list:
empty_lots = get_empty_lot_index(arr)
new_arr = []
start = 0
for i in empty_lots:
steps = i - start
while steps >= 0:
new_arr.append(steps)
steps -= 1
start = i + 1
return new_arr
One possible algorithm is to make two sweeps through the input list: once forward, once backward. Each time retain the index of the last encountered 0 and store the difference. In the second sweep take the minimum of what was stored in the first sweep and the new result:
def space_to_empty_lots(arr: list) -> list:
result = []
# first sweep
lastZero = -len(arr)
for i, value in enumerate(arr):
if value == 0:
lastZero = i
result.append(i - lastZero)
# second sweep
lastZero = len(arr)*2
for i, value in reversed(list(enumerate(arr))):
if value == 0:
lastZero = i
result[i] = min(result[i], lastZero - i)
return result
NB: this function assumes that there is at least one 0 in the input list. It is not clear what the function should do when there is no 0. In that case this implementation will return a list with values greater than the length of the input list.

How can i find the median of a number from a list? Python

I'm trying to find the median in a list. The equation to find the median is N terms/2. The code i've tried is to find and index the number but when i index i get 0 or an error, why is this?
def Median():
#MedianList_Str = ""
MedianList = [2,4,6]
print("What number do you want to add to the array, enter 0 to exit")
try:
Int = int(input())
if Int == 0:
QuitApp()
else:
MedianList.append(Int)
except:
print("Please enter a number")
MedianT = math.floor(len(MedianList)/2) #finds the nth term
MedianList.sort #sorts the list so you can find the median term
MedianList_Str.join(MedianList)
this is what i've done. I've also tried index
def Median():
MedianList_Str = ""
MedianList = [2,4,6]
print("What number do you want to add to the array, enter 0 to exit")
try:
Int = int(input())
if Int == 0:
QuitApp()
else:
MedianList.append(Int)
except:
print("Please enter a number")
MedianT = math.floor(len(MedianList)/2) #finds the nth term
MedianList.sort #sorts the list so you can find the median term
print(MedianList.index(MedianT))
which gets me 0
what can i do to get me the median? I understand that there already exists such a question but i want to try a different way.
Here's a nice trick to avoid using if/else to handle the odd and even length cases separately: the indices in the middle are (len(nums) - 1) // 2 and len(nums) // 2. If the length is odd, then these indices are equal, so adding the values and dividing by 2 has no effect.
Note that you should do floor-division with the // operator to get an integer to use as an index.
def median(nums):
nums = sorted(nums)
middle1 = (len(nums) - 1) // 2
middle2 = len(nums) // 2
return (nums[middle1] + nums[middle2]) / 2
Examples:
>>> median([1, 2, 3, 4])
2.5
>>> median([1, 2, 3, 4, 5])
3.0
As others have mentioned, I would used sorted(MedianT) instead of MeadianT.sort().
In addition, use array based indexing instead of the .index() function.
For example, this:
print(MedianList[MedianT])
Instead of this:
print(MedianList.index(MedianT))
I have included below with comments the logic and thought process of finding median value of an array in Python.
def median(array):
length = len(array)
sorted_arr = sorted(array) # sorting in O(n) time or linear complexity
# we are subtracting 1 from overall
# length because indexing is 0-based
# essentially indexes of arrays start at 0
# while counting length starts at 1
# idx_norm = (length-1) / 2 # using only single division operator yields a float
idx = (length-1) // 2 # using floor division operator // yields an Int which can be used for index
# we run a simple condition to see
# whether if the overall length of array
# is even or odd.
# If odd then we can use index value (idx) to find median
# we use modulus operator to see if if there is any remainder
# for a division operation by 2. If remainder equals 0 then
# array is even. If not array is odd.
if length % 2 == 0:
return (sorted_arr[idx] + sorted_arr[idx + 1]) / 2.0 # if you need an Int returned, then wrap this operation in int() conversion method
else:
return sorted_arr[idx]
# If even we have to use index value (idx) and the next index value (idx + 1)
# to create total and then divide for average
a = [1, 2, 3, 4, 12, 1, 9] # 7 elements, odd length --> return 3
b = [2, 3, 7, 6, 8, 9] # 6 elements, even length --> return 6.5
median(a)
median(b)
Please let me know if you have any questions and hope this helps. Cheers!
The median is either the middle element by value, or the average of the middle two if the length of the array is even.
So first we must sort the array, and then apply our logic.
def median(l):
l = sorted(l)
middle = len(l) // 2
return l[middle] if len(l) % 2 == 1 else (l[middle - 1] + l[middle]) / 2
Note that there exist more efficient algorithms to find the median, that take O(n) instead of O(n log n) time, however they're not trivial to implement and are not available in Python's standard library.
I created a class. You can add elements to your list with the addNum method. The findMedian method will check if the length of the list is odd or even and will give you the median depending on the situation.
class MedianFinder:
def __init__(self):
num_list = []
self.num_list = num_list
def addNum(self, num):
self.num_list.append(num)
return self.num_list
def findMedian(self):
# Check len of list
if len(self.num_list) % 2 == 0:
l = int(len(self.num_list) / 2)
return (self.num_list[l - 1] + self.num_list[l]) / 2
else:
l = math.floor(len(li) / 2)
return self.num_list[l]
Example usage;
s.addNum(2)
s.addNum(4)
s.addNum(6)
s.addNum(8)
print(s.findMedian())
Output: 5.0

Compare if every element of a list is larger then every other elements

I want to compare if element one of a list is larger then every other element(same for every other element).
If one element is larger than an other it gets a 1. The sum of 1s (depending on the number of comparison "won") should be store in a way that let me know how many comparison are wow for each specif element of the list.
To clarify every element to a list would be an individual with an ID
Python
#Here I create 10 random values which I call individual with the random
#funcion plus mean and standard deviation
a, b = 3, 10
mu, sigma = 5.6, 2
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)
individuals = dist.rvs(10)
#Initialize the list where I want to store the 1s
outcome = num.zeros(n)
#Trying to loop through all the elements
for k in range(0, n):
for j in range(0, n):
if individuals[k] == individuals[j]:
continue
elif individuals[k] < individuals[j]:
continue
elif individuals[k] > individuals[j]:
outcome[i] += 1
return outcome[i]
I end up having an outcome with one single value.
Probably it summed up every 1s in the first element
Here is a more efficient way, by sorting the list first, which makes the process O(n*log(n)) instead of O(n**2).
We sort the list, keeping the original index of each value (this is O(n*log(n)).
Then, we go once over the list to set the output counts, which are the indices of the values in the sorted list, except for the duplicates - in this case, we just keep track of the number of identical values to adjust the result.
def larger_than(values):
ordered_values = sorted((value, index) for index, value in enumerate(values))
out = [None] * len(values)
# take care of equal values
equals = 0
prev = None
for rank, (value, index) in enumerate(ordered_values):
if value == prev:
equals += 1
else:
equals = 0
prev = value
out[index] = rank - equals
return out
Some test:
values = [1, 4, 3, 3, 10, 1, 5, 2, 7, 6]
print(larger_than(values))
# [0, 5, 3, 3, 9, 0, 6, 2, 8, 7]

Generate all permutations of a list without adjacent equal elements

When we sort a list, like
a = [1,2,3,3,2,2,1]
sorted(a) => [1, 1, 2, 2, 2, 3, 3]
equal elements are always adjacent in the resulting list.
How can I achieve the opposite task - shuffle the list so that equal elements are never (or as seldom as possible) adjacent?
For example, for the above list one of the possible solutions is
p = [1,3,2,3,2,1,2]
More formally, given a list a, generate a permutation p of it that minimizes the number of pairs p[i]==p[i+1].
Since the lists are large, generating and filtering all permutations is not an option.
Bonus question: how to generate all such permutations efficiently?
This is the code I'm using to test the solutions: https://gist.github.com/gebrkn/9f550094b3d24a35aebd
UPD: Choosing a winner here was a tough choice, because many people posted excellent answers. #VincentvanderWeele, #David Eisenstat, #Coady, #enrico.bacis and #srgerg provided functions that generate the best possible permutation flawlessly. #tobias_k and David also answered the bonus question (generate all permutations). Additional points to David for the correctness proof.
The code from #VincentvanderWeele appears to be the fastest.
This is along the lines of Thijser's currently incomplete pseudocode. The idea is to take the most frequent of the remaining item types unless it was just taken. (See also Coady's implementation of this algorithm.)
import collections
import heapq
class Sentinel:
pass
def david_eisenstat(lst):
counts = collections.Counter(lst)
heap = [(-count, key) for key, count in counts.items()]
heapq.heapify(heap)
output = []
last = Sentinel()
while heap:
minuscount1, key1 = heapq.heappop(heap)
if key1 != last or not heap:
last = key1
minuscount1 += 1
else:
minuscount2, key2 = heapq.heappop(heap)
last = key2
minuscount2 += 1
if minuscount2 != 0:
heapq.heappush(heap, (minuscount2, key2))
output.append(last)
if minuscount1 != 0:
heapq.heappush(heap, (minuscount1, key1))
return output
Proof of correctness
For two item types, with counts k1 and k2, the optimal solution has k2 - k1 - 1 defects if k1 < k2, 0 defects if k1 = k2, and k1 - k2 - 1 defects if k1 > k2. The = case is obvious. The others are symmetric; each instance of the minority element prevents at most two defects out of a total of k1 + k2 - 1 possible.
This greedy algorithm returns optimal solutions, by the following logic. We call a prefix (partial solution) safe if it extends to an optimal solution. Clearly the empty prefix is safe, and if a safe prefix is a whole solution then that solution is optimal. It suffices to show inductively that each greedy step maintains safety.
The only way that a greedy step introduces a defect is if only one item type remains, in which case there is only one way to continue, and that way is safe. Otherwise, let P be the (safe) prefix just before the step under consideration, let P' be the prefix just after, and let S be an optimal solution extending P. If S extends P' also, then we're done. Otherwise, let P' = Px and S = PQ and Q = yQ', where x and y are items and Q and Q' are sequences.
Suppose first that P does not end with y. By the algorithm's choice, x is at least as frequent in Q as y. Consider the maximal substrings of Q containing only x and y. If the first substring has at least as many x's as y's, then it can be rewritten without introducing additional defects to begin with x. If the first substring has more y's than x's, then some other substring has more x's than y's, and we can rewrite these substrings without additional defects so that x goes first. In both cases, we find an optimal solution T that extends P', as needed.
Suppose now that P does end with y. Modify Q by moving the first occurrence of x to the front. In doing so, we introduce at most one defect (where x used to be) and eliminate one defect (the yy).
Generating all solutions
This is tobias_k's answer plus efficient tests to detect when the choice currently under consideration is globally constrained in some way. The asymptotic running time is optimal, since the overhead of generation is on the order of the length of the output. The worst-case delay unfortunately is quadratic; it could be reduced to linear (optimal) with better data structures.
from collections import Counter
from itertools import permutations
from operator import itemgetter
from random import randrange
def get_mode(count):
return max(count.items(), key=itemgetter(1))[0]
def enum2(prefix, x, count, total, mode):
prefix.append(x)
count_x = count[x]
if count_x == 1:
del count[x]
else:
count[x] = count_x - 1
yield from enum1(prefix, count, total - 1, mode)
count[x] = count_x
del prefix[-1]
def enum1(prefix, count, total, mode):
if total == 0:
yield tuple(prefix)
return
if count[mode] * 2 - 1 >= total and [mode] != prefix[-1:]:
yield from enum2(prefix, mode, count, total, mode)
else:
defect_okay = not prefix or count[prefix[-1]] * 2 > total
mode = get_mode(count)
for x in list(count.keys()):
if defect_okay or [x] != prefix[-1:]:
yield from enum2(prefix, x, count, total, mode)
def enum(seq):
count = Counter(seq)
if count:
yield from enum1([], count, sum(count.values()), get_mode(count))
else:
yield ()
def defects(lst):
return sum(lst[i - 1] == lst[i] for i in range(1, len(lst)))
def test(lst):
perms = set(permutations(lst))
opt = min(map(defects, perms))
slow = {perm for perm in perms if defects(perm) == opt}
fast = set(enum(lst))
print(lst, fast, slow)
assert slow == fast
for r in range(10000):
test([randrange(3) for i in range(randrange(6))])
Pseudocode:
Sort the list
Loop over the first half of the sorted list and fill all even indices of the result list
Loop over the second half of the sorted list and fill all odd indices of the result list
You will only have p[i]==p[i+1] if more than half of the input consists of the same element, in which case there is no other choice than putting the same element in consecutive spots (by the pidgeon hole principle).
As pointed out in the comments, this approach may have one conflict too many in case one of the elements occurs at least n/2 times (or n/2+1 for odd n; this generalizes to (n+1)/2) for both even and odd). There are at most two such elements and if there are two, the algorithm works just fine. The only problematic case is when there is one element that occurs at least half of the time. We can simply solve this problem by finding the element and dealing with it first.
I don't know enough about python to write this properly, so I took the liberty to copy the OP's implementation of a previous version from github:
# Sort the list
a = sorted(lst)
# Put the element occurring more than half of the times in front (if needed)
n = len(a)
m = (n + 1) // 2
for i in range(n - m + 1):
if a[i] == a[i + m - 1]:
a = a[i:] + a[:i]
break
result = [None] * n
# Loop over the first half of the sorted list and fill all even indices of the result list
for i, elt in enumerate(a[:m]):
result[2*i] = elt
# Loop over the second half of the sorted list and fill all odd indices of the result list
for i, elt in enumerate(a[m:]):
result[2*i+1] = elt
return result
The algorithm already given of taking the most common item left that isn't the previous item is correct. Here's a simple implementation, which optimally uses a heap to track the most common.
import collections, heapq
def nonadjacent(keys):
heap = [(-count, key) for key, count in collections.Counter(a).items()]
heapq.heapify(heap)
count, key = 0, None
while heap:
count, key = heapq.heapreplace(heap, (count, key)) if count else heapq.heappop(heap)
yield key
count += 1
for index in xrange(-count):
yield key
>>> a = [1,2,3,3,2,2,1]
>>> list(nonadjacent(a))
[2, 1, 2, 3, 1, 2, 3]
You can generate all the 'perfectly unsorted' permutations (that have no two equal elements in adjacent positions) using a recursive backtracking algorithm. In fact, the only difference to generating all the permutations is that you keep track of the last number and exclude some solutions accordingly:
def unsort(lst, last=None):
if lst:
for i, e in enumerate(lst):
if e != last:
for perm in unsort(lst[:i] + lst[i+1:], e):
yield [e] + perm
else:
yield []
Note that in this form the function is not very efficient, as it creates lots of sub-lists. Also, we can speed it up by looking at the most-constrained numbers first (those with the highest count). Here's a much more efficient version using only the counts of the numbers.
def unsort_generator(lst, sort=False):
counts = collections.Counter(lst)
def unsort_inner(remaining, last=None):
if remaining > 0:
# most-constrained first, or sorted for pretty-printing?
items = sorted(counts.items()) if sort else counts.most_common()
for n, c in items:
if n != last and c > 0:
counts[n] -= 1 # update counts
for perm in unsort_inner(remaining - 1, n):
yield [n] + perm
counts[n] += 1 # revert counts
else:
yield []
return unsort_inner(len(lst))
You can use this to generate just the next perfect permutation, or a list holding all of them. But note, that if there is no perfectly unsorted permutation, then this generator will consequently yield no results.
>>> lst = [1,2,3,3,2,2,1]
>>> next(unsort_generator(lst))
[2, 1, 2, 3, 1, 2, 3]
>>> list(unsort_generator(lst, sort=True))
[[1, 2, 1, 2, 3, 2, 3],
... 36 more ...
[3, 2, 3, 2, 1, 2, 1]]
>>> next(unsort_generator([1,1,1]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
To circumvent this problem, you could use this together with one of the algorithms proposed in the other answers as a fallback. This will guarantee to return a perfectly unsorted permutation, if there is one, or a good approximation otherwise.
def unsort_safe(lst):
try:
return next(unsort_generator(lst))
except StopIteration:
return unsort_fallback(lst)
In python you could do the following.
Consider you have a sorted list l, you can do:
length = len(l)
odd_ind = length%2
odd_half = (length - odd_ind)/2
for i in range(odd_half)[::2]:
my_list[i], my_list[odd_half+odd_ind+i] = my_list[odd_half+odd_ind+i], my_list[i]
These are just in place operations and should thus be rather fast (O(N)).
Note that you will shift from l[i] == l[i+1] to l[i] == l[i+2] so the order you end up with is anything but random, but from how I understand the question it is not randomness you are looking for.
The idea is to split the sorted list in the middle then exchange every other element in the two parts.
For l= [1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5] this leads to l = [3, 1, 4, 2, 5, 1, 3, 1, 4, 2, 5]
The method fails to get rid of all the l[i] == l[i + 1] as soon as the abundance of one element is bigger than or equal to half of the length of the list.
While the above works fine as long as the abundance of the most frequent element is smaller than half the size of the list, the following function also handles the limit cases (the famous off-by-one issue) where every other element starting with the first one must be the most abundant one:
def no_adjacent(my_list):
my_list.sort()
length = len(my_list)
odd_ind = length%2
odd_half = (length - odd_ind)/2
for i in range(odd_half)[::2]:
my_list[i], my_list[odd_half+odd_ind+i] = my_list[odd_half+odd_ind+i], my_list[i]
#this is just for the limit case where the abundance of the most frequent is half of the list length
if max([my_list.count(val) for val in set(my_list)]) + 1 - odd_ind > odd_half:
max_val = my_list[0]
max_count = my_list.count(max_val)
for val in set(my_list):
if my_list.count(val) > max_count:
max_val = val
max_count = my_list.count(max_val)
while max_val in my_list:
my_list.remove(max_val)
out = [max_val]
max_count -= 1
for val in my_list:
out.append(val)
if max_count:
out.append(max_val)
max_count -= 1
if max_count:
print 'this is not working'
return my_list
#raise Exception('not possible')
return out
else:
return my_list
Here is a good algorithm:
First of all count for all numbers how often they occur. Place the answer in a map.
sort this map so that the numbers that occur most often come first.
The first number of your answer is the first number in the sorted map.
Resort the map with the first now being one smaller.
If you want to improve efficiency look for ways to increase the efficiency of the sorting step.
In answer to the bonus question: this is an algorithm which finds all permutations of a set where no adjacent elements can be identical. I believe this to be the most efficient algorithm conceptually (although others may be faster in practice because they translate into simpler code). It doesn't use brute force, it only generates unique permutations, and paths not leading to solutions are cut off at the earliest point.
I will use the term "abundant element" for an element in a set which occurs more often than all other elements combined, and the term "abundance" for the number of abundant elements minus the number of other elements.
e.g. the set abac has no abundant element, the sets abaca and aabcaa have a as the abundant element, and abundance 1 and 2 respectively.
Start with a set like:
aaabbcd
Seperate the first occurances from the repeats:
firsts: abcd
repeats: aab
Find the abundant element in the repeats, if any, and calculate the abundance:
abundant element: a
abundance: 1
Generate all permutations of the firsts where the number of elements after the abundant element is not less than the abundance: (so in the example the "a" cannot be last)
abcd, abdc, acbd, acdb, adbc, adcb, bacd, badc, bcad, bcda, bdac, bdca,
cabd, cadb, cbad, cbda, cdab, cdba, dabc, dacb, abac, dbca, dcab, dcba
For each permutation, insert the set of repeated characters one by one, following these rules:
5.1. If the abundance of the set is greater than the number of elements after the last occurance of the abundant element in the permutation so far, skip to the next permutation.
e.g. when permutation so far is abc, a set with abundant element a can only be inserted if the abundance is 2 or less, so aaaabc is ok, aaaaabc isn't.
5.2. Select the element from the set whose last occurance in the permutation comes first.
e.g. when permutation so far is abcba and set is ab, select b
5.3. Insert the selected element at least 2 positions to the right of its last occurance in the permutation.
e.g. when inserting b into permutation babca, results are babcba and babcab
5.4. Recurse step 5 with each resulting permutation and the rest of the set.
EXAMPLE:
set = abcaba
firsts = abc
repeats = aab
perm3 set select perm4 set select perm5 set select perm6
abc aab a abac ab b ababc a a ababac
ababca
abacb a a abacab
abacba
abca ab b abcba a -
abcab a a abcaba
acb aab a acab ab a acaba b b acabab
acba ab b acbab a a acbaba
bac aab b babc aa a babac a a babaca
babca a -
bacb aa a bacab a a bacaba
bacba a -
bca aab -
cab aab a caba ab b cabab a a cababa
cba aab -
This algorithm generates unique permutations. If you want to know the total number of permutations (where aba is counted twice because you can switch the a's), multiply the number of unique permutations with a factor:
F = N1! * N2! * ... * Nn!
where N is the number of occurances of each element in the set. For a set abcdabcaba this would be 4! * 3! * 2! * 1! or 288, which demonstrates how inefficient an algorithm is that generates all permutations instead of only the unique ones. To list all permutations in this case, just list the unique permutations 288 times :-)
Below is a (rather clumsy) implementation in Javascript; I suspect that a language like Python may be better suited for this sort of thing. Run the code snippet to calculate the seperated permutations of "abracadabra".
// FIND ALL PERMUTATONS OF A SET WHERE NO ADJACENT ELEMENTS ARE IDENTICAL
function seperatedPermutations(set) {
var unique = 0, factor = 1, firsts = [], repeats = [], abund;
seperateRepeats(set);
abund = abundance(repeats);
permutateFirsts([], firsts);
alert("Permutations of [" + set + "]\ntotal: " + (unique * factor) + ", unique: " + unique);
// SEPERATE REPEATED CHARACTERS AND CALCULATE TOTAL/UNIQUE RATIO
function seperateRepeats(set) {
for (var i = 0; i < set.length; i++) {
var first, elem = set[i];
if (firsts.indexOf(elem) == -1) firsts.push(elem)
else if ((first = repeats.indexOf(elem)) == -1) {
repeats.push(elem);
factor *= 2;
} else {
repeats.splice(first, 0, elem);
factor *= repeats.lastIndexOf(elem) - first + 2;
}
}
}
// FIND ALL PERMUTATIONS OF THE FIRSTS USING RECURSION
function permutateFirsts(perm, set) {
if (set.length > 0) {
for (var i = 0; i < set.length; i++) {
var s = set.slice();
var e = s.splice(i, 1);
if (e[0] == abund.elem && s.length < abund.num) continue;
permutateFirsts(perm.concat(e), s, abund);
}
}
else if (repeats.length > 0) {
insertRepeats(perm, repeats);
}
else {
document.write(perm + "<BR>");
++unique;
}
}
// INSERT REPEATS INTO THE PERMUTATIONS USING RECURSION
function insertRepeats(perm, set) {
var abund = abundance(set);
if (perm.length - perm.lastIndexOf(abund.elem) > abund.num) {
var sel = selectElement(perm, set);
var s = set.slice();
var elem = s.splice(sel, 1)[0];
for (var i = perm.lastIndexOf(elem) + 2; i <= perm.length; i++) {
var p = perm.slice();
p.splice(i, 0, elem);
if (set.length == 1) {
document.write(p + "<BR>");
++unique;
} else {
insertRepeats(p, s);
}
}
}
}
// SELECT THE ELEMENT FROM THE SET WHOSE LAST OCCURANCE IN THE PERMUTATION COMES FIRST
function selectElement(perm, set) {
var sel, pos, min = perm.length;
for (var i = 0; i < set.length; i++) {
pos = perm.lastIndexOf(set[i]);
if (pos < min) {
min = pos;
sel = i;
}
}
return(sel);
}
// FIND ABUNDANT ELEMENT AND ABUNDANCE NUMBER
function abundance(set) {
if (set.length == 0) return ({elem: null, num: 0});
var elem = set[0], max = 1, num = 1;
for (var i = 1; i < set.length; i++) {
if (set[i] != set[i - 1]) num = 1
else if (++num > max) {
max = num;
elem = set[i];
}
}
return ({elem: elem, num: 2 * max - set.length});
}
}
seperatedPermutations(["a","b","r","a","c","a","d","a","b","r","a"]);
The idea is to sort the elements from the most common to the least common, take the most common, decrease its count and put it back in the list keeping the descending order (but avoiding putting the last used element first to prevent repetitions when possible).
This can be implemented using Counter and bisect:
from collections import Counter
from bisect import bisect
def unsorted(lst):
# use elements (-count, item) so bisect will put biggest counts first
items = [(-count, item) for item, count in Counter(lst).most_common()]
result = []
while items:
count, item = items.pop(0)
result.append(item)
if count != -1:
element = (count + 1, item)
index = bisect(items, element)
# prevent insertion in position 0 if there are other items
items.insert(index or (1 if items else 0), element)
return result
Example
>>> print unsorted([1, 1, 1, 2, 3, 3, 2, 2, 1])
[1, 2, 1, 2, 1, 3, 1, 2, 3]
>>> print unsorted([1, 2, 3, 2, 3, 2, 2])
[2, 3, 2, 1, 2, 3, 2]
Sort the list.
Generate a "best shuffle" of the list using this algorithm
It will give the minimum of items from the list in their original place (by item value) so it will try, for your example, to put the 1's, 2's and 3's away from their sorted positions.
Start with the sorted list of length n.
Let m=n/2.
Take the values at 0, then m, then 1, then m+1, then 2, then m+2, and so on.
Unless you have more than half of the numbers the same, you'll never get equivalent values in consecutive order.
Please forgive my "me too" style answer, but couldn't Coady's answer be simplified to this?
from collections import Counter
from heapq import heapify, heappop, heapreplace
from itertools import repeat
def srgerg(data):
heap = [(-freq+1, value) for value, freq in Counter(data).items()]
heapify(heap)
freq = 0
while heap:
freq, val = heapreplace(heap, (freq+1, val)) if freq else heappop(heap)
yield val
yield from repeat(val, -freq)
Edit: Here's a python 2 version that returns a list:
def srgergpy2(data):
heap = [(-freq+1, value) for value, freq in Counter(data).items()]
heapify(heap)
freq = 0
result = list()
while heap:
freq, val = heapreplace(heap, (freq+1, val)) if freq else heappop(heap)
result.append(val)
result.extend(repeat(val, -freq))
return result
Count number of times each value appears
Select values in order from most frequent to least frequent
Add selected value to final output, incrementing the index by 2 each time
Reset index to 1 if index out of bounds
from heapq import heapify, heappop
def distribute(values):
counts = defaultdict(int)
for value in values:
counts[value] += 1
counts = [(-count, key) for key, count in counts.iteritems()]
heapify(counts)
index = 0
length = len(values)
distributed = [None] * length
while counts:
count, value = heappop(counts)
for _ in xrange(-count):
distributed[index] = value
index = index + 2 if index + 2 < length else 1
return distributed

better algorithm for checking 5 in a row/col in a matrix

is there a good algorithm for checking whether there are 5 same elements in a row or a column or diagonally given a square matrix, say 6x6?
there is ofcourse the naive algorithm of iterating through every spot and then for each point in the matrix, iterate through that row, col and then the diagonal. I am wondering if there is a better way of doing it.
You could keep a histogram in a dictionary (mapping element type -> int). And then you iterate over your row or column or diagonal, and increment histogram[element], and either check at the end to see if you have any 5s in the histogram, or if you can allow more than 5 copies, you can just stop once you've reached 5 for any element.
Simple, one-dimensional, example:
m = ['A', 'A', 'A', 'A', 'B', 'A']
h = {}
for x in m:
if x in h:
h[x] += 1
else:
h[x] = 1
print "Histogram:", h
for k in h:
if h[k]>=5:
print "%s appears %d times." % (k,h[k])
Output:
Histogram: {'A': 5, 'B': 1}
A appears 5 times.
Essentially, h[x] will store the number of times the element x appears in the array (in your case, this will be the current row, or column or diagonal). The elements don't have to appear consecutively, but the counts would be reset each time you start considering a new row/column/diagonal.
You can check whether there are k same elements in a matrix of integers in a single pass.
Suppose that n is the size of the matrix and m is the largest element. We have n column, n row and 1 diagonal.
Foreach column, row or diagonal we have at most n distinct element.
Now we can create a histogram containing (n + n + 1) * (2 * m + 1) element. Representing
the rows, columns and the diagonal each of them containing at most n distinct element.
size = (n + n + 1) * (2 * m + 1)
histogram = zeros(size, Int)
Now the tricky part is how to update this histogram ?
Consider this function in pseudo-code:
updateHistogram(i, j, element)
if (element < 0)
element = m - element;
rowIndex = i * m + element
columnIndex = n * m + j * m + element
diagonalIndex = 2 * n * m + element
histogram[rowIndex] = histogram[rowIndex] + 1
histogram[columnIndex] = histogram[columnIndex] + 1
if (i = j)
histogram[diagonalIndex] = histogram[diagonalIndex] + 1
Now all you have to do is to iterate throw the histogram and check whether there is an element > k
Your best approach may depend on whether you control the placement of elements.
For example, if you were building a game and just placed the most recent element on the grid, you could capture into four strings the vertical, horizontal, and diagonal strips that intersected that point, and use the same algorithm on each strip, tallying each element and evaluating the totals. The algorithm may be slightly different depending on whether you're counting five contiguous elements out of the six, or allow gaps as long as the total is five.
For rows you can keep a counter, which indicates how many of the same elements in a row you currently have. To do this, iterate through the row and
if current element matches the previous element, increase the counter by one. If counter is 5, then you have found the 5 elements you wanted.
if current element doesn't match previous element, set the counter to 1.
The same principle can be applied to columns and diagonals as well. You probably want to use array of counters for columns (one element for each column) and diagonals so you can iterate through the matrix once.
I did the small example for a smaller case, but you can easily change it:
n = 3
matrix = [[1, 2, 3, 4],
[1, 2, 3, 1],
[2, 3, 1, 3],
[2, 1, 4, 2]]
col_counter = [1, 1, 1, 1]
for row in range(0, len(matrix)):
row_counter = 1
for col in range(0, len(matrix[row])):
current_element = matrix[row][col]
# check elements in a same row
if col > 0:
previous_element = matrix[row][col - 1]
if current_element == previous_element:
row_counter = row_counter + 1
if row_counter == n:
print n, 'in a row at:', row, col - n + 1
else:
row_counter = 1
# check elements in a same column
if row > 0:
previous_element = matrix[row - 1][col]
if current_element == previous_element:
col_counter[col] = col_counter[col] + 1;
if col_counter[col] == n:
print n, 'in a column at:', row - n + 1, col
else:
col_counter[col] = 1
I left out diagonals to keep the example short and simple, but for diagonals you can use the same principle as you use on columns. The previous element would be one of the following (depending on the direction of diagonal):
matrix[row - 1][col - 1]
matrix[row - 1][col + 1]
Note that you will need to make a little bit extra effort in the second case. For example traverse the row in the inner loop from right to left.
I don't think you can avoid iteration, but you can at least do an XOR of all elements and if the result of that is 0 => they are all equal, then you don't need to do any comparisons.
You can try improve your method with some heuristics: use the knowledge of the matrix size to exclude element sequences that do not fit and suspend unnecessary calculation. In case the given vector size is 6, you want to find 5 equal elements, and the first 3 elements are different, further calculation do not have any sense.
This approach can give you a significant advantage, if 5 equal elements in a row happen rarely enough.
If you code the rows/columns/diagonals as bitmaps, "five in a row" means "mask % 31== 0 && mask / 31 == power_of_two"
00011111 := 0x1f 31 (five in a row)
00111110 := 0x3e 62 (five in a row)
00111111 := 0x3f 63 (six in a row)
If you want to treat the six-in-a-row case also as as five-in-a-row, the easiest way is probably to:
for ( ; !(mask & 1) ; mask >>= 1 ) {;}
return (mask & 0x1f == 0x1f) ? 1 : 0;
Maybe the Stanford bit-tweaking department has a better solution or suggestion that does not need looping?

Categories