Seeking the algorithm to generate this table of numbers [duplicate] - python

I need to know how to calculate the positions of the QR Code alignment patterns as defined in the table of ISO/IEC 18004:2000 Annex E.
I don't understand how it's calculated. If you take the Version 16, for example, the positions are calculated using {6,26,50,74} and distance between the points are {20,24,24}. Why isn't it {6,28,52,74}, if the distances between the points, {22,24,22}, is distributed more equally?
I would like to know how this can be generated procedurally.

While the specification does provide a table of the alignment, this is a reasonable question (and one I found myself with :-)) - the possibility of generating the positions procedurally has its merits (less typo-prone code, smaller code footprint, knowing pattern/properties of the positions).
I'm happy to report that, yes, a procedure exists (and it is even fairly simple).
The specification itself says most of it:
[The alignment patterns] are spaced as evenly as possible between the Timing Pattern and the opposite side of the symbol, any uneven spacing being accommodated between the timing pattern and the first alignment pattern in the symbol interior.
That is, only the interval between the first and second coordinate may differ from the rest of the intervals. The rest must be equal.
Another important bit is of course that, for the APs to agree with the timing patterns, the intervals must be even.
The remaining tricky bit is just getting the rounding right.
Anyway - here's code printing the alignment position table:
def size_for_version(version):
return 17 + 4 * version
def alignment_coord_list(version):
if version == 1:
return []
divs = 2 + version // 7
size = size_for_version(version)
total_dist = size - 7 - 6
divisor = 2 * (divs - 1)
# Step must be even, for alignment patterns to agree with timing patterns
step = (total_dist + divisor // 2 + 1) // divisor * 2 # Get the rounding right
coords = [6]
for i in range(divs - 2, -1, -1): # divs-2 down to 0, inclusive
coords.append(size - 7 - i * step)
return coords
for version in range(1, 40 + 1): # 1 to 40 inclusive
print("V%d: %s" % (version, alignment_coord_list(version)))

Here's a Python solution which is basically equivalent to the C# solution posted by #jgosar, except that it corrects a deviation from the thonky.com table for version 32 (that other solution reports 110 for the second last position, whereas the linked table says 112):
def get_alignment_positions(version):
positions = []
if version > 1:
n_patterns = version // 7 + 2
first_pos = 6
positions.append(first_pos)
matrix_width = 17 + 4 * version
last_pos = matrix_width - 1 - first_pos
second_last_pos = (
(first_pos + last_pos * (n_patterns - 2) # Interpolate end points to get point
+ (n_patterns - 1) // 2) # Round to nearest int by adding half
# of divisor before division
// (n_patterns - 1) # Floor-divide by number of intervals
# to complete interpolation
) & -2 # Round down to even integer
pos_step = last_pos - second_last_pos
second_pos = last_pos - (n_patterns - 2) * pos_step
positions.extend(range(second_pos, last_pos + 1, pos_step))
return positions
The correction consists of first rounding the second last position (up or down) to the nearest integer and then rounding down to the nearest even integer (instead of directly rounding down to the nearest even integer).
Disclaimer: Like #jgosar, I don't know whether the thonky.com table is correct (I'm not going to buy the spec to find out). I've simply verified (by pasting the table into a suitable wrapper around the above function) that my solution matches that table in its current version.

sorry about my English.
I hope this can help you, and not to later reply.
first things, the standard forget a important thing is that the top left is define with (0,0).
the { 6, 26, 50, 74 } means the alignment points row coordinate and col coordinate, and I don't
know why they do like this, maybe for save space. but we combine all the values for example the:
{ 6, 26, 50, 74 }
and we get :
{ 6 , 6 } ---> ( the x coordinate is 6, and the y is 6, from top/left )
{ 6 , 26 }
{ 6 , 50 }
{ 6 , 74 }
{ 26, 26 }
{ 26, 50 }
{ 26, 74 }
{ 50, 50 }
{ 50, 74 }
{ 74, 74 }
those point's are the actual coordinate of alignment patterns center.
Ps: if a position has the position detection patterns, we ignore output alignment, like the position
(6, 6).
I also have this question before, but now, I solve it, so I hope you can solve it too.
good luck~

There are some comments on the top rated answer that suggest it isn't 100% accurate, so i'm contributing my solution as well.
My solution is written in C#. It should be easy to translate it to a language of your choice.
private static int[] getAlignmentCoords(int version)
{
if (version <= 1)
{
return new int[0];
}
int num = (version / 7) + 2;//number of coordinates to return
int[] result = new int[num];
result[0] = 6;
if (num == 1)
{
return result;
}
result[num - 1] = 4 * version + 10;
if (num == 2)
{
return result;
}
result[num - 2] = 2 * ((result[0] + result[num - 1] * (num - 2)) / ((num - 1) * 2)); //leave these brackets alone, because of integer division they ensure you get a number that's divisible by 2
if (num == 3)
{
return result;
}
int step = result[num - 1] - result[num - 2];
for (int i = num - 3; i > 0; i--)
{
result[i] = result[i + 1] - step;
}
return result;
}
The values i get with it are the same as shown here: http://www.thonky.com/qr-code-tutorial/alignment-pattern-locations/
To sum it up, the first coordinate is always 6.
The last coordinate is always 7 less than the image size. The image size is calculated as 4*version+17, therefore the last coordinate is 4*version+10.
If the coordinates were precisely evenly spaced, the position of one coordinate before the last would be (first_coordinate+(num-2) * last_coordinate)/(num-1), where num is the number of all coordinates.
But the coordinates are not evenly spaced, so this position has to be reduced to an even number.
Each of the remaining coordinates is spaced the same distance from the next one as the last two are from each other.
Disclaimer: I didn't read any of the documentation, i just wrote some code that generates a sequence of numbers that's the same as in the table i linked to.

Starting with #ericsoe's answer, and noting it's incorrect for v36 and v39 (thanks to #Ana's remarks), I've developed a function that returns the correct sequences. Pardon the JavaScript (fairly easy to translate to other languages, though):
function getAlignmentCoordinates(version) {
if (version === 1) {
return [];
}
const intervals = Math.floor(version / 7) + 1;
const distance = 4 * version + 4; // between first and last alignment pattern
const step = Math.ceil(distance / intervals / 2) * 2; // To get the next even number
return [6].concat(Array.from(
{ length: intervals },
(_, index) => distance + 6 - (intervals - 1 - index) * step)
);
}

I don't know if this is a useful question to ask. It just is the way it is, and it doesn't really matter much if it were {22,24,22}. Why are you asking?
My guess it that the spacing should be multiples of 4 modules.

It seems like most answers aren't correct for all versions (especially v32, v36 and v39) and/or are quite convoluted.
Based on #MaxArt's great solution (which produces wrong coordinates for v32), here's a C function which calculates the correct coordinates for all versions:
#include <math.h>
int getAlignmentCoordinates(int version, int *coordinates) {
if (version <= 1) return 0;
int intervals = (version / 7) + 1; // Number of gaps between alignment patterns
int distance = 4 * version + 4; // Distance between first and last alignment pattern
int step = lround((double)distance / (double)intervals); // Round equal spacing to nearest integer
step += step & 0b1; // Round step to next even number
coordinates[0] = 6; // First coordinate is always 6 (can't be calculated with step)
for (int i = 1; i <= intervals; i++) {
coordinates[i] = 6 + distance - step * (intervals - i); // Start right/bottom and go left/up by step*k
}
return intervals+1;
}
The key is to first round the division to the nearest integer (instead of up) and then round it to the next largest even number.
The C program below uses this function to generate the same values as in the table of ISO/IEC 18004:2000 Annex E linked by OP and the (updated) list found on thonky.com:
#include <stdio.h>
void main() {
for (int version = 2; version <= 40; version++) {
int coordinates[7];
int n = getAlignmentCoordinates(version, coordinates);
printf("%d:", version);
for (int i = 0; i < n; i++) {
printf(" %d", coordinates[i]);
}
printf("\n");
}
}

Related

Random Distribution Calculation

If i specify a number, is there a way to assign a random portion of that number as a total to several groups?
e.g Total 1.
Group 1 - 0.1
Group 2 - 0.3
Group 3 - 0.4
Group 4 - 0.2
It's very simple to do in java ...
You generate a random number from 1 to 100 by using a function like this
// min = 1 max=100 in your case
public int getRandomNumber(int min, int max) {
return (int) ((Math.random() * (max - min)) + min);
}
Then in your function which selects the element you do like this ...
Group 1 - 0.1 Group 2 - 0.3 Group 3 - 0.4 Group 4 - 0.2
If the number is 1 to 10 select group 1.
If the number is 11 to 40 select group 2.
If the number is 41 to 80 select group 3
If the number is 81 to 100 select group 4
Its as easy as calculating percentages.
Does this solve your problem ? Let me know in the comments.
Well if you don't care about somewhat same distribution, you can just
void foo(){
double total = 1.0;
double[] group = new double[4];
for(int i = 0; i < group.length-1; i++){
//get random 0.0-1.0
double rand = getRandom(0.0, 1.0);
double portion = random * total
group[i] = portion;
total -= portion;
}
group[group.length-1] = total;
}
If you do care, you can set getRandom to your liking, f.e.
//get random 0.0-1.0
double rand = getRandom(1.0/group.length*0.7, 1.0/group.length * 1.3);
so it will be 70% to 130% of the average.
This method can distribute a value to an array or "groups" of that value type randomly,
you can switch out double with int or float as you see fit.
public void assignGroups(double[] groups, double number){
Random rand = new Random();
for(int i=0; i<groups.length-1; i++){
double randomNum = rand.nextDouble()*number; // randomly picks number between 0 and the current amount left
groups[i] = randomNum;
number-=randomNum; // subtracts the random number from the total
}
groups[groups.length-1] = number; // sets the last value of the groups to the number left
}
In Python:
from random import random
num_groups = 5 # Number of groups
total = 5 # The given number
base = [0.] + sorted(random() for _ in range(num_groups - 1)) + [1.]
portions = [(right - left) * total for left, right in zip(base[:-1], base[1:])]
Result (print(portions)): A list of length num_groups (number of groups) which contains the distributed total (given number):
[2.5749833618941995, 0.010389749273946869, 0.3718137712569358, 0.3725336641218424, 1.6702794534530752]
Using Java:
private static double roundValue(double value, int precision) {
return (double) Math.round(value * Math.pow(10, precision))/
Math.pow(10, precision);
}
public static double[] generateGroups(double total, int groupsNumber, int precision){
double[] result = new double[groupsNumber];
double sum = 0;
for (int i = 0; i < groupsNumber - 1; i++) {
result[i] = roundValue((total - sum) * Math.random(), precision);
sum += result[i];
}
result[groupsNumber-1] = roundValue((total - sum), precision);
return result;
}
public static void main(String... args) {
double[] result = generateGroups(1.0, 4, 1);
System.out.println(Arrays.toString(result));
}

Cartesian product in Gray code order : including affected set in this order?

Having an excellent solution to: Cartesian product in Gray code order with itertools?, is there a way to add something simple to this solution to also report the set (its index) that underwent the change in going from one element to the next of the Cartesian product in Gray code order? That is, a gray_code_product_with_change(['a','b','c'], [0,1], ['x','y']) which would produce something like:
(('a',0,'x'), -1)
(('a',0,'y'), 2)
(('a',1,'y'), 1)
(('a',1,'x'), 2)
(('b',1,'x'), 0)
(('b',1,'y'), 2)
(('b',0,'y'), 1)
(('b',0,'x'), 2)
(('c',0,'x'), 0)
(('c',0,'y'), 2)
(('c',1,'y'), 1)
(('c',1,'x'), 2)
I want to avoid taking the "difference" between consecutive tuples, but to have constant-time updates --- hence the Gray code order thing to begin with. One solution could be to write an index_changed iterator, i.e., index_changed(3,2,2) would return the sequence -1,2,1,2,0,2,1,2,0,2,1,2 that I want, but can something even simpler be added to the solution above to achieve the same result?
There are several things wrong with this question, but I'll keep it like this, rather than only making it worse by turning it into a "chameleon question"
Indeed, why even ask for the elements of the Cartesian product in Gray code order, when you have this "index changed" sequence? So I suppose what I was really looking for was efficient computation of this sequence. So I ended up implementing the above-mentioned gray_code_product_with_change, which takes a base set of sets, e.g., ['a','b','c'], [0,1], ['x','y'], computing this "index changed" sequence, and updating this base set of sets as it moves through the sequence. Since the implementation ended up being more interesting than I thought, I figured I would share, should someone find it useful:
(Disclaimer: probably not the most pythonic code, rather almost C-like)
def gray_code_product_with_change(*args, repeat=1) :
sets = args * repeat
s = [len(x) - 1 for x in sets]
n = len(s)
# setup parity array and first combination
p = n * [True] # True: move foward (False: move backward)
c = n * [0] # inital combo: all 0's (first element of each set)
# emit the first combination
yield tuple(sets[i][x] for i, x in enumerate(c))
# incrementally update combination in Gray code order
has_next = True
while has_next :
# look for the smallest index to increment/decrement
has_next = False
for j in range(n-1,-1,-1) :
if p[j] : # currently moving forward..
if c[j] < s[j] :
c[j] += 1
has_next = True
# emit affected set (forward direction)
yield j
else : # ..moving backward
if c[j] > 0 :
c[j] -= 1
has_next = True
# emit affected set (reverse direction)
yield -j
# we did manage to increment/decrement at position j..
if has_next :
# emit the combination
yield tuple(sets[i][x] for i, x in enumerate(c))
for q in range(n-1,j,-1) : # cascade
p[q] = not p[q]
break
Trying to tease out as much performance as I could in just computing this sequence --- since the number of elements in the Cartesian product of a set of sets grows exponentially with the number of sets (of size 2 or more) --- I implemented this in C. What it essentially does, is implement the above-mentioned index_changed (using a slightly different notation):
(Disclaimer: there is much room for optimization here)
void gray_code_sequence(int s[], int n) {
// set up parity array
int p[n];
for(int i = 0; i < n; ++i) {
p[i] = 1; // 1: move forward, (1: move backward)
}
// initialize / emit first combination
int c[n];
printf("(");
for(int i = 0; i < n-1; ++i) {
c[i] = 0; // initial combo: all 0s (first element of each set)
printf("%d, ", c[i]); // emit the first combination
}
c[n-1] = 0;
printf("%d)\n", c[n-1]);
int has_next = 1;
while(has_next) {
// look for the smallest index to increment/decrement
has_next = 0;
for(int j = n-1; j >= 0; --j) {
if(p[j] > 0) { // currently moving forward..
if(c[j] < s[j]) {
c[j] += 1;
has_next = 1;
printf("%d\n", j);
}
}
else { // ..moving backward
if(c[j] > 0) {
c[j] -= 1;
has_next = 1;
printf("%d\n", -j);
}
}
if(has_next) {
for(int q = n-1; q > j; --q) {
p[q] = -1 * p[q]; // cascade
}
break;
}
}
}
}
When compared to the above python (where the yielding of the elements of the Cartesian product is suppressed, and only the elements of the sequence are yielded, so that the output is essentially the same, for a fair comparison), this C implementation seems to be about 15 times as fast, asymptotically.
Again this C code could be highly optimized (the irony that python code is so C-like being well-noted), for example, this parity array could stored in a single int type, performing bit shift >> operations, etc., so I bet that even a 30 or 40x speedup could be achieved.

I'm trying to understand how to print all the possible combinations of a array

i = start;
while(i <= end and end - i + 1 >= r - index):
data[index] = arr[i];
combinationUtil(arr, data, i + 1,
end, index + 1, r);
i += 1;
I'm having a hard time trying to understand why, "end - i + 1 >= r - index" this condition is needed, I've tried running the code, with and without, it produced the same output, I want to know what is the edge case that causes this condition to return False.
The full code is available here.
Try to group the variables into pieces that are easier to understand e.g.
int values_left_to_print = r - index; // (size of combination to be printed) - (current index into data)
int values_left_in_array = end - i + 1; // number of values left until the end of given arr
Now we can interpret it like this:
for (int i = start; i <= end && (values_left_in_array >= values_left_to_print); i++)
{
so if i is near the end of the given array and there are not enough values left to print a full combination, then the loop (and function) will stop. Let's look at an example:
Given
arr = {1,2,3,4}
n = 4; // size of arr
r = 3; // size of combination
The top level function will start to form a combination with 1 and then with 2 resulting in (1,2,3), (1,2,4), (1,3,4)
It will not try 3 and 4, because (values_left_in_array < values_left_to_print).
If the condition was not there, then the function would try 3 and 4, but the values in the sequence only ever increase in index from left-to-right in the given array, so the combination will end because i will reach end before being able to find 3 values.

Is it possible to determine if two lists are identical (rotatable) without going through every rotation? [duplicate]

For instance, I have lists:
a[0] = [1, 1, 1, 0, 0]
a[1] = [1, 1, 0, 0, 1]
a[2] = [0, 1, 1, 1, 0]
# and so on
They seem to be different, but if it is supposed that the start and the end are connected, then they are circularly identical.
The problem is, each list which I have has a length of 55 and contains only three ones and 52 zeros in it. Without circular condition, there are 26,235 (55 choose 3) lists. However, if the condition 'circular' exists, there are a huge number of circularly identical lists
Currently I check circularly identity by following:
def is_dup(a, b):
for i in range(len(a)):
if a == list(numpy.roll(b, i)): # shift b circularly by i
return True
return False
This function requires 55 cyclic shift operations at the worst case. And there are 26,235 lists to be compared with each other. In short, I need 55 * 26,235 * (26,235 - 1) / 2 = 18,926,847,225 computations. It's about nearly 20 Giga!
Is there any good way to do it with less computations? Or any data types that supports circular?
First off, this can be done in O(n) in terms of the length of the list
You can notice that if you will duplicate your list 2 times ([1, 2, 3]) will be [1, 2, 3, 1, 2, 3] then your new list will definitely hold all possible cyclic lists.
So all you need is to check whether the list you are searching is inside a 2 times of your starting list. In python you can achieve this in the following way (assuming that the lengths are the same).
list1 = [1, 1, 1, 0, 0]
list2 = [1, 1, 0, 0, 1]
print ' '.join(map(str, list2)) in ' '.join(map(str, list1 * 2))
Some explanation about my oneliner:
list * 2 will combine a list with itself, map(str, [1, 2]) convert all numbers to string and ' '.join() will convert array ['1', '2', '111'] into a string '1 2 111'.
As pointed by some people in the comments, oneliner can potentially give some false positives, so to cover all the possible edge cases:
def isCircular(arr1, arr2):
if len(arr1) != len(arr2):
return False
str1 = ' '.join(map(str, arr1))
str2 = ' '.join(map(str, arr2))
if len(str1) != len(str2):
return False
return str1 in str2 + ' ' + str2
P.S.1 when speaking about time complexity, it is worth noticing that O(n) will be achieved if substring can be found in O(n) time. It is not always so and depends on the implementation in your language (although potentially it can be done in linear time KMP for example).
P.S.2 for people who are afraid strings operation and due to this fact think that the answer is not good. What important is complexity and speed. This algorithm potentially runs in O(n) time and O(n) space which makes it much better than anything in O(n^2) domain. To see this by yourself, you can run a small benchmark (creates a random list pops the first element and appends it to the end thus creating a cyclic list. You are free to do your own manipulations)
from random import random
bigList = [int(1000 * random()) for i in xrange(10**6)]
bigList2 = bigList[:]
bigList2.append(bigList2.pop(0))
# then test how much time will it take to come up with an answer
from datetime import datetime
startTime = datetime.now()
print isCircular(bigList, bigList2)
print datetime.now() - startTime # please fill free to use timeit, but it will give similar results
0.3 seconds on my machine. Not really long. Now try to compare this with O(n^2) solutions. While it is comparing it, you can travel from US to Australia (most probably by a cruise ship)
Not knowledgeable enough in Python to answer this in your requested language, but in C/C++, given the parameters of your question, I'd convert the zeros and ones to bits and push them onto the least significant bits of an uint64_t. This will allow you to compare all 55 bits in one fell swoop - 1 clock.
Wickedly fast, and the whole thing will fit in on-chip caches (209,880 bytes). Hardware support for shifting all 55 list members right simultaneously is available only in a CPU's registers. The same goes for comparing all 55 members simultaneously. This allows for a 1-for-1 mapping of the problem to a software solution. (and using the SIMD/SSE 256 bit registers, up to 256 members if needed) As a result the code is immediately obvious to the reader.
You might be able to implement this in Python, I just don't know it well enough to know if that's possible or what the performance might be.
After sleeping on it a few things became obvious, and all for the better.
1.) It's so easy to spin the circularly linked list using bits that Dali's very clever trick isn't necessary. Inside a 64-bit register standard bit shifting will accomplish the rotation very simply, and in an attempt to make this all more Python friendly, by using arithmetic instead of bit ops.
2.) Bit shifting can be accomplished easily using divide by 2.
3.) Checking the end of the list for 0 or 1 can be easily done by modulo 2.
4.) "Moving" a 0 to the head of the list from the tail can be done by dividing by 2. This because if the zero were actually moved it would make the 55th bit false, which it already is by doing absolutely nothing.
5.) "Moving" a 1 to the head of the list from the tail can be done by dividing by 2 and adding 18,014,398,509,481,984 - which is the value created by marking the 55th bit true and all the rest false.
6.) If a comparison of the anchor and composed uint64_t is TRUE after any given rotation, break and return TRUE.
I would convert the entire array of lists into an array of uint64_ts right up front to avoid having to do the conversion repeatedly.
After spending a few hours trying to optimize the code, studying the assembly language I was able to shave 20% off the runtime. I should add that the O/S and MSVC compiler got updated mid-day yesterday as well. For whatever reason/s, the quality of the code the C compiler produced improved dramatically after the update (11/15/2014). Run-time is now ~ 70 clocks, 17 nanoseconds to compose and compare an anchor ring with all 55 turns of a test ring and NxN of all rings against all others is done in 12.5 seconds.
This code is so tight all but 4 registers are sitting around doing nothing 99% of the time. The assembly language matches the C code almost line for line. Very easy to read and understand. A great assembly project if someone were teaching themselves that.
Hardware is Hazwell i7, MSVC 64-bit, full optimizations.
#include "stdafx.h"
#include "stdafx.h"
#include <string>
#include <memory>
#include <stdio.h>
#include <time.h>
const uint8_t LIST_LENGTH = 55; // uint_8 supports full witdth of SIMD and AVX2
// max left shifts is 32, so must use right shifts to create head_bit
const uint64_t head_bit = (0x8000000000000000 >> (64 - LIST_LENGTH));
const uint64_t CPU_FREQ = 3840000000; // turbo-mode clock freq of my i7 chip
const uint64_t LOOP_KNT = 688275225; // 26235^2 // 1000000000;
// ----------------------------------------------------------------------------
__inline uint8_t is_circular_identical(const uint64_t anchor_ring, uint64_t test_ring)
{
// By trial and error, try to synch 2 circular lists by holding one constant
// and turning the other 0 to LIST_LENGTH positions. Return compare count.
// Return the number of tries which aligned the circularly identical rings,
// where any non-zero value is treated as a bool TRUE. Return a zero/FALSE,
// if all tries failed to find a sequence match.
// If anchor_ring and test_ring are equal to start with, return one.
for (uint8_t i = LIST_LENGTH; i; i--)
{
// This function could be made bool, returning TRUE or FALSE, but
// as a debugging tool, knowing the try_knt that got a match is nice.
if (anchor_ring == test_ring) { // test all 55 list members simultaneously
return (LIST_LENGTH +1) - i;
}
if (test_ring % 2) { // ring's tail is 1 ?
test_ring /= 2; // right-shift 1 bit
// if the ring tail was 1, set head to 1 to simulate wrapping
test_ring += head_bit;
} else { // ring's tail must be 0
test_ring /= 2; // right-shift 1 bit
// if the ring tail was 0, doing nothing leaves head a 0
}
}
// if we got here, they can't be circularly identical
return 0;
}
// ----------------------------------------------------------------------------
int main(void) {
time_t start = clock();
uint64_t anchor, test_ring, i, milliseconds;
uint8_t try_knt;
anchor = 31525197391593472; // bits 55,54,53 set true, all others false
// Anchor right-shifted LIST_LENGTH/2 represents the average search turns
test_ring = anchor >> (1 + (LIST_LENGTH / 2)); // 117440512;
printf("\n\nRunning benchmarks for %llu loops.", LOOP_KNT);
start = clock();
for (i = LOOP_KNT; i; i--) {
try_knt = is_circular_identical(anchor, test_ring);
// The shifting of test_ring below is a test fixture to prevent the
// optimizer from optimizing the loop away and returning instantly
if (i % 2) {
test_ring /= 2;
} else {
test_ring *= 2;
}
}
milliseconds = (uint64_t)(clock() - start);
printf("\nET for is_circular_identical was %f milliseconds."
"\n\tLast try_knt was %u for test_ring list %llu",
(double)milliseconds, try_knt, test_ring);
printf("\nConsuming %7.1f clocks per list.\n",
(double)((milliseconds * (CPU_FREQ / 1000)) / (uint64_t)LOOP_KNT));
getchar();
return 0;
}
Reading between the lines, it sounds as though you're trying to enumerate one representative of each circular equivalence class of strings with 3 ones and 52 zeros. Let's switch from a dense representation to a sparse one (set of three numbers in range(55)). In this representation, the circular shift of s by k is given by the comprehension set((i + k) % 55 for i in s). The lexicographic minimum representative in a class always contains the position 0. Given a set of the form {0, i, j} with 0 < i < j, the other candidates for minimum in the class are {0, j - i, 55 - i} and {0, 55 - j, 55 + i - j}. Hence, we need (i, j) <= min((j - i, 55 - i), (55 - j, 55 + i - j)) for the original to be minimum. Here's some enumeration code.
def makereps():
reps = []
for i in range(1, 55 - 1):
for j in range(i + 1, 55):
if (i, j) <= min((j - i, 55 - i), (55 - j, 55 + i - j)):
reps.append('1' + '0' * (i - 1) + '1' + '0' * (j - i - 1) + '1' + '0' * (55 - j - 1))
return reps
Repeat the first array, then use the Z algorithm (O(n) time) to find the second array inside the first.
(Note: you don't have to physically copy the first array. You can just wrap around during matching.)
The nice thing about the Z algorithm is that it's very simple compared to KMP, BM, etc.
However, if you're feeling ambitious, you could do string matching in linear time and constant space -- strstr, for example, does this. Implementing it would be more painful, though.
Following up on Salvador Dali's very smart solution, the best way to handle it is to make sure all elements are of the same length, as well as both LISTS are of the same length.
def is_circular_equal(lst1, lst2):
if len(lst1) != len(lst2):
return False
lst1, lst2 = map(str, lst1), map(str, lst2)
len_longest_element = max(map(len, lst1))
template = "{{:{}}}".format(len_longest_element)
circ_lst = " ".join([template.format(el) for el in lst1]) * 2
return " ".join([template.format(el) for el in lst2]) in circ_lst
No clue if this is faster or slower than AshwiniChaudhary's recommended regex solution in Salvador Dali's answer, which reads:
import re
def is_circular_equal(lst1, lst2):
if len(lst2) != len(lst2):
return False
return bool(re.search(r"\b{}\b".format(' '.join(map(str, lst2))),
' '.join(map(str, lst1)) * 2))
Given that you need to do so many comparisons might it be worth your while taking an initial pass through your lists to convert them into some sort of canonical form that can be easily compared?
Are you trying to get a set of circularly-unique lists? If so you can throw them into a set after converting to tuples.
def normalise(lst):
# Pick the 'maximum' out of all cyclic options
return max([lst[i:]+lst[:i] for i in range(len(lst))])
a_normalised = map(normalise,a)
a_tuples = map(tuple,a_normalised)
a_unique = set(a_tuples)
Apologies to David Eisenstat for not spotting his v.similar answer.
You can roll one list like this:
list1, list2 = [0,1,1,1,0,0,1,0], [1,0,0,1,0,0,1,1]
str_list1="".join(map(str,list1))
str_list2="".join(map(str,list2))
def rotate(string_to_rotate, result=[]):
result.append(string_to_rotate)
for i in xrange(1,len(string_to_rotate)):
result.append(result[-1][1:]+result[-1][0])
return result
for x in rotate(str_list1):
if cmp(x,str_list2)==0:
print "lists are rotationally identical"
break
First convert every of your list elements (in a copy if necessary) to that rotated version that is lexically greatest.
Then sort the resulting list of lists (retaining an index into the original list position) and unify the sorted list, marking all the duplicates in the original list as needed.
Piggybacking on #SalvadorDali's observation on looking for matches of a in any a-lengthed sized slice in b+b, here is a solution using just list operations.
def rollmatch(a,b):
bb=b*2
return any(not any(ax^bbx for ax,bbx in zip(a,bb[i:])) for i in range(len(a)))
l1 = [1,0,0,1]
l2 = [1,1,0,0]
l3 = [1,0,1,0]
rollmatch(l1,l2) # True
rollmatch(l1,l3) # False
2nd approach: [deleted]
Not a complete, free-standing answer, but on the topic of optimizing by reducing comparisons, I too was thinking of normalized representations.
Namely, if your input alphabet is {0, 1}, you could reduce the number of allowed permutations significantly. Rotate the first list to a (pseudo-) normalized form (given the distribution in your question, I would pick one where one of the 1 bits is on the extreme left, and one of the 0 bits is on the extreme right). Now before each comparison, successively rotate the other list through the possible positions with the same alignment pattern.
For example, if you have a total of four 1 bits, there can be at most 4 permutations with this alignment, and if you have clusters of adjacent 1 bits, each additional bit in such a cluster reduces the amount of positions.
List 1 1 1 1 0 1 0
List 2 1 0 1 1 1 0 1st permutation
1 1 1 0 1 0 2nd permutation, final permutation, match, done
This generalizes to larger alphabets and different alignment patterns; the main challenge is to find a good normalization with only a few possible representations. Ideally, it would be a proper normalization, with a single unique representation, but given the problem, I don't think that's possible.
Building further on RocketRoy's answer:
Convert all your lists up front to unsigned 64 bit numbers.
For each list, rotate those 55 bits around to find the smallest numerical value.
You are now left with a single unsigned 64 bit value for each list that you can compare straight with the value of the other lists. Function is_circular_identical() is not required anymore.
(In essence, you create an identity value for your lists that is not affected by the rotation of the lists elements)
That would even work if you have an arbitrary number of one's in your lists.
This is the same idea of Salvador Dali but don't need the string convertion. Behind is the same KMP recover idea to avoid impossible shift inspection. Them only call KMPModified(list1, list2+list2).
public class KmpModified
{
public int[] CalculatePhi(int[] pattern)
{
var phi = new int[pattern.Length + 1];
phi[0] = -1;
phi[1] = 0;
int pos = 1, cnd = 0;
while (pos < pattern.Length)
if (pattern[pos] == pattern[cnd])
{
cnd++;
phi[pos + 1] = cnd;
pos++;
}
else if (cnd > 0)
cnd = phi[cnd];
else
{
phi[pos + 1] = 0;
pos++;
}
return phi;
}
public IEnumerable<int> Search(int[] pattern, int[] list)
{
var phi = CalculatePhi(pattern);
int m = 0, i = 0;
while (m < list.Length)
if (pattern[i] == list[m])
{
i++;
if (i == pattern.Length)
{
yield return m - i + 1;
i = phi[i];
}
m++;
}
else if (i > 0)
{
i = phi[i];
}
else
{
i = 0;
m++;
}
}
[Fact]
public void BasicTest()
{
var pattern = new[] { 1, 1, 10 };
var list = new[] {2, 4, 1, 1, 1, 10, 1, 5, 1, 1, 10, 9};
var matches = Search(pattern, list).ToList();
Assert.Equal(new[] {3, 8}, matches);
}
[Fact]
public void SolveProblem()
{
var random = new Random();
var list = new int[10];
for (var k = 0; k < list.Length; k++)
list[k]= random.Next();
var rotation = new int[list.Length];
for (var k = 1; k < list.Length; k++)
rotation[k - 1] = list[k];
rotation[rotation.Length - 1] = list[0];
Assert.True(Search(list, rotation.Concat(rotation).ToArray()).Any());
}
}
Hope this help!
Simplifying The Problem
The problem consist of list of ordered items
The domain of value is binary (0,1)
We can reduce the problem by mapping consecutive 1s into a count
and consecutive 0s into a negative count
Example
A = [ 1, 1, 1, 0, 0, 1, 1, 0 ]
B = [ 1, 1, 0, 1, 1, 1, 0, 0 ]
~
A = [ +3, -2, +2, -1 ]
B = [ +2, -1, +3, -2 ]
This process require that the first item and the last item must be different
This will reduce the amount of comparisons overall
Checking Process
If we assume that they're duplicate, then we can assume what we are looking for
Basically the first item from the first list must exist somewhere in the other list
Followed by what is followed in the first list, and in the same manner
The previous items should be the last items from the first list
Since it's circular, the order is the same
The Grip
The question here is where to start, technically known as lookup and look-ahead
We will just check where the first element of the first list exist through the second list
The probability of frequent element is lower given that we mapped the lists into histograms
Pseudo-Code
FUNCTION IS_DUPLICATE (LIST L1, LIST L2) : BOOLEAN
LIST A = MAP_LIST(L1)
LIST B = MAP_LIST(L2)
LIST ALPHA = LOOKUP_INDEX(B, A[0])
IF A.SIZE != B.SIZE
OR COUNT_CHAR(A, 0) != COUNT_CHAR(B, ALPHA[0]) THEN
RETURN FALSE
END IF
FOR EACH INDEX IN ALPHA
IF ALPHA_NGRAM(A, B, INDEX, 1) THEN
IF IS_DUPLICATE(A, B, INDEX) THEN
RETURN TRUE
END IF
END IF
END FOR
RETURN FALSE
END FUNCTION
FUNCTION IS_DUPLICATE (LIST L1, LIST L2, INTEGER INDEX) : BOOLEAN
INTEGER I = 0
WHILE I < L1.SIZE DO
IF L1[I] != L2[(INDEX+I)%L2.SIZE] THEN
RETURN FALSE
END IF
I = I + 1
END WHILE
RETURN TRUE
END FUNCTION
Functions
MAP_LIST(LIST A):LIST MAP CONSQUETIVE ELEMENTS AS COUNTS IN A NEW LIST
LOOKUP_INDEX(LIST A, INTEGER E):LIST RETURN LIST OF INDICES WHERE THE ELEMENT E EXIST IN THE LIST A
COUNT_CHAR(LIST A , INTEGER E):INTEGER COUNT HOW MANY TIMES AN ELEMENT E OCCUR IN A LIST A
ALPHA_NGRAM(LIST A,LIST B,INTEGER I,INTEGER N):BOOLEAN CHECK IF B[I] IS EQUIVALENT TO A[0] N-GRAM IN BOTH DIRECTIONS
Finally
If the list size is going to be pretty huge or if the element we are starting to check the cycle from is frequently high, then we can do the following:
Look for the least-frequent item in the first list to start with
increase the n-gram N parameter to lower the probability of going through a the linear check
An efficient, quick-to-compute "canonical form" for the lists in question can be derived as:
Count the number of zeroes between the ones (ignoring wrap-around), to get three numbers.
Rotate the three numbers so that the biggest number is first.
The first number (a) must be between 18 and 52 (inclusive). Re-encode it as between 0 and 34.
The second number (b) must be between 0 and 26, but it doesn't matter much.
Drop the third number, since it's just 52 - (a + b) and adds no information
The canonical form is the integer b * 35 + a, which is between 0 and 936 (inclusive), which is fairly compact (there are 477 circularly-unique lists in total).
I wrote an straightforward solution which compares both lists and just increases (and wraps around) the index of the compared value for each iteration.
I don't know python well so I wrote it in Java, but it's really simple so it should be easy to adapt it to any other language.
By this you could also compare lists of other types.
public class Main {
public static void main(String[] args){
int[] a = {0,1,1,1,0};
int[] b = {1,1,0,0,1};
System.out.println(isCircularIdentical(a, b));
}
public static boolean isCircularIdentical(int[] a, int[]b){
if(a.length != b.length){
return false;
}
//The outer loop is for the increase of the index of the second list
outer:
for(int i = 0; i < a.length; i++){
//Loop trough the list and compare each value to the according value of the second list
for(int k = 0; k < a.length; k++){
// I use modulo length to wrap around the index
if(a[k] != b[(k + i) % a.length]){
//If the values do not match I continue and shift the index one further
continue outer;
}
}
return true;
}
return false;
}
}
As others have mentioned, once you find the normalized rotation of a list, you can compare them.
Heres some working code that does this,
Basic method is to find a normalized rotation for each list and compare:
Calculate a normalized rotation index on each list.
Loop over both lists with their offsets, comparing each item, returning if they mis-match.
Note that this method is it doesn't depend on numbers, you can pass in lists of strings (any values which can be compared).
Instead of doing a list-in-list search, we know we want the list to start with the minimum value - so we can loop over the minimum values, searching until we find which one has the lowest successive values, storing this for further comparisons until we have the best.
There are many opportunities to exit early when calculating the index, details on some optimizations.
Skip searching for the best minimum value when theres only one.
Skip searching minimum values when the previous is also a minimum value (it will never be a better match).
Skip searching when all values are the same.
Fail early when lists have different minimum values.
Use regular comparison when offsets match.
Adjust offsets to avoid wrapping the index values on one of the lists during comparison.
Note that in Python a list-in-list search may well be faster, however I was interested to find an efficient algorithm - which could be used in other languages too. Also, there is some advantage to avoiding to create new lists.
def normalize_rotation_index(ls, v_min_other=None):
""" Return the index or -1 (when the minimum is above `v_min_other`) """
if len(ls) <= 1:
return 0
def compare_rotations(i_a, i_b):
""" Return True when i_a is smaller.
Note: unless there are large duplicate sections of identical values,
this loop will exit early on.
"""
for offset in range(1, len(ls)):
v_a = ls[(i_a + offset) % len(ls)]
v_b = ls[(i_b + offset) % len(ls)]
if v_a < v_b:
return True
elif v_a > v_b:
return False
return False
v_min = ls[0]
i_best_first = 0
i_best_last = 0
i_best_total = 1
for i in range(1, len(ls)):
v = ls[i]
if v_min > v:
v_min = v
i_best_first = i
i_best_last = i
i_best_total = 1
elif v_min == v:
i_best_last = i
i_best_total += 1
# all values match
if i_best_total == len(ls):
return 0
# exit early if we're not matching another lists minimum
if v_min_other is not None:
if v_min != v_min_other:
return -1
# simple case, only one minimum
if i_best_first == i_best_last:
return i_best_first
# otherwise find the minimum with the lowest values compared to all others.
# start looking after the first we've found
i_best = i_best_first
for i in range(i_best_first + 1, i_best_last + 1):
if (ls[i] == v_min) and (ls[i - 1] != v_min):
if compare_rotations(i, i_best):
i_best = i
return i_best
def compare_circular_lists(ls_a, ls_b):
# sanity checks
if len(ls_a) != len(ls_b):
return False
if len(ls_a) <= 1:
return (ls_a == ls_b)
index_a = normalize_rotation_index(ls_a)
index_b = normalize_rotation_index(ls_b, ls_a[index_a])
if index_b == -1:
return False
if index_a == index_b:
return (ls_a == ls_b)
# cancel out 'index_a'
index_b = (index_b - index_a)
if index_b < 0:
index_b += len(ls_a)
index_a = 0 # ignore it
# compare rotated lists
for i in range(len(ls_a)):
if ls_a[i] != ls_b[(index_b + i) % len(ls_b)]:
return False
return True
assert(compare_circular_lists([0, 9, -1, 2, -1], [-1, 2, -1, 0, 9]) == True)
assert(compare_circular_lists([2, 9, -1, 0, -1], [-1, 2, -1, 0, 9]) == False)
assert(compare_circular_lists(["Hello" "Circular", "World"], ["World", "Hello" "Circular"]) == True)
assert(compare_circular_lists(["Hello" "Circular", "World"], ["Circular", "Hello" "World"]) == False)
See: this snippet for some more tests/examples.
You can check to see if a list A is equal to a cyclic shift of list B in expected O(N) time pretty easily.
I would use a polynomial hash function to compute the hash of list A, and every cyclic shift of list B. Where a shift of list B has the same hash as list A, I'd compare the actual elements to see if they are equal.
The reason this is fast is that with polynomial hash functions (which are extremely common!), you can calculate the hash of each cyclic shift from the previous one in constant time, so you can calculate hashes for all of the cyclic shifts in O(N) time.
It works like this:
Let's say B has N elements, then the the hash of B using prime P is:
Hb=0;
for (i=0; i<N ; i++)
{
Hb = Hb*P + B[i];
}
This is an optimized way to evaluate a polynomial in P, and is equivalent to:
Hb=0;
for (i=0; i<N ; i++)
{
Hb += B[i] * P^(N-1-i); //^ is exponentiation, not XOR
}
Notice how every B[i] is multiplied by P^(N-1-i). If we shift B to the left by 1, then every every B[i] will be multiplied by an extra P, except the first one. Since multiplication distributes over addition, we can multiply all the components at once just by multiplying the whole hash, and then fix up the factor for the first element.
The hash of the left shift of B is just
Hb1 = Hb*P + B[0]*(1-(P^N))
The second left shift:
Hb2 = Hb1*P + B[1]*(1-(P^N))
and so on...
NOTE: all math above is performed modulo some machine word size, and you only have to calculate P^N once.
To glue to the most pythonic way to do it, use sets !
from sets import Set
a = Set ([1, 1, 1, 0, 0])
b = Set ([0, 1, 1, 1, 0])
c = Set ([1, 0, 0, 1, 1])
a==b
True
a==b==c
True

Get a permutation as a function of a unique given index in O(n)

I would like to have a function get_permutation that, given a list l and an index i, returns a permutation of l such that the permutations are unique for all i bigger than 0 and lower than n! (where n = len(l)).
I.e. get_permutation(l,i) != get_permutation(l,j) if i!=j for all i, j s.t. 0 <= i and j < len(l)!).
Moreover, this function has to run in O(n).
For example, this function would comply the with the requirements, if it weren't for the exponential order:
def get_permutation(l, i):
return list(itertools.permutations(l))[i]
Does anyone has a solution for the above described problem?
EDIT: I want the permutation from the index NOT the index from the permutation
If you don't care about which permutations get which indices, an O(n) solution becomes possible if we consider that arithmetic operations with arbitrary integers are O(1).
For example, see the paper "Ranking and unranking permutations in linear time" by Wendy Myrvold and Frank Ruskey.
In short, there are two ideas.
(1) Consider Fisher-Yates shuffle method to generate a random permutation (pseudocode below):
p = [0, 1, ..., n-1]
for i := 0 upto n-1:
j := random_integer (0, i)
exchange p[i] and p[j]
This transform is injective: if we give it a different sequence of random integers, it is guaranteed to produce a different permutation. So, we substitute random integers by non-random ones: the first one is 0, the second one 0 or 1, ..., the last one can be any integer from 0 to n-1.
(2) There are n! permutations of order n. What we want to do now is to write an integer from 0 to n!-1 in factorial number system: the last digit is always 0, the previous one is 0 or 1, ..., and there are n possibilities from 0 to n-1 for the first digit. Thus we will get a unique sequence to feed the above pseudocode with.
Now, if we consider division of our number by an integer from 1 to n to be O(1) operation, transforming the number to factorial system is O(n) such divisions. This is, strictly speaking, not true: for large n, the number n! contains on the order of O(n log n) binary digits, and that division's cost is proportional to the number of digits.
In practice, for small n, O(n^2) or O(n log n) methods to rank or unrank a permutation, and also methods requiring O(2^n) or O(n!) memory to store some precomputed values, may be faster than an O(n) method involving integer division, which is a relatively slow operation on modern processors.
For n large enough so that the n! does not fit into a machine word, the "O(n) if order-n! integer operations are O(1)" argument stops working. So, you may be better off for both small and large n if you don't insist on it being theoretically O(n).
Based on http://www.2ality.com/2013/03/permutations.html here's a possible solution. As #Gassa pointed out, elements.pop is not constant in order, and hence the solution is not linear in the length of the list. Therefore, I won't mark this as an accepted answer. But, it does the job.
def integerToCode(idx, permSize):
if (permSize <= 1):
return [0]
multiplier = math.factorial(permSize-1)
digit =idx / multiplier
return [digit] + integerToCode(idx % multiplier, permSize-1)
def codeToPermutation(elements, code):
return map(lambda i: elements.pop(i), code)
def get_permutation(l, i):
c = integerToCode(i, len(l))
return codeToPermutation(list(l), c)
Update: possible dupe of Finding n-th permutation without computing others, see there for algorithm.
If len(l) will be small, you could precompute perm_index = permutations(range(len(l))) and use it as a list of lists of indexes into your actual data.
Moreover, if you have a list of permutations from range(len(l)) and you need one for for range(len(l) - 1) you can do something like:
[x - 1 for x in perm_index[i][1:]]
Which takes advantage of the fact that the permutations are in sorted order when generated.
This solution works in O(1) (runtime complexity; amortised cost for dictionary lookups):
Code
#!/usr/bin/env python
import itertools
def get_permutation():
memoize = {}
def _memoizer(l, i):
if str(l) in memoize and i not in memoize[str(l)]:
memoize[str(l)][i] = memoize[str(l)]['permutations'].next()
else:
p = itertools.permutations(l)
memoize[str(l)] = {'permutations': p}
memoize[str(l)][i] = memoize[str(l)]['permutations'].next()
return memoize[str(l)][i]
return _memoizer
if __name__ == '__main__':
get_permutation = get_permutation()
l1 = list(range(10))
l2 = list(range(5))
print(get_permutation(l1, 1))
print(get_permutation(l1, 20))
print(get_permutation(l2, 3))
Output
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
(0, 1, 2, 3, 4, 5, 6, 7, 9, 8)
(0, 1, 2, 3, 4)
How it works
The code stores all past calls in a dictionary. It also stores the permutation object(s). So in case a new permutation gets requested, the next permutation is used.
The code uses itertools.permutations
A bit too late... C# code that should give you the result you expect:
using System;
using System.Collections.Generic;
namespace WpfPermutations
{
public class PermutationOuelletLexico3<T>
{
// ************************************************************************
private T[] _sortedValues;
private bool[] _valueUsed;
public readonly long MaxIndex; // long to support 20! or less
// ************************************************************************
public PermutationOuelletLexico3(T[] sortedValues)
{
if (sortedValues.Length <= 0)
{
throw new ArgumentException("sortedValues.Lenght should be greater than 0");
}
_sortedValues = sortedValues;
Result = new T[_sortedValues.Length];
_valueUsed = new bool[_sortedValues.Length];
MaxIndex = Factorial.GetFactorial(_sortedValues.Length);
}
// ************************************************************************
public T[] Result { get; private set; }
// ************************************************************************
/// <summary>
/// Return the permutation relative to the index received, according to
/// _sortedValues.
/// Sort Index is 0 based and should be less than MaxIndex. Otherwise you get an exception.
/// </summary>
/// <param name="sortIndex"></param>
/// <param name="result">Value is not used as inpu, only as output. Re-use buffer in order to save memory</param>
/// <returns></returns>
public void GetValuesForIndex(long sortIndex)
{
int size = _sortedValues.Length;
if (sortIndex < 0)
{
throw new ArgumentException("sortIndex should be greater or equal to 0.");
}
if (sortIndex >= MaxIndex)
{
throw new ArgumentException("sortIndex should be less than factorial(the lenght of items)");
}
for (int n = 0; n < _valueUsed.Length; n++)
{
_valueUsed[n] = false;
}
long factorielLower = MaxIndex;
for (int index = 0; index < size; index++)
{
long factorielBigger = factorielLower;
factorielLower = Factorial.GetFactorial(size - index - 1); // factorielBigger / inverseIndex;
int resultItemIndex = (int)(sortIndex % factorielBigger / factorielLower);
int correctedResultItemIndex = 0;
for(;;)
{
if (! _valueUsed[correctedResultItemIndex])
{
resultItemIndex--;
if (resultItemIndex < 0)
{
break;
}
}
correctedResultItemIndex++;
}
Result[index] = _sortedValues[correctedResultItemIndex];
_valueUsed[correctedResultItemIndex] = true;
}
}
// ************************************************************************
/// <summary>
/// Calc the index, relative to _sortedValues, of the permutation received
/// as argument. Returned index is 0 based.
/// </summary>
/// <param name="values"></param>
/// <returns></returns>
public long GetIndexOfValues(T[] values)
{
int size = _sortedValues.Length;
long valuesIndex = 0;
List<T> valuesLeft = new List<T>(_sortedValues);
for (int index = 0; index < size; index++)
{
long indexFactorial = Factorial.GetFactorial(size - 1 - index);
T value = values[index];
int indexCorrected = valuesLeft.IndexOf(value);
valuesIndex = valuesIndex + (indexCorrected * indexFactorial);
valuesLeft.Remove(value);
}
return valuesIndex;
}
// ************************************************************************
}
}

Categories