Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3?

Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3? - python

It is my understanding that the range() function, which is actually an object type in Python 3, generates its contents on the fly, similar to a generator.
This being the case, I would have expected the following line to take an inordinate amount of time because, in order to determine whether 1 quadrillion is in the range, a quadrillion values would have to be generated:
1_000_000_000_000_000 in range(1_000_000_000_000_001)
Furthermore: it seems that no matter how many zeroes I add on, the calculation more or less takes the same amount of time (basically instantaneous).
I have also tried things like this, but the calculation is still almost instant:
# count by tens
1_000_000_000_000_000_000_000 in range(0,1_000_000_000_000_000_000_001,10)
If I try to implement my own range function, the result is not so nice!
def my_crappy_range(N):
i = 0
while i < N:
yield i
i += 1
return
What is the range() object doing under the hood that makes it so fast?
Martijn Pieters's answer was chosen for its completeness, but also see abarnert's first answer for a good discussion of what it means for range to be a full-fledged sequence in Python 3, and some information/warning regarding potential inconsistency for __contains__ function optimization across Python implementations. abarnert's other answer goes into some more detail and provides links for those interested in the history behind the optimization in Python 3 (and lack of optimization of xrange in Python 2). Answers by poke and by wim provide the relevant C source code and explanations for those who are interested.

The Python 3 range() object doesn't produce numbers immediately; it is a smart sequence object that produces numbers on demand. All it contains is your start, stop and step values, then as you iterate over the object the next integer is calculated each iteration.
The object also implements the object.__contains__ hook, and calculates if your number is part of its range. Calculating is a (near) constant time operation *. There is never a need to scan through all possible integers in the range.
From the range() object documentation:
The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).
So at a minimum, your range() object would do:
class my_range:
def __init__(self, start, stop=None, step=1, /):
if stop is None:
start, stop = 0, start
self.start, self.stop, self.step = start, stop, step
if step < 0:
lo, hi, step = stop, start, -step
else:
lo, hi = start, stop
self.length = 0 if lo > hi else ((hi - lo - 1) // step) + 1
def __iter__(self):
current = self.start
if self.step < 0:
while current > self.stop:
yield current
current += self.step
else:
while current < self.stop:
yield current
current += self.step
def __len__(self):
return self.length
def __getitem__(self, i):
if i < 0:
i += self.length
if 0 <= i < self.length:
return self.start + i * self.step
raise IndexError('my_range object index out of range')
def __contains__(self, num):
if self.step < 0:
if not (self.stop < num <= self.start):
return False
else:
if not (self.start <= num < self.stop):
return False
return (num - self.start) % self.step == 0
This is still missing several things that a real range() supports (such as the .index() or .count() methods, hashing, equality testing, or slicing), but should give you an idea.
I also simplified the __contains__ implementation to only focus on integer tests; if you give a real range() object a non-integer value (including subclasses of int), a slow scan is initiated to see if there is a match, just as if you use a containment test against a list of all the contained values. This was done to continue to support other numeric types that just happen to support equality testing with integers but are not expected to support integer arithmetic as well. See the original Python issue that implemented the containment test.
* Near constant time because Python integers are unbounded and so math operations also grow in time as N grows, making this a O(log N) operation. Since it’s all executed in optimised C code and Python stores integer values in 30-bit chunks, you’d run out of memory before you saw any performance impact due to the size of the integers involved here.

The fundamental misunderstanding here is in thinking that range is a generator. It's not. In fact, it's not any kind of iterator.
You can tell this pretty easily:
>>> a = range(5)
>>> print(list(a))
[0, 1, 2, 3, 4]
>>> print(list(a))
[0, 1, 2, 3, 4]
If it were a generator, iterating it once would exhaust it:
>>> b = my_crappy_range(5)
>>> print(list(b))
[0, 1, 2, 3, 4]
>>> print(list(b))
[]
What range actually is, is a sequence, just like a list. You can even test this:
>>> import collections.abc
>>> isinstance(a, collections.abc.Sequence)
True
This means it has to follow all the rules of being a sequence:
>>> a[3] # indexable
3
>>> len(a) # sized
5
>>> 3 in a # membership
True
>>> reversed(a) # reversible
<range_iterator at 0x101cd2360>
>>> a.index(3) # implements 'index'
3
>>> a.count(3) # implements 'count'
1
The difference between a range and a list is that a range is a lazy or dynamic sequence; it doesn't remember all of its values, it just remembers its start, stop, and step, and creates the values on demand on __getitem__.
(As a side note, if you print(iter(a)), you'll notice that range uses the same listiterator type as list. How does that work? A listiterator doesn't use anything special about list except for the fact that it provides a C implementation of __getitem__, so it works fine for range too.)
Now, there's nothing that says that Sequence.__contains__ has to be constant time—in fact, for obvious examples of sequences like list, it isn't. But there's nothing that says it can't be. And it's easier to implement range.__contains__ to just check it mathematically ((val - start) % step, but with some extra complexity to deal with negative steps) than to actually generate and test all the values, so why shouldn't it do it the better way?
But there doesn't seem to be anything in the language that guarantees this will happen. As Ashwini Chaudhari points out, if you give it a non-integral value, instead of converting to integer and doing the mathematical test, it will fall back to iterating all the values and comparing them one by one. And just because CPython 3.2+ and PyPy 3.x versions happen to contain this optimization, and it's an obvious good idea and easy to do, there's no reason that IronPython or NewKickAssPython 3.x couldn't leave it out. (And in fact, CPython 3.0-3.1 didn't include it.)
If range actually were a generator, like my_crappy_range, then it wouldn't make sense to test __contains__ this way, or at least the way it makes sense wouldn't be obvious. If you'd already iterated the first 3 values, is 1 still in the generator? Should testing for 1 cause it to iterate and consume all the values up to 1 (or up to the first value >= 1)?

Use the source, Luke!
In CPython, range(...).__contains__ (a method wrapper) will eventually delegate to a simple calculation which checks if the value can possibly be in the range. The reason for the speed here is we're using mathematical reasoning about the bounds, rather than a direct iteration of the range object. To explain the logic used:
Check that the number is between start and stop, and
Check that the stride value doesn't "step over" our number.
For example, 994 is in range(4, 1000, 2) because:
4 <= 994 < 1000, and
(994 - 4) % 2 == 0.
The full C code is included below, which is a bit more verbose because of memory management and reference counting details, but the basic idea is there:
static int
range_contains_long(rangeobject *r, PyObject *ob)
{
int cmp1, cmp2, cmp3;
PyObject *tmp1 = NULL;
PyObject *tmp2 = NULL;
PyObject *zero = NULL;
int result = -1;
zero = PyLong_FromLong(0);
if (zero == NULL) /* MemoryError in int(0) */
goto end;
/* Check if the value can possibly be in the range. */
cmp1 = PyObject_RichCompareBool(r->step, zero, Py_GT);
if (cmp1 == -1)
goto end;
if (cmp1 == 1) { /* positive steps: start <= ob < stop */
cmp2 = PyObject_RichCompareBool(r->start, ob, Py_LE);
cmp3 = PyObject_RichCompareBool(ob, r->stop, Py_LT);
}
else { /* negative steps: stop < ob <= start */
cmp2 = PyObject_RichCompareBool(ob, r->start, Py_LE);
cmp3 = PyObject_RichCompareBool(r->stop, ob, Py_LT);
}
if (cmp2 == -1 || cmp3 == -1) /* TypeError */
goto end;
if (cmp2 == 0 || cmp3 == 0) { /* ob outside of range */
result = 0;
goto end;
}
/* Check that the stride does not invalidate ob's membership. */
tmp1 = PyNumber_Subtract(ob, r->start);
if (tmp1 == NULL)
goto end;
tmp2 = PyNumber_Remainder(tmp1, r->step);
if (tmp2 == NULL)
goto end;
/* result = ((int(ob) - start) % step) == 0 */
result = PyObject_RichCompareBool(tmp2, zero, Py_EQ);
end:
Py_XDECREF(tmp1);
Py_XDECREF(tmp2);
Py_XDECREF(zero);
return result;
}
static int
range_contains(rangeobject *r, PyObject *ob)
{
if (PyLong_CheckExact(ob) || PyBool_Check(ob))
return range_contains_long(r, ob);
return (int)_PySequence_IterSearch((PyObject*)r, ob,
PY_ITERSEARCH_CONTAINS);
}
The "meat" of the idea is mentioned in the comment lines:
/* positive steps: start <= ob < stop */
/* negative steps: stop < ob <= start */
/* result = ((int(ob) - start) % step) == 0 */
As a final note - look at the range_contains function at the bottom of the code snippet. If the exact type check fails then we don't use the clever algorithm described, instead falling back to a dumb iteration search of the range using _PySequence_IterSearch! You can check this behaviour in the interpreter (I'm using v3.5.0 here):
>>> x, r = 1000000000000000, range(1000000000000001)
>>> class MyInt(int):
... pass
...
>>> x_ = MyInt(x)
>>> x in r # calculates immediately :)
True
>>> x_ in r # iterates for ages.. :(
^\Quit (core dumped)

To add to Martijn’s answer, this is the relevant part of the source (in C, as the range object is written in native code):
static int
range_contains(rangeobject *r, PyObject *ob)
{
if (PyLong_CheckExact(ob) || PyBool_Check(ob))
return range_contains_long(r, ob);
return (int)_PySequence_IterSearch((PyObject*)r, ob,
PY_ITERSEARCH_CONTAINS);
}
So for PyLong objects (which is int in Python 3), it will use the range_contains_long function to determine the result. And that function essentially checks if ob is in the specified range (although it looks a bit more complex in C).
If it’s not an int object, it falls back to iterating until it finds the value (or not).
The whole logic could be translated to pseudo-Python like this:
def range_contains (rangeObj, obj):
if isinstance(obj, int):
return range_contains_long(rangeObj, obj)
# default logic by iterating
return any(obj == x for x in rangeObj)
def range_contains_long (r, num):
if r.step > 0:
# positive step: r.start <= num < r.stop
cmp2 = r.start <= num
cmp3 = num < r.stop
else:
# negative step: r.start >= num > r.stop
cmp2 = num <= r.start
cmp3 = r.stop < num
# outside of the range boundaries
if not cmp2 or not cmp3:
return False
# num must be on a valid step inside the boundaries
return (num - r.start) % r.step == 0

If you're wondering why this optimization was added to range.__contains__, and why it wasn't added to xrange.__contains__ in 2.7:
First, as Ashwini Chaudhary discovered, issue 1766304 was opened explicitly to optimize [x]range.__contains__. A patch for this was accepted and checked in for 3.2, but not backported to 2.7 because "xrange has behaved like this for such a long time that I don't see what it buys us to commit the patch this late." (2.7 was nearly out at that point.)
Meanwhile:
Originally, xrange was a not-quite-sequence object. As the 3.1 docs say:
Range objects have very little behavior: they only support indexing, iteration, and the len function.
This wasn't quite true; an xrange object actually supported a few other things that come automatically with indexing and len,* including __contains__ (via linear search). But nobody thought it was worth making them full sequences at the time.
Then, as part of implementing the Abstract Base Classes PEP, it was important to figure out which builtin types should be marked as implementing which ABCs, and xrange/range claimed to implement collections.Sequence, even though it still only handled the same "very little behavior". Nobody noticed that problem until issue 9213. The patch for that issue not only added index and count to 3.2's range, it also re-worked the optimized __contains__ (which shares the same math with index, and is directly used by count).** This change went in for 3.2 as well, and was not backported to 2.x, because "it's a bugfix that adds new methods". (At this point, 2.7 was already past rc status.)
So, there were two chances to get this optimization backported to 2.7, but they were both rejected.
* In fact, you even get iteration for free with indexing alone, but in 2.3 xrange objects got a custom iterator.
** The first version actually reimplemented it, and got the details wrong—e.g., it would give you MyIntSubclass(2) in range(5) == False. But Daniel Stutzbach's updated version of the patch restored most of the previous code, including the fallback to the generic, slow _PySequence_IterSearch that pre-3.2 range.__contains__ was implicitly using when the optimization doesn't apply.

The other answers explained it well already, but I'd like to offer another experiment illustrating the nature of range objects:
>>> r = range(5)
>>> for i in r:
print(i, 2 in r, list(r))
0 True [0, 1, 2, 3, 4]
1 True [0, 1, 2, 3, 4]
2 True [0, 1, 2, 3, 4]
3 True [0, 1, 2, 3, 4]
4 True [0, 1, 2, 3, 4]
As you can see, a range object is an object that remembers its range and can be used many times (even while iterating over it), not just a one-time generator.

It's all about a lazy approach to the evaluation and some extra optimization of range.
Values in ranges don't need to be computed until real use, or even further due to extra optimization.
By the way, your integer is not such big, consider sys.maxsize
sys.maxsize in range(sys.maxsize) is pretty fast
due to optimization - it's easy to compare given integer just with min and max of range.
but:
Decimal(sys.maxsize) in range(sys.maxsize) is pretty slow.
(in this case, there is no optimization in range, so if python receives unexpected Decimal, python will compare all numbers)
You should be aware of an implementation detail but should not be relied upon, because this may change in the future.

TL;DR
The object returned by range() is actually a range object. This object implements the iterator interface so you can iterate over its values sequentially, just like a generator, list, or tuple.
But it also implements the __contains__ interface which is actually what gets called when an object appears on the right-hand side of the in operator. The __contains__() method returns a bool of whether or not the item on the left-hand side of the in is in the object. Since range objects know their bounds and stride, this is very easy to implement in O(1).

Due to optimization, it is very easy to compare given integers just with min and max range.
The reason that the range() function is so fast in Python3 is that here we use mathematical reasoning for the bounds, rather than a direct iteration of the range object.
So for explaining the logic here:
Check whether the number is between the start and stop.
Check whether the step precision value doesn't go over our number.
Take an example, 997 is in range(4, 1000, 3) because:
4 <= 997 < 1000, and (997 - 4) % 3 == 0.

Try x-1 in (i for i in range(x)) for large x values, which uses a generator comprehension to avoid invoking the range.__contains__ optimisation.

TLDR;
the range is an arithmetic series so it can very easily calculate whether the object is there. It could even get the index of it if it were list like really quickly.

__contains__ method compares directly with the start and end of the range

Related

Rewriting recursive algorithm to memoized algorithm

I have written the following recursive algorithm:
p = [2,3,2,1,4]
def fn(c,i):
if(c < 0 or i < 0):
return 0
if(c == 0):
return 1
return fn(c,i-1)+fn(c-p[i-1],i-1)
Its a solution to a problem where you have c coins, and you have to find out have many ways you can spend your c coins on beers. There are n different beers, only one of each beer.
i is denoted as the i'th beer, with the price of p[i], the prices are stored in array p.
The algorithm recursively calls itself, and if c == 0, it returns 1, as it has found a valid permutation. If c or i is less than 0, it returns 0 as it's not a valid permutation, as it exceeds the amount of coins available.
Now I need to rewrite the algorithm as a Memoized algorithm. This is my first time trying this, so I'm a little confused on how to do it.
Ive been trying different stuff, my latest try is the following code:
p = [2,3,2,1,4]
prev = np.empty([5, 5])
def fni(c,i):
if(prev[c][i] != None):
return prev[c][i]
if(c < 0 or i < 0):
prev[c][i] = 0
return 0
if(c == 0):
prev[c][i] = 1
return 1
prev[c][i] = fni(c,i-1)+fni(c-p[i-1],i-1)
return prev[c][i]
"Surprisingly" it doesn't work, and im sure it's completely wrong. My thought was to save the results of the recursive call in an 2d array of 5x5, and check in the start if the result is already saved in the array, and if it is just return it.
I only provided my above attempt to show something, so don't take the code too seriously.
My prev array is all 0's, and should be values of null so just ignore that.
My task is actually only to solve it as pseudocode, but I thought it would be easier to write it as code to make sure that it would actually work, so pseudo code would help as well.
I hope I have provided enough information, else feel free to ask!
EDIT: I forgot to mention that I have 5 coins, and 5 different beers (one of each beer). So c = 5, and i = 5

First, np.empty() by default gives an array of uninitialized values, not Nones, as the documentation points out:
>>> np.empty([2, 2])
array([[ -9.74499359e+001, 6.69583040e-309],
[ 2.13182611e-314, 3.06959433e-309]]) #uninitialized
Secondly, although this is more subjective, you should default to using dictionaries for memoization in Python. Arrays may be more efficient if you know you'll actually memoize most of the possible values, but it can be hard to tell that ahead of time. At the very least, make sure your array values are initialized. It's good that you're using numpy-- that will help you avoid the common beginner mistake of writing memo = [[0]*5]*5.
Thirdly, you should perform checks for 'out of bounds' or negative parameters (c < 0 or i < 0) before you use them to access an array as in prev[c][i] != None. Negative indices in Python could map you to a different memoized parameter's value.
Besides those details, your memoization code and strategy is sound.

How do I find all 32 bit binary numbers that have exactly six 1 and rest 0

I could do this in brute force, but I was hoping there was clever coding, or perhaps an existing function, or something I am not realising...
So some examples of numbers I want:
00000000001111110000
11111100000000000000
01010101010100000000
10101010101000000000
00100100100100100100
The full permutation. Except with results that have ONLY six 1's. Not more. Not less. 64 or 32 bits would be ideal. 16 bits if that provides an answer.

I think what you need here is using the itertools module.
BAD SOLUTION
But you need to be careful, for instance, using something like permutations would just work for very small inputs. ie:
Something like the below would give you a binary representation:
>>> ["".join(v) for v in set(itertools.permutations(["1"]*2+["0"]*3))]
['11000', '01001', '00101', '00011', '10010', '01100', '01010', '10001', '00110', '10100']
then just getting decimal representation of those number:
>>> [int("".join(v), 16) for v in set(itertools.permutations(["1"]*2+["0"]*3))]
[69632, 4097, 257, 17, 65552, 4352, 4112, 65537, 272, 65792]
if you wanted 32bits with 6 ones and 26 zeroes, you'd use:
>>> [int("".join(v), 16) for v in set(itertools.permutations(["1"]*6+["0"]*26))]
but this computation would take a supercomputer to deal with (32! = 263130836933693530167218012160000000 )
DECENT SOLUTION
So a more clever way to do it is using combinations, maybe something like this:
import itertools
num_bits = 32
num_ones = 6
lst = [
f"{sum([2**vv for vv in v]):b}".zfill(num_bits)
for v in list(itertools.combinations(range(num_bits), num_ones))
]
print(len(lst))
this would tell us there is 906192 numbers with 6 ones in the whole spectrum of 32bits numbers.
CREDITS:
Credits for this answer go to #Mark Dickinson who pointed out using permutations was unfeasible and suggested the usage of combinations

Well I am not a Python coder so I can not post a valid code for you. Instead I can do a C++ one...
If you look at your problem you set 6 bits and many zeros ... so I would approach this by 6 nested for loops computing all the possible 1s position and set the bits...
Something like:
for (i0= 0;i0<32-5;i0++)
for (i1=i0+1;i1<32-4;i1++)
for (i2=i1+1;i2<32-3;i2++)
for (i3=i2+1;i3<32-2;i3++)
for (i4=i3+1;i4<32-1;i4++)
for (i5=i4+1;i5<32-0;i5++)
// here i0,...,i5 marks the set bits positions
So the O(2^32) become to less than `~O(26.25.24.23.22.21/16) and you can not go faster than that as that would mean you miss valid solutions...
I assume you want to print the number so for speed up you can compute the number as a binary number string from the start to avoid slow conversion between string and number...
The nested for loops can be encoded as increment operation of an array (similar to bignum arithmetics)
When I put all together I got this C++ code:
int generate()
{
const int n1=6; // number of set bits
const int n=32; // number of bits
char x[n+2]; // output number string
int i[n1],j,cnt; // nested for loops iterator variables and found solutions count
for (j=0;j<n;j++) x[j]='0'; x[j]='b'; j++; x[j]=0; // x = 0
for (j=0;j<n1;j++){ i[j]=j; x[i[j]]='1'; } // first solution
for (cnt=0;;)
{
// Form1->mm_log->Lines->Add(x); // here x is the valid answer to print
cnt++;
for (j=n1-1;j>=0;j--) // this emulates n1 nested for loops
{
x[i[j]]='0'; i[j]++;
if (i[j]<n-n1+j+1){ x[i[j]]='1'; break; }
}
if (j<0) break;
for (j++;j<n1;j++){ i[j]=i[j-1]+1; x[i[j]]='1'; }
}
return cnt; // found valid answers
};
When I use this with n1=6,n=32 I got this output (without printing the numbers):
cnt = 906192
and it was finished in 4.246 ms on AMD A8-5500 3.2GHz (win7 x64 32bit app no threads) which is fast enough for me...
Beware once you start outputing the numbers somewhere the speed will drop drastically. Especially if you output to console or what ever ... it might be better to buffer the output somehow like outputting 1024 string numbers at once etc... But as I mentioned before I am no Python coder so it might be already handled by the environment...
On top of all this once you will play with variable n1,n you can do the same for zeros instead of ones and use faster approach (if there is less zeros then ones use nested for loops to mark zeros instead of ones)
If the wanted solution numbers are wanted as a number (not a string) then its possible to rewrite this so the i[] or i0,..i5 holds the bitmask instead of bit positions ... instead of inc/dec you just shift left/right ... and no need for x array anymore as the number would be x = i0|...|i5 ...

You could create a counter array for positions of 1s in the number and assemble it by shifting the bits in their respective positions. I created an example below. It runs pretty fast (less than a second for 32 bits on my laptop):
bitCount = 32
oneCount = 6
maxBit = 1<<(bitCount-1)
ones = [1<<b for b in reversed(range(oneCount)) ] # start with bits on low end
ones[0] >>= 1 # shift back 1st one because it will be incremented at start of loop
index = 0
result = []
while index < len(ones):
ones[index] <<= 1 # shift one at current position
if index == 0:
number = sum(ones) # build output number
result.append(number)
if ones[index] == maxBit:
index += 1 # go to next position when bit reaches max
elif index > 0:
index -= 1 # return to previous position
ones[index] = ones[index+1] # and prepare it to move up (relative to next)
64 bits takes about a minute, roughly proportional to the number of values that are output. O(n)
The same approach can be expressed more concisely in a recursive generator function which will allow more efficient use of the bit patterns:
def genOneBits(bitcount=32,onecount=6):
for bitPos in range(onecount-1,bitcount):
value = 1<<bitPos
if onecount == 1: yield value; continue
for otherBits in genOneBits(bitPos,onecount-1):
yield value + otherBits
result = [ n for n in genOneBits(32,6) ]
This is not faster when you get all the numbers but it allows partial access to the list without going through all values.
If you need direct access to the Nth bit pattern (e.g. to get a random one-bits pattern), you can use the following function. It works like indexing a list but without having to generate the list of patterns.
def numOneBits(bitcount=32,onecount=6):
def factorial(X): return 1 if X < 2 else X * factorial(X-1)
return factorial(bitcount)//factorial(onecount)//factorial(bitcount-onecount)
def nthOneBits(N,bitcount=32,onecount=6):
if onecount == 1: return 1<<N
bitPos = 0
while bitPos<=bitcount-onecount:
group = numOneBits(bitcount-bitPos-1,onecount-1)
if N < group: break
N -= group
bitPos += 1
if bitPos>bitcount-onecount: return None
result = 1<<bitPos
result |= nthOneBits(N,bitcount-bitPos-1,onecount-1)<<(bitPos+1)
return result
# bit pattern at position 1000:
nthOneBit(1000) # --> 10485799 (00000000101000000000000000100111)
This allows you to get the bit patterns on very large integers that would be impossible to generate completely:
nthOneBits(10000, bitcount=256, onecount=9)
# 77371252457588066994880639
# 100000000000000000000000000000000001000000000000000000000000000000000000000000001111111
It is worth noting that the pattern order does not follow the numerical order of the corresponding numbers
Although nthOneBits() can produce any pattern instantly, it is much slower than the other functions when mass producing patterns. If you need to manipulate them sequentially, you should go for the generator function instead of looping on nthOneBits().
Also, it should be fairly easy to tweak the generator to have it start at a specific pattern so you could get the best of both approaches.
Finally, it may be useful to obtain then next bit pattern given a known pattern. This is what the following function does:
def nextOneBits(N=0,bitcount=32,onecount=6):
if N == 0: return (1<<onecount)-1
bitPositions = []
for pos in range(bitcount):
bit = N%2
N //= 2
if bit==1: bitPositions.insert(0,pos)
index = 0
result = None
while index < onecount:
bitPositions[index] += 1
if bitPositions[index] == bitcount:
index += 1
continue
if index == 0:
result = sum( 1<<bp for bp in bitPositions )
break
if index > 0:
index -= 1
bitPositions[index] = bitPositions[index+1]
return result
nthOneBits(12) #--> 131103 00000000000000100000000000011111
nextOneBits(131103) #--> 262175 00000000000001000000000000011111 5.7ns
nthOneBits(13) #--> 262175 00000000000001000000000000011111 49.2ns
Like nthOneBits(), this one does not need any setup time. It could be used in combination with nthOneBits() to get subsequent patterns after getting an initial one at a given position. nextOneBits() is much faster than nthOneBits(i+1) but is still slower than the generator function.
For very large integers, using nthOneBits() and nextOneBits() may be the only practical options.

You are dealing with permutations of multisets. There are many ways to achieve this and as #BPL points out, doing this efficiently is non-trivial. There are many great methods mentioned here: permutations with unique values. The cleanest (not sure if it's the most efficient), is to use the multiset_permutations from the sympy module.
import time
from sympy.utilities.iterables import multiset_permutations
t = time.process_time()
## Credit to #BPL for the general setup
multiPerms = ["".join(v) for v in multiset_permutations(["1"]*6+["0"]*26)]
elapsed_time = time.process_time() - t
print(elapsed_time)
On my machine, the above computes in just over 8 seconds. It generates just under a million results as well:
len(multiPerms)
906192

How to stop short circuiting in Python?

Python short circuits the logical operators.
for eg:
if False and Condition2:
#condition2 won't even be checked because the first condition is already false.
Is there a way to stop this behavior. I want it to check both the conditions and then perform the and operation(as done in c, c++ etc). It's useful when we are performing some operation along with the condition. e.g.:
if a < p.pop() and b < p.pop():
One way can be checking the conditions before and then comparing the Boolean values. But that would be wastage of memory.

if all([a < p.pop(), b < p.pop()])
This creates a list, which will be evaluated in its entirety, and then uses all to confirm that both values are truthy. But this is somewhat obscure and I'd rather suggest you write plain, easy to understand code:
a_within_limit = a < p.pop()
b_within_limit = b < p.pop()
if a_within_limit and b_within_limit:

If the conditions are booleans, as they are in your example, you could use & instead:
>>> a, b, p = 1, 1, [0, 0]
>>> (a < p.pop()) & (b < p.pop())
False
>>> p
[]

You can use the all() and any() built-in functions to somehow emulate the and and or operators. Both take an iterable of boolean-likes values as parameter. If you give it a literal tuple or list, all members will be fully evaluated:
# all emulates the and operator
if all((False, Condition2)):
do_stuff()
# any emulates the or operator
if any((False, Condition2)):
do_stuff()

Short answer: No, you cannot stop it to do this.
For example:
av = p.pop()
bv = p.pop()
if a < av and b < bv:
pass
Or:
av, bv = p.pop(), p.pop()
if a < av and b < bv:
pass
Also, there is no waste of memory in these examples. In Python, almost everything is done by reference. The value object being popped already exists somewhere. Even the scalars like strings, ints, etc are objects (some of them are slightly optimized). The only memory changes here are (1) the creation of a new variable that refers to the same existing object, and (2) removal of the record in the dict at the same time (which referred to that object before popping). They are of the similar scale.

"in" statement for lists/tuples of floats

Should the use of in or not in be avoided when dealing with lists/tuples of floats? Is its implementation something like the code below or is it something more sophisticated?
check = False
for item in list_to_search_the_value_in:
if value_to_search_for == item:
check = True

in and not in should be your preferred way of membership testing. Both operators can make use (via __contains__()) of any optimized membership test that the container offers.
Your problem is with the float part, because in makes an equality comparison with == (optimized to check for identity, first).
In general, for floating point comparing for equality does not yield the desired results. Hence for lists of floats, you want something like
def is_in_float(item, sequence, eps=None):
eps = eps or 2**-52
return any((abs(item - seq_item) < eps) for seq_item in sequence)
Use with sorting and binary search to find the closest matching float at your convenience.

Here's the part of the documentation saying that in checks for equality on sequence types. So no, this should not be used for sequences of floats.

The in operator uses regular equality checks behind the scenes, so it has the same limitations as __eq__() when it comes to floats. Use with caution if at all.
>>> 0.3 == 0.4 - 0.1
False
>>> 0.3 in [0.4 - 0.1]
False

Since in operator uses equality check, it'll frequently fail, since floating point math is "broken" (well, it's not, but you get a point).
You may easily achieve similar functionality by using any:
epsilon = 1e-9
check = any(abs(f - value_to_search_for) < epsilon for f in seq)
# or
check = False
if any(abs(f - value_to_search_for) < epsilon for f in seq):
check = True

Python's list type has its __contains__ method implemented in C:
static int
list_contains(PyListObject *a, PyObject *el)
{
Py_ssize_t i;
int cmp;
for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
Py_EQ);
return cmp;
}
A literal translation to Python might be:
def list_contains(a, el):
cmp = False
for i in range(len(a)):
if cmp: break
cmp = a[i] == el
return cmp
Your example is a more idiomatic translation.
In any case, as the other answers have noted, it uses equality to test the list items against the element you're checking for membership. With float values, that can be perilous, as numbers we'd expect to be equal may not be due to floating point rounding.
A more float-safe way of implementing the check yourself might be:
any(abs(x - el) < epsilon for x in a)
where epsilon is some small value. How small it needs to be will depend on the size of the numbers you're dealing with, and how precise you care to be. If you can estimate the amount of numeric error that might differentiate el an equivalent value in the list, you can set epsilon to one order of magnitude larger and be confident that you'll not give a false negative (and probably only give false positives in cases that are impossible to get right).

Should you always favor xrange() over range()?

Why or why not?

For performance, especially when you're iterating over a large range, xrange() is usually better. However, there are still a few cases why you might prefer range():
In python 3, range() does what xrange() used to do and xrange() does not exist. If you want to write code that will run on both Python 2 and Python 3, you can't use xrange().
range() can actually be faster in some cases - eg. if iterating over the same sequence multiple times. xrange() has to reconstruct the integer object every time, but range() will have real integer objects. (It will always perform worse in terms of memory however)
xrange() isn't usable in all cases where a real list is needed. For instance, it doesn't support slices, or any list methods.
[Edit] There are a couple of posts mentioning how range() will be upgraded by the 2to3 tool. For the record, here's the output of running the tool on some sample usages of range() and xrange()
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: ws_comma
--- range_test.py (original)
+++ range_test.py (refactored)
## -1,7 +1,7 ##
for x in range(20):
- a=range(20)
+ a=list(range(20))
b=list(range(20))
c=[x for x in range(20)]
d=(x for x in range(20))
- e=xrange(20)
+ e=range(20)
As you can see, when used in a for loop or comprehension, or where already wrapped with list(), range is left unchanged.

No, they both have their uses:
Use xrange() when iterating, as it saves memory. Say:
for x in xrange(1, one_zillion):
rather than:
for x in range(1, one_zillion):
On the other hand, use range() if you actually want a list of numbers.
multiples_of_seven = range(7,100,7)
print "Multiples of seven < 100: ", multiples_of_seven

You should favour range() over xrange() only when you need an actual list. For instance, when you want to modify the list returned by range(), or when you wish to slice it. For iteration or even just normal indexing, xrange() will work fine (and usually much more efficiently). There is a point where range() is a bit faster than xrange() for very small lists, but depending on your hardware and various other details, the break-even can be at a result of length 1 or 2; not something to worry about. Prefer xrange().

One other difference is that Python 2 implementation of xrange() can't support numbers bigger than C ints, so if you want to have a range using Python's built in large number support, you have to use range().
Python 2.7.3 (default, Jul 13 2012, 22:29:01)
[GCC 4.7.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> range(123456787676676767676676,123456787676676767676679)
[123456787676676767676676L, 123456787676676767676677L, 123456787676676767676678L]
>>> xrange(123456787676676767676676,123456787676676767676679)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long
Python 3 does not have this problem:
Python 3.2.3 (default, Jul 14 2012, 01:01:48)
[GCC 4.7.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> range(123456787676676767676676,123456787676676767676679)
range(123456787676676767676676, 123456787676676767676679)

xrange() is more efficient because instead of generating a list of objects, it just generates one object at a time. Instead of 100 integers, and all of their overhead, and the list to put them in, you just have one integer at a time. Faster generation, better memory use, more efficient code.
Unless I specifically need a list for something, I always favor xrange()

range() returns a list, xrange() returns an xrange object.
xrange() is a bit faster, and a bit more memory efficient. But the gain is not very large.
The extra memory used by a list is of course not just wasted, lists have more functionality (slice, repeat, insert, ...). Exact differences can be found in the documentation. There is no bonehard rule, use what is needed.
Python 3.0 is still in development, but IIRC range() will very similar to xrange() of 2.X and list(range()) can be used to generate lists.

I would just like to say that it REALLY isn't that difficult to get an xrange object with slice and indexing functionality. I have written some code that works pretty dang well and is just as fast as xrange for when it counts (iterations).
from __future__ import division
def read_xrange(xrange_object):
# returns the xrange object's start, stop, and step
start = xrange_object[0]
if len(xrange_object) > 1:
step = xrange_object[1] - xrange_object[0]
else:
step = 1
stop = xrange_object[-1] + step
return start, stop, step
class Xrange(object):
''' creates an xrange-like object that supports slicing and indexing.
ex: a = Xrange(20)
a.index(10)
will work
Also a[:5]
will return another Xrange object with the specified attributes
Also allows for the conversion from an existing xrange object
'''
def __init__(self, *inputs):
# allow inputs of xrange objects
if len(inputs) == 1:
test, = inputs
if type(test) == xrange:
self.xrange = test
self.start, self.stop, self.step = read_xrange(test)
return
# or create one from start, stop, step
self.start, self.step = 0, None
if len(inputs) == 1:
self.stop, = inputs
elif len(inputs) == 2:
self.start, self.stop = inputs
elif len(inputs) == 3:
self.start, self.stop, self.step = inputs
else:
raise ValueError(inputs)
self.xrange = xrange(self.start, self.stop, self.step)
def __iter__(self):
return iter(self.xrange)
def __getitem__(self, item):
if type(item) is int:
if item < 0:
item += len(self)
return self.xrange[item]
if type(item) is slice:
# get the indexes, and then convert to the number
start, stop, step = item.start, item.stop, item.step
start = start if start != None else 0 # convert start = None to start = 0
if start < 0:
start += start
start = self[start]
if start < 0: raise IndexError(item)
step = (self.step if self.step != None else 1) * (step if step != None else 1)
stop = stop if stop is not None else self.xrange[-1]
if stop < 0:
stop += stop
stop = self[stop]
stop = stop
if stop > self.stop:
raise IndexError
if start < self.start:
raise IndexError
return Xrange(start, stop, step)
def index(self, value):
error = ValueError('object.index({0}): {0} not in object'.format(value))
index = (value - self.start)/self.step
if index % 1 != 0:
raise error
index = int(index)
try:
self.xrange[index]
except (IndexError, TypeError):
raise error
return index
def __len__(self):
return len(self.xrange)
Honestly, I think the whole issue is kind of silly and xrange should do all of this anyway...

Go with range for these reasons:
1) xrange will be going away in newer Python versions. This gives you easy future compatibility.
2) range will take on the efficiencies associated with xrange.

A good example given in book: Practical Python By Magnus Lie Hetland
>>> zip(range(5), xrange(100000000))
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
I wouldn’t recommend using range instead of xrange in the preceding example—although
only the first five numbers are needed, range calculates all the numbers, and that may take a lot
of time. With xrange, this isn’t a problem because it calculates only those numbers needed.
Yes I read #Brian's answer: In python 3, range() is a generator anyway and xrange() does not exist.

While xrange is faster than range in most circumstances, the difference in performance is pretty minimal. The little program below compares iterating over a range and an xrange:
import timeit
# Try various list sizes.
for list_len in [1, 10, 100, 1000, 10000, 100000, 1000000]:
# Time doing a range and an xrange.
rtime = timeit.timeit('a=0;\nfor n in range(%d): a += n'%list_len, number=1000)
xrtime = timeit.timeit('a=0;\nfor n in xrange(%d): a += n'%list_len, number=1000)
# Print the result
print "Loop list of len %d: range=%.4f, xrange=%.4f"%(list_len, rtime, xrtime)
The results below shows that xrange is indeed faster, but not enough to sweat over.
Loop list of len 1: range=0.0003, xrange=0.0003
Loop list of len 10: range=0.0013, xrange=0.0011
Loop list of len 100: range=0.0068, xrange=0.0034
Loop list of len 1000: range=0.0609, xrange=0.0438
Loop list of len 10000: range=0.5527, xrange=0.5266
Loop list of len 100000: range=10.1666, xrange=7.8481
Loop list of len 1000000: range=168.3425, xrange=155.8719
So by all means use xrange, but unless you're on a constrained hardware, don't worry too much about it.

Okay, everyone here as a different opinion as to the tradeoffs and advantages of xrange versus range. They're mostly correct, xrange is an iterator, and range fleshes out and creates an actual list. For the majority of cases, you won't really notice a difference between the two. (You can use map with range but not with xrange, but it uses up more memory.)
What I think you rally want to hear, however, is that the preferred choice is xrange. Since range in Python 3 is an iterator, the code conversion tool 2to3 will correctly convert all uses of xrange to range, and will throw out an error or warning for uses of range. If you want to be sure to easily convert your code in the future, you'll use xrange only, and list(xrange) when you're sure that you want a list. I learned this during the CPython sprint at PyCon this year (2008) in Chicago.

range(): range(1, 10) returns a list from 1 to 10 numbers & hold whole list in memory.
xrange(): Like range(), but instead of returning a list, returns an object that generates the numbers in the range on demand. For looping, this is lightly faster than range() and more memory efficient. xrange() object like an iterator and generates the numbers on demand (Lazy Evaluation).
In [1]: range(1,10)
Out[1]: [1, 2, 3, 4, 5, 6, 7, 8, 9]
In [2]: xrange(10)
Out[2]: xrange(10)
In [3]: print xrange.__doc__
Out[3]: xrange([start,] stop[, step]) -> xrange object
range() does the same thing as xrange() used to do in Python 3 and there is not term xrange() exist in Python 3.
range() can actually be faster in some scenario if you iterating over the same sequence multiple times. xrange() has to reconstruct the integer object every time, but range() will have real integer objects.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.