Worst-case time complexity of Python's int.bit_length() - python

When we call the function int.bit_length passing an integer n,
is the worst-case time complexity O(log(n)) or Python uses some trick to improve it (e.g. storing the position of the most significant bit of n when it is created)?

In CPython, for values with fewer internal-representation digits than PY_SSIZE_T_MAX/PyLong_SHIFT – i.e. fewer than PY_SSIZE_T_MAX binary digits – it’s calculated from the number of internal digits, yes:
msd = ((PyLongObject *)self)->ob_digit[ndigits-1];
msd_bits = bits_in_digit(msd);
if (ndigits <= PY_SSIZE_T_MAX/PyLong_SHIFT)
return PyLong_FromSsize_t((ndigits-1)*PyLong_SHIFT + msd_bits);
Otherwise, it goes through bigints again, for overall time complexity of O(log log N) (which isn’t exactly true either in this strange mix of practice and theory, so…).
/* expression above may overflow; use Python integers instead */
result = (PyLongObject *)PyLong_FromSsize_t(ndigits - 1);
if (result == NULL)
return NULL;
x = (PyLongObject *)PyLong_FromLong(PyLong_SHIFT);
if (x == NULL)
goto error;
y = (PyLongObject *)long_mul(result, x);
Py_DECREF(x);
if (y == NULL)
goto error;
Py_DECREF(result);
result = y;
x = (PyLongObject *)PyLong_FromLong((long)msd_bits);
if (x == NULL)
goto error;
y = (PyLongObject *)long_add(result, x);
Py_DECREF(x);
if (y == NULL)
goto error;
Py_DECREF(result);
result = y;
return (PyObject *)result;
tl;dr: it’s O(1)

Related

Python Runtime of collections.Counter Equality

I am wondering what the big-O runtime complexity is for comparing two collections.Counter objects. Here is some code to demonstrate what I mean:
import collections
counter_1 = collections.Counter("abcabcabcabcabcabcdefg")
counter_2 = collections.Counter("xyzxyzxyzabc")
comp = counter_1 == counter_2 # What is the runtime of this comparison statement?
Is the runtime of the equality comparison in the final statement O(1)? Or is it O(num_of_unique_keys_in_largest_counter)? Or is it something else?
For reference, here is the source code for collections.Counter https://github.com/python/cpython/blob/0250de48199552cdaed5a4fe44b3f9cdb5325363/Lib/collections/init.py#L497
I do not see the class implementing an __eq()__ method.
Bonus points: If the answer to this question changes between python2 and python3, I would love to hear the difference?
Counter is a subclass of dict, therefore the big O analysis is the one of dict, with the caveat that Counter objects are specialized to only hold int values (i/e they can not hold collections of values as dicts can); this simplifies the analysis.
Looking at the c code implementation of the equality comparison:
There is an early exit if the number of keys is different. (this does not influence big-O).
Then a loop that iterates over all the keys that exits early if the key is not found, or if the corresponding value is different. (again, this has no bearing on big-O).
if all keys are found, and the corresponding values are all equal, then the dictionaries are declared equal. The lookup and comparisons of each key-value pair is O(1); this operation is repeated at most n times (n being the number of keys)
In all, the time complexity is O(n), with n the number of keys.
This applies to both python 2 and 3.
from dictobject.c
/* Return 1 if dicts equal, 0 if not, -1 if error.
* Gets out as soon as any difference is detected.
* Uses only Py_EQ comparison.
*/
static int
dict_equal(PyDictObject *a, PyDictObject *b)
{
Py_ssize_t i;
if (a->ma_used != b->ma_used)
/* can't be equal if # of entries differ */
return 0;
/* Same # of entries -- check all of 'em. Exit early on any diff. */
for (i = 0; i < a->ma_keys->dk_nentries; i++) {
PyDictKeyEntry *ep = &DK_ENTRIES(a->ma_keys)[i];
PyObject *aval;
if (a->ma_values)
aval = a->ma_values[i];
else
aval = ep->me_value;
if (aval != NULL) {
int cmp;
PyObject *bval;
PyObject *key = ep->me_key;
/* temporarily bump aval's refcount to ensure it stays
alive until we're done with it */
Py_INCREF(aval);
/* ditto for key */
Py_INCREF(key);
/* reuse the known hash value */
b->ma_keys->dk_lookup(b, key, ep->me_hash, &bval);
if (bval == NULL) {
Py_DECREF(key);
Py_DECREF(aval);
if (PyErr_Occurred())
return -1;
return 0;
}
cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
Py_DECREF(key);
Py_DECREF(aval);
if (cmp <= 0) /* error or not equal */
return cmp;
}
}
return 1;
}
Internally, collections.Counter stores the count as a dictionary (that's why it subclasses dict) so the same rules apply as with comparing dictionaries - namely, it compares each key with each value from both dictionaries to ensure equality. For CPython, that is implemented in dict_equal(), other implementations may vary but, logically, you have to do the each-with-each comparison to ensure equality.
This also means that the complexity is O(N) at its worst (loops through one of the dictionaries, looks if the value is the same in the other). There are no significant changes between Python 2.x and Python 3.x in this regard.

nan float identity compares False, but nans in tuples compare True

Can someone explain how the following is possible? I tried it in Python 2 and 3, and got the same result. Shouldn't the nans always compare not equal? Or, if it's comparing pointers, shouldn't the pointers always compare equal? What's going on?
>>> n = float('nan')
>>> n == n
False
>>> (n,) == (n,)
True
For n == n, it uses the compare method of float number.
For (n,) == (n,), it calls the compare method of tuple,
/* Search for the first index where items are different.
* Note that because tuples are immutable, it's safe to reuse
* vlen and wlen across the comparison calls.
*/
for (i = 0; i < vlen && i < wlen; i++) {
int k = PyObject_RichCompareBool(vt->ob_item[i],
wt->ob_item[i], Py_EQ);
if (k < 0)
return NULL;
if (!k)
break;
}
then it calls the compare method of object. It returns true immediately if two objects are the same.
/* Quick result when objects are the same.
Guarantees that identity implies equality. */
if (v == w) {
if (op == Py_EQ)
return 1;
else if (op == Py_NE)
return 0;
}

Long numbers hashing complexity in Python

How does Python hash long numbers? I guess it takes O(1) time for 32-bit ints, but the way long integers work in Python makes me think the complexity is not O(1) for them. I've looked for answers in relevant questions, but have found none straightforward enough to make me confident. Thank you in advance.
The long_hash() function indeed loops and depends on the size of the integer, yes:
/* The following loop produces a C unsigned long x such that x is
congruent to the absolute value of v modulo ULONG_MAX. The
resulting x is nonzero if and only if v is. */
while (--i >= 0) {
/* Force a native long #-bits (32 or 64) circular shift */
x = (x >> (8*SIZEOF_LONG-PyLong_SHIFT)) | (x << PyLong_SHIFT);
x += v->ob_digit[i];
/* If the addition above overflowed we compensate by
incrementing. This preserves the value modulo
ULONG_MAX. */
if (x < v->ob_digit[i])
x++;
}
where i is the 'object size', e.g. the number of digits used to represent the number, where the size of a digit depends on your platform.

What is the difference between this C++ code and this Python code?

Answer
Thanks to #TheDark for spotting the overflow. The new C++ solution is pretty freakin' funny, too. It's extremely redundant:
if(2*i > n && 2*i > i)
replaced the old line of code if(2*i > n).
Background
I'm doing this problem on HackerRank, though the problem may not be entirely related to this question. If you cannot see the webpage, or have to make an account and don't want to, the problem is listed in plain text below.
Question
My C++ code is timing out, but my python code is not. I first suspected this was due to overflow, but I used sizeof to be sure that unsigned long long can reach 2^64 - 1, the upper limit of the problem.
I practically translated my C++ code directly into Python to see if it was my algorithms causing the timeouts, but to my surprise my Python code passed every test case.
C++ code:
#include <iostream>
bool pot(unsigned long long n)
{
if (n % 2 == 0) return pot(n/2);
return (n==1); // returns true if n is power of two
}
unsigned long long gpt(unsigned long long n)
{
unsigned long long i = 1;
while(2*i < n) {
i *= 2;
}
return i; // returns greatest power of two less than n
}
int main()
{
unsigned int t;
std::cin >> t;
std::cout << sizeof(unsigned long long) << std::endl;
for(unsigned int i = 0; i < t; i++)
{
unsigned long long n;
unsigned long long count = 1;
std::cin >> n;
while(n > 1) {
if (pot(n)) n /= 2;
else n -= gpt(n);
count++;
}
if (count % 2 == 0) std::cout << "Louise" << std::endl;
else std::cout << "Richard" << std::endl;
}
}
Python 2.7 code:
def pot(n):
while n % 2 == 0:
n/=2
return n==1
def gpt(n):
i = 1
while 2*i < n:
i *= 2
return i
t = int(raw_input())
for i in range(t):
n = int(raw_input())
count = 1
while n != 1:
if pot(n):
n /= 2
else:
n -= gpt(n)
count += 1
if count % 2 == 0:
print "Louise"
else:
print "Richard"
To me, both versions look identical. I still think I'm somehow being fooled and am actually getting overflow, causing timeouts, in my C++ code.
Problem
Louise and Richard play a game. They have a counter is set to N. Louise gets the first turn and the turns alternate thereafter. In the game, they perform the following operations.
If N is not a power of 2, they reduce the counter by the largest power of 2 less than N.
If N is a power of 2, they reduce the counter by half of N.
The resultant value is the new N which is again used for subsequent operations.
The game ends when the counter reduces to 1, i.e., N == 1, and the last person to make a valid move wins.
Given N, your task is to find the winner of the game.
Input Format
The first line contains an integer T, the number of testcases.
T lines follow. Each line contains N, the initial number set in the counter.
Constraints
1 ≤ T ≤ 10
1 ≤ N ≤ 2^64 - 1
Output Format
For each test case, print the winner's name in a new line. So if Louise wins the game, print "Louise". Otherwise, print "Richard". (Quotes are for clarity)
Sample Input
1
6
Sample Output
Richard
Explanation
As 6 is not a power of 2, Louise reduces the largest power of 2 less than 6 i.e., 4, and hence the counter reduces to 2.
As 2 is a power of 2, Richard reduces the counter by half of 2 i.e., 1. Hence the counter reduces to 1.
As we reach the terminating condition with N == 1, Richard wins the game.
When n is greater than 2^63, your gpt function will eventually have i as 2^63 and then multiply 2^63 by 2, giving an overflow and a value of 0. This will then end up with an infinite loop, multiplying 0 by 2 each time.
Try this bit-twiddling hack, which is probably slightly faster:
unsigned long largest_power_of_two_not_greater_than(unsigned long x) {
for (unsigned long y; (y = x & (x - 1)); x = y) {}
return x;
}
x&(x-1) is x without its least significant one-bit. So y will be zero (terminating the loop) exactly when x has been reduced to a power of two, which will be the largest power of two not greater than the original x. The loop is executed once for every 1-bit in x, which is on average half as many iterations as your approach. Also, this one has not issues with overflow. (It does return 0 if the original x was 0. That may or may not be what you want.)
Note the if the original x was a power of two, that value is simply returned immediately. So the function doubles as a test whether x is a power of two (or 0).
While that is fun and all, in real-life code you'd probably be better off finding your compiler's equivalent to this gcc built-in (unless your compiler is gcc, in which case here it is):
Built-in Function: int __builtin_clz (unsigned int x)
Returns the number of leading 0-bits in X, starting at the most
significant bit position. If X is 0, the result is undefined.
(Also available as __builtin_clzl for unsigned long arguments and __builtin_clzll for unsigned long long.)

Binary layout of Python lists [duplicate]

This question already has answers here:
How is Python's List Implemented?
(10 answers)
Closed 8 years ago.
I am writing a program where I need to know the efficiency (memory wise) of different data containers in Python / Cython. One of said containers is the standard Python list.
The Python list is tripping me up because I do not know how it works on the binary level. Unlike Python, C's arrays are easy to understand, because all of the elements are the same type, and the space is declared ahead of time. This means when the programmer wants to go in and index the array, the program knows mathematically what memory address to go to. But the problem is, a Python list can store many different data types, and even nested lists inside of a list. The size of these data structures changes all the time, and the list still holds them, accounting for the changes. Does extra separator memory exist to make the list as dynamic as it is?
If you could, I would appreciate an actual binary layout of an example list in RAM, annotated with what each byte represents. This will help me to fully understand the inner workings of the list, as I am working on the binary level.
The list object is defined in Include/listobject.h. The structure is really simple:
typedef struct {
PyObject_VAR_HEAD
/* Vector of pointers to list elements. list[0] is ob_item[0], etc. */
PyObject **ob_item;
/* ob_item contains space for 'allocated' elements. The number
* currently in use is ob_size.
* Invariants:
* 0 <= ob_size <= allocated
* len(list) == ob_size
* ob_item == NULL implies ob_size == allocated == 0
* list.sort() temporarily sets allocated to -1 to detect mutations.
*
* Items must normally not be NULL, except during construction when
* the list is not yet visible outside the function that builds it.
*/
Py_ssize_t allocated;
} PyListObject;
and PyObject_VAR_HEAD is defined as
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
typedef struct {
PyObject ob_base;
Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;
Basically, then, a list object looks like this:
[ssize_t ob_refcnt]
[type *ob_type]
[ssize_t ob_size]
[object **ob_item] -> [object *][object *][object *]...
[ssize_t allocated]
Note that len retrieves the value of ob_size.
ob_item points to an array of PyObject * pointers. Each element in a list is a Python object, and Python objects are always passed by reference (at the C-API level, as pointers to the actual PyObjects). Therefore, lists only store pointers to objects, and not the objects themselves.
When a list fills up, it will be reallocated. allocated tracks how many elements the list can hold at maximum (before reallocation). The reallocation algorithm is in Objects/listobject.c:
/* Ensure ob_item has room for at least newsize elements, and set
* ob_size to newsize. If newsize > ob_size on entry, the content
* of the new slots at exit is undefined heap trash; it's the caller's
* responsibility to overwrite them with sane values.
* The number of allocated elements may grow, shrink, or stay the same.
* Failure is impossible if newsize <= self.allocated on entry, although
* that partly relies on an assumption that the system realloc() never
* fails when passed a number of bytes <= the number of bytes last
* allocated (the C standard doesn't guarantee this, but it's hard to
* imagine a realloc implementation where it wouldn't be true).
* Note that self->ob_item may change, and even if newsize is less
* than ob_size on entry.
*/
static int
list_resize(PyListObject *self, Py_ssize_t newsize)
{
PyObject **items;
size_t new_allocated;
Py_ssize_t allocated = self->allocated;
/* Bypass realloc() when a previous overallocation is large enough
to accommodate the newsize. If the newsize falls lower than half
the allocated size, then proceed with the realloc() to shrink the list.
*/
if (allocated >= newsize && newsize >= (allocated >> 1)) {
assert(self->ob_item != NULL || newsize == 0);
Py_SIZE(self) = newsize;
return 0;
}
/* This over-allocates proportional to the list size, making room
* for additional growth. The over-allocation is mild, but is
* enough to give linear-time amortized behavior over a long
* sequence of appends() in the presence of a poorly-performing
* system realloc().
* The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
/* check for integer overflow */
if (new_allocated > PY_SIZE_MAX - newsize) {
PyErr_NoMemory();
return -1;
} else {
new_allocated += newsize;
}
if (newsize == 0)
new_allocated = 0;
items = self->ob_item;
if (new_allocated <= (PY_SIZE_MAX / sizeof(PyObject *)))
PyMem_RESIZE(items, PyObject *, new_allocated);
else
items = NULL;
if (items == NULL) {
PyErr_NoMemory();
return -1;
}
self->ob_item = items;
Py_SIZE(self) = newsize;
self->allocated = new_allocated;
return 0;
}
As you can see from the comments, lists grow rather slowly, in the following sequence:
0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...

Categories