Fast way to find bit length of large positive integer from decimal - python

Given the decimal string representation of a large positive integer, what's a fast way to find the integer's bit length? Using int() and then bit_length() is slow. This example with a million digits takes over five seconds to tell me it has 3321926 bits:
s = '1234567890' * 10**5
print(int(s).bit_length())
Result should be exact, at least for all strings one can actually have in memory (so let's say up to up to 100 billion decimal digits).

If storage space is not an issue and you don't mind spending time up-front, (and you'd rather have a solution that doesn't depend on floating point accuracy, even if it's otherwise impractical) you can solve just about any speed issue with more memory. Build a lookup table of the string representations of 2**n. Set up a dictionary, keying the string length to a list of (string of that length, corresponding n value) pairs. To test an input, look up the appropriate list, and then use ordinary string comparison to figure out which bit-length category it's in.

This should be accurate for billions of digits, I think. Calculate the exact result for 100000...00000 by simply bits-per-digit, then add the log of the first 10 digits.
import math
s = '1234567890' * 10**5
dper = math.log(10)/math.log(2)
base= (len(s)-10)*dper
extra = math.log(int(s[:10]))/math.log(2)
print(int(base+extra+0.99))

This does the example in about 0.15 seconds, and '1234567890' * 10**6 in about 2 seconds and '1234567890' * 10**7 in about 20 seconds. First I approximate the bit length with logarithms (similar to Tim's way), then I use decimal.Decimal to adjust until exact. That class uses base 10, so it doesnt need a costly base conversion.
Bit length b covers the interval [2**(b-1), 2**b). So we want (the exponent of) the smallest power of 2 larger than the number.
Try it online!
from time import time
from math import log2
from decimal import *
setcontext(Context(prec=MAX_PREC, Emax=MAX_EMAX, Emin=MIN_EMIN))
def bit_length(s):
if len(s) <= 20:
return int(s).bit_length()
head_bits = log2(int(s[:20]))
tail_bits = (len(s) - 20) * log2(10)
b = int(head_bits + tail_bits)
n = Decimal(s)
power = Decimal(2) ** b
while power > n:
b -= 1
power //= 2
while power <= n:
b += 1
power *= 2
return b
s = '1234567890' * 10**5
start = time()
print(bit_length(s))
print(time() - start, 'seconds')

Related

Exact Value after Floating point not rounding up [duplicate]

I want to remove digits from a float to have a fixed number of digits after the dot, like:
1.923328437452 → 1.923
I need to output as a string to another function, not print.
Also I want to ignore the lost digits, not round them.
round(1.923328437452, 3)
See Python's documentation on the standard types. You'll need to scroll down a bit to get to the round function. Essentially the second number says how many decimal places to round it to.
First, the function, for those who just want some copy-and-paste code:
def truncate(f, n):
'''Truncates/pads a float f to n decimal places without rounding'''
s = '{}'.format(f)
if 'e' in s or 'E' in s:
return '{0:.{1}f}'.format(f, n)
i, p, d = s.partition('.')
return '.'.join([i, (d+'0'*n)[:n]])
This is valid in Python 2.7 and 3.1+. For older versions, it's not possible to get the same "intelligent rounding" effect (at least, not without a lot of complicated code), but rounding to 12 decimal places before truncation will work much of the time:
def truncate(f, n):
'''Truncates/pads a float f to n decimal places without rounding'''
s = '%.12f' % f
i, p, d = s.partition('.')
return '.'.join([i, (d+'0'*n)[:n]])
Explanation
The core of the underlying method is to convert the value to a string at full precision and then just chop off everything beyond the desired number of characters. The latter step is easy; it can be done either with string manipulation
i, p, d = s.partition('.')
'.'.join([i, (d+'0'*n)[:n]])
or the decimal module
str(Decimal(s).quantize(Decimal((0, (1,), -n)), rounding=ROUND_DOWN))
The first step, converting to a string, is quite difficult because there are some pairs of floating point literals (i.e. what you write in the source code) which both produce the same binary representation and yet should be truncated differently. For example, consider 0.3 and 0.29999999999999998. If you write 0.3 in a Python program, the compiler encodes it using the IEEE floating-point format into the sequence of bits (assuming a 64-bit float)
0011111111010011001100110011001100110011001100110011001100110011
This is the closest value to 0.3 that can accurately be represented as an IEEE float. But if you write 0.29999999999999998 in a Python program, the compiler translates it into exactly the same value. In one case, you meant it to be truncated (to one digit) as 0.3, whereas in the other case you meant it to be truncated as 0.2, but Python can only give one answer. This is a fundamental limitation of Python, or indeed any programming language without lazy evaluation. The truncation function only has access to the binary value stored in the computer's memory, not the string you actually typed into the source code.1
If you decode the sequence of bits back into a decimal number, again using the IEEE 64-bit floating-point format, you get
0.2999999999999999888977697537484345957637...
so a naive implementation would come up with 0.2 even though that's probably not what you want. For more on floating-point representation error, see the Python tutorial.
It's very rare to be working with a floating-point value that is so close to a round number and yet is intentionally not equal to that round number. So when truncating, it probably makes sense to choose the "nicest" decimal representation out of all that could correspond to the value in memory. Python 2.7 and up (but not 3.0) includes a sophisticated algorithm to do just that, which we can access through the default string formatting operation.
'{}'.format(f)
The only caveat is that this acts like a g format specification, in the sense that it uses exponential notation (1.23e+4) if the number is large or small enough. So the method has to catch this case and handle it differently. There are a few cases where using an f format specification instead causes a problem, such as trying to truncate 3e-10 to 28 digits of precision (it produces 0.0000000002999999999999999980), and I'm not yet sure how best to handle those.
If you actually are working with floats that are very close to round numbers but intentionally not equal to them (like 0.29999999999999998 or 99.959999999999994), this will produce some false positives, i.e. it'll round numbers that you didn't want rounded. In that case the solution is to specify a fixed precision.
'{0:.{1}f}'.format(f, sys.float_info.dig + n + 2)
The number of digits of precision to use here doesn't really matter, it only needs to be large enough to ensure that any rounding performed in the string conversion doesn't "bump up" the value to its nice decimal representation. I think sys.float_info.dig + n + 2 may be enough in all cases, but if not that 2 might have to be increased, and it doesn't hurt to do so.
In earlier versions of Python (up to 2.6, or 3.0), the floating point number formatting was a lot more crude, and would regularly produce things like
>>> 1.1
1.1000000000000001
If this is your situation, if you do want to use "nice" decimal representations for truncation, all you can do (as far as I know) is pick some number of digits, less than the full precision representable by a float, and round the number to that many digits before truncating it. A typical choice is 12,
'%.12f' % f
but you can adjust this to suit the numbers you're using.
1Well... I lied. Technically, you can instruct Python to re-parse its own source code and extract the part corresponding to the first argument you pass to the truncation function. If that argument is a floating-point literal, you can just cut it off a certain number of places after the decimal point and return that. However this strategy doesn't work if the argument is a variable, which makes it fairly useless. The following is presented for entertainment value only:
def trunc_introspect(f, n):
'''Truncates/pads the float f to n decimal places by looking at the caller's source code'''
current_frame = None
caller_frame = None
s = inspect.stack()
try:
current_frame = s[0]
caller_frame = s[1]
gen = tokenize.tokenize(io.BytesIO(caller_frame[4][caller_frame[5]].encode('utf-8')).readline)
for token_type, token_string, _, _, _ in gen:
if token_type == tokenize.NAME and token_string == current_frame[3]:
next(gen) # left parenthesis
token_type, token_string, _, _, _ = next(gen) # float literal
if token_type == tokenize.NUMBER:
try:
cut_point = token_string.index('.') + n + 1
except ValueError: # no decimal in string
return token_string + '.' + '0' * n
else:
if len(token_string) < cut_point:
token_string += '0' * (cut_point - len(token_string))
return token_string[:cut_point]
else:
raise ValueError('Unable to find floating-point literal (this probably means you called {} with a variable)'.format(current_frame[3]))
break
finally:
del s, current_frame, caller_frame
Generalizing this to handle the case where you pass in a variable seems like a lost cause, since you'd have to trace backwards through the program's execution until you find the floating-point literal which gave the variable its value. If there even is one. Most variables will be initialized from user input or mathematical expressions, in which case the binary representation is all there is.
The result of round is a float, so watch out (example is from Python 2.6):
>>> round(1.923328437452, 3)
1.923
>>> round(1.23456, 3)
1.2350000000000001
You will be better off when using a formatted string:
>>> "%.3f" % 1.923328437452
'1.923'
>>> "%.3f" % 1.23456
'1.235'
n = 1.923328437452
str(n)[:4]
At my Python 2.7 prompt:
>>> int(1.923328437452 * 1000)/1000.0
1.923
The truely pythonic way of doing it is
from decimal import *
with localcontext() as ctx:
ctx.rounding = ROUND_DOWN
print Decimal('1.923328437452').quantize(Decimal('0.001'))
or shorter:
from decimal import Decimal as D, ROUND_DOWN
D('1.923328437452').quantize(D('0.001'), rounding=ROUND_DOWN)
Update
Usually the problem is not in truncating floats itself, but in the improper usage of float numbers before rounding.
For example: int(0.7*3*100)/100 == 2.09.
If you are forced to use floats (say, you're accelerating your code with numba), it's better to use cents as "internal representation" of prices: (70*3 == 210) and multiply/divide the inputs/outputs.
Simple python script -
n = 1.923328437452
n = float(int(n * 1000))
n /=1000
def trunc(num, digits):
sp = str(num).split('.')
return '.'.join([sp[0], sp[1][:digits]])
This should work. It should give you the truncation you are looking for.
So many of the answers given for this question are just completely wrong. They either round up floats (rather than truncate) or do not work for all cases.
This is the top Google result when I search for 'Python truncate float', a concept which is really straightforward, and which deserves better answers. I agree with Hatchkins that using the decimal module is the pythonic way of doing this, so I give here a function which I think answers the question correctly, and which works as expected for all cases.
As a side-note, fractional values, in general, cannot be represented exactly by binary floating point variables (see here for a discussion of this), which is why my function returns a string.
from decimal import Decimal, localcontext, ROUND_DOWN
def truncate(number, places):
if not isinstance(places, int):
raise ValueError("Decimal places must be an integer.")
if places < 1:
raise ValueError("Decimal places must be at least 1.")
# If you want to truncate to 0 decimal places, just do int(number).
with localcontext() as context:
context.rounding = ROUND_DOWN
exponent = Decimal(str(10 ** - places))
return Decimal(str(number)).quantize(exponent).to_eng_string()
>>> from math import floor
>>> floor((1.23658945) * 10**4) / 10**4
1.2365
# divide and multiply by 10**number of desired digits
If you fancy some mathemagic, this works for +ve numbers:
>>> v = 1.923328437452
>>> v - v % 1e-3
1.923
I did something like this:
from math import trunc
def truncate(number, decimals=0):
if decimals < 0:
raise ValueError('truncate received an invalid value of decimals ({})'.format(decimals))
elif decimals == 0:
return trunc(number)
else:
factor = float(10**decimals)
return trunc(number*factor)/factor
You can do:
def truncate(f, n):
return math.floor(f * 10 ** n) / 10 ** n
testing:
>>> f=1.923328437452
>>> [truncate(f, n) for n in range(5)]
[1.0, 1.9, 1.92, 1.923, 1.9233]
Just wanted to mention that the old "make round() with floor()" trick of
round(f) = floor(f+0.5)
can be turned around to make floor() from round()
floor(f) = round(f-0.5)
Although both these rules break around negative numbers, so using it is less than ideal:
def trunc(f, n):
if f > 0:
return "%.*f" % (n, (f - 0.5*10**-n))
elif f == 0:
return "%.*f" % (n, f)
elif f < 0:
return "%.*f" % (n, (f + 0.5*10**-n))
def precision(value, precision):
"""
param: value: takes a float
param: precision: int, number of decimal places
returns a float
"""
x = 10.0**precision
num = int(value * x)/ x
return num
precision(1.923328437452, 3)
1.923
Short and easy variant
def truncate_float(value, digits_after_point=2):
pow_10 = 10 ** digits_after_point
return (float(int(value * pow_10))) / pow_10
>>> truncate_float(1.14333, 2)
>>> 1.14
>>> truncate_float(1.14777, 2)
>>> 1.14
>>> truncate_float(1.14777, 4)
>>> 1.1477
When using a pandas df this worked for me
import math
def truncate(number, digits) -> float:
stepper = 10.0 ** digits
return math.trunc(stepper * number) / stepper
df['trunc'] = df['float_val'].apply(lambda x: truncate(x,1))
df['trunc']=df['trunc'].map('{:.1f}'.format)
int(16.5);
this will give an integer value of 16, i.e. trunc, won't be able to specify decimals, but guess you can do that by
import math;
def trunc(invalue, digits):
return int(invalue*math.pow(10,digits))/math.pow(10,digits);
Here is an easy way:
def truncate(num, res=3):
return (floor(num*pow(10, res)+0.5))/pow(10, res)
for num = 1.923328437452, this outputs 1.923
def trunc(f,n):
return ('%.16f' % f)[:(n-16)]
A general and simple function to use:
def truncate_float(number, length):
"""Truncate float numbers, up to the number specified
in length that must be an integer"""
number = number * pow(10, length)
number = int(number)
number = float(number)
number /= pow(10, length)
return number
There is an easy workaround in python 3. Where to cut I defined with an help variable decPlace to make it easy to adapt.
f = 1.12345
decPlace= 4
f_cut = int(f * 10**decPlace) /10**decPlace
Output:
f = 1.1234
Hope it helps.
Most answers are way too complicated in my opinion, how about this?
digits = 2 # Specify how many digits you want
fnum = '122.485221'
truncated_float = float(fnum[:fnum.find('.') + digits + 1])
>>> 122.48
Simply scanning for the index of '.' and truncate as desired (no rounding).
Convert string to float as final step.
Or in your case if you get a float as input and want a string as output:
fnum = str(122.485221) # convert float to string first
truncated_float = fnum[:fnum.find('.') + digits + 1] # string output
I think a better version would be just to find the index of decimal point . and then to take the string slice accordingly:
def truncate(number, n_digits:int=1)->float:
'''
:param number: real number ℝ
:param n_digits: Maximum number of digits after the decimal point after truncation
:return: truncated floating point number with at least one digit after decimal point
'''
decimalIndex = str(number).find('.')
if decimalIndex == -1:
return float(number)
else:
return float(str(number)[:decimalIndex+n_digits+1])
int(1.923328437452 * 1000) / 1000
>>> 1.923
int(1.9239 * 1000) / 1000
>>> 1.923
By multiplying the number by 1000 (10 ^ 3 for 3 digits) we shift the decimal point 3 places to the right and get 1923.3284374520001. When we convert that to an int the fractional part 3284374520001 will be discarded. Then we undo the shifting of the decimal point again by dividing by 1000 which returns 1.923.
use numpy.round
import numpy as np
precision = 3
floats = [1.123123123, 2.321321321321]
new_float = np.round(floats, precision)
Something simple enough to fit in a list-comprehension, with no libraries or other external dependencies. For Python >=3.6, it's very simple to write with f-strings.
The idea is to let the string-conversion do the rounding to one more place than you need and then chop off the last digit.
>>> nout = 3 # desired number of digits in output
>>> [f'{x:.{nout+1}f}'[:-1] for x in [2/3, 4/5, 8/9, 9/8, 5/4, 3/2]]
['0.666', '0.800', '0.888', '1.125', '1.250', '1.500']
Of course, there is rounding happening here (namely for the fourth digit), but rounding at some point is unvoidable. In case the transition between truncation and rounding is relevant, here's a slightly better example:
>>> nacc = 6 # desired accuracy (maximum 15!)
>>> nout = 3 # desired number of digits in output
>>> [f'{x:.{nacc}f}'[:-(nacc-nout)] for x in [2.9999, 2.99999, 2.999999, 2.9999999]]
>>> ['2.999', '2.999', '2.999', '3.000']
Bonus: removing zeros on the right
>>> nout = 3 # desired number of digits in output
>>> [f'{x:.{nout+1}f}'[:-1].rstrip('0') for x in [2/3, 4/5, 8/9, 9/8, 5/4, 3/2]]
['0.666', '0.8', '0.888', '1.125', '1.25', '1.5']
The core idea given here seems to me to be the best approach for this problem.
Unfortunately, it has received less votes while the later answer that has more votes is not complete (as observed in the comments). Hopefully, the implementation below provides a short and complete solution for truncation.
def trunc(num, digits):
l = str(float(num)).split('.')
digits = min(len(l[1]), digits)
return l[0] + '.' + l[1][:digits]
which should take care of all corner cases found here and here.
Am also a python newbie and after making use of some bits and pieces here, I offer my two cents
print str(int(time.time()))+str(datetime.now().microsecond)[:3]
str(int(time.time())) will take the time epoch as int and convert it to string and join with...
str(datetime.now().microsecond)[:3] which returns the microseconds only, convert to string and truncate to first 3 chars
# value value to be truncated
# n number of values after decimal
value = 0.999782
n = 3
float(int(value*1en))*1e-n

Counting the number of set bits in a number

The problem statement is:
Write an efficient program to count number of 1s in binary representation of an integer.
I found a post on this problem here which outlines multiple solutions which run in log(n) time including Brian Kernigan's algorithm and the gcc __builtin_popcount() method.
One solution that wasn't mentioned was the python method: bin(n).count("1")
which also achieves the same effect. Does this method also run in log n time?
You are converting the integer to a string, which means it'll have to produce N '0' and '1' characters. You then use str.count() which must visit every character in the string to count the '1' characters.
All in all you have a O(N) algorithm, with a relatively high constant cost.
Note that this is the same complexity as the code you linked to; the integer n has log(n) bits, but the algorithm still has to make N = log(n) steps to calculate the number of bits. The bin(n).count('1') algorithm is thus equivalent, but slow as there is a high cost to produce the string in the first place.
At the cost of a table, you could move to processing integers per byte:
table = [0]
while len(table) < 256:
table += [t + 1 for t in table]
length = sum(map(table.__getitem__, n.to_bytes(n.bit_length() // 8 + 1, 'little')))
However, because Python needs to produce a series of new objects (a bytes object and several integers) this method never quite is fast enough to beat the bin(n).count('1') method:
>>> from random import choice
>>> import timeit
>>> table = [0]
>>> while len(table) < 256:
... table += [t + 1 for t in table]
...
>>> def perbyte(n): return sum(map(table.__getitem__, n.to_bytes(n.bit_length() // 8 + 1, 'little')))
...
>>> def strcount(n): return bin(n).count('1')
...
>>> n = int(''.join([choice('01') for _ in range(2 ** 16)]))
>>> for f in (strcount, perbyte):
... print(f.__name__, timeit.timeit('f(n)', 'from __main__ import f, n', number=1000))
...
strcount 1.11822146497434
perbyte 1.4401431040023454
No matter the bit-length of the test number, perbyte is always a percentage slower.
Let's say you are trying to count the number of set bits of n. On Python typical implementations, bin will compute the binary representation in O(log n) time and count will go through the string, therefore resulting in an overall O(log n) complexity.
However, note that usually, the input parameter of algorithms is the "size" of the input. When you work with integers, this corresponds to their logarithm. That's why the current algorithm is said to have a linear complexity (the variable is m = log n, and the complexity O(m)).

Wrong answer in SPOJ `CUBERT` [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I am getting a Wrong Answer for my solution to this problem on SPOJ.
The problem asks to calculate the cube root of an integer(which can be upto 150 digits long), and output the answer truncated upto 10 decimal places.
It also asks to calculate the sum of all the digits in the answer modulo 10 as a 'checksum' value.
Here is the exact problem statement:
Your task is to calculate the cube root of a given positive integer.
We can not remember why exactly we need this, but it has something in
common with a princess, a young peasant, kissing and half of a kingdom
(a huge one, we can assure you).
Write a program to solve this crucial task.
Input
The input starts with a line containing a single integer t <= 20, the
number of test cases. t test cases follow.
The next lines consist of large positive integers of up to 150 decimal
digits. Each number is on its own separate line of the input file. The
input file may contain empty lines. Numbers can be preceded or
followed by whitespaces but no line exceeds 255 characters.
Output
For each number in the input file your program should output a line
consisting of two values separated by single space. The second value
is the cube root of the given number, truncated (not rounded!) after
the 10th decimal place. First value is a checksum of all printed
digits of the cube root, calculated as the sum of the printed digits
modulo 10.
Example
Input:
5
1
8
1000
2 33076161
Output:
1 1.0000000000
2 2.0000000000
1 10.0000000000
0 1.2599210498
6 321.0000000000
Here is my solution:
from math import pow
def foo(num):
num_cube_root = pow(num, 1.0 / 3)
# First round upto 11 decimal places
num_cube_root = "%.11f" % (num_cube_root)
# Then remove the last decimal digit
# to achieve a truncation of 10 decimal places
num_cube_root = str(num_cube_root)[0:-1]
num_cube_root_sum = 0
for digit in num_cube_root:
if digit != '.':
num_cube_root_sum += int(digit)
num_cube_root_sum %= 10
return (num_cube_root_sum, num_cube_root)
def main():
# Number of test cases
t = int(input())
while t:
t -= 1
num = input().strip()
# If line empty, ignore
if not num:
t += 1
continue
num = int(num)
ans = foo(num)
print(str(ans[0]) + " " + ans[1])
if __name__ == '__main__':
main()
It is working perfectly for the sample cases: Live demo.
Can anyone tell what is the problem with this solution?
Your solution has two problems, both related to the use of floating-point arithmetic. The first issue is that Python floats only carry roughly 16 significant decimal digits of precision, so as soon as your answer requires more than 16 significant digits or so (so more than 6 digits before the point, and 10 digits after), you've very little hope of getting the correct trailing digits. The second issue is more subtle, and affects even small values of n. That's that your approach of rounding to 11 decimal digits and then dropping the last digit suffers from potential errors due to double rounding. For an example, take n = 33. The cube root of n, to 20 decimal places or so, is:
3.20753432999582648755...
When that's rounded to 11 places after the point, you end up with
3.20753433000
and now dropping the last digit gives 3.2075343300, which isn't what you wanted. The problem is that that round to 11 decimal places can end up affecting digits to the left of the 11th place digit.
So what can you do to fix this? Well, you can avoid floating-point altogether and reduce this to a pure integer problem. We need the cube root of some integer n to 10 decimal places (rounding the last place down). That's equivalent to computing the cube root of 10**30 * n to the nearest integer, again rounding down, then dividing the result by 10**10. So the essential task here is to compute the floor of the cube root of any given integer n. I was unable to find any existing Stack Overflow answers about computing integer cube roots (still less in Python), so I thought it worth showing how to do so in detail.
Computing cube roots of integers turns out to be quite easy (with the help of a tiny bit of mathematics). There are various possible approaches, but one approach that's both efficient and easy to implement is to use a pure-integer version of the Newton-Raphson method. Over the real numbers, Newton's method for solving the equation x**3 = n takes an approximation x to the cube root of n, and iterates to return an improved approximation. The required iteration is:
x_next = (2*x + n/x**2)/3
In the real case, you'd repeat the iteration until you reached some desired tolerance. It turns out that over the integers, essentially the same iteration works, and with the right exit condition it will give us exactly the correct answer (no tolerance required). The iteration in the integer case is:
a_next = (2*a + n//a**2)//3
(Note the uses of the floor division operator // in place of the usual true division operator / above.) Mathematically, a_next is exactly the floor of (2*a + n/a**2)/3.
Here's some code based on this iteration:
def icbrt_v1(n, initial_guess=None):
"""
Given a positive integer n, find the floor of the cube root of n.
Args:
n : positive integer
initial_guess : positive integer, optional. If given, this is an
initial guess for the floor of the cube root. It must be greater
than or equal to floor(cube_root(n)).
Returns:
The floor of the cube root of n, as an integer.
"""
a = initial_guess if initial_guess is not None else n
while True:
d = n//a**2
if a <= d:
return a
a = (2*a + d)//3
And some example uses:
>>> icbrt_v1(100)
4
>>> icbrt_v1(1000000000)
1000
>>> large_int = 31415926535897932384626433
>>> icbrt_v1(large_int**3)
31415926535897932384626433
>>> icbrt_v1(large_int**3-1)
31415926535897932384626432
There are a couple of annoyances and inefficiencies in icbrt_v1 that we'll fix shortly. But first, a brief explanation of why the above code works. Note that we start with an initial guess that's assumed to be greater than or equal to the floor of the cube root. We'll show that this property is a loop invariant: every time we reach the top of the while loop, a is at least floor(cbrt(n)). Furthermore, each iteration produces a value of a strictly smaller than the old one, so our iteration is guaranteed to eventually converge to floor(cbrt(n)). To prove these facts, note that as we enter the while loop, there are two possibilities:
Case 1. a is strictly greater than the cube root of n. Then a > n//a**2, and the code proceeds to the next iteration. Write a_next = (2*a + n//a**2)//3, then we have:
a_next >= floor(cbrt(n)). This follows from the fact that (2*a + n/a**2)/3 is at least the cube root of n, which in turn follows from the AM-GM inequality applied to a, a and n/a**2: the geometric mean of these three quantities is exactly the cube root of n, so the arithmetic mean must be at least the cube root of n. So our loop invariant is preserved for the next iteration.
a_next < a: since we're assuming that a is larger than the cube root, n/a**2 < a, and it follows that (2a + n/a**2) / 3 is smaller than a, and hence that floor((2a + n/a**2) / 3) < a. This guarantees that we make progress towards the solution at each iteration.
Case 2. a is less than or equal to the cube root of n. Then a <= floor(cbrt(n)), but from the loop invariant established above we also know that a >= floor(cbrt(n)). So we're done: a is the value we're after. And the while loop exits at this point, since a <= n // a**2.
There are a couple of issues with the code above. First, starting with an initial guess of n is inefficient: the code will spend its first few iterations (roughly) dividing the current value of a by 3 each time until it gets into the neighborhood of the solution. A better choice for the initial guess (and one that's easily computable in Python) is to use the first power of two that exceeds the cube root of n.
initial_guess = 1 << -(-n.bit_length() // 3)
Even better, if n is small enough to avoid overflow, is to use floating-point arithmetic to provide the initial guess, with something like:
initial_guess = int(round(n ** (1/3.)))
But this brings us to our second issue: the correctness of our algorithm requires that the initial guess is no smaller than the actual integer cube root, and as n gets large we can't guarantee that for the float-based initial_guess above (though for small enough n, we can). Luckily, there's a very simple fix: for any positive integer a, if we perform a single iteration we always end up with a value that's at least floor(cbrt(a)) (using the same AM-GM argument that we used above). So all we have to do is perform at least one iteration before we start testing for convergence.
With that in mind, here's a more efficient version of the above code:
def icbrt(n):
"""
Given a positive integer n, find the floor of the cube root of n.
Args:
n : positive integer
Returns:
The floor of the cube root of n, as an integer.
"""
if n.bit_length() < 1024: # float(n) safe from overflow
a = int(round(n**(1/3.)))
a = (2*a + n//a**2)//3 # Ensure a >= floor(cbrt(n)).
else:
a = 1 << -(-n.bit_length()//3)
while True:
d = n//a**2
if a <= d:
return a
a = (2*a + d)//3
And with icbrt in hand, it's easy to put everything together to compute cube roots to ten decimal places. Here, for simplicity, I output the result as a string, but you could just as easily construct a Decimal instance.
def cbrt_to_ten_places(n):
"""
Compute the cube root of `n`, truncated to ten decimal places.
Returns the answer as a string.
"""
a = icbrt(n * 10**30)
q, r = divmod(a, 10**10)
return "{}.{:010d}".format(q, r)
Example outputs:
>>> cbrt_to_ten_places(2)
'1.2599210498'
>>> cbrt_to_ten_places(8)
'2.0000000000'
>>> cbrt_to_ten_places(31415926535897932384626433)
'315536756.9301821867'
>>> cbrt_to_ten_places(31415926535897932384626433**3)
'31415926535897932384626433.0000000000'
You may try to use the decimal module with a sufficiently large precision value.
EDIT: Thanks to #DSM, I realised that decimal module will not produce very exact cube roots. I suggest that you check whether all digits are 9s and round it to a integer if that is a case.
Also, I now perform the 1/3 division with Decimals as well, because passing the result of 1/3 to Decimal constructor leads to reduced precision.
import decimal
def cbrt(n):
nd = decimal.Decimal(n)
with decimal.localcontext() as ctx:
ctx.prec = 50
i = nd ** (decimal.Decimal(1) / decimal.Decimal(3))
return i
ret = str(cbrt(1233412412430519230351035712112421123121111))
print(ret)
left, right = ret.split('.')
print(left + '.' + ''.join(right[:10]))
Output:
107243119477324.80328931501744819161741924145124146
107243119477324.8032893150
Output of cbrt(10) is:
9.9999999999999999999999999999999999999999999999998

Fastest way to generate number like 66666 when the number of digits is given

I have an interesting problem where I want to generate a big number (~30000 digits) but it has to be all identical digits, like 66666666666666.......
So far I have done this by:
def fillWithSixes(digits):
result = 0
for i in range(digits):
result *= 10
result += 6
return result
However, this is very inefficient, and was wondering if there is any better way? Answer in cpp or java is okay too.
Edit:
Let's not just solve for 666666..... I want it to be generic for any number. How about 7777777777.... or 44444........ or 55555...?
String operations are worse, the increase from current complexity of O(n) to O(n^2).
You may use the formula 666...666 = 6/9*(10**n-1), where n is the number of digits.
So, in Python, you would write that as
n = int(input())
a = 6 * (10**n - 1) // 9
print(a)
You can use ljust or rjust:
number = 6
amount_of_times_to_repeat = 30000
big_number = int("".ljust(amount_of_times_to_repeat, str(number)))
print big_number
In one single line:
print int("".ljust(30000, str(6)))
Or:
new_number = int("".ljust(30000, str(6)))
The fastest method to generate such numbers with 100000+ digits is decimal.Decimal():
from decimal import Decimal as D
d = D('6' * n)
Measurements show that 6 * (10**n - 1) // 9 is O(n*log n) while D('6' * n) is O(n). Though for small n (less than ~10000), the former can be faster.
Decimal internal representation stores decimal digits directly. If you need to print the numbers latter; str(Decimal) is much faster than str(int).

Finding digits in powers of 2 fast

The task is to search every power of two below 2^10000, returning the index of the first power in which a string is contained. For example if the given string to search for is "7" the program will output 15, as 2^15 is the first power to contain 7 in it.
I have approached this with a brute force attempt which times out on ~70% of test cases.
for i in range(1,9999):
if search in str(2**i):
print i
break
How would one approach this with a time limit of 5 seconds?
Try not to compute 2^i at each step.
pow = 1
for i in xrange(1,9999):
if search in str(pow):
print i
break
pow *= 2
You can compute it as you go along. This should save a lot of computation time.
Using xrange will prevent a list from being built, but that will probably not make much of a difference here.
in is probably implemented as a quadratic string search algorithm. It may (or may not, you'd have to test) be more efficient to use something like KMP for string searching.
A faster approach could be computing the numbers directly in decimal
def double(x):
carry = 0
for i, v in enumerate(x):
d = v*2 + carry
if d > 99999999:
x[i] = d - 100000000
carry = 1
else:
x[i] = d
carry = 0
if carry:
x.append(carry)
Then the search function can become
def p2find(s):
x = [1]
for y in xrange(10000):
if s in str(x[-1])+"".join(("00000000"+str(y))[-8:]
for y in x[::-1][1:]):
return y
double(x)
return None
Note also that the digits of all powers of two up to 2^10000 are just 15 millions, and searching the static data is much faster. If the program must not be restarted each time then
def p2find(s, digits = []):
if len(digits) == 0:
# This precomputation happens only ONCE
p = 1
for k in xrange(10000):
digits.append(str(p))
p *= 2
for i, v in enumerate(digits):
if s in v: return i
return None
With this approach the first check will take some time, next ones will be very very fast.
Compute every power of two and build a suffix tree using each string. This is linear time in the size of all the strings. Now, the lookups are basically linear time in the length of each lookup string.
I don't think you can beat this for computational complexity.
There are only 10000 numbers. You don't need any complex algorithms. Simply calculated them in advance and do search. This should take merely 1 or 2 seconds.
powers_of_2 = [str(1<<i) for i in range(10000)]
def search(s):
for i in range(len(powers_of_2)):
if s in powers_of_2[i]:
return i
Try this
twos = []
twoslen = []
two = 1
for i in xrange(10000):
twos.append(two)
twoslen.append(len(str(two)))
two *= 2
tens = []
ten = 1
for i in xrange(len(str(two))):
tens.append(ten)
ten *= 10
s = raw_input()
l = len(s)
n = int(s)
for i in xrange(len(twos)):
for j in xrange(twoslen[i]):
k = twos[i] / tens[j]
if k < n: continue
if (k - n) % tens[l] == 0:
print i
exit()
The idea is to precompute every power of 2, 10 and and also to precompute the number of digits for every power of 2. In this way the problem is reduces to finding the minimum i for which there exist a j such that after removing the last j digits from 2 ** i you obtain a number which ends with n or expressed as a formula (2 ** i / 10 ** j - n) % 10 ** len(str(n)) == 0.
A big problem here is that converting a binary integer to decimal notation takes time quadratic in the number of bits (at least in the straightforward way Python does it). It's actually faster to fake your own decimal arithmetic, as #6502 did in his answer.
But it's very much faster to let Python's decimal module do it - at least under Python 3.3.2 (I don't know how much C acceleration is built in to Python decimal versions before that). Here's code:
class S:
def __init__(self):
import decimal
decimal.getcontext().prec = 4000 # way more than enough for 2**10000
p2 = decimal.Decimal(1)
full = []
for i in range(10000):
s = "%s<%s>" % (p2, i)
##assert s == "%s<%s>" % (str(2**i), i)
full.append(s)
p2 *= 2
self.full = "".join(full)
def find(self, s):
import re
pat = s + "[^<>]*<(\d+)>"
m = re.search(pat, self.full)
if m:
return int(m.group(1))
else:
print(s, "not found!")
and sample usage:
>>> s = S()
>>> s.find("1")
0
>>> s.find("2")
1
>>> s.find("3")
5
>>> s.find("65")
16
>>> s.find("7")
15
>>> s.find("00000")
1491
>>> s.find("666")
157
>>> s.find("666666")
2269
>>> s.find("66666666")
66666666 not found!
s.full is a string with a bit over 15 million characters. It looks like this:
>>> print(s.full[:20], "...", s.full[-20:])
1<0>2<1>4<2>8<3>16<4 ... 52396298354688<9999>
So the string contains each power of 2, with the exponent following a power enclosed in angle brackets. The find() method constructs a regular expression to search for the desired substring, then look ahead to find the power.
Playing around with this, I'm convinced that just about any way of searching is "fast enough". It's getting the decimal representations of the large powers that sucks up the vast bulk of the time. And the decimal module solves that one.

Categories