Problem 48 description from Project Euler:
The series, 1^1 + 2^2 + 3^3 + ... + 10^10 = 10405071317. Find the last
ten digits of the series, 1^1 + 2^2 + 3^3 + ... + 1000^1000.
I've just solved this problem using a one-liner in Python:
print sum([i**i for i in range(1,1001)])%(10**10)
I did it that way almost instantly, as I remembered that division mod n is very fast in Python. But I still don't understand how does this work under the hood (what optimizations does Python do?) and why is this so fast.
Could you please explain this to me? Is the mod 10**10 operation optimized to be applied for every iteration of the list comprehension instead of the whole sum?
$ time python pe48.py
9110846700
real 0m0.070s
user 0m0.047s
sys 0m0.015s
Given that
print sum([i**i for i in range(1,1001)])%(10**10)
and
print sum([i**i for i in range(1,1001)])
function equally fast in Python, the answer to your last question is 'no'.
So, Python must be able to do integer exponentiation really fast. And it so happens that integer exponentiation is O(log(n)) multiplications: http://en.wikipedia.org/wiki/Exponentiation#Efficient_computation_of_integer_powers
Essentially what is done is, instead of doing 2^100 = 2*2*2*2*2... 100 times, you realize that 2^100 is also 2^64 * 2^32 * 2^4 , that you can square 2 over and over and over to get 2^2 then 2^4 then 2^8... etc, and once you've found the values of all three of those components you multiply them for a final answer. This requires much fewer multiplication operations. The specifics of how to go about it are a bit more complex, but Python is mature enough to be well optimized on such a core feature.
No, it's applied to the whole sum. The sum itself is very fast to compute. Doing exponents isn't that hard to do quickly if the arguments are integers.
Related
I am developing a machine learning based algorithm on python. The main thing, that I need to calculate to solve this problem is probabilities. This way I have the following code:
class_ans = class_probability[current_class] * lambdas[current_class]
for word in appears_words:
if word in message:
class_ans *= words_probability[(word, current_class)]
else:
class_ans *= (1 - words_probability[(word, current_class)])
ans.append(class_ans)
ans[current_class] /= summ
It works, but in case the dataset is too big or lambdas value is too small, I ran out of my float precision.
I've tryed to research an other algorithm of calculating my answer's value, multimplying and dividing on some random consts different variables to make them not to overflow. Despite this, nothing helped.
This way, I would like to ask, is there any ways to increase my float precision in python?
Thanks!
You cannot. When using serious scientific computation where precision is key (and speed is not), consider the following two options:
Instead of using float, switch your datatype to decimal.Decimal and set your desired precision.
For a more battle-hardened thorough implementation, switch to gmpy2.mpfr as your data type.
However, if your entire computation (or at least the problematic part) involves the multiplication of factors, you can often bypass the need for the above by working in log-space as Konrad Rudolph suggests in the comments:
a * b * c * d * ... = exp(log(a) + log(b) + log(c) + log(d) + ...)
I am trying to find a very fast way to find the next higher powers of 2 than a very large number (1,000,000) digits. Example, i have 1009, and want to find it's next higher powers of two which is 1024 or 2**10
I tried using a loop, but for large numbers this is very, very slow
y=0
while (1<<y)<1009:
y+=1
print(1<<y)
1024
While this works, it's slow for numbers larger than a million digits. Is there a faster algorithm to find the next higher powers of 2 than a number that is large?
ANSWERED BY #JonClements
using 2**number.bit_length() works perfectly. So this will work for large numbers as well. Thanks Jon.
Here's a code example from Jon's implementation:
2**j.bit_length()
1024
Here's a code example using the shift operator
2<<(j.bit_length()-1)
1024
Here is the time difference using the million length number, the shift operator and bit_length is significantly faster:
len(str(aa))
1000000
def useBITLENGTHwithshiftoperator(hm):
return 1<<hm.bit_length()-1<<1
def useBITLENGTHwithpowersoperator(hm):
return 2**hm.bit_length()
start = time.time()
l=useBITLENGTHwithpowersoperator(aa)
end = time.time()
print(end - start)
0.014303922653198242
start = time.time()
l=useBITLENGTHwithshiftoperator(aa)
end = time.time()
print(end - start)
0.0002968311309814453
take 2^ceiling(logBase2(x)) - should work unless x is a power of 2. and you can check for that with: if x==ceiling(x).
I do not code in python but millions of digits implies bignums so:
try to look inside your bignum lib
It might return the number of words or bits used in O(1) as some number representations need it to speed up other stuff. In such case you can obtain your answer in O(1) for free.
As #JonClements suggested in a comments try bit_length() and measure if it is O(1) or O(log(n)) ...
Your while is O(n^3) instead of O(n^2)
You are bitshifting from 1 over and over again in each iteration. Why not just shift last result by 1 bit again instead? Something like
for (y=0,yy=1;yy<1009;y++,yy<<=1);
using log2 might be faster
in case the bignum class you use have it implemented correctly after some number size threshold the log2(1009) might be signifficantly faster. But that depends on the type of numbers you using and bignum implementation itself.
bit-shifting can be even faster
If you got some upper limit on your numbers you can use binary search converting your bitshifting into O(n.log2(n)).
If not you can start bitshifting by 32 bits instead of by 1 when reached target size bitshift by 1 bit. Or even use more layers like 1024/128/16/1 bits. The complexity would be still O(n^2) but the constant time would be ~1024 times smaller speeding up ~1024 times your code for big numbers...
Other option is to find the limit by shifting by 1 bit, then by 2 then by 4,8,16,32,64,... until result is bigger than your target number and from there either bitshift back or use binary search. This one would be O(n.log2(n)) even without any upper limit..
However all of these brings up much higher overhead and will slow down the processing of smaller numbers.
Constructing 2^(y-1) < x <= 2^y might be possible to enhance too. For example by using bit shifting approach to find the y you got your answer as byproduct for free. For example with floating point or fixed point numbers you can directly construct such number as computing exponent for 1 or by setting correct bit in the zero ... But for arbitrary numbers (where size of number is dynamic) i sthis much harder/slower. So all boils down what kind of bignums class you got and what values you use.
I was writing a program where I need to calculate insanely huge numbers.
k = int(input())
print(int((2**k)*5 % (10**9 + 7))
Here, k being of the orders of 109
As expected, this was rather slow( taking upto 5 seconds to calculate) whereas my program needs to finish computing in 1 second.
After a little research online I found a function pow(), and by writing
p = 10**9 + 7
print(int(pow(2, k- 1,p)*10))
This works fine for small numbers but messes up at large numbers. I can understand why that is happening( because this isn't essentially what I want to calculate and the modulus operation with such a large number doesn't affect the calculation with small values of k).
I also found libraries like gmpy2 and numpy but I don't know how to use them since I'm just a beginner with python.
So how can I write an expression for what I want to calculate and which works fast enough and doesn't err at large numbers too?
You can optimize your operation by passing the number you want to take modulus from as the third argument of builtin pow and multiplying the result by 5
def func(k):
x = pow(2, k, pow(10,9) + 7) * 5
return int(x)
I was solving a problem I came across, what is the sum of powers of 3 from 0 to 2009 mod 8.
I got an answer using pen and paper, and tried to verify it with some simple python
print(sum(3**k for k in range(2010)) % 8)
I was surprised by how quickly it returned an answer. My question is what optimisations or tricks are used by the interpreter to get the answer so quickly?
None, it's just not a lot of computation for a computer to do.
Your code is equivalent to:
>>> a = sum(3**k for k in range(2010))
>>> a % 8
4
a is a 959-digit number - it's just not a large task to ask of a computer.
Try sticking two zeros on the end of the 2010 and you will see it taking an appreciable amount of time.
The only optimization at work is that each instance of 3**k is evaluated using a number of multiplications proportional to the number of bits in k (it does not multiply 3 by itself k-1 times).
As already noted, if you boost 2010 to 20100 or 201000 or ..., it will take much longer, because 3**k becomes very large. However, in those cases you can speed it enormously again by rewriting it as, e.g.,
print(sum(pow(3, k, 8) for k in range(201000)) % 8)
Internally, pow(3, k, 8) still does a number of multiplications proportional to the number of bits in k, but doesn't need to retain any integers internally larger than about 8**2 (the square of the modulus).
No fancy optimizations are responsible for the fast response you observed. Computers are just a lot faster in absolute terms than you expected.
This question is a parallel to python - How do I decompose a number into powers of 2?. Indeed, it is the same question, but rather than using Python (or Javascript, or C++, as these also seem to exist), I'm wondering how it can be done using Lua. I have a very basic understanding of Python, so I took the code first listed in the site above and attempted to translate it to Lua, with no success. Here's the original, and following, my translation:
Python
def myfunc(x):
powers = []
i = 1
while i <= x:
if i & x:
powers.append(i)
i <<= 1
return powers
Lua
function powerfind(n)
local powers = {}
i = 1
while i <= n do
if bit.band(i, n) then -- bitwise and check
table.insert(powers, i)
end
i = bit.shl(i, 1) -- bitwise shift to the left
end
return powers
end
Unfortunately, my version locks and "runs out of memory". This was after using the number 12 as a test. It's more than likely that my primitive knowledge of Python is failing me, and I'm not able to translate the code from Python to Lua correctly, so hopefully someone can offer a fresh set of eyes and help me fix it.
Thanks to the comments from user2357112, I've got it fixed, so I'm posting the answer in case anyone else comes across this issue:
function powerfind(n)
local powers = {}
i = 1
while i <= n do
if bit.band(i, n) ~= 0 then -- bitwise and check
table.insert(powers, i)
end
i = bit.shl(i, 1) -- bitwise shift to the left
end
return powers
end
I saw that in the other one, it became a sort of speed contest. This one should also be easy to understand.
i is the current power. It isn't used for calculations.
n is the current place in the array.
r is the remainder after a division of x by two.
If the remainder is 1 then you know that i is a power of two which is used in the binary representation of x.
local function powerfind(x)
local powers={
nil,nil,nil,nil,
nil,nil,nil,nil,
nil,nil,nil,nil,
nil,nil,nil,nil,
}
local i,n=1,0
while x~=0 do
local r=x%2
if r==1 then
x,n=x-1,n+1
powers[n]=i
end
x,i=x/2,2*i
end
end
Running a million iterations, x from 1 to 1000000, takes me 0.29 seconds. I initialize the size of the powers table to 16.