How can i handle very high exponentials in Python? - python

I'm working in a RSA algorithm in Python. My code runs smoothly, whenever I type prime values such as 17 and 41 no problems happens.
However, if I try primes over 1000 the code "stops" running (I mean, it will get stuck in the exponential) and don't return.
The part where the problems relies is this:
msg = ''
msg = msg + (ord(char ** d) % n for char in array)
I know that doing exponentials for numbers like d = 32722667 and char = 912673 requires a lot of computing, but I guess it shouldn't be supposed to don't return anything in like 5, 10 minutes.
My computer is an i3-6006U and has 4GB of RAM

If you are interested in result of x**y%z rather than x**y you might harness built-in function pow, from help(pow):
Help on built-in function pow in module builtins:
pow(x, y, z=None, /)
Equivalent to x**y (with two arguments) or x**y % z (with three arguments)
Some types, such as ints, are able to use a more efficient algorithm when
invoked using the three argument form.

What you're trying to do is to calculate modular exponentiation. You should not calculate the result of the exponentiation first and then the result of the modulo. There are more efficient algorithms like you can read on Wikipedia. In python you can use the built-in pow function for the modular exponentiation like it was already mentioned.

Related

Check if float is an integer: is_integer() vs. modulo 1

I've seen a number of questions asking how to check if a float is an integer. Majority of answers seem to recommend using is_integer():
(1.0).is_integer()
(1.55).is_integer()
I have also occasionally seen math.floor() being used:
import math
1.0 == math.floor(1.0)
1.55 == math.floor(1.55)
I'm wondering why % 1 is rarely used or recommended?
1.0 % 1 == 0
1.55 % 1 == 0
Is there a problem with using modulo for this purpose? Are there edge cases that this doesn't catch? Performance issues for really large numbers?
If % 1 is a fine alternative, then I'm also wondering why is_integer() was introduced to the standard library?
It seems that % is much more flexible. For example, its common to use % 2 to check if a number is odd/even, or % n to check if something is a multiple of n. Given this flexibility, why introduce a new method (is_integer) that does the same thing, or use math.floor, both of which require knowing/remembering that they exist and knowing how to use them? I know that math.floor has uses beyond just integer checking but still...
All are valid for the purpose. The math.floor option requires exact matching between a specific value and the result of the floor function. Which is not very convenient if you want to encapsulate it in a generic method. So it boils down to the first and third option. Both are valid and will do the job. So the key difference is simple - performance:
from timeit import Timer
def with_isint(num):
return num.is_integer()
def with_mod(num):
return num % 1 == 0
Timer(lambda: with_isint(10.0)).timeit(number=10000000)
#output: 2.0617980659008026
Timer(lambda: with_mod(10.0)).timeit(number=10000000)
#output: 2.6560597440693527
Naturally this is a simple operation so you'd need a lot of calls in order to see a considerable difference, as you can see in the example.
One soft reason is definitely: readability
If a function called is_integer() returns True, it is obvious what you have been testing.
However, using the modulo solution, one has to think through the process to see, that it is actually testing if a float is an integer. If you wrap your modulo formalism in a function with an obvious name such as simon_says_its_an_integer(), I think it's just as fine (apart from needlessly introducing an already existing function).

Inaccurate Large Fibonacci Numbers in Python

I am currently implementing this simple code trying to find the n-th element of the Fibonacci sequence using Python 2.7:
import numpy as np
def fib(n):
F = np.empty(n+2)
F[1] = 1
F[0] = 0
for i in range(2,n+1):
F[i]=F[i-1]+F[i-2]
return int(F[n])
This works fine for F < 79, but after that I get wrong numbers. For example, according to wolfram alpha F79 should be equal to 14472334024676221, but fib(100) gives me 14472334024676220. I think this could be caused by the way python deals with integers, but I have no idea what exactly the problem is. Any help is greatly appreciated!
the default data type for a numpy array is depending on architecture a 64 (or 32) bit int.
pure python would let you have arbitrarily long integers; numpy does not.
so it's more the way numpy deals with integers; pure python would do just fine.
Python will deal with integers perfectly fine here. Indeed, that is the beauty of python. numpy, on the other hand, introduces ugliness and just happens to be completely unnecessary, and will likely slow you down. Your implementation will also require much more space. Python allows you to write beautiful, readable code. Here is Raymond Hettinger's canonical implementation of iterative fibonacci in Python:
def fib(n):
x, y = 0, 1
for _ in range(n):
x, y = y, x + y
return x
That is O(n) time and constant space. It is beautiful, readable, and succinct. It will also give you the correct integer as long as you have memory to store the number on your machine. Learn to use numpy when it is the appropriate tool, and as importantly, learn to not use it when it is inappropriate.
Unless you want to generate a list with all the fibonacci numbers until Fn, there is no need to use a list, numpy or anything else like that, a simple loop and 2 variables will be enough as you only really need to know the 2 previous values
def fib(n):
Fk, Fk1 = 0, 1
for _ in range(n):
Fk, Fk1 = Fk1, Fk+Fk1
return Fk
of course, there is better ways to do it using the mathematical properties of the Fibonacci numbers, with those we know that there is a matrix that give us the right result
import numpy
def fib_matrix(n):
mat = numpy.matrix( [[1,1],[1,0]], dtype=object) ** n
return mat[0,1]
to which I assume they have an optimized matrix exponentiation making it more efficient that the previous method.
Using the properties of the underlying Lucas sequence is possible to do it without the matriz, and equally as efficient as exponentiation by squaring and with the same number of variables as the other, but that is a little harder to understand at first glance unlike the first example because alongside the second example it require more mathematical.
The close form, the one with the golden ratio, will give you the result even faster, but that have the risk of being inaccurate because the use of floating point arithmetic.
As an additional word to the previous answer by hiro protagonist, note that if using Numpy is a requirement, you can solve very easely your issue by replacing:
F = np.empty(n+2)
with
F = np.empty(n+2, dtype=object)
but it will not do anything more than transferring back the computation to pure Python.

Why is python's built in multiplication so fast [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
So the other day I was trying something in python, I was trying to write a custom multiplication function in python
def multi(x, y):
z = 0
while y > 0:
z = z + x
y = y - 1
return z
However, when I ran it with extremely large numbers like (1 << 90) and (1 << 45) which is (2 ^ 90) * (2 ^ 45). It took forever to compute.
So I tried looking into different types of multiplication, like the russian peasant multiplication technique, implemented down there, which was extremely fast but not as readable as multi(x, y)
def russian_peasant(x, y):
z = 0
while y > 0:
if y % 2 == 1: z = z + x
x = x << 1
y = y >> 1
return z
What I want you to answer is how do programming languages like python multiply numbers ?
Your multi version runs in O(N) whereas russian_peasant version runs in O(logN), which is far better than O(N).
To realize how fast your russian_peasant version is, check this out
from math import log
print round(log(100000000, 2)) # 27.0
So, the loop has to be executed just 27 times, but your multi version's while loop has to be executed 100000000 times, when y is 100000000.
To answer your other question,
What I want you to answer is how do programming languages like python
multiply numbers ?
Python uses O(N^2) grade school multiplication algorithm for small numbers, but for big numbers it uses Karatsuba algorithm.
Basically multiplication is handled in C code, which can be compiled to machine code and executed faster.
Programming languages like Python use the multiplication instruction provided by your computer's CPU.
In addition, you have to remember that Python is a very high-level programming language, which runs on a virtual machine which itself runs on your computer. As such, it is, inherently, a few order of magnitudes slower than native code. Translating your algorithm to assembly (or even to C) would result in a massive speedup -- although it'd still be slower than the CPU's multiplication operation.
On the plus side, unlike naive assembly/C, Python auto-promotes integers to bignums instead of overflowing when your numbers are bigger than 2**32.
The basic answer to your question is this, multiplication using * is handled through C code. In essence if you write something in pure python its going to be slower than the C implementation, let me give you an example.
The operator.mul function is implemented in C, but a lambda is implemented in Python, we're going to try to find the product of all the numbers in an array using functools.reduce and we are going to use two cases, one using operator.mul and another using a lambda which both do the same thing (on the surface):
from timeit import timeit
setup = """
from functools import reduce
from operator import mul
"""
print(timeit('reduce(mul, range(1, 10))', setup=setup))
print(timeit('reduce(lambda x, y: x * y, range(1, 10))', setup=setup))
Output:
1.48362842561
2.67425475375
operator.mul takes less time, as you can see.
Usually, functional programming involving many computations is best made to take less time using memoization -- the basic idea is that if you feed a true function (something that always evaluates the same result for a given argument) the same thing twice or more, you're wasting time, time that could easily be saved by identifying common calls and storing whatever they evaluate down to into a hash table or other quickly-accessible object. See https://en.wikipedia.org/wiki/Memoization for basic theory. It is well-implemented in Common Lisp.

Exponentiation in Python - should I prefer ** operator instead of math.pow and math.sqrt? [duplicate]

This question already has answers here:
Which is faster in Python: x**.5 or math.sqrt(x)?
(15 answers)
Closed 9 years ago.
In my field it's very common to square some numbers, operate them together, and take the square root of the result. This is done in pythagorean theorem, and the RMS calculation, for example.
In numpy, I have done the following:
result = numpy.sqrt(numpy.sum(numpy.pow(some_vector, 2)))
And in pure python something like this would be expected:
result = math.sqrt(math.pow(A, 2) + math.pow(B,2)) # example with two dimensions.
However, I have been using this pure python form, since I find it much more compact, import-independent, and seemingly equivalent:
result = (A**2 + B**2)**0.5 # two dimensions
result = (A**2 + B**2 + C**2 + D**2)**0.5
I have heard some people argue that the ** operator is sort of a hack, and that squaring a number by exponentiating it by 0.5 is not so readable. But what I'd like to ask is if:
"Is there any COMPUTATIONAL reason to prefer the former two alternatives over the third one(s)?"
Thanks for reading!
math.sqrt is the C implementation of square root and is therefore different from using the ** operator which implements Python's built-in pow function. Thus, using math.sqrt actually gives a different answer than using the ** operator and there is indeed a computational reason to prefer numpy or math module implementation over the built-in. Specifically the sqrt functions are probably implemented in the most efficient way possible whereas ** operates over a large number of bases and exponents and is probably unoptimized for the specific case of square root. On the other hand, the built-in pow function handles a few extra cases like "complex numbers, unbounded integer powers, and modular exponentiation".
See this Stack Overflow question for more information on the difference between ** and math.sqrt.
In terms of which is more "Pythonic", I think we need to discuss the very definition of that word. From the official Python glossary, it states that a piece of code or idea is Pythonic if it "closely follows the most common idioms of the Python language, rather than implementing code using concepts common to other languages." In every single other language I can think of, there is some math module with basic square root functions. However there are languages that lack a power operator like ** e.g. C++. So ** is probably more Pythonic, but whether or not it's objectively better depends on the use case.
Even in base Python you can do the computation in generic form
result = sum(x**2 for x in some_vector) ** 0.5
x ** 2 is surely not an hack and the computation performed is the same (I checked with cpython source code). I actually find it more readable (and readability counts).
Using instead x ** 0.5 to take the square root doesn't do the exact same computations as math.sqrt as the former (probably) is computed using logarithms and the latter (probably) using the specific numeric instruction of the math processor.
I often use x ** 0.5 simply because I don't want to add math just for that. I'd expect however a specific instruction for the square root to work better (more accurately) than a multi-step operation with logarithms.

Which programming language or a library can process Infinite Series?

Which programming language or a library is able to process infinite series (like geometric or harmonic)? It perhaps must have a database of some well-known series and automatically give proper values in case of convergence, and maybe generate an exception in case of divergence.
For example, in Python it could look like:
sum = 0
sign = -1.0
for i in range(1,Infinity,2):
sign = -sign
sum += sign / i
then, sum must be math.pi/4 without doing any computations in the loop (because it's a well-known sum).
Most functional languages which evaluate lazily can simulate the processing of infinite series. Of course, on a finite computer it is not possible to process infinite series, as I am sure you are aware. Off the top of my head, I guess Mathematica can do most of what you might want, I suspect that Maple can too, maybe Sage and other computer-algebra systems and I'd be surprised if you can't find a Haskell implementation that suits you.
EDIT to clarify for OP: I do not propose generating infinite loops. Lazy evaluation allows you to write programs (or functions) which simulate infinite series, programs which themselves are finite in time and space. With such languages you can determine many of the properties, such as convergence, of the simulated infinite series with considerable accuracy and some degree of certainty. Try Mathematica or, if you don't have access to it, try Wolfram Alpha to see what one system can do for you.
One place to look might be the Wikipedia category of Computer Algebra Systems.
There are two tools available in Haskell for this beyond simply supporting infinite lists.
First there is a module that supports looking up sequences in OEIS. This can be applied to the first few terms of your series and can help you identify a series for which you don't know the closed form, etc. The other is the 'CReal' library of computable reals. If you have the ability to generate an ever improving bound on your value (i.e. by summing over the prefix, you can declare that as a computable real number which admits a partial ordering, etc. In many ways this gives you a value that you can use like the sum above.
However in general computing the equality of two streams requires an oracle for the halting problem, so no language will do what you want in full generality, though some computer algebra systems like Mathematica can try.
Maxima can calculate some infinite sums, but in this particular case it doesn't seem to find the answer :-s
(%i1) sum((-1)^k/(2*k), k, 1, inf), simpsum;
inf
==== k
\ (- 1)
> ------
/ k
====
k = 1
(%o1) ------------
2
but for example, those work:
(%i2) sum(1/(k^2), k, 1, inf), simpsum;
2
%pi
(%o2) ----
6
(%i3) sum((1/2^k), k, 1, inf), simpsum;
(%o3) 1
You can solve the series problem in Sage (a free Python-based math software system) exactly as follows:
sage: k = var('k'); sum((-1)^k/(2*k+1), k, 1, infinity)
1/4*pi - 1
Behind the scenes, this is really using Maxima (a component of Sage).
For Python check out SymPy - clone of Mathematica and Matlab.
There is also a heavier Python-based math-processing tool called Sage.
You need something that can do a symbolic computation like Mathematica.
You can also consider quering wolframaplha: sum((-1)^i*1/i, i, 1 , inf)
There is a library called mpmath(python), a module of sympy, which provides the series support for sympy( I believe it also backs sage).
More specifically, all of the series stuff can be found here: Series documentation
The C++ iRRAM library performs real arithmetic exactly. Among other things it can compute limits exactly using the limit function. The homepage for iRRAM is here. Check out the limit function in the documentation. Note that I'm not talking about arbitrary precision arithmetic. This is exact arithmetic, for a sensible definition of exact. Here's their code to compute e exactly, pulled from the example on their web site:
//---------------------------------------------------------------------
// Compute an approximation to e=2.71.. up to an error of 2^p
REAL e_approx (int p)
{
if ( p >= 2 ) return 0;
REAL y=1,z=2;
int i=2;
while ( !bound(y,p-1) ) {
y=y/i;
z=z+y;
i+=1;
}
return z;
};
//---------------------------------------------------------------------
// Compute the exact value of e=2.71..
REAL e()
{
return limit(e_approx);
};
Clojure and Haskell off the top of my head.
Sorry I couldn't find a better link to haskell's sequences, if someone else has it, please let me know and I'll update.
Just install sympy on your computer. Then do the following code:
from sympy.abc import i, k, m, n, x
from sympy import Sum, factorial, oo, IndexedBase, Function
Sum((-1)**k/(2*k+1), (k, 0, oo)).doit()
Result will be: pi/4
I have worked in couple of Huge Data Series for Research purpose.
I used Matlab for that. I didn't know it can/can't process Infinite Series.
But I think there is a possibility.
U can try :)
This can be done in for instance sympy and sage (among open source alternatives) In the following, a few examples using sympy:
In [10]: summation(1/k**2,(k,1,oo))
Out[10]:
2
π
──
6
In [11]: summation(1/k**4, (k,1,oo))
Out[11]:
4
π
──
90
In [12]: summation( (-1)**k/k, (k,1,oo))
Out[12]: -log(2)
In [13]: summation( (-1)**(k+1)/k, (k,1,oo))
Out[13]: log(2)
Behind the scenes, this is using the theory for hypergeometric series, a nice introduction is the book "A=B" by Marko Petkovˇeks, Herbert S. Wilf
and Doron Zeilberger which you can find by googling. ¿What is a hypergeometric series?
Everybody knows what an geometric series is: $X_1, x_2, x_3, \dots, x_k, \dots $ is geometric if the contecutive terms ratio $x_{k+1}/x_k$ is constant. It is hypergeometric if the consecutive terms ratio is a rational function in $k$! sympy can handle basically all infinite sums where this last condition is fulfilled, but only very few others.

Categories