pow(a,x,c) operator in python returns (a**x)%c . If I have values of a, c, and the result of this operation, how can I find the value of x?
Additionally, this is all the information I have
pow(a,x,c) = pow(d,e,c)
Where I know the value of a,c,d, and e.
These numbers are very large (a = 814779647738427315424653119, d = 3, e = 40137673778629769409284441239, c = 1223334444555556666667777777) so I can not just compute these values directly.
I'm aware of the Carmichael's lambda function that can be used to solve for a, but I am not sure if and/or how this applies to solve for x.
Any help will be appreciated.
As #user2357112 says in the comments, this is the discrete logarithm problem, which is computationally very difficult for large c, and no fast general solution is known.
However, for small c there are still some things you can do. Given that a and c are coprime, there is an exponent k < c such that a^k = 1 mod c, after which the powers repeat. Let b = a^x. So, if you brute force it by calculating all powers of a until you get b, you'll have to loop at most c times:
def do_log(a, b, c):
x = 1
p = a
while p != b and p != 1:
x += 1
p *= a
p %= c
if p == b:
return x
else:
return None # no such x
If you run this calculation multiple times with the same a, you can do even better.
# a, c constant
p_to_x = {1: 0}
x = 1
p = a
while p != 1:
p_to_x[p] = x
x += 1
p *= a
p %= c
def do_log_a_c(b):
return p_to_x[b]
Here a cache is made in a loop running at most c times and the cache is accessed in the log function.
So, I'm trying to fit some pairs of x,y data with a quadratic regression, a sample formula can be found at http://polynomialregression.drque.net/math.html.
Following is my code that does the regression using that explicit formula and using numpy inbuilt functions,
import numpy as np
x = [6.230825,6.248279,6.265732]
y = [0.312949,0.309886,0.306639472]
toCheck = x[2]
def evaluateValue(coeff,x):
c,b,a = coeff
val = np.around( a+b*x+c*x**2,9)
act = 0.306639472
error= np.abs(act-val)*100/act
print "Value = {:.9f} Error = {:.2f}%".format(val,error)
###### USing numpy######################
coeff = np.polyfit(x,y,2)
evaluateValue(coeff, toCheck)
################# Using explicit formula
def determinant(a,b,c,d,e,f,g,h,i):
# the matrix is [[a,b,c],[d,e,f],[g,h,i]]
return a*(e*i - f*h) - b*(d*i - g*f) + c*(d*h - e*g)
a = b = c = d = e = m = n = p = 0
a = len(x)
for i,j in zip(x,y):
b += i
c += i**2
d += i**3
e += i**4
m += j
n += j*i
p += j*i**2
det = determinant(a,b,c,b,c,d,c,d,e)
c0 = determinant(m,b,c,n,c,d,p,d,e)/det
c1 = determinant(a,m,c,b,n,d,c,p,e)/det
c2 = determinant(a,b,m,b,c,n,c,d,p)/det
evaluateValue([c2,c1,c0], toCheck)
######Using another explicit alternative
def determinantAlt(a,b,c,d,e,f,g,h,i):
return a*e*i - a*f*h - b*d*i +b*g*f + c*d*h - c*e*g # <- barckets removed
a = b = c = d = e = m = n = p = 0
a = len(x)
for i,j in zip(x,y):
b += i
c += i**2
d += i**3
e += i**4
m += j
n += j*i
p += j*i**2
det = determinantAlt(a,b,c,b,c,d,c,d,e)
c0 = determinantAlt(m,b,c,n,c,d,p,d,e)/det
c1 = determinantAlt(a,m,c,b,n,d,c,p,e)/det
c2 = determinantAlt(a,b,m,b,c,n,c,d,p)/det
evaluateValue([c2,c1,c0], toCheck)
This code gives this output
Value = 0.306639472 Error = 0.00%
Value = 0.308333580 Error = 0.55%
Value = 0.585786477 Error = 91.03%
As, you can see these are different from each other and third one is totally wrong. Now my questions are:
1. Why the explicit formula is giving slightly wrong result and how to improve that?
2. How numpy is giving so accurate result?
3. In the third case only by openning the parenthesis, how come the result changes so drastically?
So there are a few things that are going on here that are unfortunately plaguing the way you are doing things. Take a look at this code:
for i,j in zip(x,y):
b += i
c += i**2
d += i**3
e += i**4
m += j
n += j*i
p += j*i**2
You are building features such that the x values are not only squared, but cubed and fourth powered.
If you print out each of these values before you put them into the 3 x 3 matrix to solve:
In [35]: a = b = c = d = e = m = n = p = 0
...: a = len(x)
...: for i,j in zip(xx,y):
...: b += i
...: c += i**2
...: d += i**3
...: e += i**4
...: m += j
...: n += j*i
...: p += j*i**2
...: print(a, b, c, d, e, m, n, p)
...:
...:
3 18.744836 117.12356813829001 731.8283056811686 4572.738547313946 0.9294744720000001 5.807505391292503 36.28641270376207
When dealing with floating-point arithmetic and especially for small values, the order of operations does matter. What's happening here is that by fluke, the mix of both small values and large values that have been computed result in a value that is very small. Therefore, when you compute the determinant using the factored form and expanded form, notice how you get slightly different results but also look at the precision of the values:
In [36]: det = determinant(a,b,c,b,c,d,c,d,e)
In [37]: det
Out[37]: 1.0913403514223319e-10
In [38]: det = determinantAlt(a,b,c,b,c,d,c,d,e)
In [39]: det
Out[39]: 2.3283064365386963e-10
The determinant is on the order of 10-10! The reason why there's a discrepancy is because with floating-point arithmetic, theoretically both determinant methods should yield the same result but unfortunately in reality they are giving slightly different results and this is due to something called error propagation. Because there are a finite number of bits that can represent a floating-point number, the order of operations changes how the error propagates, so even though you are removing the parentheses and the formulas do essentially match, the order of operations to get to the result are now different. This article is an essential read for any software developer who deals with floating-point arithmetic regularly: What Every Computer Scientist Should Know About Floating-Point Arithmetic.
Therefore, when you're trying to solve the system with Cramer's Rule, inevitably when you divide by the main determinant in your code, even though the change is on the order of 10-10, the change is negligible between the two methods but you will get very different results because you're dividing by this number when solving for the coefficients.
The reason why NumPy doesn't have this problem is because they solve the system by least-squares and the pseudo-inverse and not using Cramer's Rule. I would not recommend using Cramer's Rule to find regression coefficients mostly due to experience and that there are more robust ways of doing it.
However to solve your particular problem, it's good to normalize the data so that the dynamic range is now centered at 0. Therefore, the features you use to construct your coefficient matrix are more sensible and thus the computational process has an easier time dealing with the data. In your case, something as simple as subtracting the data with the mean of the x values should work. As such, if you have new data points you want to predict, you must subtract by the mean of the x data first prior to doing the prediction.
Therefore at the beginning of your code, perform mean subtraction and regress on this data. I've showed you where I've modified the code given your source above:
import numpy as np
x = [6.230825,6.248279,6.265732]
y = [0.312949,0.309886,0.306639472]
# Calculate mean
me = sum(x) / len(x)
# Make new dataset that is mean subtracted
xx = [pt - me for pt in x]
#toCheck = x[2]
# Data point to check is now mean subtracted
toCheck = x[2] - me
def evaluateValue(coeff,x):
c,b,a = coeff
val = np.around( a+b*x+c*x**2,9)
act = 0.306639472
error= np.abs(act-val)*100/act
print("Value = {:.9f} Error = {:.2f}%".format(val,error))
###### USing numpy######################
coeff = np.polyfit(xx,y,2) # Change
evaluateValue(coeff, toCheck)
################# Using explicit formula
def determinant(a,b,c,d,e,f,g,h,i):
# the matrix is [[a,b,c],[d,e,f],[g,h,i]]
return a*(e*i - f*h) - b*(d*i - g*f) + c*(d*h - e*g)
a = b = c = d = e = m = n = p = 0
a = len(x)
for i,j in zip(xx,y): # Change
b += i
c += i**2
d += i**3
e += i**4
m += j
n += j*i
p += j*i**2
det = determinant(a,b,c,b,c,d,c,d,e)
c0 = determinant(m,b,c,n,c,d,p,d,e)/det
c1 = determinant(a,m,c,b,n,d,c,p,e)/det
c2 = determinant(a,b,m,b,c,n,c,d,p)/det
evaluateValue([c2,c1,c0], toCheck)
######Using another explicit alternative
def determinantAlt(a,b,c,d,e,f,g,h,i):
return a*e*i - a*f*h - b*d*i +b*g*f + c*d*h - c*e*g # <- barckets removed
a = b = c = d = e = m = n = p = 0
a = len(x)
for i,j in zip(xx,y): # Change
b += i
c += i**2
d += i**3
e += i**4
m += j
n += j*i
p += j*i**2
det = determinantAlt(a,b,c,b,c,d,c,d,e)
c0 = determinantAlt(m,b,c,n,c,d,p,d,e)/det
c1 = determinantAlt(a,m,c,b,n,d,c,p,e)/det
c2 = determinantAlt(a,b,m,b,c,n,c,d,p)/det
evaluateValue([c2,c1,c0], toCheck)
When I run this, we now get:
In [41]: run interp_test
Value = 0.306639472 Error = 0.00%
Value = 0.306639472 Error = 0.00%
Value = 0.306639472 Error = 0.00%
As some final reading for you, this is a similar problem that someone else encountered which I addressed in their question: Fitting a quadratic function in python without numpy polyfit. The summary is that I advised them not to use Cramer's Rule and to use least-squares through the pseudo-inverse. I showed them how to get exactly the same results without using numpy.polyfit. Also, using least-squares generalizes where if you have more than 3 points, you can still fit a quadratic through your points so that the model has the smallest error possible.
I'm new to Python, and I'm trying to get familiar with it by solving problems on CodeChef. I'm attempting to solve the Easy problem Number Game. The issue is that the execution time is too long for my code.
I have translated the Python solution I wrote into C++, and the submission was accepted, so I know I have a correct answer, and it's just off by a constant multiple.
Is it possible to solve this problem in Python 3 in the allotted time? Can you help me speed up my code to accomplish this?
import time
def getStartValues(A, M):
startVals = [0]*M
b = [0]*len(A)
for i in range(len(A)-1):
b[i+1] = (10*b[i] + A[i]) % M
f = 0
power = 1
for i in range(len(A)-1,0,-1):
startVals[(b[i]*power + f) % M] += 1
f = (A[i]*power + f) % M
power = (power*10 % M)
startVals[f] += 1
return startVals, power
def checkValues(i, startVals, M, powNm1, checked, chklst):
if checked[i] == 1:
return startVals[i]
q = [i]
chk = [0]*M
chk[i] = 1
while len(q) > 0:
val = q.pop(0)
for j in chklst:
val2 = (powNm1*val + j) % M
if checked[val2] > 0:
checked[i] = 1
return startVals[i]
elif chk[val2] == 0:
q.append(val2)
chk[val2] = 1
return 0
def compute(A, M):
startVals, power = getStartValues(A, M)
checked = [0]*M
checked[0] = 1
chklst = [j for j in range(M) if startVals[j] > 0]
total = 0
for i in chklst:
c = checkValues(i, startVals, M, power, checked, chklst)
total += c
return total
start = time.time()
file = open('numbgame.in', 'r')
#T = int(input())
T = int(file.readline())
for i in range(T):
#A, M = input().split()
A, M = file.readline().split()
A = list(map(int,A))
M = int(M)
print(compute(A, M))
tDiff = time.time() - start
print('Total time: %s' % tDiff)
Note that I have modified the code to read from a file and to display execution time, as a convenience, and some small alterations are needed before it can be submitted.
getStartValues takes in the (big) list of digits of the input A and the (small) integer M and returns the values modulo M that can be generated from A by removing a single digit.
checkValues takes an index i, the list startValues, the integer M, the integer powNm1 (which is the value 10^(n-1) mod M, where n is the number of digits in A, a list checked that keeps track of whether a value has already been determined to be solvable, and the list chklst (which contains the indices i such that startValues[i] > 0).
The majority of the time is spent in the function getStartValues, since A could be up to 10^6 digits long. On my desktop, the getStartValues function call takes about 1.2s, while the rest of the compute function takes about 0.04s (for worst case inputs).
I'm a beginner in Python, and tried to take MIT 6.00, the page provided is the assignments page.
I'm at assignment 2, where i have to find a solution for Diophantine equation, i'm really not that great in math, so i tried to understand what it does as much as i can, and think of a solution for it.
Here's what i got to :
def test(x):
for a in range(1,150):
for b in range(1,150):
for c in range(1,150):
y = 6*a+9*b+20*c
if y == x:
print "this --> " , a, b, c
break
else : ##this to see how close i was to the number
if y - x < 3:
print a, b, c , y
The assignment states that there's a solution for 50, 51, 52, 53, 54, and 55, but unfortunately the script only gets the solution for 50, 53 and 55.
I'd be very grateful if someone explained what's wrong in my code, or if i'm not understanding Diophantine equation at all, please tell me what is it all about and how to find a solution for it, since i cant get the assignment's explanation into my head.
Thanks.
The assignment says:
To determine if it is possible to
buy exactly n McNuggets, one has to solve a Diophantine equation: find non-negative integer
values of a, b, and c, such that
6a + 9b + 20c = n.
It seems that you have to include zero in the ranges of your function. That way, you can find solutions for all the numbers you need.
A solution to
6*a+9*b+20*c = 51
with integers a, b, c must have at least one of the integers 0 or negative. Some solutions are
6*7 + 9*1 + 20*0 = 51
6*0 + 9*(-1) + 20*3 = 51
Depending on the constraints in the assignment, you need to include 0 or even negative numbers among the possible coefficients.
A solution for 51 is 5*9 + 1*6.
Hint: where's the 20? What does this mean for it's coefficient?
A solution for 54 is 3*20 + (-1)*6. You figure out the rest.
For a start, you can usefully exploit bounds analysis. Given
6a + 9b + 20c = n
0 <= a
0 <= b
0 <= c
we can systematically set pairs of {a, b, c} to 0 to infer the upper bound for the remaining variable. This gives us
a <= floor(n / 6)
b <= floor(n / 9)
c <= floor(n / 20)
Moreover, if you pick a strategy (e.g., assign c then b then a), you can tighten the upper bounds further, for instance:
b <= floor((n - 20c) / 9)
Also, the last variable to be assigned must be a function of the other variables: you don't need to search for that.
You can start your range for a,b,c from 0 to 150.
Actually even I am a beginner and have started out from MIt 6.00 only.
ON reading their problem ,I think 150 it the limit to the largest number which cannot be possible to take.
This is a solution in Perl. rather a hack by using Regex.
Following this blog post to solve algebraic equations using regex.
we can use the following script for 3x + 2y + 5z = 40
#!/usr/bin/perl
$_ = 'o' x 40;
$a = 'o' x 3;
$b = 'o' x 2;
$c = 'o' x 5;
$_ =~ /^((?:$a)+)((?:$b)+)((?:$c)+)$/;
print "x = ", length($1)/length($a), "\n";
print "y = ", length($2)/length($b), "\n";
print "z = ", length($3)/length($c), "\n";
output: x=11, y = 1, z = 1
the famous Oldest plays the piano puzzle ends up as a 3 variable equation
This method applies for a condition that the variables are actually positive and the constant is positive.
Check this one I adapted from yours. It seems to fixed your problem:
variables=range(0,10)
exams=range(51,56)
for total in exams:
for a in variables:
for b in variables:
for c in variables:
if total==4*a+6*b+20*c:
print a, 'four pieces', b, 'six pieces','and', c ,'twenty pieces', 'for a total of', total
The break function will only break out of the closest loop. The code below uses an indicator to break out of each loop.
n = 1 # n starting from 1
count = 0 # Count + 1 everytime we find a possible value.
# Reset = 0 after a non-possible value.
notPossibleValue = ()
while True:
ind = 0 # become 1 if int solutions were found
for c in range (0,n/20+1):
if ind == 1: break
for b in range (0,n/9+1):
if ind == 1: break
for a in range (0, n/6+1):
if (n-20*c) == (b*9+a*6):
count += 1
ind = 1
# print 'n=', n, a,b,c, 'count', count
break # Break out of "a" loop
if ind == 0:
count = 0
notPossibleValue += (n,)
# print notPossibleValue, 'count', count
if count == 6:
print 'The largest number of McNuggets that cannot be bought in exact quantity is', notPossibleValue[-1]
break
n += 1
n=1
a=0
b=0
c=0
mcnugget = []
for i in range (n,100):
for a in range (0,20):
if 6*a + 9* b +20*c ==i:
mcnugget.append(i)
break
else:
for b in range (0,12):
if 6*a + 9* b +20*c ==i:
mcnugget.append(i)
break
else:
for c in range(0,5):
if 6*a + 9* b +20*c ==i:
mcnugget.append(i)
break
else:
if i>8:
if mcnugget[-1]==mcnugget[-2]+1==mcnugget[-3]+2==mcnugget[-4]+3==mcnugget[-5]+4==mcnugget[-6]+5 and mcnugget[-6]>0 :
break
mcnugget = set (mcnugget)
mcnugget = list (mcnugget)
count = 0
for z in mcnugget:
count += 1
if mcnugget [count]==mcnugget [count-1]+1==mcnugget [count-2]+2==mcnugget [count-3]+3==mcnugget [count-4]+4==mcnugget[count-5]+5:
biggestN= mcnugget[count-6]
break
#print (mcnugget)
biggestN = str(biggestN)
print ('Largest number of McNuggets that cannot be bought in exact quantity: <'+ biggestN +'>')