This question already has answers here:
In what situation is an object not equal to itself?
(2 answers)
Closed 26 days ago.
I found a code snippet, which is a custom metric for tensorboard (pytorch training)
def specificity(output, target, t=0.5):
tp, tn, fp, fn = tp_tn_fp_fn(output, target, t)
if fp == 0:
return 1
s = tn / (tn + fp)
if s != s:
s = 1
return s
def tp_tn_fp_fn(output, target, t):
with torch.no_grad():
preds = output > t # torch.argmax(output, dim=1)
preds = preds.long()
num_true_neg = torch.sum((preds == target) & (target == 0), dtype=torch.float).item()
num_true_pos = torch.sum((preds == target) & (target == 1), dtype=torch.float).item()
num_false_pos = torch.sum((preds != target) & (target == 1), dtype=torch.float).item()
num_false_neg = torch.sum((preds != target) & (target == 0), dtype=torch.float).item()
return num_true_pos, num_true_neg, num_false_pos, num_false_neg
In terms of the calculation itself it is easy enough to understand.
What I don't understand is s != s. What does that check do, how can the two s even be different?
Since it's ML-related, I'll assume the data are all numbers. The only number where s != s is true is the special not-a-number value nan. Any comparison with nan is always false, so from that follows that nan is not equal to itself.
Related
so im trying to make a program to solve various normal distribution questions with pure python (no modules other than math) to 4 decimal places only for A Levels, and there is this problem that occurs in the function get_z_less_than_a_equal(0.75):. Apparently, without the assert statement in the except clause, the variables all get messed up, and change. The error, i'm catching is the recursion error. Anyways, if there is an easier and more efficient way to do things, it'd be appreciated.
import math
mean = 0
standard_dev = 1
percentage_points = {0.5000: 0.0000, 0.4000: 0.2533, 0.3000: 0.5244, 0.2000: 0.8416, 0.1000: 1.2816, 0.0500: 1.6440, 0.0250: 1.9600, 0.0100: 2.3263, 0.0050: 2.5758, 0.0010: 3.0902, 0.0005: 3.2905}
def get_z_less_than(x):
"""
P(Z < x)
"""
return round(0.5 * (1 + math.erf((x - mean)/math.sqrt(2 * standard_dev**2))), 4)
def get_z_greater_than(x):
"""
P(Z > x)
"""
return round(1 - get_z_less_than(x), 4)
def get_z_in_range(lower_bound, upper_bound):
"""
P(lower_bound < Z < upper_bound)
"""
return round(get_z_less_than(upper_bound) - get_z_less_than(lower_bound), 4)
def get_z_less_than_a_equal(x):
"""
P(Z < a) = x
acquires a, given x
"""
# first trial: brute forcing
for i in range(401):
a = i/100
p = get_z_less_than(a)
if x == p:
return a
elif p > x:
break
# second trial: using symmetry
try:
res = -get_z_less_than_a_equal(1 - x)
except:
# third trial: using estimation
assert a, "error"
prev = get_z_less_than(a-0.01)
p = get_z_less_than(a)
if abs(x - prev) > abs(x - p):
res = a
else:
res = a - 0.01
return res
def get_z_greater_than_a_equal(x):
"""
P(Z > a) = x
"""
if x in percentage_points:
return percentage_points[x]
else:
return get_z_less_than_a_equal(1-x)
print(get_z_in_range(-1.20, 1.40))
print(get_z_less_than_a_equal(0.7517))
print(get_z_greater_than_a_equal(0.1000))
print(get_z_greater_than_a_equal(0.0322))
print(get_z_less_than_a_equal(0.1075))
print(get_z_less_than_a_equal(0.75))
Since python3.8, the statistics module in the standard library has a NormalDist class, so we could use that to implement our functions "with pure python" or at least for testing:
import math
from statistics import NormalDist
normal_dist = NormalDist(mu=0, sigma=1)
for i in range(-2000, 2000):
test_val = i / 1000
assert get_z_less_than(test_val) == round(normal_dist.cdf(test_val), 4)
Doesn't throw an error, so that part probably works fine
Your get_z_less_than_a_equal seems to be the equivalent of NormalDist.inv_cdf
There are very efficient ways to compute it accurately using the inverse of the error function (see Wikipedia and Python implementation), but we don't have that in the standard library
Since you only care about the first few digits and get_z_less_than is monotonic, we can use a simple bisection method to find our solution
Newton's method would be much faster, and not too hard to implement since we know that the derivative of the cdf is just the pdf, but still probably more complex than what we need
def get_z_less_than_a_equal(x):
"""
P(Z < a) = x
acquires a, given x
"""
if x <= 0.0 or x >= 1.0:
raise ValueError("x must be >0.0 and <1.0")
min_res, max_res = -10, 10
while max_res - min_res > 1e-7:
mid = (max_res + min_res) / 2
if get_z_less_than(mid) < x:
min_res = mid
else:
max_res = mid
return round((max_res + min_res) / 2, 4)
Let's test this:
for i in range(1, 2000):
test_val = i / 2000
left_val = get_z_less_than_a_equal(test_val)
right_val = round(normal_dist.inv_cdf(test_val), 4)
assert left_val == right_val, f"{left_val} != {right_val}"
# AssertionError: -3.3201 != -3.2905
We see that we are losing some precision, that's because the error introduced by get_z_less_than (which rounds to 4 digits) gets propagated and amplified when we use it to estimate its inverse (see Wikipedia - error propagation for details)
So let's add a "digits" parameter to get_z_less_than and change our functions slightly:
def get_z_less_than(x, digits=4):
"""
P(Z < x)
"""
res = 0.5 * (1 + math.erf((x - mean) / math.sqrt(2 * standard_dev ** 2)))
return round(res, digits)
def get_z_less_than_a_equal(x, digits=4):
"""
P(Z < a) = x
acquires a, given x
"""
if x <= 0.0 or x >= 1.0:
raise ValueError("x must be >0.0 and <1.0")
min_res, max_res = -10, 10
while max_res - min_res > 10 ** -(digits * 2):
mid = (max_res + min_res) / 2
if get_z_less_than(mid, digits * 2) < x:
min_res = mid
else:
max_res = mid
return round((max_res + min_res) / 2, digits)
And now we can try the same test again and see it passes
Write a function answer(str_S) which, given the base-10 string
representation of an integer S, returns the largest n such that R(n) =
S. Return the answer as a string in base-10 representation. If there
is no such n, return "None". S will be a positive integer no greater
than 10^25.
where R(n) is the number of zombits at time n:
R(0) = 1
R(1) = 1
R(2) = 2
R(2n) = R(n) + R(n + 1) + n (for n > 1)
R(2n + 1) = R(n - 1) + R(n) + 1 (for n >= 1)
Test cases
==========
Inputs:
(string) str_S = "7"
Output:
(string) "4"
Inputs:
(string) str_S = "100"
Output:
(string) "None"
My program below is correct but it is not scalable since here the range of S can be a very large number like 10^24. Could anyone help me with some suggestion to improve the code further so that it can cover any input case.
def answer(str_S):
d = {0: 1, 1: 1, 2: 2}
str_S = int(str_S)
i = 1
while True:
if i > 1:
d[i*2] = d[i] + d[i+1] + i
if d[i*2] == str_S:
return i*2
elif d[i*2] > str_S:
return None
if i>=1:
d[i*2+1] = d[i-1] + d[i] + 1
if d[i*2+1] == str_S:
return i*2 + 1
elif d[i*2+1] > str_S:
return None
i += 1
print answer('7')
First of all, where are you having trouble with the scaling? I ran your code on a 30-digit number, and it seemed to complete okay. Do you have a memory limit? Python handles arbitrarily large integers, although very large ones get flipped into digital arithmetic mode.
Given the density of R values, I suspect that you can save space as well as time if you switch to a straight array: use the value as an array index instead of a dict key.
I'm currently writing a function that makes and returns a new function to create polynomial expressions. I want the function to store the coefficients of the polynomial and the polynomial itself in string form. However, it doesn't seem that I can set either of the attributes without the interpreter insisting the newly created function has no such attribute.
Please see my function below
def poly(coefs):
"""Return a function that represents the polynomial with these coefficients.
For example, if coefs=(10, 20, 30), return the function of x that computes
'30 * x**2 + 20 * x + 10'. Also store the coefs on the .coefs attribute of
the function, and the str of the formula on the .__name__ attribute.'"""
# your code here (I won't repeat "your code here"; there's one for each function)
def createPoly(x):
formulaParts = []
power = 0
createPoly.coefs = coefs
for coef in coefs:
if power == 0:
formulaParts += [('%d') % (coef)]
elif power == 1:
formulaParts += [('%d * x') % (coef)]
else:
formulaParts += [('%d * x**%d') % (coef, power)]
power +=1
createPoly.__name__ = ' + '.join(formulaParts[::-1])
createPoly.value = eval(createPoly.__name__)
return createPoly.value
return createPoly
As you can see when I set the attributes in the above code and use them there is no problem. However if I use code like the below that's when the error occurs
y = poly((5,10,5))
print(y.__name__)
It might be something REALLY simple I'm overlooking. Please help
Your code to set up the inner function can't be inside the inner function:
def poly(coefs):
def createPoly(x):
createPoly.value = eval(createPoly.__name__)
return createPoly.value
formulaParts = []
power = 0
for coef in coefs:
if power == 0:
formulaParts += [('%d') % (coef)]
elif power == 1:
formulaParts += [('%d * x') % (coef)]
else:
formulaParts += [('%d * x**%d') % (coef, power)]
power += 1
createPoly.__name__ = ' + '.join(formulaParts[::-1])
createPoly.coefs = coefs
return createPoly
I am creating a sigsum() function which takes the sum using an input equation and an input variable. Here's what I have so far:
def sigsum(eqn, index, lower=0, upper=None, step=1):
if type(step) is not int:
raise TypeError('step must be an integer')
elif step < 1:
raise ValueError('step must be greater than or equal to 1')
if upper is None:
upper = 1280000
if lower is None:
lower = -1280000
if (upper - lower) % step:
upper -= (upper - lower) % step
index = lower
total = 0
while True:
total += eqn
if index == upper:
break
index += step
return total
Usage of function:
print(sigsum('1/(i+5)','i'))
>>> 12.5563
My current problem is converting 'eqn' and 'index' to variables that exist inside the function local namespace. I heard around that using exec is not a good idea and that maybe setattr() might work. Can anyone help me out?
Thanks.
For eqn I suggest using a lambda function:
eqn = lambda i: 1 / (i + 5)
then index is not needed, because it is just "the variable passed to the function" (does not need a name).
Then your function becomes
def integrate(fn, start = 0, end = 128000, step = 1):
"""
Return a stepwise approximation of
the integral of fn from start to end
"""
num_steps = (end - start) // step
if num_steps < 0:
raise ValueError("bad step value")
else:
return sum(fn(start + k*step) for k in range(num_steps))
and you can run it like
res = step_integrate(eqn) # => 10.253703030104417
Note that there are many steps to this, and many of them involve very small numbers; rounding errors can become a major problem. If accuracy is important you may want to manually derive an integral,
from math import log
eqn = lambda i: 1 / (i + 5)
eqn.integral = lambda i: log(i + 5)
def integrate(fn, start = 0, end = 128000, step = 1):
"""
Return the integral of fn from start to end
If fn.integral is defined, used it;
otherwise do a stepwise approximation
"""
if hasattr(fn, "integral"):
return fn.integral(end) - fn.integral(start)
else:
num_steps = (end - start) // step
if num_steps < 0:
raise ValueError("bad step value")
else:
return sum(fn(start + k*step) for k in range(num_steps))
which again runs like
res = step_integrate(eqn) # => 10.150386692204735
(note that the stepwise approximation was about 1% too high.)
I would use a lambda function as Hugh Bothwell suggested you
would have to modify sigsum as the following
def sigsum(eqn, lower=0, upper=None, step=1):
if type(step) is not int:
raise TypeError('step must be an integer')
elif step < 1:
raise ValueError('step must be greater than or equal to 1')
if upper is None:
upper = 1280000
if lower is None:
lower = -1280000
if (upper - lower) % step:
upper -= (upper - lower) % step
index = lower
total = 0
while True:
total += eqn(index)
if index == upper:
break
index += step
return total
Usage of function:
print(sigsum(lambda i: 1/(i+5)))
>>> 12.5563
you can also define a function separatly:
def myfunction(i):
return 1/(i+5)
and pass it to sigsum
print(sigsum(myfunction))
>>> 12.5563
to be able to pass function as a parameter is called in computer language speech function as first class object. (C and java for example doesn't have it, javascript and python have)
How do I use z3 to count the number of solutions? For example, I want to prove that for any n, there are 2 solutions to the set of equations {x^2 == 1, y_1 == 1, ..., y_n == 1}. The following code shows satisfiability for a given n, which isn't quite what I want (I want number of solutions for an arbitrary n).
#!/usr/bin/env python
from z3 import *
# Add the equations { x_1^2 == 1, x_2 == 1, ... x_n == 1 } to s and return it.
def add_constraints(s, n):
assert n > 1
X = IntVector('x', n)
s.add(X[0]*X[0] == 1)
for i in xrange(1, n):
s.add(X[i] == 1)
return s
s = Solver()
add_constraints(s, 3)
s.check()
s.model()
If there are a finite number of solutions, you can use the disjunct of the constants (your x_i's) not equal to their assigned model values to enumerate all of them. If there are infinite solutions (which is the case if you want to prove this for all natural numbers n), you can use the same technique, but of course couldn't enumerate them all, but could use this to generate many solutions up to some bound you pick. If you want to prove this for all n > 1, you will need to use quantifiers. I've added a discussion of this below.
While you didn't quite ask this question, you should see this question/answer as well: Z3: finding all satisfying models
Here's your example doing this (z3py link here: http://rise4fun.com/Z3Py/643M ):
# Add the equations { x_1^2 == 1, x_2 == 1, ... x_n == 1 } to s and return it.
def add_constraints(s, n, model):
assert n > 1
X = IntVector('x', n)
s.add(X[0]*X[0] == 1)
for i in xrange(1, n):
s.add(X[i] == 1)
notAgain = []
i = 0
for val in model:
notAgain.append(X[i] != model[val])
i = i + 1
if len(notAgain) > 0:
s.add(Or(notAgain))
print Or(notAgain)
return s
for n in range(2,5):
s = Solver()
i = 0
add_constraints(s, n, [])
while s.check() == sat:
print s.model()
i = i + 1
add_constraints(s, n, s.model())
print i # solutions
If you want to prove there are no other solutions for any choice of n, you need to use quantifiers, since the previous approach will only work for finite n (and it gets very expensive quickly). Here is an encoding showing this proof. You could generalize this to incorporate the model generation capability in the previous part to come up with the +/- 1 solution for a more general formula. If the equation has a number of solutions independent of n (like in your example), this would allow you to prove equations have some finite number of solutions. If the number of solutions is a function of n, you'd have to figure that function out. z3py link: http://rise4fun.com/Z3Py/W9En
x = Function('x', IntSort(), IntSort())
s = Solver()
n = Int('n')
# theorem says that x(1)^2 == 1 and that x(1) != +/- 1, and forall n >= 2, x(n) == 1
# try removing the x(1) != +/- constraints
theorem = ForAll([n], And(Implies(n == 1, And(x(n) * x(n) == 1, x(n) != 1, x(n) != -1) ), Implies(n > 1, x(n) == 1)))
#s.add(Not(theorem))
s.add(theorem)
print s.check()
#print s.model() # unsat, no model available, no other solutions