Related
This question already has answers here:
Python: sort function breaks in the presence of nan
(8 answers)
Closed 1 year ago.
Please explain !
I can't even reproduce it on other computer.
Sorting is based on the idea that the values being sorted form a total order. A total order requires several properties be met, including
for any two values a and b, exactly one of a < b, a == b, or a > b is true
the relations are transitive: a < b and b < c implies that a < c.
Knowing that we have a total order, we can assume that an sorted order exists, and that we can find it using O(n lg n) comparisons. In the absence of this assumption, we need to make O(n^2) comparisons to even determine if a sorted order exist, let alone produce one.
Floating point values with np.nan do not form a total order.
All three comparisons np.nan < x, np.nan == x, and np.nan > x are false no matter what x is. As a result, there is no "correct" place for np.nan in list; it all depends on which comparisons are made and when.
I think I have encountered a bug of Sympy's evalf() method with substitutions passed.
By accident, I found an expression that evaluates to a wrong value if I replace the variable x by an integer rather than a Float. The funny thing is that this happens only for some values of the precision.
The expression looks somewhat arbitrary, but if I attempt to simplify it further the bug disappears. This is a minimal working example
#!/usr/bin/env python
import sympy
from sympy.abc import x
# Some valid mathematical expression
expr = 1/((x - 9)*(x - 8)*(x - 7)*(x - 4)**2*(x - 3)**3*(x - 2))
def example(prec):
# This is the string 1.( prec-2 zeroes )1
almost1 = '1.'+(prec-2)*'0'+'1'
# We replace the integer 1
res1 = expr.evalf(prec, subs={x:1})
# We replace a Float veeery close to 1
res_almost1 = expr.evalf(prec, subs={x:sympy.Float(almost1,prec)})
return res1, res_almost1
The expected outcome is that the returned tuple should contain similar numbers since 1 and almost1 are very close. However, for some values of prec the result obtained by replacing 1 to expr is zero. (While the one obtained by replacing almost1 is close to the correct one.)
You may ask: "What are the values of prec for which the expression is wrong?". By running the code
wrong = [str(a) for a in range(10,1001) if example(a)[0] == 0]
print(','.join(wrong))
I obtain this seemingly completely random list
11,20,22,29,31,38,40,49,58,67,76,78,85,87,94,96,105,114,123,132,134,141,143,150,152,159,161,170,179,188,190,197,199,206,208,215,217,226,235,244,253,255,262,264,271,273,282,291,300,309,311,318,320,327,329,338,347,356,365,367,374,376,383,385,392,394,403,412,421,423,430,432,439,441,448,450,459,468,477,486,488,495,497,504,506,515,524,533,542,544,551,553,560,562,571,580,589,598,600,607,609,616,618,625,627,636,645,654,663,665,672,674,681,683,692,701,710,719,721,728,730,737,739,748,757,766,775,777,784,786,793,795,802,804,813,822,831,833,840,842,849,851,858,860,869,878,887,896,898,905,907,914,916,925,934,943,952,954,961,963,970,972,981,990,999
I posted it here so see whether I made some blunder in my code, otherwise I plan to post a bug issue on Sympy's github.
I'm writing unit tests for my simulation and want to check that for specific parameters the result, a numpy array, is zero. Due to calculation inaccuracies, small values are also accepted (1e-7). What is the best way to assert this array is close to 0 in all places?
np.testing.assert_array_almost_equal(a, np.zeros(a.shape)) and assert_allclose fail as the relative tolerance is inf (or 1 if you switch the arguments) Docu
I feel like np.testing.assert_array_almost_equal_nulp(a, np.zeros(a.shape)) is not precise enough as it compares the difference to the spacing, therefore it's always true for nulps >= 1 and false otherways but does not say anything about the amplitude of a Docu
Use of np.testing.assert_(np.all(np.absolute(a) < 1e-7)) based on this question does not give any of the detailed output, I am used to by other np.testing methods
Is there another way to test this? Maybe another testing package?
If you compare a numpy array with all zeros, you can use the absolute tolerance, as the relative tolerance does not make sense here:
from numpy.testing import assert_allclose
def test_zero_array():
a = np.array([0, 1e-07, 1e-08])
assert_allclose(a, 0, atol=1e-07)
The rtol value does not matter in this case, as it is multiplied with 0 if calculating the tolerance:
atol + rtol * abs(desired)
Update: Replaced np.zeros_like(a) with the simpler scalar 0. As pointed out by #hintze, np array comparisons also work against scalars.
I'll preface with, this is solely to satisfy my curiosity rather than needing help on a coding project. But I was wanting to know if anyone knows of a function (particularly in python, but I'll accept a valid mathematical concept) kind of like absolute value, that given a number will return 0 if negative or return that number if positive.
Pseudo code:
def myFunc(x):
if x > 0:
return x
else:
return 0
Again, not asking the question out of complexity, just curiosity. I've needed it a couple times now, and was wondering if I really did need to write my own function or if one already existed. If there isn't a function to do this, is there a way to write this in one line using an expression doesn't evaluate twice.
i.e.
myVar = x-y if x-y>0 else 0
I'd be fine with a solution like that if x-y wasn't evaluated twice. So if anyone out there has any solution, I'd appreciate it.
Thanks
One way...
>>> max(0, x)
This should do it:
max(x-y, 0)
Sounds like an analysis type question. numpy can come to the rescue!
If you have your data in an array:
x = np.arange(-5,11)
print x
[-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10]
# Now do your subtraction and replacement.
x[(x-y)>0] -= y
x[(x-y)<0] = 0
If I understand your question correctly, you want to replace values in x where x-y<0 with zeros, otherwise replace with x-y.
NOTE, the solution above works well for subtracting an integer from an array, or operating on two array of equal dimensions. However, Daniel's solution is more elegant when working on two lists of equal length. It all depends on your needs (and whether you want to venture into the world of numpy or not).
An alternative expression would be
x -= min(x, y)
So in Ruby there is a trick to specify infinity:
1.0/0
=> Infinity
I believe in Python you can do something like this
float('inf')
These are just examples though, I'm sure most languages have infinity in some capacity. When would you actually use this construct in the real world? Why would using it in a range be better than just using a boolean expression? For instance
(0..1.0/0).include?(number) == (number >= 0) # True for all values of number
=> true
To summarize, what I'm looking for is a real world reason to use Infinity.
EDIT: I'm looking for real world code. It's all well and good to say this is when you "could" use it, when have people actually used it.
Dijkstra's Algorithm typically assigns infinity as the initial edge weights in a graph. This doesn't have to be "infinity", just some arbitrarily constant but in java I typically use Double.Infinity. I assume ruby could be used similarly.
Off the top of the head, it can be useful as an initial value when searching for a minimum value.
For example:
min = float('inf')
for x in somelist:
if x<min:
min=x
Which I prefer to setting min initially to the first value of somelist
Of course, in Python, you should just use the min() built-in function in most cases.
There seems to be an implied "Why does this functionality even exist?" in your question. And the reason is that Ruby and Python are just giving access to the full range of values that one can specify in floating point form as specified by IEEE.
This page seems to describe it well:
http://steve.hollasch.net/cgindex/coding/ieeefloat.html
As a result, you can also have NaN (Not-a-number) values and -0.0, while you may not immediately have real-world uses for those either.
In some physics calculations you can normalize irregularities (ie, infinite numbers) of the same order with each other, canceling them both and allowing a approximate result to come through.
When you deal with limits, calculations like (infinity / infinity) -> approaching a finite a number could be achieved. It's useful for the language to have the ability to overwrite the regular divide-by-zero error.
Use Infinity and -Infinity when implementing a mathematical algorithm calls for it.
In Ruby, Infinity and -Infinity have nice comparative properties so that -Infinity < x < Infinity for any real number x. For example, Math.log(0) returns -Infinity, extending to 0 the property that x > y implies that Math.log(x) > Math.log(y). Also, Infinity * x is Infinity if x > 0, -Infinity if x < 0, and 'NaN' (not a number; that is, undefined) if x is 0.
For example, I use the following bit of code in part of the calculation of some log likelihood ratios. I explicitly reference -Infinity to define a value even if k is 0 or n AND x is 0 or 1.
Infinity = 1.0/0.0
def Similarity.log_l(k, n, x)
unless x == 0 or x == 1
k * Math.log(x.to_f) + (n-k) * Math.log(1.0-x)
end
-Infinity
end
end
Alpha-beta pruning
I use it to specify the mass and inertia of a static object in physics simulations. Static objects are essentially unaffected by gravity and other simulation forces.
In Ruby infinity can be used to implement lazy lists. Say i want N numbers starting at 200 which get successively larger by 100 units each time:
Inf = 1.0 / 0.0
(200..Inf).step(100).take(N)
More info here: http://banisterfiend.wordpress.com/2009/10/02/wtf-infinite-ranges-in-ruby/
I've used it for cases where you want to define ranges of preferences / allowed.
For example in 37signals apps you have like a limit to project number
Infinity = 1 / 0.0
FREE = 0..1
BASIC = 0..5
PREMIUM = 0..Infinity
then you can do checks like
if PREMIUM.include? current_user.projects.count
# do something
end
I used it for representing camera focus distance and to my surprise in Python:
>>> float("inf") is float("inf")
False
>>> float("inf") == float("inf")
True
I wonder why is that.
I've used it in the minimax algorithm. When I'm generating new moves, if the min player wins on that node then the value of the node is -∞. Conversely, if the max player wins then the value of that node is +∞.
Also, if you're generating nodes/game states and then trying out several heuristics you can set all the node values to -∞/+∞ which ever makes sense and then when you're running a heuristic its easy to set the node value:
node_val = -∞
node_val = max(heuristic1(node), node_val)
node_val = max(heuristic2(node), node_val)
node_val = max(heuristic2(node), node_val)
I've used it in a DSL similar to Rails' has_one and has_many:
has 0..1 :author
has 0..INFINITY :tags
This makes it easy to express concepts like Kleene star and plus in your DSL.
I use it when I have a Range object where one or both ends need to be open
I've used symbolic values for positive and negative infinity in dealing with range comparisons to eliminate corner cases that would otherwise require special handling:
Given two ranges A=[a,b) and C=[c,d) do they intersect, is one greater than the other, or does one contain the other?
A > C iff a >= d
A < C iff b <= c
etc...
If you have values for positive and negative infinity that respectively compare greater than and less than all other values, you don't need to do any special handling for open-ended ranges. Since floats and doubles already implement these values, you might as well use them instead of trying to find the largest/smallest values on your platform. With integers, it's more difficult to use "infinity" since it's not supported by hardware.
I ran across this because I'm looking for an "infinite" value to set for a maximum, if a given value doesn't exist, in an attempt to create a binary tree. (Because I'm selecting based on a range of values, and not just a single value, I quickly realized that even a hash won't work in my situation.)
Since I expect all numbers involved to be positive, the minimum is easy: 0. Since I don't know what to expect for a maximum, though, I would like the upper bound to be Infinity of some sort. This way, I won't have to figure out what "maximum" I should compare things to.
Since this is a project I'm working on at work, it's technically a "Real world problem". It may be kindof rare, but like a lot of abstractions, it's convenient when you need it!
Also, to those who say that this (and other examples) are contrived, I would point out that all abstractions are somewhat contrived; that doesn't mean they are useful when you contrive them.
When working in a problem domain where trig is used (especially tangent) infinity is an answer that can come up. Trig ends up being used heavily in graphics applications, games, and geospatial applications, plus the obvious math applications.
I'm sure there are other ways to do this, but you could use Infinity to check for reasonable inputs in a String-to-Float conversion. In Java, at least, the Float.isNaN() static method will return false for numbers with infinite magnitude, indicating they are valid numbers, even though your program might want to classify them as invalid. Checking against the Float.POSITIVE_INFINITY and Float.NEGATIVE_INFINITY constants solves that problem. For example:
// Some sample values to test our code with
String stringValues[] = {
"-999999999999999999999999999999999999999999999",
"12345",
"999999999999999999999999999999999999999999999"
};
// Loop through each string representation
for (String stringValue : stringValues) {
// Convert the string representation to a Float representation
Float floatValue = Float.parseFloat(stringValue);
System.out.println("String representation: " + stringValue);
System.out.println("Result of isNaN: " + floatValue.isNaN());
// Check the result for positive infinity, negative infinity, and
// "normal" float numbers (within the defined range for Float values).
if (floatValue == Float.POSITIVE_INFINITY) {
System.out.println("That number is too big.");
} else if (floatValue == Float.NEGATIVE_INFINITY) {
System.out.println("That number is too small.");
} else {
System.out.println("That number is jussssst right.");
}
}
Sample Output:
String representation: -999999999999999999999999999999999999999999999
Result of isNaN: false
That number is too small.
String representation: 12345
Result of isNaN: false
That number is jussssst right.
String representation: 999999999999999999999999999999999999999999999
Result of isNaN: false
That number is too big.
It is used quite extensively in graphics. For example, any pixel in a 3D image that is not part of an actual object is marked as infinitely far away. So that it can later be replaced with a background image.
I'm using a network library where you can specify the maximum number of reconnection attempts. Since I want mine to reconnect forever:
my_connection = ConnectionLibrary(max_connection_attempts = float('inf'))
In my opinion, it's more clear than the typical "set to -1 to retry forever" style, since it's literally saying "retry until the number of connection attempts is greater than infinity".
Some programmers use Infinity or NaNs to show a variable has never been initialized or assigned in the program.
If you want the largest number from an input but they might use very large negatives. If I enter -13543124321.431 it still works out as the largest number since it's bigger than -inf.
enter code here
initial_value = float('-inf')
while True:
try:
x = input('gimmee a number or type the word, stop ')
except KeyboardInterrupt:
print("we done - by yo command")
break
if x == "stop":
print("we done")
break
try:
x = float(x)
except ValueError:
print('not a number')
continue
if x > initial_value: initial_value = x
print("The largest number is: " + str(initial_value))
You can to use:
import decimal
decimal.Decimal("Infinity")
or:
from decimal import *
Decimal("Infinity")
For sorting
I've seen it used as a sort value, to say "always sort these items to the bottom".
To specify a non-existent maximum
If you're dealing with numbers, nil represents an unknown quantity, and should be preferred to 0 for that case. Similarly, Infinity represents an unbounded quantity, and should be preferred to (arbitrarily_large_number) in that case.
I think it can make the code cleaner. For example, I'm using Float::INFINITY in a Ruby gem for exactly that: the user can specify a maximum string length for a message, or they can specify :all. In that case, I represent the maximum length as Float::INFINITY, so that later when I check "is this message longer than the maximum length?" the answer will always be false, without needing a special case.