[c++] Comparing floating point number to zero

The C++ FAQ lite "[29.17] Why doesn't my floating-point comparison work?" recommends this equality test:

#include <cmath>  /* for std::abs(double) */

inline bool isEqual(double x, double y)
{
  const double epsilon = /* some small number such as 1e-5 */;
  return std::abs(x - y) <= epsilon * std::abs(x);
  // see Knuth section 4.2.2 pages 217-218
}
  1. Is it correct, that this implies that the only numbers which are equal to zero are +0 and -0?
  2. Should one use this function also when testing for zero or rather a test like |x| < epsilon?

Update

As pointed out by Daniel Daranas the function should probably better be called isNearlyEqual (which is the case I care about).

Someone pointed out "Comparing Floating Point Numbers", which I want to share more prominently.

This question is related to c++ floating-point

The answer is


No.

Equality is equality.

The function you wrote will not test two doubles for equality, as its name promises. It will only test if two doubles are "close enough" to each other.

If you really want to test two doubles for equality, use this one:

inline bool isEqual(double x, double y)
{
   return x == y;
}

Coding standards usually recommend against comparing two doubles for exact equality. But that is a different subject. If you actually want to compare two doubles for exact equality, x == y is the code you want.

10.000000000000001 is not equal to 10.0, no matter what they tell you.

An example of using exact equality is when a particular value of a double is used as a synonym of some special state, such as "pending calulation" or "no data available". This is possible only if the actual numeric values after that pending calculation are only a subset of the possible values of a double. The most typical particular case is when that value is nonnegative, and you use -1.0 as an (exact) representation of a "pending calculation" or "no data available". You could represent that with a constant:

const double NO_DATA = -1.0;

double myData = getSomeDataWhichIsAlwaysNonNegative(someParameters);

if (myData != NO_DATA)
{
    ...
}

If you are only interested in +0.0 and -0.0, you can use fpclassify from <cmath>. For instance:

if( FP_ZERO == fpclassify(x) ) do_something;


Like @Exceptyon pointed out, this function is 'relative' to the values you're comparing. The Epsilon * abs(x) measure will scale based on the value of x, so that you'll get a comparison result as accurately as epsilon, irrespective of the range of values in x or y.

If you're comparing zero(y) to another really small value(x), say 1e-8, abs(x-y) = 1e-8 will still be much larger than epsilon *abs(x) = 1e-13. So unless you're dealing with extremely small number that can't be represented in a double type, this function should do the job and will match zero only against +0 and -0.

The function seems perfectly valid for zero comparison. If you're planning to use it, I suggest you use it everywhere there're floats involved, and not have special cases for things like zero, just so that there's uniformity in the code.

ps: This is a neat function. Thanks for pointing to it.


Consider this example:

bool isEqual = (23.42f == 23.42);

What is isEqual? 9 out of 10 people will say "It's true, of course" and 9 out of 10 people are wrong: https://rextester.com/RVL15906

That's because floating point numbers are no exact numeric representations.

Being binary numbers, they cannot even exactly represent all numbers that can be exact represented as decimal numbers. E.g. while 0.1 can be exactly represented as a decimal number (it is exactly the tenth part of 1), it cannot be represented using floating point because it is 0.00011001100110011... periodic as binary. 0.1 is for floating point what 1/3 is for decimal (which is 0.33333... as decimal)

The consequence is that calculations like 0.3 + 0.6 can result in 0.89999999999999991, which is not 0.9, albeit it's close to that. And thus the test 0.1 + 0.2 - 0.3 == 0.0 might fail as the result of the calculation may not be 0, albeit it will be very close to 0.

== is an exact test and performing an exact test on inexact numbers is usually not very meaningful. As many floating point calculations include rounding errors, you usually want your comparisons to also allow small errors and this is what the test code you posted is all about. Instead of testing "Is A equal to B" it tests "Is A very close to B" as very close is quite often the best result you can expect from floating point calculations.


Simple comparison of FP numbers has it's own specific and it's key is the understanding of FP format (see https://en.wikipedia.org/wiki/IEEE_floating_point)

When FP numbers calculated in a different ways, one through sin(), other though exp(), strict equality won't be working, even though mathematically numbers could be equal. The same way won't be working equality with the constant. Actually, in many situations FP numbers must not be compared using strict equality (==)

In such cases should be used DBL_EPSIPON constant, which is minimal value do not change representation of 1.0 being added to the number more than 1.0. For floating point numbers that more than 2.0 DBL_EPSIPON does not exists at all. Meanwhile, DBL_EPSILON has exponent -16, which means that all numbers, let's say, with exponent -34, would be absolutely equal in compare to DBL_EPSILON.

Also, see example, why 10.0 == 10.0000000000000001

Comparing dwo floating point numbers depend on these number nature, we should calculate DBL_EPSILON for them that would be meaningful for the comparison. Simply, we should multiply DBL_EPSILON to one of these numbers. Which of them? Maximum of course

bool close_enough(double a, double b){
    if (fabs(a - b) <= DBL_EPSILON * std::fmax(fabs(a), fabs(b)))
    {
        return true;
    }
    return false;
}

All other ways would give you bugs with inequality which could be very hard to catch


You can use std::nextafter with a fixed factor of the epsilon of a value like the following:

bool isNearlyEqual(double a, double b)
{
  int factor = /* a fixed factor of epsilon */;

  double min_a = a - (a - std::nextafter(a, std::numeric_limits<double>::lowest())) * factor;
  double max_a = a + (std::nextafter(a, std::numeric_limits<double>::max()) - a) * factor;

  return min_a <= b && max_a >= b;
}

notice, that code is:

std::abs((x - y)/x) <= epsilon

you are requiring that the "relative error" on the var is <= epsilon, not that the absolute difference is


2 + 2 = 5(*)

(for some floating-precision values of 2)

This problem frequently arises when we think of"floating point" as a way to increase precision. Then we run afoul of the "floating" part, which means there is no guarantee of which numbers can be represented.

So while we might easily be able to represent "1.0, -1.0, 0.1, -0.1" as we get to larger numbers we start to see approximations - or we should, except we often hide them by truncating the numbers for display.

As a result, we might think the computer is storing "0.003" but it may instead be storing "0.0033333333334".

What happens if you perform "0.0003 - 0.0002"? We expect .0001, but the actual values being stored might be more like "0.00033" - "0.00029" which yields "0.000004", or the closest representable value, which might be 0, or it might be "0.000006".

With current floating point math operations, it is not guaranteed that (a / b) * b == a.

#include <stdio.h>

// defeat inline optimizations of 'a / b * b' to 'a'
extern double bodge(int base, int divisor) {
    return static_cast<double>(base) / static_cast<double>(divisor);
}

int main() {
    int errors = 0;
    for (int b = 1; b < 100; ++b) {
        for (int d = 1; d < 100; ++d) {
            // b / d * d ... should == b
            double res = bodge(b, d) * static_cast<double>(d);
            // but it doesn't always
            if (res != static_cast<double>(b))
                ++errors;
        }
    }
    printf("errors: %d\n", errors);
}

ideone reports 599 instances where (b * d) / d != b using just the 10,000 combinations of 1 <= b <= 100 and 1 <= d <= 100 .

The solution described in the FAQ is essentially to apply a granularity constraint - to test if (a == b +/- epsilon).

An alternative approach is to avoid the problem entirely by using fixed point precision or by using your desired granularity as the base unit for your storage. E.g. if you want times stored with nanosecond precision, use nanoseconds as your unit of storage.

C++11 introduced std::ratio as the basis for fixed-point conversions between different time units.