Comparing floating point number to zero

Question

The C   FAQ lite  quot  29 17  Why doesn t my floating-point comparison work  quot  recommends this equality test   include  lt cmath gt      for std  abs double      inline bool isEqual double x  double y      const double epsilon      some small number such as 1e-5       return std  abs x - y   lt   epsilon   std  abs x        see Knuth section 4 2 2 pages 217-218     Is it correct  that this implies that the only numbers which are equal to zero are  0 and -0  Should one use this function also when testing for zero or rather a test like  x   lt  epsilon   Update As pointed out by Daniel Daranas the function should probably better be called isNearlyEqual  which is the case I care about   Someone pointed out  quot Comparing Floating Point Numbers quot   which I want to share more prominently

User · Accepted Answer

You are correct with your observation    If x    0 0  then abs x    epsilon is zero and you re testing whether abs y   lt   0 0   If y    0 0 then you re testing abs x   lt   abs x    epsilon which means either epsilon  gt   1  it isn t  or x    0 0   So either is equal val  0 0  or is equal 0 0  val  would be pointless  and you could just say val    0 0   If you want to only accept exactly  0 0 and -0 0   The FAQ s recommendation in this case is of limited utility   There is no  one size fits all  floating-point comparison   You have to think about the semantics of your variables  the acceptable range of values  and the magnitude of error introduced by your computations   Even the FAQ mentions a caveat  saying this function is not usually a problem  when the magnitudes of x and y are significantly larger than epsilon  but your mileage may vary

User · Answer

notice  that code is   std  abs  x - y  x   lt   epsilon   you are requiring that the  relative error  on the var is  lt   epsilon  not that the absolute difference is

User · Answer

2   2   5      for some floating-precision values of 2   This problem frequently arises when we think of floating point  as a way to increase precision  Then we run afoul of the  floating  part  which means there is no guarantee of which numbers can be represented   So while we might easily be able to represent  1 0  -1 0  0 1  -0 1  as we get to larger numbers we start to see approximations - or we should  except we often hide them by truncating the numbers for display   As a result  we might think the computer is storing  0 003  but it may instead be storing  0 0033333333334    What happens if you perform  0 0003 - 0 0002   We expect  0001  but the actual values being stored might be more like  0 00033  -  0 00029  which yields  0 000004   or the closest representable value  which might be 0  or it might be  0 000006    With current floating point math operations  it is not guaranteed that  a   b    b    a    include  lt stdio h gt      defeat inline optimizations of  a   b   b  to  a  extern double bodge int base  int divisor        return static cast lt double gt  base    static cast lt double gt  divisor      int main         int errors   0      for  int b   1  b  lt  100    b            for  int d   1  d  lt  100    d                   b   d   d     should    b             double res   bodge b  d    static cast lt double gt  d                  but it doesn t always             if  res    static cast lt double gt  b                     errors                      printf  errors   d n   errors       ideone reports 599 instances where  b   d    d    b using just the 10 000 combinations of 1  lt   b  lt   100 and 1  lt   d  lt   100    The solution described in the FAQ is essentially to apply a granularity constraint - to test if  a    b   - epsilon    An alternative approach is to avoid the problem entirely by using fixed point precision or by using your desired granularity as the base unit for your storage  E g  if you want times stored with nanosecond precision  use nanoseconds as your unit of storage   C  11 introduced std  ratio as the basis for fixed-point conversions between different time units

User · Answer

Like  Exceptyon pointed out  this function is  relative  to the values you re comparing  The Epsilon   abs x  measure will scale based on the value of x  so that you ll get a comparison result as accurately as epsilon  irrespective of the range of values in x or y    If you re comparing zero y  to another really small value x   say 1e-8  abs x-y    1e-8 will still be much larger than epsilon  abs x    1e-13  So unless you re dealing with extremely small number that can t be represented in a double type  this function should do the job and will match zero only against  0 and -0   The function seems perfectly valid for zero comparison  If you re planning to use it  I suggest you use it everywhere there re floats involved  and not have special cases for things like zero  just so that there s uniformity in the code   ps  This is a neat function  Thanks for pointing to it

User · Answer

Simple comparison of FP numbers has it s own specific and it s key is the understanding of FP format  see https   en wikipedia org wiki IEEE floating point   When FP numbers calculated in a different ways  one through sin    other though exp    strict equality won t be working  even though mathematically numbers could be equal  The same way won t be working equality with the constant  Actually  in many situations FP numbers must not be compared using strict equality       In such cases should be used DBL EPSIPON constant  which is minimal value do not change representation of 1 0 being added to the number more than 1 0  For floating point numbers that more than 2 0 DBL EPSIPON does not exists at all  Meanwhile  DBL EPSILON has exponent -16  which means that all numbers  let s say  with exponent -34  would be absolutely equal in compare to DBL EPSILON   Also  see example  why 10 0    10 0000000000000001  Comparing dwo floating point numbers depend on these number nature  we should calculate DBL EPSILON for them that would be meaningful for the comparison  Simply  we should multiply DBL EPSILON to one of these numbers  Which of them  Maximum of course  bool close enough double a  double b       if  fabs a - b   lt   DBL EPSILON   std  fmax fabs a   fabs b                  return true            return false      All other ways would give you bugs with inequality which could be very hard to catch

User · Answer

No   Equality is equality   The function you wrote will not test two doubles for equality  as its name promises  It will only test if two doubles are  close enough  to each other   If you really want to test two doubles for equality  use this one   inline bool isEqual double x  double y       return x    y      Coding standards usually recommend against comparing two doubles for exact equality  But that is a different subject  If you actually want to compare two doubles for exact equality  x    y is the code you want   10 000000000000001 is not equal to 10 0  no matter what they tell you   An example of using exact equality is when a particular value of a double is used as a synonym of some special state  such as  pending calulation  or  no data available   This is possible only if the actual numeric values after that pending calculation are only a subset of the possible values of a double  The most typical particular case is when that value is nonnegative  and you use -1 0 as an  exact  representation of a  pending calculation  or  no data available   You could represent that with a constant   const double NO DATA   -1 0   double myData   getSomeDataWhichIsAlwaysNonNegative someParameters    if  myData    NO DATA

User · Answer

Consider this example   bool isEqual    23 42f    23 42     What is isEqual  9 out of 10 people will say  It s true  of course  and 9 out of 10 people are wrong  https   rextester com RVL15906  That s because floating point numbers are no exact numeric representations    Being binary numbers  they cannot even exactly represent all numbers that can be exact represented as decimal numbers  E g  while 0 1 can be exactly represented as a decimal number  it is exactly the tenth part of 1   it cannot be represented using floating point because it is 0 00011001100110011    periodic as binary  0 1 is for floating point what 1 3 is for decimal  which is 0 33333    as decimal   The consequence is that calculations like 0 3   0 6 can result in 0 89999999999999991  which is not 0 9  albeit it s close to that  And thus the test 0 1   0 2 - 0 3    0 0 might fail as the result of the calculation may not be 0  albeit it will be very close to 0      is an exact test and performing an exact test on inexact numbers is usually not very meaningful  As many floating point calculations include rounding errors  you usually want your comparisons to also allow small errors and this is what the test code you posted is all about  Instead of testing  Is A equal to B  it tests  Is A very close to B  as very close is quite often the best result you can expect from floating point calculations

User · Answer

If you are only interested in  0 0 and -0 0  you can use fpclassify from  lt cmath gt   For instance   if  FP ZERO    fpclassify x    do something

User · Answer

You can use std  nextafter with a fixed factor of the epsilon of a value like the following   bool isNearlyEqual double a  double b      int factor      a fixed factor of epsilon        double min a   a -  a - std  nextafter a  std  numeric limits lt double gt   lowest       factor    double max a   a    std  nextafter a  std  numeric limits lt double gt   max    - a    factor     return min a  lt   b  amp  amp  max a  gt   b

[c++] Comparing floating point number to zero

Examples related to c++

Examples related to floating-point