[floating-point] Floating point inaccuracy examples

How do you explain floating point inaccuracy to fresh programmers and laymen who still think computers are infinitely wise and accurate?
Do you have a favourite example or anecdote which seems to get the idea across much better than an precise, but dry, explanation?
How is this taught in Computer Science classes?

This question is related to floating-point floating-accuracy

The answer is


Here is my simple understanding.

Problem: The value 0.45 cannot be accurately be represented by a float and is rounded up to 0.450000018. Why is that?

Answer: An int value of 45 is represented by the binary value 101101. In order to make the value 0.45 it would be accurate if it you could take 45 x 10^-2 (= 45 / 10^2.) But that’s impossible because you must use the base 2 instead of 10.

So the closest to 10^2 = 100 would be 128 = 2^7. The total number of bits you need is 9 : 6 for the value 45 (101101) + 3 bits for the value 7 (111). Then the value 45 x 2^-7 = 0.3515625. Now you have a serious inaccuracy problem. 0.3515625 is not nearly close to 0.45.

How do we improve this inaccuracy? Well we could change the value 45 and 7 to something else.

How about 460 x 2^-10 = 0.44921875. You are now using 9 bits for 460 and 4 bits for 10. Then it’s a bit closer but still not that close. However if your initial desired value was 0.44921875 then you would get an exact match with no approximation.

So the formula for your value would be X = A x 2^B. Where A and B are integer values positive or negative. Obviously the higher the numbers can be the higher would your accuracy become however as you know the number of bits to represent the values A and B are limited. For float you have a total number of 32. Double has 64 and Decimal has 128.


In python:

>>> 1.0 / 10
0.10000000000000001

Explain how some fractions cannot be represented precisely in binary. Just like some fractions (like 1/3) cannot be represented precisely in base 10.


Show them that the base-10 system suffers from exactly the same problem.

Try to represent 1/3 as a decimal representation in base 10. You won't be able to do it exactly.

So if you write "0.3333", you will have a reasonably exact representation for many use cases.

But if you move that back to a fraction, you will get "3333/10000", which is not the same as "1/3".

Other fractions, such as 1/2 can easily be represented by a finite decimal representation in base-10: "0.5"

Now base-2 and base-10 suffer from essentially the same problem: both have some numbers that they can't represent exactly.

While base-10 has no problem representing 1/10 as "0.1" in base-2 you'd need an infinite representation starting with "0.000110011..".


How's this for an explantation to the layman. One way computers represent numbers is by counting discrete units. These are digital computers. For whole numbers, those without a fractional part, modern digital computers count powers of two: 1, 2, 4, 8. ,,, Place value, binary digits, blah , blah, blah. For fractions, digital computers count inverse powers of two: 1/2, 1/4, 1/8, ... The problem is that many numbers can't be represented by a sum of a finite number of those inverse powers. Using more place values (more bits) will increase the precision of the representation of those 'problem' numbers, but never get it exactly because it only has a limited number of bits. Some numbers can't be represented with an infinite number of bits.

Snooze...

OK, you want to measure the volume of water in a container, and you only have 3 measuring cups: full cup, half cup, and quarter cup. After counting the last full cup, let's say there is one third of a cup remaining. Yet you can't measure that because it doesn't exactly fill any combination of available cups. It doesn't fill the half cup, and the overflow from the quarter cup is too small to fill anything. So you have an error - the difference between 1/3 and 1/4. This error is compounded when you combine it with errors from other measurements.


A cute piece of numerical weirdness may be observed if one converts 9999999.4999999999 to a float and back to a double. The result is reported as 10000000, even though that value is obviously closer to 9999999, and even though 9999999.499999999 correctly rounds to 9999999.


Another example, in C

printf (" %.20f \n", 3.6);

incredibly gives

3.60000000000000008882