[math] Why can't decimal numbers be represented exactly in binary?

There have been several questions posted to SO about floating-point representation. For example, the decimal number 0.1 doesn't have an exact binary representation, so it's dangerous to use the == operator to compare it to another floating-point number. I understand the principles behind floating-point representation.

What I don't understand is why, from a mathematical perspective, are the numbers to the right of the decimal point any more "special" that the ones to the left?

For example, the number 61.0 has an exact binary representation because the integral portion of any number is always exact. But the number 6.10 is not exact. All I did was move the decimal one place and suddenly I've gone from Exactopia to Inexactville. Mathematically, there should be no intrinsic difference between the two numbers -- they're just numbers.

By contrast, if I move the decimal one place in the other direction to produce the number 610, I'm still in Exactopia. I can keep going in that direction (6100, 610000000, 610000000000000) and they're still exact, exact, exact. But as soon as the decimal crosses some threshold, the numbers are no longer exact.

What's going on?

Edit: to clarify, I want to stay away from discussion about industry-standard representations, such as IEEE, and stick with what I believe is the mathematically "pure" way. In base 10, the positional values are:

... 1000  100   10    1   1/10  1/100 ...

In binary, they would be:

... 8    4    2    1    1/2  1/4  1/8 ...

There are also no arbitrary limits placed on these numbers. The positions increase indefinitely to the left and to the right.

This question is related to math floating-point

The answer is


The root (mathematical) reason is that when you are dealing with integers, they are countably infinite.

Which means, even though there are an infinite amount of them, we could "count out" all of the items in the sequence, without skipping any. That means if we want to get the item in the 610000000000000th position in the list, we can figure it out via a formula.

However, real numbers are uncountably infinite. You can't say "give me the real number at position 610000000000000" and get back an answer. The reason is because, even between 0 and 1, there are an infinite number of values, when you are considering floating-point values. The same holds true for any two floating point numbers.

More info:

http://en.wikipedia.org/wiki/Countable_set

http://en.wikipedia.org/wiki/Uncountable_set

Update: My apologies, I appear to have misinterpreted the question. My response is about why we cannot represent every real value, I hadn't realized that floating point was automatically classified as rational.


There are an infinite number of rational numbers, and a finite number of bits with which to represent them. See http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems.


The problem is that you do not really know whether the number actually is exactly 61.0 . Consider this:


float a = 60;
float b = 0.1;
float c = a + b * 10;

What is the value of c? It is not exactly 61, because b is not really .1 because .1 does not have an exact binary representation.


It's the same reason you cannot represent 1/3 exactly in base 10, you need to say 0.33333(3). In binary it is the same type of problem but just occurs for different set of numbers.


For a simple answer: The computer doesn't have infinite memory to store fraction (after representing the decimal number as the form of scientific notation). According to IEEE 754 standard for double-precision floating-point numbers, we only have a limit of 53 bits to store fraction. For more info: http://mathcenter.oxford.emory.edu/site/cs170/ieee754/


A parallel can be made of fractions and whole numbers. Some fractions eg 1/7 cannot be represented in decimal form without lots and lots of decimals. Because floating point is binary based the special cases change but the same sort of accuracy problems present themselves.


As we have been discussing, in floating point arithmetic, the decimal 0.1 cannot be perfectly represented in binary.

Floating point and integer representations provide grids or lattices for the numbers represented. As arithmetic is done, the results fall off the grid and have to be put back onto the grid by rounding. Example is 1/10 on a binary grid.

If we use binary coded decimal representation as one gentleman suggested, would we be able to keep numbers on the grid?


This is a good question.

All your question is based on "how do we represent a number?"

ALL the numbers can be represented with decimal representation or with binary (2's complement) representation. All of them !!

BUT some (most of them) require infinite number of elements ("0" or "1" for the binary position, or "0", "1" to "9" for the decimal representation).

Like 1/3 in decimal representation (1/3 = 0.3333333... <- with an infinite number of "3")

Like 0.1 in binary ( 0.1 = 0.00011001100110011.... <- with an infinite number of "0011")

Everything is in that concept. Since your computer can only consider finite set of digits (decimal or binary), only some numbers can be exactly represented in your computer...

And as said Jon, 3 is a prime number which isn't a factor of 10, so 1/3 cannot be represented with a finite number of elements in base 10.

Even with arithmetic with arbitrary precision, the numbering position system in base 2 is not able to fully describe 6.1, although it can represent 61.

For 6.1, we must use another representation (like decimal representation, or IEEE 854 that allows base 2 or base 10 for the representation of floating-point values)


(Note: I'll append 'b' to indicate binary numbers here. All other numbers are given in decimal)

One way to think about things is in terms of something like scientific notation. We're used to seeing numbers expressed in scientific notation like, 6.022141 * 10^23. Floating point numbers are stored internally using a similar format - mantissa and exponent, but using powers of two instead of ten.

Your 61.0 could be rewritten as 1.90625 * 2^5, or 1.11101b * 2^101b with the mantissa and exponents. To multiply that by ten and (move the decimal point), we can do:

(1.90625 * 2^5) * (1.25 * 2^3) = (2.3828125 * 2^8) = (1.19140625 * 2^9)

or in with the mantissa and exponents in binary:

(1.11101b * 2^101b) * (1.01b * 2^11b) = (10.0110001b * 2^1000b) = (1.00110001b * 2^1001b)

Note what we did there to multiply the numbers. We multiplied the mantissas and added the exponents. Then, since the mantissa ended greater than two, we normalized the result by bumping the exponent. It's just like when we adjust the exponent after doing an operation on numbers in decimal scientific notation. In each case, the values that we worked with had a finite representation in binary, and so the values output by the basic multiplication and addition operations also produced values with a finite representation.

Now, consider how we'd divide 61 by 10. We'd start by dividing the mantissas, 1.90625 and 1.25. In decimal, this gives 1.525, a nice short number. But what is this if we convert it to binary? We'll do it the usual way -- subtracting out the largest power of two whenever possible, just like converting integer decimals to binary, but we'll use negative powers of two:

1.525         - 1*2^0   --> 1
0.525         - 1*2^-1  --> 1
0.025         - 0*2^-2  --> 0
0.025         - 0*2^-3  --> 0
0.025         - 0*2^-4  --> 0
0.025         - 0*2^-5  --> 0
0.025         - 1*2^-6  --> 1
0.009375      - 1*2^-7  --> 1
0.0015625     - 0*2^-8  --> 0
0.0015625     - 0*2^-9  --> 0
0.0015625     - 1*2^-10 --> 1
0.0005859375  - 1*2^-11 --> 1
0.00009765625...

Uh oh. Now we're in trouble. It turns out that 1.90625 / 1.25 = 1.525, is a repeating fraction when expressed in binary: 1.11101b / 1.01b = 1.10000110011...b Our machines only have so many bits to hold that mantissa and so they'll just round the fraction and assume zeroes beyond a certain point. The error you see when you divide 61 by 10 is the difference between:

1.100001100110011001100110011001100110011...b * 2^10b
and, say:
1.100001100110011001100110b * 2^10b

It's this rounding of the mantissa that leads to the loss of precision that we associate with floating point values. Even when the mantissa can be expressed exactly (e.g., when just adding two numbers), we can still get numeric loss if the mantissa needs too many digits to fit after normalizing the exponent.

We actually do this sort of thing all the time when we round decimal numbers to a manageable size and just give the first few digits of it. Because we express the result in decimal it feels natural. But if we rounded a decimal and then converted it to a different base, it'd look just as ugly as the decimals we get due to floating point rounding.


In the equation

2^x = y ;  
x = log(y) / log(2)

Hence, I was just wondering if we could have a logarithmic base system for binary like,

 2^1, 2^0, 2^(log(1/2) / log(2)), 2^(log(1/4) / log(2)), 2^(log(1/8) / log(2)),2^(log(1/16) / log(2)) ........

That might be able to solve the problem, so if you wanted to write something like 32.41 in binary, that would be

2^5 + 2^(log(0.4) / log(2)) + 2^(log(0.01) / log(2))

Or

2^5 + 2^(log(0.41) / log(2))

The high scoring answer above nailed it.

First you were mixing base 2 and base 10 in your question, then when you put a number on the right side that is not divisible into the base you get problems. Like 1/3 in decimal because 3 doesnt go into a power of 10 or 1/5 in binary which doesnt go into a power of 2.

Another comment though NEVER use equal with floating point numbers, period. Even if it is an exact representation there are some numbers in some floating point systems that can be accurately represented in more than one way (IEEE is bad about this, it is a horrible floating point spec to start with, so expect headaches). No different here 1/3 is not EQUAL to the number on your calculator 0.3333333, no matter how many 3's there are to the right of the decimal point. It is or can be close enough but is not equal. so you would expect something like 2*1/3 to not equal 2/3 depending on the rounding. Never use equal with floating point.


For example, the number 61.0 has an exact binary representation because the integral portion of any number is always exact. But the number 6.10 is not exact. All I did was move the decimal one place and suddenly I've gone from Exactopia to Inexactville. Mathematically, there should be no intrinsic difference between the two numbers -- they're just numbers.

Let's step away for a moment from the particulars of bases 10 and 2. Let's ask - in base b, what numbers have terminating representations, and what numbers don't? A moment's thought tells us that a number x has a terminating b-representation if and only if there exists an integer n such that x b^n is an integer.

So, for example, x = 11/500 has a terminating 10-representation, because we can pick n = 3 and then x b^n = 22, an integer. However x = 1/3 does not, because whatever n we pick we will not be able to get rid of the 3.

This second example prompts us to think about factors, and we can see that for any rational x = p/q (assumed to be in lowest terms), we can answer the question by comparing the prime factorisations of b and q. If q has any prime factors not in the prime factorisation of b, we will never be able to find a suitable n to get rid of these factors.

Thus for base 10, any p/q where q has prime factors other than 2 or 5 will not have a terminating representation.

So now going back to bases 10 and 2, we see that any rational with a terminating 10-representation will be of the form p/q exactly when q has only 2s and 5s in its prime factorisation; and that same number will have a terminating 2-representatiion exactly when q has only 2s in its prime factorisation.

But one of these cases is a subset of the other! Whenever

q has only 2s in its prime factorisation

it obviously is also true that

q has only 2s and 5s in its prime factorisation

or, put another way, whenever p/q has a terminating 2-representation, p/q has a terminating 10-representation. The converse however does not hold - whenever q has a 5 in its prime factorisation, it will have a terminating 10-representation , but not a terminating 2-representation. This is the 0.1 example mentioned by other answers.

So there we have the answer to your question - because the prime factors of 2 are a subset of the prime factors of 10, all 2-terminating numbers are 10-terminating numbers, but not vice versa. It's not about 61 versus 6.1 - it's about 10 versus 2.

As a closing note, if by some quirk people used (say) base 17 but our computers used base 5, your intuition would never have been led astray by this - there would be no (non-zero, non-integer) numbers which terminated in both cases!


BCD - Binary-coded Decimal - representations are exact. They are not very space-efficient, but that's a trade-off you have to make for accuracy in this case.


If you make a big enough number with floating point (as it can do exponents), then you'll end up with inexactness in front of the decimal point, too. So I don't think your question is entirely valid because the premise is wrong; it's not the case that shifting by 10 will always create more precision, because at some point the floating point number will have to use exponents to represent the largeness of the number and will lose some precision that way as well.


I'm surprised no one has stated this yet: use continued fractions. Any rational number can be represented finitely in binary this way.

Some examples:

1/3 (0.3333...)

0; 3

5/9 (0.5555...)

0; 1, 1, 4

10/43 (0.232558139534883720930...)

0; 4, 3, 3

9093/18478 (0.49209871198181621387596060179673...)

0; 2, 31, 7, 8, 5

From here, there are a variety of known ways to store a sequence of integers in memory.

In addition to storing your number with perfect accuracy, continued fractions also have some other benefits, such as best rational approximation. If you decide to terminate the sequence of numbers in a continued fraction early, the remaining digits (when recombined to a fraction) will give you the best possible fraction. This is how approximations to pi are found:

Pi's continued fraction:

3; 7, 15, 1, 292 ...

Terminating the sequence at 1, this gives the fraction:

355/113

which is an excellent rational approximation.


you know integer numbers right? each bit represent 2^n


2^4=16
2^3=8
2^2=4
2^1=2
2^0=1

well its the same for floating point(with some distinctions) but the bits represent 2^-n 2^-1=1/2=0.5
2^-2=1/(2*2)=0.25
2^-3=0.125
2^-4=0.0625

Floating point binary representation:

sign  Exponent    Fraction(i think invisible 1 is appended to the fraction )
B11  B10 B9 B8   B7 B6 B5 B4 B3 B2 B1 B0


There's a threshold because the meaning of the digit has gone from integer to non-integer. To represent 61, you have 6*10^1 + 1*10^0; 10^1 and 10^0 are both integers. 6.1 is 6*10^0 + 1*10^-1, but 10^-1 is 1/10, which is definitely not an integer. That's how you end up in Inexactville.


The number 61.0 does indeed have an exact floating-point operation—but that's not true for all integers. If you wrote a loop that added one to both a double-precision floating point number and a 64-bit integer, eventually you'd reach a point where the 64-bit integer perfectly represents a number, but the floating point doesn't—because there aren't enough significant bits.

It's just much easier to reach the point of approximation on the right side of the decimal point. If you started writing out all the numbers in binary floating point, it'd make more sense.

Another way of thinking about it is that when you note that 61.0 is perfectly representable in base 10, and shifting the decimal point around doesn't change that, you're performing multiplication by powers of ten (10^1, 10^-1). In floating point, multiplying by powers of two does not affect the precision of the number. Try taking 61.0 and dividing it by three repeatedly for an illustration of how a perfectly precise number can lose its precise representation.


To repeat what I said in my comment to Mr. Skeet: we can represent 1/3, 1/9, 1/27, or any rational in decimal notation. We do it by adding an extra symbol. For example, a line over the digits that repeat in the decimal expansion of the number. What we need to represent decimal numbers as a sequence of binary numbers are 1) a sequence of binary numbers, 2) a radix point, and 3) some other symbol to indicate the repeating part of the sequence.

Hehner's quote notation is a way of doing this. He uses a quote symbol to represent the repeating part of the sequence. The article: http://www.cs.toronto.edu/~hehner/ratno.pdf and the Wikipedia entry: http://en.wikipedia.org/wiki/Quote_notation.

There's nothing that says we can't add a symbol to our representation system, so we can represent decimal rationals exactly using binary quote notation, and vice versa.