# [c] What does AND 0xFF do?

In the following code:

``````short = ((byte2 << 8) | (byte1 & 0xFF))
``````

What is the purpose of `&0xFF`? Because other somestimes I see it written as:

``````short = ((byte2 << 8) | byte1)
``````

And that seems to work fine too?

This question is related to `c` `bit-manipulation` `bitwise-operators` `bit-shift`

it clears the all the bits that are not in the first byte

if `byte1` is an 8-bit integer type then it's pointless - if it is more than 8 bits it will essentially give you the last 8 bits of the value:

``````    0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
&  0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
-------------------------------
0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1
``````

`& 0xFF` by itself only ensures that if bytes are longer than 8 bits (allowed by the language standard), the rest are ignored.

And that seems to work fine too?

If the result ends up greater than `SHRT_MAX`, you get undefined behavior. In that respect both will work equally poorly.

Assuming your `byte1` is a byte(8bits), When you do a bitwise AND of a byte with 0xFF, you are getting the same byte.

So `byte1` is the same as `byte1 & 0xFF`

Say `byte1` is `01001101` , then `byte1 & 0xFF = 01001101 & 11111111 = 01001101 = byte1`

If byte1 is of some other type say integer of 4 bytes, bitwise AND with 0xFF leaves you with least significant byte(8 bits) of the byte1.

The danger of the second expression comes if the type of `byte1` is `char`. In that case, some implementations can have it `signed char`, which will result in sign extension when evaluating.

``````signed char byte1 = 0x80;
signed char byte2 = 0x10;

unsigned short value1 = ((byte2 << 8) | (byte1 & 0xFF));
unsigned short value2 = ((byte2 << 8) | byte1);

printf("value1=%hu %hx\n", value1, value1);
printf("value2=%hu %hx\n", value2, value2);
``````

will print

``````value1=4224 1080     right
value2=65408 ff80    wrong!!
``````

I tried it on gcc v3.4.6 on Solaris SPARC 64 bit and the result is the same with `byte1` and `byte2` declared as `char`.

TL;DR

The masking is to avoid implicit sign extension.

EDIT: I checked, it's the same behaviour in C++.

EDIT2: As requested explanation of sign extension. Sign extension is a consequence of the way C evaluates expressions. There is a rule in C called promotion rule. C will implicitly cast all small types to `int` before doing the evaluation. Let's see what happens to our expression:

``````unsigned short value2 = ((byte2 << 8) | byte1);
``````

`byte1` is a variable containing bit pattern 0xFF. If `char` is `unsigned` that value is interpreted as 255, if it is `signed` it is -128. When doing the calculation, C will extend the value to an `int` size (16 or 32 bits generally). This means that if the variable is `unsigned` and we will keep the value 255, the bit-pattern of that value as `int` will be 0x000000FF. If it is `signed` we want the value -128 which bit pattern is 0xFFFFFFFF. The sign was extended to the size of the tempory used to do the calculation. And thus oring the temporary will yield the wrong result.

On x86 assembly it is done with the `movsx` instruction (`movzx` for the zero extend). Other CPU's had other instructions for that (6809 had `SEX`).

The `byte1 & 0xff` ensures that only the 8 least significant bits of `byte1` can be non-zero.

if `byte1` is already an unsigned type that has only 8 bits (e.g., `char` in some cases, or `unsigned char` in most) it won't make any difference/is completely unnecessary.

If `byte1` is a type that's signed or has more than 8 bits (e.g., `short`, `int`, `long`), and any of the bits except the 8 least significant is set, then there will be a difference (i.e., it'll zero those upper bits before `or`ing with the other variable, so this operand of the `or` affects only the 8 least significant bits of the result).