Byte and char conversion in Java

Question

If I convert a  character to byte and then back to char  that character mysteriously disappears and becomes something else  How is this possible   This is the code   char a                  line 1        byte b    byte a       line 2        char c    char b       line 3 System out println  char c          int c     Until line 2 everything is fine    In line 1 I could print  a  in the console and it would show       In line 2 I could print  b  in the console and it would show -56  that is 200 because byte is signed  And 200 is       So it s still fine    But what s wrong in line 3   c  becomes something else and the program prints    65480  That s something completely different   What I should write in line 3 in order to get the correct result

User · Answer

A character in Java is a Unicode code-unit which is treated as an unsigned number. So if you perform c = (char)b the value you get is 2^16 - 56 or 65536 - 56.

Or more precisely, the byte is first converted to a signed integer with the value 0xFFFFFFC8 using sign extension in a widening conversion. This in turn is then narrowed down to 0xFFC8 when casting to a char, which translates to the positive number 65480.

From the language specification:

5.1.4. Widening and Narrowing Primitive Conversion

First, the byte is converted to an int via widening primitive conversion (§5.1.2), and then the resulting int is converted to a char by narrowing primitive conversion (§5.1.3).

To get the right point use char c = (char) (b & 0xFF) which first converts the byte value of b to the positive integer 200 by using a mask, zeroing the top 24 bits after conversion: 0xFFFFFFC8 becomes 0x000000C8 or the positive number 200 in decimals.

Above is a direct explanation of what happens during conversion between the byte, int and char primitive types.

If you want to encode/decode characters from bytes, use Charset, CharsetEncoder, CharsetDecoder or one of the convenience methods such as new String(byte[] bytes, Charset charset) or String#toBytes(Charset charset). You can get the character set (such as UTF-8 or Windows-1252) from StandardCharsets.

User · Answer

new String byteArray  Charset defaultCharset     This will convert a byte array to the default charset in java  It may throw exceptions depending on what you supply with the byteArray

[java] Byte and char conversion in Java

Examples related to java

Examples related to encoding

Examples related to unicode

Examples related to utf-16