[java] Get unicode value of a character

Is there any way in Java so that I can get Unicode equivalent of any character? e.g.

Suppose a method getUnicode(char c). A call getUnicode('÷') should return \u00f7.

This question is related to java unicode

The answer is


char c = 'a';
String a = Integer.toHexString(c); // gives you---> a = "61"

are you picky with using Unicode because with java its more simple if you write your program to use "dec" value or (HTML-Code) then you can simply cast data types between char and int

char a = 98;
char b = 'b';
char c = (char) (b+0002);

System.out.println(a);
System.out.println((int)b);
System.out.println((int)c);
System.out.println(c);

Gives this output

b
98
100
d

If you have Java 5, use char c = ...; String s = String.format ("\\u%04x", (int)c);

If your source isn't a Unicode character (char) but a String, you must use charAt(index) to get the Unicode character at position index.

Don't use codePointAt(index) because that will return 24bit values (full Unicode) which can't be represented with just 4 hex digits (it needs 6). See the docs for an explanation.

[EDIT] To make it clear: This answer doesn't use Unicode but the method which Java uses to represent Unicode characters (i.e. surrogate pairs) since char is 16bit and Unicode is 24bit. The question should be: "How can I convert char to a 4-digit hex number", since it's not (really) about Unicode.


I found this nice code on web.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class Unicode {

public static void main(String[] args) {
System.out.println("Use CTRL+C to quite to program.");

// Create the reader for reading in the text typed in the console. 
InputStreamReader inputStreamReader = new InputStreamReader(System.in);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);

try {
  String line = null;
  while ((line = bufferedReader.readLine()).length() > 0) {
    for (int index = 0; index < line.length(); index++) {

      // Convert the integer to a hexadecimal code.
      String hexCode = Integer.toHexString(line.codePointAt(index)).toUpperCase();


      // but the it must be a four number value.
      String hexCodeWithAllLeadingZeros = "0000" + hexCode;
      String hexCodeWithLeadingZeros = hexCodeWithAllLeadingZeros.substring(hexCodeWithAllLeadingZeros.length()-4);

      System.out.println("\\u" + hexCodeWithLeadingZeros);
    }

  }
} catch (IOException ioException) {
       ioException.printStackTrace();
  }
 }
}

Original Article


private static String toUnicode(char ch) {
    return String.format("\\u%04x", (int) ch);
}

First, I get the high side of the char. After, get the low side. Convert all of things in HexString and put the prefix.

int hs = (int) c  >> 8;
int ls = hs & 0x000F;

String highSide = Integer.toHexString(hs);
String lowSide = Integer.toHexString(ls);
lowSide = Integer.toHexString(hs & 0x00F0);
String hexa = Integer.toHexString( (int) c );

System.out.println(c+" = "+"\\u"+highSide+lowSide+hexa);