How to check if a String contains only ASCII

Question

The call Character isLetter c  returns true if the character is a letter  But is there a way to quickly find if a String only contains the base characters of ASCII

User · Answer

commons-lang3 from Apache contains valuable utility convenience methods for all kinds of  problems    including this one    System out println StringUtils isAsciiPrintable          amp

User · Answer

private static boolean isASCII String s         for  int i   0  i  lt  s length    i             if  s charAt i   gt  127               return false      return true

User · Answer

You can do it with  java nio charset Charset   import java nio charset Charset   public class StringUtils      public static boolean isPureAscii String v        return Charset forName  US-ASCII   newEncoder   canEncode v          or  ISO-8859-1  for ISO Latin 1        or StandardCharsets US ASCII with JDK1 7         public static void main  String args        throws Exception         String test    R  al        System out println test     isPureAscii         StringUtils isPureAscii test         test    Real        System out println test     isPureAscii         StringUtils isPureAscii test                     output             R  al isPureAscii     false           Real isPureAscii     true                  Detect non-ASCII character in a String

User · Answer

Or you copy the code from the IDN class      to check if a string only contains US-ASCII code point    private static boolean isAllASCII String input        boolean isASCII   true      for  int i   0  i  lt  input length    i              int c   input charAt i           if  c  gt  0x7F                isASCII   false              break                      return isASCII

User · Answer

This will return true if String only contains ASCII characters and false when it does not   Charset forName  US-ASCII   newEncoder   canEncode str    If You want to remove non ASCII   here is the snippet   if  Charset forName  US-ASCII   newEncoder   canEncode str                             str   str replaceAll      p ASCII

User · Answer

It was possible  Pretty problem   import java io UnsupportedEncodingException  import java nio charset Charset  import java nio charset CharsetEncoder   public class EncodingTest        static CharsetEncoder asciiEncoder   Charset forName  US-ASCII                newEncoder         public static void main String   args             String testStr      E  s  W          i  T        3      i  T  U2  KITEC 3 F Rotunda 2           String   strArr   testStr split       2           int count   0          boolean encodeFlag   false           do               encodeFlag   asciiEncoderTest strArr count                System out println encodeFlag               count              while  count  lt  strArr length              public static boolean asciiEncoderTest String test            boolean encodeFlag   false          try               encodeFlag   asciiEncoder canEncode new String test                      getBytes  ISO8859 1     BIG5               catch  UnsupportedEncodingException e                e printStackTrace                      return encodeFlag

User · Answer

Iterate through the string and make sure all the characters have a value less than 128   Java Strings are conceptually encoded as UTF-16   In UTF-16  the ASCII character set is encoded as the values 0 - 127 and the encoding for any non ASCII character  which may consist of more than one Java char  is guaranteed not to include the numbers 0 - 127

User · Answer

try this   for  char c  string toCharArray       if    int c  gt 127       return false         return true

User · Answer

Here is another way not depending on a library but using a regex   You can use this single line   text matches    A  p ASCII    z     Whole example program   public class Main       public static void main String   args            char nonAscii   0x00FF          String asciiText    Hello           String nonAsciiText    Buy      nonAscii          System out println asciiText matches    A  p ASCII    z             System out println nonAsciiText matches    A  p ASCII    z

User · Answer

return is uppercase or lowercase public boolean isASCIILetter char c      return  c  gt  64  amp  amp  c  lt  91      c  gt  96  amp  amp  c  lt  123

User · Answer

From Guava 19 0 onward  you may use   boolean isAscii   CharMatcher ascii   matchesAllOf someString     This uses the matchesAllOf someString  method which relies on the factory method ascii   rather than the now deprecated ASCII singleton   Here ASCII includes all ASCII characters including the non-printable characters lower than 0x20  space  such as tabs  line-feed   return but also BEL with code 0x07 and DEL with code 0x7F   This code incorrectly uses characters rather than code points  even if code points are indicated in the comments of earlier versions  Fortunately  the characters required to create code point with a value of U 010000 or over uses two surrogate characters with a value outside of the ASCII range  So the method still succeeds in testing for ASCII  even for strings containing emoji s   For earlier Guava versions without the ascii   method you may write   boolean isAscii   CharMatcher ASCII matchesAllOf someString

User · Answer

Iterate through the string  and use charAt   to get the char   Then treat it as an int  and see if it has a unicode value  a superset of ASCII  which you like   Break at the first you don t like

[java] How to check if a String contains only ASCII?

Examples related to java

Examples related to string

Examples related to character-encoding

Examples related to ascii