[java] Extract digits from a string in Java

I have a Java String object. I need to extract only digits from it. I'll give an example:

"123-456-789" I want "123456789"

Is there a library function that extracts only digits?

Thanks for the answers. Before I try these I need to know if I have to install any additional llibraries?

This question is related to java string

The answer is


You can use str.replaceAll("[^0-9]", "");


Code:

public class saasa {

    public static void main(String[] args) {
        // TODO Auto-generated method stub
        String t="123-456-789";
        t=t.replaceAll("-", "");
        System.out.println(t);
    }

Using Google Guava:

CharMatcher.DIGIT.retainFrom("123-456-789");

CharMatcher is plug-able and quite interesting to use, for instance you can do the following:

String input = "My phone number is 123-456-789!";
String output = CharMatcher.is('-').or(CharMatcher.DIGIT).retainFrom(input);

output == 123-456-789


Here's a more verbose solution. Less elegant, but probably faster:

public static String stripNonDigits(
            final CharSequence input /* inspired by seh's comment */){
    final StringBuilder sb = new StringBuilder(
            input.length() /* also inspired by seh's comment */);
    for(int i = 0; i < input.length(); i++){
        final char c = input.charAt(i);
        if(c > 47 && c < 58){
            sb.append(c);
        }
    }
    return sb.toString();
}

Test Code:

public static void main(final String[] args){
    final String input = "0-123-abc-456-xyz-789";
    final String result = stripNonDigits(input);
    System.out.println(result);
}

Output:

0123456789

BTW: I did not use Character.isDigit(ch) because it accepts many other chars except 0 - 9.


public class FindDigitFromString 
{

    public static void main(String[] args) 
    {
        String s="  Hi How Are You 11  ";        
        String s1=s.replaceAll("[^0-9]+", "");
        //*replacing all the value of string except digit by using "[^0-9]+" regex.*
       System.out.println(s1);          
   }
}

Output: 11


input.replaceAll("[^0-9?!\\.]","")

This will ignore the decimal points.

eg: if you have an input as 445.3kg the output will be 445.3.


import java.util.*;
public class FindDigits{

 public static void main(String []args){
    FindDigits h=new  FindDigits();
    h.checkStringIsNumerical();
 }

 void checkStringIsNumerical(){
    String h="hello 123 for the rest of the 98475wt355";
     for(int i=0;i<h.length();i++)  {
      if(h.charAt(i)!=' '){
       System.out.println("Is this '"+h.charAt(i)+"' is a digit?:"+Character.isDigit(h.charAt(i)));
       }
    }
 }

void checkStringIsNumerical2(){
    String h="hello 123 for 2the rest of the 98475wt355";
     for(int i=0;i<h.length();i++)  {
         char chr=h.charAt(i);
      if(chr!=' '){
       if(Character.isDigit(chr)){
          System.out.print(chr) ;
       }
       }
    }
 }
}

Using Google Guava:

CharMatcher.inRange('0','9').retainFrom("123-456-789")

UPDATE:

Using Precomputed CharMatcher can further improve performance

CharMatcher ASCII_DIGITS=CharMatcher.inRange('0','9').precomputed();  
ASCII_DIGITS.retainFrom("123-456-789");

public String extractDigits(String src) {
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < src.length(); i++) {
        char c = src.charAt(i);
        if (Character.isDigit(c)) {
            builder.append(c);
        }
    }
    return builder.toString();
}

Using Kotlin and Lambda expressions you can do it like this:

val digitStr = str.filter { it.isDigit() }

I inspired by code Sean Patrick Floyd and little rewrite it for maximum performance i get.

public static String stripNonDigitsV2( CharSequence input ) {
    if (input == null)
        return null;
    if ( input.length() == 0 )
        return "";

    char[] result = new char[input.length()];
    int cursor = 0;
    CharBuffer buffer = CharBuffer.wrap( input );

    while ( buffer.hasRemaining() ) {
        char chr = buffer.get();
        if ( chr > 47 && chr < 58 )
            result[cursor++] = chr;
    }

    return new String( result, 0, cursor );
}

i do Performance test to very long String with minimal numbers and result is:

  • Original code is 25,5% slower
  • Guava approach is 2.5-3 times slower
  • Regular expression with D+ is 3-3.5 times slower
  • Regular expression with only D is 25+ times slower

Btw it depends on how long that string is. With string that contains only 6 number is guava 50% slower and regexp 1 times slower


Use regular expression to match your requirement.

String num,num1,num2;
String str = "123-456-789";
String regex ="(\\d+)";
Matcher matcher = Pattern.compile( regex ).matcher( str);
while (matcher.find( ))
{
num = matcher.group();     
System.out.print(num);                 
}

I have finalized the code for phone numbers +9 (987) 124124.

Unicode characters occupy 4 bytes.

public static String stripNonDigitsV2( CharSequence input ) {
    if (input == null)
        return null;
    if ( input.length() == 0 )
        return "";

    char[] result = new char[input.length()];
    int cursor = 0;
    CharBuffer buffer = CharBuffer.wrap( input );
    int i=0;
    while ( i< buffer.length()  ) { //buffer.hasRemaining()
        char chr = buffer.get(i);
        if (chr=='u'){
            i=i+5;
            chr=buffer.get(i);
        }

        if ( chr > 39 && chr < 58 )
            result[cursor++] = chr;
        i=i+1;
    }

    return new String( result, 0, cursor );
}