[java] Use String.split() with multiple delimiters

I need to split a string base on delimiter - and .. Below are my desired output.

AA.BB-CC-DD.zip ->

AA
BB
CC
DD
zip 

but my following code does not work.

private void getId(String pdfName){
    String[]tokens = pdfName.split("-\\.");
}

This question is related to java regex

The answer is


Try this code:

var string = 'AA.BB-CC-DD.zip';
array = string.split(/[,.]/);

For two char sequence as delimeters "AND" and "OR" this should be worked. Don't forget to trim while using.

 String text ="ISTANBUL AND NEW YORK AND PARIS OR TOKYO AND MOSCOW";
 String[] cities = text.split("AND|OR"); 

Result : cities = {"ISTANBUL ", " NEW YORK ", " PARIS ", " TOKYO ", " MOSCOW"}


s.trim().split("[\\W]+") 

should work.


pdfName.split("[.-]+");

  • [.-] -> any one of the . or - can be used as delimiter

  • + sign signifies that if the aforementioned delimiters occur consecutively we should treat it as one.


Using Guava you could do this:

Iterable<String> tokens = Splitter.on(CharMatcher.anyOf("-.")).split(pdfName);

I'd use Apache Commons:

import org.apache.commons.lang3.StringUtils;

private void getId(String pdfName){
    String[] tokens = StringUtils.split(pdfName, "-.");
}

It'll split on any of the specified separators, as opposed to StringUtils.splitByWholeSeparator(str, separator) which uses the complete string as a separator


Try this regex "[-.]+". The + after treats consecutive delimiter chars as one. Remove plus if you do not want this.


The string you give split is the string form of a regular expression, so:

private void getId(String pdfName){
    String[]tokens = pdfName.split("[\\-.]");
}

That means to split on any character in the [] (we have to escape - with a backslash because it's special inside []; and of course we have to escape the backslash because this is a string). (Conversely, . is normally special but isn't special inside [].)


If you know the sting will always be in the same format, first split the string based on . and store the string at the first index in a variable. Then split the string in the second index based on - and store indexes 0, 1 and 2. Finally, split index 2 of the previous array based on . and you should have obtained all of the relevant fields.

Refer to the following snippet:

String[] tmp = pdfName.split(".");
String val1 = tmp[0];
tmp = tmp[1].split("-");
String val2 = tmp[0];
...

You may also specified regular expression as argument in split() method ..see below example....

private void getId(String pdfName){
String[]tokens = pdfName.split("-|\\.");
}

It's better to use something like this:

s.split("[\\s\\-\\.\\'\\?\\,\\_\\@]+");

Have added a few other characters as sample. This is the safest way to use, because the way . and ' is treated.


You can use the regex "\W".This matches any non-word character.The required line would be:

String[] tokens=pdfName.split("\\W");

String[] token=s.split("[.-]");