[java] How can I count the number of matches for a regex?

Let's say I have a string which contains this:

HelloxxxHelloxxxHello

I compile a pattern to look for 'Hello'

Pattern pattern = Pattern.compile("Hello");
Matcher matcher = pattern.matcher("HelloxxxHelloxxxHello");

It should find three matches. How can I get a count of how many matches there were?

I've tried various loops and using the matcher.groupCount() but it didn't work.

This question is related to java regex

The answer is


If you want to use Java 8 streams and are allergic to while loops, you could try this:

public static int countPattern(String references, Pattern referencePattern) {
    Matcher matcher = referencePattern.matcher(references);
    return Stream.iterate(0, i -> i + 1)
            .filter(i -> !matcher.find())
            .findFirst()
            .get();
}

Disclaimer: this only works for disjoint matches.

Example:

public static void main(String[] args) throws ParseException {
    Pattern referencePattern = Pattern.compile("PASSENGER:\\d+");
    System.out.println(countPattern("[ \"PASSENGER:1\", \"PASSENGER:2\", \"AIR:1\", \"AIR:2\", \"FOP:2\" ]", referencePattern));
    System.out.println(countPattern("[ \"AIR:1\", \"AIR:2\", \"FOP:2\" ]", referencePattern));
    System.out.println(countPattern("[ \"AIR:1\", \"AIR:2\", \"FOP:2\", \"PASSENGER:1\" ]", referencePattern));
    System.out.println(countPattern("[  ]", referencePattern));
}

This prints out:

2
0
1
0

This is a solution for disjoint matches with streams:

public static int countPattern(String references, Pattern referencePattern) {
    return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
            new Iterator<Integer>() {
                Matcher matcher = referencePattern.matcher(references);
                int from = 0;

                @Override
                public boolean hasNext() {
                    return matcher.find(from);
                }

                @Override
                public Integer next() {
                    from = matcher.start() + 1;
                    return 1;
                }
            },
            Spliterator.IMMUTABLE), false).reduce(0, (a, c) -> a + c);
}

Use the below code to find the count of number of matches that the regex finds in your input

        Pattern p = Pattern.compile(regex, Pattern.MULTILINE | Pattern.DOTALL);// "regex" here indicates your predefined regex.
        Matcher m = p.matcher(pattern); // "pattern" indicates your string to match the pattern against with
        boolean b = m.matches();
        if(b)
        count++;
        while (m.find())
        count++;

This is a generalized code not specific one though, tailor it to suit your need

Please feel free to correct me if there is any mistake.


From Java 9, you can use the stream provided by Matcher.results()

long matches = matcher.results().count();

This should work for matches that might overlap:

public static void main(String[] args) {
    String input = "aaaaaaaa";
    String regex = "aa";
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(input);
    int from = 0;
    int count = 0;
    while(matcher.find(from)) {
        count++;
        from = matcher.start() + 1;
    }
    System.out.println(count);
}