[java] Occurrences of substring in a string

Why is the following algorithm not halting for me? (str is the string I am searching in, findStr is the string I am trying to find)

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while (lastIndex != -1) {
    lastIndex = str.indexOf(findStr,lastIndex);

    if( lastIndex != -1)
        count++;

    lastIndex += findStr.length();
}

System.out.println(count);

This question is related to java string

The answer is


This solution prints the total number of occurrence of a given substring throughout the string, also includes the cases where overlapping matches do exist.

class SubstringMatch{
    public static void main(String []args){
        //String str = "aaaaabaabdcaa";
        //String sub = "aa";
        //String str = "caaab";
        //String sub = "aa";
        String str="abababababaabb";
        String sub = "bab";

        int n = str.length();
        int m = sub.length();

        // index=-1 in case of no match, otherwise >=0(first match position)
        int index=str.indexOf(sub), i=index+1, count=(index>=0)?1:0;
        System.out.println(i+" "+index+" "+count);

        // i will traverse up to only (m-n) position
        while(index!=-1 && i<=(n-m)){   
            index=str.substring(i, n).indexOf(sub);
            count=(index>=0)?count+1:count;
            i=i+index+1;  
            System.out.println(i+" "+index);
        }
        System.out.println("count: "+count);
    }
}

You can number of occurrences using inbuilt library function:

import org.springframework.util.StringUtils;
StringUtils.countOccurrencesOf(result, "R-")

public static int getCountSubString(String str , String sub){
int n = 0, m = 0, counter = 0, counterSub = 0;
while(n < str.length()){
  counter = 0;
  m = 0;
  while(m < sub.length() && str.charAt(n) == sub.charAt(m)){
    counter++;
    m++; n++;
  }
  if (counter == sub.length()){
    counterSub++;
    continue;
  }
  else if(counter > 0){
    continue;
  }
  n++;
}

return  counterSub;

}


Increment lastIndex whenever you look for next occurrence.

Otherwise it's always finding the first substring (at position 0).


A lot of the given answers fail on one or more of:

  • Patterns of arbitrary length
  • Overlapping matches (such as counting "232" in "23232" or "aa" in "aaa")
  • Regular expression meta-characters

Here's what I wrote:

static int countMatches(Pattern pattern, String string)
{
    Matcher matcher = pattern.matcher(string);

    int count = 0;
    int pos = 0;
    while (matcher.find(pos))
    {
        count++;
        pos = matcher.start() + 1;
    }

    return count;
}

Example call:

Pattern pattern = Pattern.compile("232");
int count = countMatches(pattern, "23232"); // Returns 2

If you want a non-regular-expression search, just compile your pattern appropriately with the LITERAL flag:

Pattern pattern = Pattern.compile("1+1", Pattern.LITERAL);
int count = countMatches(pattern, "1+1+1"); // Returns 2

If you need the index of each substring within the original string, you can do something with indexOf like this:

 private static List<Integer> getAllIndexesOfSubstringInString(String fullString, String substring) {
    int pointIndex = 0;
    List<Integer> allOccurences = new ArrayList<Integer>();
    while(fullPdfText.indexOf(substring,pointIndex) >= 0){
       allOccurences.add(fullPdfText.indexOf(substring, pointIndex));
       pointIndex = fullPdfText.indexOf(substring, pointIndex) + substring.length();
    }
    return allOccurences;
}

Your lastIndex += findStr.length(); was placed outside the brackets, causing an infinite loop (when no occurence was found, lastIndex was always to findStr.length()).

Here is the fixed version :

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while (lastIndex != -1) {

    lastIndex = str.indexOf(findStr, lastIndex);

    if (lastIndex != -1) {
        count++;
        lastIndex += findStr.length();
    }
}
System.out.println(count);

Based on the existing answer(s) I'd like to add a "shorter" version without the if:

String str = "helloslkhellodjladfjhello";
String findStr = "hello";

int count = 0, lastIndex = 0;
while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
    lastIndex += findStr.length() - 1;
    count++;
}

System.out.println(count); // output: 3

Try this one. It replaces all the matches with a -.

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int numberOfMatches = 0;
while (str.contains(findStr)){
    str = str.replaceFirst(findStr, "-");
    numberOfMatches++;
}

And if you don't want to destroy your str you can create a new string with the same content:

String str = "helloslkhellodjladfjhello";
String strDestroy = str;
String findStr = "hello";
int numberOfMatches = 0;
while (strDestroy.contains(findStr)){
    strDestroy = strDestroy.replaceFirst(findStr, "-");
    numberOfMatches++;
}

After executing this block these will be your values:

str = "helloslkhellodjladfjhello"
strDestroy = "-slk-djladfj-"
findStr = "hello"
numberOfMatches = 3

This below method show how many time substring repeat on ur whole string. Hope use full to you:-

    String searchPattern="aaa"; // search string
    String str="aaaaaababaaaaaa"; // whole string
    int searchLength = searchPattern.length(); 
    int totalLength = str.length(); 
    int k = 0;
    for (int i = 0; i < totalLength - searchLength + 1; i++) {
        String subStr = str.substring(i, searchLength + i);
        if (subStr.equals(searchPattern)) {
           k++;
        }

    }

here is the other solution without using regexp/patterns/matchers or even not using StringUtils.

String str = "helloslkhellodjladfjhelloarunkumarhelloasdhelloaruhelloasrhello";
        String findStr = "hello";
        int count =0;
        int findStrLength = findStr.length();
        for(int i=0;i<str.length();i++){
            if(findStr.startsWith(Character.toString(str.charAt(i)))){
                if(str.substring(i).length() >= findStrLength){
                    if(str.substring(i, i+findStrLength).equals(findStr)){
                        count++;
                    }
                }
            }
        }
        System.out.println(count);

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
     count++;
     lastIndex += findStr.length() - 1;
}
System.out.println(count);

at the end of the loop count is 3; hope it helps


I'm very surprised no one has mentioned this one liner. It's simple, concise and performs slightly better than str.split(target, -1).length-1

public static int count(String str, String target) {
    return (str.length() - str.replace(target, "").length()) / target.length();
}

As @Mr_and_Mrs_D suggested:

String haystack = "hellolovelyworld";
String needle = "lo";
return haystack.split(Pattern.quote(needle), -1).length - 1;

try adding lastIndex+=findStr.length() to the end of your loop, otherwise you will end up in an endless loop because once you found the substring, you are trying to find it again and again from the same last position.


The answer given as correct is no good for counting things like line returns and is far too verbose. Later answers are better but all can be achieved simply with

str.split(findStr).length

It does not drop trailing matches using the example in the question.


public int indexOf(int ch,
                   int fromIndex)

Returns the index within this string of the first occurrence of the specified character, starting the search at the specified index.

So your lastindex value is always 0 and it always finds hello in the string.


Do you really have to handle the matching yourself ? Especially if all you need is the number of occurences, regular expressions are tidier :

String str = "helloslkhellodjladfjhello";
Pattern p = Pattern.compile("hello");
Matcher m = p.matcher(str);
int count = 0;
while (m.find()){
    count +=1;
}
System.out.println(count);     

Here it is, wrapped up in a nice and reusable method:

public static int count(String text, String find) {
        int index = 0, count = 0, length = find.length();
        while( (index = text.indexOf(find, index)) != -1 ) {                
                index += length; count++;
        }
        return count;
}

A shorter version. ;)

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
System.out.println(str.split(findStr, -1).length-1);

How about using StringUtils.countMatches from Apache Commons Lang?

String str = "helloslkhellodjladfjhello";
String findStr = "hello";

System.out.println(StringUtils.countMatches(str, findStr));

That outputs:

3

Here is the advanced version for counting how many times the token occurred in a user entered string:

public class StringIndexOf {

    public static void main(String[] args) {

        Scanner scanner = new Scanner(System.in);

        System.out.println("Enter a sentence please: \n");
        String string = scanner.nextLine();

        int atIndex = 0;
        int count = 0;

        while (atIndex != -1)
        {
            atIndex = string.indexOf("hello", atIndex);

            if(atIndex != -1)
            {
                count++;
                atIndex += 5;
            }
        }

        System.out.println(count);
    }

}

public int countOfOccurrences(String str, String subStr) {
  return (str.length() - str.replaceAll(Pattern.quote(subStr), "").length()) / subStr.length();
}