[java] Count the number of Occurrences of a Word in a String

I am new to Java Strings the problem is that I want to count the Occurrences of a specific word in a String. Suppose that my String is:

i have a male cat. the color of male cat is Black

Now I dont want to split it as well so I want to search for a word that is "male cat". it occurs two times in my string!

What I am trying is:

int c = 0;
for (int j = 0; j < text.length(); j++) {
    if (text.contains("male cat")) {
        c += 1;
    }
}

System.out.println("counter=" + c);

it gives me 46 counter value! So whats the solution?

This question is related to java regex string

The answer is


This should be a faster non-regex solution.
(note - Not a Java programmer)

 String str = "i have a male cat. the color of male cat is Black";
 int found  = 0;
 int oldndx = 0;
 int newndx = 0;

 while ( (newndx=str.indexOf("male cat", oldndx)) > -1 )
 {
     found++;
     oldndx = newndx+8;
 }

If you find the String you are searching for, you can go on for the length of that string (if in case you search aa in aaaa you consider it 2 times).

int c=0;
String found="male cat";
 for(int j=0; j<text.length();j++){
     if(text.contains(found)){
         c+=1;
         j+=found.length()-1;
     }
 }
 System.out.println("counter="+c);

Replace the String that needs to be counted with empty string and then use the length without the string to calculate the number of occurrence.

public int occurrencesOf(String word)
    {
    int length = text.length();
    int lenghtofWord = word.length();
    int lengthWithoutWord = text.replace(word, "").length();
    return (length - lengthWithoutWord) / lenghtofWord ;
    }

Java 8 version:

    public static long countNumberOfOccurrencesOfWordInString(String msg, String target) {
    return Arrays.stream(msg.split("[ ,\\.]")).filter(s -> s.equals(target)).count();
}

This static method does returns the number of occurrences of a string on another string.

/**
 * Returns the number of appearances that a string have on another string.
 * 
 * @param source    a string to use as source of the match
 * @param sentence  a string that is a substring of source
 * @return the number of occurrences of sentence on source 
 */
public static int numberOfOccurrences(String source, String sentence) {
    int occurrences = 0;

    if (source.contains(sentence)) {
        int withSentenceLength    = source.length();
        int withoutSentenceLength = source.replace(sentence, "").length();
        occurrences = (withSentenceLength - withoutSentenceLength) / sentence.length();
    }

    return occurrences;
}

Tests:

String source = "Hello World!";
numberOfOccurrences(source, "Hello World!");   // 1
numberOfOccurrences(source, "ello W");         // 1
numberOfOccurrences(source, "l");              // 3
numberOfOccurrences(source, "fun");            // 0
numberOfOccurrences(source, "Hello");          // 1

BTW, the method could be written in one line, awful, but it also works :)

public static int numberOfOccurrences(String source, String sentence) {
    return (source.contains(sentence)) ? (source.length() - source.replace(sentence, "").length()) / sentence.length() : 0;
}

Simple solution is here-

Below code uses HashMap as it will maintain keys and values. so here keys will be word and values will be count (occurance of a word in a given string).

public class WordOccurance 
{

 public static void main(String[] args) 
 {
    HashMap<String, Integer> hm = new HashMap<>();
    String str = "avinash pande avinash pande avinash";

    //split the word with white space       
    String words[] = str.split(" ");
    for (String word : words) 
    {   
        //If already added/present in hashmap then increment the count by 1
        if(hm.containsKey(word))    
        {           
            hm.put(word, hm.get(word)+1);
        }
        else //if not added earlier then add with count 1
        {
            hm.put(word, 1);
        }

    }
    //Iterate over the hashmap
    Set<Entry<String, Integer>> entry =  hm.entrySet();
    for (Entry<String, Integer> entry2 : entry) 
    {
        System.out.println(entry2.getKey() + "      "+entry2.getValue());
    }
}

}


Java 8 version.

System.out.println(Pattern.compile("\\bmale cat")
            .splitAsStream("i have a male cat. the color of male cat is Black")
            .count()-1);

We can count from many ways for the occurrence of substring:-

public class Test1 {
public static void main(String args[]) {
    String st = "abcdsfgh yfhf hghj gjgjhbn hgkhmn abc hadslfahsd abcioh abc  a ";
    count(st, 0, "a".length());

}

public static void count(String trim, int i, int length) {
    if (trim.contains("a")) {
        trim = trim.substring(trim.indexOf("a") + length);
        count(trim, i + 1, length);
    } else {
        System.out.println(i);
    }
}

public static void countMethod2() {
    int index = 0, count = 0;
    String inputString = "mynameiskhanMYlaptopnameishclMYsirnameisjasaiwalmyfrontnameisvishal".toLowerCase();
    String subString = "my".toLowerCase();

    while (index != -1) {
        index = inputString.indexOf(subString, index);
        if (index != -1) {
            count++;
            index += subString.length();
        }
    }
    System.out.print(count);
}}

If you just want the count of "male cat" then I would just do it like this:

String str = "i have a male cat. the color of male cat is Black";
int c = str.split("male cat").length - 1;
System.out.println(c);

and if you want to make sure that "female cat" is not matched then use \\b word boundaries in the split regex:

int c = str.split("\\bmale cat\\b").length - 1;

public class WordCount {

public static void main(String[] args) {
    // TODO Auto-generated method stub
    String scentence = "This is a treeis isis is is is";
    String word = "is";
    int wordCount = 0;
    for(int i =0;i<scentence.length();i++){
        if(word.charAt(0) == scentence.charAt(i)){
            if(i>0){
                if(scentence.charAt(i-1) == ' '){
                    if(i+word.length()<scentence.length()){
                        if(scentence.charAt(i+word.length()) != ' '){
                            continue;}
                        }
                    }
                else{
                    continue;
                }
            }
            int count = 1;
            for(int j=1 ; j<word.length();j++){
                i++;
                if(word.charAt(j) != scentence.charAt(i)){
                    break;
                }
                else{
                    count++;
                }
            }
            if(count == word.length()){
                wordCount++;
            }

        }
    }
    System.out.println("The word "+ word + " was repeated :" + wordCount);
}

}


I've got another approach here:

String description = "hello india hello india hello hello india hello";
String textToBeCounted = "hello";

// Split description using "hello", which will return 
//string array of words other than hello
String[] words = description.split("hello");

// Get number of characters words other than "hello"
int lengthOfNonMatchingWords = 0;
for (String word : words) {
    lengthOfNonMatchingWords += word.length();
}

// Following code gets length of `description` - length of all non-matching
// words and divide it by length of word to be counted
System.out.println("Number of matching words are " + 
(description.length() - lengthOfNonMatchingWords) / textToBeCounted.length());

The string contains that string all the time when looping through it. You don't want to ++ because what this is doing right now is just getting the length of the string if it contains " "male cat"

You need to indexOf() / substring()

Kind of get what i am saying?


This will work

int word_count(String text,String key){
   int count=0;
   while(text.contains(key)){
      count++;
      text=text.substring(text.indexOf(key)+key.length());
   }
   return count;
}

Complete Example here,

package com.test;

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

public class WordsOccurances {

      public static void main(String[] args) {

            String sentence = "Java can run on many different operating "
                + "systems. This makes Java platform independent.";

            String[] words = sentence.split(" ");
            Map<String, Integer> wordsMap = new HashMap<String, Integer>();

            for (int i = 0; i<words.length; i++ ) {
                if (wordsMap.containsKey(words[i])) {
                    Integer value = wordsMap.get(words[i]);
                    wordsMap.put(words[i], value + 1);
                } else {
                    wordsMap.put(words[i], 1);
                }
            }

            /*Now iterate the HashMap to display the word with number 
           of time occurance            */

           Iterator it = wordsMap.entrySet().iterator();
           while (it.hasNext()) {
                Map.Entry<String, Integer> entryKeyValue = (Map.Entry<String, Integer>) it.next();
                System.out.println("Word : "+entryKeyValue.getKey()+", Occurance : "
                                +entryKeyValue.getValue()+" times");
           }
     }
}

for scala it's just 1 line

def numTimesOccurrenced(text:String, word:String) =text.split(word).size-1


public int occurrencesOf(String word) {
    int length = text.length();
    int lenghtofWord = word.length();
    int lengthWithoutWord = text.replaceAll(word, "").length();
    return (length - lengthWithoutWord) / lenghtofWord ;
}

using indexOf...

public static int count(String string, String substr) {
    int i;
    int last = 0;
    int count = 0;
    do {
        i = string.indexOf(substr, last);
        if (i != -1) count++;
        last = i+substr.length();
    } while(i != -1);
    return count;
}

public static void main (String[] args ){
    System.out.println(count("i have a male cat. the color of male cat is Black", "male cat"));
}

That will show: 2

Another implementation for count(), in just 1 line:

public static int count(String string, String substr) {
    return (string.length() - string.replaceAll(substr, "").length()) / substr.length() ;
}

Once you find the term you need to remove it from String under process so that it won't resolve the same again, use indexOf() and substring() , you don't need to do contains check length times


Why not recursive ?

public class CatchTheMaleCat  {
    private static final String MALE_CAT = "male cat";
    static int count = 0;
    public static void main(String[] arg){
        wordCount("i have a male cat. the color of male cat is Black");
        System.out.println(count);
    }

    private static boolean wordCount(String str){
        if(str.contains(MALE_CAT)){
            count++;
            return wordCount(str.substring(str.indexOf(MALE_CAT)+MALE_CAT.length()));
        }
        else{
            return false;
        }
    }
}

StringUtils in apache commons-lang have CountMatches method to counts the number of occurrences of one String in another.

   String input = "i have a male cat. the color of male cat is Black";
   int occurance = StringUtils.countMatches(input, "male cat");
   System.out.println(occurance);

public class TestWordCount {

public static void main(String[] args) {

    int count = numberOfOccurences("Alice", "Alice in wonderland. Alice & chinki are classmates. Chinki is better than Alice.occ");
    System.out.println("count : "+count);

}

public static int numberOfOccurences(String findWord, String sentence) {

    int length = sentence.length();
    int lengthWithoutFindWord = sentence.replace(findWord, "").length();
    return (length - lengthWithoutFindWord)/findWord.length();

}

}


There are so many ways for the occurrence of substring and two of theme are:-

public class Test1 {
public static void main(String args[]) {
    String st = "abcdsfgh yfhf hghj gjgjhbn hgkhmn abc hadslfahsd abcioh abc  a ";
    count(st, 0, "a".length());

}

public static void count(String trim, int i, int length) {
    if (trim.contains("a")) {
        trim = trim.substring(trim.indexOf("a") + length);
        count(trim, i + 1, length);
    } else {
        System.out.println(i);
    }
}

public static void countMethod2() {
    int index = 0, count = 0;
    String inputString = "mynameiskhanMYlaptopnameishclMYsirnameisjasaiwalmyfrontnameisvishal".toLowerCase();
    String subString = "my".toLowerCase();

    while (index != -1) {
        index = inputString.indexOf(subString, index);
        if (index != -1) {
            count++;
            index += subString.length();
        }
    }
    System.out.print(count);
}}

Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript