[java] Counting number of lines, words, and characters in a text file

I am trying to take input from a user, and print the amount of lines, words, and characters in a text file. However, only the amount of words is correct, it always prints 0 for the lines and characters.

import java.util.*;
import java.io.*;

public class TextFileInfoPrinter
{  
    public static void main(String[]args) throws FileNotFoundException        
    { 
            Scanner console = new Scanner(System.in);           

            System.out.println("File to be read: ");
            String inputFile = console.next();

            File file = new File(inputFile);
            Scanner in = new Scanner(file);

            int words = 0;
            int lines = 0;
            int chars = 0;

            while(in.hasNext())
            {
                in.next();
                words++;
            }

            while(in.hasNextLine())
            {
                in.nextLine();
                lines++;
            }

            while(in.hasNextByte())
            {
                in.nextByte();
                chars++;
            }

            System.out.println("Number of lines: " + lines);
            System.out.println("Number of words: " + words);
            System.out.println("Number of characters: " + chars);
    }
}

This question is related to java file input

The answer is


The file pointer is set to the end of the file when the 1st while is executed. try this:

Scanner in = new Scanner(file);


        while(in.hasNext())
        {
            in.next();
            words++;
        }
        in = new Scanner(file);
        while(in.hasNextLine())
        {
            in.nextLine();
            lines++;
        }
        in = new Scanner(file);
        while(in.hasNextByte())
        {
            in.nextByte();
            chars++;
        }

in.next(); is consuming all the lines in the first while(). After the end of your first while loop, there are no more characters to be read at the input stream.

You should nest your character and word-counting within a while loop counting lines.


Is there some reason why you think that:

while(in.hasNext())
{
    in.next();
    words++;
}

will not consume the entire input stream?

It will do so, meaning that your other two while loops will never iterate. That's why your values for words and lines are still set to zero.

You're probably better off reading the file one character at a time, increasing the character count each time through the loop, and also detecting the character to decide whether or not to increment the other counters.

Basically, wherever you find a \n, increase the line count - you should probably also do this if the last character in the stream wasn't \n.

And, whenever you transition from white-space to non-white-space, increase the word count (there'll probably be some tricky edge case processing at the stream beginning but that's an implementation issue).

You're looking at something like the following pseudo-code:

# Init counters and last character

charCount = 0
wordCount = 0
lineCount = 0
lastChar = ' '

# Start loop.

currChar = getNextChar()
while currChar != EOF:
    # Every character counts.

    charCount++;

    # Words only on whitespace transitions.

    if isWhite(lastChar) && !isWhite(currChar):
        wordCount++

    # Lines only on newline characters.

    if currChar == '\n':
        lineCount++;
    lastChar = currChar
    currChar = getNextChar()

# Handle incomplete last line.

if lastChar != '\n':
    lineCount++;

I'm no Java expert, but I would presume that the .hasNext, .hasNextLine and .hasNextByte all use and increment the same file position indicator. You'll need to reset that, either by creating a new Scanner as Aashray mentioned, or using a RandomAccessFile and calling file.seek(0); after each loop.


import java.io.*;
class wordcount
{
    public static int words=0;
    public static int lines=0;
    public static int chars=0;
    public static void wc(InputStreamReader isr)throws IOException
    {
        int c=0;
        boolean lastwhite=true;
        while((c=isr.read())!=-1)
        {
            chars++;
            if(c=='\n')
                lines++;
            if(c=='\t' || c==' ' || c=='\n')
                ++words;
            if(chars!=0)
                ++chars;
        }   
       }
    public static void main(String[] args)
    {
        FileReader fr;
        try
        {
            if(args.length==0)
            {
                wc(new InputStreamReader(System.in));
            }
            else
            {
                for(int i=0;i<args.length;i++)
                {
                    fr=new FileReader(args[i]);
                    wc(fr);
                }
            }

        }
        catch(IOException ie)
        {
            return;
        }
        System.out.println(lines+" "+words+" "+chars);
    }
}

I think the best answer is

int words = 0;
int lines = 0;
int chars = 0;
while(in.hasNextLine())  {
    lines++;
    String line = in.nextLine();
   for(int i=0;i<line.length();i++)
    {
        if(line.charAt(i)!=' ' && line.charAt(i)!='\n')
        chars ++;
    }
    words += new StringTokenizer(line, " ,").countTokens();
}

I agree with @Cthulhu answer. In your code you can reset your Scanner object (in).

in.reset();

This will reset your in object at the first line of your file.


while(in.hasNextLine())  {
        lines++;
        String line = in.nextLine();
        for(int i=0;i<line.length();i++)
        {
            if(line.charAt(i)!=' ' && line.charAt(i)!='\n')
        chars ++;
        }
        words += new StringTokenizer(line, " ,;:.").countTokens();
    }

try

    int words = 0;
    int lines = 0;
    int chars = 0;
    while(in.hasNextLine())  {
        lines++;
        String line = in.nextLine();
        chars += line.length();
        words += new StringTokenizer(line, " ,").countTokens();
    }

You could use regular expressions to count for you.

String subject = "First Line\n Second Line\nThird Line";  
Matcher wordM = Pattern.compile("\\b\\S+?\\b").matcher(subject); //matches a word
Matcher charM = Pattern.compile(".").matcher(subject); //matches a character
Matcher newLineM = Pattern.compile("\\r?\\n").matcher(subject); //matches a linebreak

int words=0,chars=0,newLines=1; //newLines is initially 1 because the first line has no corresponding linebreak

while(wordM.find()) words++;
while(charM.find()) chars++;
while(newLineM.find()) newLines++;

System.out.println("Words: "+words);
System.out.println("Chars: "+chars);
System.out.println("Lines: "+newLines);

Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to file

Gradle - Move a folder from ABC to XYZ Difference between opening a file in binary vs text Angular: How to download a file from HttpClient? Python error message io.UnsupportedOperation: not readable java.io.FileNotFoundException: class path resource cannot be opened because it does not exist Writing JSON object to a JSON file with fs.writeFileSync How to read/write files in .Net Core? How to write to a CSV line by line? Writing a dictionary to a text file? What are the pros and cons of parquet format compared to other formats?

Examples related to input

Angular 4 - get input value React - clearing an input value after form submit Min and max value of input in angular4 application Disable Button in Angular 2 Angular2 - Input Field To Accept Only Numbers How to validate white spaces/empty spaces? [Angular 2] Can't bind to 'ngModel' since it isn't a known property of 'input' Mask for an Input to allow phone numbers? File upload from <input type="file"> Why does the html input with type "number" allow the letter 'e' to be entered in the field?