[bash] Count number of lines in a git repository

How would I count the total number of lines present in all the files in a git repository?

git ls-files gives me a list of files tracked by git.

I'm looking for a command to cat all those files. Something like

git ls-files | [cat all these files] | wc -l

This question is related to bash git shell line-count

The answer is


This tool on github https://github.com/flosse/sloc can give the output in more descriptive way. It will Create stats of your source code:

  • physical lines
  • lines of code (source)
  • lines with comments
  • single-line comments
  • lines with block comments
  • lines mixed up with source and comments
  • empty lines

Try:

find . -type f -name '*.*' -exec wc -l {} + 

on the directory/directories in question


I was playing around with cmder (http://gooseberrycreative.com/cmder/) and I wanted to count the lines of html,css,java and javascript. While some of the answers above worked, or pattern in grep didn't - I found here (https://unix.stackexchange.com/questions/37313/how-do-i-grep-for-multiple-patterns) that I had to escape it

So this is what I use now:

git ls-files | grep "\(.html\|.css\|.js\|.java\)$" | xargs wc -l


If you want to get the number of lines from a certain author, try the following code:

git ls-files "*.java" | xargs -I{} git blame {} | grep ${your_name} | wc -l

I use the following:

git grep ^ | wc -l

This searches all files versioned by git for the regex ^, which represents the beginning of a line, so this command gives the total number of lines!


The answer by Carl Norum assumes there are no files with spaces, one of the characters of IFS with the others being tab and newline. The solution would be to terminate the line with a NULL byte.

 git ls-files -z | xargs -0 cat | wc -l

I've encountered batching problems with git ls-files | xargs wc -l when dealing with large numbers of files, where the line counts will get chunked out into multiple total lines.

Taking a tip from question Why does the wc utility generate multiple lines with "total"?, I've found the following command to bypass the issue:

wc -l $(git ls-files)

Or if you want to only examine some files, e.g. code:

wc -l $(git ls-files | grep '.*\.cs')


The best solution, to me anyway, is buried in the comments of @ephemient's answer. I am just pulling it up here so that it doesn't go unnoticed. The credit for this should go to @FRoZeN (and @ephemient).

git diff --shortstat `git hash-object -t tree /dev/null`

returns the total of files and lines in the working directory of a repo, without any additional noise. As a bonus, only the source code is counted - binary files are excluded from the tally.

The command above works on Linux and OS X. The cross-platform version of it is

git diff --shortstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904

That works on Windows, too.

For the record, the options for excluding blank lines,

  • -w/--ignore-all-space,
  • -b/--ignore-space-change,
  • --ignore-blank-lines,
  • --ignore-space-at-eol

don't have any effect when used with --shortstat. Blank lines are counted.


If you want this count because you want to get an idea of the project’s scope, you may prefer the output of CLOC (“Count Lines of Code”), which gives you a breakdown of significant and insignificant lines of code by language.

cloc $(git ls-files)

(This line is equivalent to git ls-files | xargs cloc. It uses sh’s $() command substitution feature.)

Sample output:

      20 text files.
      20 unique files.                              
       6 files ignored.

http://cloc.sourceforge.net v 1.62  T=0.22 s (62.5 files/s, 2771.2 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Javascript                       2             13            111            309
JSON                             3              0              0             58
HTML                             2              7             12             50
Handlebars                       2              0              0             37
CoffeeScript                     4              1              4             12
SASS                             1              1              1              5
-------------------------------------------------------------------------------
SUM:                            14             22            128            471
-------------------------------------------------------------------------------

You will have to install CLOC first. You can probably install cloc with your package manager – for example, brew install cloc with Homebrew.

cloc $(git ls-files) is often an improvement over cloc .. For example, the above sample output with git ls-files reports 471 lines of code. For the same project, cloc . reports a whopping 456,279 lines (and takes six minutes to run), because it searches the dependencies in the Git-ignored node_modules folder.


git diff --stat 4b825dc642cb6eb9a060e54bf8d69288fbee4904

This shows the differences from the empty tree to your current working tree. Which happens to count all lines in your current working tree.

To get the numbers in your current working tree, do this:

git diff --shortstat `git hash-object -t tree /dev/null`

It will give you a string like 1770 files changed, 166776 insertions(+).


If you want to find the total number of non-empty lines, you could use AWK:

git ls-files | xargs cat | awk '/\S/{x++} END{print "Total number of non-empty lines:", x}'

This uses regex to count the lines containing a non-whitespace character.


This works as of cloc 1.68:

cloc --vcs=git


Depending on whether or not you want to include binary files, there are two solutions.

  1. git grep --cached -al '' | xargs -P 4 cat | wc -l
  2. git grep --cached -Il '' | xargs -P 4 cat | wc -l

    "xargs -P 4" means it can read the files using four parallel processes. This can be really helpful if you are scanning very large repositories. Depending on capacity of the machine you may increase number of processes.

    -a, process binary files as text (Include Binary)
    -l '', show only filenames instead of matching lines (Scan only non empty files)
    -I, don't match patterns in binary files (Exclude Binary)
    --cached, search in index instead of in the work tree (Include uncommitted files)


: | git mktree | git diff --shortstat --stdin

Or:

git ls-tree @ | sed '1i\\' | git mktree --batch | xargs | git diff-tree --shortstat --stdin

I did this:

git ls-files | xargs file | grep "ASCII" | cut -d : -f 1 | xargs wc -l

this works if you count all text files in the repository as the files of interest. If some are considered documentation, etc, an exclusion filter can be added.


Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to git

Does the target directory for a git clone have to match the repo name? Git fatal: protocol 'https' is not supported Git is not working after macOS Update (xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools) git clone: Authentication failed for <URL> destination path already exists and is not an empty directory SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443 GitLab remote: HTTP Basic: Access denied and fatal Authentication How can I switch to another branch in git? VS 2017 Git Local Commit DB.lock error on every commit How to remove an unpushed outgoing commit in Visual Studio?

Examples related to shell

Comparing a variable with a string python not working when redirecting from bash script Get first line of a shell command's output How to run shell script file using nodejs? Run bash command on jenkins pipeline Way to create multiline comments in Bash? How to do multiline shell script in Ansible How to check if a file exists in a shell script How to check if an environment variable exists and get its value? Curl to return http status code along with the response docker entrypoint running bash script gets "permission denied"

Examples related to line-count

Can you get the number of lines of code from a GitHub repository? Eclipse count lines of code Count number of lines in a git repository How do you count the lines of code in a Visual Studio solution? How to get line count of a large file cheaply in Python?