[git] Generating statistics from Git repository

I'm looking for some good tools/scripts that allow me to generate a few statistics from a git repository. I've seen this feature on some code hosting sites, and they contained information like...

  • commits per author
  • commits per day/week/year/etc.
  • lines of code over time
  • graphs
  • ... much more

Basically I just want to get an idea how much my project grows over time, which developer commits most code, and so on.

This question is related to git graph statistics

The answer is


Just want to add gitqlite into the mix of answers here, which is a command-line tool that enables execution of SQL queries on git data, such as SELECT * FROM commits WHERE author_name = 'foo' etc.

Full disclosure, I'm a creator/maintainer of the project!


If your project is on GitHub, you now (April 2013) have Pulse (see "Get up to speed with Pulse"):

It is more limited, and won't display all the stats you might need, but is readily available for any GitHub project.

Pulse is a great way to discover recent activity on projects.
Pulse will show you who has been actively committing and what has changed in a project's default branch:

Pulse

You can find the link to the left of the nav bar.

Link

Note that there isn't (yet) an API to extract that information.


repostat is an enhanced fork of gitstats tool.

I'm not sure if it's in any way related to the project with the same name on pypi, so your best bet is to download the latest release from GitHub and install it in your Python environment.

As of November 2019, I was able to use v1.2.0 under Windows 7, after making gnuplot available in PATH.


usage: repostat [-h] [-v] [-c CONFIG_FILE] [--no-browser] [--copy-assets]
                git_repo output_path

Git repository desktop analyzer. Analyze and generate git statistics in HTML
format

positional arguments:
git_repo              Path to git repository
output_path           Path to an output directory

optional arguments:
-h, --help            show this help message and exit
-v, --version         show program's version number and exit
-c CONFIG_FILE, --config-file CONFIG_FILE
                        Configuration file path
--no-browser          Do not open report in browser
--copy-assets         Copy assets (images, css, etc.) into report folder
                        (report becomes relocatable)

Just yesterday I've added my git-analytics docker-compose file, which builds up several containers to start analyzing multiple git repositories against each other.

It is able to show you commit statistics over time about the author and also several diff statistics.

You can use the provided angular client and also kibana to visualize the statistics.

https://github.com/alexejsailer/git-analytics-docker

It will be improved over time.

Angular Client Screenshot

Angular Client Screenshot

Kibana Client Screenshot

Kibana Client Screenshot]


I tried http://gitstats.sourceforge.net/, starts are very interesting.

Once git clone git://repo.or.cz/gitstats.git is done, go to that folder and say gitstats <git repo location> <report output folder> (create a new folder for report as this generates lots of files)

Here is a quick list of stats from this:

  • activity
    • hour of the day
    • day of week
  • authors
    • List of Authors
    • Author of Month
    • Author of Year
  • files
    • File count by date
    • Extensions
  • lines
    • Lines of Code
  • tags

commits per author

git shortlog -s -n 

git-bars can show you "commits per day/week/year/etc".

You can install it with pip install git-bars (cf. https://github.com/knadh/git-bars)

The output looks like this:

$ git-bars -p month
370 commits over 19 month(s)
2019-10  7    ¯¯¯¯¯¯
2019-09  36   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2019-08  7    ¯¯¯¯¯¯
2019-07  10   ¯¯¯¯¯¯¯¯
2019-05  4    ¯¯¯
2019-04  2    ¯
2019-03  28   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2019-02  32   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2019-01  16   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2018-12  41   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2018-11  52   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2018-10  57   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2018-09  37   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2018-08  17   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯
2018-07  1    
2018-04  7    ¯¯¯¯¯¯
2018-03  12   ¯¯¯¯¯¯¯¯¯¯
2018-02  2    ¯
2016-01  2    ¯

A quick google search lead me to: http://gitstats.sourceforge.net/

Have you tried this project? I'm sure there are similar projects.


And if you prefer hosted solution, you should check out Open Hub (formerly Ohloh.net). It is nice, but don't expect large statistics.


I'm doing a git repository statistics generator in ruby, it's called git_stats.

You can find examples generated for some repositories on project page.

Here is a list of what it can do:

  • General statistics
    • Total files (text and binary)
    • Total lines (added and deleted)
    • Total commits
    • Authors
  • Activity (total and per author)
    • Commits by date
    • Commits by hour of day
    • Commits by day of week
    • Commits by hour of week
    • Commits by month of year
    • Commits by year
    • Commits by year and month
  • Authors
    • Commits by author
    • Lines added by author
    • Lines deleted by author
    • Lines changed by author
  • Files and lines
    • By date
    • By extension

If you have any idea what to add or improve please let me know, I would appreciate any feedback.


Examples related to git

Does the target directory for a git clone have to match the repo name? Git fatal: protocol 'https' is not supported Git is not working after macOS Update (xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools) git clone: Authentication failed for <URL> destination path already exists and is not an empty directory SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443 GitLab remote: HTTP Basic: Access denied and fatal Authentication How can I switch to another branch in git? VS 2017 Git Local Commit DB.lock error on every commit How to remove an unpushed outgoing commit in Visual Studio?

Examples related to graph

How to plot multiple functions on the same figure, in Matplotlib? Python equivalent to 'hold on' in Matlab How to combine 2 plots (ggplot) into one plot? how to draw directed graphs using networkx in python? What is the difference between dynamic programming and greedy approach? Plotting using a CSV file Python equivalent of D3.js Count number of times a date occurs and make a graph out of it How do I create a chart with multiple series using different X values for each series? Rotating x axis labels in R for barplot

Examples related to statistics

Function to calculate R2 (R-squared) in R pandas: find percentile stats of a given column What exactly does numpy.exp() do? Find p-value (significance) in scikit-learn LinearRegression How to plot ROC curve in Python Pandas - Compute z-score for all columns Calculating percentile of dataset column How to normalize an array in NumPy to a unit vector? How to find row number of a value in R code np.mean() vs np.average() in Python NumPy?