[python] Python Git Module experiences?

What are people's experiences with any of the Git modules for Python? (I know of GitPython, PyGit, and Dulwich - feel free to mention others if you know of them.)

I am writing a program which will have to interact (add, delete, commit) with a Git repository, but have no experience with Git, so one of the things I'm looking for is ease of use/understanding with regards to Git.

The other things I'm primarily interested in are maturity and completeness of the library, a reasonable lack of bugs, continued development, and helpfulness of the documentation and developers.

If you think of something else I might want/need to know, please feel free to mention it.

This question is related to python git

The answer is


PTBNL's Answer is quite perfect for me. I make a little more for Windows user.

import time
import subprocess
def gitAdd(fileName, repoDir):
    cmd = 'git add ' + fileName
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 

def gitCommit(commitMessage, repoDir):
    cmd = 'git commit -am "%s"'%commitMessage
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 
def gitPush(repoDir):
    cmd = 'git push '
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    pipe.wait()
    return 

temp=time.localtime(time.time())
uploaddate= str(temp[0])+'_'+str(temp[1])+'_'+str(temp[2])+'_'+str(temp[3])+'_'+str(temp[4])

repoDir='d:\\c_Billy\\vfat\\Programming\\Projector\\billyccm' # your git repository , windows your need to use double backslash for right directory.
gitAdd('.',repoDir )
gitCommit(uploaddate, repoDir)
gitPush(repoDir)

This is a pretty old question, and while looking for Git libraries, I found one that was made this year (2013) called Gittle.

It worked great for me (where the others I tried were flaky), and seems to cover most of the common actions.

Some examples from the README:

from gittle import Gittle

# Clone a repository
repo_path = '/tmp/gittle_bare'
repo_url = 'git://github.com/FriendCode/gittle.git'
repo = Gittle.clone(repo_url, repo_path)

# Stage multiple files
repo.stage(['other1.txt', 'other2.txt'])

# Do the commit
repo.commit(name="Samy Pesse", email="[email protected]", message="This is a commit")

# Authentication with RSA private key
key_file = open('/Users/Me/keys/rsa/private_rsa')
repo.auth(pkey=key_file)

# Do push
repo.push()

Maybe it helps, but Bazaar and Mercurial are both using dulwich for their Git interoperability.

Dulwich is probably different than the other in the sense that's it's a reimplementation of git in python. The other might just be a wrapper around Git's commands (so it could be simpler to use from a high level point of view: commit/add/delete), it probably means their API is very close to git's command line so you'll need to gain experience with Git.


The git interaction library part of StGit is actually pretty good. However, it isn't broken out as a separate package but if there is sufficient interest, I'm sure that can be fixed.

It has very nice abstractions for representing commits, trees etc, and for creating new commits and trees.


For the record, none of the aforementioned Git Python libraries seem to contain a "git status" equivalent, which is really the only thing I would want since dealing with the rest of the git commands via subprocess is so easy.


An updated answer reflecting changed times:

GitPython currently is the easiest to use. It supports wrapping of many git plumbing commands and has pluggable object database (dulwich being one of them), and if a command isn't implemented, provides an easy api for shelling out to the command line. For example:

repo = Repo('.')
repo.checkout(b='new_branch')

This calls:

bash$ git checkout -b new_branch

Dulwich is also good but much lower level. It's somewhat of a pain to use because it requires operating on git objects at the plumbing level and doesn't have nice porcelain that you'd normally want to do. However, if you plan on modifying any parts of git, or use git-receive-pack and git-upload-pack, you need to use dulwich.


For the sake of completeness, http://github.com/alex/pyvcs/ is an abstraction layer for all dvcs's. It uses dulwich, but provides interop with the other dvcs's.


Here's a really quick implementation of "git status":

import os
import string
from subprocess import *

repoDir = '/Users/foo/project'

def command(x):
    return str(Popen(x.split(' '), stdout=PIPE).communicate()[0])

def rm_empty(L): return [l for l in L if (l and l!="")]

def getUntracked():
    os.chdir(repoDir)
    status = command("git status")
    if "# Untracked files:" in status:
        untf = status.split("# Untracked files:")[1][1:].split("\n")
        return rm_empty([x[2:] for x in untf if string.strip(x) != "#" and x.startswith("#\t")])
    else:
        return []

def getNew():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tnew file:   ")]

def getModified():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tmodified:   ")]

print("Untracked:")
print( getUntracked() )
print("New:")
print( getNew() )
print("Modified:")
print( getModified() )

I'd recommend pygit2 - it uses the excellent libgit2 bindings


While this question was asked a while ago and I don't know the state of the libraries at that point, it is worth mentioning for searchers that GitPython does a good job of abstracting the command line tools so that you don't need to use subprocess. There are some useful built in abstractions that you can use, but for everything else you can do things like:

import git
repo = git.Repo( '/home/me/repodir' )
print repo.git.status()
# checkout and track a remote branch
print repo.git.checkout( 'origin/somebranch', b='somebranch' )
# add a file
print repo.git.add( 'somefile' )
# commit
print repo.git.commit( m='my commit message' )
# now we are one commit ahead
print repo.git.status()

Everything else in GitPython just makes it easier to navigate. I'm fairly well satisfied with this library and appreciate that it is a wrapper on the underlying git tools.

UPDATE: I've switched to using the sh module for not just git but most commandline utilities I need in python. To replicate the above I would do this instead:

import sh
git = sh.git.bake(_cwd='/home/me/repodir')
print git.status()
# checkout and track a remote branch
print git.checkout('-b', 'somebranch')
# add a file
print git.add('somefile')
# commit
print git.commit(m='my commit message')
# now we are one commit ahead
print git.status()