[git] What's the difference between HEAD, working tree and index, in Git?

This is an inevitably long yet easy to follow explanation from ProGit book:

Note: For reference you can read Chapter 7.7 of the book, Reset Demystified

Git as a system manages and manipulates three trees in its normal operation:

  • HEAD: Last commit snapshot, next parent
  • Index: Proposed next commit snapshot
  • Working Directory: Sandbox

The HEAD

HEAD is the pointer to the current branch reference, which is in turn a pointer to the last commit made on that branch. That means HEAD will be the parent of the next commit that is created. It’s generally simplest to think of HEAD as the snapshot of your last commit on that branch.

What does it contain?
To see what that snapshot looks like run the following in root directory of your repository:

                                 git ls-tree -r HEAD

it would result in something like this:

                       $ git ls-tree -r HEAD  
                       100644 blob a906cb2a4a904a152... README  
                       100644 blob 8f94139338f9404f2... Rakefile  
                       040000 tree 99f1a6d12cb4b6f19... lib  

The Index

Git populates this index with a list of all the file contents that were last checked out into your working directory and what they looked like when they were originally checked out. You then replace some of those files with new versions of them, and git commit converts that into the tree for a new commit.

What does it contain?
Use git ls-files -s to see what it looks like. You should see something like this:

                 100644 a906cb2a4a904a152e80877d4088654daad0c859 0 README   
                 100644 8f94139338f9404f26296befa88755fc2598c289 0 Rakefile  
                 100644 47c6340d6459e05787f644c2447d2595f5d3a54b 0 lib/simplegit.rb  

The Working Directory

This is where your files reside and where you can try changes out before committing them to your staging area (index) and then into history.

Visualized Sample

Let's see how do these three trees (As the ProGit book refers to them) work together?
Git’s typical workflow is to record snapshots of your project in successively better states, by manipulating these three trees. Take a look at this picture:

enter image description here

To get a good visualized understanding consider this scenario. Say you go into a new directory with a single file in it. Call this v1 of the file. It is indicated in blue. Running git init will create a Git repository with a HEAD reference which points to the unborn master branch

enter image description here

At this point, only the working directory tree has any content. Now we want to commit this file, so we use git add to take content in the working directory and copy it to the index.

enter image description here

Then we run git commit, which takes the contents of the index and saves it as a permanent snapshot, creates a commit object which points to that snapshot, and updates master to point to that commit.

enter image description here

If we run git status, we’ll see no changes, because all three trees are the same.

The beautiful point

git status shows the difference between these trees in the following manner:

  • If the Working Tree is different from index, then git status will show there are some changes not staged for commit
  • If the Working Tree is the same as index, but they are different from HEAD, then git status will show some files under changes to be committed section in its result
  • If the Working Tree is different from the index, and index is different from HEAD, then git status will show some files under changes not staged for commit section and some other files under changes to be committed section in its result.

For the more curious

Note about git reset command
Hopefully, knowing how reset command works will further brighten the reason behind the existence of these three trees.

reset command is your Time Machine in git which can easily take you back in time and bring some old snapshots for you to work on. In this manner, HEAD is the wormhole through which you can travel in time. Let's see how it works with an example from the book:

Consider the following repository which has a single file and 3 commits which are shown in different colours and different version numbers:

enter image description here

The state of trees is like the next picture:

enter image description here

Step 1: Moving HEAD (--soft):

The first thing reset will do is move what HEAD points to. This isn’t the same as changing HEAD itself (which is what checkout does). reset moves the branch that HEAD is pointing to. This means if HEAD is set to the master branch, running git reset 9e5e6a4 will start by making master point to 9e5e6a4. If you call reset with --soft option it will stop here, without changing index and working directory. Our repo will look like this now:
Notice: HEAD~ is the parent of HEAD

enter image description here

Looking a second time at the image, we can see that the command essentially undid the last commit. As the working tree and the index are the same but different from HEAD, git status will now show changes in green ready to be committed.

Step 2: Updating the index (--mixed):

This is the default option of the command

Running reset with --mixed option updates the index with the contents of whatever snapshot HEAD points to currently, leaving Working Directory intact. Doing so, your repository will look like when you had done some work that is not staged and git status will show that as changes not staged for commit in red. This option will also undo the last commit and also unstage all the changes. It's like you made changes but have not called git add command yet. Our repo would look like this now:

enter image description here

Step 3: Updating the Working Directory (--hard)

If you call reset with --hard option it will copy contents of the snapshot HEAD is pointing to into HEAD, index and Working Directory. After executing reset --hard command, it would mean like you got back to a previous point in time and haven't done anything after that at all. see the picture below:

enter image description here

Conclusion

I hope now you have a better understanding of these trees and have a great idea of the power they bring to you by enabling you to change your files in your repository to undo or redo things you have done mistakenly.