[git] Is it possible to move/rename files in Git and maintain their history?

I would like to rename/move a project subtree in Git moving it from

/project/xyz

to

/components/xyz

If I use a plain git mv project components, then all the commit history for the xyz project gets lost. Is there a way to move this such that the history is maintained?

This question is related to git rename mv

The answer is


Simply move the file and stage with:

git add .

Before commit you can check the status:

git status

That will show:

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        renamed:    old-folder/file.txt -> new-folder/file.txt

I tested with Git version 2.26.1.

Extracted from GitHub Help Page.


It is possible to rename a file and keep the history intact, although it causes the file to be renamed throughout the entire history of the repository. This is probably only for the obsessive git-log-lovers, and has some serious implications, including these:

  • You could be rewriting a shared history, which is the most important DON'T while using Git. If someone else has cloned the repository, you'll break it doing this. They will have to re-clone to avoid headaches. This might be OK if the rename is important enough, but you'll need to consider this carefully -- you might end up upsetting an entire opensource community!
  • If you've referenced the file using it's old name earlier in the repository history, you're effectively breaking earlier versions. To remedy this, you'll have to do a bit more hoop jumping. It's not impossible, just tedious and possibly not worth it.

Now, since you're still with me, you're a probably solo developer renaming a completely isolated file. Let's move a file using filter-tree!

Assume you're going to move a file old into a folder dir and give it the name new

This could be done with git mv old dir/new && git add -u dir/new, but that breaks history.

Instead:

git filter-branch --tree-filter 'if [ -f old ]; then mkdir dir && mv old dir/new; fi' HEAD

will redo every commit in the branch, executing the command in the ticks for each iteration. Plenty of stuff can go wrong when you do this. I normally test to see if the file is present (otherwise it's not there yet to move) and then perform the necessary steps to shoehorn the tree to my liking. Here you might sed through files to alter references to the file and so on. Knock yourself out! :)

When completed, the file is moved and the log is intact. You feel like a ninja pirate.

Also; The mkdir dir is only necessary if you move the file to a new folder, of course. The if will avoid the creation of this folder earlier in history than your file exists.


I would like to rename/move a project subtree in Git moving it from

/project/xyz

to

/components/xyz

If I use a plain git mv project components, then all the commit history for the xyz project gets lost.

No (8 years later, Git 2.19, Q3 2018), because Git will detect the directory rename, and this is now better documented.

See commit b00bf1c, commit 1634688, commit 0661e49, commit 4d34dff, commit 983f464, commit c840e1a, commit 9929430 (27 Jun 2018), and commit d4e8062, commit 5dacd4a (25 Jun 2018) by Elijah Newren (newren).
(Merged by Junio C Hamano -- gitster -- in commit 0ce5a69, 24 Jul 2018)

That is now explained in Documentation/technical/directory-rename-detection.txt:

Example:

When all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is likely that x/d added in the meantime would also want to move to z/d by taking the hint that the entire directory 'x' moved to 'z'.

But they are many other cases, like:

one side of history renames x -> z, and the other renames some file to x/e, causing the need for the merge to do a transitive rename.

To simplify directory rename detection, those rules are enforced by Git:

a couple basic rules limit when directory rename detection applies:

  1. If a given directory still exists on both sides of a merge, we do not consider it to have been renamed.
  2. If a subset of to-be-renamed files have a file or directory in the way (or would be in the way of each other), "turn off" the directory rename for those specific sub-paths and report the conflict to the user.
  3. If the other side of history did a directory rename to a path that your side of history renamed away, then ignore that particular rename from the other side of history for any implicit directory renames (but warn the user).

You can see a lot of tests in t/t6043-merge-rename-directories.sh, which also point out that:

  • a) If renames split a directory into two or more others, the directory with the most renames, "wins".
  • b) Avoid directory-rename-detection for a path, if that path is the source of a rename on either side of a merge.
  • c) Only apply implicit directory renames to directories if the other side of history is the one doing the renaming.

I make moving the files and then do

git add -A

which put in the sataging area all deleted/new files. Here git realizes that the file is moved.

git commit -m "my message"
git push

I do not know why but this works for me.


I do:

git mv {old} {new}
git add -u {new}

First create a standalone commit with just a rename.

Then any eventual changes to the file content put in the separate commit.


While the core of Git, the Git plumbing doesn't keep track of renames, the history you display with the Git log "porcelain" can detect them if you like.

For a given git log use the -M option:

git log -p -M

With a current version of Git.

This works for other commands like git diff as well.

There are options to make the comparisons more or less rigorous. If you rename a file without making significant changes to the file at the same time it makes it easier for Git log and friends to detect the rename. For this reason some people rename files in one commit and change them in another.

There's a cost in CPU use whenever you ask Git to find where files have been renamed, so whether you use it or not, and when, is up to you.

If you would like to always have your history reported with rename detection in a particular repository you can use:

git config diff.renames 1

Files moving from one directory to another is detected. Here's an example:

commit c3ee8dfb01e357eba1ab18003be1490a46325992
Author: John S. Gruber <[email protected]>
Date:   Wed Feb 22 22:20:19 2017 -0500

    test rename again

diff --git a/yyy/power.py b/zzz/power.py
similarity index 100%
rename from yyy/power.py
rename to zzz/power.py

commit ae181377154eca800832087500c258a20c95d1c3
Author: John S. Gruber <[email protected]>
Date:   Wed Feb 22 22:19:17 2017 -0500

    rename test

diff --git a/power.py b/yyy/power.py
similarity index 100%
rename from power.py
rename to yyy/power.py

Please note that this works whenever you are using diff, not just with git log. For example:

$ git diff HEAD c3ee8df
diff --git a/power.py b/zzz/power.py
similarity index 100%
rename from power.py
rename to zzz/power.py

As a trial I made a small change in one file in a feature branch and committed it and then in the master branch I renamed the file, committed, and then made a small change in another part of the file and committed that. When I went to feature branch and merged from master the merge renamed the file and merged the changes. Here's the output from the merge:

 $ git merge -v master
 Auto-merging single
 Merge made by the 'recursive' strategy.
  one => single | 4 ++++
  1 file changed, 4 insertions(+)
  rename one => single (67%)

The result was a working directory with the file renamed and both text changes made. So it's possible for Git to do the right thing despite the fact that it doesn't explicitly track renames.

This is an late answer to an old question so the other answers may have been correct for the Git version at the time.


I followed this multi-step process to move code to the parent directory and retained history.

Step 0: Created a branch 'history' from 'master' for safekeeping

Step 1: Used git-filter-repo tool to rewrite history. This command below moved folder 'FolderwithContentOfInterest' to one level up and modified the relevant commit history

git filter-repo --path-rename ParentFolder/FolderwithContentOfInterest/:FolderwithContentOfInterest/ --force

Step 2: By this time the GitHub repository lost its remote repository path. Added remote reference

git remote add origin [email protected]:MyCompany/MyRepo.git

Step 3: Pull information on repository

git pull

Step 4: Connect the local lost branch with the origin branch

git branch --set-upstream-to=origin/history history

Step 5: Address merge conflict for the folder structure if prompted

Step 6: Push!!

git push

Note: The modified history and moved folder appear to already be committed. enter code here

Done. Code moves to the parent / desired directory keeping history intact!


Yes

  1. You convert the commit history of files into email patches using git log --pretty=email
  2. You reorganize these files in new directories and rename them
  3. You convert back these files (emails) to Git commits to keep the history using git am.

Limitation

  • Tags and branches are not kept
  • History is cut on path file rename (directory rename)

Step by step explanation with examples

1. Extract history in email format

Example: Extract history of file3, file4 and file5

my_repo
+-- dirA
¦   +-- file1
¦   +-- file2
+-- dirB            ^
¦   +-- subdir      | To be moved
¦   ¦   +-- file3   | with history
¦   ¦   +-- file4   | 
¦   +-- file5       v
+-- dirC
    +-- file6
    +-- file7

Set/clean the destination

export historydir=/tmp/mail/dir       # Absolute path
rm -rf "$historydir"    # Caution when cleaning the folder

Extract history of each file in email format

cd my_repo/dirB
find -name .git -prune -o -type d -o -exec bash -c 'mkdir -p "$historydir/${0%/*}" && git log --pretty=email -p --stat --reverse --full-index --binary -- "$0" > "$historydir/$0"' {} ';'

Unfortunately option --follow or --find-copies-harder cannot be combined with --reverse. This is why history is cut when file is renamed (or when a parent directory is renamed).

Temporary history in email format:

/tmp/mail/dir
    +-- subdir
    ¦   +-- file3
    ¦   +-- file4
    +-- file5

Dan Bonachea suggests to invert the loops of the git log generation command in this first step: rather than running git log once per file, run it exactly once with a list of files on the command line and generate a single unified log. This way commits that modify multiple files remain a single commit in the result, and all the new commits maintain their original relative order. Note this also requires changes in second step below when rewriting filenames in the (now unified) log.


2. Reorganize file tree and update filenames

Suppose you want to move these three files in this other repo (can be the same repo).

my_other_repo
+-- dirF
¦   +-- file55
¦   +-- file56
+-- dirB              # New tree
¦   +-- dirB1         # from subdir
¦   ¦   +-- file33    # from file3
¦   ¦   +-- file44    # from file4
¦   +-- dirB2         # new dir
¦        +-- file5    # from file5
+-- dirH
    +-- file77

Therefore reorganize your files:

cd /tmp/mail/dir
mkdir -p dirB/dirB1
mv subdir/file3 dirB/dirB1/file33
mv subdir/file4 dirB/dirB1/file44
mkdir -p dirB/dirB2
mv file5 dirB/dirB2

Your temporary history is now:

/tmp/mail/dir
    +-- dirB
        +-- dirB1
        ¦   +-- file33
        ¦   +-- file44
        +-- dirB2
             +-- file5

Change also filenames within the history:

cd "$historydir"
find * -type f -exec bash -c 'sed "/^diff --git a\|^--- a\|^+++ b/s:\( [ab]\)/[^ ]*:\1/$0:g" -i "$0"' {} ';'

3. Apply new history

Your other repo is:

my_other_repo
+-- dirF
¦   +-- file55
¦   +-- file56
+-- dirH
    +-- file77

Apply commits from temporary history files:

cd my_other_repo
find "$historydir" -type f -exec cat {} + | git am --committer-date-is-author-date

--committer-date-is-author-date preserves the original commit time-stamps (Dan Bonachea's comment).

Your other repo is now:

my_other_repo
+-- dirF
¦   +-- file55
¦   +-- file56
+-- dirB
¦   +-- dirB1
¦   ¦   +-- file33
¦   ¦   +-- file44
¦   +-- dirB2
¦        +-- file5
+-- dirH
    +-- file77

Use git status to see amount of commits ready to be pushed :-)


Extra trick: Check renamed/moved files within your repo

To list the files having been renamed:

find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow {} ';' | grep '=>'

More customizations: You can complete the command git log using options --find-copies-harder or --reverse. You can also remove the first two columns using cut -f3- and grepping complete pattern '{.* => .*}'.

find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow --find-copies-harder --reverse {} ';' | cut -f3- | grep '{.* => .*}'

No.

The short answer is NO. It is not possible to rename a file in Git and remember the history. And it is a pain.

Rumor has it that git log --follow --find-copies-harder will work, but it does not work for me, even if there are zero changes to the file contents, and the moves have been made with git mv.

(Initially I used Eclipse to rename and update packages in one operation, which may have confused Git. But that is a very common thing to do. --follow does seem to work if only a mv is performed and then a commit and the mv is not too far.)

Linus says that you are supposed to understand the entire contents of a software project holistically, not needing to track individual files. Well, sadly, my small brain cannot do that.

It is really annoying that so many people have mindlessly repeated the statement that Git automatically tracks moves. They have wasted my time. Git does no such thing. By design(!) Git does not track moves at all.

My solution is to rename the files back to their original locations. Change the software to fit the source control. With Git you just seem to need to "git" it right the first time.

Unfortunately, that breaks Eclipse, which seems to use --follow. git log --follow sometimes does not show the full history of files with complicated rename histories even though git log does. (I do not know why.)

(There are some too clever hacks that go back and recommit old work, but they are rather frightening. See GitHub-Gist: emiller/git-mv-with-history.)


To rename a directory or file (I don't know much about complex cases, so there might be some caveats):

git filter-repo --path-rename OLD_NAME:NEW_NAME

To rename a directory in files that mention it (it's possible to use callbacks, but I don't know how):

git filter-repo --replace-text expressions.txt

expressions.txt is a file filled with lines like literal:OLD_NAME==>NEW_NAME (it's possible to use Python's RE with regex: or glob with glob:).

To rename a directory in messages of commits:

git-filter-repo --message-callback 'return message.replace(b"OLD_NAME", b"NEW_NAME")'

Python's regular expressions are also supported, but they must be written in Python, manually.

If the repository is original, without remote, you will have to add --force to force a rewrite. (You may want to create a backup of your repository before doing this.)

If you do not want to preserve refs (they will be displayed in the branch history of Git GUI), you will have to add --replace-refs delete-no-add.


git log --follow [file]

will show you the history through renames.


I have faced the issue "Renaming the folder without loosing history". To fix it, run:

$ git mv oldfolder temp && git mv temp newfolder
$ git commit
$ git push