[git] How to move files from one git repo to another (not a clone), preserving history

This answer provide interesting commands based on git am and presented using examples, step by step.

Objective

  • You want to move some or all files from one repository to another.
  • You want to keep their history.
  • But you do not care about keeping tags and branches.
  • You accept limited history for renamed files (and files in renamed directories).

Procedure

  1. Extract history in email format using
    git log --pretty=email -p --reverse --full-index --binary
  2. Reorganize file tree and update filename change in history [optional]
  3. Apply new history using git am

1. Extract history in email format

Example: Extract history of file3, file4 and file5

my_repo
+-- dirA
¦   +-- file1
¦   +-- file2
+-- dirB            ^
¦   +-- subdir      | To be moved
¦   ¦   +-- file3   | with history
¦   ¦   +-- file4   | 
¦   +-- file5       v
+-- dirC
    +-- file6
    +-- file7

Clean the temporary directory destination

export historydir=/tmp/mail/dir  # Absolute path
rm -rf "$historydir"             # Caution when cleaning

Clean your the repo source

git commit ...           # Commit your working files
rm .gitignore            # Disable gitignore
git clean -n             # Simulate removal
git clean -f             # Remove untracked file
git checkout .gitignore  # Restore gitignore

Extract history of each file in email format

cd my_repo/dirB
find -name .git -prune -o -type d -o -exec bash -c 'mkdir -p "$historydir/${0%/*}" && git log --pretty=email -p --stat --reverse --full-index --binary -- "$0" > "$historydir/$0"' {} ';'

Unfortunately option --follow or --find-copies-harder cannot be combined with --reverse. This is why history is cut when file is renamed (or when a parent directory is renamed).

After: Temporary history in email format

/tmp/mail/dir
    +-- subdir
    ¦   +-- file3
    ¦   +-- file4
    +-- file5

2. Reorganize file tree and update filename change in history [optional]

Suppose you want to move these three files in this other repo (can be the same repo).

my_other_repo
+-- dirF
¦   +-- file55
¦   +-- file56
+-- dirB              # New tree
¦   +-- dirB1         # was subdir
¦   ¦   +-- file33    # was file3
¦   ¦   +-- file44    # was file4
¦   +-- dirB2         # new dir
¦        +-- file5    # = file5
+-- dirH
    +-- file77

Therefore reorganize your files:

cd /tmp/mail/dir
mkdir     dirB
mv subdir dirB/dirB1
mv dirB/dirB1/file3 dirB/dirB1/file33
mv dirB/dirB1/file4 dirB/dirB1/file44
mkdir    dirB/dirB2
mv file5 dirB/dirB2

Your temporary history is now:

/tmp/mail/dir
    +-- dirB
        +-- dirB1
        ¦   +-- file33
        ¦   +-- file44
        +-- dirB2
             +-- file5

Change also filenames within the history:

cd "$historydir"
find * -type f -exec bash -c 'sed "/^diff --git a\|^--- a\|^+++ b/s:\( [ab]\)/[^ ]*:\1/$0:g" -i "$0"' {} ';'

Note: This rewrites the history to reflect the change of path and filename.
      (i.e. the change of the new location/name within the new repo)


3. Apply new history

Your other repo is:

my_other_repo
+-- dirF
¦   +-- file55
¦   +-- file56
+-- dirH
    +-- file77

Apply commits from temporary history files:

cd my_other_repo
find "$historydir" -type f -exec cat {} + | git am 

Your other repo is now:

my_other_repo
+-- dirF
¦   +-- file55
¦   +-- file56
+-- dirB            ^
¦   +-- dirB1       | New files
¦   ¦   +-- file33  | with
¦   ¦   +-- file44  | history
¦   +-- dirB2       | kept
¦        +-- file5  v
+-- dirH
    +-- file77

Use git status to see amount of commits ready to be pushed :-)

Note: As the history has been rewritten to reflect the path and filename change:
      (i.e. compared to the location/name within the previous repo)

  • No need to git mv to change the location/filename.
  • No need to git log --follow to access full history.

Extra trick: Detect renamed/moved files within your repo

To list the files having been renamed:

find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow {} ';' | grep '=>'

More customizations: You can complete the command git log using options --find-copies-harder or --reverse. You can also remove the first two columns using cut -f3- and grepping complete pattern '{.* => .*}'.

find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow --find-copies-harder --reverse {} ';' | cut -f3- | grep '{.* => .*}'