What are the file limits in Git number and size

Question

Does anyone know what are the Git limits for number of files and size of files

User · Answer

I found this trying to store a massive number of files(350k+) in a repo. Yes, store. Laughs.

$ time git add . 
git add . 333.67s user 244.26s system 14% cpu 1:06:48.63 total

The following extracts from the Bitbucket documentation are quite interesting.

When you work with a DVCS repository cloning, pushing, you are working with the entire repository and all of its history. In practice, once your repository gets larger than 500MB, you might start seeing issues.

... 94% of Bitbucket customers have repositories that are under 500MB. Both the Linux Kernel and Android are under 900MB.

The recommended solution on that page is to split your project into smaller chunks.

User · Answer

This message from Linus himself can help you with some other limits           CVS  ie it really ends up being pretty much oriented to a  one file   at a time  model       Which is nice in that you can have a million files  and then only check   out a few of them - you ll never even see the impact of the other   999 995 files       Git   fundamentally never really looks at less than the whole repo  Even if you   limit things a bit  ie check out just a portion  or have the history go   back just a bit   git ends up still always caring about the whole thing    and carrying the knowledge around       So git scales really badly if you force it to look at everything as one   huge repository  I don t think that part is really fixable  although we   can probably improve on it       And yes  then there s the  big file  issues  I really don t know what to   do about huge files  We suck at them  I know     See more in my other answer  the limit with Git is that each repository must represent a  coherent set of files   the  all system  in itself  you can not tag  part of a repository    If your system is made of autonomous  but inter-dependent  parts  you must use submodules   As illustrated by Talljoe s answer  the limit can be a system one  large number of files   but if you do understand the nature of Git  about data coherency represented by its SHA-1 keys   you will realize the true  limit  is a usage one  i e  you should not try to store everything in a Git repository  unless you are prepared to always get or tag everything back  For some large projects  it would make no sense     For a more in-depth look at git limits  see  git with large files   which mentions git-lfs  a solution to store large files outside the git repo  GitHub  April 2015   The three issues that limits a git repo    huge files  the xdelta for packfile is in memory only  which isn t good with large files  huge number of files  which means  one file per blob  and slow git gc to generate one packfile at a time  huge packfiles  with a packfile index inefficient to retrieve data from the  huge  packfile      A more recent thread  Feb  2015  illustrates the limiting factors for a Git repo           Will a few simultaneous clones from the central server also slow down other concurrent operations for other users          There are no locks in server when cloning  so in theory cloning does not affect other operations  Cloning can use lots of memory though  and a lot of cpu unless you turn on reachability bitmap feature  which you should             Will  git pull  be slow          If we exclude the server side  the size of your tree is the main factor  but your 25k files should be fine  linux has 48k files              git push           This one is not affected by how deep your repo s history is  or how wide your tree is  so should be quick        Ah the number of refs may affect both git-push and git-pull    I think Stefan knows better than I in this area             git commit    It is listed as slow in reference 3        git status    Slow again in reference 3 though I don t see it        also git-add          Again  the size of your tree  At your repo s size  I don t think you need to worry about it            Some operations might not seem to be day-to-day but if they are called frequently by the web front-end to GitLab Stash GitHub etc then they can become bottlenecks   e g   git branch --contains  seems terribly adversely affected by large numbers of branches           git-blame could be slow when a file is modified a lot

User · Answer

Back in Feb 2012  there was a very interesting thread on the Git mailing list from Joshua Redstone  a Facebook software engineer testing Git on a huge test repository       The test repo has 4 million commits  linear history and about 1 3 million   files    Tests that were run show that for such a repo Git is unusable  cold operation lasting minutes   but this may change in the future  Basically the performance is penalized by the number of stat   calls to the kernel FS module  so it will depend on the number of files in the repo  and the FS caching efficiency  See also this Gist for further discussion

User · Answer

I think that it s good to try to avoid large file commits as being part of the repository  e g  a database dump might be better off elsewhere   but if one considers the size of the kernel in its repository  you can probably expect to work comfortably with anything smaller in size and less complex than that

User · Answer

git has a 4G  32bit  limit for repo   http   code google com p support wiki GitFAQ

User · Answer

I have a generous amount of data that s stored in my repo as individual JSON fragments   There s about 75 000 files sitting under a few directories and it s not really detrimental to performance     Checking them in the first time was  obviously  a little slow

User · Answer

It depends on what your meaning is   There are practical size limits  if you have a lot of big files  it can get boringly slow    If you have a lot of files  scans can also get slow   There aren t really inherent limits to the model  though   You can certainly use it poorly and be miserable

User · Answer

As of 2018-04-20 Git for Windows has a bug which effectively limits the file size to 4GB max using that particular implementation  this bug propagates to lfs as well

User · Answer

There is no real limit -- everything is named with a 160-bit name   The size of the file must be representable in a 64 bit number so no real limit there either   There is a practical limit  though   I have a repository that s  8GB with  880 000 files and git gc takes a while   The working tree is rather large so operations that inspect the entire working directory take quite a while   This repo is only used for data storage  though  so it s just a bunch of automated tools that handle it   Pulling changes from the repo is much  much faster than rsyncing the same data    find   -type f   wc -l 791887  time git add   git add    6 48s user 13 53s system 55  cpu 36 121 total  time git status   On branch master nothing to commit  working directory clean  git status  0 00s user 0 01s system 0  cpu 47 169 total  du -sh   29G        cd  git  du -sh   7 9G

User · Answer

If you add files that are too large  GBs in my case  Cygwin  XP  3 GB RAM   expect this      fatal  Out of memory  malloc failed   More details here  Update 3 2 11  Saw similar in Windows 7 x64 with Tortoise Git   Tons of memory used  very very slow system response

[git] What are the file limits in Git (number and size)?

Examples related to git