[windows] NTFS performance and large volumes of files and directories

How does Windows with NTFS perform with large volumes of files and directories?

Is there any guidance around limits of files or directories you can place in a single directory before you run into performance problems or other issues?

E.g. is having a folder with 100,000 folders inside of it an OK thing to do?

This question is related to windows performance filesystems ntfs

The answer is


For local access, large numbers of directories/files doesn't seem to be an issue. However, if you're accessing it across a network, there's a noticeable performance hit after a few hundred (especially when accessed from Vista machines (XP to Windows Server w/NTFS seemed to run much faster in that regard)).


I had real experience with about 100 000 files (each several MBs) on NTFS in a directory while copying one online library.

It takes about 15 minutes to open the directory with Explorer or 7-zip.

Writing site copy with winhttrack will always get stuck after some time. It dealt also with directory, containing about 1 000 000 files. I think the worst thing is that the MFT can only by traversed sequentially.

Opening the same under ext2fsd on ext3 gave almost the same timing. Probably moving to reiserfs (not reiser4fs) can help.

Trying to avoid this situation is probably the best.

For your own programs using blobs w/o any fs could be beneficial. That's the way Facebook does for storing photos.


100,000 should be fine.

I have (anecdotally) seen people having problems with many millions of files and I have had problems myself with Explorer just not having a clue how to count past 60-something thousand files, but NTFS should be good for the volumes you're talking.

In case you're wondering, the technical (and I hope theoretical) maximum number of files is: 4,294,967,295


For local access, large numbers of directories/files doesn't seem to be an issue. However, if you're accessing it across a network, there's a noticeable performance hit after a few hundred (especially when accessed from Vista machines (XP to Windows Server w/NTFS seemed to run much faster in that regard)).


When you create a folder with N entries, you create a list of N items at file-system level. This list is a system-wide shared data structure. If you then start modifying this list continuously by adding/removing entries, I expect at least some lock contention over shared data. This contention - theoretically - can negatively affect performance.

For read-only scenarios I can't imagine any reason for performance degradation of directories with large number of entries.


I am building a File-Structure to host up to 2 billion (2^32) files and performed the following tests that show a sharp drop in Navigate + Read Performance at about 250 Files or 120 Directories per NTFS Directory on a Solid State Drive (SSD):

  • The File Performance drops by 50% between 250 and 1000 Files.
  • The Directory Performance drops by 60% between 120 and 1000 Directories.
  • Values for Numbers > 1000 remain relatively stable

Interestingly the Number of Directories and Files do NOT significantly interfere.

So the Lessons are:

  • File Numbers above 250 cost a Factor of 2
  • Directories above 120 cost a Factor of 2.5
  • The File-Explorer in Windows 7 can handle large #Files or #Dirs, but Usability is still bad.
  • Introducing Sub-Directories is not expensive

This is the Data (2 Measurements for each File and Directory):

(FOPS = File Operations per Second)
(DOPS = Directory Operations per Second)

#Files  lg(#)   FOPS    FOPS2   DOPS    DOPS2
   10   1.00    16692   16692   16421   16312
  100   2.00    16425   15943   15738   16031
  120   2.08    15716   16024   15878   16122
  130   2.11    15883   16124   14328   14347
  160   2.20    15978   16184   11325   11128
  200   2.30    16364   16052   9866    9678
  210   2.32    16143   15977   9348    9547
  220   2.34    16290   15909   9094    9038
  230   2.36    16048   15930   9010    9094
  240   2.38    15096   15725   8654    9143
  250   2.40    15453   15548   8872    8472
  260   2.41    14454   15053   8577    8720
  300   2.48    12565   13245   8368    8361
  400   2.60    11159   11462   7671    7574
  500   2.70    10536   10560   7149    7331
 1000   3.00    9092    9509    6569    6693
 2000   3.30    8797    8810    6375    6292
10000   4.00    8084    8228    6210    6194
20000   4.30    8049    8343    5536    6100
50000   4.70    7468    7607    5364    5365

And this is the Test Code:

[TestCase(50000, false, Result = 50000)]
[TestCase(50000, true, Result = 50000)]
public static int TestDirPerformance(int numFilesInDir, bool testDirs) {
    var files = new List<string>();
    var dir = Path.GetTempPath() + "\\Sub\\" + Guid.NewGuid() + "\\";
    Directory.CreateDirectory(dir);
    Console.WriteLine("prepare...");
    const string FILE_NAME = "\\file.txt";
    for (int i = 0; i < numFilesInDir; i++) {
        string filename = dir + Guid.NewGuid();
        if (testDirs) {
            var dirName = filename + "D";
            Directory.CreateDirectory(dirName);
            using (File.Create(dirName + FILE_NAME)) { }
        } else {
            using (File.Create(filename)) { }
        }
        files.Add(filename);
    }
    //Adding 1000 Directories didn't change File Performance
    /*for (int i = 0; i < 1000; i++) {
        string filename = dir + Guid.NewGuid();
        Directory.CreateDirectory(filename + "D");
    }*/
    Console.WriteLine("measure...");
    var r = new Random();
    var sw = new Stopwatch();
    sw.Start();
    int len = 0;
    int count = 0;
    while (sw.ElapsedMilliseconds < 5000) {
        string filename = files[r.Next(files.Count)];
        string text = File.ReadAllText(testDirs ? filename + "D" + FILE_NAME : filename);
        len += text.Length;
        count++;
    }
    Console.WriteLine("{0} File Ops/sec ", count / 5);
    return numFilesInDir; 
}

When you create a folder with N entries, you create a list of N items at file-system level. This list is a system-wide shared data structure. If you then start modifying this list continuously by adding/removing entries, I expect at least some lock contention over shared data. This contention - theoretically - can negatively affect performance.

For read-only scenarios I can't imagine any reason for performance degradation of directories with large number of entries.


For local access, large numbers of directories/files doesn't seem to be an issue. However, if you're accessing it across a network, there's a noticeable performance hit after a few hundred (especially when accessed from Vista machines (XP to Windows Server w/NTFS seemed to run much faster in that regard)).


There are also performance problems with short file name creation slowing things down. Microsoft recommends turning off short filename creation if you have more than 300k files in a folder [1]. The less unique the first 6 characters are, the more of a problem this is.

[1] How NTFS Works from http://technet.microsoft.com, search for "300,000"


100,000 should be fine.

I have (anecdotally) seen people having problems with many millions of files and I have had problems myself with Explorer just not having a clue how to count past 60-something thousand files, but NTFS should be good for the volumes you're talking.

In case you're wondering, the technical (and I hope theoretical) maximum number of files is: 4,294,967,295


I had real experience with about 100 000 files (each several MBs) on NTFS in a directory while copying one online library.

It takes about 15 minutes to open the directory with Explorer or 7-zip.

Writing site copy with winhttrack will always get stuck after some time. It dealt also with directory, containing about 1 000 000 files. I think the worst thing is that the MFT can only by traversed sequentially.

Opening the same under ext2fsd on ext3 gave almost the same timing. Probably moving to reiserfs (not reiser4fs) can help.

Trying to avoid this situation is probably the best.

For your own programs using blobs w/o any fs could be beneficial. That's the way Facebook does for storing photos.


When you create a folder with N entries, you create a list of N items at file-system level. This list is a system-wide shared data structure. If you then start modifying this list continuously by adding/removing entries, I expect at least some lock contention over shared data. This contention - theoretically - can negatively affect performance.

For read-only scenarios I can't imagine any reason for performance degradation of directories with large number of entries.


For local access, large numbers of directories/files doesn't seem to be an issue. However, if you're accessing it across a network, there's a noticeable performance hit after a few hundred (especially when accessed from Vista machines (XP to Windows Server w/NTFS seemed to run much faster in that regard)).


100,000 should be fine.

I have (anecdotally) seen people having problems with many millions of files and I have had problems myself with Explorer just not having a clue how to count past 60-something thousand files, but NTFS should be good for the volumes you're talking.

In case you're wondering, the technical (and I hope theoretical) maximum number of files is: 4,294,967,295


Examples related to windows

"Permission Denied" trying to run Python on Windows 10 A fatal error occurred while creating a TLS client credential. The internal error state is 10013 How to install OpenJDK 11 on Windows? I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."? git clone: Authentication failed for <URL> How to avoid the "Windows Defender SmartScreen prevented an unrecognized app from starting warning" XCOPY: Overwrite all without prompt in BATCH Laravel 5 show ErrorException file_put_contents failed to open stream: No such file or directory how to open Jupyter notebook in chrome on windows Tensorflow import error: No module named 'tensorflow'

Examples related to performance

Why is 2 * (i * i) faster than 2 * i * i in Java? What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check if a key exists in Json Object and get its value Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Most efficient way to map function over numpy array The most efficient way to remove first N elements in a list? Fastest way to get the first n elements of a List into an Array Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3? pandas loc vs. iloc vs. at vs. iat? Android Recyclerview vs ListView with Viewholder

Examples related to filesystems

Get an image extension from an uploaded file in Laravel Notepad++ cached files location No space left on device How to create a directory using Ansible best way to get folder and file list in Javascript Exploring Docker container's file system Remove directory which is not empty GIT_DISCOVERY_ACROSS_FILESYSTEM not set Trying to create a file in Android: open failed: EROFS (Read-only file system) Node.js check if path is file or directory

Examples related to ntfs

How do I force Robocopy to overwrite files? Maximum filename length in NTFS (Windows XP and Windows Vista)? NTFS performance and large volumes of files and directories How can I view the allocation unit size of a NTFS partition in Vista?