[c] Socket accept - "Too many open files"

I am working on a school project where I had to write a multi-threaded server, and now I am comparing it to apache by running some tests against it. I am using autobench to help with that, but after I run a few tests, or if I give it too high of a rate (around 600+) to make the connections, I get a "Too many open files" error.

After I am done with dealing with request, I always do a close() on the socket. I have tried to use the shutdown() function as well, but nothing seems to help. Any way around this?

This question is related to c sockets

The answer is


There are multiple places where Linux can have limits on the number of file descriptors you are allowed to open.

You can check the following:

cat /proc/sys/fs/file-max

That will give you the system wide limits of file descriptors.

On the shell level, this will tell you your personal limit:

ulimit -n

This can be changed in /etc/security/limits.conf - it's the nofile param.

However, if you're closing your sockets correctly, you shouldn't receive this unless you're opening a lot of simulataneous connections. It sounds like something is preventing your sockets from being closed appropriately. I would verify that they are being handled properly.


I had the same problem and I wasn't bothering to check the return values of the close() calls. When I started checking the return value, the problem mysteriously vanished.

I can only assume an optimisation glitch of the compiler (gcc in my case), is assuming that close() calls are without side effects and can be omitted if their return values aren't used.


On MacOS, show the limits:

launchctl limit maxfiles

Result like: maxfiles 256 1000

If the numbers (soft limit & hard limit) are too low, you have to set upper:

sudo launchctl limit maxfiles 65536 200000

I had this problem too. You have a file handle leak. You can debug this by printing out a list of all the open file handles (on POSIX systems):

void showFDInfo()
{
   s32 numHandles = getdtablesize();

   for ( s32 i = 0; i < numHandles; i++ )
   {
      s32 fd_flags = fcntl( i, F_GETFD ); 
      if ( fd_flags == -1 ) continue;


      showFDInfo( i );
   }
}

void showFDInfo( s32 fd )
{
   char buf[256];

   s32 fd_flags = fcntl( fd, F_GETFD ); 
   if ( fd_flags == -1 ) return;

   s32 fl_flags = fcntl( fd, F_GETFL ); 
   if ( fl_flags == -1 ) return;

   char path[256];
   sprintf( path, "/proc/self/fd/%d", fd );

   memset( &buf[0], 0, 256 );
   ssize_t s = readlink( path, &buf[0], 256 );
   if ( s == -1 )
   {
        cerr << " (" << path << "): " << "not available";
        return;
   }
   cerr << fd << " (" << buf << "): ";

   if ( fd_flags & FD_CLOEXEC )  cerr << "cloexec ";

   // file status
   if ( fl_flags & O_APPEND   )  cerr << "append ";
   if ( fl_flags & O_NONBLOCK )  cerr << "nonblock ";

   // acc mode
   if ( fl_flags & O_RDONLY   )  cerr << "read-only ";
   if ( fl_flags & O_RDWR     )  cerr << "read-write ";
   if ( fl_flags & O_WRONLY   )  cerr << "write-only ";

   if ( fl_flags & O_DSYNC    )  cerr << "dsync ";
   if ( fl_flags & O_RSYNC    )  cerr << "rsync ";
   if ( fl_flags & O_SYNC     )  cerr << "sync ";

   struct flock fl;
   fl.l_type = F_WRLCK;
   fl.l_whence = 0;
   fl.l_start = 0;
   fl.l_len = 0;
   fcntl( fd, F_GETLK, &fl );
   if ( fl.l_type != F_UNLCK )
   {
      if ( fl.l_type == F_WRLCK )
         cerr << "write-locked";
      else
         cerr << "read-locked";
      cerr << "(pid:" << fl.l_pid << ") ";
   }
}

By dumping out all the open files you will quickly figure out where your file handle leak is.

If your server spawns subprocesses. E.g. if this is a 'fork' style server, or if you are spawning other processes ( e.g. via cgi ), you have to make sure to create your file handles with "cloexec" - both for real files and also sockets.

Without cloexec, every time you fork or spawn, all open file handles are cloned in the child process.

It is also really easy to fail to close network sockets - e.g. just abandoning them when the remote party disconnects. This will leak handles like crazy.


When your program has more open descriptors than the open files ulimit (ulimit -a will list this), the kernel will refuse to open any more file descriptors. Make sure you don't have any file descriptor leaks - for example, by running it for a while, then stopping and seeing if any extra fds are still open when it's idle - and if it's still a problem, change the nofile ulimit for your user in /etc/security/limits.conf


This means that the maximum number of simultaneously open files.

Solved:

At the end of the file /etc/security/limits.conf you need to add the following lines:

* soft nofile 16384
* hard nofile 16384

In the current console from root (sudo does not work) to do:

ulimit -n 16384

Although this is optional, if it is possible to restart the server.

In /etc/nginx/nginx.conf file to register the new value worker_connections equal to 16384 divide by value worker_processes.

If not did ulimit -n 16384, need to reboot, then the problem will recede.

PS:

If after the repair is visible in the logs error accept() failed (24: Too many open files):

In the nginx configuration, propevia (for example):

worker_processes 2;

worker_rlimit_nofile 16384;

events {
  worker_connections 8192;
}

Use lsof -u `whoami` | wc -l to find how many open files the user has


For future reference, I ran into a similar problem; I was creating too many file descriptors (FDs) by creating too many files and sockets (on Unix OSs, everything is a FD). My solution was to increase FDs at runtime with setrlimit().

First I got the FD limits, with the following code:

// This goes somewhere in your code
struct rlimit rlim;

if (getrlimit(RLIMIT_NOFILE, &rlim) == 0) {
    std::cout << "Soft limit: " << rlim.rlim_cur << std::endl;
    std::cout << "Hard limit: " << rlim.rlim_max << std::endl;
} else {
    std::cout << "Unable to get file descriptor limits" << std::endl;
}

After running getrlimit(), I could confirm that on my system, the soft limit is 256 FDs, and the hard limit is infinite FDs (this is different depending on your distro and specs). Since I was creating > 300 FDs between files and sockets, my code was crashing.

In my case I couldn't decrease the number of FDs, so I decided to increase the FD soft limit instead, with this code:

// This goes somewhere in your code
struct rlimit rlim;

rlim.rlim_cur = NEW_SOFT_LIMIT;
rlim.rlim_max = NEW_HARD_LIMIT;

if (setrlimit(RLIMIT_NOFILE, &rlim) == -1) {
    std::cout << "Unable to set file descriptor limits" << std::endl;
}

Note that you can also get the number of FDs that you are using, and the source of these FDs, with this code.

Also you can find more information on gettrlimit() and setrlimit() here and here.


TCP has a feature called "TIME_WAIT" that ensures connections are closed cleanly. It requires one end of the connection to stay listening for a while after the socket has been closed.

In a high-performance server, it's important that it's the clients who go into TIME_WAIT, not the server. Clients can afford to have a port open, whereas a busy server can rapidly run out of ports or have too many open FDs.

To achieve this, the server should never close the connection first -- it should always wait for the client to close it.


I had similar problem. Quick solution is :

ulimit -n 4096

explanation is as follows - each server connection is a file descriptor. In CentOS, Redhat and Fedora, probably others, file user limit is 1024 - no idea why. It can be easily seen when you type: ulimit -n

Note this has no much relation to system max files (/proc/sys/fs/file-max).

In my case it was problem with Redis, so I did:

ulimit -n 4096
redis-server -c xxxx

in your case instead of redis, you need to start your server.


Just another information about CentOS. In this case, when using "systemctl" to launch process. You have to modify the system file ==> /usr/lib/systemd/system/processName.service .Had this line in the file :

LimitNOFILE=50000

And just reload your system conf :

systemctl daemon-reload

it can take a bit of time before a closed socket is really freed up

lsof to list open files

cat /proc/sys/fs/file-max to see if there's a system limit


Similar issue on Ubuntu 18 on vsphere. The cause - Config file nginx.conf contains too many log files and sockets. Sockets are treated as files in Linux. When nginx -s reload or sudo service nginx start/restart, the Too many open files error appeared in error.log.

NGINX worker processes were launched by NGINX user. Ulimit (soft and hard) for nginx user was 65536. The ulimit and setting limits.conf did not work.

The rlimit setting in nginx.conf did not help either: worker_rlimit_nofile 65536;

The solution that worked was:

$ mkdir -p /etc/systemd/system/nginx.service.d
$ nano /etc/systemd/system/nginx.service.d/nginx.conf
    [Service]
    LimitNOFILE=30000
$ systemctl daemon-reload
$ systemctl restart nginx.service