Skip to content
Prev 60072 / 63421 Next

Is it a good choice to increase the NCONNECTION value?

Hello,


The soft limit to the number of file descriptors is 1024 on GNU/Linux but the default hard limit is at 1048576 or 524288 on major modern distributions : Ubuntu, Fedora, Debian.

I do not have access to a Macintosh, but it looks like the soft limit is 256 and hard limit is "unlimited", though actually, the real hard limit has been reported as 10240 (https://developer.r-project.org/Blog/public/2018/03/23/maximum-number-of-dlls/index.html).


Therefore, R should easily be able to change the limit without superuser privileges, with a call to setrlimit().

This should make file descriptor exhaustion very unlikely, except for buggy programs leaking file descriptors.


The simplest approach would be to set the soft limit to the value of the hard limit. Maybe to be nicer, R could set it to 10000 (or the hard limit if lower), which should be enough for intensive uses but would not use too much system resources in case of file descriptor leaks.


To get R reliably work in more esoteric operating systems or in poorly configured systems (e.g. systems with a hard limit at 1024), a second security could be added: a request of a new connection would be denied if the actual number of open file descriptors (or connections if that is easier to compute) is too close to the hard limit. A fixed amount (e.g. 128) or a proportion (e.g. 25%) of file descriptors would be reserved for "other uses", such as shared libraries.


This discussion reminds me of the fixed number of file descriptors of MS-DOS, defined at boot time in config.sys (e.g. files=20).

This is incredible that 64 bits computers in 2021 with gigabytes of RAM still have similar limits, and that R, has a hard-coded limit at 128.


--

Sincerely

Andr? GILLIBERT

________________________________
De : qweytr1 at mail.ustc.edu.cn <qweytr1 at mail.ustc.edu.cn>
Envoy? : mercredi 25 ao?t 2021 06:15:59
? : Simon Urbanek
Cc : Martin Maechler; GILLIBERT, Andre; R-devel
Objet : ??: [SPAM] Re: [Rd] Is it a good choice to increase the NCONNECTION value?

ATTENTION: Cet e-mail provient d?une adresse mail ext?rieure au CHU de Rouen. Ne cliquez pas sur les liens ou n'ouvrez pas les pi?ces jointes ? moins de conna?tre l'exp?diteur et de savoir que le contenu est s?r. En cas de doute, transf?rer le mail ? ? DSI, S?curit? ? pour analyse. Merci de votre vigilance


Simon,

What about using a dynamically allocated connections and a modifiable MAX_NCONNECTIONS limit?
ulimit could be modified by root users, at least now NCONNECTION could not.

I tried changing the program using malloc and realloc to allocate memory, due to unfamiliar with `.Internal` calls, I could not provide a function that modify the MAX_NCONNECTIONS (but it is possible.)
test and changes are shown below. I'll be appperciate if you could tell me whether there could be a bug.

(a demo that may change MAX_NCONNECTIONS, not tested.)
static int SetMaxNconnections(int now){ // return current value of MAX_NCONNECTIONS
  if(now<3)error(_("Could not shrink the MAX_NCONNECTIONS less than 3"));
  if(now>65536)warning(_("Setting MAX_NCONNECTIONS=%d, larger than 65536, may be crazy. Use at your own risk."),now);
  // setting MAX_NCONNECTIONS to a really large value is safe, since the allocation is not done immediately. Thus this is a warning.
  if(now>=NCONNECTIONS)return MAX_NCONNECTIONS=now; // if now is larger than NCONNECTIONS<=now,MAX_NCONNECTIONS, thus it is safe.
  R_gc(); /* Try to reclaim unused connections */
  for(int i=NCONNECTIONS;i>=now;--i){// now >= 3 here, thus no underflow occurs.
    // shrink the value of MAX_NCONNECTIONS and NCONNECTIONS
    if(!Connections[i]){now=i+1;break;}
  }
  // here, we could call a realloc, since *Connections only capture several kilobytes, realloc seems meaningless.
  // a true realloc will trigger if NCONNECTIONS<MAX_NCONNECTIONS and call NextConnection with all connections are in use
  return MAX_NCONNECTIONS=NCONNECTIONS=now;
}



test result:

$ LC_ALL=C R-4.1.1/bin/R -q -e 'library(doParallel);cl=makeForkCluster(128);max(sapply(clusterCall(cl,function()runif(10)),"+"))'
WARNING: ignoring environment value of R_HOME
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
Warning messages:
1: In socketAccept(socket = socket, blocking = TRUE, open = "a+b",  :
  increase max connections from 16 to 32
2: In socketAccept(socket = socket, blocking = TRUE, open = "a+b",  :
  increase max connections from 32 to 64
3: In socketAccept(socket = socket, blocking = TRUE, open = "a+b",  :
  increase max connections from 64 to 128
4: In socketAccept(socket = socket, blocking = TRUE, open = "a+b",  :
  increase max connections from 128 to 256
[1] 0.9975836
tested changes:


~line 127

static int NCONNECTIONS=16; /* need one per cluster node, 16 is the
  initial value which grows dynamically */
static int MAX_NCONNECTIONS=8192; /* increase it only affect the speed of
  finding the correct connection, if you have a machine with more than
  4096 threads, you could submit an issue or modify this value manually */
#define NSINKS 21

static Rconnection *Connections=NULL; /* we will allocate it later */
...

~line 146



static int NextConnection(void)
{
    int i;
    for(i = 3; i < NCONNECTIONS; i++)
    if(!Connections[i]) break;
    if(i >= NCONNECTIONS) {
    R_gc(); /* Try to reclaim unused ones */
    for(i = 3; i < NCONNECTIONS; i++)
        if(!Connections[i]) break;
    if(i >= NCONNECTIONS) {
        if(i >= MAX_NCONNECTIONS)
        error(_("all connections are in use"));
        int new_connections=NCONNECTIONS*2;//try dynamic alloc
        if(new_connections > MAX_NCONNECTIONS)
        new_connections = MAX_NCONNECTIONS;
        Rconnection*ptr = realloc(Connections,new_connections*sizeof(Rconnection));
        if (ptr==NULL)
        error(_("alloc extra connections failed"));
        warning(_("increase max connections from %d to %d\n"),NCONNECTIONS,new_connections);
        Connections = ptr;
        NCONNECTIONS = new_connections;
        for(int j = i; j < NCONNECTIONS; j++) Connections[j] = NULL;
    }
    }
    return i;
}
...



~line 5265

void attribute_hidden InitConnections()
{
    int i;
    Connections=malloc(NCONNECTIONS*sizeof(Rconnection));
    if(Connections == NULL) {
    error(_("Cannot alloc connections."));
    abort();
    }
...