Simon Urbanek
2021-Aug-25 00:25 UTC
[Rd] Is it a good choice to increase the NCONNECTION value?
Martin, I don't think static connection limit is sensible. Recall that connections can be anything, not just necessarily sockets or file descriptions so they are not linked to the system fd limit. For example, if you use a codec then you will need twice the number of connections than the fds. To be honest the connection limit is one of the main reasons why in our big data applications we have always avoided R connections and used C-level sockets instead (others were lack of control over the socket flags, but that has been addressed in the last release). So I'd vote for at the very least increasing the limit significantly (at least 1k if not more) and, ideally, make it dynamic if memory footprint is an issue. Cheers, Simon> On Aug 25, 2021, at 8:53 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote: > >>>>>> GILLIBERT, Andre >>>>>> on Tue, 24 Aug 2021 09:49:52 +0000 writes: > >> RConnection is a pointer to a Rconn structure. The Rconn >> structure must be allocated independently (e.g. by >> malloc() in R_new_custom_connection). Therefore, >> increasing NCONNECTION to 1024 should only use 8 >> kilobytes on 64-bits platforms and 4 kilobytes on 32 >> bits platforms. > > You are right indeed, and I was wrong. > >> Ideally, it should be dynamically allocated : either as >> a linked list or as a dynamic array >> (malloc/realloc). However, a simple change of >> NCONNECTION to 1024 should be enough for most uses. > > There is one important other problem I've been made aware > (similarly to the number of open DLL libraries, an issue 1-2 > years ago) : > > The OS itself has limits on the number of open files > (yes, I know that there are other connections than files) and > these limits may quite differ from platform to platform. > > On my Linux laptop, in a shell, I see > > $ ulimit -n > 1024 > > which is barely conformant with your proposed 1024 NCONNECTION. > > Now if NCONNCECTION is larger than the max allowed number of > open files and if R opens more files than the OS allowed, the > user may get quite unpleasant behavior, e.g. R being terminated brutally > (or behaving crazily) without good R-level warning / error messages. > > It's also not at all sufficient to check for the open files > limit at compile time, but rather at R process startup time > > So this may need considerably more work than you / we have > hoped, and it's probably hard to find a safe number that is > considerably larger than 128 and less than the smallest of all > non-crazy platforms' {number of open files limit}. > >> Sincerely >> Andr? GILLIBERT > > [............] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
iuke-tier@ey m@iii@g oii uiow@@edu
2021-Aug-25 01:45 UTC
[Rd] [External] Re: Is it a good choice to increase the NCONNECTION value?
We do need to be careful about using too many file descriptors. The standard soft limit on Linux is fairly low (1024; the hard limit is usually quite a bit higher). Hitting that limit, e.g. with runaway with code allocating lots of connections, can cause other things, like loading packages, to fail with hard to diagnose error messages. A static connection limit is a crude way to guard against that. Doing anything substantially better is probably a lot of work. A simple option that may be worth pursuing is to allow the limit to be adjusted at runtime. Users who want to go higher would do so at their own risk and may need to know how to adjust the soft limit on the process. Best, luke On Wed, 25 Aug 2021, Simon Urbanek wrote:> > Martin, > > I don't think static connection limit is sensible. Recall that connections can be anything, not just necessarily sockets or file descriptions so they are not linked to the system fd limit. For example, if you use a codec then you will need twice the number of connections than the fds. To be honest the connection limit is one of the main reasons why in our big data applications we have always avoided R connections and used C-level sockets instead (others were lack of control over the socket flags, but that has been addressed in the last release). So I'd vote for at the very least increasing the limit significantly (at least 1k if not more) and, ideally, make it dynamic if memory footprint is an issue. > > Cheers, > Simon > > >> On Aug 25, 2021, at 8:53 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote: >> >>>>>>> GILLIBERT, Andre >>>>>>> on Tue, 24 Aug 2021 09:49:52 +0000 writes: >> >>> RConnection is a pointer to a Rconn structure. The Rconn >>> structure must be allocated independently (e.g. by >>> malloc() in R_new_custom_connection). Therefore, >>> increasing NCONNECTION to 1024 should only use 8 >>> kilobytes on 64-bits platforms and 4 kilobytes on 32 >>> bits platforms. >> >> You are right indeed, and I was wrong. >> >>> Ideally, it should be dynamically allocated : either as >>> a linked list or as a dynamic array >>> (malloc/realloc). However, a simple change of >>> NCONNECTION to 1024 should be enough for most uses. >> >> There is one important other problem I've been made aware >> (similarly to the number of open DLL libraries, an issue 1-2 >> years ago) : >> >> The OS itself has limits on the number of open files >> (yes, I know that there are other connections than files) and >> these limits may quite differ from platform to platform. >> >> On my Linux laptop, in a shell, I see >> >> $ ulimit -n >> 1024 >> >> which is barely conformant with your proposed 1024 NCONNECTION. >> >> Now if NCONNCECTION is larger than the max allowed number of >> open files and if R opens more files than the OS allowed, the >> user may get quite unpleasant behavior, e.g. R being terminated brutally >> (or behaving crazily) without good R-level warning / error messages. >> >> It's also not at all sufficient to check for the open files >> limit at compile time, but rather at R process startup time >> >> So this may need considerably more work than you / we have >> hoped, and it's probably hard to find a safe number that is >> considerably larger than 128 and less than the smallest of all >> non-crazy platforms' {number of open files limit}. >> >>> Sincerely >>> Andr? GILLIBERT >> >> [............] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu