Alex Brown
2013-Mar-06 18:24 UTC
[Rd] do_fileinfo / file.info test for file IS directory during package load pointlessly stresses NIS by getting username / group info
*Summary: * During package loading the library function calls file.info to determine if a file is a directory. This uselessly invokes getpwuid and getgrgid which can be expensive if the user/group database are held on a network. Note that file_test ALSO uses file.info for the same purpose Suggest rebuilding file_test to use ‘stat’ based call for directory search, and using file_test in function library. Note that functions like R_unlink uses stat calls to determine if something is a directory. -Alex Brown *Detail:* While developing an application using Shiny my (large fortune500) company started to have network issues and NIS performance issues in particular. Shiny will relatively frequently restart R, which entails loading a small handful of packages. I noticed that R startup time went down the drain, and my shiny server started failing with timeouts in other parts of the server (apache mod_proxy and shiny’s node components have 60s timeouts) I narrowed this down to long latency calls on NIS – due to calls to libc’s getpwent and getgrgid; always to the same user – but why? *Strace:* bind(5, {sa_family=AF_INET, sin_port=htons(994), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EACCES (Permission denied) setsockopt(5, SOL_IP, IP_RECVERR, [1], 4) = 0 close(4) = 0 sendto(5, "_*n\230\0\0\0\0\0\0\0\2\0\1\206\244\0\0\0\2\0\0\0\3\0\0\0\0\0\0\0\0"..., 76, 0, {sa_family=AF_INET, sin_port=htons(776), sin_addr=inet_addr("10.3.147.16")}, 16) = 76 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) *gdb (break on bind)* #0 0x00007ffff6ee3c00 in bind () from /lib64/libc.so.6 #1 0x00007ffff6f069b3 in bindresvport () from /lib64/libc.so.6 #2 0x00007ffff6f08a0f in __libc_clntudp_bufcreate_internal () from /lib64/libc.so.6 #3 0x00007ffff58ddd37 in yp_bind_client_create () from /lib64/libnsl.so.1 #4 0x00007ffff58dde06 in yp_bind_file () from /lib64/libnsl.so.1 #5 0x00007ffff58de043 in __yp_bind () from /lib64/libnsl.so.1 #6 0x00007ffff58de47c in do_ypcall () from /lib64/libnsl.so.1 #7 0x00007ffff58de569 in do_ypcall_tr () from /lib64/libnsl.so.1 #8 0x00007ffff58df0a2 in yp_match () from /lib64/libnsl.so.1 #9 0x00007ffff5af5f79 in _nss_nis_getpwuid_r () from /lib64/libnss_nis.so.2 #10 0x00007ffff6eb040c in getpwuid_r@@GLIBC_2.2.5 () from /lib64/libc.so.6 #11 0x00007ffff6eafc6f in getpwuid () from /lib64/libc.so.6 #12 0x00007ffff7905262 in do_fileinfo (call=<optimized out>, op=<optimized out>, args=<optimized out>, rho=<optimized out>) at platform.c:946 #13 0x00007ffff7894f62 in bcEval (body=<optimized out>, rho=<optimized out>, useCache=<optimized out>) at eval.c:4444 *R: trace(file.info,browser) where* where 7: file.info(lib.loc)$isdir %in% TRUE where 8: library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, warn.conflicts = warn.conflicts, quietly = quietly) *R: library* lib.loc <- lib.loc[file.info(lib.loc)$isdir %in% TRUE] if (!character.only) -Alex Brown [[alternative HTML version deleted]]