Alex Brown
2013-Mar-06 18:24 UTC
[Rd] do_fileinfo / file.info test for file IS directory during package load pointlessly stresses NIS by getting username / group info
*Summary:
*
During package loading the library function calls file.info to determine if
a file is a directory. This uselessly invokes getpwuid and getgrgid which
can be expensive if the user/group database are held on a network.
Note that file_test ALSO uses file.info for the same purpose
Suggest rebuilding file_test to use ‘stat’ based call for directory search,
and using file_test in function library.
Note that functions like R_unlink uses stat calls to determine if something
is a directory.
-Alex Brown
*Detail:*
While developing an application using Shiny my (large fortune500) company
started to have network issues and NIS performance issues in particular.
Shiny will relatively frequently restart R, which entails loading a small
handful of packages.
I noticed that R startup time went down the drain, and my shiny server
started failing with timeouts in other parts of the server (apache
mod_proxy and shiny’s node components have 60s timeouts)
I narrowed this down to long latency calls on NIS – due to calls to libc’s
getpwent and getgrgid; always to the same user – but why?
*Strace:*
bind(5, {sa_family=AF_INET, sin_port=htons(994),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EACCES (Permission denied)
setsockopt(5, SOL_IP, IP_RECVERR, [1], 4) = 0
close(4) = 0
sendto(5,
"_*n\230\0\0\0\0\0\0\0\2\0\1\206\244\0\0\0\2\0\0\0\3\0\0\0\0\0\0\0\0"...,
76, 0, {sa_family=AF_INET, sin_port=htons(776),
sin_addr=inet_addr("10.3.147.16")}, 16) = 76
poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}])
*gdb (break on bind)*
#0 0x00007ffff6ee3c00 in bind () from /lib64/libc.so.6
#1 0x00007ffff6f069b3 in bindresvport () from /lib64/libc.so.6
#2 0x00007ffff6f08a0f in __libc_clntudp_bufcreate_internal () from
/lib64/libc.so.6
#3 0x00007ffff58ddd37 in yp_bind_client_create () from /lib64/libnsl.so.1
#4 0x00007ffff58dde06 in yp_bind_file () from /lib64/libnsl.so.1
#5 0x00007ffff58de043 in __yp_bind () from /lib64/libnsl.so.1
#6 0x00007ffff58de47c in do_ypcall () from /lib64/libnsl.so.1
#7 0x00007ffff58de569 in do_ypcall_tr () from /lib64/libnsl.so.1
#8 0x00007ffff58df0a2 in yp_match () from /lib64/libnsl.so.1
#9 0x00007ffff5af5f79 in _nss_nis_getpwuid_r () from /lib64/libnss_nis.so.2
#10 0x00007ffff6eb040c in getpwuid_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
#11 0x00007ffff6eafc6f in getpwuid () from /lib64/libc.so.6
#12 0x00007ffff7905262 in do_fileinfo (call=<optimized out>,
op=<optimized
out>, args=<optimized out>,
rho=<optimized out>) at platform.c:946
#13 0x00007ffff7894f62 in bcEval (body=<optimized out>, rho=<optimized
out>, useCache=<optimized out>)
at eval.c:4444
*R: trace(file.info,browser) where*
where 7: file.info(lib.loc)$isdir %in% TRUE
where 8: library(package, lib.loc = lib.loc, character.only = TRUE,
logical.return = TRUE,
warn.conflicts = warn.conflicts, quietly = quietly)
*R: library*
lib.loc <- lib.loc[file.info(lib.loc)$isdir %in% TRUE]
if (!character.only)
-Alex Brown
[[alternative HTML version deleted]]
