Erik Iverson
2008-Jun-03 05:20 UTC
[Rd] More information on R segfaults, tcltk package, and graphics devices
Dear R-devel - I have investigated the report I made at https://stat.ethz.ch/pipermail/r-devel/2008-May/049683.html some more, and believe I have enough information to warrant an update. My sessionInfo() immediately after starting R is at the bottom of this message. I decided to first concentrate on finding out why I sometimes receive a segfault while closing a graphics window while the window is redrawing after resizing it, but seemingly only after loading the tcltk package. I do the following code in a --vanilla R session. library(grid) library(tcltk) for(i in seq(0, 1, by = .1)) { for(j in seq(0, 1, by = .01)) { angle <- runif(1, 1, 180) col <- sample(colors(), 1) pushViewport(viewport(x = i, y= j, width = .1, height = .1, angle = angle, gp = gpar(col = col))) grid.rect() popViewport() } } I then simply resize the X11 window a bit to force a redraw of the graphic, and then rapidly hit the 'X' close button on the X11 window while the rectangles are redrawing. I will often get the behavior that the window closes and R segfaults. The gdb backtraces from the core dumps I produced mostly were failing in GEcheckState from engine.c, but it was not clear to me what was going on from the backtrace. After much trial and error, I decided to put a breakpoint in the removeDevice function from device.c. I then do what I describe above, and get the following backtrace from gdb, edited to show what I think is going on. (gdb) bt #0 removeDevice (devNum=1, findNext=TRUE) at devices.c:307 #1 0xb7962855 in handleEvent (event {type = 33, xany = {type = 33, serial = 15621, send_event = 1, ...(snip)... , 268686226}}) at devX11.c:627 #2 0xb796296c in R_ProcessX11Events (data=0x0) at devX11.c:665 #3 0x080fd99c in R_runHandlers (handlers=0x8263d28, readMask=0x82cf6a0) at sys-std.c:363 #4 0xb74e159e in RTcl_eventProc (evPtr=0x97dfbf0, flags=-1) at tcltk_unix.c:136 #5 0xb749d6a3 in Tcl_ServiceEvent () from /usr/lib/libtcl8.4.so.0 #6 0xb749da32 in Tcl_DoOneEvent () from /usr/lib/libtcl8.4.so.0 #7 0xb74e14ee in TclSpinLoop (data=0x0) at tcltk_unix.c:60 #8 0x0814d4a6 in R_ToplevelExec (fun=0xb74e14d0 <TclSpinLoop>, data=0x0) at context.c:604 #9 0xb74e14b2 in TclHandler () at tcltk_unix.c:67 #10 0x08184f11 in R_CheckUserInterrupt () at errors.c:125 #11 0x0818d5cc in Rf_eval (e=0x8ab3010, rho=0x858f6f8) at eval.c:370 ... (snip)... Many Rf_eval, Rf_applyClosure, etc. #73 0x08173480 in do_recordGraphics (call=0x8308040, op=0x83223e0, args=0x91f4c58, env=0x8308040) at engine.c:2757 #74 0x081730a7 in GEplayDisplayList (dd=0x974f8e0) at engine.c:2547 #75 0xb7962659 in handleEvent (event {type = 12, xany = {type = 12, serial = 14493, send_event = 0, ...snip... #79 0x0805b0ca in Rf_ReplIteration (rho=0x832ac68, savestack=0, browselevel=0, state=0xbffbbc34) at main.c:206 #80 0x0805b1ea in R_ReplConsole (rho=0x832ac68, savestack=0, browselevel=0) at main.c:306 #81 0x0805b4d8 in run_Rmainloop () at main.c:967 #82 0x08058d91 in main (ac=0, av=0x0) at Rmain.c:35 #83 0xb7d61450 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6 #84 0x08058cc1 in _start () What seems to be happening is during the while (theList != R_NilValue && plotok) loop in GEplayDisplayList, at some point R_CheckUserInterrupt can be called, and if the tcltk package has been loaded, its TclHandler is called, which eventually ends up getting removeDevice called, as the backtrace above shows. From there, and please excuse my possibly loose terminology here, the device no longer exists to R, and accessing the 'dd' variable as in GEcheckState can cause a segfault, if something did not already go wrong while replaying the display list, such as the strange grid errors such as "Cannot pop top-level viewport" and "VECTOR_ELT() can only be applied to a 'list', not a 'NULL'" messages I had been receiving. Now, I have no idea if there is a fix, or how to go about it at this point, but I believe that is what is happening, so if anyone wants to investigate it further, this should be a good starting point. Perhaps the relevant advice here is "Don't do that". Please ask if I have not been clear enough or additional information from gdb is needed. Best, Erik Iverson iverson at biostat.wisc.edu sessionInfo() R version 2.7.0 (2008-04-22) i686-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base