maechler@stat.math.ethz.ch
2004-Jul-17 12:15 UTC
[Rd] gsub(*, perl=TRUE) bug incl. seg.fault (PR#7108)
Experimenting a bit further, I've "found" the following 1) the problem seems only gsub(), not sub() > sub(" ", "", "b c + d | a * b", perl=TRUE) [1] "bc + d | a * b" > gsub(" ", "", "b c + d | a * b", perl=TRUE) NULL 2) only if perl = TRUE, not otherwise 3) Also modifying the replacement string only slightly leads to a different (correct instead of bug) result: > gsub(" ", "", "bc + d | a * b", perl=TRUE) [1] "bc+d|a*b" > gsub(" ", "", "b c + d | a *", perl=TRUE) [1] "bc+d|a*" > gsub(" ", "", "b c + d | b", perl=TRUE) [1] "bc+d|b" Whereas those give the bug as well : > gsub(" ", "", "b : : d : a : b", perl=TRUE) NULL > gsub(" ", "", "b : : : : a : b", perl=TRUE) NULL > gsub(" ", "", "b : : : : : : b", perl=TRUE) NULL > gsub(" ", "", "a : : : : : : a", perl=TRUE) NULL but not this > gsub(" ", "", "b : : : : : : a", perl=TRUE) [1] "b::::::a" But it's even worse : > gsub(" ", "", "a: 12345 :a", perl=TRUE) -> segmentation fault and from "R -d gdb" : > gsub(" ", "", "a: 12345 :a", perl=TRUE) Program received signal SIGSEGV, Segmentation fault. hashIndex (symbol=0x8fd0060, table=0x823f2b8) at ../../../R/src/main/envir.c:599 599 if( !HASHASH(c) ) { (gdb) bt #0 hashIndex (symbol=0x8fd0060, table=0x823f2b8) at ../../../R/src/main/envir.c:599 #1 0x080b145d in R_GetGlobalCache (symbol=0x1) at ../../../R/src/main/envir.c:656 #2 0x080b1ac1 in findGlobalVar (symbol=0x8fd0060) at ../../../R/src/main/envir.c:923 #3 0x080b8873 in Rf_eval (e=0x823f2b8, rho=0x823f014) at ../../../R/src/main/eval.c:329 #4 0x080b88c9 in Rf_eval (e=0x8fd1e60, rho=0x821e540) at ../../../R/src/main/eval.c:354 #5 0x080e1e0e in GetObject (cptr=0xbfffd400) at ../../../R/src/main/objects.c:88 #6 0x080e2a3b in do_usemethod (call=0x84eb084, op=0x823be78, args=0x84eb0a0, env=0x8fd1df0) at ../../../R/src/main/objects.c:381 #7 0x080b8c33 in Rf_eval (e=0x84eb084, rho=0x8fd1df0) at ../../../R/src/main/eval.c:375 #8 0x080b8f48 in Rf_applyClosure (call=0x8fd1eb4, op=0x84eafa4, arglist=0x8fd1e44, rho=0x823f014, suppliedenv=0x821e540) at ../../../R/src/main/eval.c:559 #9 0x080b89bf in Rf_eval (e=0x8fd1eb4, rho=0x823f014) at ../../../R/src/main/eval.c:410 #10 0x0811b0f3 in Rf_PrintValueEnv (s=0x8fd1eb4, env=0x823f014) at ../../../R/src/main/print.c:775 #11 0x080d55f1 in Rf_ReplIteration (rho=0x823f014, savestack=150798432, browselevel=1, state=0x823f014) at ../../../R/src/main/main.c:254 #12 0x080d578d in R_ReplConsole (rho=0x823f014, savestack=0, browselevel=0) at ../../../R/src/main/main.c:298 #13 0x080d60d0 in run_Rmainloop () at ../../../R/src/main/main.c:656 #14 0x08147e90 in main (ac=150798432, av=0x8fd0060) at ../../../R/src/unix/system.c:99 #15 0x42017589 in __libc_start_main () from /lib/i686/libc.so.6 (which is not directly helpful since src/main/pcre.c (where the C source for gsub(*, perl=TRUE) resides) isn't mentioned above. Must be a memory allocation / mismatch / end_of_string problem somewhere. Martin Maechler, ETH Zurich
Prof Brian Ripley
2004-Jul-26 17:47 UTC
[Rd] gsub(*, perl=TRUE) bug incl. seg.fault (PR#7108)
I presume this was in R 1.9.1? It seems that this was something I had already fixed (from another symptom) and not been able to commit because the CVS archive was down. I have now fixed it in R 1.9.1 patched and R-devel, and put a regression test into the latter. Brian On Sat, 17 Jul 2004 maechler@stat.math.ethz.ch wrote:> Experimenting a bit further, I've "found" the following > > 1) the problem seems only gsub(), not sub() > > > sub(" ", "", "b c + d | a * b", perl=TRUE) > [1] "bc + d | a * b" > > gsub(" ", "", "b c + d | a * b", perl=TRUE) > NULL > > 2) only if perl = TRUE, not otherwise > > 3) Also modifying the replacement string only slightly leads to a > different (correct instead of bug) result: > > > gsub(" ", "", "bc + d | a * b", perl=TRUE) > [1] "bc+d|a*b" > > gsub(" ", "", "b c + d | a *", perl=TRUE) > [1] "bc+d|a*" > > gsub(" ", "", "b c + d | b", perl=TRUE) > [1] "bc+d|b" > > Whereas those give the bug as well : > > > gsub(" ", "", "b : : d : a : b", perl=TRUE) > NULL > > gsub(" ", "", "b : : : : a : b", perl=TRUE) > NULL > > gsub(" ", "", "b : : : : : : b", perl=TRUE) > NULL > > gsub(" ", "", "a : : : : : : a", perl=TRUE) > NULL > but not this > > gsub(" ", "", "b : : : : : : a", perl=TRUE) > [1] "b::::::a" > > But it's even worse : > > > gsub(" ", "", "a: 12345 :a", perl=TRUE) > > -> segmentation fault > > and from "R -d gdb" : > > > gsub(" ", "", "a: 12345 :a", perl=TRUE) > > Program received signal SIGSEGV, Segmentation fault. > hashIndex (symbol=0x8fd0060, table=0x823f2b8) at > ../../../R/src/main/envir.c:599 > 599 if( !HASHASH(c) ) { > (gdb) bt > #0 hashIndex (symbol=0x8fd0060, table=0x823f2b8) > at ../../../R/src/main/envir.c:599 > #1 0x080b145d in R_GetGlobalCache (symbol=0x1) > at ../../../R/src/main/envir.c:656 > #2 0x080b1ac1 in findGlobalVar (symbol=0x8fd0060) > at ../../../R/src/main/envir.c:923 > #3 0x080b8873 in Rf_eval (e=0x823f2b8, rho=0x823f014) > at ../../../R/src/main/eval.c:329 > #4 0x080b88c9 in Rf_eval (e=0x8fd1e60, rho=0x821e540) > at ../../../R/src/main/eval.c:354 > #5 0x080e1e0e in GetObject (cptr=0xbfffd400) > at ../../../R/src/main/objects.c:88 > #6 0x080e2a3b in do_usemethod (call=0x84eb084, op=0x823be78, args=0x84eb0a0, > env=0x8fd1df0) at ../../../R/src/main/objects.c:381 > #7 0x080b8c33 in Rf_eval (e=0x84eb084, rho=0x8fd1df0) > at ../../../R/src/main/eval.c:375 > #8 0x080b8f48 in Rf_applyClosure (call=0x8fd1eb4, op=0x84eafa4, > arglist=0x8fd1e44, rho=0x823f014, suppliedenv=0x821e540) > at ../../../R/src/main/eval.c:559 > #9 0x080b89bf in Rf_eval (e=0x8fd1eb4, rho=0x823f014) > at ../../../R/src/main/eval.c:410 > #10 0x0811b0f3 in Rf_PrintValueEnv (s=0x8fd1eb4, env=0x823f014) > at ../../../R/src/main/print.c:775 > #11 0x080d55f1 in Rf_ReplIteration (rho=0x823f014, savestack=150798432, > browselevel=1, state=0x823f014) at ../../../R/src/main/main.c:254 > #12 0x080d578d in R_ReplConsole (rho=0x823f014, savestack=0, browselevel=0) > at ../../../R/src/main/main.c:298 > #13 0x080d60d0 in run_Rmainloop () > at ../../../R/src/main/main.c:656 > #14 0x08147e90 in main (ac=150798432, av=0x8fd0060) > at ../../../R/src/unix/system.c:99 > #15 0x42017589 in __libc_start_main () from /lib/i686/libc.so.6 > > (which is not directly helpful since src/main/pcre.c (where > the C source for gsub(*, perl=TRUE) resides) isn't mentioned above. > > Must be a memory allocation / mismatch / end_of_string problem > somewhere. > > Martin Maechler, ETH Zurich > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel > >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
maechler@stat.math.ethz.ch
2004-Jul-26 18:17 UTC
[Rd] gsub(*, perl=TRUE) bug incl. seg.fault (PR#7108)
>>>>> "BDR" == Prof Brian Ripley <ripley@stats.ox.ac.uk> >>>>> on Mon, 26 Jul 2004 16:11:51 +0100 (BST) writes:BDR> I presume this was in R 1.9.1? yes and R-patched (and R-devel) of that time -- of course not containing your "off line" fixes. Thanks a lot for taking care of it! Martin BDR> I presume this was in R 1.9.1? It seems that this was BDR> something I had already fixed (from another symptom) BDR> and not been able to commit because the CVS archive was BDR> down. BDR> I have now fixed it in R 1.9.1 patched and R-devel, and BDR> put a regression test into the latter. BDR> Brian BDR> On Sat, 17 Jul 2004 maechler@stat.math.ethz.ch wrote: >> Experimenting a bit further, I've "found" the following >> >> 1) the problem seems only gsub(), not sub() >> >> > sub(" ", "", "b c + d | a * b", perl=TRUE) [1] "bc + d >> | a * b" > gsub(" ", "", "b c + d | a * b", perl=TRUE) >> NULL >> >> 2) only if perl = TRUE, not otherwise >> >> 3) Also modifying the replacement string only slightly >> leads to a different (correct instead of bug) result: >> >> > gsub(" ", "", "bc + d | a * b", perl=TRUE) [1] >> "bc+d|a*b" > gsub(" ", "", "b c + d | a *", perl=TRUE) >> [1] "bc+d|a*" > gsub(" ", "", "b c + d | b", perl=TRUE) >> [1] "bc+d|b" >> >> Whereas those give the bug as well : >> >> > gsub(" ", "", "b : : d : a : b", perl=TRUE) NULL > >> gsub(" ", "", "b : : : : a : b", perl=TRUE) NULL > gsub(" >> ", "", "b : : : : : : b", perl=TRUE) NULL > gsub(" ", "", >> "a : : : : : : a", perl=TRUE) NULL but not this > gsub(" >> ", "", "b : : : : : : a", perl=TRUE) [1] "b::::::a" >> >> But it's even worse : >> >> > gsub(" ", "", "a: 12345 :a", perl=TRUE) >> -> segmentation fault >> and from "R -d gdb" : >> >> > gsub(" ", "", "a: 12345 :a", perl=TRUE) >> >> Program received signal SIGSEGV, Segmentation fault. >> hashIndex (symbol=0x8fd0060, table=0x823f2b8) at >> ../../../R/src/main/envir.c:599 599 if( !HASHASH(c) ) { >> (gdb) bt #0 hashIndex (symbol=0x8fd0060, table=0x823f2b8) >> at ../../../R/src/main/envir.c:599 #1 0x080b145d in >> R_GetGlobalCache (symbol=0x1) at >> ../../../R/src/main/envir.c:656 #2 0x080b1ac1 in >> findGlobalVar (symbol=0x8fd0060) at >> ../../../R/src/main/envir.c:923 #3 0x080b8873 in Rf_eval >> (e=0x823f2b8, rho=0x823f014) at >> ../../../R/src/main/eval.c:329 #4 0x080b88c9 in Rf_eval >> (e=0x8fd1e60, rho=0x821e540) at >> ../../../R/src/main/eval.c:354 #5 0x080e1e0e in GetObject >> (cptr=0xbfffd400) at ../../../R/src/main/objects.c:88 #6 >> 0x080e2a3b in do_usemethod (call=0x84eb084, op=0x823be78, >> args=0x84eb0a0, env=0x8fd1df0) at >> ../../../R/src/main/objects.c:381 #7 0x080b8c33 in >> Rf_eval (e=0x84eb084, rho=0x8fd1df0) at >> ../../../R/src/main/eval.c:375 #8 0x080b8f48 in >> Rf_applyClosure (call=0x8fd1eb4, op=0x84eafa4, >> arglist=0x8fd1e44, rho=0x823f014, suppliedenv=0x821e540) >> at ../../../R/src/main/eval.c:559 #9 0x080b89bf in >> Rf_eval (e=0x8fd1eb4, rho=0x823f014) at >> ../../../R/src/main/eval.c:410 #10 0x0811b0f3 in >> Rf_PrintValueEnv (s=0x8fd1eb4, env=0x823f014) at >> ../../../R/src/main/print.c:775 #11 0x080d55f1 in >> Rf_ReplIteration (rho=0x823f014, savestack=150798432, >> browselevel=1, state=0x823f014) at >> ../../../R/src/main/main.c:254 #12 0x080d578d in >> R_ReplConsole (rho=0x823f014, savestack=0, browselevel=0) >> at ../../../R/src/main/main.c:298 #13 0x080d60d0 in >> run_Rmainloop () at ../../../R/src/main/main.c:656 #14 >> 0x08147e90 in main (ac=150798432, av=0x8fd0060) at >> ../../../R/src/unix/system.c:99 #15 0x42017589 in >> __libc_start_main () from /lib/i686/libc.so.6 >> >> (which is not directly helpful since src/main/pcre.c >> (where the C source for gsub(*, perl=TRUE) resides) isn't >> mentioned above. >> >> Must be a memory allocation / mismatch / end_of_string >> problem somewhere. >> >> Martin Maechler, ETH Zurich >> >> ______________________________________________ >> R-devel@stat.math.ethz.ch mailing list >> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel >> >> BDR> -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of BDR> Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ BDR> University of Oxford, Tel: +44 1865 272861 (self) 1 BDR> South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, BDR> UK Fax: +44 1865 272595