Laurent Gautier
2019-Dec-09 13:54 UTC
[Rd] Inconsistent behavior for the C AP's R_ParseVector() ?
Le lun. 9 d?c. 2019 ? 05:43, Tomas Kalibera <tomas.kalibera at gmail.com> a ?crit :> On 12/7/19 10:32 PM, Laurent Gautier wrote: > > Thanks for the quick response Tomas. > > The same error is indeed happening when trying to have a zero-length > variable name in an environment. The surprising bit is then "why is this > happening during parsing" (that is why are variables assigned to an > environment) ? > > The emitted R error (in the R console) is not a parse (syntax) error, but > an error emitted during parsing when the parser tries to intern a name - > look it up in a symbol table. Empty string is not allowed as a symbol name, > and hence the error. In the call "list(''=1)" , the empty name is what > could eventually become a name of a local variable inside list(), even > though not yet during parsing. >Thanks Tomas. I guess this has do with R expressions being lazily evaluated, and names of arguments in a call are also part of the expression. Now the puzzling part is why is that at all part of the parsing: I would have expected R_ParseVector() to be restricted to parsing... Now it feels like R_ParseVector() is performing parsing, and a first level of evalution for expressions that "should never work" (the empty name). There is probably some error in how the external code is handling R errors> (Fatal error: unable to initialize the JIT, stack smashing, etc) and > possibly also how R is initialized before calling ParseVector. Probably you > would get the same problem when running say "stop('myerror')". Please note > R errors are implemented as long-jumps, so care has to be taken when > calling into R, Writing R Extensions has more details (and section 8 > specifically about embedding R). This is unlike parse (syntax) errors > signaled via return value to ParseVector() >The issue is that the segfault (because of stack smashing, therefore because of what also suspected to be an incontrolled jump) is happening within the execution of R_ParseVector(). I would think that an issue with the initialization of R is less likely because the project is otherwise used a fair bit and is well covered by automated continuous tests. After looking more into R's gram.c I suspect that an execution context is required for R_ParseVector() to know to properly work (know where to jump in case of error) when the parsing code decides to fail outside what it thinks is a syntax error. If the case, this would make R_ParseVector() function well when called from say, a C-extension to an R package, but fail the way I am seeing it fail when called from an embedded R. Best, Laurent> Best, > Tomas > > > We are otherwise aware that the error is not occurring in the R console, > but can be traced to a call to R_ParseVector() in R's C API:( > https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509 > ). > > Our specific setup is calling an embedded R from Python, using the cffi > library. An error on end was the first possibility considered, but the > puzzling specificity of the error (as shown below other parsing errors are > handled properly) and the difficulty tracing what is in happening in > R_ParseVector() made me ask whether someone on this list had a suggestion > about the possible issue" > > ``` > > >>> import rpy2.rinterface as ri>>> ri.initr()>>> e = ri.parse("list(''=1+") ---------------------------------------------------------------------------RParsingError Traceback (most recent call last)>>> e = ri.parse("list(''=123") R[write to console]: Error: attempt to use zero-length variable name > R[write to console]: Fatal error: unable to initialize the JIT > > *** stack smashing detected ***: <unknown> terminated > ``` > > > Le lun. 2 d?c. 2019 ? 06:37, Tomas Kalibera <tomas.kalibera at gmail.com> a > ?crit : > >> Dear Laurent, >> >> could you please provide a complete reproducible example where parsing >> results in a crash of R? Calling parse(text="list(''=123") from R works >> fine for me (gives Error: attempt to use zero-length variable name). >> >> I don't think the problem you observed could be related to the memory >> leak. The leak is on the heap, not stack. >> >> Zero-length names of elements in a list are allowed. They are not the >> same thing as zero-length variables in an environment. If you try to >> convert "lst" from your example to an environment, you would get the >> error (attempt to use zero-length variable name). >> >> Best >> Tomas >> >> >> On 11/30/19 11:55 PM, Laurent Gautier wrote: >> > Hi again, >> > >> > Beside R_ParseVector()'s possible inconsistent behavior, R's handling of >> > zero-length named elements does not seem consistent either: >> > >> > ``` >> >> lst <- list() >> >> lst[[""]] <- 1 >> >> names(lst) >> > [1] "" >> >> list("" = 1) >> > Error: attempt to use zero-length variable name >> > ``` >> > >> > Should the parser be made to accept as valid what is otherwise possible >> > when using `[[<` ? >> > >> > >> > Best, >> > >> > Laurent >> > >> > >> > >> > Le sam. 30 nov. 2019 ? 17:33, Laurent Gautier <lgautier at gmail.com> a >> ?crit : >> > >> >> I found the following code comment in `src/main/gram.c`: >> >> >> >> ``` >> >> >> >> /* Memory leak >> >> >> >> yyparse(), as generated by bison, allocates extra space for the parser >> >> stack using malloc(). Unfortunately this means that there is a memory >> >> leak in case of an R error (long-jump). In principle, we could define >> >> yyoverflow() to relocate the parser stacks for bison and allocate say >> on >> >> the R heap, but yyoverflow() is undocumented and somewhat complicated >> >> (we would have to replicate some macros from the generated parser >> here). >> >> The same problem exists at least in the Rd and LaTeX parsers in tools. >> >> */ >> >> >> >> ``` >> >> >> >> Could this be related to be issue ? >> >> >> >> Le sam. 30 nov. 2019 ? 14:04, Laurent Gautier <lgautier at gmail.com> a >> >> ?crit : >> >> >> >>> Hi, >> >>> >> >>> The behavior of >> >>> ``` >> >>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); >> >>> ``` >> >>> defined in `src/include/R_ext/Parse.h` appears to be inconsistent >> >>> depending on the string to be parsed. >> >>> >> >>> Trying to parse a string such as `"list(''=1+"` sets the >> >>> `ParseStatus` to incomplete parsing error but trying to parse >> >>> `"list(''=123"` will result in R sending a message to the console >> (followed but a crash): >> >>> >> >>> ``` >> >>> R[write to console]: Error: attempt to use zero-length variable >> nameR[write to console]: Fatal error: unable to initialize the JIT*** stack >> smashing detected ***: <unknown> terminated >> >>> ``` >> >>> >> >>> Is there a reason for the difference in behavior, and is there a >> workaround ? >> >>> >> >>> Thanks, >> >>> >> >>> >> >>> Laurent >> >>> >> >>> >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> >[[alternative HTML version deleted]]
Tomas Kalibera
2019-Dec-09 14:57 UTC
[Rd] Inconsistent behavior for the C AP's R_ParseVector() ?
On 12/9/19 2:54 PM, Laurent Gautier wrote:> > > Le?lun. 9 d?c. 2019 ??05:43, Tomas Kalibera <tomas.kalibera at gmail.com > <mailto:tomas.kalibera at gmail.com>> a ?crit?: > > On 12/7/19 10:32 PM, Laurent Gautier wrote: >> Thanks for the quick response Tomas. >> >> The same error is indeed happening when trying to have a >> zero-length variable name in an environment. The surprising bit >> is then "why is this happening during parsing" (that is why are >> variables assigned to an environment) ? > > The emitted R error (in the R console) is not a parse (syntax) > error, but an error emitted during parsing when the parser tries > to intern a name - look it up in a symbol table. Empty string is > not allowed as a symbol name, and hence the error. In the call > "list(''=1)" , the empty name is what could eventually become a > name of a local variable inside list(), even though not yet during > parsing. > > > Thanks Tomas. > > I guess this has do with R expressions being lazily evaluated, and > names of arguments in a call are also part of the expression. Now the > puzzling part is why is that at all part of the parsing: I would have > expected R_ParseVector() to be restricted to parsing... Now it feels > like R_ParseVector() is performing parsing, and a first level of > evalution for expressions that "should never work" (the empty name).Think of it as an exception in say Python. Some failures during parsing result in an exception (called error in R and implemented using a long jump). Any time you are calling into R you can get an error; out of memory is also signalled as R error.> > There is probably some error in how the external code is handling > R errors? (Fatal error: unable to initialize the JIT, stack > smashing, etc) and possibly also how R is initialized before > calling ParseVector. Probably you would get the same problem when > running say "stop('myerror')". Please note R errors are > implemented as long-jumps, so care has to be taken when calling > into R, Writing R Extensions has more details (and section 8 > specifically about embedding R). This is unlike parse (syntax) > errors signaled via return value to ParseVector() > > > The issue is that the segfault (because of stack smashing, therefore > because of what also suspected to be an incontrolled jump) is > happening within the execution of R_ParseVector(). I would think that > an issue with the initialization of R is less likely because the > project is otherwise used a fair bit and is well covered by automated > continuous tests. > > After looking more into R's gram.c I suspect that an execution context > is required for R_ParseVector() to know to properly work (know where > to jump in case of error) when the parsing code decides to fail > outside what it thinks is a syntax error. If the case, this would make > R_ParseVector() function well when called from say, a C-extension to > an R package, but fail the way I am seeing it fail when called from an > embedded R.Yes, contexts are used internally to handle errors. For external use please see Writing R Extensions, section 6.12. Best Tomas> Best, > > Laurent > > Best, > Tomas > >> >> We are otherwise aware that the error is not occurring in the R >> console, but can be traced to a call to R_ParseVector() in R's C >> API:(https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509). >> >> Our specific setup is calling an embedded R from Python, using >> the cffi library. An error on end was the first possibility >> considered, but the puzzling specificity of the error (as shown >> below other parsing errors are handled properly) and the >> difficulty tracing what is in happening in R_ParseVector() made >> me ask whether someone on this list had a suggestion about the >> possible issue" >> >> ``` >> >>> import rpy2.rinterface as ri >> >>> ri.initr() >> >>> e = ri.parse("list(''=1+") >> --------------------------------------------------------------------------- >> RParsingError Traceback (most recent call last)>>> e = ri.parse("list(''=123") R[write to console]: Error: >> attempt to use zero-length variable name R[write to console]: >> Fatal error: unable to initialize the JIT *** stack smashing >> detected ***: <unknown> terminated ``` >> >> Le?lun. 2 d?c. 2019 ??06:37, Tomas Kalibera >> <tomas.kalibera at gmail.com <mailto:tomas.kalibera at gmail.com>> a >> ?crit?: >> >> Dear Laurent, >> >> could you please provide a complete reproducible example >> where parsing >> results in a crash of R? Calling parse(text="list(''=123") >> from R works >> fine for me (gives Error: attempt to use zero-length variable >> name). >> >> I don't think the problem you observed could be related to >> the memory >> leak. The leak is on the heap, not stack. >> >> Zero-length names of elements in a list are allowed. They are >> not the >> same thing as zero-length variables in an environment. If you >> try to >> convert "lst" from your example to an environment, you would >> get the >> error (attempt to use zero-length variable name). >> >> Best >> Tomas >> >> >> On 11/30/19 11:55 PM, Laurent Gautier wrote: >> > Hi again, >> > >> > Beside R_ParseVector()'s possible inconsistent behavior, >> R's handling of >> > zero-length named elements does not seem consistent either: >> > >> > ``` >> >> lst <- list() >> >> lst[[""]] <- 1 >> >> names(lst) >> > [1] "" >> >> list("" = 1) >> > Error: attempt to use zero-length variable name >> > ``` >> > >> > Should the parser be made to accept as valid what is >> otherwise possible >> > when using `[[<` ? >> > >> > >> > Best, >> > >> > Laurent >> > >> > >> > >> > Le sam. 30 nov. 2019 ? 17:33, Laurent Gautier >> <lgautier at gmail.com <mailto:lgautier at gmail.com>> a ?crit : >> > >> >> I found the following code comment in `src/main/gram.c`: >> >> >> >> ``` >> >> >> >> /* Memory leak >> >> >> >> yyparse(), as generated by bison, allocates extra space >> for the parser >> >> stack using malloc(). Unfortunately this means that there >> is a memory >> >> leak in case of an R error (long-jump). In principle, we >> could define >> >> yyoverflow() to relocate the parser stacks for bison and >> allocate say on >> >> the R heap, but yyoverflow() is undocumented and somewhat >> complicated >> >> (we would have to replicate some macros from the generated >> parser here). >> >> The same problem exists at least in the Rd and LaTeX >> parsers in tools. >> >> */ >> >> >> >> ``` >> >> >> >> Could this be related to be issue ? >> >> >> >> Le sam. 30 nov. 2019 ? 14:04, Laurent Gautier >> <lgautier at gmail.com <mailto:lgautier at gmail.com>> a >> >> ?crit : >> >> >> >>> Hi, >> >>> >> >>> The behavior of >> >>> ``` >> >>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); >> >>> ``` >> >>> defined in `src/include/R_ext/Parse.h` appears to be >> inconsistent >> >>> depending on the string to be parsed. >> >>> >> >>> Trying to parse a string such as `"list(''=1+"` sets the >> >>> `ParseStatus` to incomplete parsing error but trying to parse >> >>> `"list(''=123"` will result in R sending a message to the >> console (followed but a crash): >> >>> >> >>> ``` >> >>> R[write to console]: Error: attempt to use zero-length >> variable nameR[write to console]: Fatal error: unable to >> initialize the JIT*** stack smashing detected ***: <unknown> >> terminated >> >>> ``` >> >>> >> >>> Is there a reason for the difference in behavior, and is >> there a workaround ? >> >>> >> >>> Thanks, >> >>> >> >>> >> >>> Laurent >> >>> >> >>> >> >? ? ? ?[[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel at r-project.org <mailto:R-devel at r-project.org> >> mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >[[alternative HTML version deleted]]
@osp@m m@iii@g oii @itieid-im@de
2019-Dec-11 20:42 UTC
[Rd] Why does INT 3 (opcode 0xCC) SIGTRAP break to debugger (gdb) in Rgui.exe and Rterm.exe but NOT in R.exe on Windows (64 bit)?
I am developing a package to improve the debugging of Rcpp (C++) and SEXP based C code in gdb by providing convenience print, subset and other functions: https://github.com/aryoda/R_CppDebugHelper I also want to solve the Windows-only problem that you can break into the debugger from R only via Rgui.exe (menu "Misc > break to debugger") by supporting breakpoints for R.exe. I want breakpoints support in R.exe because debugging in Rgui.exe has an unwanted side effect: https://stackoverflow.com/questions/59236579/gdb-prints-output-stdout-to-rgui-console-instead-of-gdb-console-on-windows-whe My idea is to break into the debugger from R.exe by calling a little C(++) code that contains an INT 3 (opcode 0xCC) SIGTRAP code: // break_to_debugger.cpp // [[Rcpp::export]] int break_to_debugger() { int a = 3; asm("int $3"); // this code line shall break into the debugger // Idea taken from "Rgui > break into debugger": // https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/gnuwin32/rui.c#L431 a++; return a; } # breakpoint.R #' breaks the execution into the debugger #' #' @return #' @export breakpoint <- function() { break_to_debugger() } Surprisingly this works not only on Linux but also on Windows (v10, x64 architecture = 64 bit) in Rterm.exe, but NOT for R.exe (64 bit): - Rgui.exe: Works - Rscript.exe: Works - R.exe: Does not work: R.exe is exited with: [Inferior 1 (process 20704) exited with code 020000000003] Can you please help me to understand why it works for Rgui.exe and Rscript.exe but not for R.exe? Why is int 3 exiting R.exe? And: How could I make it also work with R.exe? Thanks a lot for sharing your ideas and experiences! J?rgen PS 1: My sessionInfo(): R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 17134) PS 2: My package "CppDebugHelper" was compiled with -g -o0 -std=c++11 PS 3: Here is my captured gdb output for the three test cases: 1. Rgui.exe ------------------------------------------------------------------------>gdb --quiet --args Rgui.exe --silent --vanillaReading symbols from Rgui.exe...(no debugging symbols found)...done. (gdb) run Starting program: C:\R\bin\x64\Rgui.exe --silent --vanilla [New Thread 14476.0x3710] [New Thread 14476.0x284c] [New Thread 14476.0x50ec] [New Thread 14476.0x2d24] warning: Invalid parameter passed to C runtime function. [In RGui's R console:] library(CppDebugHelper) breakpoint() [in gdb again:] Program received signal SIGTRAP, Trace/breakpoint trap. break_to_debugger () at break_to_debugger.cpp:33 33 a++; (gdb) b debug_example_rcpp Breakpoint 1 at 0x66ac6846: file debug_example_rcpp.cpp, line 13. (gdb) continue Continuing. [In RGui's R console:] debug_example_rcpp() [in gdb again:] Breakpoint 1, debug_example_rcpp () at debug_example_rcpp.cpp:13 13 CharacterVector cv = CharacterVector::create("foo", "bar", NA_STRING, "hello") ; (gdb) next 14 NumericVector nv = NumericVector::create(0.0, 1.0, NA_REAL, 10) ; (gdb) n 16 DateVector dv = DateVector::create( 14974, 14975, 15123, NA_REAL); // TODO how to use real dates instead? (gdb) n 17 DateVector dv2 = DateVector::create(Date("2010-12-31"), Date("01.01.2011", "%d.%m.%Y"), Date(2011, 05, 29), NA_REAL); (gdb) n 18 DatetimeVector dtv = DatetimeVector::create(1293753600, Datetime("2011-01-01"), Datetime("2011-05-29 10:15:30") , NA_REAL); (gdb) n 19 DataFrame df = DataFrame::create(Named("name1") = cv, _["value1"] = nv, _["dv2"] = dv2); // Named and _[ ] are the same (gdb) n 20 CharacterVector col1 = df["name1"]; // get the first column (gdb) call dbg_print(df) (gdb) call dbg_str(df) (gdb) continue Continuing. [Output for the dbg_* function calls is printed to Rgui's R console (NOT the gdb terminal!):] name1 value1 dv2 1 foo 0 2010-12-31 2 bar 1 2011-01-01 3 <NA> NA 2011-05-29 4 hello 10 <NA> 'data.frame': 4 obs. of 3 variables: $ name1 : Factor w/ 3 levels "bar","foo","hello": 2 1 NA 3 $ value1: num 0 1 NA 10 $ dv2 : Date, format: "2010-12-31" "2011-01-01" ... 2. R.exe ------------------------------------------------------------------------>gdb --quiet --args R.exe --silent --vanillaReading symbols from R.exe...(no debugging symbols found)...done. (gdb) r Starting program: C:\R\bin\x64\R.exe --silent --vanilla [New Thread 20704.0x2b20] [New Thread 20704.0x4c08] [New Thread 20704.0x425c] [New Thread 20704.0x45f8]> library(CppDebugHelper) > breakpoint()[Thread 20704.0x45f8 exited with code 2147483651] [Thread 20704.0x425c exited with code 2147483651] [Thread 20704.0x4c08 exited with code 2147483651] [Inferior 1 (process 20704) exited with code 020000000003] (gdb) bt No stack. (gdb) 3. Rterm.exe ------------------------------------------------------------------------ gdb --quiet --args Rterm.exe --silent --vanilla Reading symbols from Rterm.exe...(no debugging symbols found)...done. (gdb) run Starting program: C:\R\bin\x64\Rterm.exe --silent --vanilla [New Thread 8132.0x3ee8] [New Thread 8132.0x3828] [New Thread 8132.0x4f1c] [New Thread 8132.0x4ff4] warning: Invalid parameter passed to C runtime function. [New Thread 8132.0x4dc8]> library(CppDebugHelper) > breakpoint()Program received signal SIGTRAP, Trace/breakpoint trap. break_to_debugger () at break_to_debugger.cpp:33 33 a++; (gdb) b debug_example_rcpp Breakpoint 1 at 0x66ac6846: file debug_example_rcpp.cpp, line 13. (gdb) c Continuing. [1] 4> debug_example_rcpp()Breakpoint 1, debug_example_rcpp () at debug_example_rcpp.cpp:13 13 CharacterVector cv = CharacterVector::create("foo", "bar", NA_STRING, "hello") ; (gdb) n 14 NumericVector nv = NumericVector::create(0.0, 1.0, NA_REAL, 10) ; (gdb) n 16 DateVector dv = DateVector::create( 14974, 14975, 15123, NA_REAL); // TODO how to use real dates instead? (gdb) call dbg_print(nv) [1] 0 1 NA 10 (gdb) call dbg_print(dbg_subset(nv, 1, 2)) [1] 1 NA (gdb)
Laurent Gautier
2019-Dec-14 16:25 UTC
[Rd] Inconsistent behavior for the C AP's R_ParseVector() ?
Le lun. 9 d?c. 2019 ? 09:57, Tomas Kalibera <tomas.kalibera at gmail.com> a ?crit :> On 12/9/19 2:54 PM, Laurent Gautier wrote: > > > > Le lun. 9 d?c. 2019 ? 05:43, Tomas Kalibera <tomas.kalibera at gmail.com> a > ?crit : > >> On 12/7/19 10:32 PM, Laurent Gautier wrote: >> >> Thanks for the quick response Tomas. >> >> The same error is indeed happening when trying to have a zero-length >> variable name in an environment. The surprising bit is then "why is this >> happening during parsing" (that is why are variables assigned to an >> environment) ? >> >> The emitted R error (in the R console) is not a parse (syntax) error, but >> an error emitted during parsing when the parser tries to intern a name - >> look it up in a symbol table. Empty string is not allowed as a symbol name, >> and hence the error. In the call "list(''=1)" , the empty name is what >> could eventually become a name of a local variable inside list(), even >> though not yet during parsing. >> > > Thanks Tomas. > > I guess this has do with R expressions being lazily evaluated, and names > of arguments in a call are also part of the expression. Now the puzzling > part is why is that at all part of the parsing: I would have expected > R_ParseVector() to be restricted to parsing... Now it feels like > R_ParseVector() is performing parsing, and a first level of evalution for > expressions that "should never work" (the empty name). > > Think of it as an exception in say Python. Some failures during parsing > result in an exception (called error in R and implemented using a long > jump). Any time you are calling into R you can get an error; out of memory > is also signalled as R error. >The surprising bit for me was that I had expected the function to solely perform parsing. I did expect an exception (and a jmp smashing the stack) when the function concerned is in the C-API, is parsing a string, and is using a parameter (pointer) to store whether parsing was a failure or a success. Since you are making a comparison with Python, the distinction I am making between parsing and evaluation seem to apply there. For example: ```>>> import parser >>> parser.expr('1+')Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1 1+ ^ SyntaxError: unexpected EOF while parsing>>> p = parser.expr('list(""=1)') >>> p<parser.st at 0x7f360e5329f0>>>> eval(p)Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: eval() arg 1 must be a string, bytes or code object>>> list(""=1)File "<stdin>", line 1 SyntaxError: keyword can't be an expression ```> There is probably some error in how the external code is handling R >> errors (Fatal error: unable to initialize the JIT, stack smashing, etc) >> and possibly also how R is initialized before calling ParseVector. Probably >> you would get the same problem when running say "stop('myerror')". Please >> note R errors are implemented as long-jumps, so care has to be taken when >> calling into R, Writing R Extensions has more details (and section 8 >> specifically about embedding R). This is unlike parse (syntax) errors >> signaled via return value to ParseVector() >> > > The issue is that the segfault (because of stack smashing, therefore > because of what also suspected to be an incontrolled jump) is happening > within the execution of R_ParseVector(). I would think that an issue with > the initialization of R is less likely because the project is otherwise > used a fair bit and is well covered by automated continuous tests. > > After looking more into R's gram.c I suspect that an execution context is > required for R_ParseVector() to know to properly work (know where to jump > in case of error) when the parsing code decides to fail outside what it > thinks is a syntax error. If the case, this would make R_ParseVector() > function well when called from say, a C-extension to an R package, but fail > the way I am seeing it fail when called from an embedded R. > > Yes, contexts are used internally to handle errors. For external use > please see Writing R Extensions, section 6.12. >I have wrapped my call to R_ParseVector() in a R_tryCatchError(), and this is seems to help me overcome the issue. Thanks for the pointer. Best, Laurent> Best > Tomas > > > Best, > > Laurent > >> Best, >> Tomas >> >> >> We are otherwise aware that the error is not occurring in the R console, >> but can be traced to a call to R_ParseVector() in R's C API:( >> https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509 >> ). >> >> Our specific setup is calling an embedded R from Python, using the cffi >> library. An error on end was the first possibility considered, but the >> puzzling specificity of the error (as shown below other parsing errors are >> handled properly) and the difficulty tracing what is in happening in >> R_ParseVector() made me ask whether someone on this list had a suggestion >> about the possible issue" >> >> ``` >> >> >>> import rpy2.rinterface as ri>>> ri.initr()>>> e = ri.parse("list(''=1+") ---------------------------------------------------------------------------RParsingError Traceback (most recent call last)>>> e = ri.parse("list(''=123") R[write to console]: Error: attempt to use zero-length variable name >> R[write to console]: Fatal error: unable to initialize the JIT >> >> *** stack smashing detected ***: <unknown> terminated >> ``` >> >> >> Le lun. 2 d?c. 2019 ? 06:37, Tomas Kalibera <tomas.kalibera at gmail.com> a >> ?crit : >> >>> Dear Laurent, >>> >>> could you please provide a complete reproducible example where parsing >>> results in a crash of R? Calling parse(text="list(''=123") from R works >>> fine for me (gives Error: attempt to use zero-length variable name). >>> >>> I don't think the problem you observed could be related to the memory >>> leak. The leak is on the heap, not stack. >>> >>> Zero-length names of elements in a list are allowed. They are not the >>> same thing as zero-length variables in an environment. If you try to >>> convert "lst" from your example to an environment, you would get the >>> error (attempt to use zero-length variable name). >>> >>> Best >>> Tomas >>> >>> >>> On 11/30/19 11:55 PM, Laurent Gautier wrote: >>> > Hi again, >>> > >>> > Beside R_ParseVector()'s possible inconsistent behavior, R's handling >>> of >>> > zero-length named elements does not seem consistent either: >>> > >>> > ``` >>> >> lst <- list() >>> >> lst[[""]] <- 1 >>> >> names(lst) >>> > [1] "" >>> >> list("" = 1) >>> > Error: attempt to use zero-length variable name >>> > ``` >>> > >>> > Should the parser be made to accept as valid what is otherwise possible >>> > when using `[[<` ? >>> > >>> > >>> > Best, >>> > >>> > Laurent >>> > >>> > >>> > >>> > Le sam. 30 nov. 2019 ? 17:33, Laurent Gautier <lgautier at gmail.com> a >>> ?crit : >>> > >>> >> I found the following code comment in `src/main/gram.c`: >>> >> >>> >> ``` >>> >> >>> >> /* Memory leak >>> >> >>> >> yyparse(), as generated by bison, allocates extra space for the parser >>> >> stack using malloc(). Unfortunately this means that there is a memory >>> >> leak in case of an R error (long-jump). In principle, we could define >>> >> yyoverflow() to relocate the parser stacks for bison and allocate say >>> on >>> >> the R heap, but yyoverflow() is undocumented and somewhat complicated >>> >> (we would have to replicate some macros from the generated parser >>> here). >>> >> The same problem exists at least in the Rd and LaTeX parsers in tools. >>> >> */ >>> >> >>> >> ``` >>> >> >>> >> Could this be related to be issue ? >>> >> >>> >> Le sam. 30 nov. 2019 ? 14:04, Laurent Gautier <lgautier at gmail.com> a >>> >> ?crit : >>> >> >>> >>> Hi, >>> >>> >>> >>> The behavior of >>> >>> ``` >>> >>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); >>> >>> ``` >>> >>> defined in `src/include/R_ext/Parse.h` appears to be inconsistent >>> >>> depending on the string to be parsed. >>> >>> >>> >>> Trying to parse a string such as `"list(''=1+"` sets the >>> >>> `ParseStatus` to incomplete parsing error but trying to parse >>> >>> `"list(''=123"` will result in R sending a message to the console >>> (followed but a crash): >>> >>> >>> >>> ``` >>> >>> R[write to console]: Error: attempt to use zero-length variable >>> nameR[write to console]: Fatal error: unable to initialize the JIT*** stack >>> smashing detected ***: <unknown> terminated >>> >>> ``` >>> >>> >>> >>> Is there a reason for the difference in behavior, and is there a >>> workaround ? >>> >>> >>> >>> Thanks, >>> >>> >>> >>> >>> >>> Laurent >>> >>> >>> >>> >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-devel at r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> >>> >> >[[alternative HTML version deleted]]
Reasonably Related Threads
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?