Laurent Gautier
2019-Dec-14 16:25 UTC
[Rd] Inconsistent behavior for the C AP's R_ParseVector() ?
Le lun. 9 d?c. 2019 ? 09:57, Tomas Kalibera <tomas.kalibera at gmail.com> a ?crit :> On 12/9/19 2:54 PM, Laurent Gautier wrote: > > > > Le lun. 9 d?c. 2019 ? 05:43, Tomas Kalibera <tomas.kalibera at gmail.com> a > ?crit : > >> On 12/7/19 10:32 PM, Laurent Gautier wrote: >> >> Thanks for the quick response Tomas. >> >> The same error is indeed happening when trying to have a zero-length >> variable name in an environment. The surprising bit is then "why is this >> happening during parsing" (that is why are variables assigned to an >> environment) ? >> >> The emitted R error (in the R console) is not a parse (syntax) error, but >> an error emitted during parsing when the parser tries to intern a name - >> look it up in a symbol table. Empty string is not allowed as a symbol name, >> and hence the error. In the call "list(''=1)" , the empty name is what >> could eventually become a name of a local variable inside list(), even >> though not yet during parsing. >> > > Thanks Tomas. > > I guess this has do with R expressions being lazily evaluated, and names > of arguments in a call are also part of the expression. Now the puzzling > part is why is that at all part of the parsing: I would have expected > R_ParseVector() to be restricted to parsing... Now it feels like > R_ParseVector() is performing parsing, and a first level of evalution for > expressions that "should never work" (the empty name). > > Think of it as an exception in say Python. Some failures during parsing > result in an exception (called error in R and implemented using a long > jump). Any time you are calling into R you can get an error; out of memory > is also signalled as R error. >The surprising bit for me was that I had expected the function to solely perform parsing. I did expect an exception (and a jmp smashing the stack) when the function concerned is in the C-API, is parsing a string, and is using a parameter (pointer) to store whether parsing was a failure or a success. Since you are making a comparison with Python, the distinction I am making between parsing and evaluation seem to apply there. For example: ```>>> import parser >>> parser.expr('1+')Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1 1+ ^ SyntaxError: unexpected EOF while parsing>>> p = parser.expr('list(""=1)') >>> p<parser.st at 0x7f360e5329f0>>>> eval(p)Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: eval() arg 1 must be a string, bytes or code object>>> list(""=1)File "<stdin>", line 1 SyntaxError: keyword can't be an expression ```> There is probably some error in how the external code is handling R >> errors (Fatal error: unable to initialize the JIT, stack smashing, etc) >> and possibly also how R is initialized before calling ParseVector. Probably >> you would get the same problem when running say "stop('myerror')". Please >> note R errors are implemented as long-jumps, so care has to be taken when >> calling into R, Writing R Extensions has more details (and section 8 >> specifically about embedding R). This is unlike parse (syntax) errors >> signaled via return value to ParseVector() >> > > The issue is that the segfault (because of stack smashing, therefore > because of what also suspected to be an incontrolled jump) is happening > within the execution of R_ParseVector(). I would think that an issue with > the initialization of R is less likely because the project is otherwise > used a fair bit and is well covered by automated continuous tests. > > After looking more into R's gram.c I suspect that an execution context is > required for R_ParseVector() to know to properly work (know where to jump > in case of error) when the parsing code decides to fail outside what it > thinks is a syntax error. If the case, this would make R_ParseVector() > function well when called from say, a C-extension to an R package, but fail > the way I am seeing it fail when called from an embedded R. > > Yes, contexts are used internally to handle errors. For external use > please see Writing R Extensions, section 6.12. >I have wrapped my call to R_ParseVector() in a R_tryCatchError(), and this is seems to help me overcome the issue. Thanks for the pointer. Best, Laurent> Best > Tomas > > > Best, > > Laurent > >> Best, >> Tomas >> >> >> We are otherwise aware that the error is not occurring in the R console, >> but can be traced to a call to R_ParseVector() in R's C API:( >> https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509 >> ). >> >> Our specific setup is calling an embedded R from Python, using the cffi >> library. An error on end was the first possibility considered, but the >> puzzling specificity of the error (as shown below other parsing errors are >> handled properly) and the difficulty tracing what is in happening in >> R_ParseVector() made me ask whether someone on this list had a suggestion >> about the possible issue" >> >> ``` >> >> >>> import rpy2.rinterface as ri>>> ri.initr()>>> e = ri.parse("list(''=1+") ---------------------------------------------------------------------------RParsingError Traceback (most recent call last)>>> e = ri.parse("list(''=123") R[write to console]: Error: attempt to use zero-length variable name >> R[write to console]: Fatal error: unable to initialize the JIT >> >> *** stack smashing detected ***: <unknown> terminated >> ``` >> >> >> Le lun. 2 d?c. 2019 ? 06:37, Tomas Kalibera <tomas.kalibera at gmail.com> a >> ?crit : >> >>> Dear Laurent, >>> >>> could you please provide a complete reproducible example where parsing >>> results in a crash of R? Calling parse(text="list(''=123") from R works >>> fine for me (gives Error: attempt to use zero-length variable name). >>> >>> I don't think the problem you observed could be related to the memory >>> leak. The leak is on the heap, not stack. >>> >>> Zero-length names of elements in a list are allowed. They are not the >>> same thing as zero-length variables in an environment. If you try to >>> convert "lst" from your example to an environment, you would get the >>> error (attempt to use zero-length variable name). >>> >>> Best >>> Tomas >>> >>> >>> On 11/30/19 11:55 PM, Laurent Gautier wrote: >>> > Hi again, >>> > >>> > Beside R_ParseVector()'s possible inconsistent behavior, R's handling >>> of >>> > zero-length named elements does not seem consistent either: >>> > >>> > ``` >>> >> lst <- list() >>> >> lst[[""]] <- 1 >>> >> names(lst) >>> > [1] "" >>> >> list("" = 1) >>> > Error: attempt to use zero-length variable name >>> > ``` >>> > >>> > Should the parser be made to accept as valid what is otherwise possible >>> > when using `[[<` ? >>> > >>> > >>> > Best, >>> > >>> > Laurent >>> > >>> > >>> > >>> > Le sam. 30 nov. 2019 ? 17:33, Laurent Gautier <lgautier at gmail.com> a >>> ?crit : >>> > >>> >> I found the following code comment in `src/main/gram.c`: >>> >> >>> >> ``` >>> >> >>> >> /* Memory leak >>> >> >>> >> yyparse(), as generated by bison, allocates extra space for the parser >>> >> stack using malloc(). Unfortunately this means that there is a memory >>> >> leak in case of an R error (long-jump). In principle, we could define >>> >> yyoverflow() to relocate the parser stacks for bison and allocate say >>> on >>> >> the R heap, but yyoverflow() is undocumented and somewhat complicated >>> >> (we would have to replicate some macros from the generated parser >>> here). >>> >> The same problem exists at least in the Rd and LaTeX parsers in tools. >>> >> */ >>> >> >>> >> ``` >>> >> >>> >> Could this be related to be issue ? >>> >> >>> >> Le sam. 30 nov. 2019 ? 14:04, Laurent Gautier <lgautier at gmail.com> a >>> >> ?crit : >>> >> >>> >>> Hi, >>> >>> >>> >>> The behavior of >>> >>> ``` >>> >>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); >>> >>> ``` >>> >>> defined in `src/include/R_ext/Parse.h` appears to be inconsistent >>> >>> depending on the string to be parsed. >>> >>> >>> >>> Trying to parse a string such as `"list(''=1+"` sets the >>> >>> `ParseStatus` to incomplete parsing error but trying to parse >>> >>> `"list(''=123"` will result in R sending a message to the console >>> (followed but a crash): >>> >>> >>> >>> ``` >>> >>> R[write to console]: Error: attempt to use zero-length variable >>> nameR[write to console]: Fatal error: unable to initialize the JIT*** stack >>> smashing detected ***: <unknown> terminated >>> >>> ``` >>> >>> >>> >>> Is there a reason for the difference in behavior, and is there a >>> workaround ? >>> >>> >>> >>> Thanks, >>> >>> >>> >>> >>> >>> Laurent >>> >>> >>> >>> >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-devel at r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> >>> >> >[[alternative HTML version deleted]]
Simon Urbanek
2019-Dec-14 17:04 UTC
[Rd] Inconsistent behavior for the C AP's R_ParseVector() ?
Laurent, the main point here is that ParseVector() just like any other R API has to be called in a correct context since it can raise errors so the issue was that your C code has a bug of not setting R correctly (my guess would be your'e not creating the initial context necessary in embedded R). There are many different errors, your is just one of many that can occur - any R API call that does allocation (and parsing obviously does) can cause errors. Note that this is true for pretty much all R API functions. Cheers, Simon> On Dec 14, 2019, at 11:25 AM, Laurent Gautier <lgautier at gmail.com> wrote: > > Le lun. 9 d?c. 2019 ? 09:57, Tomas Kalibera <tomas.kalibera at gmail.com> a > ?crit : > >> On 12/9/19 2:54 PM, Laurent Gautier wrote: >> >> >> >> Le lun. 9 d?c. 2019 ? 05:43, Tomas Kalibera <tomas.kalibera at gmail.com> a >> ?crit : >> >>> On 12/7/19 10:32 PM, Laurent Gautier wrote: >>> >>> Thanks for the quick response Tomas. >>> >>> The same error is indeed happening when trying to have a zero-length >>> variable name in an environment. The surprising bit is then "why is this >>> happening during parsing" (that is why are variables assigned to an >>> environment) ? >>> >>> The emitted R error (in the R console) is not a parse (syntax) error, but >>> an error emitted during parsing when the parser tries to intern a name - >>> look it up in a symbol table. Empty string is not allowed as a symbol name, >>> and hence the error. In the call "list(''=1)" , the empty name is what >>> could eventually become a name of a local variable inside list(), even >>> though not yet during parsing. >>> >> >> Thanks Tomas. >> >> I guess this has do with R expressions being lazily evaluated, and names >> of arguments in a call are also part of the expression. Now the puzzling >> part is why is that at all part of the parsing: I would have expected >> R_ParseVector() to be restricted to parsing... Now it feels like >> R_ParseVector() is performing parsing, and a first level of evalution for >> expressions that "should never work" (the empty name). >> >> Think of it as an exception in say Python. Some failures during parsing >> result in an exception (called error in R and implemented using a long >> jump). Any time you are calling into R you can get an error; out of memory >> is also signalled as R error. >> > > > The surprising bit for me was that I had expected the function to solely > perform parsing. I did expect an exception (and a jmp smashing the stack) > when the function concerned is in the C-API, is parsing a string, and is > using a parameter (pointer) to store whether parsing was a failure or a > success. > > Since you are making a comparison with Python, the distinction I am making > between parsing and evaluation seem to apply there. For example: > > ``` >>>> import parser >>>> parser.expr('1+') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "<string>", line 1 > 1+ > ^ > SyntaxError: unexpected EOF while parsing >>>> p = parser.expr('list(""=1)') >>>> p > <parser.st at 0x7f360e5329f0> >>>> eval(p) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: eval() arg 1 must be a string, bytes or code object > >>>> list(""=1) > File "<stdin>", line 1 > SyntaxError: keyword can't be an expression > ``` > > >> There is probably some error in how the external code is handling R >>> errors (Fatal error: unable to initialize the JIT, stack smashing, etc) >>> and possibly also how R is initialized before calling ParseVector. Probably >>> you would get the same problem when running say "stop('myerror')". Please >>> note R errors are implemented as long-jumps, so care has to be taken when >>> calling into R, Writing R Extensions has more details (and section 8 >>> specifically about embedding R). This is unlike parse (syntax) errors >>> signaled via return value to ParseVector() >>> >> >> The issue is that the segfault (because of stack smashing, therefore >> because of what also suspected to be an incontrolled jump) is happening >> within the execution of R_ParseVector(). I would think that an issue with >> the initialization of R is less likely because the project is otherwise >> used a fair bit and is well covered by automated continuous tests. >> >> After looking more into R's gram.c I suspect that an execution context is >> required for R_ParseVector() to know to properly work (know where to jump >> in case of error) when the parsing code decides to fail outside what it >> thinks is a syntax error. If the case, this would make R_ParseVector() >> function well when called from say, a C-extension to an R package, but fail >> the way I am seeing it fail when called from an embedded R. >> >> Yes, contexts are used internally to handle errors. For external use >> please see Writing R Extensions, section 6.12. >> > > I have wrapped my call to R_ParseVector() in a R_tryCatchError(), and this > is seems to help me overcome the issue. Thanks for the pointer. > > Best, > > > Laurent > > >> Best >> Tomas >> >> >> Best, >> >> Laurent >> >>> Best, >>> Tomas >>> >>> >>> We are otherwise aware that the error is not occurring in the R console, >>> but can be traced to a call to R_ParseVector() in R's C API:( >>> https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509 >>> ). >>> >>> Our specific setup is calling an embedded R from Python, using the cffi >>> library. An error on end was the first possibility considered, but the >>> puzzling specificity of the error (as shown below other parsing errors are >>> handled properly) and the difficulty tracing what is in happening in >>> R_ParseVector() made me ask whether someone on this list had a suggestion >>> about the possible issue" >>> >>> ``` >>> >>>>>> import rpy2.rinterface as ri>>> ri.initr()>>> e = ri.parse("list(''=1+") ---------------------------------------------------------------------------RParsingError Traceback (most recent call last)>>> e = ri.parse("list(''=123") R[write to console]: Error: attempt to use zero-length variable name >>> R[write to console]: Fatal error: unable to initialize the JIT >>> >>> *** stack smashing detected ***: <unknown> terminated >>> ``` >>> >>> >>> Le lun. 2 d?c. 2019 ? 06:37, Tomas Kalibera <tomas.kalibera at gmail.com> a >>> ?crit : >>> >>>> Dear Laurent, >>>> >>>> could you please provide a complete reproducible example where parsing >>>> results in a crash of R? Calling parse(text="list(''=123") from R works >>>> fine for me (gives Error: attempt to use zero-length variable name). >>>> >>>> I don't think the problem you observed could be related to the memory >>>> leak. The leak is on the heap, not stack. >>>> >>>> Zero-length names of elements in a list are allowed. They are not the >>>> same thing as zero-length variables in an environment. If you try to >>>> convert "lst" from your example to an environment, you would get the >>>> error (attempt to use zero-length variable name). >>>> >>>> Best >>>> Tomas >>>> >>>> >>>> On 11/30/19 11:55 PM, Laurent Gautier wrote: >>>>> Hi again, >>>>> >>>>> Beside R_ParseVector()'s possible inconsistent behavior, R's handling >>>> of >>>>> zero-length named elements does not seem consistent either: >>>>> >>>>> ``` >>>>>> lst <- list() >>>>>> lst[[""]] <- 1 >>>>>> names(lst) >>>>> [1] "" >>>>>> list("" = 1) >>>>> Error: attempt to use zero-length variable name >>>>> ``` >>>>> >>>>> Should the parser be made to accept as valid what is otherwise possible >>>>> when using `[[<` ? >>>>> >>>>> >>>>> Best, >>>>> >>>>> Laurent >>>>> >>>>> >>>>> >>>>> Le sam. 30 nov. 2019 ? 17:33, Laurent Gautier <lgautier at gmail.com> a >>>> ?crit : >>>>> >>>>>> I found the following code comment in `src/main/gram.c`: >>>>>> >>>>>> ``` >>>>>> >>>>>> /* Memory leak >>>>>> >>>>>> yyparse(), as generated by bison, allocates extra space for the parser >>>>>> stack using malloc(). Unfortunately this means that there is a memory >>>>>> leak in case of an R error (long-jump). In principle, we could define >>>>>> yyoverflow() to relocate the parser stacks for bison and allocate say >>>> on >>>>>> the R heap, but yyoverflow() is undocumented and somewhat complicated >>>>>> (we would have to replicate some macros from the generated parser >>>> here). >>>>>> The same problem exists at least in the Rd and LaTeX parsers in tools. >>>>>> */ >>>>>> >>>>>> ``` >>>>>> >>>>>> Could this be related to be issue ? >>>>>> >>>>>> Le sam. 30 nov. 2019 ? 14:04, Laurent Gautier <lgautier at gmail.com> a >>>>>> ?crit : >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> The behavior of >>>>>>> ``` >>>>>>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); >>>>>>> ``` >>>>>>> defined in `src/include/R_ext/Parse.h` appears to be inconsistent >>>>>>> depending on the string to be parsed. >>>>>>> >>>>>>> Trying to parse a string such as `"list(''=1+"` sets the >>>>>>> `ParseStatus` to incomplete parsing error but trying to parse >>>>>>> `"list(''=123"` will result in R sending a message to the console >>>> (followed but a crash): >>>>>>> >>>>>>> ``` >>>>>>> R[write to console]: Error: attempt to use zero-length variable >>>> nameR[write to console]: Fatal error: unable to initialize the JIT*** stack >>>> smashing detected ***: <unknown> terminated >>>>>>> ``` >>>>>>> >>>>>>> Is there a reason for the difference in behavior, and is there a >>>> workaround ? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>>> Laurent >>>>>>> >>>>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-devel at r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>>> >>>> >>> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Laurent Gautier
2019-Dec-14 22:29 UTC
[Rd] Inconsistent behavior for the C AP's R_ParseVector() ?
Hi Simon, Widespread errors would have caught my earlier as the way that code is using only one initialization of the embedded R, is used quite a bit, and is covered by quite a few unit tests. This is the only situation I am aware of in which an error occurs. What is a "correct context", or initial context, the code should from ? Searching for "context" in the R-exts manual does not return much. Best, Laurent Le sam. 14 d?c. 2019 ? 12:20, Simon Urbanek <simon.urbanek at r-project.org> a ?crit :> Laurent, > > the main point here is that ParseVector() just like any other R API has to > be called in a correct context since it can raise errors so the issue was > that your C code has a bug of not setting R correctly (my guess would be > your'e not creating the initial context necessary in embedded R). There are > many different errors, your is just one of many that can occur - any R API > call that does allocation (and parsing obviously does) can cause errors. > Note that this is true for pretty much all R API functions. > > Cheers, > Simon > > > > > On Dec 14, 2019, at 11:25 AM, Laurent Gautier <lgautier at gmail.com> > wrote: > > > > Le lun. 9 d?c. 2019 ? 09:57, Tomas Kalibera <tomas.kalibera at gmail.com> a > > ?crit : > > > >> On 12/9/19 2:54 PM, Laurent Gautier wrote: > >> > >> > >> > >> Le lun. 9 d?c. 2019 ? 05:43, Tomas Kalibera <tomas.kalibera at gmail.com> > a > >> ?crit : > >> > >>> On 12/7/19 10:32 PM, Laurent Gautier wrote: > >>> > >>> Thanks for the quick response Tomas. > >>> > >>> The same error is indeed happening when trying to have a zero-length > >>> variable name in an environment. The surprising bit is then "why is > this > >>> happening during parsing" (that is why are variables assigned to an > >>> environment) ? > >>> > >>> The emitted R error (in the R console) is not a parse (syntax) error, > but > >>> an error emitted during parsing when the parser tries to intern a name > - > >>> look it up in a symbol table. Empty string is not allowed as a symbol > name, > >>> and hence the error. In the call "list(''=1)" , the empty name is what > >>> could eventually become a name of a local variable inside list(), even > >>> though not yet during parsing. > >>> > >> > >> Thanks Tomas. > >> > >> I guess this has do with R expressions being lazily evaluated, and names > >> of arguments in a call are also part of the expression. Now the puzzling > >> part is why is that at all part of the parsing: I would have expected > >> R_ParseVector() to be restricted to parsing... Now it feels like > >> R_ParseVector() is performing parsing, and a first level of evalution > for > >> expressions that "should never work" (the empty name). > >> > >> Think of it as an exception in say Python. Some failures during parsing > >> result in an exception (called error in R and implemented using a long > >> jump). Any time you are calling into R you can get an error; out of > memory > >> is also signalled as R error. > >> > > > > > > The surprising bit for me was that I had expected the function to solely > > perform parsing. I did expect an exception (and a jmp smashing the stack) > > when the function concerned is in the C-API, is parsing a string, and is > > using a parameter (pointer) to store whether parsing was a failure or a > > success. > > > > Since you are making a comparison with Python, the distinction I am > making > > between parsing and evaluation seem to apply there. For example: > > > > ``` > >>>> import parser > >>>> parser.expr('1+') > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > File "<string>", line 1 > > 1+ > > ^ > > SyntaxError: unexpected EOF while parsing > >>>> p = parser.expr('list(""=1)') > >>>> p > > <parser.st at 0x7f360e5329f0> > >>>> eval(p) > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > TypeError: eval() arg 1 must be a string, bytes or code object > > > >>>> list(""=1) > > File "<stdin>", line 1 > > SyntaxError: keyword can't be an expression > > ``` > > > > > >> There is probably some error in how the external code is handling R > >>> errors (Fatal error: unable to initialize the JIT, stack smashing, > etc) > >>> and possibly also how R is initialized before calling ParseVector. > Probably > >>> you would get the same problem when running say "stop('myerror')". > Please > >>> note R errors are implemented as long-jumps, so care has to be taken > when > >>> calling into R, Writing R Extensions has more details (and section 8 > >>> specifically about embedding R). This is unlike parse (syntax) errors > >>> signaled via return value to ParseVector() > >>> > >> > >> The issue is that the segfault (because of stack smashing, therefore > >> because of what also suspected to be an incontrolled jump) is happening > >> within the execution of R_ParseVector(). I would think that an issue > with > >> the initialization of R is less likely because the project is otherwise > >> used a fair bit and is well covered by automated continuous tests. > >> > >> After looking more into R's gram.c I suspect that an execution context > is > >> required for R_ParseVector() to know to properly work (know where to > jump > >> in case of error) when the parsing code decides to fail outside what it > >> thinks is a syntax error. If the case, this would make R_ParseVector() > >> function well when called from say, a C-extension to an R package, but > fail > >> the way I am seeing it fail when called from an embedded R. > >> > >> Yes, contexts are used internally to handle errors. For external use > >> please see Writing R Extensions, section 6.12. > >> > > > > I have wrapped my call to R_ParseVector() in a R_tryCatchError(), and > this > > is seems to help me overcome the issue. Thanks for the pointer. > > > > Best, > > > > > > Laurent > > > > > >> Best > >> Tomas > >> > >> > >> Best, > >> > >> Laurent > >> > >>> Best, > >>> Tomas > >>> > >>> > >>> We are otherwise aware that the error is not occurring in the R > console, > >>> but can be traced to a call to R_ParseVector() in R's C API:( > >>> > https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509 > >>> ). > >>> > >>> Our specific setup is calling an embedded R from Python, using the cffi > >>> library. An error on end was the first possibility considered, but the > >>> puzzling specificity of the error (as shown below other parsing errors > are > >>> handled properly) and the difficulty tracing what is in happening in > >>> R_ParseVector() made me ask whether someone on this list had a > suggestion > >>> about the possible issue" > >>> > >>> ``` > >>> > >>>>>> import rpy2.rinterface as ri>>> ri.initr()>>> e > ri.parse("list(''=1+") > ---------------------------------------------------------------------------RParsingError > Traceback (most recent call last)>>> e > ri.parse("list(''=123") R[write to console]: Error: attempt to use > zero-length variable name > >>> R[write to console]: Fatal error: unable to initialize the JIT > >>> > >>> *** stack smashing detected ***: <unknown> terminated > >>> ``` > >>> > >>> > >>> Le lun. 2 d?c. 2019 ? 06:37, Tomas Kalibera <tomas.kalibera at gmail.com> > a > >>> ?crit : > >>> > >>>> Dear Laurent, > >>>> > >>>> could you please provide a complete reproducible example where parsing > >>>> results in a crash of R? Calling parse(text="list(''=123") from R > works > >>>> fine for me (gives Error: attempt to use zero-length variable name). > >>>> > >>>> I don't think the problem you observed could be related to the memory > >>>> leak. The leak is on the heap, not stack. > >>>> > >>>> Zero-length names of elements in a list are allowed. They are not the > >>>> same thing as zero-length variables in an environment. If you try to > >>>> convert "lst" from your example to an environment, you would get the > >>>> error (attempt to use zero-length variable name). > >>>> > >>>> Best > >>>> Tomas > >>>> > >>>> > >>>> On 11/30/19 11:55 PM, Laurent Gautier wrote: > >>>>> Hi again, > >>>>> > >>>>> Beside R_ParseVector()'s possible inconsistent behavior, R's handling > >>>> of > >>>>> zero-length named elements does not seem consistent either: > >>>>> > >>>>> ``` > >>>>>> lst <- list() > >>>>>> lst[[""]] <- 1 > >>>>>> names(lst) > >>>>> [1] "" > >>>>>> list("" = 1) > >>>>> Error: attempt to use zero-length variable name > >>>>> ``` > >>>>> > >>>>> Should the parser be made to accept as valid what is otherwise > possible > >>>>> when using `[[<` ? > >>>>> > >>>>> > >>>>> Best, > >>>>> > >>>>> Laurent > >>>>> > >>>>> > >>>>> > >>>>> Le sam. 30 nov. 2019 ? 17:33, Laurent Gautier <lgautier at gmail.com> a > >>>> ?crit : > >>>>> > >>>>>> I found the following code comment in `src/main/gram.c`: > >>>>>> > >>>>>> ``` > >>>>>> > >>>>>> /* Memory leak > >>>>>> > >>>>>> yyparse(), as generated by bison, allocates extra space for the > parser > >>>>>> stack using malloc(). Unfortunately this means that there is a > memory > >>>>>> leak in case of an R error (long-jump). In principle, we could > define > >>>>>> yyoverflow() to relocate the parser stacks for bison and allocate > say > >>>> on > >>>>>> the R heap, but yyoverflow() is undocumented and somewhat > complicated > >>>>>> (we would have to replicate some macros from the generated parser > >>>> here). > >>>>>> The same problem exists at least in the Rd and LaTeX parsers in > tools. > >>>>>> */ > >>>>>> > >>>>>> ``` > >>>>>> > >>>>>> Could this be related to be issue ? > >>>>>> > >>>>>> Le sam. 30 nov. 2019 ? 14:04, Laurent Gautier <lgautier at gmail.com> > a > >>>>>> ?crit : > >>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> The behavior of > >>>>>>> ``` > >>>>>>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); > >>>>>>> ``` > >>>>>>> defined in `src/include/R_ext/Parse.h` appears to be inconsistent > >>>>>>> depending on the string to be parsed. > >>>>>>> > >>>>>>> Trying to parse a string such as `"list(''=1+"` sets the > >>>>>>> `ParseStatus` to incomplete parsing error but trying to parse > >>>>>>> `"list(''=123"` will result in R sending a message to the console > >>>> (followed but a crash): > >>>>>>> > >>>>>>> ``` > >>>>>>> R[write to console]: Error: attempt to use zero-length variable > >>>> nameR[write to console]: Fatal error: unable to initialize the JIT*** > stack > >>>> smashing detected ***: <unknown> terminated > >>>>>>> ``` > >>>>>>> > >>>>>>> Is there a reason for the difference in behavior, and is there a > >>>> workaround ? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> > >>>>>>> Laurent > >>>>>>> > >>>>>>> > >>>>> [[alternative HTML version deleted]] > >>>>> > >>>>> ______________________________________________ > >>>>> R-devel at r-project.org mailing list > >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>>> > >>>> > >>>> > >>> > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > >[[alternative HTML version deleted]]
Seemingly Similar Threads
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?
- Inconsistent behavior for the C AP's R_ParseVector() ?