Hi all,
When I run the following code, R segfaults:
text <- "?"
srcfile <- srcfilecopy("test.r", text)
parse(textConnection(text), srcfile = srcfile)
It doesn't segfault if text is ASCII, or it's not wrapped in
textConnection, or srcfile isn't set.
Hadley
-- 
http://hadley.nz
	[[alternative HTML version deleted]]
On 2024-05-28 1:35 p.m., Hadley Wickham wrote:> Hi all, > > When I run the following code, R segfaults: > > text <- "?" > srcfile <- srcfilecopy("test.r", text) > parse(textConnection(text), srcfile = srcfile) > > It doesn't segfault if text is ASCII, or it's not wrapped in > textConnection, or srcfile isn't set.I also see the segfault on R version 4.4.0 (2024-04-24) -- "Puppy Cup" Copyright (C) 2024 The R Foundation for Statistical Computing Platform: aarch64-apple-darwin20 Apple shows me this stack trace: Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libsystem_platform.dylib 0x189364904 _platform_strlen + 4 1 libR.dylib 0x10380a954 Rf_mkChar + 20 (envir.c:4076) 2 libR.dylib 0x10385e3ac finalizeData + 1516 3 libR.dylib 0x10385d6dc R_Parse + 924 (gram.c:4215) 4 libR.dylib 0x1038f4a6c do_parse + 1260 (source.c:294) 5 libR.dylib 0x10383ac4c bcEval_loop + 40204 (eval.c:8141) 6 libR.dylib 0x10382356c bcEval + 684 (eval.c:7524) 7 libR.dylib 0x103822c6c Rf_eval + 556 (eval.c:1167) 8 libR.dylib 0x10382582c R_execClosure + 812 (eval.c:2398) 9 libR.dylib 0x103824924 applyClosure_core + 164 (eval.c:2311) 10 libR.dylib 0x103822f08 Rf_applyClosure + 20 (eval.c:2333) [inlined] 11 libR.dylib 0x103822f08 Rf_eval + 1224 (eval.c:1285) 12 libR.dylib 0x10387f8f8 R_ReplDLLdo1 + 440 (main.c:398) 13 R 0x102d22fa0 run_REngineRmainloop + 260 14 R 0x102d1a64c -[REngine runREPL] + 124 15 R 0x102d0dd90 main + 588 16 dyld 0x188fae0e0 start + 2360 Duncan Murdoch
On 5/28/24 19:35, Hadley Wickham wrote:> Hi all, > > When I run the following code, R segfaults: > > text <- "?" > srcfile <- srcfilecopy("test.r", text) > parse(textConnection(text), srcfile = srcfile) > > It doesn't segfault if text is ASCII, or it's not wrapped in > textConnection, or srcfile isn't set.Thanks, this is because R parser doesn't support non-ASCII UTF-8 outside string literals and comments, plus a missing bounds check. The "correct" result should be an R error, which I get in a debug build. The tokenizer ends up with a negative token and then when the parse data are being finalized, creating a table of token names, there is an out of bounds access (yytname array). Probably the check should go right away into the tokenizer. Tomas> > Hadley >
Apparently Analagous Threads
- [External] Re: Segfault when parsing UTF-8 text with srcrefs
- Segfault when parsing UTF-8 text with srcrefs
- Proper way to drop 'srcref' from an expression created via substitute(function() ...)?
- parse( connection) and source-keeping
- formals(x)<- drops attributes including class