On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < peter.langfelder at gmail.com> wrote:> On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius <dwinsemius at comcast.net> > wrote: > > Not entirely clear. If you were intending to just get character output > then you could just use: > > > > strsplit(txt, ";") > > You would want to avoid splitting within character strings > (print(";")) and in comments (print(2); ls() # This prints 2; then > lists...) The comment char could also appear in a character string, > where it does not mean the start of a comment...Yes, that would be the problem. Returning to my original post, modifying the example: x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" This should result in a character vector of length 4: [1] "print(2)" "bar <- \"don't ; use semicolons\"" [3] "foo <- '3;4'" "ls(" even though the last command would cause an error using parse(text = x) Perhaps this is not that important (I am trying to simulate a normal R console), and parse only if it syntactically correct. I was merely curious if this could be done, likely using regular expressions (surely strsplit doesn't solve it). Best, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]]
On 19/09/2016 7:59 AM, Adrian Du?a wrote:> On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < > peter.langfelder at gmail.com> wrote: > > > On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius <dwinsemius at comcast.net> > > wrote: > > > Not entirely clear. If you were intending to just get character output > > then you could just use: > > > > > > strsplit(txt, ";") > > > > You would want to avoid splitting within character strings > > (print(";")) and in comments (print(2); ls() # This prints 2; then > > lists...) The comment char could also appear in a character string, > > where it does not mean the start of a comment... > > > Yes, that would be the problem. > Returning to my original post, modifying the example: > > x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" > > This should result in a character vector of length 4: > [1] "print(2)" "bar <- \"don't ; use semicolons\"" > [3] "foo <- '3;4'" "ls(" > > even though the last command would cause an error using parse(text = x) > > Perhaps this is not that important (I am trying to simulate a normal R > console), and parse only if it syntactically correct. > I was merely curious if this could be done, likely using regular > expressions (surely strsplit doesn't solve it). > > Best, > Adrian >See the section on "partial parsing" in the ?parse help page. Duncan Murdoch
Oh yes, completely forgot about partial parsing. One possible (quick) solution: txt <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" sf <- srcfile("txt") tryit <- tryCatch(parse(text = txt, srcfile = sf), error = identity) gpd <- getParseData(sf) pos <- c(0, gpd$col1[gpd$token == "';'"], nchar(txt) + 1) final <- c() for (i in seq(length(pos) - 1)) { final <- c(final, substr(txt, pos[i] + 1, pos[i + 1] - 1)) } Which outputs: [1] "print(2)" " bar <- \"don't ; use semicolons\"" [3] " foo <- '3;4'" " ls(" Excellent, thanks very much, Adrian On Mon, Sep 19, 2016 at 3:19 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> On 19/09/2016 7:59 AM, Adrian Du?a wrote: > >> On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < >> peter.langfelder at gmail.com> wrote: >> >> > On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius < >> dwinsemius at comcast.net> >> > wrote: >> > > Not entirely clear. If you were intending to just get character output >> > then you could just use: >> > > >> > > strsplit(txt, ";") >> > >> > You would want to avoid splitting within character strings >> > (print(";")) and in comments (print(2); ls() # This prints 2; then >> > lists...) The comment char could also appear in a character string, >> > where it does not mean the start of a comment... >> >> >> Yes, that would be the problem. >> Returning to my original post, modifying the example: >> >> x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" >> >> This should result in a character vector of length 4: >> [1] "print(2)" "bar <- \"don't ; use >> semicolons\"" >> [3] "foo <- '3;4'" "ls(" >> >> even though the last command would cause an error using parse(text = x) >> >> Perhaps this is not that important (I am trying to simulate a normal R >> console), and parse only if it syntactically correct. >> I was merely curious if this could be done, likely using regular >> expressions (surely strsplit doesn't solve it). >> >> Best, >> Adrian >> >> See the section on "partial parsing" in the ?parse help page. > > Duncan Murdoch > >-- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]]