On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < peter.langfelder at gmail.com> wrote:> On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius <dwinsemius at comcast.net> > wrote: > > Not entirely clear. If you were intending to just get character output > then you could just use: > > > > strsplit(txt, ";") > > You would want to avoid splitting within character strings > (print(";")) and in comments (print(2); ls() # This prints 2; then > lists...) The comment char could also appear in a character string, > where it does not mean the start of a comment...Yes, that would be the problem. Returning to my original post, modifying the example: x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" This should result in a character vector of length 4: [1] "print(2)" "bar <- \"don't ; use semicolons\"" [3] "foo <- '3;4'" "ls(" even though the last command would cause an error using parse(text = x) Perhaps this is not that important (I am trying to simulate a normal R console), and parse only if it syntactically correct. I was merely curious if this could be done, likely using regular expressions (surely strsplit doesn't solve it). Best, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]]
On 19/09/2016 7:59 AM, Adrian Du?a wrote:> On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < > peter.langfelder at gmail.com> wrote: > > > On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius <dwinsemius at comcast.net> > > wrote: > > > Not entirely clear. If you were intending to just get character output > > then you could just use: > > > > > > strsplit(txt, ";") > > > > You would want to avoid splitting within character strings > > (print(";")) and in comments (print(2); ls() # This prints 2; then > > lists...) The comment char could also appear in a character string, > > where it does not mean the start of a comment... > > > Yes, that would be the problem. > Returning to my original post, modifying the example: > > x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" > > This should result in a character vector of length 4: > [1] "print(2)" "bar <- \"don't ; use semicolons\"" > [3] "foo <- '3;4'" "ls(" > > even though the last command would cause an error using parse(text = x) > > Perhaps this is not that important (I am trying to simulate a normal R > console), and parse only if it syntactically correct. > I was merely curious if this could be done, likely using regular > expressions (surely strsplit doesn't solve it). > > Best, > Adrian >See the section on "partial parsing" in the ?parse help page. Duncan Murdoch
Oh yes, completely forgot about partial parsing. One possible (quick)
solution:
txt <- "print(2); bar <- \"don't ; use semicolons\";
foo <- '3;4'; ls("
sf <- srcfile("txt")
tryit <- tryCatch(parse(text = txt, srcfile = sf), error = identity)
gpd <- getParseData(sf)
pos <- c(0, gpd$col1[gpd$token == "';'"], nchar(txt) + 1)
final <- c()
for (i in seq(length(pos) - 1)) {
final <- c(final, substr(txt, pos[i] + 1, pos[i + 1] - 1))
}
Which outputs:
[1] "print(2)" " bar <-
\"don't ; use
semicolons\""
[3] " foo <- '3;4'" " ls("
Excellent, thanks very much,
Adrian
On Mon, Sep 19, 2016 at 3:19 PM, Duncan Murdoch <murdoch.duncan at
gmail.com>
wrote:
> On 19/09/2016 7:59 AM, Adrian Du?a wrote:
>
>> On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder <
>> peter.langfelder at gmail.com> wrote:
>>
>> > On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius <
>> dwinsemius at comcast.net>
>> > wrote:
>> > > Not entirely clear. If you were intending to just get
character output
>> > then you could just use:
>> > >
>> > > strsplit(txt, ";")
>> >
>> > You would want to avoid splitting within character strings
>> > (print(";")) and in comments (print(2); ls() # This
prints 2; then
>> > lists...) The comment char could also appear in a character
string,
>> > where it does not mean the start of a comment...
>>
>>
>> Yes, that would be the problem.
>> Returning to my original post, modifying the example:
>>
>> x <- "print(2); bar <- \"don't ; use
semicolons\"; foo <- '3;4'; ls("
>>
>> This should result in a character vector of length 4:
>> [1] "print(2)" "bar <-
\"don't ; use
>> semicolons\""
>> [3] "foo <- '3;4'"
"ls("
>>
>> even though the last command would cause an error using parse(text = x)
>>
>> Perhaps this is not that important (I am trying to simulate a normal R
>> console), and parse only if it syntactically correct.
>> I was merely curious if this could be done, likely using regular
>> expressions (surely strsplit doesn't solve it).
>>
>> Best,
>> Adrian
>>
>> See the section on "partial parsing" in the ?parse help page.
>
> Duncan Murdoch
>
>
--
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr.90
050663 Bucharest sector 5
Romania
[[alternative HTML version deleted]]