thr3ads.net - R devel - [Rd] Changes to parser in R-devel [Jul 2012]

If this information is useful, please help other people find it:
Share via:

Duncan Murdoch

2012-Jul-18 18:31 UTC

[Rd] Changes to parser in R-devel

I have just committed (in r59883) some changes to the R parser based on 
Romain Francois' parser package.  Packages that made use of parser will 
hopefully find that the information in base R gives them what they need 
to work with, but the data is not identical to
what parser recorded (since it was not consistent with some things 
already in R).  One reason for the change was that the parser in the 
parser package was slightly different than the one in R; the hope is 
that by providing the services in R, it will make maintenance easier for 
things like code analysis, pretty printing, etc.

See ?getParseData for details, and if you are maintaining a package that 
depends on parser, feel free to ask me for help in the transition, or 
make suggestions for changes if I've done something that causes you too 
much trouble.

Duncan Murdoch

P.S. to Qiang Li:  as mentioned privately, the goal for this change was 
to reproduce output equivalent to what parser did, so I have not 
incorporated your suggested change to outlaw expressions like "x[[1]
]"
(with an embedded space where it shouldn't be).  After things settle 
down we can consider that change and others.

Yihui Xie

2012-Jul-19 20:41 UTC

head link

[Rd] Changes to parser in R-devel

I'm not sure if there is a bug somewhere; see this example:

getParseData(parse(text='function(x){}'))

  line1 col1 line2 col2 id parent          token terminal     text
1     1    1     1    8  1     11       FUNCTION     TRUE function
2     1    9     1    9  2     11            '('     TRUE        (
3     1   10     1   10  3      5 SYMBOL_FORMALS     TRUE        x
4     1   11     1   11  4     11            ')'     TRUE        )
5     1   12     1   12  6      8            '{'     TRUE        {
6     1   13     1   13  7      8            '}'     TRUE        }
7     1   12     1   12  5     11            '}'     TRUE        {
8     1   12     1   13  8     11           expr    FALSE
9     1    1     1   13 11      0           expr    FALSE

I get an additional { in the 7th row of the 'text' column.

Another problem is that for this empty function below, there will be
an obvious pause if you run it more than once:

getParseData(parse(text='function(){}'))

and you may get wild line/col numbers like this:

   line1 col1     line2 col2 id parent    token terminal     text
1      1    1         1    8  1      9 FUNCTION     TRUE function
2      1    9         1    9  2      9      '('     TRUE        (
3      1   10         1   10  3      9      ')'     TRUE        )
4      1   11         1   11  4      6      '{'     TRUE        {
5      1   12         1   12  5      6      '}'     TRUE        }
6 320024   11 140106360   11 11      9      '}'     TRUE
7      1   11         1   12  6      9     expr    FALSE
8      1    1         1   12  9     11     expr    FALSE

What is worse is it can crash R:

 *** caught segfault ***
address 0x9488c20, cause 'memory not mapped'

Traceback:
 1: parse(text = "function(){}")
 2: getSrcref(x)
 3: getSrcfile(x)
 4: getParseData(parse(text = "function(){}"))

> sessionInfo()R Under development (unstable) (2012-07-18 r59904)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base


Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Wed, Jul 18, 2012 at 2:31 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:> I have just committed (in r59883) some changes to the R parser based on
> Romain Francois' parser package.  Packages that made use of parser will
> hopefully find that the information in base R gives them what they need to
> work with, but the data is not identical to
> what parser recorded (since it was not consistent with some things already
> in R).  One reason for the change was that the parser in the parser package
> was slightly different than the one in R; the hope is that by providing the
> services in R, it will make maintenance easier for things like code
> analysis, pretty printing, etc.
>
> See ?getParseData for details, and if you are maintaining a package that
> depends on parser, feel free to ask me for help in the transition, or make
> suggestions for changes if I've done something that causes you too much
> trouble.
>
> Duncan Murdoch
>
> P.S. to Qiang Li:  as mentioned privately, the goal for this change was to
> reproduce output equivalent to what parser did, so I have not incorporated
> your suggested change to outlaw expressions like "x[[1] ]"  (with
an
> embedded space where it shouldn't be).  After things settle down we can
> consider that change and others.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Reasonably Related Threads

Search for more maybe matching threads

R devel - Jul 2012 - Changes to parser in R-devel

[Rd] Changes to parser in R-devel

[Rd] Changes to parser in R-devel

Reasonably Related Threads