Daniel Chen
2019-Jul-12 18:53 UTC
[Rd] Unexpected behaviour when comparing (==) long quoted expressions
Hi everyone: I?m one of the interns at RStudio this summer working on a project that helps teachers grade student code. I found an unexpected behaviour with the |==| operator when comparing |quote|d expressions. Example 1: |u <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # TRUE u <- quote(tidyr::gather(key = key, value = value, na.rm = TRUE)) s <- quote(tidyr::gather(key = key, value = value, na.rm = FALSE)) u == s # FALSE | Example 2: |u <- quote(f(x123456789012345678901234567890123456789012345678901234567890, 1)) s <- quote(f(x123456789012345678901234567890123456789012345678901234567890, 2)) u == s #> [1] TRUE | Winston Chang pointed out in the help page for |==|: Language objects such as symbols and calls are deparsed to character strings before comparison. and in the source code that does the comparison [1] shows that It deparses each language object and then only extracts the first element from the resulting character vector: |SET_STRING_ELT(tmp, 0, (iS) ? PRINTNAME(x) : STRING_ELT(deparse1(x, 0, DEFAULTDEPARSE), 0)); | Is this a fix that needs to happen within the |==| documentation? or an actual bug with the operator? For more context the original issue we had is here: https://github.com/rstudio-education/grader/issues/28 Workaround: You can get around this issue by using |all.equal| or |identical| |u <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # TRUE all.equal(u, s) # "target, current do not match when deparsed" identical(u, s) # FALSE | Thanks, Dan [1] https://github.com/wch/r-source/blob/e647f78cb85282263f88ea30c6337b77a30743d9/src/main/relop.c#L140-L155 ? [[alternative HTML version deleted]]
Clark Fitzgerald
2019-Jul-15 17:12 UTC
[Rd] Unexpected behaviour when comparing (==) long quoted expressions
Hi Dan, I wouldn't expect that behavior out of `==` on language objects either. On a related note, working with R's language objects directly can be clumsy. That was one of the motivations for Nick Ulle to develop the rstatic package. https://github.com/nick-ulle/rstatic It lets me write code that's easier to read and reason about, compared to using the standard language objects. It behaves as you hoped here:> s <-quote(f(x123456789012345678901234567890123456789012345678901234567890, 1))> u <-quote(f(x123456789012345678901234567890123456789012345678901234567890, 2))> s == u[1] TRUE> s1 = rstatic::to_ast(s) > u1 = rstatic::to_ast(u) > s1 == u1[1] FALSE Best, Clark On Mon, Jul 15, 2019 at 3:25 AM Daniel Chen <chendaniely at gmail.com> wrote:> Hi everyone: > > I?m one of the interns at RStudio this summer working on a project that > helps teachers grade student code. I found an unexpected behaviour with > the |==| operator when comparing |quote|d expressions. > > Example 1: > > |u <- quote(tidyr::gather(key = key, value = value, > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > TRUE u <- quote(tidyr::gather(key = key, value = value, na.rm = TRUE)) s > <- quote(tidyr::gather(key = key, value = value, na.rm = FALSE)) u == s > # FALSE | > > Example 2: > > |u <- > quote(f(x123456789012345678901234567890123456789012345678901234567890, > 1)) s <- > quote(f(x123456789012345678901234567890123456789012345678901234567890, > 2)) u == s #> [1] TRUE | > > Winston Chang pointed out in the help page for |==|: > > Language objects such as symbols and calls are deparsed to character > strings before comparison. > > and in the source code that does the comparison [1] shows that It > deparses each language object and then only extracts the first element > from the resulting character vector: > > |SET_STRING_ELT(tmp, 0, (iS) ? PRINTNAME(x) : STRING_ELT(deparse1(x, 0, > DEFAULTDEPARSE), 0)); | > > Is this a fix that needs to happen within the |==| documentation? or an > actual bug with the operator? > > For more context the original issue we had is here: > https://github.com/rstudio-education/grader/issues/28 > > Workaround: > > You can get around this issue by using |all.equal| or |identical| > > |u <- quote(tidyr::gather(key = key, value = value, > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > TRUE all.equal(u, s) # "target, current do not match when deparsed" > identical(u, s) # FALSE | > > Thanks, > > Dan > > [1] > > https://github.com/wch/r-source/blob/e647f78cb85282263f88ea30c6337b77a30743d9/src/main/relop.c#L140-L155 > > ? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Martin Maechler
2019-Jul-16 08:35 UTC
[Rd] Unexpected behaviour when comparing (==) long quoted expressions
>>>>> Daniel Chen >>>>> on Fri, 12 Jul 2019 13:53:21 -0500 writes:> Hi everyone: > I?m one of the interns at RStudio this summer working on a project that > helps teachers grade student code. I found an unexpected behaviour with > the |==| operator when comparing |quote|d expressions. > Example 1: > |u <- quote(tidyr::gather(key = key, value = value, > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key = > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > TRUE u <- quote(tidyr::gather(key = key, value = value, na.rm = TRUE)) s > <- quote(tidyr::gather(key = key, value = value, na.rm = FALSE)) u == s > # FALSE | Unfortunately the above is almost unreadable, as you "forgot" to click (in the lower right corner of your Gmail interface with the three vertical dots) "plain text mode". > Example 2: > |u <- > quote(f(x123456789012345678901234567890123456789012345678901234567890, > 1)) s <- > quote(f(x123456789012345678901234567890123456789012345678901234567890, > 2)) u == s #> [1] TRUE | this is even readable after html - de-html mangling > Winston Chang pointed out in the help page for |==|: > Language objects such as symbols and calls are deparsed to character > strings before comparison. > and in the source code that does the comparison [1] shows that It > deparses each language object and then only extracts the first element > from the resulting character vector: > |SET_STRING_ELT(tmp, 0, (iS) ? PRINTNAME(x) : STRING_ELT(deparse1(x, 0, > DEFAULTDEPARSE), 0)); | > Is this a fix that needs to happen within the |==| documentation? or an > actual bug with the operator? This a good question. Thank you, Daniel, for providing the link to the source code in <R>/src/main/relop.c . Looking at that and its context, I think we (R core) should reconsider that implementation of '==' which indeed does about the same thing as deparse {which also truncates at some point by default; something very very reasonable for error messages, but undesirable in other cases}. But I think it's fair expectation that comparing calls ["language"] with '==' should compare the full call's syntax even if that may occasionally be very long. Martin > For more context the original issue we had is here: > https://github.com/rstudio-education/grader/issues/28 > Workaround: > You can get around this issue by using |all.equal| or |identical| > |u <- quote(tidyr::gather(key = key, value = value, > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key = > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > TRUE all.equal(u, s) # "target, current do not match when deparsed" > identical(u, s) # FALSE | > Thanks, > Dan > [1] https://github.com/wch/r-source/blob/e647f78cb85282263f88ea30c6337b77a30743d9/src/main/relop.c#L140-L155
Daniel Chen
2019-Jul-16 09:11 UTC
[Rd] Unexpected behaviour when comparing (==) long quoted expressions
Hi Martin: Yes, I totally made things worse (and blundered my first listserv post) when things got converted from markdown... For posterity (and clarity), I've reproduced the examples that show the unexpected behaviour and the current workaround we've used. Example 1: expected u == s to return FALSE, but it return TRUE instead. u <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # TRUE u <- quote(tidyr::gather(key = key, value = value, na.rm = TRUE)) s <- quote(tidyr::gather(key = key, value = value, na.rm = FALSE)) u == s # FALSE Example 2: seems more to do with length of the quoted expression than function arguments. u <- quote(f(x123456789012345678901234567890123456789012345678901234567890, 1)) s <- quote(f(x123456789012345678901234567890123456789012345678901234567890, 2)) u == s #> [1] TRUE Workaround: You can get around this issue by using all.equal or identical u <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key = key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # TRUE all.equal(u, s) # "target, current do not match when deparsed" identical(u, s) # FALSE The snippet for the implementation shows only the first element from the resulting deparse character vector is used (https://github.com/wch/r-source/blob/e647f78cb85282263f88ea30c6337b77a30743d9/src/main/relop.c#L140-L155): ??? SET_STRING_ELT(tmp, 0, (iS) ? PRINTNAME(x) : ?????????????? STRING_ELT(deparse1(x, 0, DEFAULTDEPARSE), 0)); - Dan On 7/16/19 1:35 AM, Martin Maechler wrote:>>>>>> Daniel Chen >>>>>> on Fri, 12 Jul 2019 13:53:21 -0500 writes: > > Hi everyone: > > I?m one of the interns at RStudio this summer working on a project that > > helps teachers grade student code. I found an unexpected behaviour with > > the |==| operator when comparing |quote|d expressions. > > > Example 1: > > > |u <- quote(tidyr::gather(key = key, value = value, > > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key > > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > > TRUE u <- quote(tidyr::gather(key = key, value = value, na.rm = TRUE)) s > > <- quote(tidyr::gather(key = key, value = value, na.rm = FALSE)) u == s > > # FALSE | > > Unfortunately the above is almost unreadable, as you "forgot" to > click (in the lower right corner of your Gmail interface with > the three vertical dots) "plain text mode". > > > Example 2: > > > |u <- > > quote(f(x123456789012345678901234567890123456789012345678901234567890, > > 1)) s <- > > quote(f(x123456789012345678901234567890123456789012345678901234567890, > > 2)) u == s #> [1] TRUE | > > this is even readable after html - de-html mangling > > > Winston Chang pointed out in the help page for |==|: > > > Language objects such as symbols and calls are deparsed to character > > strings before comparison. > > > and in the source code that does the comparison [1] shows that It > > deparses each language object and then only extracts the first element > > from the resulting character vector: > > > |SET_STRING_ELT(tmp, 0, (iS) ? PRINTNAME(x) : STRING_ELT(deparse1(x, 0, > > DEFAULTDEPARSE), 0)); | > > > Is this a fix that needs to happen within the |==| documentation? or an > > actual bug with the operator? > > This a good question. > > Thank you, Daniel, for providing the link to the source code in > <R>/src/main/relop.c . > > Looking at that and its context, I think we (R core) should > reconsider that implementation of '==' which indeed does about > the same thing as deparse {which also truncates at some point by > default; something very very reasonable for error messages, but > undesirable in other cases}. > > But I think it's fair expectation that comparing calls ["language"] > with '==' should compare the full call's syntax even if that may > occasionally be very long. > > Martin > > > For more context the original issue we had is here: > > https://github.com/rstudio-education/grader/issues/28 > > > Workaround: > > > You can get around this issue by using |all.equal| or |identical| > > > |u <- quote(tidyr::gather(key = key, value = value, > > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key > > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > > TRUE all.equal(u, s) # "target, current do not match when deparsed" > > identical(u, s) # FALSE | > > > Thanks, > > > Dan > > > [1] https://github.com/wch/r-source/blob/e647f78cb85282263f88ea30c6337b77a30743d9/src/main/relop.c#L140-L155
Tierney, Luke
2019-Jul-16 12:05 UTC
[Rd] [External] Re: Unexpected behaviour when comparing (==) long quoted expressions
On Tue, 16 Jul 2019, Martin Maechler wrote:>>>>>> Daniel Chen >>>>>> on Fri, 12 Jul 2019 13:53:21 -0500 writes: > > > Hi everyone: > > I?m one of the interns at RStudio this summer working on a project that > > helps teachers grade student code. I found an unexpected behaviour with > > the |==| operator when comparing |quote|d expressions. > > > Example 1: > > > |u <- quote(tidyr::gather(key = key, value = value, > > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key > > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > > TRUE u <- quote(tidyr::gather(key = key, value = value, na.rm = TRUE)) s > > <- quote(tidyr::gather(key = key, value = value, na.rm = FALSE)) u == s > > # FALSE | > > Unfortunately the above is almost unreadable, as you "forgot" to > click (in the lower right corner of your Gmail interface with > the three vertical dots) "plain text mode". > > > Example 2: > > > |u <- > > quote(f(x123456789012345678901234567890123456789012345678901234567890, > > 1)) s <- > > quote(f(x123456789012345678901234567890123456789012345678901234567890, > > 2)) u == s #> [1] TRUE | > > this is even readable after html - de-html mangling > > > Winston Chang pointed out in the help page for |==|: > > > Language objects such as symbols and calls are deparsed to character > > strings before comparison. > > > and in the source code that does the comparison [1] shows that It > > deparses each language object and then only extracts the first element > > from the resulting character vector: > > > |SET_STRING_ELT(tmp, 0, (iS) ? PRINTNAME(x) : STRING_ELT(deparse1(x, 0, > > DEFAULTDEPARSE), 0)); | > > > Is this a fix that needs to happen within the |==| documentation? or an > > actual bug with the operator? > > This a good question. > > Thank you, Daniel, for providing the link to the source code in > <R>/src/main/relop.c . > > Looking at that and its context, I think we (R core) should > reconsider that implementation of '==' which indeed does about > the same thing as deparse {which also truncates at some point by > default; something very very reasonable for error messages, but > undesirable in other cases}. > > But I think it's fair expectation that comparing calls ["language"] > with '==' should compare the full call's syntax even if that may > occasionally be very long.Before going there I think we should reconsider whether allowing =comparisons on calls is a good idea. We already don't allow it for expresison() objects. It is probably unavoidable to allow symbols (there are probably lots of things that would break if quote(x) == "x" did not work and return TRUE), but for calls it makes little sense to have "f(x)" == quote(f(x)). These are very different objects that happen to have identical string representations. For computing on the language identical() is the right way to go (that is what is used in the byte code compiler and codetools). Best, luke> > Martin > > > For more context the original issue we had is here: > > https://github.com/rstudio-education/grader/issues/28 > > > Workaround: > > > You can get around this issue by using |all.equal| or |identical| > > > |u <- quote(tidyr::gather(key = key, value = value, > > new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key > > key, value = value, new_sp_m014:newrel_f65, na.rm = FALSE)) u == s # > > TRUE all.equal(u, s) # "target, current do not match when deparsed" > > identical(u, s) # FALSE | > > > Thanks, > > > Dan > > > [1] https://github.com/wch/r-source/blob/e647f78cb85282263f88ea30c6337b77a30743d9/src/main/relop.c#L140-L155 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
Possibly Parallel Threads
- Unexpected behaviour when comparing (==) long quoted expressions
- Re: [R] chisq.test freezing on certain inputs (PR#5701)
- [External] Mitigating Stalls Caused by Call Deparse on Error
- [External] Mitigating Stalls Caused by Call Deparse on Error
- Change DEFAULTDEPARSE to DEFAULTDEPARSE | SHOWATTRIBUTES ?