Can someone verify whether or not this is a bug. When I substitute all occurrence of "\\B" with "X" R seems to correctly place an X at all non-word boundaries (whether or not I specify perl) but "\\b" does not seem to act on all complement positions:> gsub("\\b", "X", "abc def") # nothing done[1] "abc def"> gsub("\\B", "X", "abc def") # as expected, I think[1] "aXbXc dXeXf"> gsub("\\b", "X", "abc def", perl = TRUE) # not as expected[1] "abc Xdef"> gsub("\\B", "X", "abc def", perl = TRUE) # as expected[1] "aXbXc dXeXf"> R.version.string # Windows 2000[1] "R version 2.0.1, 2004-11-27"
>>>>> "Gabor" == Gabor Grothendieck <ggrothendieck@myway.com> >>>>> on Wed, 1 Dec 2004 21:05:59 -0500 (EST) writes:Gabor> Can someone verify whether or not this is a bug. Gabor> When I substitute all occurrence of "\\B" with "X" R Gabor> seems to correctly place an X at all non-word Gabor> boundaries (whether or not I specify perl) but "\\b" Gabor> does not seem to act on all complement positions: >> gsub("\\b", "X", "abc def") # nothing done Gabor> [1] "abc def" >> gsub("\\B", "X", "abc def") # as expected, I think Gabor> [1] "aXbXc dXeXf" >> gsub("\\b", "X", "abc def", perl = TRUE) # not as >> expected Gabor> [1] "abc Xdef" >> gsub("\\B", "X", "abc def", perl = TRUE) # as expected Gabor> [1] "aXbXc dXeXf" >> R.version.string # Windows 2000 Gabor> [1] "R version 2.0.1, 2004-11-27" I agree this looks "unfortunate". Just to confirm: 1) I get the same on a Linux version 2) the real perl does behave differently and as you (and I) would have expected: $ echo 'abc def'| perl -pe 's/\b/X/g' XabcX XdefX $ echo 'abc def'| perl -pe 's/\B/X/g' aXbXc dXeXf Also, from what I see, "\b" should behave the same independently of perl = TRUE or FALSE. -- Martin
Gabor Grothendieck <ggrothendieck <at> myway.com> writes: : : Can someone verify whether or not this is a bug. : : When I substitute all occurrence of "\\B" with "X" : R seems to correctly place an X at all non-word boundaries : (whether or not I specify perl) but "\\b" does not seem to : act on all complement positions: : : > gsub("\\b", "X", "abc def") # nothing done : [1] "abc def" : > gsub("\\B", "X", "abc def") # as expected, I think : [1] "aXbXc dXeXf" : > gsub("\\b", "X", "abc def", perl = TRUE) # not as expected : [1] "abc Xdef" : > gsub("\\B", "X", "abc def", perl = TRUE) # as expected : [1] "aXbXc dXeXf" : > R.version.string # Windows 2000 : [1] "R version 2.0.1, 2004-11-27" I have found another possibly related problem. In the above \\B always worked as expected but not \\b. I have an example where \\B does not work as expected either. Note that in the first example below all the letters which are not first in the word get prefaced with X as expected but in the second case only alternate letters which are not first in the word get replaced with X whereas one would have exptected that all letters not first in the word get replaced with X. R> gsub("\\B", "X", "The Quick Brown Fox") # works as expected [1] "TXhXe QXuXiXcXk BXrXoXwXn FXoXx" R> gsub("\\B.", "X", "The Quick Brown Fox", perl = TRUE) # problem [1] "TXe QXiXk BXoXn FXx" R> R.version.string # Windows XP [1] "R version 2.0.1, 2004-11-04" By the way, do I have to submit a second bug report for this or is it possible to add this onto the previous one as a comment?