karl at huftis.org
2009-Dec-10 09:20 UTC
[Rd] R on Windows crashes when using certain characters in strings in data frames (PR#14125)
Full_Name: Karl Ove Hufthammer Version: 2.10.0 OS: Windows XP Submission from: (NULL) (93.124.134.66) I have found a rather strange bug in R 2.10.0 on Windows, where the choice of characters used in a string make R crash (i.e., Windows shows a dialogue saying that the application has a problem, and must be closed). I can reproduce the bug on two separate systems running Windows XP, and with both R 2.10.0 and the latest R.2.10.1 RC. The following commands trigger the crash for me: n=1e5 k=10 x=sample(k,n,replace=TRUE) y=sample(k,n,replace=TRUE) xy=paste(x,y,sep=" ? ") z=sample(n) d=data.frame(xy,z) The last step takes very long time, and R crashes before it's finished. Note that if I reduce n, the problem disappears. Also, if I change the ? (a multiplication symbol) to a x (a letter), the problem also disappears (and the last command takes almost no time to run). I originally discovered this (or a related?) bug while using 'unique' on a data frame similar to the 'd' data frame defined above, where R would often, but not always, crash.> sessionInfo()R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=Norwegian-Nynorsk_Norway.1252 [2] LC_CTYPE=Norwegian-Nynorsk_Norway.1252 [3] LC_MONETARY=Norwegian-Nynorsk_Norway.1252 [4] LC_NUMERIC=C [5] LC_TIME=Norwegian-Nynorsk_Norway.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base
Karl Ove Hufthammer
2009-Dec-11 11:36 UTC
[Rd] R on Windows crashes when using certain characters in strings in data frames (PR#14125)
On Thu, 10 Dec 2009 10:20:09 +0100 (CET) karl at huftis.org <karl at huftis.org> wrote:> The following commands trigger the crash for me: > > n=1e5 > k=10 > x=sample(k,n,replace=TRUE) > y=sample(k,n,replace=TRUE) > xy=paste(x,y,sep=" ? ") > z=sample(n) > d=data.frame(xy,z)Note: On the R Bug Tracking System Web site, the character causing the problem seems to be incorrectly displayed as a '.', though on the mailing list the correct character is used. The character should be the multiplication symbol, U+00D7, which looks similar to an 'x'. The character does exist in both ISO 8859-1 and Windows-1252. -- Karl Ove Hufthammer
Duncan Murdoch
2009-Dec-14 12:34 UTC
[Rd] R on Windows crashes when using certain characters in strings in data frames (PR#14125)
On 10/12/2009 4:20 AM, karl at huftis.org wrote:> Full_Name: Karl Ove Hufthammer > Version: 2.10.0 > OS: Windows XP > Submission from: (NULL) (93.124.134.66) > > > I have found a rather strange bug in R 2.10.0 on Windows, where the choice of > characters used in a string make R crash (i.e., Windows shows a dialogue saying > that the application has a problem, and must be closed).This was related to encoding changes. It likely appeared Windows-specific because Windows uses a different default encoding than most Linux systems. I believe it is fixed now in R-devel, and it will soon make it into 2.10.1-patched, but it came too late to make it into today's release. I believe PR#14114 was the same issue and is also fixed, but I did less testing of it. I'd appreciate it if those who saw either bug in real code test the patches. They should be in today's tarball of R-devel, and did make it into the Windows binary build of R-devel this morning. Duncan> > I can reproduce the bug on two separate systems running Windows XP, and with > both R 2.10.0 and the latest R.2.10.1 RC. > > The following commands trigger the crash for me: > > n=1e5 > k=10 > x=sample(k,n,replace=TRUE) > y=sample(k,n,replace=TRUE) > xy=paste(x,y,sep=" ? ") > z=sample(n) > d=data.frame(xy,z) > > The last step takes very long time, and R crashes before it's finished. Note > that if I reduce n, the problem disappears. Also, if I change the ? (a > multiplication symbol) to a x (a letter), the problem also disappears (and the > last command takes almost no time to run). > > I originally discovered this (or a related?) bug while using 'unique' on a data > frame similar to the 'd' data frame defined above, where R would often, but not > always, crash. > >> sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=Norwegian-Nynorsk_Norway.1252 > [2] LC_CTYPE=Norwegian-Nynorsk_Norway.1252 > [3] LC_MONETARY=Norwegian-Nynorsk_Norway.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=Norwegian-Nynorsk_Norway.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Apparently Analagous Threads
- Border width on symbols plotted with the lattice package
- R on Windows crashes when using certain characters in strings (PR#14137)
- match function causing bad performance when using table function on factors with multibyte characters on Windows
- Kerning issues with CairoPDF
- plot.POSIXct uses wrong x axis (PR#14016)