g.russell at eos-solutions.com
2009-Dec-07 11:15 UTC
[Rd] Crash with Unicode and sub (PR#14114)
Full_Name: George Russell Version: 2.10.0 OS: Windows XP Version 2002 SP 2 Submission from: (NULL) (217.111.3.131) The following typed into R --vanilla induces a crash: -- cut here -- gctorture() u <- intToUtf8(c(rep(1e3,1e2),32,c(rep(1e3,1e2)))) v <- rep(u,1e2) v <- sub(" ","",v) v %in% "" -- cut here -- sessionInfo() says: -- cut here -- R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5] LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base -- cut here -- I apologise for not testing this with R-2.10.1 but as far as I can see there are only source releases available so far, which I am not able to compile. Best wishes and thanks, George Russell
g.russell at eos-solutions.com wrote:> Full_Name: George Russell > Version: 2.10.0 > OS: Windows XP Version 2002 SP 2 > Submission from: (NULL) (217.111.3.131) > > > The following typed into R --vanilla induces a crash: > -- cut here -- > gctorture() > u <- intToUtf8(c(rep(1e3,1e2),32,c(rep(1e3,1e2)))) > v <- rep(u,1e2) > v <- sub(" ","",v) > v %in% "" > -- cut here -- > > sessionInfo() says: > > -- cut here -- > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 > [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C > [5] LC_TIME=German_Germany.1252 > > attached base packages: > [1] stats graphics grDevices datasets utils methods base > -- cut here -- > > I apologise for not testing this with R-2.10.1 but as far as I can see there are > only source releases available so far, which I am not able to compile. >2.10.1 RC is available now. Please check. It does seem to be reproducible in the Windows version, or at least it takes a very long time, but that means running under Wine on SUSE for me. I don't see the effect with the Linux build. -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
maechler at stat.math.ethz.ch
2009-Dec-08 14:35 UTC
[Rd] Crash with Unicode and sub (PR#14114)
>>>>> "PD" == Peter Dalgaard <P.Dalgaard at biostat.ku.dk> >>>>> on Tue, 08 Dec 2009 11:24:50 +0100 writes:PD> g.russell at eos-solutions.com wrote: >> Full_Name: George Russell >> Version: 2.10.0 >> OS: Windows XP Version 2002 SP 2 >> Submission from: (NULL) (217.111.3.131) >> >> >> The following typed into R --vanilla induces a crash: >> -- cut here -- >> gctorture() >> u <- intToUtf8(c(rep(1e3,1e2),32,c(rep(1e3,1e2)))) >> v <- rep(u,1e2) >> v <- sub(" ","",v) >> v %in% "" >> -- cut here -- >> >> sessionInfo() says: >> >> -- cut here -- >> R version 2.10.0 (2009-10-26) >> i386-pc-mingw32 >> >> locale: >> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 >> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C >> [5] LC_TIME=German_Germany.1252 >> >> attached base packages: >> [1] stats graphics grDevices datasets utils methods base >> -- cut here -- >> >> I apologise for not testing this with R-2.10.1 but as far as I can see there are >> only source releases available so far, which I am not able to compile. >> PD> 2.10.1 RC is available now. Please check. I just did, on our "Windows Server 2003 R2 \\ Standard x64 edition" with > sessionInfo() R version 2.10.1 RC (2009-12-06 r50684) i386-pc-mingw32 locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 [3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C [5] LC_TIME=German_Switzerland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base > It does "crash" i.e. you get a popup window about an exception with a hex code. And indeed, I don't see a problem in Linux. Martin Maechler, ETH Zurich PD> It does seem to be PD> reproducible in the Windows version, or at least it takes a very long PD> time, but that means running under Wine on SUSE for me. I don't see the PD> effect with the Linux build. PD> -- PD> O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B PD> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K PD> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 PD> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 PD> ______________________________________________ PD> R-devel at r-project.org mailing list PD> https://stat.ethz.ch/mailman/listinfo/r-devel
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --27464147-153156736-1260394708=:548 Content-Type: TEXT/PLAIN; CHARSET=UTF-8; format=flowed Content-Transfer-Encoding: 8BIT Content-ID: <alpine.LFD.2.00.0912092139001.548 at gannet.stats.ox.ac.uk> It seems (from the debugger output) that this is corruption in the R memory allocation routines. Such things can usually be tracked down via valgrind and a valgrind-instrumented build of R, but I cannot trigger this on any system with valgrind. I've tried 64- and 32-bit versions, and Latin-1 locales as well as UTF-8. So I am inclining to think this is Windows-specific. One thing that is specific to Windows is UCS-2 (16-bit) wide characters, which might be the issue. But we simply don't have the tools on Windows that we do on other platforms. On Wed, 9 Dec 2009, g.russell at eos-solutions.com wrote:> Hello Peter, > > I have now installed R-2.10.1 RC (sessionInfo() says "R version 2.10.1 RC > (2009-12-06 r50684)", the rest I believe is as before). The following code > always brings R --vanilla down (with a crash, not a normal exit): > -- cut here -- > gctorture() > u <- intToUtf8(c(rep(1e3,1e2),32,c(rep(1e3,1e2)))) > v <- rep(u,1e2) > v <- sub(" ","",v) > v %in% "" > q() > -- cut here -- > > I've tried this several times now, with different effects. Sometimes R > crashes after 'v %in% ""'. Sometimes it survives that command, but crashes > during the q(). I have also had the error message "Fehler in match(x, > table, nomatch = 0L) > 0L : Vergleich (6) ist nur f??r atomare und > Listentypen m??glich" from that command (the match seems to be the > problem), when I type q() R still crashes. > > Best wishes, > > George Russell | KG EOS Holding GmbH & Co > > Tel: +49 40 2850 ??? 1574 | g.russell at eos-solutions.com > > EOS. With head and heart in finance > > KG EOS Holding GmbH & Co | Steindamm 71, 20099 Hamburg | AG Hamburg HRA 95 > 748 > Pers??nlich haftend | EOS Holding GmbH | AG Hamburg HRB 78 748 > Gesch??ftsf??hrer | Hans-Werner Scherer, Klaus Engberding, Justus > Hecking-Veltman, Paul Leary sen., Christos Savvides, Dr. Andreas Witzig > Vorsitzender des Beirates | J??rgen Schulte-Laggenbeck > > Save a tree. Don???t print this email unless it???s really necessary. > > Diese E-Mail enth??lt vertrauliche und/oder rechtlich gesch??tzte > Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail > irrt??mlich erhalten haben, informieren Sie bitte sofort den Absender und > vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte > Weitergabe dieser Mail ist nicht gestattet. > > This email may contain confidential and/or privileged information. > If you are not the intended recipient or have received this email in > error, please notify the sender immediately and destroy this email. > Any unauthorized copying, disclosure or distribution of the material in > this email is strictly forbidden. > > Peter Dalgaard <P.Dalgaard at biostat.ku.dk> schrieb am 08.12.2009 11:24:50: > >> g.russell at eos-solutions.com wrote: >>> Full_Name: George Russell >>> Version: 2.10.0 >>> OS: Windows XP Version 2002 SP 2 >>> Submission from: (NULL) (217.111.3.131) >>> >>> >>> The following typed into R --vanilla induces a crash: >>> -- cut here -- >>> gctorture() >>> u <- intToUtf8(c(rep(1e3,1e2),32,c(rep(1e3,1e2)))) >>> v <- rep(u,1e2) >>> v <- sub(" ","",v) >>> v %in% "" >>> -- cut here -- >>> >>> sessionInfo() says: >>> >>> -- cut here -- >>> R version 2.10.0 (2009-10-26) >>> i386-pc-mingw32 >>> >>> locale: >>> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 >>> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C >>> [5] LC_TIME=German_Germany.1252 >>> >>> attached base packages: >>> [1] stats graphics grDevices datasets utils methods base >>> -- cut here -- >>> >>> I apologise for not testing this with R-2.10.1 but as far as I can >> see there are >>> only source releases available so far, which I am not able to compile. >>> >> >> 2.10.1 RC is available now. Please check. It does seem to be >> reproducible in the Windows version, or at least it takes a very long >> time, but that means running under Wine on SUSE for me. I don't see the >> effect with the Linux build. >> >> -- >> O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B >> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K >> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 >> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 >> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 --27464147-153156736-1260394708=:548--
Reasonably Related Threads
- Antwort: Re: Crash with Unicode and sub (PR#14114)
- R crash with intToUtf8 on huge vectors (PR#14068)
- Confusing error message for [[.factor (PR#14209)
- Rubbish values written with zero-length vectors (PR#14217)
- Sweave output encoding in R-2.10.0beta on Windows (Rgui <-> Rterm)