R help - Mar 2009 - Help

Hello Everyone,
I am trying to excess the inbuit .Fortran and .C codes of R. Can any one
help me in that. For example in kmeans clustering the algorithms are written
in .Fortran I want to access them and  see the .Fortran syntax of the codes.
Can any one help me how can I do that?

Thanx,
Nitin Kumar

On Thu, Nov 27, 2008 at 12:00 PM, <r-help-request@r-project.org> wrote:
> Send R-help mailing list submissions to
>        r-help@r-project.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        stat.ethz.ch/mailman/listinfo/r-help
> or, via email, send a message with subject or body 'help' to
>        r-help-request@r-project.org
>
> You can reach the person managing the list at
>        r-help-owner@r-project.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of R-help digest..."
>
>
> Today's Topics:
>
>   1. Re: Efficient passing through big data.frame and modifying
>      select (Johannes Graumann)
>   2. Re: Running rtest - how to/ help (indian scorpio)
>   3. Re: how to read .sps (SPSS file extension)? (Eik Vettorazzi)
>   4. Re: how to read .sps (SPSS file extension)? (Eik Vettorazzi)
>   5. Re: plotting density for truncated distribution (Chris Andrews)
>   6. construct a vector (axionator)
>   7. Re: construct a vector (Marc Schwartz)
>   8. Re: construct a vector (Richard.Cotton@hsl.gov.uk)
>   9. Re: multiple imputation with fit.mult.impute in Hmisc - how
>      to replace NA with imputed value? (Frank E Harrell Jr)
>  10. Re: construct a vector (Wacek Kusnierczyk)
>  11. Re: construct a vector (axionator)
>  12. Re: Question about Kolmogorov-Smirnov Test (Uwe Ligges)
>  13. Re: memory limit (seanpor)
>  14. Re: Very slow: using double apply and cor.test to compute
>      correlation p.values for 2 matrices (jim holtman)
>  15. how to check linearity in Cox regression (Terry Therneau)
>  16. Chi-Square Test Disagreement (Andrew Choens)
>  17. Re: eclipse and R (Tobias Verbeke)
>  18. Finding Stopping time (Debanjan Bhattacharjee)
>  19. Re: Very slow: using double apply and cor.test to compute
>      correlation p.values for 2 matrices (David Winsemius)
>  20.  memory limit (iwalters)
>  21. Needs suggestions for choosing appropriate R packages
>      (zhijie zhang)
>  22. plm pakage (valeria pedrina)
>  23. S4 object (Laurina Guerra)
>  24. Re: Very slow: using double apply and cor.test to compute
>      correlation p.values for 2 matrices (Jorge Ivan Velez)
>  25. Re: Chi-Square Test Disagreement (Chuck Cleland)
>  26. S4 slot containing either aov or NULL (Thomas Kaliwe)
>  27.  odfWeave and XML... on a Mac (Tubin)
>  28. Re: Very slow: using double apply and cor.test to compute
>      correlation p.values for 2 matrices (hadley wickham)
>  29. Re: Very slow: using double apply and cor.test to compute
>      correlation p.values for 2 matrices (Daren Tan)
>  30. Re: plotting density for truncated distribution (Jeroen Ooms)
>  31. RES:  S4 object (Rodrigo Aluizio)
>  32. RES:   memory limit (Leandro Marino)
>  33. ts subscripting problem (andrew collier)
>  34. survreg and pweibull (Andrew Beckerman)
>  35. Second y-axis (Dr. Alireza Zolfaghari)
>  36. Re: multiple imputation with fit.mult.impute in Hmisc - how
>      to replace NA with imputed value? (Charlie Brush)
>  37. Re: multiple imputation with fit.mult.impute in Hmisc - how
>      to replace NA with imputed value? (Frank E Harrell Jr)
>  38. Re: Chi-Square Test Disagreement (Berwin A Turlach)
>  39. Re: Chi-Square Test Disagreement (Andrew Choens)
>  40. Problem with aovlmer.fnc in languageR (Mats Exter)
>  41. Re: S4 slot containing either aov or NULL (Matthias Kohl)
>  42. Re: Chi-Square Test Disagreement ( (Ted Harding))
>  43. Re: Error in sqlCopy in RODBC (BKMooney)
>  44. Smoothed 3D plots (Jorge Ivan Velez)
>  45. Re: weighted ftable (Andrew Choens)
>  46. Re: Chi-Square Test Disagreement (Andrew Choens)
>  47. Reshape with var as fun.aggregate (locklin.jason@gmail.com)
>  48.  Creating a vector based on lookup function (PDXRugger)
>  49. Request for Assistance in R with NonMem (Michael White)
>  50. Re: Smoothed 3D plots (Mark Difford)
>  51. SPSSyntax function (Adrian Dusa)
>  52.  R and SPSS (Applejus)
>  53. Re: Creating a vector based on lookup function (Charles C. Berry)
>  54. Re: Very slow: using double apply and cor.test to compute
>      correlation p.values for 2 matrices (Thomas Lumley)
>  55. Re: memory limit (Stavros Macrakis)
>  56. Installing packages based on the license (Ted Mendum)
>  57. Re: R and SPSS (Liviu Andronic)
>  58. SVM (simon abai bisrat)
>  59. Estimates of coefficient variances and covariances from a
>      multinomial logistic regression? (Grant Gillis)
>  60. Re: Installing packages based on the license (Charles C. Berry)
>  61. Re: Estimates of coefficient variances and covariances from a
>      multinomial logistic regression? (Charles C. Berry)
>  62. Re: Installing packages based on the license (Duncan Murdoch)
>  63. Re: memory limit (Henrik Bengtsson)
>  64. Drawing a tree in R (Severin Hacker)
>  65. Re: Reshape with var as fun.aggregate (hadley wickham)
>  66. Re: R and SPSS (Andrew Choens)
>  67. Re: R and SPSS (David Winsemius)
>  68. Boundary value problem (karthik jayasurya)
>  69.  Business Data Sets (nmarti)
>  70. Re: How to create a string containing '\/' to be used with
>      SED? (ikarus)
>  71. base:::rbind (Fernando Saldanha)
>  72. ggplot2 problem (steve)
>  73. Re: base:::rbind (Gabor Grothendieck)
>  74. Re: ggplot2 problem (Eric)
>  75. Re: ggplot2 problem (steve)
>  76. Re: How to create a string containing '\/' to be used with
>      SED? (seanpor)
>  77. as.numeric in data.frame, but only where it is possible (Kinoko)
>  78. Re: R and SPSS (Alain Guillet)
>  79. Re: Error in sqlCopy in RODBC (Dieter Menne)
>  80. Re: Welcome to the "R-help" mailing list (Digest mode)
>      (Weijia You)
>  81. Re: Welcome to the "R-help" mailing list (Digest mode)
>      ( G?bor Cs?rdi )
>  82. Regression Problem for loop (ales grill)
>  83. Re: survreg and pweibull solved for any distribution
>      (Andrew Beckerman)
>  84. what is there in a numeric (0)? (jass@in.gr)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 26 Nov 2008 11:36:58 +0100
> From: Johannes Graumann <johannes_graumann@web.de>
> Subject: Re: [R] Efficient passing through big data.frame and
>        modifying       select
> To: "Henrik Bengtsson" <hb@stat.berkeley.edu>
> Cc: R help <R-help@stat.math.ethz.ch>, William Dunlap
>        <wdunlap@tibco.com>
> Message-ID: <200811261137.00489.johannes_graumann@web.de>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Marvelous! Thanks guys for your hints and time! Very smooth now!
>
> Joh
>
> On Wednesday 26 November 2008 03:41:49 Henrik Bengtsson wrote:
> > Alright, here are another $.02: using 'use.names=FALSE' in
unlist() is
> > much faster than the default 'use.names=TRUE'. /Henrik
> >
> > On Tue, Nov 25, 2008 at 6:40 PM, Henrik Bengtsson
<hb@stat.berkeley.edu>
> wrote:
> > > My $.02: Using argument 'fixed=TRUE' in strsplit() is
much faster than
> > > the default 'fixed=FALSE'. /Henrik
> > >
> > > On Tue, Nov 25, 2008 at 1:02 PM, William Dunlap
<wdunlap@tibco.com>
> wrote:
> > >>> -----Original Message-----
> > >>> From: William Dunlap
> > >>> Sent: Tuesday, November 25, 2008 9:16 AM
> > >>> To: 'johannes_graumann@web.de'
> > >>> Subject: Re: [R] Efficient passing through big data.frame
and
> > >>> modifying select fields
> > >>>
> > >>> > Johannes Graumann johannes_graumann at web.de
> > >>> > Tue Nov 25 15:16:01 CET 2008
> > >>> >
> > >>> > Hi all,
> > >>> >
> > >>> > I have relatively big data frames (> 10000 rows
by 80 columns)
> > >>> > that need to be exposed to "merge". Works
marvelously well in
> > >>> > general, but some fields of the data frames actually
contain
> > >>> > multiple ";"-separated values encoded as a
character string without
> > >>> > defined order, which makes the fields not match each
other.
> > >>> >
> > >>> > Example:
> > >>> > > frame1[1,1]
> > >>> >
> > >>> > [1] "some;thing"
> > >>> >
> > >>> > >frame2[2,1]
> > >>> >
> > >>> > [2] "thing;some"
> > >>> >
> > >>> > In order to enable merging/duplicate identification
of columns
> > >>> > containing these strings, I wrote the following
function, which
> > >>> > passes through the rows one by one, identifies
";"-containing
> cells,
> > >>> > splits and resorts them.
> > >>> >
> > >>> > ResortCombinedFields <- function(dframe){
> > >>> >  if(!is.data.frame(dframe)){
> > >>> >    stop("\"ResortCombinedFields\"
input needs to be a data frame.")
> > >>> >  }
> > >>> >  for(row in seq(nrow(dframe))){
> > >>> >    for(mef in grep(";",dframe[row,])){
> > >>>
> > >>> I needed to add drop=TRUE to the above dframe[row,] for
this to work.
> > >>>
> > >>> >      dframe[row,mef] <-
> > >>>
> > >>>
paste(sort(unlist(strsplit(dframe[row,mef],";"))),collapse=";")
> > >>>
> > >>> >    }
> > >>> >  }
> > >>> >  return(dframe)
> > >>> > }
> > >>> >
> > >>> > works fine, but is horribly inefficient. How might
this be
> > >>>
> > >>> tackled more elegantly?
> > >>>
> > >>> > Thanks for any input, Joh
> > >>>
> > >>> It is usually faster to loop over columns of an data
frame and use
> row
> > >>> subscripting, if needed, on individual columns.  E.g.,
the following
> > >>> 2 are much quicker on a sample 1000 by 4 dataset I made
with
> > >>>
> > >>> dframe<-data.frame(lapply(c(One=1,Two=2,Three=3),
> > >>>    function(i)sapply(1:1000,
> > >>>       function(i)
> > >>>
> > >>>
paste(sample(LETTERS[1:5],size=sample(3,size=1),repl=FALSE),
> > >>> collapse=";"))),
> > >>>    stringsAsFactors=FALSE)
> > >>> dframe$Four<-sample(LETTERS[1:5], size=nrow(dframe),
> > >>> replace=TRUE) # no ;'s in column Four
> > >>>
> > >>> The first function, f1, doesn't try to find which
rows may
> > >>> need adjusting
> > >>> and the second, f2, does.
> > >>>
> > >>> f1 <- function(dframe){
> > >>>   if(!is.data.frame(dframe)){
> > >>>     stop("\"ResortCombinedFields\" input
needs to be a data frame.")
> > >>>   }
> > >>>   for(icol in seq_len(ncol(dframe))){
> > >>>     dframe[,icol] <-
unlist(lapply(strsplit(dframe[,icol],
> > >>> ";"), function(parts) paste(sort(parts),
collapse=";")))
> > >>>   }
> > >>>   return(dframe)
> > >>> }
> > >>>
> > >>> f2 <-
> > >>> function(dframe){
> > >>>   if(!is.data.frame(dframe)){
> > >>>     stop("\"ResortCombinedFields\" input
needs to be a data frame.")
> > >>>   }
> > >>>   for(icol in seq_len(ncol(dframe))){
> > >>>     col <- dframe[,icol]
> > >>>     irow <- grep(";", col)
> > >>>     if (length(irow)) {
> > >>>         col[irow] <- unlist(lapply(strsplit(col[irow],
";"),
> > >>> function(parts) paste(sort(parts),
collapse=";")))
> > >>>         dframe[,icol] <- col
> > >>>     }
> > >>>   }
> > >>>   return(dframe)
> > >>> }
> > >>>
> > >>> Times were
> > >>>
> > >>> > unix.time(z<-ResortCombinedFields(dframe))
> > >>>
> > >>>    user  system elapsed
> > >>>   2.526   0.022   2.559
> > >>>
> > >>> > unix.time(f1z<-f1(dframe))
> > >>>
> > >>>    user  system elapsed
> > >>>   0.509   0.000   0.508
> > >>>
> > >>> > unix.time(f2z<-f2(dframe))
> > >>>
> > >>>    user  system elapsed
> > >>>   0.259   0.004   0.264
> > >>>
> > >>> > identical(z, f1z)
> > >>>
> > >>> [1] TRUE
> > >>>
> > >>> > identical(z, f2z)
> > >>>
> > >>> [1] TRUE
> > >>
> > >> In R 2.7.0 (April 2008) f1() and f2() both take time
proportional
> > >> to nrow(dframe), while your original ResortCombinedFields()
takes
> > >> time proportional to the square of nrow(dframe).  E.g., for
50,000
> > >> rows ResortCombinedFields takes 4252 seconds while f2 takes
14 seconds
> > >> It looks like 2.9 acts about the same.
> > >>
> > >> Bill Dunlap
> > >> TIBCO Software Inc - Spotfire Division
> > >> wdunlap tibco.com
> > >>
> > >> ______________________________________________
> > >> R-help@r-project.org mailing list
> > >> stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> R-project.org/posting-guide.html and provide
commented,
> > >> minimal, self-contained, reproducible code.
>
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 835 bytes
> Desc: This is a digitally signed message part.
> URL: <
>
stat.ethz.ch/pipermail/r-help/attachments/20081126/f2c08061/attachment-0001.bin
> >
>
> ------------------------------
>
> Message: 2
> Date: Wed, 26 Nov 2008 16:44:20 +0530
> From: "indian scorpio" <cool.scorpio84@gmail.com>
> Subject: Re: [R] Running rtest - how to/ help
> To: r-help@r-project.org
> Message-ID:
>        <bd914bcb0811260314j558328b7vea363462513dc794@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Well after trying a couple of things
> - rtest.java example with command line argument being --zero-init
> this is the error
> Creating Rengine (with arguments)
> Rengine created, waiting for R
> #
> # An unexpected error has been detected by Java Runtime Environment:
> #
> #  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x6c733b9d, pid=3640,
> tid=5016
> #
> # Java VM: Java HotSpot(TM) Client VM (10.0-b19 mixed mode, sharing
> windows-x86)
> # Problematic frame:
> # C  [R.dll+0x33b9d]
> #
> # An error report file with more information is saved as:
> # C:\workspaceVE\XTest\hs_err_pid3640.log
> #
> # If you would like to submit a bug report, please visit:
> #   java.sun.com/webapps/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
>
> 2. while for rtest2.java
> it remains the same that is terminates after Creating Rengine (with
> arguments) But difference being some window/ frame which opens for a
> fraction of a second
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 26 Nov 2008 13:00:52 +0100
> From: Eik Vettorazzi <E.Vettorazzi@uke.uni-hamburg.de>
> Subject: Re: [R] how to read .sps (SPSS file extension)?
> To: livio finos <livio.finos@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D3A74.1070508@uke.uni-hamburg.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> maybe the importers of the memisc-package will help you, but I never
> tried them, see
>
>  help(importers,package="memisc")
>
> At a first glance it seems, that you have to split your file manually,
> but maybe there is another way.
> hth.
>
> livio finos schrieb:
> > sorry, you are completely right!
> > sps is not the extension for portable file! sorry for the time I make
> > you spend.
> > I try to make my problem more clear.
> > I exporting a dataset from limesurvey (a free software for internet
> > survey). It works very fine and it allow to export in different format
> > such as csv and excel. this fine, but what I like from spss formats is
> > the variables labels.
> > limesurvey declares to export in spss, but it export in sps format
> > which is not a format but a code actually. I realize it just now,
> > sorry. I attach an extract of the code here below.
> > do you have any suggestion on how to manage that? I think it will be
> > great if we can improve the interconnettivity among free software.
> > thanks again..
> >   livio
> >
> > NEW FILE.
> > FILE TYPE NESTED RECORD=1(A).
> > - RECORD TYPE 'A'.
> > - DATA LIST LIST / i0(A1) d1(N3) d2(DATETIME20.0) d3(A15) d4(N2)
> > d5(N1) d6(N1) d7(N1) d8(N1) d9(N1) d10(N1) d11(N1) d12(N1) d13(N1)
> > d14(N1) d15(N1) d16(N1) d17(N2) d18(N1) d19(N1) d20(N1) .
> >
> > - RECORD TYPE 'B'.
> > - DATA LIST LIST / i1(A1) d21(N1) d22(N1) d23(N1) d24(N1) d25(N1)
> > d26(N1) d27(N1) d28(N1) d29(N1) d30(N1) d31(N1) d32(N1) d33(N1)
> > d34(N1) d35(N1) d36(N1) d37(N1) d38(A37) d39(N1) d40(N1) .
> >
> > - RECORD TYPE 'C'.
> > - DATA LIST LIST / i2(A1) d41(N1) d42(N1) d43(N1) d44(N1) d45(N1)
> > d46(N1) d47(N1) d48(N1) d49(N1) d50(N1) d51(N1) d52(N1) d53(N1)
> > d54(N1) d55(N1) d56(N1) d57(N1) d58(N1) d59(N1) d60(N1) .
> >
> > - RECORD TYPE 'D'.
> > - DATA LIST LIST / i3(A1) d61(N1) d62(N1) d63(N1) d64(N1) d65(N1)
> > d66(N1) d67(N1) d68(N1) d69(N1) d70(N1) d71(N1) d72(N1) d73(N1)
> > d74(N1) d75(N1) d76(N1) d77(N1) d78(N1) d79(N1) d80(N1) .
> >
> > - RECORD TYPE 'E'.
> > - DATA LIST LIST / i4(A1) d81(N1) d82(N1) d83(N1) d84(N1) d85(N1)
> > d86(N1) d87(N1) d88(N1) d89(N1) d90(N1) d91(N1) d92(N1) d93(N1)
> > d94(N1) d95(N1) d96(N1) d97(N1) d98(N1) d99(N1) d100(N1) .
> >
> > - RECORD TYPE 'F'.
> > - DATA LIST LIST / i5(A1) d101(N1) d102(N1) d103(N1) d104(N1) d105(N1)
> > d106(N1) d107(N1) d108(N1) d109(N1) .
> > END FILE TYPE.
> >
> > BEGIN DATA
> > A '8' '01-01-1980 00:00:00' 'it' '13'
'1' '1' '' '1' '' '1'
'1' '1'
> > '1' '1' '4' '0' '' '1'
'' '1'
> > B '' '1' '1' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '2' '7' '3' '2'
> > '0''2' '3'
> > C '3' '4' '4' '2' '2'
'1' '4' '4' '2' '3' '1'
'4' '1' '4' '1' '1' '4'
> > '3' '4' '3'
> > D '1' '3' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '2' '2' '2' '2' '3'
> > '3' '3' '2'
> > E '2' '3' '2' '2'
'0''0''0''0''0''0''0''3'
'4' '4' '2' '1' '5' '2'
'5'
> > '4'
> > F '1' '2' '2' '2' '2'
'1' '2' '2' '5'
> > A '9' '01-01-1980 00:00:00' 'it' '13'
'2' '1' '' '1' '' '0' ''
'1' '1'
> > '1' '3' '0' '' '1' ''
'1'
> > B '' '0' '' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '3' '7' '3' '2'
> > '0''2' '4'
> > C '3' '4' '4' '3' '3'
'3' '4' '3' '2' '2' '1'
'3' '1' '4' '1' '4' '4'
> > '4' '4' '3'
> > D '1' '2' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '3' '1' '3' '3' '3'
> > '3' '3' '3'
> > E '3' '3' '3' '2'
'0''0''0''0''0''0''0''3'
'4' '2' '2' '5' '3' '5'
'2'
> > '5'
> > F '1' '5' '5' '5' '5'
'5' '5' '2' '5'
> > A '10' '01-01-1980 00:00:00' 'it' '13'
'1' '1' '' '1' '' '1'
'2' '0'
> > '' '0' '' '0' '' '1'
'' '1'
> > B '' '1' '2' '0' ''
'0' '' '0' '' '1' '2'
'1' '4' '6' '7' '3' '2'
> > '0''3' '3'
> > C '4' '4' '4' '4' '3'
'4' '4' '3' '2' '2' '1'
'4' '1' '4' '1' '3' '4'
> > '4' '4' '1'
> > D '1' '1' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'3' '3' '3' '3' '3' '3'
> > '3' '3' '3'
> > E '3' '3' '3' '2'
'0''0''0''0''0''0''0''5'
'4' '5' '2' '5' '5' '5'
'5'
> > '5'
> > F '1' '5' '5' '5' '5'
'5' '5' '1' '5'
> > END DATA.
> > EXECUTE.
> >
> > *Define Variable Properties.
> > VARIABLE LABELS d1 'Record ID'.
> > VARIABLE LABELS d2 'Data di completamento'.
> > VARIABLE LABELS d3 'Lingua di partenza'.
> > VARIABLE LABELS d4 'Et? :'.
> > VARIABLE LABELS d5 'Sesso:'.
> > VARIABLE LABELS d6 '3 - Pap? '.
> > VARIABLE LABELS d7 'Com&apos;?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d8 '3 - Mamma'.
> > VARIABLE LABELS d9 'Com&apos;?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d10 '3 - Fratelli n??'.
> > VARIABLE LABELS d11 'Com&apos;?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d12 '3 - Sorelle n??'.
> > VARIABLE LABELS d13 'Com&apos;?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d14 '3 - Nonni n??'.
> > VARIABLE LABELS d15 'Com&apos;?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d16 '3 - Altre figure parentali (zii, cugini,
ecc.) n??'.
> > VARIABLE LABELS d17 'Com&apos;?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d18 '4 - Pap? '.
> > VARIABLE LABELS d19 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d20 '4 - Mamma'.
> > VARIABLE LABELS d21 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d22 '4 - Fratelli n??'.
> > VARIABLE LABELS d23 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d24 '4 - Sorelle n??'.
> > VARIABLE LABELS d25 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d26 '4 - Nonni n??'.
> > VARIABLE LABELS d27 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d28 '4 - Altre figure parentali (zii, cugini,
ecc.) n??'.
> > VARIABLE LABELS d29 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> >
> >
> > *Define Value labels.
> > VALUE LABELS d5
> > 1 "Maschio"
> > 2 "Femmina".
> > VALUE LABELS d6
> > 1 "S??"
> > 0 "Non selezionato".
> > VALUE LABELS d8
> > 1 "S??"
> > 0 "Non selezionato".
> >
> >
> >
> >
> > On Tue, Nov 25, 2008 at 10:27 AM, Eik Vettorazzi
> > <E.Vettorazzi@uke.uni-hamburg.de
> > <mailto:E.Vettorazzi@uke.uni-hamburg.de>> wrote:
> >
> >     Hi Livio,
> >     I think you mixed something up. The .sps - files are the syntax
> >     files of SPSS, and I think there is no automated way (but I would
> >     like to be corrected there) of converting SPSS syntax to R-code.
> >     The usual data files of spss  have the extension .sav. Such files
> >     can easily read by read.spss (package foreign) or spss.get
> >     (package Hmisc), if you think the variable labels of SPSS are
> >     fancy the latter approach is possibly more appropriate, because it
> >     adds an attribute with this label to each row.
> >     hth.
> >
> >
> >
> >     livio finos schrieb:
> >
> >         Hi everyone,
> >         I'm trying to import .sps (SPSS portable file) file.
> >         the read.spss function (library foreign) doesn't allow to
> >         import such files.
> >         should I import in spss and then save as sav file? there is
> >         not other
> >         solutions available?
> >         what I mostly like from spss file is that they have variable
> >         labels.
> >         want is really wish to keep are the variable.labels from the
> >         spss file; so,
> >         if there is a different way to bring them from the sps file
> >         will be also ok
> >         (I also have a csv copy but without the variable.labels
> >         obviously).
> >         thanks for any answer..
> >          livio
> >
> >                [[alternative HTML version deleted]]
> >
> >         ______________________________________________
> >         R-help@r-project.org <mailto:R-help@r-project.org>
mailing list
> >         stat.ethz.ch/mailman/listinfo/r-help
> >         PLEASE do read the posting guide
> >         R-project.org/posting-guide.html
> >         and provide commented, minimal, self-contained, reproducible
> code.
> >
> >
> >
> >     --
> >     Eik Vettorazzi
> >     Institut f?r Medizinische Biometrie und Epidemiologie
> >     Universit?tsklinikum Hamburg-Eppendorf
> >
> >     Martinistr. 52
> >     20246 Hamburg
> >
> >     T ++49/40/42803-8243
> >     F ++49/40/42803-7790
> >
> >
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 26 Nov 2008 13:05:21 +0100
> From: Eik Vettorazzi <E.Vettorazzi@uke.uni-hamburg.de>
> Subject: Re: [R] how to read .sps (SPSS file extension)?
> To: livio finos <livio.finos@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D3B81.3030709@uke.uni-hamburg.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> sorry for the typo,
>
> help(importer, package="memisc")
>
> will do the trick.
>
> Eik Vettorazzi schrieb:
> > maybe the importers of the memisc-package will help you, but I never
> > tried them, see
> >
> >  help(importers,package="memisc")
> >
> > At a first glance it seems, that you have to split your file manually,
> > but maybe there is another way.
> > hth.
> >
> > livio finos schrieb:
> >> sorry, you are completely right!
> >> sps is not the extension for portable file! sorry for the time I
make
> >> you spend.
> >> I try to make my problem more clear.
> >> I exporting a dataset from limesurvey (a free software for
internet
> >> survey). It works very fine and it allow to export in different
> >> format such as csv and excel. this fine, but what I like from spss
> >> formats is the variables labels.
> >> limesurvey declares to export in spss, but it export in sps format
> >> which is not a format but a code actually. I realize it just now,
> >> sorry. I attach an extract of the code here below.
> >> do you have any suggestion on how to manage that? I think it will
be
> >> great if we can improve the interconnettivity among free software.
> >> thanks again..
> >>   livio
> >>
> >> NEW FILE.
> >> FILE TYPE NESTED RECORD=1(A).
> >> - RECORD TYPE 'A'.
> >> - DATA LIST LIST / i0(A1) d1(N3) d2(DATETIME20.0) d3(A15) d4(N2)
> >> d5(N1) d6(N1) d7(N1) d8(N1) d9(N1) d10(N1) d11(N1) d12(N1) d13(N1)
> >> d14(N1) d15(N1) d16(N1) d17(N2) d18(N1) d19(N1) d20(N1) .
> >>
> >> - RECORD TYPE 'B'.
> >> - DATA LIST LIST / i1(A1) d21(N1) d22(N1) d23(N1) d24(N1) d25(N1)
> >> d26(N1) d27(N1) d28(N1) d29(N1) d30(N1) d31(N1) d32(N1) d33(N1)
> >> d34(N1) d35(N1) d36(N1) d37(N1) d38(A37) d39(N1) d40(N1) .
> >>
> >> - RECORD TYPE 'C'.
> >> - DATA LIST LIST / i2(A1) d41(N1) d42(N1) d43(N1) d44(N1) d45(N1)
> >> d46(N1) d47(N1) d48(N1) d49(N1) d50(N1) d51(N1) d52(N1) d53(N1)
> >> d54(N1) d55(N1) d56(N1) d57(N1) d58(N1) d59(N1) d60(N1) .
> >>
> >> - RECORD TYPE 'D'.
> >> - DATA LIST LIST / i3(A1) d61(N1) d62(N1) d63(N1) d64(N1) d65(N1)
> >> d66(N1) d67(N1) d68(N1) d69(N1) d70(N1) d71(N1) d72(N1) d73(N1)
> >> d74(N1) d75(N1) d76(N1) d77(N1) d78(N1) d79(N1) d80(N1) .
> >>
> >> - RECORD TYPE 'E'.
> >> - DATA LIST LIST / i4(A1) d81(N1) d82(N1) d83(N1) d84(N1) d85(N1)
> >> d86(N1) d87(N1) d88(N1) d89(N1) d90(N1) d91(N1) d92(N1) d93(N1)
> >> d94(N1) d95(N1) d96(N1) d97(N1) d98(N1) d99(N1) d100(N1) .
> >>
> >> - RECORD TYPE 'F'.
> >> - DATA LIST LIST / i5(A1) d101(N1) d102(N1) d103(N1) d104(N1)
> >> d105(N1) d106(N1) d107(N1) d108(N1) d109(N1) .
> >> END FILE TYPE.
> >>
> >> BEGIN DATA
> >> A '8' '01-01-1980 00:00:00' 'it'
'13' '1' '1' '' '1' ''
'1' '1' '1'
> >> '1' '1' '4' '0' ''
'1' '' '1'
> >> B '' '1' '1' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '2' '7' '3' '2'
> >> '0''2' '3'
> >> C '3' '4' '4' '2' '2'
'1' '4' '4' '2' '3' '1'
'4' '1' '4' '1' '1' '4'
> >> '3' '4' '3'
> >> D '1' '3' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '2' '2' '2' '2' '3'
> >> '3' '3' '2'
> >> E '2' '3' '2' '2'
'0''0''0''0''0''0''0''3'
'4' '4' '2' '1' '5' '2'
> >> '5' '4'
> >> F '1' '2' '2' '2' '2'
'1' '2' '2' '5'
> >> A '9' '01-01-1980 00:00:00' 'it'
'13' '2' '1' '' '1' ''
'0' '' '1'
> >> '1' '1' '3' '0' ''
'1' '' '1'
> >> B '' '0' '' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '3' '7' '3' '2'
> >> '0''2' '4'
> >> C '3' '4' '4' '3' '3'
'3' '4' '3' '2' '2' '1'
'3' '1' '4' '1' '4' '4'
> >> '4' '4' '3'
> >> D '1' '2' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '3' '1' '3' '3' '3'
> >> '3' '3' '3'
> >> E '3' '3' '3' '2'
'0''0''0''0''0''0''0''3'
'4' '2' '2' '5' '3' '5'
> >> '2' '5'
> >> F '1' '5' '5' '5' '5'
'5' '5' '2' '5'
> >> A '10' '01-01-1980 00:00:00' 'it'
'13' '1' '1' '' '1' ''
'1' '2' '0'
> >> '' '0' '' '0' ''
'1' '' '1'
> >> B '' '1' '2' '0' ''
'0' '' '0' '' '1' '2'
'1' '4' '6' '7' '3' '2'
> >> '0''3' '3'
> >> C '4' '4' '4' '4' '3'
'4' '4' '3' '2' '2' '1'
'4' '1' '4' '1' '3' '4'
> >> '4' '4' '1'
> >> D '1' '1' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'3' '3' '3' '3' '3' '3'
> >> '3' '3' '3'
> >> E '3' '3' '3' '2'
'0''0''0''0''0''0''0''5'
'4' '5' '2' '5' '5' '5'
> >> '5' '5'
> >> F '1' '5' '5' '5' '5'
'5' '5' '1' '5'
> >> END DATA.
> >> EXECUTE.
> >>
> >> *Define Variable Properties.
> >> VARIABLE LABELS d1 'Record ID'.
> >> VARIABLE LABELS d2 'Data di completamento'.
> >> VARIABLE LABELS d3 'Lingua di partenza'.
> >> VARIABLE LABELS d4 'Et? :'.
> >> VARIABLE LABELS d5 'Sesso:'.
> >> VARIABLE LABELS d6 '3 - Pap? '.
> >> VARIABLE LABELS d7 'Com&apos;?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d8 '3 - Mamma'.
> >> VARIABLE LABELS d9 'Com&apos;?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d10 '3 - Fratelli n??'.
> >> VARIABLE LABELS d11 'Com&apos;?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d12 '3 - Sorelle n??'.
> >> VARIABLE LABELS d13 'Com&apos;?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d14 '3 - Nonni n??'.
> >> VARIABLE LABELS d15 'Com&apos;?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d16 '3 - Altre figure parentali (zii, cugini,
ecc.)
> >> n??'.
> >> VARIABLE LABELS d17 'Com&apos;?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d18 '4 - Pap? '.
> >> VARIABLE LABELS d19 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d20 '4 - Mamma'.
> >> VARIABLE LABELS d21 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d22 '4 - Fratelli n??'.
> >> VARIABLE LABELS d23 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d24 '4 - Sorelle n??'.
> >> VARIABLE LABELS d25 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d26 '4 - Nonni n??'.
> >> VARIABLE LABELS d27 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d28 '4 - Altre figure parentali (zii, cugini,
ecc.)
> >> n??'.
> >> VARIABLE LABELS d29 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >>
> >>
> >> *Define Value labels.
> >> VALUE LABELS d5
> >> 1 "Maschio"
> >> 2 "Femmina".
> >> VALUE LABELS d6
> >> 1 "S??"
> >> 0 "Non selezionato".
> >> VALUE LABELS d8
> >> 1 "S??"
> >> 0 "Non selezionato".
> >>
> >>
> >>
> >>
> >> On Tue, Nov 25, 2008 at 10:27 AM, Eik Vettorazzi
> >> <E.Vettorazzi@uke.uni-hamburg.de
> >> <mailto:E.Vettorazzi@uke.uni-hamburg.de>> wrote:
> >>
> >>     Hi Livio,
> >>     I think you mixed something up. The .sps - files are the
syntax
> >>     files of SPSS, and I think there is no automated way (but I
would
> >>     like to be corrected there) of converting SPSS syntax to
R-code.
> >>     The usual data files of spss  have the extension .sav. Such
files
> >>     can easily read by read.spss (package foreign) or spss.get
> >>     (package Hmisc), if you think the variable labels of SPSS are
> >>     fancy the latter approach is possibly more appropriate,
because it
> >>     adds an attribute with this label to each row.
> >>     hth.
> >>
> >>
> >>
> >>     livio finos schrieb:
> >>
> >>         Hi everyone,
> >>         I'm trying to import .sps (SPSS portable file) file.
> >>         the read.spss function (library foreign) doesn't allow
to
> >>         import such files.
> >>         should I import in spss and then save as sav file? there
is
> >>         not other
> >>         solutions available?
> >>         what I mostly like from spss file is that they have
variable
> >>         labels.
> >>         want is really wish to keep are the variable.labels from
the
> >>         spss file; so,
> >>         if there is a different way to bring them from the sps
file
> >>         will be also ok
> >>         (I also have a csv copy but without the variable.labels
> >>         obviously).
> >>         thanks for any answer..
> >>          livio
> >>
> >>                [[alternative HTML version deleted]]
> >>
> >>         ______________________________________________
> >>         R-help@r-project.org <mailto:R-help@r-project.org>
mailing list
> >>         stat.ethz.ch/mailman/listinfo/r-help
> >>         PLEASE do read the posting guide
> >>         R-project.org/posting-guide.html
> >>         and provide commented, minimal, self-contained,
reproducible
> >> code.
> >>
> >>
> >>     --     Eik Vettorazzi
> >>     Institut f?r Medizinische Biometrie und Epidemiologie
> >>     Universit?tsklinikum Hamburg-Eppendorf
> >>
> >>     Martinistr. 52
> >>     20246 Hamburg
> >>
> >>     T ++49/40/42803-8243
> >>     F ++49/40/42803-7790
> >>
> >>
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 26 Nov 2008 04:41:29 -0800 (PST)
> From: Chris Andrews <candrews@buffalo.edu>
> Subject: Re: [R] plotting density for truncated distribution
> To: r-help@r-project.org
> Message-ID: <20699699.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> Another option
>
> mydata <- rnorm(100000)
> mydata <- mydata[mydata>0]
> plot(density(c(mydata, -mydata), from=0))
>
> If you want the area under the curve to be one, you'll need to double
the
> density estimate
>
> dx <- density(c(mydata, -mydata), from=0)
> dx$y <- dx$y * 2
> plot(dx)
>
> Chris
>
>
>
> Jeroen Ooms wrote:
> >
> > I am using density() to plot a density curves. However, one of my
> > variables is truncated at zero, but has most of its density around
zero.
> I
> > would like to know how to plot this with the density function.
> >
> > The problem is that if I do this the regular way density(), values
near
> > zero automatically get a very low value because there are no observed
> > values below zero. Furthermore there is some density below zero,
although
> > there are no observed values below zero.
> >
> > This illustrated the problem:
> >
> > mydata <- rnorm(100000);
> > mydata <- mydata[mydata>0];
> > plot(density(mydata));
> >
> > the 'real' density is exactly the right half of a normal
distribution, so
> > truncated at zero. However using the default options, the line seems
to
> > decrease with a nice curve at the left, with some density below zero.
> This
> > is pretty confusing for the reader. I have tried to decrease the bw,
> masks
> > (but does not fix) some of the problem, but than also the rest of the
> > curve loses smoothness. I would like to make a plot of this data that
> > looks like the right half of a normal distribution, while keeping the
> > curve relatively smooth.
> >
> > Is there any way to specify this truncation in the density function,
so
> > that it will only use the positive domain to calculate density?
> >
>
> --
> View this message in context:
>
nabble.com/plotting-density-for-truncated-distribution-tp20684995p20699699.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 26 Nov 2008 14:11:43 +0100
> From: axionator <axionator@gmail.com>
> Subject: [R] construct a vector
> To: "r-help@r-project.org" <r-help@r-project.org>
> Message-ID:
>        <97a146780811260511j73c30f1drefba26e3eaf10def@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi all,
> I have an unkown number of vectors (>=2) all of the same length. Out
> of these, I want to construct a new one as follows:
> having vectors u,v and w, the resulting vector z should have entries:
> z[1] = u[1], z[2] = v[1], z[3] = w[1]
> z[4] = u[2], z[5] = v[2], z[6] = w[2]
> ...
> i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> elements and store them consecutively in z.
> Is there an efficient way in R to do this?
>
> Thanks in advance
> Armin
>
>
>
> ------------------------------
>
> Message: 7
> Date: Wed, 26 Nov 2008 07:33:24 -0600
> From: Marc Schwartz <marc_schwartz@comcast.net>
> Subject: Re: [R] construct a vector
> To: axionator <axionator@gmail.com>
> Cc: "r-help@r-project.org" <r-help@r-project.org>
> Message-ID: <492D5024.30704@comcast.net>
> Content-Type: text/plain; charset=ISO-8859-1
>
> on 11/26/2008 07:11 AM axionator wrote:
> > Hi all,
> > I have an unkown number of vectors (>=2) all of the same length.
Out
> > of these, I want to construct a new one as follows:
> > having vectors u,v and w, the resulting vector z should have entries:
> > z[1] = u[1], z[2] = v[1], z[3] = w[1]
> > z[4] = u[2], z[5] = v[2], z[6] = w[2]
> > ...
> > i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> > elements and store them consecutively in z.
> > Is there an efficient way in R to do this?
> >
> > Thanks in advance
> > Armin
>
> Is this what you want?
>
> u <- 1:10
> v <- 11:20
> w <- 21:30
>
> z <- as.vector(rbind(u, v, w))
>
> > z
>  [1]  1 11 21  2 12 22  3 13 23  4 14 24  5 15 25  6 16 26  7 17 27  8
> [23] 18 28  9 19 29 10 20 30
>
>
> Essentially, we are creating a matrix from the 3 vectors:
>
> > rbind(u, v, w)
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> u    1    2    3    4    5    6    7    8    9    10
> v   11   12   13   14   15   16   17   18   19    20
> w   21   22   23   24   25   26   27   28   29    30
>
> Then coercing that to a vector, taking advantage of the way in which
> matrix elements are stored.
>
> HTH,
>
> Marc Schwartz
>
>
>
> ------------------------------
>
> Message: 8
> Date: Wed, 26 Nov 2008 13:38:23 +0000
> From: Richard.Cotton@hsl.gov.uk
> Subject: Re: [R] construct a vector
> To: axionator <axionator@gmail.com>
> Cc: "r-help@r-project.org" <r-help@r-project.org>,
>        r-help-bounces@r-project.org
> Message-ID:
>        <
> OF8137852B.42F93AFC-ON8025750D.004AC462-8025750D.004AE18D@hsl.gov.uk>
> Content-Type: text/plain; charset="US-ASCII"
>
> > I have an unkown number of vectors (>=2) all of the same length.
Out
> > of these, I want to construct a new one as follows:
> > having vectors u,v and w, the resulting vector z should have entries:
> > z[1] = u[1], z[2] = v[1], z[3] = w[1]
> > z[4] = u[2], z[5] = v[2], z[6] = w[2]
> > ...
> > i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> > elements and store them consecutively in z.
> > Is there an efficient way in R to do this?
>
> u <- 1:5
> v <- (1:5) + 0.1
> w <- (1:5) + 0.2
> as.vector(rbind(u,v,w))
> # [1] 1.0 1.1 1.2 2.0 2.1 2.2 3.0 3.1 3.2 4.0 4.1 4.2 5.0 5.1 5.2
>
> Regards,
> Richie.
>
> Mathematical Sciences Unit
> HSL
>
>
> ------------------------------------------------------------------------
> ATTENTION:
>
> This message contains privileged and confidential inform...{{dropped:20}}
>
>
>
> ------------------------------
>
> Message: 9
> Date: Wed, 26 Nov 2008 07:38:32 -0600
> From: Frank E Harrell Jr <f.harrell@vanderbilt.edu>
> Subject: Re: [R] multiple imputation with fit.mult.impute in Hmisc -
>        how to replace NA with imputed value?
> To: Charlie Brush <cfbrush@ucdavis.edu>
> Cc: r-help@r-project.org
> Message-ID: <492D5158.4090003@vanderbilt.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Charlie Brush wrote:
> > I am doing multiple imputation with Hmisc, and
> > can't figure out how to replace the NA values with
> > the imputed values.
> >
> > Here's a general ourline of the process:
> >
> >  > set.seed(23)
> >  > library("mice")
> >  > library("Hmisc")
> >  > library("Design")
> >  > d <- read.table("DailyDataRaw_01.txt",header=T)
> >  > length(d);length(d[,1])
> > [1] 43
> > [1] 2666
> > Do for this data set, there are 43 columns and 2666 rows
> >
> > Here is a piece of data.frame d:
> >  > d[1:20,4:6]
> >  P01  P02  P03
> > 1  0.1 0.16 0.16
> > 2   NA 0.00 0.00
> > 3   NA 0.60 0.04
> > 4   NA 0.15 0.00
> > 5   NA 0.00 0.00
> > 6  0.7 0.00 0.75
> > 7   NA 0.00 0.00
> > 8   NA 0.00 0.00
> > 9  0.0 0.00 0.00
> > 10 0.0 0.00 0.00
> > 11 0.0 0.00 0.00
> > 12 0.0 0.00 0.00
> > 13 0.0 0.00 0.00
> > 14 0.0 0.00 0.00
> > 15 0.0 0.00 0.03
> > 16  NA 0.00 0.00
> > 17  NA 0.01 0.00
> > 18 0.0 0.00 0.00
> > 19 0.0 0.00 0.00
> > 20 0.0 0.00 0.00
> >
> > These are daily precipitation values at NCDC stations, and
> > NA values at station P01 will be filled using multiple
> > imputation and data from highly correlated stations P02 and P08:
> >
> >  > f <- aregImpute(~ I(P01) + I(P02) + I(P08),
> > n.impute=10,match='closest',data=d)
> > Iteration 13
> >  > fmi <- fit.mult.impute( P01 ~ P02 + P08 , ols, f, d)
> >
> > Variance Inflation Factors Due to Imputation:
> >
> > Intercept       P02       P08
> >    1.01      1.39      1.16
> >
> > Rate of Missing Information:
> >
> > Intercept       P02       P08
> >    0.01      0.28      0.14
> >
> > d.f. for t-distribution for Tests of Single Coefficients:
> >
> > Intercept       P02       P08
> > 242291.18    116.05    454.95
> >  > r <- apply(f$imputed$P01,1,mean)
> >  > r
> >    2     3     4     5     7     8    16    17   249   250   251
> > 0.002 0.430 0.044 0.002 0.002 0.002 0.002 0.123 0.002 0.002 0.002
> >  252   253   254   255   256   257   258   259   260   261   262
> > 1.033 0.529 1.264 0.611 0.002 0.513 0.085 0.002 0.705 0.840 0.719
> >  263   264   265   266   267   268   269   270   271   272   273
> > 1.489 0.532 0.150 0.134 0.002 0.002 0.002 0.002 0.002 0.055 0.135
> >  274   275   276   277   278   279   280   281   282   283   284
> > 0.009 0.002 0.002 0.002 0.008 0.454 1.676 1.462 0.071 0.002 1.029
> >  285   286   287   288   289   418   419   420   421   422   700
> > 0.055 0.384 0.947 0.002 0.002 0.008 0.759 0.066 0.009 0.002 0.002
> >
> > ------------------------------------------------------------------
> > So far, this is working great.
> > Now, make a copy of d:
> >  > dnew <- d
> >
> > And then fill in the NA values in P01 with the values in r
> >
> > For example:
> >  > for (i in 1:length(r)){
> >    dnew$P01[r[i,1]] <- r[i,2]
> >    }
> > This doesn't work, because each 'piece' of r is two
numbers:
> >  > r[1]
> >   2
> > 0.002
> >  > r[1,1]
> > Error in r[1, 1] : incorrect number of dimensions
> >
> > My question: how can I separate the the two items in (for example)
> > r[1] to use the first part as an index and the second as a value,
> > and then use them to replace the NA values with the imputed values?
> >
> > Or is there a better way to replace the NA values with the imputed
> values?
> >
> > Thanks in advance for any help.
> >
>
> You didn't state your goal, and why fit.mult.impute does not do what
you
> want.   But you can look inside fit.mult.impute to see how it retrieves
> the imputed values.  Also see the example in documentation for transcan
> in which the command impute(xt, imputation=1) to retrieve one of the
> multiple imputations.
>
> Note that you can say library(Design) (omit the quotes) to access both
> Design and Hmisc.
>
> Frank
> --
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                      Department of Biostatistics   Vanderbilt University
>
>
>
> ------------------------------
>
> Message: 10
> Date: Wed, 26 Nov 2008 14:39:09 +0100
> From: Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk@idi.ntnu.no>
> Subject: Re: [R] construct a vector
> To: axionator <axionator@gmail.com>
> Cc: R help <R-help@stat.math.ethz.ch>
> Message-ID: <492D517D.8050108@idi.ntnu.no>
> Content-Type: text/plain; charset=ISO-8859-1
>
> axionator wrote:
> > Hi all,
> > I have an unkown number of vectors (>=2) all of the same length.
Out
> > of these, I want to construct a new one as follows:
> > having vectors u,v and w, the resulting vector z should have entries:
> > z[1] = u[1], z[2] = v[1], z[3] = w[1]
> > z[4] = u[2], z[5] = v[2], z[6] = w[2]
> > ...
> > i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> > elements and store them consecutively in z.
> > Is there an efficient way in R to do this?
> >
> >
>
> suppose you have your vectors collected into a list, say vs; then the
> following will do:
>
> as.vector(do.call(rbind, vs))
>
> vQ
>
>
>
> ------------------------------
>
> Message: 11
> Date: Wed, 26 Nov 2008 14:43:30 +0100
> From: axionator <axionator@gmail.com>
> Subject: Re: [R] construct a vector
> To: "Wacek Kusnierczyk"
<Waclaw.Marcin.Kusnierczyk@idi.ntnu.no>
> Cc: R help <R-help@stat.math.ethz.ch>
> Message-ID:
>        <97a146780811260543l4d5bcaa7td42e123b9f2906a9@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Thanks, works fine.
>
> Armin
>
>
>
> ------------------------------
>
> Message: 12
> Date: Wed, 26 Nov 2008 14:43:53 +0100
> From: Uwe Ligges <ligges@statistik.tu-dortmund.de>
> Subject: Re: [R] Question about Kolmogorov-Smirnov Test
> To: Ricardo R?os <ricardo.rios.sv@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D5299.5020206@statistik.tu-dortmund.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>
>
> Ricardo R?os wrote:
> > Hi wizards
> >
> > I have the following code for a Kolmogorov-Smirnov Test:
> >
> >
z<-c(1.6,10.3,3.5,13.5,18.4,7.7,24.3,10.7,8.4,4.9,7.9,12,16.2,6.8,14.7)
> > ks.test(z,"pexp",1/10)$statistic
> >
> > The Kolmogorov-Smirnov statistic is:
> >
> >        D
> > 0.293383
> >
> > However, I have calculated the Kolmogorov-Smirnov statistic with the
> > following R code:
> >
> >
z<-c(1.6,10.3,3.5,13.5,18.4,7.7,24.3,10.7,8.4,4.9,7.9,12,16.2,6.8,14.7)
> > a<-sort(z)
> > d<- pexp(a, rate = 1/10, lower.tail = TRUE, log.p = FALSE)
> > w=numeric(length = length(a))
> > for(i in 1:length(a)) w[i]=i/15
> > max(abs(w-d))
> >
> > But I have obtained the following result:
> >
> > [1] 0.2267163
> >
> > Why these results are not equal?
>
> w is calculated as follows:
>
> w <- (seq(along=a)-1)/length(a)
> [ {0, ..., n-1} rather than {1, ..., n} ]
>
>
> Uwe Ligges
>
>
> > Thanks in advance
> >
>
>
>
> ------------------------------
>
> Message: 13
> Date: Wed, 26 Nov 2008 05:45:29 -0800 (PST)
> From: seanpor <seanpor@acm.org>
> Subject: Re: [R] memory limit
> To: r-help@r-project.org
> Message-ID: <20700590.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> Good afternoon,
>
> The short answer is "yes", the long answer is "it
depends".
>
> It all depends on what you want to do with the data, I'm working with
> dataframes of a couple of million lines, on this plain desktop machine and
> for my purposes it works fine.  I read in text files, manipulate them,
> convert them into dataframes, do some basic descriptive stats and tests on
> them, a couple of columns at a time, all quick and simple in R.  There are
> some libraries which are setup to handle very large datasets, e.g. biglm
> [1].
>
> If you're using algorithms which require vast quantities of memory,
then as
> the previous emails in this thread suggest, you might need R running on
> 64-bit.
>
> If you're working with a problem which is "embarrassingly
parallel"[2],
> then
> there are a variety of solutions - if you're in between then the
solutions
> are much more data dependant.
>
> the flip question: how long would it take you to get up and running with
> the
> functionallity (tried and tested in R) you require if you're going to
be
> re-working things in C++?
>
> I suggest that you have a look at R, possibly using a subset of your full
> set to start with - you'll be amazed how quickly you can get up and
> running.
>
> As suggested at the start of this email... "it depends"...
>
> Best Regards,
> Sean O'Riordain
> Dublin
>
> [1] cran.r-project.org/web/packages/biglm/index.html
> [2] en.wikipedia.org/wiki/Embarrassingly_parallel
>
>
> iwalters wrote:
> >
> > I'm currently working with very large datasets that consist out of
> > 1,000,000 + rows.  Is it at all possible to use R for datasets this
size
> > or should I rather consider C++/Java.
> >
> >
> >
>
> --
> View this message in context:
>
nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20700590.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 14
> Date: Wed, 26 Nov 2008 09:14:40 -0500
> From: "jim holtman" <jholtman@gmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
>        correlation p.values for 2 matrices
> To: "Daren Tan" <daren76@hotmail.com>
> Cc: r-help@stat.math.ethz.ch
> Message-ID:
>        <644e1f320811260614i27b26152i566e9da4eae778f2@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Your time is being taken up in cor.test because you are calling it
> 100,000 times.  So grin and bear it with the amount of work you are
> asking it to do.
>
> Here I am only calling it 100 time:
>
> > m1 <- matrix(rnorm(10000), ncol=100)
> > m2 <- matrix(rnorm(10000), ncol=100)
> > Rprof('/tempxx.txt')
> > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { cor.test(x,y)$p.value }) }))
>   user  system elapsed
>   8.86    0.00    8.89
> >
>
> so my guess is that calling it 100,000 times will take:  100,000 *
> 0.0886 seconds or about 3 hours.
>
> If you run Rprof, you will see if is spending most of its time there:
>
>  0   8.8 root
>  1.    8.8 apply
>  2. .    8.8 FUN
>  3. . .    8.8 apply
>  4. . . .    8.7 FUN
>  5. . . . .    8.6 cor.test
>  6. . . . . .    8.4 cor.test.default
>  7. . . . . . .    2.4 match.arg
>  8. . . . . . . .    1.7 eval
>  9. . . . . . . . .    1.4 deparse
>  10. . . . . . . . . .    0.6 .deparseOpts
>  11. . . . . . . . . . .    0.2 pmatch
>  11. . . . . . . . . . .    0.1 sum
>  10. . . . . . . . . .    0.5 %in%
>  11. . . . . . . . . . .    0.3 match
>  12. . . . . . . . . . . .    0.3 is.factor
>  13. . . . . . . . . . . . .    0.3 inherits
>  8. . . . . . . .    0.2 formals
>  9. . . . . . . . .    0.2 sys.function
>  7. . . . . . .    2.1 cor
>  8. . . . . . . .    1.1 match.arg
>  9. . . . . . . . .    0.7 eval
>  10. . . . . . . . . .    0.6 deparse
>  11. . . . . . . . . . .    0.3 .deparseOpts
>  12. . . . . . . . . . . .    0.1 pmatch
>  11. . . . . . . . . . .    0.2 %in%
>  12. . . . . . . . . . . .    0.2 match
>  13. . . . . . . . . . . . .    0.1 is.factor
>  14. . . . . . . . . . . . . .    0.1 inherits
>  9. . . . . . . . .    0.1 formals
>  8. . . . . . . .    0.5 stopifnot
>  9. . . . . . . . .    0.2 match.call
>  8. . . . . . . .    0.1 pmatch
>  8. . . . . . . .    0.1 is.data.frame
>  9. . . . . . . . .    0.1 inherits
>  7. . . . . . .    1.5 paste
>  8. . . . . . . .    1.4 deparse
>  9. . . . . . . . .    0.6 .deparseOpts
>  10. . . . . . . . . .    0.3 pmatch
>  10. . . . . . . . . .    0.1 any
>  9. . . . . . . . .    0.6 %in%
>  10. . . . . . . . . .    0.6 match
>  11. . . . . . . . . . .    0.5 is.factor
>  12. . . . . . . . . . . .    0.4 inherits
>  13. . . . . . . . . . . . .    0.2 mode
>  7. . . . . . .    0.4 switch
>  8. . . . . . . .    0.1 qnorm
>  7. . . . . . .    0.2 pt
>  5. . . . .    0.1 $
>
> On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan <daren76@hotmail.com>
wrote:
> >
> > My two matrices are roughly the sizes of m1 and m2. I tried using two
> apply and cor.test to compute the correlation p.values. More than an hour,
> and the codes are still running. Please help to make it more efficient.
> >
> > m1 <- matrix(rnorm(100000), ncol=100)
> > m2 <- matrix(rnorm(10000000), ncol=100)
> >
> > cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y)
{
> cor.test(x,y)$p.value }) })
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
>
>
> ------------------------------
>
> Message: 15
> Date: Wed, 26 Nov 2008 08:42:37 -0600 (CST)
> From: Terry Therneau <therneau@mayo.edu>
> Subject: [R] how to check linearity in Cox regression
> To: mmargoli@gettysburg.edu
> Cc: r-help@r-project.org
> Message-ID: <200811261442.mAQEgba26210@hsrnfs-101.mayo.edu>
> Content-Type: TEXT/plain; charset=us-ascii
>
> > On examining non-linearity of Cox coefficients with penalized splines
-
>  I
> > have not been able to dig up a completely clear description of the
test
> > performed in R or S-plus.
>
>  One "iron clad" way to test is to fit a model that has the
variable of
> interest
> "x" as a linear term, then a second model with splines, and do a
likelihood
> ratio test with 2*(difference in log-likelihood) on (difference in df)
> degrees
> of freedom.  With a penalized model this test is conservative: the
> chi-square is
> not quite the right distribution, the true dist has the same mean but
> smaller
> variance.
>
>  The pspline function uses an evenly spaced set of symmetric basis
> functions.  A
> neat consequence of this is that the Wald test for linear vs 'more
general'
> is a
> test that the coefficients of the spline terms fall in a linear series.
>  That
> is, a linear trend test on the coefficients.  This is what coxph does.  As
> with
> the LR test, the chi-square dist is conservative.  I have not worked at
> putting
> in the more correct distribution.  See Eilers and Marx, Statistical Science
> 1986.
>
>  > And what is the null for the non-linear test?
>
> The linear test is "is a linear better than nothing", the
non-linear one is
> a
> sequential test "is the non-linear better than the linear".  The
second
> test of
> course depends on the total number of df you allowed for the pspline fit.
>  As a
> silly example adding "+ pspline(x, df=200)" would likely show
that the
> nonlinear
> term was not a significant addition, i.e., not worth 199 more degrees of
> freedom.
>
>        Terry Therneau
>
>
>
> ------------------------------
>
> Message: 16
> Date: Wed, 26 Nov 2008 14:51:50 +0000
> From: Andrew Choens <andy.choens@gmail.com>
> Subject: [R] Chi-Square Test Disagreement
> To: r-help@r-project.org
> Message-ID: <1227711110.8422.24.camel@chinstrap>
> Content-Type: text/plain
>
> I was asked by my boss to do an analysis on a large data set, and I am
> trying to convince him to let me use R rather than SPSS. I think Sweave
> could make my life much much easier. To get me a little closer to this
> goal, I ran my analysis through R and SPSS and compared the resulting
> values. In all but one case, they were the same. Given the matrix
>
>    [,1] [,2]
> [1,]  110  358
> [2,]   71  312
> [3,]   29  139
> [4,]   31   77
> [5,]   13   32
>
> This is the output from R:
> > chisq.test(test29)
>
>        Pearson's Chi-squared test
>
> data:  test29
> X-squared = 9.593, df = 4, p-value = 0.04787
>
> But, the same data in SPSS generates a p value of .051. It's a small
but
> important difference. I played around and rescaled things, and tried
> different values for B, but I never could get R to reach .051.
>
> I'd like to know which program is correct - R or SPSS? I know, this is
a
> biased place to ask such a question. I also appreciate all input that
> will help me use R more effectively. The difference could be the result
> of my own ignorance.
>
> thanks
> --andy
>
> --
> Insert something humorous here.  :-)
>
>
>
> ------------------------------
>
> Message: 17
> Date: Wed, 26 Nov 2008 15:03:28 +0000
> From: "Tobias Verbeke" <tobias.verbeke@telenet.be>
> Subject: Re: [R] eclipse and R
> To: "Ruud H. koning" <r.h.koning@rug.nl>,
r-help@r-project.org
> Cc: statet-user@r-forge.wu-wien.ac.at
> Message-ID: <W117353088718051227711808@nocme1bl6.telenet-ops.be>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi Ruud,
>
> I forwarded your message to the StatET (R in Eclipse) list;
> there might be StatET users with a similar setup as yours
> on that list (and the StatET developer is more likely to
> pick up your question there).
>
> Best,
> Tobias
>
> >Hello, I am trying to install Eclipse and R on an amd64 machine running
> >Suse linux 9.3. I have compiled R 2.8.0 with --enable-R-shlib and it
> >seems that compilation was successfull. After starting R, I installed
> >the latest rJava package, from the output:
> >checking whether JRI is requested... yes
> >cp src/libjri.so libjri.so
> >It seems JRI support has been compiled successfully. However, when I
try
> >to open R from within Eclipse, I receive an error message:
> >
> >Launching the R Console was cancelled, because it seems starting the
> >Java process/R engine failed.
> >Please make sure that R package 'rJava' with JRI is installed.
> >
> >I can open an R console from the command line, and attach the rJava
> >library without problems. What am I doing wrong here?
> >Thanks, Ruud
> >
> >______________________________________________
> >R-help@r-project.org mailing list
> >stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
>
> ------------------------------
>
> Message: 18
> Date: Wed, 26 Nov 2008 10:04:55 -0500
> From: "Debanjan Bhattacharjee"
<debanjan.bhattacharjee@gmail.com>
> Subject: [R] Finding Stopping time
> To: r-help@r-project.org
> Message-ID:
>        <878a627a0811260704x1e1e4cfxbea42f48628d843d@mail.gmail.com>
> Content-Type: text/plain
>
> Can any one help me to solve problem in my code? I am actually trying to
> find the stopping index N.
> So first I generate random numbers from normals. There is no problem in
> finding the first stopping index.
> Now I want to find the second stopping index using obeservation starting
> from the one after the first stopping index.
> E.g. If my first stopping index was 5. I want to set 6th observation from
> the generated normal variables as the first random
> number, and I stop at second stopping index.
>
> This is my code,
>
>
> alpha <- 0.05
> beta <- 0.07
> a <- log((1-beta)/alpha)
> b <- log(beta/(1-alpha))
> theta1 <- 2
> theta2 <- 3
>
> cumsm<-function(n)
>              {y<-NULL
>               for(i in 1:n)
>               {y[i]=x[i]^2}
>               s=sum(y)
>               return(s)
>              }
> psum <- function(p,q)
>               {z <- NULL
>                for(l in p:q)
>                   { z[l-p+1] <- x[l]^2}
>                ps <- sum(z)
>                return(ps)
>               }
> smm <- NULL
> sm <- NULL
> N <- NULL
> Nout <- NULL
> T <- NULL
> k<-0
>  x <- rnorm(100,theta1,theta1)
>  for(i in 1:length(x))
>    {
>       sm[i] <- psum(1,i)
>       T[i] <-
>
>
((i/2)*log(theta1/theta2))+(((theta2-theta1)/(2*theta1*theta2))*sm[i])-(i*(theta2-theta1)/2)
>       if (T[i]<=b | T[i]>=a){N[1]<-i
>                              break}
>
>    }
> for(j in 2:200)
> {
>  for(k in (N[j-1]+1):length(x))
>    {  smm[k] <- psum((N[j-1]+1),k)
>       T[k] <-
>
>
((k/2)*log(theta1/theta2))+(((theta2-theta1)/(2*theta1*theta2))*smm[k])-(k*(theta2-theta1)/2)
>       if (T[k]<=b | T[k]>=a){N[j]<-k
>                              break}
>     }
> }
>
> But I cannot get the stopping index after the first one.
>
> Tanks
> --
>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 19
> Date: Wed, 26 Nov 2008 10:08:56 -0500
> From: David Winsemius <dwinsemius@comcast.net>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
>        correlation p.values for 2 matrices
> To: "jim holtman" <jholtman@gmail.com>
> Cc: r-help@stat.math.ethz.ch
> Message-ID: <A405ABC6-A6DE-42B2-AFD9-2E2C41F55ABC@comcast.net>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> He might try rcorr from Hmisc instead. Using your test suite, it gives
> about a 20% improvement on my MacPro:
>
>  > m1 <- matrix(rnorm(10000), ncol=100)
>  > m2 <- matrix(rnorm(10000), ncol=100)
>  > Rprof('/tempxx.txt')
>  > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { rcorr(x,y)$P }) }))
>    user  system elapsed
>   4.221   0.049   4.289
>
>  > m1 <- matrix(rnorm(10000), ncol=100)
>  > m2 <- matrix(rnorm(10000), ncol=100)
>  > Rprof('/tempxx.txt')
>  > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { cor.test(x,y)$p.value }) }))
>    user  system elapsed
>   5.328   0.038   5.355
>
> I'm not a smart enough programmer to figure out whether there might be
> an even more efficient method that takes advantage rcorr's  implicit
> "looping" through a set of columns to produce an all combinations
> return.
>
> --
> David Winsemius, MD
> Heritage Labs
>
>
> On Nov 26, 2008, at 9:14 AM, jim holtman wrote:
>
> > Your time is being taken up in cor.test because you are calling it
> > 100,000 times.  So grin and bear it with the amount of work you are
> > asking it to do.
> >
> > Here I am only calling it 100 time:
> >
> >> m1 <- matrix(rnorm(10000), ncol=100)
> >> m2 <- matrix(rnorm(10000), ncol=100)
> >> Rprof('/tempxx.txt')
> >> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2,
1,
> >> function(y) { cor.test(x,y)$p.value }) }))
> >   user  system elapsed
> >   8.86    0.00    8.89
> >>
> >
> > so my guess is that calling it 100,000 times will take:  100,000 *
> > 0.0886 seconds or about 3 hours.
> >
> > If you run Rprof, you will see if is spending most of its time there:
> >
> >  0   8.8 root
> >  1.    8.8 apply
> >  2. .    8.8 FUN
> >  3. . .    8.8 apply
> >  4. . . .    8.7 FUN
> >  5. . . . .    8.6 cor.test
> >  6. . . . . .    8.4 cor.test.default
> >  7. . . . . . .    2.4 match.arg
> >  8. . . . . . . .    1.7 eval
> >  9. . . . . . . . .    1.4 deparse
> > 10. . . . . . . . . .    0.6 .deparseOpts
> > 11. . . . . . . . . . .    0.2 pmatch
> > 11. . . . . . . . . . .    0.1 sum
> > 10. . . . . . . . . .    0.5 %in%
> > 11. . . . . . . . . . .    0.3 match
> > 12. . . . . . . . . . . .    0.3 is.factor
> > 13. . . . . . . . . . . . .    0.3 inherits
> >  8. . . . . . . .    0.2 formals
> >  9. . . . . . . . .    0.2 sys.function
> >  7. . . . . . .    2.1 cor
> >  8. . . . . . . .    1.1 match.arg
> >  9. . . . . . . . .    0.7 eval
> > 10. . . . . . . . . .    0.6 deparse
> > 11. . . . . . . . . . .    0.3 .deparseOpts
> > 12. . . . . . . . . . . .    0.1 pmatch
> > 11. . . . . . . . . . .    0.2 %in%
> > 12. . . . . . . . . . . .    0.2 match
> > 13. . . . . . . . . . . . .    0.1 is.factor
> > 14. . . . . . . . . . . . . .    0.1 inherits
> >  9. . . . . . . . .    0.1 formals
> >  8. . . . . . . .    0.5 stopifnot
> >  9. . . . . . . . .    0.2 match.call
> >  8. . . . . . . .    0.1 pmatch
> >  8. . . . . . . .    0.1 is.data.frame
> >  9. . . . . . . . .    0.1 inherits
> >  7. . . . . . .    1.5 paste
> >  8. . . . . . . .    1.4 deparse
> >  9. . . . . . . . .    0.6 .deparseOpts
> > 10. . . . . . . . . .    0.3 pmatch
> > 10. . . . . . . . . .    0.1 any
> >  9. . . . . . . . .    0.6 %in%
> > 10. . . . . . . . . .    0.6 match
> > 11. . . . . . . . . . .    0.5 is.factor
> > 12. . . . . . . . . . . .    0.4 inherits
> > 13. . . . . . . . . . . . .    0.2 mode
> >  7. . . . . . .    0.4 switch
> >  8. . . . . . . .    0.1 qnorm
> >  7. . . . . . .    0.2 pt
> >  5. . . . .    0.1 $
> >
> > On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan
<daren76@hotmail.com>
> > wrote:
> >>
> >> My two matrices are roughly the sizes of m1 and m2. I tried using
> >> two apply and cor.test to compute the correlation p.values. More
> >> than an hour, and the codes are still running. Please help to make
> >> it more efficient.
> >>
> >> m1 <- matrix(rnorm(100000), ncol=100)
> >> m2 <- matrix(rnorm(10000000), ncol=100)
> >>
> >> cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
function(y)
> >> { cor.test(x,y)$p.value }) })
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem that you are trying to solve?
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ------------------------------
>
> Message: 20
> Date: Wed, 26 Nov 2008 04:42:26 -0800 (PST)
> From: iwalters <iwalters@cellc.co.za>
> Subject: [R]  memory limit
> To: r-help@r-project.org
> Message-ID: <20699700.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> I'm currently working with very large datasets that consist out of
> 1,000,000
> + rows.  Is it at all possible to use R for datasets this size or should I
> rather consider C++/Java.
>
>
> --
> View this message in context:
>
nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20699700.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 21
> Date: Wed, 26 Nov 2008 21:29:28 +0800
> From: "zhijie zhang" <epistat@gmail.com>
> Subject: [R] Needs suggestions for choosing appropriate R packages
> To: R-help@stat.math.ethz.ch
> Message-ID:
>        <2fc17e30811260529u1d14f3cbg7c3e075a85753bcc@mail.gmail.com>
> Content-Type: text/plain
>
> Dear all,
>  I am thinking to fit a multilevel dataset  with R. I have found several
> possible packages for my task, such as
> glmmPQL(MASS),glmm(repeated),glmer(lme4), et al. I am a little confused by
> these functions.
>  Could anybody tell me which function/package is the correct one to analyse
> my dataset?
> My dataset is as follows:
> the response variable P is binary variable (the subject is a patient or
> not);
> two explanatory variables X1 (age) and X2 (sex);
> this dataset was sampled from three different level,
> district,school,individual,  so this was regarded as a multilevel dataset.
> I hope to fit the 3-level model(Y is binary variable):
> Logit(Pijk)=(a0+b0k+c0jk)+b1*X1+b2*X2
>  i-individual, first level;   j-school, 2nd level;   k-district,3rd level.
>  I know that the GLIMMIX procedure in the latest version SAS9.2 is a choice
> for that, but unfortunately we donot have the latest version.
>  R must have the similar functions to do that,  can anybody give me some
> suggestions or help on analysing my dataset?
> Q1: Which package/functions is appropriate for my task?  Could u show me
> some example codes if possible?
> Q2: Logit(Pijk)=(a0+b0k+c0jk)+(b1+b1j)*X1+b2*X2
>  If the randome effect was also specified in X1 as above, Which
> package/functions is possible?
>  Thanks a lot.
>
>
> --
> With Kind Regards,
>
> oooO:::::::::
> (..):::::::::
> :\.(:::Oooo::
> ::\_)::(..)::
> :::::::)./:::
> ::::::(_/::::
> :::::::::::::
> [***********************************************************************]
> ZhiJie Zhang ,PhD
> Dept.of Epidemiology, School of Public Health,Fudan University
> Office:Room 443, Building 8
> Office Tel./Fax.:+86-21-54237410
> Address:No. 138 Yi Xue Yuan Road,Shanghai,China
> Postcode:200032
> Email:epistat@gmail.com <Email%3Aepistat@gmail.com> <
> Email%3Aepistat@gmail.com <Email%253Aepistat@gmail.com>>
> Website: statABC.com
> [***********************************************************************]
> oooO:::::::::
> (..):::::::::
> :\.(:::Oooo::
> ::\_)::(..)::
> :::::::)./:::
> ::::::(_/::::
> :::::::::::::
>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 22
> Date: Wed, 26 Nov 2008 13:10:16 +0100
> From: "valeria pedrina" <valeria.pedrina@gmail.com>
> Subject: [R] plm pakage
> To: r-help@r-project.org
> Message-ID:
>        <e515eba20811260410g59c1d7d3q54fc47224f9ead0b@mail.gmail.com>
> Content-Type: text/plain
>
> Hi everyone, I'm doing a panel data analisys and I want to run three
> estimation methods against my available dataset:pooled OLS, random and
> fixed
> effects. I have 9 individuals and 5 observation for each individual. this
> is
> my code,what's wrong?
>
> X <- cbind(y,x)
> X <-data.frame(X)
> ooo<-pdata.frame(X,9)
> vedo<-plm(y~x, data=ooo)
>
> and this is the error:
> Errore in X.m[, coef.within, drop = F] : numero di dimensione errato
> thanks
>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 23
> Date: Wed, 26 Nov 2008 09:42:37 -0400
> From: "Laurina Guerra" <laurinaguerra@gmail.com>
> Subject: [R] S4 object
> To: <r-help@r-project.org>
> Message-ID: <005101c94fcc$db1d8f40$9158adc0$@com>
> Content-Type: text/plain
>
> Hola buen dia!! Alguien me podría orientar acerca de que es la clase S4
> object y como funciona de la forma mas sencilla posible. Se les agradecería
> su respuesta lo mas pronto posible.. gracias!!
>
>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 24
> Date: Wed, 26 Nov 2008 10:16:14 -0500
> From: "Jorge Ivan Velez" <jorgeivanvelez@gmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
>        correlation p.values for 2 matrices
> To: "Daren Tan" <daren76@hotmail.com>
> Cc: R mailing list <r-help@r-project.org>
> Message-ID:
>        <317737de0811260716t788a4735g659a908410dc9fc2@mail.gmail.com>
> Content-Type: text/plain
>
> Hi Daren,
> Here is another aproach a little bit faster taking into account that
I'm
> using your original matrices.  My session info is at the end. I'm using
a
> 2.4 GHz Core 2-Duo processor and 3 GB of RAM.
>
>  # Data
>  set.seed(123)
>  m1 <- matrix(rnorm(100000), ncol=100)
>  m2 <- matrix(rnorm(100000), ncol=100)
>  colnames(m1)=paste('m1_',1:100,sep="")
>  colnames(m2)=paste('m2_',1:100,sep="")
>
> # Combinations
>  combs=expand.grid(colnames(m1),colnames(m2))
>
> # ---------------
> # Option 1
> #----------------
> system.time(apply(combs,1,function(x)
> cor.test(m1[,x[1]],m2[,x[2]])$p.value)->pvalues1)
> #  user  system elapsed
> #   8.12    0.01    8.20
>
> # ---------------
> # Option 2
> #----------------
> require(Hmisc)
> system.time(apply(combs,1,function(x)
> rcorr(m1[,x[1]],m2[,x[2]])$P[2])->pvalues2)
> #   user  system elapsed
> #   7.00    0.00    7.02
>
>
> HTH,
>
> Jorge
>
>
> # -------------  Session Info ----------------------------
> R version 2.8.0 Patched (2008-11-08 r46864)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
>
>
> On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan <daren76@hotmail.com>
wrote:
>
> >
> > My two matrices are roughly the sizes of m1 and m2. I tried using two
> apply
> > and cor.test to compute the correlation p.values. More than an hour,
and
> the
> > codes are still running. Please help to make it more efficient.
> >
> > m1 <- matrix(rnorm(100000), ncol=100)
> > m2 <- matrix(rnorm(10000000), ncol=100)
> >
> > cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y)
{
> > cor.test(x,y)$p.value }) })
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 25
> Date: Wed, 26 Nov 2008 10:17:32 -0500
> From: Chuck Cleland <ccleland@optonline.net>
> Subject: Re: [R] Chi-Square Test Disagreement
> To: Andrew Choens <andy.choens@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D688C.5040007@optonline.net>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On 11/26/2008 9:51 AM, Andrew Choens wrote:
> > I was asked by my boss to do an analysis on a large data set, and I am
> > trying to convince him to let me use R rather than SPSS. I think
Sweave
> > could make my life much much easier. To get me a little closer to this
> > goal, I ran my analysis through R and SPSS and compared the resulting
> > values. In all but one case, they were the same. Given the matrix
> >
> >     [,1] [,2]
> > [1,]  110  358
> > [2,]   71  312
> > [3,]   29  139
> > [4,]   31   77
> > [5,]   13   32
> >
> > This is the output from R:
> >> chisq.test(test29)
> >
> >       Pearson's Chi-squared test
> >
> > data:  test29
> > X-squared = 9.593, df = 4, p-value = 0.04787
> >
> > But, the same data in SPSS generates a p value of .051. It's a
small but
> > important difference. I played around and rescaled things, and tried
> > different values for B, but I never could get R to reach .051.
> >
> > I'd like to know which program is correct - R or SPSS? I know,
this is a
> > biased place to ask such a question. I also appreciate all input that
> > will help me use R more effectively. The difference could be the
result
> > of my own ignorance.
>
>  The SPSS p-value is for the Likelihood Ratio Chi-squared test, not
> Pearson's.  For Pearson's Chi-squared test in SPSS (16.0.2), I get
> p=0.04787, so the results do match if you do the same Chi-squared test.
>
> > thanks
> > --andy
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc. (ndri.org)
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>
>
>
> ------------------------------
>
> Message: 26
> Date: Wed, 26 Nov 2008 16:18:33 +0100
> From: Thomas Kaliwe <hamstersquats@web.de>
> Subject: [R] S4 slot containing either aov or NULL
> To: r-help@r-project.org
> Message-ID: <492D68C9.8010304@web.de>
> Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>
> Dear listmembers,
>
> I would like to define a class with a slot that takes either an object
> of class aov or NULL. I have been reading "S4 Classes in 15 pages more
> or less" and "Lecture: S4 classes and methods"
>
> #First I tried with list and NULL
> setClass(listOrNULL")
> setIs("list", "listOrNULL")
> setIs("NULL", "listOrNULL")
> #doesn't work
>
> #defining a union class it works with list and null
> setClassUnion("listOrNULL", c("list",
"NULL"))
> setClass("c1", representation(value = "listOrNULL"))
> y1 = new("c1", value = NULL)
> y2 = new("c1", value = list(a = 10))
>
> #but it won't work with aov or null
> setClassUnion("aovOrNULL", c("aov", "NULL"))
> setClass("c1", representation(value = "aovOrNULL"))
> y1 = new("c1", value = NULL)
>
> #trying to assign an aov object to the slot doesn't work
> utils::data(npk, package="MASS")
> npk.aov <- aov(yield ~ block + N*P*K, npk)
> y2 = new("c1", value = npk.aov )
>
> Any ideas?
>
> Thank you
>
> Thomas Kaliwe
>
>
>
> ------------------------------
>
> Message: 27
> Date: Wed, 26 Nov 2008 07:24:07 -0800 (PST)
> From: Tubin <sredmonson@yahoo.com>
> Subject: [R]  odfWeave and XML... on a Mac
> To: r-help@r-project.org
> Message-ID: <20702670.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> I'm trying out odfWeave in a Mac environment and getting some odd
behavior.
> Having figured out that the code snippets only work if they're in
certain
> fonts, I was able to get R to run a test document through and produce an
> output document.  After running it, though, I get a warning message:
>
> Warning message:
> In file.remove("styles_2.xml") :
>  cannot remove file 'styles_2.xml', reason 'No such file or
directory'
>
> This message is interesting given that about 20 lines earlier I see:
> Renaming styles_2.xml to styles.xml
>
> If I run the test doc in results=verbatim mode, I see that warning by my
> results appear in the appropriate places on the output document:
>
> *****output*****
> This is the basic text stuff.  Now I will try to input the other stuff:
>
> [1] 5
>
> And this is the after-text.
> *****end*****
>
> If I run the test document in results=xml mode, though, the output is
> blank:
>
> *******output*****
>
> This is the basic text stuff.  Now I will try to input the other stuff:
>
>
> And this is the after-text.
> ******end*****
>
> Earlier posts on this forum suggest that the solution may involve loading
> an
> earlier build of XML.  Is that likely to work?  And if so - stupid question
> I'm sure, but how do I do that?
>
> Thanks in advance for the time and attention of people more experienced
> than
> myself...
>
>
> --
> View this message in context:
> nabble.com/odfWeave-and-XML...-on-a-Mac-tp20702670p20702670.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 28
> Date: Wed, 26 Nov 2008 09:33:59 -0600
> From: "hadley wickham" <h.wickham@gmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
>        correlation p.values for 2 matrices
> To: "jim holtman" <jholtman@gmail.com>
> Cc: r-help@stat.math.ethz.ch
> Message-ID:
>        <f8e6ff050811260733x7062a6acm68f8da611a14cb87@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Wed, Nov 26, 2008 at 8:14 AM, jim holtman <jholtman@gmail.com>
wrote:
> > Your time is being taken up in cor.test because you are calling it
> > 100,000 times.  So grin and bear it with the amount of work you are
> > asking it to do.
> >
> > Here I am only calling it 100 time:
> >
> >> m1 <- matrix(rnorm(10000), ncol=100)
> >> m2 <- matrix(rnorm(10000), ncol=100)
> >> Rprof('/tempxx.txt')
> >> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2,
1,
> function(y) { cor.test(x,y)$p.value }) }))
> >   user  system elapsed
> >   8.86    0.00    8.89
> >>
> >
> > so my guess is that calling it 100,000 times will take:  100,000 *
> > 0.0886 seconds or about 3 hours.
>
> You can make it ~3 times faster by vectorising the testing:
>
> m1 <- matrix(rnorm(10000), ncol=100)
> m2 <- matrix(rnorm(10000), ncol=100)
>
> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { cor.test(x,y)$p.value })}))
>
>
> system.time({
> r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y) })})
>
> df <- nrow(m1) - 2
> t <- sqrt(df) * r / sqrt(1 - r ^ 2)
> p <- pt(t, df)
> p <- 2 * pmin(p, 1 - p)
> })
>
>
> all.equal(cor.pvalues, p)
>
>
> You can make cor much faster by stripping away all the error checking
> code and calling the internal c function  directly (suggested by the
> Rprof output):
>
>
> system.time({
> r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y) })})
> })
>
> system.time({
> r2 <- apply(m1, 1, function(x) { apply(m2, 1, function(y) {
> .Internal(cor(x, y, 4L, FALSE)) })})
> })
>
> 1.5s vs 0.2 s on my computer.  Combining both changes gives me a ~25
> time speed up - I suspect you can do even better if you think about
> what calculations are being duplicated in the computation of the
> correlations.
>
> Hadley
>
> --
> had.co.nz
>
>
>
> ------------------------------
>
> Message: 29
> Date: Wed, 26 Nov 2008 15:37:55 +0000
> From: Daren Tan <daren76@hotmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
>        correlation p.values for 2 matrices
> To: <r-help@stat.math.ethz.ch>
> Message-ID: <BLU137-W95CAC1950A6096E58D289B20A0@phx.gbl>
> Content-Type: text/plain; charset="gb2312"
>
>
> Out of desperation, I made the following function which hadley beats me to
> it :P. Thanks everyone for the great help.
>
>
> cor.p.values <- function(r, n) {
>  df <- n - 2
>  STATISTIC <- c(sqrt(df) * r / sqrt(1 - r^2))
>  p <- pt(STATISTIC, df)
>  return(2 * pmin(p, 1 - p))
> }
>
> > Date: Wed, 26 Nov 2008 09:33:59 -0600
> > From: h.wickham@gmail.com
> > To: jholtman@gmail.com
> > Subject: Re: [R] Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices
> > CC: daren76@hotmail.com; r-help@stat.math.ethz.ch
> >
> > On Wed, Nov 26, 2008 at 8:14 AM, jim holtman  wrote:
> >> Your time is being taken up in cor.test because you are calling it
> >> 100,000 times. So grin and bear it with the amount of work you are
> >> asking it to do.
> >>
> >> Here I am only calling it 100 time:
> >>
> >>> m1 <- matrix(rnorm(10000), ncol=100)
> >>> m2 <- matrix(rnorm(10000), ncol=100)
> >>> Rprof('/tempxx.txt')
> >>> system.time(cor.pvalues <- apply(m1, 1, function(x) {
apply(m2, 1,
> function(y) { cor.test(x,y)$p.value }) }))
> >> user system elapsed
> >> 8.86 0.00 8.89
> >>>
> >>
> >> so my guess is that calling it 100,000 times will take: 100,000 *
> >> 0.0886 seconds or about 3 hours.
> >
> > You can make it ~3 times faster by vectorising the testing:
> >
> > m1 <- matrix(rnorm(10000), ncol=100)
> > m2 <- matrix(rnorm(10000), ncol=100)
> >
> > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> > function(y) { cor.test(x,y)$p.value })}))
> >
> >
> > system.time({
> > r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y)
})})
> >
> > df <- nrow(m1) - 2
> > t <- sqrt(df) * r / sqrt(1 - r ^ 2)
> > p <- pt(t, df)
> > p <- 2 * pmin(p, 1 - p)
> > })
> >
> >
> > all.equal(cor.pvalues, p)
> >
> >
> > You can make cor much faster by stripping away all the error checking
> > code and calling the internal c function directly (suggested by the
> > Rprof output):
> >
> >
> > system.time({
> > r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y)
})})
> > })
> >
> > system.time({
> > r2 <- apply(m1, 1, function(x) { apply(m2, 1, function(y) {
> > .Internal(cor(x, y, 4L, FALSE)) })})
> > })
> >
> > 1.5s vs 0.2 s on my computer. Combining both changes gives me a ~25
> > time speed up - I suspect you can do even better if you think about
> > what calculations are being duplicated in the computation of the
> > correlations.
> >
> > Hadley
> >
> > --
> > had.co.nz
> _________________________________________________________________
> [[elided Hotmail spam]]
>
>
>
> ------------------------------
>
> Message: 30
> Date: Wed, 26 Nov 2008 07:43:22 -0800 (PST)
> From: Jeroen Ooms <j.c.l.ooms@uu.nl>
> Subject: Re: [R] plotting density for truncated distribution
> To: r-help@r-project.org
> Message-ID: <20703469.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> thank you, both solutions are really helpful!
> --
> View this message in context:
>
nabble.com/plotting-density-for-truncated-distribution-tp20684995p20703469.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 31
> Date: Wed, 26 Nov 2008 13:42:29 -0200
> From: "Rodrigo Aluizio" <r.aluizio@gmail.com>
> Subject: [R] RES:  S4 object
> To: "'Laurina Guerra'" <laurinaguerra@gmail.com>
> Cc: R Help <r-help@r-project.org>
> Message-ID: <492d6eb6.231e640a.0480.76d0@mx.google.com>
> Content-Type: text/plain;       charset="us-ascii"
>
> Take a look at the links you will found on this previous post of the list.
>
> tolstoy.newcastle.edu.au/R/help/06/01/18259.html
>
> I myself don't know anything about this subject.
>
> Sorry, but you probably will found what you need there.
>
> Best Wishes
>
> Rodrigo.
>
> -----Mensagem original-----
> De: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] Em
> nome de Laurina Guerra
> Enviada em: quarta-feira, 26 de novembro de 2008 11:43
> Para: r-help@r-project.org
> Assunto: [R] S4 object
>
> Hola buen dia!! Alguien me podrma orientar acerca de que es la clase S4
> object y como funciona de la forma mas sencilla posible. Se les agradecerma
> su respuesta lo mas pronto posible.. gracias!!
>
>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 32
> Date: Wed, 26 Nov 2008 13:53:44 -0300
> From: "Leandro Marino" <leandro@cesgranrio.org.br>
> Subject: [R] RES:   memory limit
> To: "'iwalters'" <iwalters@cellc.co.za>,
<r-help@r-project.org>
> Message-ID:
>
> 
<!&!AAAAAAAAAAAYAAAAAAAAAEadxqYXQLlLmuUnwe+aKQfCgAAAEAAAAI9VmQAZHrdBskVHz8nCW0sBAAAAAA==@
> cesgranrio.org.br>
>
> Content-Type: text/plain;       charset="us-ascii"
>
> It depends of the number of the variables. If you are using 2 or 3
> variables
> you can do some things.
>
> I should you read about ff package and ASOR packages they manage the
> dataset
> to do some kind of IO.
>
> Regards,
>
> -----Mensagem original-----
> De: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] Em
> nome de iwalters
> Enviada em: quarta-feira, 26 de novembro de 2008 09:42
> Para: r-help@r-project.org
> Assunto: [R] memory limit
>
>
> I'm currently working with very large datasets that consist out of
> 1,000,000
> + rows.  Is it at all possible to use R for datasets this size or should I
> rather consider C++/Java.
>
>
> --
> View this message in context:
>
>
nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-
> tp20675880p20699700.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ------------------------------
>
> Message: 33
> Date: Wed, 26 Nov 2008 18:24:32 +0200
> From: "andrew collier" <collierab@gmail.com>
> Subject: [R] ts subscripting problem
> To: r-help@r-project.org
> Message-ID:
>        <c642e63c0811260824l7adc65e4tf9177ad2fad01132@mail.gmail.com>
> Content-Type: text/plain
>
> hi,
>
> i am having trouble getting a particular time series to plot. this is what
> i
> have:
>
> > class(irradiance)
> [1] "ts"
> > irradiance[1:30]
>  197811   197812   197901   197902   197903   197904   197905   197906
> 1366.679 1366.729 1367.476 1367.739 1368.339 1367.883 1367.916 1367.055
>  197907   197908   197909   197910   197911   197912   198001   198002
> 1367.484 1366.887 1366.935 1367.034 1366.997 1367.310 1367.041 1366.459
>  198003   198004   198005   198006   198007   198008   198009   198010
> 1367.143 1366.553 1366.597 1366.854 1366.814 1366.901 1366.622 1366.669
>  198011   198012   198101   198102   198103   198104
> 1365.874 1366.098 1367.141 1366.239 1366.323 1366.388
> > plot(irradiance[1:30])
> > plot(irradiance)
> Error in dn[[2]] : subscript out of bounds
>
> so, if i plot a subset of the data it works fine. but if i try to plot the
> whole thing it breaks. the ts object was created using:
>
> irradiance = ts(tapply(d$number, f, mean), freq = 12, start = c(1978, 11))
>
> and other ts objects that i have defined using basically the same approach
> work fine.
>
> any ideas greatly appreciated!
>
> cheers,
> andrew.
>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 34
> Date: Wed, 26 Nov 2008 16:27:02 +0000
> From: Andrew Beckerman <a.beckerman@sheffield.ac.uk>
> Subject: [R] survreg and pweibull
> To: r-help@r-project.org
> Message-ID: <876C2357-B2DD-43F9-A803-110628B7A537@sheffield.ac.uk>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> Dear all -
>
> I have followed the thread the reply to which was lead by Thomas
> Lumley about using pweibull to generate fitted survival curves for
> survreg models.
>
> tolstoy.newcastle.edu.au/R/help/04/11/7766.html
>
> Using the lung data set,
>
> data(lung)
> lung.wbs <- survreg( Surv(time, status)~ 1, data=lung,
dist='weibull')
> curve(pweibull(x, scale=exp(coef(lung.wbs)), shape=1/lung.wbs
> $scale,lower.tail=FALSE),from=0, to=max(lung$time))
> lines(survfit(Surv(time,status)~1, data=lung), col="red")
>
> Assuming this is correct, why does the inflection point of this curve
> not match up to the exp(scale parameter)?  Am I wrong in assuming that
> the scale represents the inflection, and the shape adjusts the shape
> around this point?  I think I am.... perhaps confusing the scale and
> the median with the inflection point calcuation?
>
> One can visualise the mismatch with:
>
> abline(v=exp(coef(lung.wbs)),lty=2)
> abline(h=0.5,lty=2)
>
> Many thanks for the clarification....
>
> R version 2.8.0 (2008-10-20)
> i386-apple-darwin8.11.1
> locale:
> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
> attached base packages:
> [1] splines   datasets  utils     stats     graphics  grDevices
> methods   base
> other attached packages:
> [1] survival_2.34-1 Hmisc_3.4-3     lattice_0.17-15 MASS_7.2-44
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 grid_2.8.0      tools_2.8.0
>
> Andrew
>
>
>
---------------------------------------------------------------------------------
> Dr. Andrew Beckerman
> Department of Animal and Plant Sciences, University of Sheffield,
> Alfred Denny Building, Western Bank, Sheffield S10 2TN, UK
> ph +44 (0)114 222 0026; fx +44 (0)114 222 0002
> beckslab.staff.shef.ac.uk
>
> flickr.com/photos/apbeckerman
> warblefly.co.uk
>
>
>
> ------------------------------
>
> Message: 35
> Date: Wed, 26 Nov 2008 16:22:01 +0000
> From: "Dr. Alireza Zolfaghari" <ali.zolfaghari@gmail.com>
> Subject: [R] Second y-axis
> To: R-help <r-help@r-project.org>
> Message-ID:
>        <d47fac460811260822o207a95e4gda2b936585139506@mail.gmail.com>
> Content-Type: text/plain
>
> Hi list,
> In the following code, how can I place the percentage label away from
> numbers in the second y-axis (lets say all should be inside plot area)?
>
> Thanks
> Alireza
>
> ================> require(grid)
> vp<- viewport(x=.1,y=.1,width=.6,height=.6,just=c("left",
"bottom"))
> pushViewport(vp)
>
>
plotDATA=data.frame(Loss=c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10),Level=c("AvgAll","AvgAll","AvgAll","AvgAll","AvgAll","AvgAll","AvgAll",
>
>
"AvgAll","AvgAll","AvgAll","AvgAll","AvgAll","GUL","GUL","GUL","GUL","GUL","GUL","GUL","GUL"),Line=c(1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,5,6,7,8))
> library(lattice)
> xyplot( Loss ~ Line, data=plotDATA, t="p",
> scales=list(relation="free", x=list(draw=TRUE, tick.number=12,
labels> 1:12)),par.settings = list(clip = list(panel = "off")))
> p<- xyplot( Loss ~ Line, data=plotDATA,
> t="p",scales=list(relation="free",x=list(at = 1:12)),
>    panel=function(x,y,subscripts, groups,...){
>    panel.xyplot(subset(plotDATA,
Level=="AvgAll")$Line,subset(plotDATA,
> Level=="AvgAll")$Loss
,col=Lloydscolour(colIncP),lwd=3,origin=0,...)
>    panel.axis(side = "right",
>
>
at=unique(plotDATA$Loss),labels=unique(plotDATA$Loss)/max(plotDATA$Loss)*100,outside=FALSE,ticks=TRUE,half=FALSE)
>    panel.axis(side = "right",
>
>
at=median(plotDATA$Loss),labels="Percentage",outside=FALSE,ticks=FALSE,half=FALSE,rot=90)
>    panel.axis(side = "right",
> at=c(4,8),labels=c(200,400),outside=TRUE,ticks=TRUE,half=FALSE)
>    panel.barchart(subset(plotDATA,Level=="GUL" )$Line,
> subset(plotDATA,Level=="GUL" )$Loss,box.ratio=1,horizontal =
FALSE,stack > TRUE,reference =
TRUE,col="blue",border="blue")#,origin=0)
>     }
>  )
>
>  print(p,position = c(0.1, 0.1, 0.9, .9))
> ================>
>        [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 36
> Date: Wed, 26 Nov 2008 08:39:15 -0800
> From: Charlie Brush <cfbrush@ucdavis.edu>
> Subject: Re: [R] multiple imputation with fit.mult.impute in Hmisc -
>        how to replace NA with imputed value?
> To: Frank E Harrell Jr <f.harrell@vanderbilt.edu>
> Cc: r-help@r-project.org
> Message-ID: <492D7BB3.7000809@ucdavis.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Frank E Harrell Jr wrote:
> > Charlie Brush wrote:
> >> I am doing multiple imputation with Hmisc, and
> >> can't figure out how to replace the NA values with
> >> the imputed values.
> >>
> >> Here's a general ourline of the process:
> >>
> >>  > set.seed(23)
> >>  > library("mice")
> >>  > library("Hmisc")
> >>  > library("Design")
> >>  > d <- read.table("DailyDataRaw_01.txt",header=T)
> >>  > length(d);length(d[,1])
> >> [1] 43
> >> [1] 2666
> >> Do for this data set, there are 43 columns and 2666 rows
> >>
> >> Here is a piece of data.frame d:
> >>  > d[1:20,4:6]
> >>  P01  P02  P03
> >> 1  0.1 0.16 0.16
> >> 2   NA 0.00 0.00
> >> 3   NA 0.60 0.04
> >> 4   NA 0.15 0.00
> >> 5   NA 0.00 0.00
> >> 6  0.7 0.00 0.75
> >> 7   NA 0.00 0.00
> >> 8   NA 0.00 0.00
> >> 9  0.0 0.00 0.00
> >> 10 0.0 0.00 0.00
> >> 11 0.0 0.00 0.00
> >> 12 0.0 0.00 0.00
> >> 13 0.0 0.00 0.00
> >> 14 0.0 0.00 0.00
> >> 15 0.0 0.00 0.03
> >> 16  NA 0.00 0.00
> >> 17  NA 0.01 0.00
> >> 18 0.0 0.00 0.00
> >> 19 0.0 0.00 0.00
> >> 20 0.0 0.00 0.00
> >>
> >> These are daily precipitation values at NCDC stations, and
> >> NA values at station P01 will be filled using multiple
> >> imputation and data from highly correlated stations P02 and P08:
> >>
> >>  > f <- aregImpute(~ I(P01) + I(P02) + I(P08),
> >> n.impute=10,match='closest',data=d)
> >> Iteration 13
> >>  > fmi <- fit.mult.impute( P01 ~ P02 + P08 , ols, f, d)
> >>
> >> Variance Inflation Factors Due to Imputation:
> >>
> >> Intercept       P02       P08
> >>    1.01      1.39      1.16
> >>
> >> Rate of Missing Information:
> >>
> >> Intercept       P02       P08
> >>    0.01      0.28      0.14
> >>
> >> d.f. for t-distribution for Tests of Single Coefficients:
> >>
> >> Intercept       P02       P08
> >> 242291.18    116.05    454.95
> >>  > r <- apply(f$imputed$P01,1,mean)
> >>  > r
> >>    2     3     4     5     7     8    16    17   249   250   251
> >> 0.002 0.430 0.044 0.002 0.002 0.002 0.002 0.123 0.002 0.002 0.002
> >>  252   253   254   255   256   257   258   259   260   261   262
> >> 1.033 0.529 1.264 0.611 0.002 0.513 0.085 0.002 0.705 0.840 0.719
> >>  263   264   265   266   267   268   269   270   271   272   273
> >> 1.489 0.532 0.150 0.134 0.002 0.002 0.002 0.002 0.002 0.055 0.135
> >>  274   275   276   277   278   279   280   281   282   283   284
> >> 0.009 0.002 0.002 0.002 0.008 0.454 1.676 1.462 0.071 0.002 1.029
> >>  285   286   287   288   289   418   419   420   421   422   700
> >> 0.055 0.384 0.947 0.002 0.002 0.008 0.759 0.066 0.009 0.002 0.002
> >>
> >> ------------------------------------------------------------------
> >> So far, this is working great.
> >> Now, make a copy of d:
> >>  > dnew <- d
> >>
> >> And then fill in the NA values in P01 with the values in r
> >>
> >> For example:
> >>  > for (i in 1:length(r)){
> >>    dnew$P01[r[i,1]] <- r[i,2]
> >>    }
> >> This doesn't work, because each 'piece' of r is two
numbers:
> >>  > r[1]
> >>   2
> >> 0.002
> >>  > r[1,1]
> >> Error in r[1, 1] : incorrect number of dimensions
> >>
> >> My question: how can I separate the the two items in (for example)
> >> r[1] to use the first part as an index and the second as a value,
> >> and then use them to replace the NA values with the imputed
values?
> >>
> >> Or is there a better way to replace the NA values with the imputed
> >> values?
> >>
> >> Thanks in advance for any help.
> >>
> >
> > You didn't state your goal, and why fit.mult.impute does not do
what
> > you want.   But you can look inside fit.mult.impute to see how it
> > retrieves the imputed values.  Also see the example in documentation
> > for transcan in which the command impute(xt, imputation=1) to retrieve
> > one of the multiple imputations.
> >
> > Note that you can say library(Design) (omit the quotes) to access both
> > Design and Hmisc.
> >
> > Frank
> Thanks for your help.
> My goal is to replace the NA values in the (copy of the) data frame with
> the means of the imputed values (which are now in variable 'r').
> fit.mult.impute works fine. I just can't figure out the last step,
> taking the results of fit.mult.impute (which are in variable 'r')
and
> replacing the NA values in the (copy of the) data frame.
> A simple for loop doesn't work because the items in 'r'
don't look like
> a normal vector, as for example r[1] returns
>  2
> 0.002
> Is there a command to replace the NA values in the data frame with the
> means of the imputed values?
>
> Thanks,
> Charlie
>
>
>
> ------------------------------
>
> Message: 37
> Date: Wed, 26 Nov 2008 10:46:11 -0600
> From: Frank E Harrell Jr <f.harrell@vanderbilt.edu>
> Subject: Re: [R] multiple imputation with fit.mult.impute in Hmisc -
>        how to replace NA with imputed value?
> To: Charlie Brush <cfbrush@ucdavis.edu>
> Cc: r-help@r-project.org
> Message-ID: <492D7D53.2070902@vanderbilt.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Charlie Brush wrote:
> > Frank E Harrell Jr wrote:
> >> Charlie Brush wrote:
> >>> I am doing multiple imputation with Hmisc, and
> >>> can't figure out how to replace the NA values with
> >>> the imputed values.
> >>>
> >>> Here's a general ourline of the process:
> >>>
> >>>  > set.seed(23)
> >>>  > library("mice")
> >>>  > library("Hmisc")
> >>>  > library("Design")
> >>>  > d <-
read.table("DailyDataRaw_01.txt",header=T)
> >>>  > length(d);length(d[,1])
> >>> [1] 43
> >>> [1] 2666
> >>> Do for this data set, there are 43 columns and 2666 rows
> >>>
> >>> Here is a piece of data.frame d:
> >>>  > d[1:20,4:6]
> >>>  P01  P02  P03
> >>> 1  0.1 0.16 0.16
> >>> 2   NA 0.00 0.00
> >>> 3   NA 0.60 0.04
> >>> 4   NA 0.15 0.00
> >>> 5   NA 0.00 0.00
> >>> 6  0.7 0.00 0.75
> >>> 7   NA 0.00 0.00
> >>> 8   NA 0.00 0.00
> >>> 9  0.0 0.00 0.00
> >>> 10 0.0 0.00 0.00
> >>> 11 0.0 0.00 0.00
> >>> 12 0.0 0.00 0.00
> >>> 13 0.0 0.00 0.00
> >>> 14 0.0 0.00 0.00
> >>> 15 0.0 0.00 0.03
> >>> 16  NA 0.00 0.00
> >>> 17  NA 0.01 0.00
> >>> 18 0.0 0.00 0.00
> >>> 19 0.0 0.00 0.00
> >>> 20 0.0 0.00 0.00
> >>>
> >>> These are daily precipitation values at NCDC stations, and
> >>> NA values at station P01 will be filled using multiple
> >>> imputation and data from highly correlated stations P02 and
P08:
> >>>
> >>>  > f <- aregImpute(~ I(P01) + I(P02) + I(P08),
> >>> n.impute=10,match='closest',data=d)
> >>> Iteration 13
> >>>  > fmi <- fit.mult.impute( P01 ~ P02 + P08 , ols, f, d)
> >>>
> >>> Variance Inflation Factors Due to Imputation:
> >>>
> >>> Intercept       P02       P08
> >>>    1.01      1.39      1.16
> >>>
> >>> Rate of Missing Information:
> >>>
> >>> Intercept       P02       P08
> >>>    0.01      0.28      0.14
> >>>
> >>> d.f. for t-distribution for Tests of Single Coefficients:
> >>>
> >>> Intercept       P02       P08
> >>> 242291.18    116.05    454.95
> >>>  > r <- apply(f$imputed$P01,1,mean)
> >>>  > r
> >>>    2     3     4     5     7     8    16    17   249   250  
251
> >>> 0.002 0.430 0.044 0.002 0.002 0.002 0.002 0.123 0.002 0.002
0.002
> >>>  252   253   254   255   256   257   258   259   260   261  
262
> >>> 1.033 0.529 1.264 0.611 0.002 0.513 0.085 0.002 0.705 0.840
0.719
> >>>  263   264   265   266   267   268   269   270   271   272  
273
> >>> 1.489 0.532 0.150 0.134 0.002 0.002 0.002 0.002 0.002 0.055
0.135
> >>>  274   275   276   277   278   279   280   281   282   283  
284
> >>> 0.009 0.002 0.002 0.002 0.008 0.454 1.676 1.462 0.071 0.002
1.029
> >>>  285   286   287   288   289   418   419   420   421   422  
700
> >>> 0.055 0.384 0.947 0.002 0.002 0.008 0.759 0.066 0.009 0.002
0.002
> >>>
> >>>
------------------------------------------------------------------
> >>> So far, this is working great.
> >>> Now, make a copy of d:
> >>>  > dnew <- d
> >>>
> >>> And then fill in the NA values in P01 with the values in r
> >>>
> >>> For example:
> >>>  > for (i in 1:length(r)){
> >>>    dnew$P01[r[i,1]] <- r[i,2]
> >>>    }
> >>> This doesn't work, because each 'piece' of r is
two numbers:
> >>>  > r[1]
> >>>   2
> >>> 0.002
> >>>  > r[1,1]
> >>> Error in r[1, 1] : incorrect number of dimensions
> >>>
> >>> My question: how can I separate the the two items in (for
example)
> >>> r[1] to use the first part as an index and the second as a
value,
> >>> and then use them to replace the NA values with the imputed
values?
> >>>
> >>> Or is there a better way to replace the NA values with the
imputed
> >>> values?
> >>>
> >>> Thanks in advance for any help.
> >>>
> >>
> >> You didn't state your goal, and why fit.mult.impute does not
do what
> >> you want.   But you can look inside fit.mult.impute to see how it
> >> retrieves the imputed values.  Also see the example in
documentation
> >> for transcan in which the command impute(xt, imputation=1) to
retrieve
> >> one of the multiple imputations.
> >>
> >> Note that you can say library(Design) (omit the quotes) to access
both
> >> Design and Hmisc.
> >>
> >> Frank
> > Thanks for your help.
> > My goal is to replace the NA values in the (copy of the) data frame
with
> > the means of the imputed values (which are now in variable
'r').
> > fit.mult.impute works fine. I just can't figure out the last step,
> > taking the results of fit.mult.impute (which are in variable
'r') and
> > replacing the NA values in the (copy of the) data frame.
> > A simple for loop doesn't work because the items in 'r'
don't look like
> > a normal vector, as for example r[1] returns
> >  2
> > 0.002
> > Is there a command to replace the NA values in the data frame with the
> > means of the imputed values?
> >
> > Thanks,
> > Charlie
> >
>
> Don't do that, as this would no longer be multiple imputation.  If you
> want single conditional mean imputation use transcan.
>
> Frank
>
>
> --
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                      Department of Biostatistics   Vanderbilt University
>
>
>
> ------------------------------
>
> Message: 38
> Date: Thu, 27 Nov 2008 00:46:31 +0800
> From: Berwin A Turlach <berwin@maths.uwa.edu.au>
> Subject: Re: [R] Chi-Square Test Disagreement
> To: Andrew Choens <andy.choens@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <20081127004631.3289e0c3@absentia>
> Content-Type: text/plain; charset=US-ASCII
>
> G'day Andy,
>
> On Wed, 26 Nov 2008 14:51:50 +0000
> Andrew Choens <andy.choens@gmail.com> wrote:
>
> > I was asked by my boss to do an analysis on a large data set, and I am
> > trying to convince him to let me use R rather than SPSS.
>
> Very laudable of you. :)
>
> > This is the output from R:
> > > chisq.test(test29)
> >
> >       Pearson's Chi-squared test
> >
> > data:  test29
> > X-squared = 9.593, df = 4, p-value = 0.04787
> >
> > But, the same data in SPSS generates a p value of .051. It's a
small
> > but important difference.
>
> Chuck explained already the reason for this small difference.  I just
> take issue about it being an important difference.  In my opinion, this
> difference is not important at all.  It would only be important to
> people who are still sticking to arbitrary cut-off points that are
> mainly due to historical coincidences and the lack of computing power
> at those time in history.  If somebody tells you that this difference
> is important, ask him or her whether he or she will be willing to
> finance you a room full of calculators (in the sense of Pearson's time)
> and whether he or she wants you to do all your calculations and analyses
> with these calculators in future.  Alternatively, you could ask the
> person whether he or she would like the anaesthetist during his or her
> next operation to use chloroform given his or her nostalgic penchant for
> out-dated rituals/methods.
>
> > I played around and rescaled things, and tried different values for
> > B, but I never could get R to reach .051.
>
> Well, I have no problem when using simulated p-values to get something
> close to 0.051; look at the last try.  The second one might also be
> noteworthy.  Unfortunately, I didn't save the seed beforehand.
>
> > test29 <- matrix(c(110,358,71,312,29,139,31,77,13,32), byrow=TRUE,
> > ncol=2) test29
>     [,1] [,2]
> [1,]  110  358
> [2,]   71  312
> [3,]   29  139
> [4,]   31   77
> [5,]   13   32
> > chisq.test(test29, simul=TRUE)
>
>        Pearson's Chi-squared test with simulated p-value (based on 2000
>        replicates)
>
> data:  test29
> X-squared = 9.593, df = NA, p-value = 0.04798
>
> > chisq.test(test29, simul=TRUE)
>
>        Pearson's Chi-squared test with simulated p-value (based on 2000
>        replicates)
>
> data:  test29
> X-squared = 9.593, df = NA, p-value = 0.05697
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
>        Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data:  test29
> X-squared = 9.593, df = NA, p-value = 0.0463
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
>        Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data:  test29
> X-squared = 9.593, df = NA, p-value = 0.0499
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
>        Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data:  test29
> X-squared = 9.593, df = NA, p-value = 0.0486
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
>        Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data:  test29
> X-squared = 9.593, df = NA, p-value = 0.05125
>
>
> Cheers,
>
>        Berwin
>
> =========================== Full address ============================>
Berwin A Turlach                            Tel.: +65 6516 4416 (secr)
> Dept of Statistics and Applied Probability        +65 6516 6650 (self)
> Faculty of Science                          FAX : +65 6872 3919
> National University of Singapore
> 6 Science Drive 2, Blk S16, Level 7          e-mail: statba@nus.edu.sg
> Singapore 117546                    stat.nus.edu.sg/~statba
>
>
>
> ------------------------------
>
> Message: 39
> Date: Wed, 26 Nov 2008 17:57:52 +0000
> From: Andrew Choens <andy.choens@gmail.com>
> Subject: Re: [R] Chi-Square Test Disagreement
> To: Berwin A Turlach <berwin@maths.uwa.edu.au>
> Cc: r-help@r-project.org
> Message-ID: <1227722272.8422.201.camel@chinstrap>
> Content-Type: text/plain
>
> On Thu, 2008-11-27 at 00:46 +0800, Berwin A Turlach wrote:
> > Chuck explained already the reason for this small difference.  I just
> > take issue about it being an important difference.  In my opinion,
> > this difference is not important at all.  It would only be important
> > to people who are still sticking to arbitrary cut-off points that are
> > mainly due to historical coincidences and the lack of computing power
> > at those time in history.  If somebody tells you that this difference
> > is important, ask him or her whether he or she will be willing to
> > finance you a room full of calculators (in the sense of Pearson's
> > time) and whether he or she wants you to do all your calculations and
> > analyses with these calculators in future.  Alternatively, you could
> > ask the person whether he or she would like the anaesthetist during
> > his or her next operation to use chloroform given his or her nostalgic
> > penchant for out-dated rituals/methods.
>
> Yes he did and when I realized the source of my confusion I was
> appropriately chastised. I felt like a bit of a fool. Of course, I
> should try comparing apples to apples. Oranges are another thing
> entirely.
>
> As to the importance of the difference, I am of two minds. On the one
> hand I fully agree with you. It is an anachronistic approach. On the
> other hand we don't all have the pleasure of working in a math
> department where such subtleties are well understood.
>
> I work for a consulting firm that advises state and local governments
> (USA). I personally do try to expand my understanding on statistics and
> math (I do not have a degree in math), but my clients do not. When I'm
> working with someone from the government, it is sometimes easier to
> simply tell them that relationship x is significant at a certain level
> of certainty. Although I doubt they could really explain the details,
> they have some basic understanding of what I am talking about.
> Subtleties are sometimes lost on our public servants.
>
> And, since I do work for government, if I ask for a roomful of
> calculators, I might just get them. And really, what am I going to do
> with a roomful of calculators?
>
> --andy
>
>
> --
> Insert something humorous here.  :-)
>
>
>
> ------------------------------
>
> Message: 40
> Date: Wed, 26 Nov 2008 17:22:40 +0100
> From: Mats Exter <mats.exter@uni-koeln.de>
> Subject: [R] Problem with aovlmer.fnc in languageR
> To: r-help@r-project.org
> Message-ID: <492D77D0.306@uni-koeln.de>
> Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>
> Dear R list,
>
> I have a recurring problem with the languageR package, specifically the
> aovlmer.fnc function. When I try to run the following code (from R. H.
> Baayen's textbook):
>
>
>     # Example 1:
>     library(languageR)
>     latinsquare.lmer <- lmer(RT ~ SOA + (1 | Word) + (1 | Subject),
>                              data = latinsquare)
>     x <- pvals.fnc(latinsquare.lmer,
>                    withMCMC = TRUE)
>     aovlmer.fnc(latinsquare.lmer,
>                 mcmc = x$mcmc,
>                 which = c("SOAmedium", "SOAshort"))
>
>
> I get the following error message (German locale):
>
>
>     Fehler in anova(object) : Calculated PWRSS for a LMM is negative
>
>
> Invoking traceback yields the following result:
>
>
>     > traceback()
>     4: .Call(mer_update_projection, object)
>     3: anova(object)
>     2: anova(object)
>     1: aovlmer.fnc(latinsquare.lmer, mcmc = x$mcmc, which =
c("SOAmedium",
>            "SOAshort"))
>
>
> By contrast, the following code (without the aovlmer.fnc command) runs
> without error:
>
>
>     # Example 2:
>     library(languageR)
>     latinsquare.lmer <- lmer(RT ~ SOA + (1 | Word) + (1 | Subject),
>                              data = latinsquare)
>     pvals.fnc(latinsquare.lmer,
>               withMCMC = TRUE)
>
>
> Similarly, the following code (without the pvals.fnc command, and
> consequently...
>
> [Message clipped]
	[[alternative HTML version deleted]]

R help - Mar 2009 - Help

[R] Help

[R] Accessing the Sources; was: Help

[R] Help

[R] help

[R] help

[R] help

Apparently Analagous Threads