Hello Everyone,
I am trying to excess the inbuit .Fortran and .C codes of R. Can any one
help me in that. For example in kmeans clustering the algorithms are written
in .Fortran I want to access them and see the .Fortran syntax of the codes.
Can any one help me how can I do that?
Thanx,
Nitin Kumar
On Thu, Nov 27, 2008 at 12:00 PM, <r-help-request@r-project.org> wrote:
> Send R-help mailing list submissions to
> r-help@r-project.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://stat.ethz.ch/mailman/listinfo/r-help
> or, via email, send a message with subject or body 'help' to
> r-help-request@r-project.org
>
> You can reach the person managing the list at
> r-help-owner@r-project.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of R-help digest..."
>
>
> Today's Topics:
>
> 1. Re: Efficient passing through big data.frame and modifying
> select (Johannes Graumann)
> 2. Re: Running rtest - how to/ help (indian scorpio)
> 3. Re: how to read .sps (SPSS file extension)? (Eik Vettorazzi)
> 4. Re: how to read .sps (SPSS file extension)? (Eik Vettorazzi)
> 5. Re: plotting density for truncated distribution (Chris Andrews)
> 6. construct a vector (axionator)
> 7. Re: construct a vector (Marc Schwartz)
> 8. Re: construct a vector (Richard.Cotton@hsl.gov.uk)
> 9. Re: multiple imputation with fit.mult.impute in Hmisc - how
> to replace NA with imputed value? (Frank E Harrell Jr)
> 10. Re: construct a vector (Wacek Kusnierczyk)
> 11. Re: construct a vector (axionator)
> 12. Re: Question about Kolmogorov-Smirnov Test (Uwe Ligges)
> 13. Re: memory limit (seanpor)
> 14. Re: Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices (jim holtman)
> 15. how to check linearity in Cox regression (Terry Therneau)
> 16. Chi-Square Test Disagreement (Andrew Choens)
> 17. Re: eclipse and R (Tobias Verbeke)
> 18. Finding Stopping time (Debanjan Bhattacharjee)
> 19. Re: Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices (David Winsemius)
> 20. memory limit (iwalters)
> 21. Needs suggestions for choosing appropriate R packages
> (zhijie zhang)
> 22. plm pakage (valeria pedrina)
> 23. S4 object (Laurina Guerra)
> 24. Re: Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices (Jorge Ivan Velez)
> 25. Re: Chi-Square Test Disagreement (Chuck Cleland)
> 26. S4 slot containing either aov or NULL (Thomas Kaliwe)
> 27. odfWeave and XML... on a Mac (Tubin)
> 28. Re: Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices (hadley wickham)
> 29. Re: Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices (Daren Tan)
> 30. Re: plotting density for truncated distribution (Jeroen Ooms)
> 31. RES: S4 object (Rodrigo Aluizio)
> 32. RES: memory limit (Leandro Marino)
> 33. ts subscripting problem (andrew collier)
> 34. survreg and pweibull (Andrew Beckerman)
> 35. Second y-axis (Dr. Alireza Zolfaghari)
> 36. Re: multiple imputation with fit.mult.impute in Hmisc - how
> to replace NA with imputed value? (Charlie Brush)
> 37. Re: multiple imputation with fit.mult.impute in Hmisc - how
> to replace NA with imputed value? (Frank E Harrell Jr)
> 38. Re: Chi-Square Test Disagreement (Berwin A Turlach)
> 39. Re: Chi-Square Test Disagreement (Andrew Choens)
> 40. Problem with aovlmer.fnc in languageR (Mats Exter)
> 41. Re: S4 slot containing either aov or NULL (Matthias Kohl)
> 42. Re: Chi-Square Test Disagreement ( (Ted Harding))
> 43. Re: Error in sqlCopy in RODBC (BKMooney)
> 44. Smoothed 3D plots (Jorge Ivan Velez)
> 45. Re: weighted ftable (Andrew Choens)
> 46. Re: Chi-Square Test Disagreement (Andrew Choens)
> 47. Reshape with var as fun.aggregate (locklin.jason@gmail.com)
> 48. Creating a vector based on lookup function (PDXRugger)
> 49. Request for Assistance in R with NonMem (Michael White)
> 50. Re: Smoothed 3D plots (Mark Difford)
> 51. SPSSyntax function (Adrian Dusa)
> 52. R and SPSS (Applejus)
> 53. Re: Creating a vector based on lookup function (Charles C. Berry)
> 54. Re: Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices (Thomas Lumley)
> 55. Re: memory limit (Stavros Macrakis)
> 56. Installing packages based on the license (Ted Mendum)
> 57. Re: R and SPSS (Liviu Andronic)
> 58. SVM (simon abai bisrat)
> 59. Estimates of coefficient variances and covariances from a
> multinomial logistic regression? (Grant Gillis)
> 60. Re: Installing packages based on the license (Charles C. Berry)
> 61. Re: Estimates of coefficient variances and covariances from a
> multinomial logistic regression? (Charles C. Berry)
> 62. Re: Installing packages based on the license (Duncan Murdoch)
> 63. Re: memory limit (Henrik Bengtsson)
> 64. Drawing a tree in R (Severin Hacker)
> 65. Re: Reshape with var as fun.aggregate (hadley wickham)
> 66. Re: R and SPSS (Andrew Choens)
> 67. Re: R and SPSS (David Winsemius)
> 68. Boundary value problem (karthik jayasurya)
> 69. Business Data Sets (nmarti)
> 70. Re: How to create a string containing '\/' to be used with
> SED? (ikarus)
> 71. base:::rbind (Fernando Saldanha)
> 72. ggplot2 problem (steve)
> 73. Re: base:::rbind (Gabor Grothendieck)
> 74. Re: ggplot2 problem (Eric)
> 75. Re: ggplot2 problem (steve)
> 76. Re: How to create a string containing '\/' to be used with
> SED? (seanpor)
> 77. as.numeric in data.frame, but only where it is possible (Kinoko)
> 78. Re: R and SPSS (Alain Guillet)
> 79. Re: Error in sqlCopy in RODBC (Dieter Menne)
> 80. Re: Welcome to the "R-help" mailing list (Digest mode)
> (Weijia You)
> 81. Re: Welcome to the "R-help" mailing list (Digest mode)
> ( G?bor Cs?rdi )
> 82. Regression Problem for loop (ales grill)
> 83. Re: survreg and pweibull solved for any distribution
> (Andrew Beckerman)
> 84. what is there in a numeric (0)? (jass@in.gr)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 26 Nov 2008 11:36:58 +0100
> From: Johannes Graumann <johannes_graumann@web.de>
> Subject: Re: [R] Efficient passing through big data.frame and
> modifying select
> To: "Henrik Bengtsson" <hb@stat.berkeley.edu>
> Cc: R help <R-help@stat.math.ethz.ch>, William Dunlap
> <wdunlap@tibco.com>
> Message-ID: <200811261137.00489.johannes_graumann@web.de>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Marvelous! Thanks guys for your hints and time! Very smooth now!
>
> Joh
>
> On Wednesday 26 November 2008 03:41:49 Henrik Bengtsson wrote:
> > Alright, here are another $.02: using 'use.names=FALSE' in
unlist() is
> > much faster than the default 'use.names=TRUE'. /Henrik
> >
> > On Tue, Nov 25, 2008 at 6:40 PM, Henrik Bengtsson
<hb@stat.berkeley.edu>
> wrote:
> > > My $.02: Using argument 'fixed=TRUE' in strsplit() is
much faster than
> > > the default 'fixed=FALSE'. /Henrik
> > >
> > > On Tue, Nov 25, 2008 at 1:02 PM, William Dunlap
<wdunlap@tibco.com>
> wrote:
> > >>> -----Original Message-----
> > >>> From: William Dunlap
> > >>> Sent: Tuesday, November 25, 2008 9:16 AM
> > >>> To: 'johannes_graumann@web.de'
> > >>> Subject: Re: [R] Efficient passing through big data.frame
and
> > >>> modifying select fields
> > >>>
> > >>> > Johannes Graumann johannes_graumann at web.de
> > >>> > Tue Nov 25 15:16:01 CET 2008
> > >>> >
> > >>> > Hi all,
> > >>> >
> > >>> > I have relatively big data frames (> 10000 rows
by 80 columns)
> > >>> > that need to be exposed to "merge". Works
marvelously well in
> > >>> > general, but some fields of the data frames actually
contain
> > >>> > multiple ";"-separated values encoded as a
character string without
> > >>> > defined order, which makes the fields not match each
other.
> > >>> >
> > >>> > Example:
> > >>> > > frame1[1,1]
> > >>> >
> > >>> > [1] "some;thing"
> > >>> >
> > >>> > >frame2[2,1]
> > >>> >
> > >>> > [2] "thing;some"
> > >>> >
> > >>> > In order to enable merging/duplicate identification
of columns
> > >>> > containing these strings, I wrote the following
function, which
> > >>> > passes through the rows one by one, identifies
";"-containing
> cells,
> > >>> > splits and resorts them.
> > >>> >
> > >>> > ResortCombinedFields <- function(dframe){
> > >>> > if(!is.data.frame(dframe)){
> > >>> > stop("\"ResortCombinedFields\"
input needs to be a data frame.")
> > >>> > }
> > >>> > for(row in seq(nrow(dframe))){
> > >>> > for(mef in grep(";",dframe[row,])){
> > >>>
> > >>> I needed to add drop=TRUE to the above dframe[row,] for
this to work.
> > >>>
> > >>> > dframe[row,mef] <-
> > >>>
> > >>>
paste(sort(unlist(strsplit(dframe[row,mef],";"))),collapse=";")
> > >>>
> > >>> > }
> > >>> > }
> > >>> > return(dframe)
> > >>> > }
> > >>> >
> > >>> > works fine, but is horribly inefficient. How might
this be
> > >>>
> > >>> tackled more elegantly?
> > >>>
> > >>> > Thanks for any input, Joh
> > >>>
> > >>> It is usually faster to loop over columns of an data
frame and use
> row
> > >>> subscripting, if needed, on individual columns. E.g.,
the following
> > >>> 2 are much quicker on a sample 1000 by 4 dataset I made
with
> > >>>
> > >>> dframe<-data.frame(lapply(c(One=1,Two=2,Three=3),
> > >>> function(i)sapply(1:1000,
> > >>> function(i)
> > >>>
> > >>>
paste(sample(LETTERS[1:5],size=sample(3,size=1),repl=FALSE),
> > >>> collapse=";"))),
> > >>> stringsAsFactors=FALSE)
> > >>> dframe$Four<-sample(LETTERS[1:5], size=nrow(dframe),
> > >>> replace=TRUE) # no ;'s in column Four
> > >>>
> > >>> The first function, f1, doesn't try to find which
rows may
> > >>> need adjusting
> > >>> and the second, f2, does.
> > >>>
> > >>> f1 <- function(dframe){
> > >>> if(!is.data.frame(dframe)){
> > >>> stop("\"ResortCombinedFields\" input
needs to be a data frame.")
> > >>> }
> > >>> for(icol in seq_len(ncol(dframe))){
> > >>> dframe[,icol] <-
unlist(lapply(strsplit(dframe[,icol],
> > >>> ";"), function(parts) paste(sort(parts),
collapse=";")))
> > >>> }
> > >>> return(dframe)
> > >>> }
> > >>>
> > >>> f2 <-
> > >>> function(dframe){
> > >>> if(!is.data.frame(dframe)){
> > >>> stop("\"ResortCombinedFields\" input
needs to be a data frame.")
> > >>> }
> > >>> for(icol in seq_len(ncol(dframe))){
> > >>> col <- dframe[,icol]
> > >>> irow <- grep(";", col)
> > >>> if (length(irow)) {
> > >>> col[irow] <- unlist(lapply(strsplit(col[irow],
";"),
> > >>> function(parts) paste(sort(parts),
collapse=";")))
> > >>> dframe[,icol] <- col
> > >>> }
> > >>> }
> > >>> return(dframe)
> > >>> }
> > >>>
> > >>> Times were
> > >>>
> > >>> > unix.time(z<-ResortCombinedFields(dframe))
> > >>>
> > >>> user system elapsed
> > >>> 2.526 0.022 2.559
> > >>>
> > >>> > unix.time(f1z<-f1(dframe))
> > >>>
> > >>> user system elapsed
> > >>> 0.509 0.000 0.508
> > >>>
> > >>> > unix.time(f2z<-f2(dframe))
> > >>>
> > >>> user system elapsed
> > >>> 0.259 0.004 0.264
> > >>>
> > >>> > identical(z, f1z)
> > >>>
> > >>> [1] TRUE
> > >>>
> > >>> > identical(z, f2z)
> > >>>
> > >>> [1] TRUE
> > >>
> > >> In R 2.7.0 (April 2008) f1() and f2() both take time
proportional
> > >> to nrow(dframe), while your original ResortCombinedFields()
takes
> > >> time proportional to the square of nrow(dframe). E.g., for
50,000
> > >> rows ResortCombinedFields takes 4252 seconds while f2 takes
14 seconds
> > >> It looks like 2.9 acts about the same.
> > >>
> > >> Bill Dunlap
> > >> TIBCO Software Inc - Spotfire Division
> > >> wdunlap tibco.com
> > >>
> > >> ______________________________________________
> > >> R-help@r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html and provide
commented,
> > >> minimal, self-contained, reproducible code.
>
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 835 bytes
> Desc: This is a digitally signed message part.
> URL: <
>
https://stat.ethz.ch/pipermail/r-help/attachments/20081126/f2c08061/attachment-0001.bin
> >
>
> ------------------------------
>
> Message: 2
> Date: Wed, 26 Nov 2008 16:44:20 +0530
> From: "indian scorpio" <cool.scorpio84@gmail.com>
> Subject: Re: [R] Running rtest - how to/ help
> To: r-help@r-project.org
> Message-ID:
> <bd914bcb0811260314j558328b7vea363462513dc794@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Well after trying a couple of things
> - rtest.java example with command line argument being --zero-init
> this is the error
> Creating Rengine (with arguments)
> Rengine created, waiting for R
> #
> # An unexpected error has been detected by Java Runtime Environment:
> #
> # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x6c733b9d, pid=3640,
> tid=5016
> #
> # Java VM: Java HotSpot(TM) Client VM (10.0-b19 mixed mode, sharing
> windows-x86)
> # Problematic frame:
> # C [R.dll+0x33b9d]
> #
> # An error report file with more information is saved as:
> # C:\workspaceVE\XTest\hs_err_pid3640.log
> #
> # If you would like to submit a bug report, please visit:
> # http://java.sun.com/webapps/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
>
> 2. while for rtest2.java
> it remains the same that is terminates after Creating Rengine (with
> arguments) But difference being some window/ frame which opens for a
> fraction of a second
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 26 Nov 2008 13:00:52 +0100
> From: Eik Vettorazzi <E.Vettorazzi@uke.uni-hamburg.de>
> Subject: Re: [R] how to read .sps (SPSS file extension)?
> To: livio finos <livio.finos@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D3A74.1070508@uke.uni-hamburg.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> maybe the importers of the memisc-package will help you, but I never
> tried them, see
>
> help(importers,package="memisc")
>
> At a first glance it seems, that you have to split your file manually,
> but maybe there is another way.
> hth.
>
> livio finos schrieb:
> > sorry, you are completely right!
> > sps is not the extension for portable file! sorry for the time I make
> > you spend.
> > I try to make my problem more clear.
> > I exporting a dataset from limesurvey (a free software for internet
> > survey). It works very fine and it allow to export in different format
> > such as csv and excel. this fine, but what I like from spss formats is
> > the variables labels.
> > limesurvey declares to export in spss, but it export in sps format
> > which is not a format but a code actually. I realize it just now,
> > sorry. I attach an extract of the code here below.
> > do you have any suggestion on how to manage that? I think it will be
> > great if we can improve the interconnettivity among free software.
> > thanks again..
> > livio
> >
> > NEW FILE.
> > FILE TYPE NESTED RECORD=1(A).
> > - RECORD TYPE 'A'.
> > - DATA LIST LIST / i0(A1) d1(N3) d2(DATETIME20.0) d3(A15) d4(N2)
> > d5(N1) d6(N1) d7(N1) d8(N1) d9(N1) d10(N1) d11(N1) d12(N1) d13(N1)
> > d14(N1) d15(N1) d16(N1) d17(N2) d18(N1) d19(N1) d20(N1) .
> >
> > - RECORD TYPE 'B'.
> > - DATA LIST LIST / i1(A1) d21(N1) d22(N1) d23(N1) d24(N1) d25(N1)
> > d26(N1) d27(N1) d28(N1) d29(N1) d30(N1) d31(N1) d32(N1) d33(N1)
> > d34(N1) d35(N1) d36(N1) d37(N1) d38(A37) d39(N1) d40(N1) .
> >
> > - RECORD TYPE 'C'.
> > - DATA LIST LIST / i2(A1) d41(N1) d42(N1) d43(N1) d44(N1) d45(N1)
> > d46(N1) d47(N1) d48(N1) d49(N1) d50(N1) d51(N1) d52(N1) d53(N1)
> > d54(N1) d55(N1) d56(N1) d57(N1) d58(N1) d59(N1) d60(N1) .
> >
> > - RECORD TYPE 'D'.
> > - DATA LIST LIST / i3(A1) d61(N1) d62(N1) d63(N1) d64(N1) d65(N1)
> > d66(N1) d67(N1) d68(N1) d69(N1) d70(N1) d71(N1) d72(N1) d73(N1)
> > d74(N1) d75(N1) d76(N1) d77(N1) d78(N1) d79(N1) d80(N1) .
> >
> > - RECORD TYPE 'E'.
> > - DATA LIST LIST / i4(A1) d81(N1) d82(N1) d83(N1) d84(N1) d85(N1)
> > d86(N1) d87(N1) d88(N1) d89(N1) d90(N1) d91(N1) d92(N1) d93(N1)
> > d94(N1) d95(N1) d96(N1) d97(N1) d98(N1) d99(N1) d100(N1) .
> >
> > - RECORD TYPE 'F'.
> > - DATA LIST LIST / i5(A1) d101(N1) d102(N1) d103(N1) d104(N1) d105(N1)
> > d106(N1) d107(N1) d108(N1) d109(N1) .
> > END FILE TYPE.
> >
> > BEGIN DATA
> > A '8' '01-01-1980 00:00:00' 'it' '13'
'1' '1' '' '1' '' '1'
'1' '1'
> > '1' '1' '4' '0' '' '1'
'' '1'
> > B '' '1' '1' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '2' '7' '3' '2'
> > '0''2' '3'
> > C '3' '4' '4' '2' '2'
'1' '4' '4' '2' '3' '1'
'4' '1' '4' '1' '1' '4'
> > '3' '4' '3'
> > D '1' '3' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '2' '2' '2' '2' '3'
> > '3' '3' '2'
> > E '2' '3' '2' '2'
'0''0''0''0''0''0''0''3'
'4' '4' '2' '1' '5' '2'
'5'
> > '4'
> > F '1' '2' '2' '2' '2'
'1' '2' '2' '5'
> > A '9' '01-01-1980 00:00:00' 'it' '13'
'2' '1' '' '1' '' '0' ''
'1' '1'
> > '1' '3' '0' '' '1' ''
'1'
> > B '' '0' '' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '3' '7' '3' '2'
> > '0''2' '4'
> > C '3' '4' '4' '3' '3'
'3' '4' '3' '2' '2' '1'
'3' '1' '4' '1' '4' '4'
> > '4' '4' '3'
> > D '1' '2' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '3' '1' '3' '3' '3'
> > '3' '3' '3'
> > E '3' '3' '3' '2'
'0''0''0''0''0''0''0''3'
'4' '2' '2' '5' '3' '5'
'2'
> > '5'
> > F '1' '5' '5' '5' '5'
'5' '5' '2' '5'
> > A '10' '01-01-1980 00:00:00' 'it' '13'
'1' '1' '' '1' '' '1'
'2' '0'
> > '' '0' '' '0' '' '1'
'' '1'
> > B '' '1' '2' '0' ''
'0' '' '0' '' '1' '2'
'1' '4' '6' '7' '3' '2'
> > '0''3' '3'
> > C '4' '4' '4' '4' '3'
'4' '4' '3' '2' '2' '1'
'4' '1' '4' '1' '3' '4'
> > '4' '4' '1'
> > D '1' '1' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'3' '3' '3' '3' '3' '3'
> > '3' '3' '3'
> > E '3' '3' '3' '2'
'0''0''0''0''0''0''0''5'
'4' '5' '2' '5' '5' '5'
'5'
> > '5'
> > F '1' '5' '5' '5' '5'
'5' '5' '1' '5'
> > END DATA.
> > EXECUTE.
> >
> > *Define Variable Properties.
> > VARIABLE LABELS d1 'Record ID'.
> > VARIABLE LABELS d2 'Data di completamento'.
> > VARIABLE LABELS d3 'Lingua di partenza'.
> > VARIABLE LABELS d4 'Et? :'.
> > VARIABLE LABELS d5 'Sesso:'.
> > VARIABLE LABELS d6 '3 - Pap? '.
> > VARIABLE LABELS d7 'Com'?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d8 '3 - Mamma'.
> > VARIABLE LABELS d9 'Com'?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d10 '3 - Fratelli n??'.
> > VARIABLE LABELS d11 'Com'?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d12 '3 - Sorelle n??'.
> > VARIABLE LABELS d13 'Com'?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d14 '3 - Nonni n??'.
> > VARIABLE LABELS d15 'Com'?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d16 '3 - Altre figure parentali (zii, cugini,
ecc.) n??'.
> > VARIABLE LABELS d17 'Com'?? composta la tua famiglia? -
COMMENT'.
> > VARIABLE LABELS d18 '4 - Pap? '.
> > VARIABLE LABELS d19 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d20 '4 - Mamma'.
> > VARIABLE LABELS d21 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d22 '4 - Fratelli n??'.
> > VARIABLE LABELS d23 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d24 '4 - Sorelle n??'.
> > VARIABLE LABELS d25 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d26 '4 - Nonni n??'.
> > VARIABLE LABELS d27 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> > VARIABLE LABELS d28 '4 - Altre figure parentali (zii, cugini,
ecc.) n??'.
> > VARIABLE LABELS d29 'Quali? di queste persone? vivono in casa con
te?
> > - COMMENT'.
> >
> >
> > *Define Value labels.
> > VALUE LABELS d5
> > 1 "Maschio"
> > 2 "Femmina".
> > VALUE LABELS d6
> > 1 "S??"
> > 0 "Non selezionato".
> > VALUE LABELS d8
> > 1 "S??"
> > 0 "Non selezionato".
> >
> >
> >
> >
> > On Tue, Nov 25, 2008 at 10:27 AM, Eik Vettorazzi
> > <E.Vettorazzi@uke.uni-hamburg.de
> > <mailto:E.Vettorazzi@uke.uni-hamburg.de>> wrote:
> >
> > Hi Livio,
> > I think you mixed something up. The .sps - files are the syntax
> > files of SPSS, and I think there is no automated way (but I would
> > like to be corrected there) of converting SPSS syntax to R-code.
> > The usual data files of spss have the extension .sav. Such files
> > can easily read by read.spss (package foreign) or spss.get
> > (package Hmisc), if you think the variable labels of SPSS are
> > fancy the latter approach is possibly more appropriate, because it
> > adds an attribute with this label to each row.
> > hth.
> >
> >
> >
> > livio finos schrieb:
> >
> > Hi everyone,
> > I'm trying to import .sps (SPSS portable file) file.
> > the read.spss function (library foreign) doesn't allow to
> > import such files.
> > should I import in spss and then save as sav file? there is
> > not other
> > solutions available?
> > what I mostly like from spss file is that they have variable
> > labels.
> > want is really wish to keep are the variable.labels from the
> > spss file; so,
> > if there is a different way to bring them from the sps file
> > will be also ok
> > (I also have a csv copy but without the variable.labels
> > obviously).
> > thanks for any answer..
> > livio
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org <mailto:R-help@r-project.org>
mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible
> code.
> >
> >
> >
> > --
> > Eik Vettorazzi
> > Institut f?r Medizinische Biometrie und Epidemiologie
> > Universit?tsklinikum Hamburg-Eppendorf
> >
> > Martinistr. 52
> > 20246 Hamburg
> >
> > T ++49/40/42803-8243
> > F ++49/40/42803-7790
> >
> >
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 26 Nov 2008 13:05:21 +0100
> From: Eik Vettorazzi <E.Vettorazzi@uke.uni-hamburg.de>
> Subject: Re: [R] how to read .sps (SPSS file extension)?
> To: livio finos <livio.finos@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D3B81.3030709@uke.uni-hamburg.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> sorry for the typo,
>
> help(importer, package="memisc")
>
> will do the trick.
>
> Eik Vettorazzi schrieb:
> > maybe the importers of the memisc-package will help you, but I never
> > tried them, see
> >
> > help(importers,package="memisc")
> >
> > At a first glance it seems, that you have to split your file manually,
> > but maybe there is another way.
> > hth.
> >
> > livio finos schrieb:
> >> sorry, you are completely right!
> >> sps is not the extension for portable file! sorry for the time I
make
> >> you spend.
> >> I try to make my problem more clear.
> >> I exporting a dataset from limesurvey (a free software for
internet
> >> survey). It works very fine and it allow to export in different
> >> format such as csv and excel. this fine, but what I like from spss
> >> formats is the variables labels.
> >> limesurvey declares to export in spss, but it export in sps format
> >> which is not a format but a code actually. I realize it just now,
> >> sorry. I attach an extract of the code here below.
> >> do you have any suggestion on how to manage that? I think it will
be
> >> great if we can improve the interconnettivity among free software.
> >> thanks again..
> >> livio
> >>
> >> NEW FILE.
> >> FILE TYPE NESTED RECORD=1(A).
> >> - RECORD TYPE 'A'.
> >> - DATA LIST LIST / i0(A1) d1(N3) d2(DATETIME20.0) d3(A15) d4(N2)
> >> d5(N1) d6(N1) d7(N1) d8(N1) d9(N1) d10(N1) d11(N1) d12(N1) d13(N1)
> >> d14(N1) d15(N1) d16(N1) d17(N2) d18(N1) d19(N1) d20(N1) .
> >>
> >> - RECORD TYPE 'B'.
> >> - DATA LIST LIST / i1(A1) d21(N1) d22(N1) d23(N1) d24(N1) d25(N1)
> >> d26(N1) d27(N1) d28(N1) d29(N1) d30(N1) d31(N1) d32(N1) d33(N1)
> >> d34(N1) d35(N1) d36(N1) d37(N1) d38(A37) d39(N1) d40(N1) .
> >>
> >> - RECORD TYPE 'C'.
> >> - DATA LIST LIST / i2(A1) d41(N1) d42(N1) d43(N1) d44(N1) d45(N1)
> >> d46(N1) d47(N1) d48(N1) d49(N1) d50(N1) d51(N1) d52(N1) d53(N1)
> >> d54(N1) d55(N1) d56(N1) d57(N1) d58(N1) d59(N1) d60(N1) .
> >>
> >> - RECORD TYPE 'D'.
> >> - DATA LIST LIST / i3(A1) d61(N1) d62(N1) d63(N1) d64(N1) d65(N1)
> >> d66(N1) d67(N1) d68(N1) d69(N1) d70(N1) d71(N1) d72(N1) d73(N1)
> >> d74(N1) d75(N1) d76(N1) d77(N1) d78(N1) d79(N1) d80(N1) .
> >>
> >> - RECORD TYPE 'E'.
> >> - DATA LIST LIST / i4(A1) d81(N1) d82(N1) d83(N1) d84(N1) d85(N1)
> >> d86(N1) d87(N1) d88(N1) d89(N1) d90(N1) d91(N1) d92(N1) d93(N1)
> >> d94(N1) d95(N1) d96(N1) d97(N1) d98(N1) d99(N1) d100(N1) .
> >>
> >> - RECORD TYPE 'F'.
> >> - DATA LIST LIST / i5(A1) d101(N1) d102(N1) d103(N1) d104(N1)
> >> d105(N1) d106(N1) d107(N1) d108(N1) d109(N1) .
> >> END FILE TYPE.
> >>
> >> BEGIN DATA
> >> A '8' '01-01-1980 00:00:00' 'it'
'13' '1' '1' '' '1' ''
'1' '1' '1'
> >> '1' '1' '4' '0' ''
'1' '' '1'
> >> B '' '1' '1' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '2' '7' '3' '2'
> >> '0''2' '3'
> >> C '3' '4' '4' '2' '2'
'1' '4' '4' '2' '3' '1'
'4' '1' '4' '1' '1' '4'
> >> '3' '4' '3'
> >> D '1' '3' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '2' '2' '2' '2' '3'
> >> '3' '3' '2'
> >> E '2' '3' '2' '2'
'0''0''0''0''0''0''0''3'
'4' '4' '2' '1' '5' '2'
> >> '5' '4'
> >> F '1' '2' '2' '2' '2'
'1' '2' '2' '5'
> >> A '9' '01-01-1980 00:00:00' 'it'
'13' '2' '1' '' '1' ''
'0' '' '1'
> >> '1' '1' '3' '0' ''
'1' '' '1'
> >> B '' '0' '' '1' '1'
'0' '' '0' '' '2' '2'
'1' '4' '3' '7' '3' '2'
> >> '0''2' '4'
> >> C '3' '4' '4' '3' '3'
'3' '4' '3' '2' '2' '1'
'3' '1' '4' '1' '4' '4'
> >> '4' '4' '3'
> >> D '1' '2' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'1' '3' '1' '3' '3' '3'
> >> '3' '3' '3'
> >> E '3' '3' '3' '2'
'0''0''0''0''0''0''0''3'
'4' '2' '2' '5' '3' '5'
> >> '2' '5'
> >> F '1' '5' '5' '5' '5'
'5' '5' '2' '5'
> >> A '10' '01-01-1980 00:00:00' 'it'
'13' '1' '1' '' '1' ''
'1' '2' '0'
> >> '' '0' '' '0' ''
'1' '' '1'
> >> B '' '1' '2' '0' ''
'0' '' '0' '' '1' '2'
'1' '4' '6' '7' '3' '2'
> >> '0''3' '3'
> >> C '4' '4' '4' '4' '3'
'4' '4' '3' '2' '2' '1'
'4' '1' '4' '1' '3' '4'
> >> '4' '4' '1'
> >> D '1' '1' '1' '1' '1'
'1' '1' '1' '1' '1' '1'
'3' '3' '3' '3' '3' '3'
> >> '3' '3' '3'
> >> E '3' '3' '3' '2'
'0''0''0''0''0''0''0''5'
'4' '5' '2' '5' '5' '5'
> >> '5' '5'
> >> F '1' '5' '5' '5' '5'
'5' '5' '1' '5'
> >> END DATA.
> >> EXECUTE.
> >>
> >> *Define Variable Properties.
> >> VARIABLE LABELS d1 'Record ID'.
> >> VARIABLE LABELS d2 'Data di completamento'.
> >> VARIABLE LABELS d3 'Lingua di partenza'.
> >> VARIABLE LABELS d4 'Et? :'.
> >> VARIABLE LABELS d5 'Sesso:'.
> >> VARIABLE LABELS d6 '3 - Pap? '.
> >> VARIABLE LABELS d7 'Com'?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d8 '3 - Mamma'.
> >> VARIABLE LABELS d9 'Com'?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d10 '3 - Fratelli n??'.
> >> VARIABLE LABELS d11 'Com'?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d12 '3 - Sorelle n??'.
> >> VARIABLE LABELS d13 'Com'?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d14 '3 - Nonni n??'.
> >> VARIABLE LABELS d15 'Com'?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d16 '3 - Altre figure parentali (zii, cugini,
ecc.)
> >> n??'.
> >> VARIABLE LABELS d17 'Com'?? composta la tua famiglia?
- COMMENT'.
> >> VARIABLE LABELS d18 '4 - Pap? '.
> >> VARIABLE LABELS d19 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d20 '4 - Mamma'.
> >> VARIABLE LABELS d21 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d22 '4 - Fratelli n??'.
> >> VARIABLE LABELS d23 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d24 '4 - Sorelle n??'.
> >> VARIABLE LABELS d25 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d26 '4 - Nonni n??'.
> >> VARIABLE LABELS d27 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >> VARIABLE LABELS d28 '4 - Altre figure parentali (zii, cugini,
ecc.)
> >> n??'.
> >> VARIABLE LABELS d29 'Quali? di queste persone? vivono in casa
con te?
> >> - COMMENT'.
> >>
> >>
> >> *Define Value labels.
> >> VALUE LABELS d5
> >> 1 "Maschio"
> >> 2 "Femmina".
> >> VALUE LABELS d6
> >> 1 "S??"
> >> 0 "Non selezionato".
> >> VALUE LABELS d8
> >> 1 "S??"
> >> 0 "Non selezionato".
> >>
> >>
> >>
> >>
> >> On Tue, Nov 25, 2008 at 10:27 AM, Eik Vettorazzi
> >> <E.Vettorazzi@uke.uni-hamburg.de
> >> <mailto:E.Vettorazzi@uke.uni-hamburg.de>> wrote:
> >>
> >> Hi Livio,
> >> I think you mixed something up. The .sps - files are the
syntax
> >> files of SPSS, and I think there is no automated way (but I
would
> >> like to be corrected there) of converting SPSS syntax to
R-code.
> >> The usual data files of spss have the extension .sav. Such
files
> >> can easily read by read.spss (package foreign) or spss.get
> >> (package Hmisc), if you think the variable labels of SPSS are
> >> fancy the latter approach is possibly more appropriate,
because it
> >> adds an attribute with this label to each row.
> >> hth.
> >>
> >>
> >>
> >> livio finos schrieb:
> >>
> >> Hi everyone,
> >> I'm trying to import .sps (SPSS portable file) file.
> >> the read.spss function (library foreign) doesn't allow
to
> >> import such files.
> >> should I import in spss and then save as sav file? there
is
> >> not other
> >> solutions available?
> >> what I mostly like from spss file is that they have
variable
> >> labels.
> >> want is really wish to keep are the variable.labels from
the
> >> spss file; so,
> >> if there is a different way to bring them from the sps
file
> >> will be also ok
> >> (I also have a csv copy but without the variable.labels
> >> obviously).
> >> thanks for any answer..
> >> livio
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help@r-project.org <mailto:R-help@r-project.org>
mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained,
reproducible
> >> code.
> >>
> >>
> >> -- Eik Vettorazzi
> >> Institut f?r Medizinische Biometrie und Epidemiologie
> >> Universit?tsklinikum Hamburg-Eppendorf
> >>
> >> Martinistr. 52
> >> 20246 Hamburg
> >>
> >> T ++49/40/42803-8243
> >> F ++49/40/42803-7790
> >>
> >>
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 26 Nov 2008 04:41:29 -0800 (PST)
> From: Chris Andrews <candrews@buffalo.edu>
> Subject: Re: [R] plotting density for truncated distribution
> To: r-help@r-project.org
> Message-ID: <20699699.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> Another option
>
> mydata <- rnorm(100000)
> mydata <- mydata[mydata>0]
> plot(density(c(mydata, -mydata), from=0))
>
> If you want the area under the curve to be one, you'll need to double
the
> density estimate
>
> dx <- density(c(mydata, -mydata), from=0)
> dx$y <- dx$y * 2
> plot(dx)
>
> Chris
>
>
>
> Jeroen Ooms wrote:
> >
> > I am using density() to plot a density curves. However, one of my
> > variables is truncated at zero, but has most of its density around
zero.
> I
> > would like to know how to plot this with the density function.
> >
> > The problem is that if I do this the regular way density(), values
near
> > zero automatically get a very low value because there are no observed
> > values below zero. Furthermore there is some density below zero,
although
> > there are no observed values below zero.
> >
> > This illustrated the problem:
> >
> > mydata <- rnorm(100000);
> > mydata <- mydata[mydata>0];
> > plot(density(mydata));
> >
> > the 'real' density is exactly the right half of a normal
distribution, so
> > truncated at zero. However using the default options, the line seems
to
> > decrease with a nice curve at the left, with some density below zero.
> This
> > is pretty confusing for the reader. I have tried to decrease the bw,
> masks
> > (but does not fix) some of the problem, but than also the rest of the
> > curve loses smoothness. I would like to make a plot of this data that
> > looks like the right half of a normal distribution, while keeping the
> > curve relatively smooth.
> >
> > Is there any way to specify this truncation in the density function,
so
> > that it will only use the positive domain to calculate density?
> >
>
> --
> View this message in context:
>
http://www.nabble.com/plotting-density-for-truncated-distribution-tp20684995p20699699.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 26 Nov 2008 14:11:43 +0100
> From: axionator <axionator@gmail.com>
> Subject: [R] construct a vector
> To: "r-help@r-project.org" <r-help@r-project.org>
> Message-ID:
> <97a146780811260511j73c30f1drefba26e3eaf10def@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi all,
> I have an unkown number of vectors (>=2) all of the same length. Out
> of these, I want to construct a new one as follows:
> having vectors u,v and w, the resulting vector z should have entries:
> z[1] = u[1], z[2] = v[1], z[3] = w[1]
> z[4] = u[2], z[5] = v[2], z[6] = w[2]
> ...
> i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> elements and store them consecutively in z.
> Is there an efficient way in R to do this?
>
> Thanks in advance
> Armin
>
>
>
> ------------------------------
>
> Message: 7
> Date: Wed, 26 Nov 2008 07:33:24 -0600
> From: Marc Schwartz <marc_schwartz@comcast.net>
> Subject: Re: [R] construct a vector
> To: axionator <axionator@gmail.com>
> Cc: "r-help@r-project.org" <r-help@r-project.org>
> Message-ID: <492D5024.30704@comcast.net>
> Content-Type: text/plain; charset=ISO-8859-1
>
> on 11/26/2008 07:11 AM axionator wrote:
> > Hi all,
> > I have an unkown number of vectors (>=2) all of the same length.
Out
> > of these, I want to construct a new one as follows:
> > having vectors u,v and w, the resulting vector z should have entries:
> > z[1] = u[1], z[2] = v[1], z[3] = w[1]
> > z[4] = u[2], z[5] = v[2], z[6] = w[2]
> > ...
> > i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> > elements and store them consecutively in z.
> > Is there an efficient way in R to do this?
> >
> > Thanks in advance
> > Armin
>
> Is this what you want?
>
> u <- 1:10
> v <- 11:20
> w <- 21:30
>
> z <- as.vector(rbind(u, v, w))
>
> > z
> [1] 1 11 21 2 12 22 3 13 23 4 14 24 5 15 25 6 16 26 7 17 27 8
> [23] 18 28 9 19 29 10 20 30
>
>
> Essentially, we are creating a matrix from the 3 vectors:
>
> > rbind(u, v, w)
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> u 1 2 3 4 5 6 7 8 9 10
> v 11 12 13 14 15 16 17 18 19 20
> w 21 22 23 24 25 26 27 28 29 30
>
> Then coercing that to a vector, taking advantage of the way in which
> matrix elements are stored.
>
> HTH,
>
> Marc Schwartz
>
>
>
> ------------------------------
>
> Message: 8
> Date: Wed, 26 Nov 2008 13:38:23 +0000
> From: Richard.Cotton@hsl.gov.uk
> Subject: Re: [R] construct a vector
> To: axionator <axionator@gmail.com>
> Cc: "r-help@r-project.org" <r-help@r-project.org>,
> r-help-bounces@r-project.org
> Message-ID:
> <
> OF8137852B.42F93AFC-ON8025750D.004AC462-8025750D.004AE18D@hsl.gov.uk>
> Content-Type: text/plain; charset="US-ASCII"
>
> > I have an unkown number of vectors (>=2) all of the same length.
Out
> > of these, I want to construct a new one as follows:
> > having vectors u,v and w, the resulting vector z should have entries:
> > z[1] = u[1], z[2] = v[1], z[3] = w[1]
> > z[4] = u[2], z[5] = v[2], z[6] = w[2]
> > ...
> > i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> > elements and store them consecutively in z.
> > Is there an efficient way in R to do this?
>
> u <- 1:5
> v <- (1:5) + 0.1
> w <- (1:5) + 0.2
> as.vector(rbind(u,v,w))
> # [1] 1.0 1.1 1.2 2.0 2.1 2.2 3.0 3.1 3.2 4.0 4.1 4.2 5.0 5.1 5.2
>
> Regards,
> Richie.
>
> Mathematical Sciences Unit
> HSL
>
>
> ------------------------------------------------------------------------
> ATTENTION:
>
> This message contains privileged and confidential inform...{{dropped:20}}
>
>
>
> ------------------------------
>
> Message: 9
> Date: Wed, 26 Nov 2008 07:38:32 -0600
> From: Frank E Harrell Jr <f.harrell@vanderbilt.edu>
> Subject: Re: [R] multiple imputation with fit.mult.impute in Hmisc -
> how to replace NA with imputed value?
> To: Charlie Brush <cfbrush@ucdavis.edu>
> Cc: r-help@r-project.org
> Message-ID: <492D5158.4090003@vanderbilt.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Charlie Brush wrote:
> > I am doing multiple imputation with Hmisc, and
> > can't figure out how to replace the NA values with
> > the imputed values.
> >
> > Here's a general ourline of the process:
> >
> > > set.seed(23)
> > > library("mice")
> > > library("Hmisc")
> > > library("Design")
> > > d <- read.table("DailyDataRaw_01.txt",header=T)
> > > length(d);length(d[,1])
> > [1] 43
> > [1] 2666
> > Do for this data set, there are 43 columns and 2666 rows
> >
> > Here is a piece of data.frame d:
> > > d[1:20,4:6]
> > P01 P02 P03
> > 1 0.1 0.16 0.16
> > 2 NA 0.00 0.00
> > 3 NA 0.60 0.04
> > 4 NA 0.15 0.00
> > 5 NA 0.00 0.00
> > 6 0.7 0.00 0.75
> > 7 NA 0.00 0.00
> > 8 NA 0.00 0.00
> > 9 0.0 0.00 0.00
> > 10 0.0 0.00 0.00
> > 11 0.0 0.00 0.00
> > 12 0.0 0.00 0.00
> > 13 0.0 0.00 0.00
> > 14 0.0 0.00 0.00
> > 15 0.0 0.00 0.03
> > 16 NA 0.00 0.00
> > 17 NA 0.01 0.00
> > 18 0.0 0.00 0.00
> > 19 0.0 0.00 0.00
> > 20 0.0 0.00 0.00
> >
> > These are daily precipitation values at NCDC stations, and
> > NA values at station P01 will be filled using multiple
> > imputation and data from highly correlated stations P02 and P08:
> >
> > > f <- aregImpute(~ I(P01) + I(P02) + I(P08),
> > n.impute=10,match='closest',data=d)
> > Iteration 13
> > > fmi <- fit.mult.impute( P01 ~ P02 + P08 , ols, f, d)
> >
> > Variance Inflation Factors Due to Imputation:
> >
> > Intercept P02 P08
> > 1.01 1.39 1.16
> >
> > Rate of Missing Information:
> >
> > Intercept P02 P08
> > 0.01 0.28 0.14
> >
> > d.f. for t-distribution for Tests of Single Coefficients:
> >
> > Intercept P02 P08
> > 242291.18 116.05 454.95
> > > r <- apply(f$imputed$P01,1,mean)
> > > r
> > 2 3 4 5 7 8 16 17 249 250 251
> > 0.002 0.430 0.044 0.002 0.002 0.002 0.002 0.123 0.002 0.002 0.002
> > 252 253 254 255 256 257 258 259 260 261 262
> > 1.033 0.529 1.264 0.611 0.002 0.513 0.085 0.002 0.705 0.840 0.719
> > 263 264 265 266 267 268 269 270 271 272 273
> > 1.489 0.532 0.150 0.134 0.002 0.002 0.002 0.002 0.002 0.055 0.135
> > 274 275 276 277 278 279 280 281 282 283 284
> > 0.009 0.002 0.002 0.002 0.008 0.454 1.676 1.462 0.071 0.002 1.029
> > 285 286 287 288 289 418 419 420 421 422 700
> > 0.055 0.384 0.947 0.002 0.002 0.008 0.759 0.066 0.009 0.002 0.002
> >
> > ------------------------------------------------------------------
> > So far, this is working great.
> > Now, make a copy of d:
> > > dnew <- d
> >
> > And then fill in the NA values in P01 with the values in r
> >
> > For example:
> > > for (i in 1:length(r)){
> > dnew$P01[r[i,1]] <- r[i,2]
> > }
> > This doesn't work, because each 'piece' of r is two
numbers:
> > > r[1]
> > 2
> > 0.002
> > > r[1,1]
> > Error in r[1, 1] : incorrect number of dimensions
> >
> > My question: how can I separate the the two items in (for example)
> > r[1] to use the first part as an index and the second as a value,
> > and then use them to replace the NA values with the imputed values?
> >
> > Or is there a better way to replace the NA values with the imputed
> values?
> >
> > Thanks in advance for any help.
> >
>
> You didn't state your goal, and why fit.mult.impute does not do what
you
> want. But you can look inside fit.mult.impute to see how it retrieves
> the imputed values. Also see the example in documentation for transcan
> in which the command impute(xt, imputation=1) to retrieve one of the
> multiple imputations.
>
> Note that you can say library(Design) (omit the quotes) to access both
> Design and Hmisc.
>
> Frank
> --
> Frank E Harrell Jr Professor and Chair School of Medicine
> Department of Biostatistics Vanderbilt University
>
>
>
> ------------------------------
>
> Message: 10
> Date: Wed, 26 Nov 2008 14:39:09 +0100
> From: Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk@idi.ntnu.no>
> Subject: Re: [R] construct a vector
> To: axionator <axionator@gmail.com>
> Cc: R help <R-help@stat.math.ethz.ch>
> Message-ID: <492D517D.8050108@idi.ntnu.no>
> Content-Type: text/plain; charset=ISO-8859-1
>
> axionator wrote:
> > Hi all,
> > I have an unkown number of vectors (>=2) all of the same length.
Out
> > of these, I want to construct a new one as follows:
> > having vectors u,v and w, the resulting vector z should have entries:
> > z[1] = u[1], z[2] = v[1], z[3] = w[1]
> > z[4] = u[2], z[5] = v[2], z[6] = w[2]
> > ...
> > i.e. go through the vector u,v,w, take at each time the 1st, 2sd, ...
> > elements and store them consecutively in z.
> > Is there an efficient way in R to do this?
> >
> >
>
> suppose you have your vectors collected into a list, say vs; then the
> following will do:
>
> as.vector(do.call(rbind, vs))
>
> vQ
>
>
>
> ------------------------------
>
> Message: 11
> Date: Wed, 26 Nov 2008 14:43:30 +0100
> From: axionator <axionator@gmail.com>
> Subject: Re: [R] construct a vector
> To: "Wacek Kusnierczyk"
<Waclaw.Marcin.Kusnierczyk@idi.ntnu.no>
> Cc: R help <R-help@stat.math.ethz.ch>
> Message-ID:
> <97a146780811260543l4d5bcaa7td42e123b9f2906a9@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Thanks, works fine.
>
> Armin
>
>
>
> ------------------------------
>
> Message: 12
> Date: Wed, 26 Nov 2008 14:43:53 +0100
> From: Uwe Ligges <ligges@statistik.tu-dortmund.de>
> Subject: Re: [R] Question about Kolmogorov-Smirnov Test
> To: Ricardo R?os <ricardo.rios.sv@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D5299.5020206@statistik.tu-dortmund.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>
>
> Ricardo R?os wrote:
> > Hi wizards
> >
> > I have the following code for a Kolmogorov-Smirnov Test:
> >
> >
z<-c(1.6,10.3,3.5,13.5,18.4,7.7,24.3,10.7,8.4,4.9,7.9,12,16.2,6.8,14.7)
> > ks.test(z,"pexp",1/10)$statistic
> >
> > The Kolmogorov-Smirnov statistic is:
> >
> > D
> > 0.293383
> >
> > However, I have calculated the Kolmogorov-Smirnov statistic with the
> > following R code:
> >
> >
z<-c(1.6,10.3,3.5,13.5,18.4,7.7,24.3,10.7,8.4,4.9,7.9,12,16.2,6.8,14.7)
> > a<-sort(z)
> > d<- pexp(a, rate = 1/10, lower.tail = TRUE, log.p = FALSE)
> > w=numeric(length = length(a))
> > for(i in 1:length(a)) w[i]=i/15
> > max(abs(w-d))
> >
> > But I have obtained the following result:
> >
> > [1] 0.2267163
> >
> > Why these results are not equal?
>
> w is calculated as follows:
>
> w <- (seq(along=a)-1)/length(a)
> [ {0, ..., n-1} rather than {1, ..., n} ]
>
>
> Uwe Ligges
>
>
> > Thanks in advance
> >
>
>
>
> ------------------------------
>
> Message: 13
> Date: Wed, 26 Nov 2008 05:45:29 -0800 (PST)
> From: seanpor <seanpor@acm.org>
> Subject: Re: [R] memory limit
> To: r-help@r-project.org
> Message-ID: <20700590.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> Good afternoon,
>
> The short answer is "yes", the long answer is "it
depends".
>
> It all depends on what you want to do with the data, I'm working with
> dataframes of a couple of million lines, on this plain desktop machine and
> for my purposes it works fine. I read in text files, manipulate them,
> convert them into dataframes, do some basic descriptive stats and tests on
> them, a couple of columns at a time, all quick and simple in R. There are
> some libraries which are setup to handle very large datasets, e.g. biglm
> [1].
>
> If you're using algorithms which require vast quantities of memory,
then as
> the previous emails in this thread suggest, you might need R running on
> 64-bit.
>
> If you're working with a problem which is "embarrassingly
parallel"[2],
> then
> there are a variety of solutions - if you're in between then the
solutions
> are much more data dependant.
>
> the flip question: how long would it take you to get up and running with
> the
> functionallity (tried and tested in R) you require if you're going to
be
> re-working things in C++?
>
> I suggest that you have a look at R, possibly using a subset of your full
> set to start with - you'll be amazed how quickly you can get up and
> running.
>
> As suggested at the start of this email... "it depends"...
>
> Best Regards,
> Sean O'Riordain
> Dublin
>
> [1] http://cran.r-project.org/web/packages/biglm/index.html
> [2] http://en.wikipedia.org/wiki/Embarrassingly_parallel
>
>
> iwalters wrote:
> >
> > I'm currently working with very large datasets that consist out of
> > 1,000,000 + rows. Is it at all possible to use R for datasets this
size
> > or should I rather consider C++/Java.
> >
> >
> >
>
> --
> View this message in context:
>
http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20700590.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 14
> Date: Wed, 26 Nov 2008 09:14:40 -0500
> From: "jim holtman" <jholtman@gmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices
> To: "Daren Tan" <daren76@hotmail.com>
> Cc: r-help@stat.math.ethz.ch
> Message-ID:
> <644e1f320811260614i27b26152i566e9da4eae778f2@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Your time is being taken up in cor.test because you are calling it
> 100,000 times. So grin and bear it with the amount of work you are
> asking it to do.
>
> Here I am only calling it 100 time:
>
> > m1 <- matrix(rnorm(10000), ncol=100)
> > m2 <- matrix(rnorm(10000), ncol=100)
> > Rprof('/tempxx.txt')
> > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { cor.test(x,y)$p.value }) }))
> user system elapsed
> 8.86 0.00 8.89
> >
>
> so my guess is that calling it 100,000 times will take: 100,000 *
> 0.0886 seconds or about 3 hours.
>
> If you run Rprof, you will see if is spending most of its time there:
>
> 0 8.8 root
> 1. 8.8 apply
> 2. . 8.8 FUN
> 3. . . 8.8 apply
> 4. . . . 8.7 FUN
> 5. . . . . 8.6 cor.test
> 6. . . . . . 8.4 cor.test.default
> 7. . . . . . . 2.4 match.arg
> 8. . . . . . . . 1.7 eval
> 9. . . . . . . . . 1.4 deparse
> 10. . . . . . . . . . 0.6 .deparseOpts
> 11. . . . . . . . . . . 0.2 pmatch
> 11. . . . . . . . . . . 0.1 sum
> 10. . . . . . . . . . 0.5 %in%
> 11. . . . . . . . . . . 0.3 match
> 12. . . . . . . . . . . . 0.3 is.factor
> 13. . . . . . . . . . . . . 0.3 inherits
> 8. . . . . . . . 0.2 formals
> 9. . . . . . . . . 0.2 sys.function
> 7. . . . . . . 2.1 cor
> 8. . . . . . . . 1.1 match.arg
> 9. . . . . . . . . 0.7 eval
> 10. . . . . . . . . . 0.6 deparse
> 11. . . . . . . . . . . 0.3 .deparseOpts
> 12. . . . . . . . . . . . 0.1 pmatch
> 11. . . . . . . . . . . 0.2 %in%
> 12. . . . . . . . . . . . 0.2 match
> 13. . . . . . . . . . . . . 0.1 is.factor
> 14. . . . . . . . . . . . . . 0.1 inherits
> 9. . . . . . . . . 0.1 formals
> 8. . . . . . . . 0.5 stopifnot
> 9. . . . . . . . . 0.2 match.call
> 8. . . . . . . . 0.1 pmatch
> 8. . . . . . . . 0.1 is.data.frame
> 9. . . . . . . . . 0.1 inherits
> 7. . . . . . . 1.5 paste
> 8. . . . . . . . 1.4 deparse
> 9. . . . . . . . . 0.6 .deparseOpts
> 10. . . . . . . . . . 0.3 pmatch
> 10. . . . . . . . . . 0.1 any
> 9. . . . . . . . . 0.6 %in%
> 10. . . . . . . . . . 0.6 match
> 11. . . . . . . . . . . 0.5 is.factor
> 12. . . . . . . . . . . . 0.4 inherits
> 13. . . . . . . . . . . . . 0.2 mode
> 7. . . . . . . 0.4 switch
> 8. . . . . . . . 0.1 qnorm
> 7. . . . . . . 0.2 pt
> 5. . . . . 0.1 $
>
> On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan <daren76@hotmail.com>
wrote:
> >
> > My two matrices are roughly the sizes of m1 and m2. I tried using two
> apply and cor.test to compute the correlation p.values. More than an hour,
> and the codes are still running. Please help to make it more efficient.
> >
> > m1 <- matrix(rnorm(100000), ncol=100)
> > m2 <- matrix(rnorm(10000000), ncol=100)
> >
> > cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y)
{
> cor.test(x,y)$p.value }) })
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
>
>
> ------------------------------
>
> Message: 15
> Date: Wed, 26 Nov 2008 08:42:37 -0600 (CST)
> From: Terry Therneau <therneau@mayo.edu>
> Subject: [R] how to check linearity in Cox regression
> To: mmargoli@gettysburg.edu
> Cc: r-help@r-project.org
> Message-ID: <200811261442.mAQEgba26210@hsrnfs-101.mayo.edu>
> Content-Type: TEXT/plain; charset=us-ascii
>
> > On examining non-linearity of Cox coefficients with penalized splines
-
> I
> > have not been able to dig up a completely clear description of the
test
> > performed in R or S-plus.
>
> One "iron clad" way to test is to fit a model that has the
variable of
> interest
> "x" as a linear term, then a second model with splines, and do a
likelihood
> ratio test with 2*(difference in log-likelihood) on (difference in df)
> degrees
> of freedom. With a penalized model this test is conservative: the
> chi-square is
> not quite the right distribution, the true dist has the same mean but
> smaller
> variance.
>
> The pspline function uses an evenly spaced set of symmetric basis
> functions. A
> neat consequence of this is that the Wald test for linear vs 'more
general'
> is a
> test that the coefficients of the spline terms fall in a linear series.
> That
> is, a linear trend test on the coefficients. This is what coxph does. As
> with
> the LR test, the chi-square dist is conservative. I have not worked at
> putting
> in the more correct distribution. See Eilers and Marx, Statistical Science
> 1986.
>
> > And what is the null for the non-linear test?
>
> The linear test is "is a linear better than nothing", the
non-linear one is
> a
> sequential test "is the non-linear better than the linear". The
second
> test of
> course depends on the total number of df you allowed for the pspline fit.
> As a
> silly example adding "+ pspline(x, df=200)" would likely show
that the
> nonlinear
> term was not a significant addition, i.e., not worth 199 more degrees of
> freedom.
>
> Terry Therneau
>
>
>
> ------------------------------
>
> Message: 16
> Date: Wed, 26 Nov 2008 14:51:50 +0000
> From: Andrew Choens <andy.choens@gmail.com>
> Subject: [R] Chi-Square Test Disagreement
> To: r-help@r-project.org
> Message-ID: <1227711110.8422.24.camel@chinstrap>
> Content-Type: text/plain
>
> I was asked by my boss to do an analysis on a large data set, and I am
> trying to convince him to let me use R rather than SPSS. I think Sweave
> could make my life much much easier. To get me a little closer to this
> goal, I ran my analysis through R and SPSS and compared the resulting
> values. In all but one case, they were the same. Given the matrix
>
> [,1] [,2]
> [1,] 110 358
> [2,] 71 312
> [3,] 29 139
> [4,] 31 77
> [5,] 13 32
>
> This is the output from R:
> > chisq.test(test29)
>
> Pearson's Chi-squared test
>
> data: test29
> X-squared = 9.593, df = 4, p-value = 0.04787
>
> But, the same data in SPSS generates a p value of .051. It's a small
but
> important difference. I played around and rescaled things, and tried
> different values for B, but I never could get R to reach .051.
>
> I'd like to know which program is correct - R or SPSS? I know, this is
a
> biased place to ask such a question. I also appreciate all input that
> will help me use R more effectively. The difference could be the result
> of my own ignorance.
>
> thanks
> --andy
>
> --
> Insert something humorous here. :-)
>
>
>
> ------------------------------
>
> Message: 17
> Date: Wed, 26 Nov 2008 15:03:28 +0000
> From: "Tobias Verbeke" <tobias.verbeke@telenet.be>
> Subject: Re: [R] eclipse and R
> To: "Ruud H. koning" <r.h.koning@rug.nl>,
r-help@r-project.org
> Cc: statet-user@r-forge.wu-wien.ac.at
> Message-ID: <W117353088718051227711808@nocme1bl6.telenet-ops.be>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi Ruud,
>
> I forwarded your message to the StatET (R in Eclipse) list;
> there might be StatET users with a similar setup as yours
> on that list (and the StatET developer is more likely to
> pick up your question there).
>
> Best,
> Tobias
>
> >Hello, I am trying to install Eclipse and R on an amd64 machine running
> >Suse linux 9.3. I have compiled R 2.8.0 with --enable-R-shlib and it
> >seems that compilation was successfull. After starting R, I installed
> >the latest rJava package, from the output:
> >checking whether JRI is requested... yes
> >cp src/libjri.so libjri.so
> >It seems JRI support has been compiled successfully. However, when I
try
> >to open R from within Eclipse, I receive an error message:
> >
> >Launching the R Console was cancelled, because it seems starting the
> >Java process/R engine failed.
> >Please make sure that R package 'rJava' with JRI is installed.
> >
> >I can open an R console from the command line, and attach the rJava
> >library without problems. What am I doing wrong here?
> >Thanks, Ruud
> >
> >______________________________________________
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
>
> ------------------------------
>
> Message: 18
> Date: Wed, 26 Nov 2008 10:04:55 -0500
> From: "Debanjan Bhattacharjee"
<debanjan.bhattacharjee@gmail.com>
> Subject: [R] Finding Stopping time
> To: r-help@r-project.org
> Message-ID:
> <878a627a0811260704x1e1e4cfxbea42f48628d843d@mail.gmail.com>
> Content-Type: text/plain
>
> Can any one help me to solve problem in my code? I am actually trying to
> find the stopping index N.
> So first I generate random numbers from normals. There is no problem in
> finding the first stopping index.
> Now I want to find the second stopping index using obeservation starting
> from the one after the first stopping index.
> E.g. If my first stopping index was 5. I want to set 6th observation from
> the generated normal variables as the first random
> number, and I stop at second stopping index.
>
> This is my code,
>
>
> alpha <- 0.05
> beta <- 0.07
> a <- log((1-beta)/alpha)
> b <- log(beta/(1-alpha))
> theta1 <- 2
> theta2 <- 3
>
> cumsm<-function(n)
> {y<-NULL
> for(i in 1:n)
> {y[i]=x[i]^2}
> s=sum(y)
> return(s)
> }
> psum <- function(p,q)
> {z <- NULL
> for(l in p:q)
> { z[l-p+1] <- x[l]^2}
> ps <- sum(z)
> return(ps)
> }
> smm <- NULL
> sm <- NULL
> N <- NULL
> Nout <- NULL
> T <- NULL
> k<-0
> x <- rnorm(100,theta1,theta1)
> for(i in 1:length(x))
> {
> sm[i] <- psum(1,i)
> T[i] <-
>
>
((i/2)*log(theta1/theta2))+(((theta2-theta1)/(2*theta1*theta2))*sm[i])-(i*(theta2-theta1)/2)
> if (T[i]<=b | T[i]>=a){N[1]<-i
> break}
>
> }
> for(j in 2:200)
> {
> for(k in (N[j-1]+1):length(x))
> { smm[k] <- psum((N[j-1]+1),k)
> T[k] <-
>
>
((k/2)*log(theta1/theta2))+(((theta2-theta1)/(2*theta1*theta2))*smm[k])-(k*(theta2-theta1)/2)
> if (T[k]<=b | T[k]>=a){N[j]<-k
> break}
> }
> }
>
> But I cannot get the stopping index after the first one.
>
> Tanks
> --
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 19
> Date: Wed, 26 Nov 2008 10:08:56 -0500
> From: David Winsemius <dwinsemius@comcast.net>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices
> To: "jim holtman" <jholtman@gmail.com>
> Cc: r-help@stat.math.ethz.ch
> Message-ID: <A405ABC6-A6DE-42B2-AFD9-2E2C41F55ABC@comcast.net>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> He might try rcorr from Hmisc instead. Using your test suite, it gives
> about a 20% improvement on my MacPro:
>
> > m1 <- matrix(rnorm(10000), ncol=100)
> > m2 <- matrix(rnorm(10000), ncol=100)
> > Rprof('/tempxx.txt')
> > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { rcorr(x,y)$P }) }))
> user system elapsed
> 4.221 0.049 4.289
>
> > m1 <- matrix(rnorm(10000), ncol=100)
> > m2 <- matrix(rnorm(10000), ncol=100)
> > Rprof('/tempxx.txt')
> > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { cor.test(x,y)$p.value }) }))
> user system elapsed
> 5.328 0.038 5.355
>
> I'm not a smart enough programmer to figure out whether there might be
> an even more efficient method that takes advantage rcorr's implicit
> "looping" through a set of columns to produce an all combinations
> return.
>
> --
> David Winsemius, MD
> Heritage Labs
>
>
> On Nov 26, 2008, at 9:14 AM, jim holtman wrote:
>
> > Your time is being taken up in cor.test because you are calling it
> > 100,000 times. So grin and bear it with the amount of work you are
> > asking it to do.
> >
> > Here I am only calling it 100 time:
> >
> >> m1 <- matrix(rnorm(10000), ncol=100)
> >> m2 <- matrix(rnorm(10000), ncol=100)
> >> Rprof('/tempxx.txt')
> >> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2,
1,
> >> function(y) { cor.test(x,y)$p.value }) }))
> > user system elapsed
> > 8.86 0.00 8.89
> >>
> >
> > so my guess is that calling it 100,000 times will take: 100,000 *
> > 0.0886 seconds or about 3 hours.
> >
> > If you run Rprof, you will see if is spending most of its time there:
> >
> > 0 8.8 root
> > 1. 8.8 apply
> > 2. . 8.8 FUN
> > 3. . . 8.8 apply
> > 4. . . . 8.7 FUN
> > 5. . . . . 8.6 cor.test
> > 6. . . . . . 8.4 cor.test.default
> > 7. . . . . . . 2.4 match.arg
> > 8. . . . . . . . 1.7 eval
> > 9. . . . . . . . . 1.4 deparse
> > 10. . . . . . . . . . 0.6 .deparseOpts
> > 11. . . . . . . . . . . 0.2 pmatch
> > 11. . . . . . . . . . . 0.1 sum
> > 10. . . . . . . . . . 0.5 %in%
> > 11. . . . . . . . . . . 0.3 match
> > 12. . . . . . . . . . . . 0.3 is.factor
> > 13. . . . . . . . . . . . . 0.3 inherits
> > 8. . . . . . . . 0.2 formals
> > 9. . . . . . . . . 0.2 sys.function
> > 7. . . . . . . 2.1 cor
> > 8. . . . . . . . 1.1 match.arg
> > 9. . . . . . . . . 0.7 eval
> > 10. . . . . . . . . . 0.6 deparse
> > 11. . . . . . . . . . . 0.3 .deparseOpts
> > 12. . . . . . . . . . . . 0.1 pmatch
> > 11. . . . . . . . . . . 0.2 %in%
> > 12. . . . . . . . . . . . 0.2 match
> > 13. . . . . . . . . . . . . 0.1 is.factor
> > 14. . . . . . . . . . . . . . 0.1 inherits
> > 9. . . . . . . . . 0.1 formals
> > 8. . . . . . . . 0.5 stopifnot
> > 9. . . . . . . . . 0.2 match.call
> > 8. . . . . . . . 0.1 pmatch
> > 8. . . . . . . . 0.1 is.data.frame
> > 9. . . . . . . . . 0.1 inherits
> > 7. . . . . . . 1.5 paste
> > 8. . . . . . . . 1.4 deparse
> > 9. . . . . . . . . 0.6 .deparseOpts
> > 10. . . . . . . . . . 0.3 pmatch
> > 10. . . . . . . . . . 0.1 any
> > 9. . . . . . . . . 0.6 %in%
> > 10. . . . . . . . . . 0.6 match
> > 11. . . . . . . . . . . 0.5 is.factor
> > 12. . . . . . . . . . . . 0.4 inherits
> > 13. . . . . . . . . . . . . 0.2 mode
> > 7. . . . . . . 0.4 switch
> > 8. . . . . . . . 0.1 qnorm
> > 7. . . . . . . 0.2 pt
> > 5. . . . . 0.1 $
> >
> > On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan
<daren76@hotmail.com>
> > wrote:
> >>
> >> My two matrices are roughly the sizes of m1 and m2. I tried using
> >> two apply and cor.test to compute the correlation p.values. More
> >> than an hour, and the codes are still running. Please help to make
> >> it more efficient.
> >>
> >> m1 <- matrix(rnorm(100000), ncol=100)
> >> m2 <- matrix(rnorm(10000000), ncol=100)
> >>
> >> cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
function(y)
> >> { cor.test(x,y)$p.value }) })
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem that you are trying to solve?
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ------------------------------
>
> Message: 20
> Date: Wed, 26 Nov 2008 04:42:26 -0800 (PST)
> From: iwalters <iwalters@cellc.co.za>
> Subject: [R] memory limit
> To: r-help@r-project.org
> Message-ID: <20699700.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> I'm currently working with very large datasets that consist out of
> 1,000,000
> + rows. Is it at all possible to use R for datasets this size or should I
> rather consider C++/Java.
>
>
> --
> View this message in context:
>
http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20699700.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 21
> Date: Wed, 26 Nov 2008 21:29:28 +0800
> From: "zhijie zhang" <epistat@gmail.com>
> Subject: [R] Needs suggestions for choosing appropriate R packages
> To: R-help@stat.math.ethz.ch
> Message-ID:
> <2fc17e30811260529u1d14f3cbg7c3e075a85753bcc@mail.gmail.com>
> Content-Type: text/plain
>
> Dear all,
> I am thinking to fit a multilevel dataset with R. I have found several
> possible packages for my task, such as
> glmmPQL(MASS),glmm(repeated),glmer(lme4), et al. I am a little confused by
> these functions.
> Could anybody tell me which function/package is the correct one to analyse
> my dataset?
> My dataset is as follows:
> the response variable P is binary variable (the subject is a patient or
> not);
> two explanatory variables X1 (age) and X2 (sex);
> this dataset was sampled from three different level,
> district,school,individual, so this was regarded as a multilevel dataset.
> I hope to fit the 3-level model(Y is binary variable):
> Logit(Pijk)=(a0+b0k+c0jk)+b1*X1+b2*X2
> i-individual, first level; j-school, 2nd level; k-district,3rd level.
> I know that the GLIMMIX procedure in the latest version SAS9.2 is a choice
> for that, but unfortunately we donot have the latest version.
> R must have the similar functions to do that, can anybody give me some
> suggestions or help on analysing my dataset?
> Q1: Which package/functions is appropriate for my task? Could u show me
> some example codes if possible?
> Q2: Logit(Pijk)=(a0+b0k+c0jk)+(b1+b1j)*X1+b2*X2
> If the randome effect was also specified in X1 as above, Which
> package/functions is possible?
> Thanks a lot.
>
>
> --
> With Kind Regards,
>
> oooO:::::::::
> (..):::::::::
> :\.(:::Oooo::
> ::\_)::(..)::
> :::::::)./:::
> ::::::(_/::::
> :::::::::::::
> [***********************************************************************]
> ZhiJie Zhang ,PhD
> Dept.of Epidemiology, School of Public Health,Fudan University
> Office:Room 443, Building 8
> Office Tel./Fax.:+86-21-54237410
> Address:No. 138 Yi Xue Yuan Road,Shanghai,China
> Postcode:200032
> Email:epistat@gmail.com <Email%3Aepistat@gmail.com> <
> Email%3Aepistat@gmail.com <Email%253Aepistat@gmail.com>>
> Website: www.statABC.com
> [***********************************************************************]
> oooO:::::::::
> (..):::::::::
> :\.(:::Oooo::
> ::\_)::(..)::
> :::::::)./:::
> ::::::(_/::::
> :::::::::::::
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 22
> Date: Wed, 26 Nov 2008 13:10:16 +0100
> From: "valeria pedrina" <valeria.pedrina@gmail.com>
> Subject: [R] plm pakage
> To: r-help@r-project.org
> Message-ID:
> <e515eba20811260410g59c1d7d3q54fc47224f9ead0b@mail.gmail.com>
> Content-Type: text/plain
>
> Hi everyone, I'm doing a panel data analisys and I want to run three
> estimation methods against my available dataset:pooled OLS, random and
> fixed
> effects. I have 9 individuals and 5 observation for each individual. this
> is
> my code,what's wrong?
>
> X <- cbind(y,x)
> X <-data.frame(X)
> ooo<-pdata.frame(X,9)
> vedo<-plm(y~x, data=ooo)
>
> and this is the error:
> Errore in X.m[, coef.within, drop = F] : numero di dimensione errato
> thanks
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 23
> Date: Wed, 26 Nov 2008 09:42:37 -0400
> From: "Laurina Guerra" <laurinaguerra@gmail.com>
> Subject: [R] S4 object
> To: <r-help@r-project.org>
> Message-ID: <005101c94fcc$db1d8f40$9158adc0$@com>
> Content-Type: text/plain
>
> Hola buen dia!! Alguien me podrÃa orientar acerca de que es la clase S4
> object y como funciona de la forma mas sencilla posible. Se les agradecerÃa
> su respuesta lo mas pronto posible.. gracias!!
>
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 24
> Date: Wed, 26 Nov 2008 10:16:14 -0500
> From: "Jorge Ivan Velez" <jorgeivanvelez@gmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices
> To: "Daren Tan" <daren76@hotmail.com>
> Cc: R mailing list <r-help@r-project.org>
> Message-ID:
> <317737de0811260716t788a4735g659a908410dc9fc2@mail.gmail.com>
> Content-Type: text/plain
>
> Hi Daren,
> Here is another aproach a little bit faster taking into account that
I'm
> using your original matrices. My session info is at the end. I'm using
a
> 2.4 GHz Core 2-Duo processor and 3 GB of RAM.
>
> # Data
> set.seed(123)
> m1 <- matrix(rnorm(100000), ncol=100)
> m2 <- matrix(rnorm(100000), ncol=100)
> colnames(m1)=paste('m1_',1:100,sep="")
> colnames(m2)=paste('m2_',1:100,sep="")
>
> # Combinations
> combs=expand.grid(colnames(m1),colnames(m2))
>
> # ---------------
> # Option 1
> #----------------
> system.time(apply(combs,1,function(x)
> cor.test(m1[,x[1]],m2[,x[2]])$p.value)->pvalues1)
> # user system elapsed
> # 8.12 0.01 8.20
>
> # ---------------
> # Option 2
> #----------------
> require(Hmisc)
> system.time(apply(combs,1,function(x)
> rcorr(m1[,x[1]],m2[,x[2]])$P[2])->pvalues2)
> # user system elapsed
> # 7.00 0.00 7.02
>
>
> HTH,
>
> Jorge
>
>
> # ------------- Session Info ----------------------------
> R version 2.8.0 Patched (2008-11-08 r46864)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
>
>
> On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan <daren76@hotmail.com>
wrote:
>
> >
> > My two matrices are roughly the sizes of m1 and m2. I tried using two
> apply
> > and cor.test to compute the correlation p.values. More than an hour,
and
> the
> > codes are still running. Please help to make it more efficient.
> >
> > m1 <- matrix(rnorm(100000), ncol=100)
> > m2 <- matrix(rnorm(10000000), ncol=100)
> >
> > cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y)
{
> > cor.test(x,y)$p.value }) })
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 25
> Date: Wed, 26 Nov 2008 10:17:32 -0500
> From: Chuck Cleland <ccleland@optonline.net>
> Subject: Re: [R] Chi-Square Test Disagreement
> To: Andrew Choens <andy.choens@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <492D688C.5040007@optonline.net>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On 11/26/2008 9:51 AM, Andrew Choens wrote:
> > I was asked by my boss to do an analysis on a large data set, and I am
> > trying to convince him to let me use R rather than SPSS. I think
Sweave
> > could make my life much much easier. To get me a little closer to this
> > goal, I ran my analysis through R and SPSS and compared the resulting
> > values. In all but one case, they were the same. Given the matrix
> >
> > [,1] [,2]
> > [1,] 110 358
> > [2,] 71 312
> > [3,] 29 139
> > [4,] 31 77
> > [5,] 13 32
> >
> > This is the output from R:
> >> chisq.test(test29)
> >
> > Pearson's Chi-squared test
> >
> > data: test29
> > X-squared = 9.593, df = 4, p-value = 0.04787
> >
> > But, the same data in SPSS generates a p value of .051. It's a
small but
> > important difference. I played around and rescaled things, and tried
> > different values for B, but I never could get R to reach .051.
> >
> > I'd like to know which program is correct - R or SPSS? I know,
this is a
> > biased place to ask such a question. I also appreciate all input that
> > will help me use R more effectively. The difference could be the
result
> > of my own ignorance.
>
> The SPSS p-value is for the Likelihood Ratio Chi-squared test, not
> Pearson's. For Pearson's Chi-squared test in SPSS (16.0.2), I get
> p=0.04787, so the results do match if you do the same Chi-squared test.
>
> > thanks
> > --andy
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc. (www.ndri.org)
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>
>
>
> ------------------------------
>
> Message: 26
> Date: Wed, 26 Nov 2008 16:18:33 +0100
> From: Thomas Kaliwe <hamstersquats@web.de>
> Subject: [R] S4 slot containing either aov or NULL
> To: r-help@r-project.org
> Message-ID: <492D68C9.8010304@web.de>
> Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>
> Dear listmembers,
>
> I would like to define a class with a slot that takes either an object
> of class aov or NULL. I have been reading "S4 Classes in 15 pages more
> or less" and "Lecture: S4 classes and methods"
>
> #First I tried with list and NULL
> setClass(listOrNULL")
> setIs("list", "listOrNULL")
> setIs("NULL", "listOrNULL")
> #doesn't work
>
> #defining a union class it works with list and null
> setClassUnion("listOrNULL", c("list",
"NULL"))
> setClass("c1", representation(value = "listOrNULL"))
> y1 = new("c1", value = NULL)
> y2 = new("c1", value = list(a = 10))
>
> #but it won't work with aov or null
> setClassUnion("aovOrNULL", c("aov", "NULL"))
> setClass("c1", representation(value = "aovOrNULL"))
> y1 = new("c1", value = NULL)
>
> #trying to assign an aov object to the slot doesn't work
> utils::data(npk, package="MASS")
> npk.aov <- aov(yield ~ block + N*P*K, npk)
> y2 = new("c1", value = npk.aov )
>
> Any ideas?
>
> Thank you
>
> Thomas Kaliwe
>
>
>
> ------------------------------
>
> Message: 27
> Date: Wed, 26 Nov 2008 07:24:07 -0800 (PST)
> From: Tubin <sredmonson@yahoo.com>
> Subject: [R] odfWeave and XML... on a Mac
> To: r-help@r-project.org
> Message-ID: <20702670.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> I'm trying out odfWeave in a Mac environment and getting some odd
behavior.
> Having figured out that the code snippets only work if they're in
certain
> fonts, I was able to get R to run a test document through and produce an
> output document. After running it, though, I get a warning message:
>
> Warning message:
> In file.remove("styles_2.xml") :
> cannot remove file 'styles_2.xml', reason 'No such file or
directory'
>
> This message is interesting given that about 20 lines earlier I see:
> Renaming styles_2.xml to styles.xml
>
> If I run the test doc in results=verbatim mode, I see that warning by my
> results appear in the appropriate places on the output document:
>
> *****output*****
> This is the basic text stuff. Now I will try to input the other stuff:
>
> [1] 5
>
> And this is the after-text.
> *****end*****
>
> If I run the test document in results=xml mode, though, the output is
> blank:
>
> *******output*****
>
> This is the basic text stuff. Now I will try to input the other stuff:
>
>
> And this is the after-text.
> ******end*****
>
> Earlier posts on this forum suggest that the solution may involve loading
> an
> earlier build of XML. Is that likely to work? And if so - stupid question
> I'm sure, but how do I do that?
>
> Thanks in advance for the time and attention of people more experienced
> than
> myself...
>
>
> --
> View this message in context:
> http://www.nabble.com/odfWeave-and-XML...-on-a-Mac-tp20702670p20702670.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 28
> Date: Wed, 26 Nov 2008 09:33:59 -0600
> From: "hadley wickham" <h.wickham@gmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices
> To: "jim holtman" <jholtman@gmail.com>
> Cc: r-help@stat.math.ethz.ch
> Message-ID:
> <f8e6ff050811260733x7062a6acm68f8da611a14cb87@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Wed, Nov 26, 2008 at 8:14 AM, jim holtman <jholtman@gmail.com>
wrote:
> > Your time is being taken up in cor.test because you are calling it
> > 100,000 times. So grin and bear it with the amount of work you are
> > asking it to do.
> >
> > Here I am only calling it 100 time:
> >
> >> m1 <- matrix(rnorm(10000), ncol=100)
> >> m2 <- matrix(rnorm(10000), ncol=100)
> >> Rprof('/tempxx.txt')
> >> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2,
1,
> function(y) { cor.test(x,y)$p.value }) }))
> > user system elapsed
> > 8.86 0.00 8.89
> >>
> >
> > so my guess is that calling it 100,000 times will take: 100,000 *
> > 0.0886 seconds or about 3 hours.
>
> You can make it ~3 times faster by vectorising the testing:
>
> m1 <- matrix(rnorm(10000), ncol=100)
> m2 <- matrix(rnorm(10000), ncol=100)
>
> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> function(y) { cor.test(x,y)$p.value })}))
>
>
> system.time({
> r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y) })})
>
> df <- nrow(m1) - 2
> t <- sqrt(df) * r / sqrt(1 - r ^ 2)
> p <- pt(t, df)
> p <- 2 * pmin(p, 1 - p)
> })
>
>
> all.equal(cor.pvalues, p)
>
>
> You can make cor much faster by stripping away all the error checking
> code and calling the internal c function directly (suggested by the
> Rprof output):
>
>
> system.time({
> r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y) })})
> })
>
> system.time({
> r2 <- apply(m1, 1, function(x) { apply(m2, 1, function(y) {
> .Internal(cor(x, y, 4L, FALSE)) })})
> })
>
> 1.5s vs 0.2 s on my computer. Combining both changes gives me a ~25
> time speed up - I suspect you can do even better if you think about
> what calculations are being duplicated in the computation of the
> correlations.
>
> Hadley
>
> --
> http://had.co.nz/
>
>
>
> ------------------------------
>
> Message: 29
> Date: Wed, 26 Nov 2008 15:37:55 +0000
> From: Daren Tan <daren76@hotmail.com>
> Subject: Re: [R] Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices
> To: <r-help@stat.math.ethz.ch>
> Message-ID: <BLU137-W95CAC1950A6096E58D289B20A0@phx.gbl>
> Content-Type: text/plain; charset="gb2312"
>
>
> Out of desperation, I made the following function which hadley beats me to
> it :P. Thanks everyone for the great help.
>
>
> cor.p.values <- function(r, n) {
> df <- n - 2
> STATISTIC <- c(sqrt(df) * r / sqrt(1 - r^2))
> p <- pt(STATISTIC, df)
> return(2 * pmin(p, 1 - p))
> }
>
> > Date: Wed, 26 Nov 2008 09:33:59 -0600
> > From: h.wickham@gmail.com
> > To: jholtman@gmail.com
> > Subject: Re: [R] Very slow: using double apply and cor.test to compute
> correlation p.values for 2 matrices
> > CC: daren76@hotmail.com; r-help@stat.math.ethz.ch
> >
> > On Wed, Nov 26, 2008 at 8:14 AM, jim holtman wrote:
> >> Your time is being taken up in cor.test because you are calling it
> >> 100,000 times. So grin and bear it with the amount of work you are
> >> asking it to do.
> >>
> >> Here I am only calling it 100 time:
> >>
> >>> m1 <- matrix(rnorm(10000), ncol=100)
> >>> m2 <- matrix(rnorm(10000), ncol=100)
> >>> Rprof('/tempxx.txt')
> >>> system.time(cor.pvalues <- apply(m1, 1, function(x) {
apply(m2, 1,
> function(y) { cor.test(x,y)$p.value }) }))
> >> user system elapsed
> >> 8.86 0.00 8.89
> >>>
> >>
> >> so my guess is that calling it 100,000 times will take: 100,000 *
> >> 0.0886 seconds or about 3 hours.
> >
> > You can make it ~3 times faster by vectorising the testing:
> >
> > m1 <- matrix(rnorm(10000), ncol=100)
> > m2 <- matrix(rnorm(10000), ncol=100)
> >
> > system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1,
> > function(y) { cor.test(x,y)$p.value })}))
> >
> >
> > system.time({
> > r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y)
})})
> >
> > df <- nrow(m1) - 2
> > t <- sqrt(df) * r / sqrt(1 - r ^ 2)
> > p <- pt(t, df)
> > p <- 2 * pmin(p, 1 - p)
> > })
> >
> >
> > all.equal(cor.pvalues, p)
> >
> >
> > You can make cor much faster by stripping away all the error checking
> > code and calling the internal c function directly (suggested by the
> > Rprof output):
> >
> >
> > system.time({
> > r <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor(x,y)
})})
> > })
> >
> > system.time({
> > r2 <- apply(m1, 1, function(x) { apply(m2, 1, function(y) {
> > .Internal(cor(x, y, 4L, FALSE)) })})
> > })
> >
> > 1.5s vs 0.2 s on my computer. Combining both changes gives me a ~25
> > time speed up - I suspect you can do even better if you think about
> > what calculations are being duplicated in the computation of the
> > correlations.
> >
> > Hadley
> >
> > --
> > http://had.co.nz/
> _________________________________________________________________
> [[elided Hotmail spam]]
>
>
>
> ------------------------------
>
> Message: 30
> Date: Wed, 26 Nov 2008 07:43:22 -0800 (PST)
> From: Jeroen Ooms <j.c.l.ooms@uu.nl>
> Subject: Re: [R] plotting density for truncated distribution
> To: r-help@r-project.org
> Message-ID: <20703469.post@talk.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> thank you, both solutions are really helpful!
> --
> View this message in context:
>
http://www.nabble.com/plotting-density-for-truncated-distribution-tp20684995p20703469.html
> Sent from the R help mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>
> Message: 31
> Date: Wed, 26 Nov 2008 13:42:29 -0200
> From: "Rodrigo Aluizio" <r.aluizio@gmail.com>
> Subject: [R] RES: S4 object
> To: "'Laurina Guerra'" <laurinaguerra@gmail.com>
> Cc: R Help <r-help@r-project.org>
> Message-ID: <492d6eb6.231e640a.0480.76d0@mx.google.com>
> Content-Type: text/plain; charset="us-ascii"
>
> Take a look at the links you will found on this previous post of the list.
>
> http://tolstoy.newcastle.edu.au/R/help/06/01/18259.html
>
> I myself don't know anything about this subject.
>
> Sorry, but you probably will found what you need there.
>
> Best Wishes
>
> Rodrigo.
>
> -----Mensagem original-----
> De: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] Em
> nome de Laurina Guerra
> Enviada em: quarta-feira, 26 de novembro de 2008 11:43
> Para: r-help@r-project.org
> Assunto: [R] S4 object
>
> Hola buen dia!! Alguien me podrma orientar acerca de que es la clase S4
> object y como funciona de la forma mas sencilla posible. Se les agradecerma
> su respuesta lo mas pronto posible.. gracias!!
>
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 32
> Date: Wed, 26 Nov 2008 13:53:44 -0300
> From: "Leandro Marino" <leandro@cesgranrio.org.br>
> Subject: [R] RES: memory limit
> To: "'iwalters'" <iwalters@cellc.co.za>,
<r-help@r-project.org>
> Message-ID:
>
>
<!&!AAAAAAAAAAAYAAAAAAAAAEadxqYXQLlLmuUnwe+aKQfCgAAAEAAAAI9VmQAZHrdBskVHz8nCW0sBAAAAAA==@
> cesgranrio.org.br>
>
> Content-Type: text/plain; charset="us-ascii"
>
> It depends of the number of the variables. If you are using 2 or 3
> variables
> you can do some things.
>
> I should you read about ff package and ASOR packages they manage the
> dataset
> to do some kind of IO.
>
> Regards,
>
> -----Mensagem original-----
> De: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] Em
> nome de iwalters
> Enviada em: quarta-feira, 26 de novembro de 2008 09:42
> Para: r-help@r-project.org
> Assunto: [R] memory limit
>
>
> I'm currently working with very large datasets that consist out of
> 1,000,000
> + rows. Is it at all possible to use R for datasets this size or should I
> rather consider C++/Java.
>
>
> --
> View this message in context:
>
>
http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-
> tp20675880p20699700.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ------------------------------
>
> Message: 33
> Date: Wed, 26 Nov 2008 18:24:32 +0200
> From: "andrew collier" <collierab@gmail.com>
> Subject: [R] ts subscripting problem
> To: r-help@r-project.org
> Message-ID:
> <c642e63c0811260824l7adc65e4tf9177ad2fad01132@mail.gmail.com>
> Content-Type: text/plain
>
> hi,
>
> i am having trouble getting a particular time series to plot. this is what
> i
> have:
>
> > class(irradiance)
> [1] "ts"
> > irradiance[1:30]
> 197811 197812 197901 197902 197903 197904 197905 197906
> 1366.679 1366.729 1367.476 1367.739 1368.339 1367.883 1367.916 1367.055
> 197907 197908 197909 197910 197911 197912 198001 198002
> 1367.484 1366.887 1366.935 1367.034 1366.997 1367.310 1367.041 1366.459
> 198003 198004 198005 198006 198007 198008 198009 198010
> 1367.143 1366.553 1366.597 1366.854 1366.814 1366.901 1366.622 1366.669
> 198011 198012 198101 198102 198103 198104
> 1365.874 1366.098 1367.141 1366.239 1366.323 1366.388
> > plot(irradiance[1:30])
> > plot(irradiance)
> Error in dn[[2]] : subscript out of bounds
>
> so, if i plot a subset of the data it works fine. but if i try to plot the
> whole thing it breaks. the ts object was created using:
>
> irradiance = ts(tapply(d$number, f, mean), freq = 12, start = c(1978, 11))
>
> and other ts objects that i have defined using basically the same approach
> work fine.
>
> any ideas greatly appreciated!
>
> cheers,
> andrew.
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 34
> Date: Wed, 26 Nov 2008 16:27:02 +0000
> From: Andrew Beckerman <a.beckerman@sheffield.ac.uk>
> Subject: [R] survreg and pweibull
> To: r-help@r-project.org
> Message-ID: <876C2357-B2DD-43F9-A803-110628B7A537@sheffield.ac.uk>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> Dear all -
>
> I have followed the thread the reply to which was lead by Thomas
> Lumley about using pweibull to generate fitted survival curves for
> survreg models.
>
> http://tolstoy.newcastle.edu.au/R/help/04/11/7766.html
>
> Using the lung data set,
>
> data(lung)
> lung.wbs <- survreg( Surv(time, status)~ 1, data=lung,
dist='weibull')
> curve(pweibull(x, scale=exp(coef(lung.wbs)), shape=1/lung.wbs
> $scale,lower.tail=FALSE),from=0, to=max(lung$time))
> lines(survfit(Surv(time,status)~1, data=lung), col="red")
>
> Assuming this is correct, why does the inflection point of this curve
> not match up to the exp(scale parameter)? Am I wrong in assuming that
> the scale represents the inflection, and the shape adjusts the shape
> around this point? I think I am.... perhaps confusing the scale and
> the median with the inflection point calcuation?
>
> One can visualise the mismatch with:
>
> abline(v=exp(coef(lung.wbs)),lty=2)
> abline(h=0.5,lty=2)
>
> Many thanks for the clarification....
>
> R version 2.8.0 (2008-10-20)
> i386-apple-darwin8.11.1
> locale:
> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
> attached base packages:
> [1] splines datasets utils stats graphics grDevices
> methods base
> other attached packages:
> [1] survival_2.34-1 Hmisc_3.4-3 lattice_0.17-15 MASS_7.2-44
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 grid_2.8.0 tools_2.8.0
>
> Andrew
>
>
>
---------------------------------------------------------------------------------
> Dr. Andrew Beckerman
> Department of Animal and Plant Sciences, University of Sheffield,
> Alfred Denny Building, Western Bank, Sheffield S10 2TN, UK
> ph +44 (0)114 222 0026; fx +44 (0)114 222 0002
> http://www.beckslab.staff.shef.ac.uk/
>
> http://www.flickr.com/photos/apbeckerman/
> http://www.warblefly.co.uk
>
>
>
> ------------------------------
>
> Message: 35
> Date: Wed, 26 Nov 2008 16:22:01 +0000
> From: "Dr. Alireza Zolfaghari" <ali.zolfaghari@gmail.com>
> Subject: [R] Second y-axis
> To: R-help <r-help@r-project.org>
> Message-ID:
> <d47fac460811260822o207a95e4gda2b936585139506@mail.gmail.com>
> Content-Type: text/plain
>
> Hi list,
> In the following code, how can I place the percentage label away from
> numbers in the second y-axis (lets say all should be inside plot area)?
>
> Thanks
> Alireza
>
> ================> require(grid)
> vp<- viewport(x=.1,y=.1,width=.6,height=.6,just=c("left",
"bottom"))
> pushViewport(vp)
>
>
plotDATA=data.frame(Loss=c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10),Level=c("AvgAll","AvgAll","AvgAll","AvgAll","AvgAll","AvgAll","AvgAll",
>
>
"AvgAll","AvgAll","AvgAll","AvgAll","AvgAll","GUL","GUL","GUL","GUL","GUL","GUL","GUL","GUL"),Line=c(1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,5,6,7,8))
> library(lattice)
> xyplot( Loss ~ Line, data=plotDATA, t="p",
> scales=list(relation="free", x=list(draw=TRUE, tick.number=12,
labels> 1:12)),par.settings = list(clip = list(panel = "off")))
> p<- xyplot( Loss ~ Line, data=plotDATA,
> t="p",scales=list(relation="free",x=list(at = 1:12)),
> panel=function(x,y,subscripts, groups,...){
> panel.xyplot(subset(plotDATA,
Level=="AvgAll")$Line,subset(plotDATA,
> Level=="AvgAll")$Loss
,col=Lloydscolour(colIncP),lwd=3,origin=0,...)
> panel.axis(side = "right",
>
>
at=unique(plotDATA$Loss),labels=unique(plotDATA$Loss)/max(plotDATA$Loss)*100,outside=FALSE,ticks=TRUE,half=FALSE)
> panel.axis(side = "right",
>
>
at=median(plotDATA$Loss),labels="Percentage",outside=FALSE,ticks=FALSE,half=FALSE,rot=90)
> panel.axis(side = "right",
> at=c(4,8),labels=c(200,400),outside=TRUE,ticks=TRUE,half=FALSE)
> panel.barchart(subset(plotDATA,Level=="GUL" )$Line,
> subset(plotDATA,Level=="GUL" )$Loss,box.ratio=1,horizontal =
FALSE,stack > TRUE,reference =
TRUE,col="blue",border="blue")#,origin=0)
> }
> )
>
> print(p,position = c(0.1, 0.1, 0.9, .9))
> ================>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 36
> Date: Wed, 26 Nov 2008 08:39:15 -0800
> From: Charlie Brush <cfbrush@ucdavis.edu>
> Subject: Re: [R] multiple imputation with fit.mult.impute in Hmisc -
> how to replace NA with imputed value?
> To: Frank E Harrell Jr <f.harrell@vanderbilt.edu>
> Cc: r-help@r-project.org
> Message-ID: <492D7BB3.7000809@ucdavis.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Frank E Harrell Jr wrote:
> > Charlie Brush wrote:
> >> I am doing multiple imputation with Hmisc, and
> >> can't figure out how to replace the NA values with
> >> the imputed values.
> >>
> >> Here's a general ourline of the process:
> >>
> >> > set.seed(23)
> >> > library("mice")
> >> > library("Hmisc")
> >> > library("Design")
> >> > d <- read.table("DailyDataRaw_01.txt",header=T)
> >> > length(d);length(d[,1])
> >> [1] 43
> >> [1] 2666
> >> Do for this data set, there are 43 columns and 2666 rows
> >>
> >> Here is a piece of data.frame d:
> >> > d[1:20,4:6]
> >> P01 P02 P03
> >> 1 0.1 0.16 0.16
> >> 2 NA 0.00 0.00
> >> 3 NA 0.60 0.04
> >> 4 NA 0.15 0.00
> >> 5 NA 0.00 0.00
> >> 6 0.7 0.00 0.75
> >> 7 NA 0.00 0.00
> >> 8 NA 0.00 0.00
> >> 9 0.0 0.00 0.00
> >> 10 0.0 0.00 0.00
> >> 11 0.0 0.00 0.00
> >> 12 0.0 0.00 0.00
> >> 13 0.0 0.00 0.00
> >> 14 0.0 0.00 0.00
> >> 15 0.0 0.00 0.03
> >> 16 NA 0.00 0.00
> >> 17 NA 0.01 0.00
> >> 18 0.0 0.00 0.00
> >> 19 0.0 0.00 0.00
> >> 20 0.0 0.00 0.00
> >>
> >> These are daily precipitation values at NCDC stations, and
> >> NA values at station P01 will be filled using multiple
> >> imputation and data from highly correlated stations P02 and P08:
> >>
> >> > f <- aregImpute(~ I(P01) + I(P02) + I(P08),
> >> n.impute=10,match='closest',data=d)
> >> Iteration 13
> >> > fmi <- fit.mult.impute( P01 ~ P02 + P08 , ols, f, d)
> >>
> >> Variance Inflation Factors Due to Imputation:
> >>
> >> Intercept P02 P08
> >> 1.01 1.39 1.16
> >>
> >> Rate of Missing Information:
> >>
> >> Intercept P02 P08
> >> 0.01 0.28 0.14
> >>
> >> d.f. for t-distribution for Tests of Single Coefficients:
> >>
> >> Intercept P02 P08
> >> 242291.18 116.05 454.95
> >> > r <- apply(f$imputed$P01,1,mean)
> >> > r
> >> 2 3 4 5 7 8 16 17 249 250 251
> >> 0.002 0.430 0.044 0.002 0.002 0.002 0.002 0.123 0.002 0.002 0.002
> >> 252 253 254 255 256 257 258 259 260 261 262
> >> 1.033 0.529 1.264 0.611 0.002 0.513 0.085 0.002 0.705 0.840 0.719
> >> 263 264 265 266 267 268 269 270 271 272 273
> >> 1.489 0.532 0.150 0.134 0.002 0.002 0.002 0.002 0.002 0.055 0.135
> >> 274 275 276 277 278 279 280 281 282 283 284
> >> 0.009 0.002 0.002 0.002 0.008 0.454 1.676 1.462 0.071 0.002 1.029
> >> 285 286 287 288 289 418 419 420 421 422 700
> >> 0.055 0.384 0.947 0.002 0.002 0.008 0.759 0.066 0.009 0.002 0.002
> >>
> >> ------------------------------------------------------------------
> >> So far, this is working great.
> >> Now, make a copy of d:
> >> > dnew <- d
> >>
> >> And then fill in the NA values in P01 with the values in r
> >>
> >> For example:
> >> > for (i in 1:length(r)){
> >> dnew$P01[r[i,1]] <- r[i,2]
> >> }
> >> This doesn't work, because each 'piece' of r is two
numbers:
> >> > r[1]
> >> 2
> >> 0.002
> >> > r[1,1]
> >> Error in r[1, 1] : incorrect number of dimensions
> >>
> >> My question: how can I separate the the two items in (for example)
> >> r[1] to use the first part as an index and the second as a value,
> >> and then use them to replace the NA values with the imputed
values?
> >>
> >> Or is there a better way to replace the NA values with the imputed
> >> values?
> >>
> >> Thanks in advance for any help.
> >>
> >
> > You didn't state your goal, and why fit.mult.impute does not do
what
> > you want. But you can look inside fit.mult.impute to see how it
> > retrieves the imputed values. Also see the example in documentation
> > for transcan in which the command impute(xt, imputation=1) to retrieve
> > one of the multiple imputations.
> >
> > Note that you can say library(Design) (omit the quotes) to access both
> > Design and Hmisc.
> >
> > Frank
> Thanks for your help.
> My goal is to replace the NA values in the (copy of the) data frame with
> the means of the imputed values (which are now in variable 'r').
> fit.mult.impute works fine. I just can't figure out the last step,
> taking the results of fit.mult.impute (which are in variable 'r')
and
> replacing the NA values in the (copy of the) data frame.
> A simple for loop doesn't work because the items in 'r'
don't look like
> a normal vector, as for example r[1] returns
> 2
> 0.002
> Is there a command to replace the NA values in the data frame with the
> means of the imputed values?
>
> Thanks,
> Charlie
>
>
>
> ------------------------------
>
> Message: 37
> Date: Wed, 26 Nov 2008 10:46:11 -0600
> From: Frank E Harrell Jr <f.harrell@vanderbilt.edu>
> Subject: Re: [R] multiple imputation with fit.mult.impute in Hmisc -
> how to replace NA with imputed value?
> To: Charlie Brush <cfbrush@ucdavis.edu>
> Cc: r-help@r-project.org
> Message-ID: <492D7D53.2070902@vanderbilt.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Charlie Brush wrote:
> > Frank E Harrell Jr wrote:
> >> Charlie Brush wrote:
> >>> I am doing multiple imputation with Hmisc, and
> >>> can't figure out how to replace the NA values with
> >>> the imputed values.
> >>>
> >>> Here's a general ourline of the process:
> >>>
> >>> > set.seed(23)
> >>> > library("mice")
> >>> > library("Hmisc")
> >>> > library("Design")
> >>> > d <-
read.table("DailyDataRaw_01.txt",header=T)
> >>> > length(d);length(d[,1])
> >>> [1] 43
> >>> [1] 2666
> >>> Do for this data set, there are 43 columns and 2666 rows
> >>>
> >>> Here is a piece of data.frame d:
> >>> > d[1:20,4:6]
> >>> P01 P02 P03
> >>> 1 0.1 0.16 0.16
> >>> 2 NA 0.00 0.00
> >>> 3 NA 0.60 0.04
> >>> 4 NA 0.15 0.00
> >>> 5 NA 0.00 0.00
> >>> 6 0.7 0.00 0.75
> >>> 7 NA 0.00 0.00
> >>> 8 NA 0.00 0.00
> >>> 9 0.0 0.00 0.00
> >>> 10 0.0 0.00 0.00
> >>> 11 0.0 0.00 0.00
> >>> 12 0.0 0.00 0.00
> >>> 13 0.0 0.00 0.00
> >>> 14 0.0 0.00 0.00
> >>> 15 0.0 0.00 0.03
> >>> 16 NA 0.00 0.00
> >>> 17 NA 0.01 0.00
> >>> 18 0.0 0.00 0.00
> >>> 19 0.0 0.00 0.00
> >>> 20 0.0 0.00 0.00
> >>>
> >>> These are daily precipitation values at NCDC stations, and
> >>> NA values at station P01 will be filled using multiple
> >>> imputation and data from highly correlated stations P02 and
P08:
> >>>
> >>> > f <- aregImpute(~ I(P01) + I(P02) + I(P08),
> >>> n.impute=10,match='closest',data=d)
> >>> Iteration 13
> >>> > fmi <- fit.mult.impute( P01 ~ P02 + P08 , ols, f, d)
> >>>
> >>> Variance Inflation Factors Due to Imputation:
> >>>
> >>> Intercept P02 P08
> >>> 1.01 1.39 1.16
> >>>
> >>> Rate of Missing Information:
> >>>
> >>> Intercept P02 P08
> >>> 0.01 0.28 0.14
> >>>
> >>> d.f. for t-distribution for Tests of Single Coefficients:
> >>>
> >>> Intercept P02 P08
> >>> 242291.18 116.05 454.95
> >>> > r <- apply(f$imputed$P01,1,mean)
> >>> > r
> >>> 2 3 4 5 7 8 16 17 249 250
251
> >>> 0.002 0.430 0.044 0.002 0.002 0.002 0.002 0.123 0.002 0.002
0.002
> >>> 252 253 254 255 256 257 258 259 260 261
262
> >>> 1.033 0.529 1.264 0.611 0.002 0.513 0.085 0.002 0.705 0.840
0.719
> >>> 263 264 265 266 267 268 269 270 271 272
273
> >>> 1.489 0.532 0.150 0.134 0.002 0.002 0.002 0.002 0.002 0.055
0.135
> >>> 274 275 276 277 278 279 280 281 282 283
284
> >>> 0.009 0.002 0.002 0.002 0.008 0.454 1.676 1.462 0.071 0.002
1.029
> >>> 285 286 287 288 289 418 419 420 421 422
700
> >>> 0.055 0.384 0.947 0.002 0.002 0.008 0.759 0.066 0.009 0.002
0.002
> >>>
> >>>
------------------------------------------------------------------
> >>> So far, this is working great.
> >>> Now, make a copy of d:
> >>> > dnew <- d
> >>>
> >>> And then fill in the NA values in P01 with the values in r
> >>>
> >>> For example:
> >>> > for (i in 1:length(r)){
> >>> dnew$P01[r[i,1]] <- r[i,2]
> >>> }
> >>> This doesn't work, because each 'piece' of r is
two numbers:
> >>> > r[1]
> >>> 2
> >>> 0.002
> >>> > r[1,1]
> >>> Error in r[1, 1] : incorrect number of dimensions
> >>>
> >>> My question: how can I separate the the two items in (for
example)
> >>> r[1] to use the first part as an index and the second as a
value,
> >>> and then use them to replace the NA values with the imputed
values?
> >>>
> >>> Or is there a better way to replace the NA values with the
imputed
> >>> values?
> >>>
> >>> Thanks in advance for any help.
> >>>
> >>
> >> You didn't state your goal, and why fit.mult.impute does not
do what
> >> you want. But you can look inside fit.mult.impute to see how it
> >> retrieves the imputed values. Also see the example in
documentation
> >> for transcan in which the command impute(xt, imputation=1) to
retrieve
> >> one of the multiple imputations.
> >>
> >> Note that you can say library(Design) (omit the quotes) to access
both
> >> Design and Hmisc.
> >>
> >> Frank
> > Thanks for your help.
> > My goal is to replace the NA values in the (copy of the) data frame
with
> > the means of the imputed values (which are now in variable
'r').
> > fit.mult.impute works fine. I just can't figure out the last step,
> > taking the results of fit.mult.impute (which are in variable
'r') and
> > replacing the NA values in the (copy of the) data frame.
> > A simple for loop doesn't work because the items in 'r'
don't look like
> > a normal vector, as for example r[1] returns
> > 2
> > 0.002
> > Is there a command to replace the NA values in the data frame with the
> > means of the imputed values?
> >
> > Thanks,
> > Charlie
> >
>
> Don't do that, as this would no longer be multiple imputation. If you
> want single conditional mean imputation use transcan.
>
> Frank
>
>
> --
> Frank E Harrell Jr Professor and Chair School of Medicine
> Department of Biostatistics Vanderbilt University
>
>
>
> ------------------------------
>
> Message: 38
> Date: Thu, 27 Nov 2008 00:46:31 +0800
> From: Berwin A Turlach <berwin@maths.uwa.edu.au>
> Subject: Re: [R] Chi-Square Test Disagreement
> To: Andrew Choens <andy.choens@gmail.com>
> Cc: r-help@r-project.org
> Message-ID: <20081127004631.3289e0c3@absentia>
> Content-Type: text/plain; charset=US-ASCII
>
> G'day Andy,
>
> On Wed, 26 Nov 2008 14:51:50 +0000
> Andrew Choens <andy.choens@gmail.com> wrote:
>
> > I was asked by my boss to do an analysis on a large data set, and I am
> > trying to convince him to let me use R rather than SPSS.
>
> Very laudable of you. :)
>
> > This is the output from R:
> > > chisq.test(test29)
> >
> > Pearson's Chi-squared test
> >
> > data: test29
> > X-squared = 9.593, df = 4, p-value = 0.04787
> >
> > But, the same data in SPSS generates a p value of .051. It's a
small
> > but important difference.
>
> Chuck explained already the reason for this small difference. I just
> take issue about it being an important difference. In my opinion, this
> difference is not important at all. It would only be important to
> people who are still sticking to arbitrary cut-off points that are
> mainly due to historical coincidences and the lack of computing power
> at those time in history. If somebody tells you that this difference
> is important, ask him or her whether he or she will be willing to
> finance you a room full of calculators (in the sense of Pearson's time)
> and whether he or she wants you to do all your calculations and analyses
> with these calculators in future. Alternatively, you could ask the
> person whether he or she would like the anaesthetist during his or her
> next operation to use chloroform given his or her nostalgic penchant for
> out-dated rituals/methods.
>
> > I played around and rescaled things, and tried different values for
> > B, but I never could get R to reach .051.
>
> Well, I have no problem when using simulated p-values to get something
> close to 0.051; look at the last try. The second one might also be
> noteworthy. Unfortunately, I didn't save the seed beforehand.
>
> > test29 <- matrix(c(110,358,71,312,29,139,31,77,13,32), byrow=TRUE,
> > ncol=2) test29
> [,1] [,2]
> [1,] 110 358
> [2,] 71 312
> [3,] 29 139
> [4,] 31 77
> [5,] 13 32
> > chisq.test(test29, simul=TRUE)
>
> Pearson's Chi-squared test with simulated p-value (based on 2000
> replicates)
>
> data: test29
> X-squared = 9.593, df = NA, p-value = 0.04798
>
> > chisq.test(test29, simul=TRUE)
>
> Pearson's Chi-squared test with simulated p-value (based on 2000
> replicates)
>
> data: test29
> X-squared = 9.593, df = NA, p-value = 0.05697
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
> Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data: test29
> X-squared = 9.593, df = NA, p-value = 0.0463
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
> Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data: test29
> X-squared = 9.593, df = NA, p-value = 0.0499
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
> Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data: test29
> X-squared = 9.593, df = NA, p-value = 0.0486
>
> > chisq.test(test29, simul=TRUE, B=20000)
>
> Pearson's Chi-squared test with simulated p-value (based on
> 20000 replicates)
>
> data: test29
> X-squared = 9.593, df = NA, p-value = 0.05125
>
>
> Cheers,
>
> Berwin
>
> =========================== Full address ============================>
Berwin A Turlach Tel.: +65 6516 4416 (secr)
> Dept of Statistics and Applied Probability +65 6516 6650 (self)
> Faculty of Science FAX : +65 6872 3919
> National University of Singapore
> 6 Science Drive 2, Blk S16, Level 7 e-mail: statba@nus.edu.sg
> Singapore 117546 http://www.stat.nus.edu.sg/~statba
>
>
>
> ------------------------------
>
> Message: 39
> Date: Wed, 26 Nov 2008 17:57:52 +0000
> From: Andrew Choens <andy.choens@gmail.com>
> Subject: Re: [R] Chi-Square Test Disagreement
> To: Berwin A Turlach <berwin@maths.uwa.edu.au>
> Cc: r-help@r-project.org
> Message-ID: <1227722272.8422.201.camel@chinstrap>
> Content-Type: text/plain
>
> On Thu, 2008-11-27 at 00:46 +0800, Berwin A Turlach wrote:
> > Chuck explained already the reason for this small difference. I just
> > take issue about it being an important difference. In my opinion,
> > this difference is not important at all. It would only be important
> > to people who are still sticking to arbitrary cut-off points that are
> > mainly due to historical coincidences and the lack of computing power
> > at those time in history. If somebody tells you that this difference
> > is important, ask him or her whether he or she will be willing to
> > finance you a room full of calculators (in the sense of Pearson's
> > time) and whether he or she wants you to do all your calculations and
> > analyses with these calculators in future. Alternatively, you could
> > ask the person whether he or she would like the anaesthetist during
> > his or her next operation to use chloroform given his or her nostalgic
> > penchant for out-dated rituals/methods.
>
> Yes he did and when I realized the source of my confusion I was
> appropriately chastised. I felt like a bit of a fool. Of course, I
> should try comparing apples to apples. Oranges are another thing
> entirely.
>
> As to the importance of the difference, I am of two minds. On the one
> hand I fully agree with you. It is an anachronistic approach. On the
> other hand we don't all have the pleasure of working in a math
> department where such subtleties are well understood.
>
> I work for a consulting firm that advises state and local governments
> (USA). I personally do try to expand my understanding on statistics and
> math (I do not have a degree in math), but my clients do not. When I'm
> working with someone from the government, it is sometimes easier to
> simply tell them that relationship x is significant at a certain level
> of certainty. Although I doubt they could really explain the details,
> they have some basic understanding of what I am talking about.
> Subtleties are sometimes lost on our public servants.
>
> And, since I do work for government, if I ask for a roomful of
> calculators, I might just get them. And really, what am I going to do
> with a roomful of calculators?
>
> --andy
>
>
> --
> Insert something humorous here. :-)
>
>
>
> ------------------------------
>
> Message: 40
> Date: Wed, 26 Nov 2008 17:22:40 +0100
> From: Mats Exter <mats.exter@uni-koeln.de>
> Subject: [R] Problem with aovlmer.fnc in languageR
> To: r-help@r-project.org
> Message-ID: <492D77D0.306@uni-koeln.de>
> Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>
> Dear R list,
>
> I have a recurring problem with the languageR package, specifically the
> aovlmer.fnc function. When I try to run the following code (from R. H.
> Baayen's textbook):
>
>
> # Example 1:
> library(languageR)
> latinsquare.lmer <- lmer(RT ~ SOA + (1 | Word) + (1 | Subject),
> data = latinsquare)
> x <- pvals.fnc(latinsquare.lmer,
> withMCMC = TRUE)
> aovlmer.fnc(latinsquare.lmer,
> mcmc = x$mcmc,
> which = c("SOAmedium", "SOAshort"))
>
>
> I get the following error message (German locale):
>
>
> Fehler in anova(object) : Calculated PWRSS for a LMM is negative
>
>
> Invoking traceback yields the following result:
>
>
> > traceback()
> 4: .Call(mer_update_projection, object)
> 3: anova(object)
> 2: anova(object)
> 1: aovlmer.fnc(latinsquare.lmer, mcmc = x$mcmc, which =
c("SOAmedium",
> "SOAshort"))
>
>
> By contrast, the following code (without the aovlmer.fnc command) runs
> without error:
>
>
> # Example 2:
> library(languageR)
> latinsquare.lmer <- lmer(RT ~ SOA + (1 | Word) + (1 | Subject),
> data = latinsquare)
> pvals.fnc(latinsquare.lmer,
> withMCMC = TRUE)
>
>
> Similarly, the following code (without the pvals.fnc command, and
> consequently...
>
> [Message clipped]
[[alternative HTML version deleted]]