Pushpike J Thilakarathne
2008-Feb-11 12:26 UTC
[R] Testing for differecnes between groups, need help to find the right test in R. (Kes Knave)
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of r-help-request at r-project.org Sent: Monday, February 11, 2008 12:00 PM To: r-help at r-project.org Subject: R-help Digest, Vol 60, Issue 11 Send R-help mailing list submissions to r-help at r-project.org To subscribe or unsubscribe via the World Wide Web, visit https://stat.ethz.ch/mailman/listinfo/r-help or, via email, send a message with subject or body 'help' to [pushpike] helo, you could use anova function in R. regards, pushpike. r-help-request at r-project.org You can reach the person managing the list at r-help-owner at r-project.org When replying, please edit your Subject line so it is more specific than "Re: Contents of R-help digest..." Today's Topics: 1. Re: writing a function (Johannes H??sing) 2. Testing for differecnes between groups, need help to find the right test in R. (Kes Knave) 3. Using 'sapply' and 'by' in one function (David & Natalia) 4. Re: Using 'sapply' and 'by' in one function (Gabor Grothendieck) 5. Do I need to use dropterm()?? (DaniWells) 6. Re: Using 'sapply' and 'by' in one function (Gabor Grothendieck) 7. Re: R on Mac PRO does anyone have experience with R on such a platform ? (Rod) 8. Re: Do I need to use dropterm()?? (Bernard Leemon) 9. Re: Which package should I use if I estimate a recursive model? (John Fox) 10. Re: Using 'sapply' and 'by' in one function (hadley wickham) 11. Error in optim while using fitdistr() function (Jason Q. McClintic) 12. Re: Using 'sapply' and 'by' in one function (Gabor Grothendieck) 13. Re: Error while using fitdistr() function or goodfit() function (Jason Q. McClintic) 14. Re: Error while using fitdistr() function or goodfit() function (Aswad Gurjar) 15. Re: Using 'sapply' and 'by' in one function (hadley wickham) 16. building packages for Linux vs. Windows (Erin Hodgess) 17. Re: building packages for Linux vs. Windows (Duncan Murdoch) 18. prcomp vs. princomp vs fast.prcomp (Erin Hodgess) 19. Re: building packages for Linux vs. Windows (Uwe Ligges) 20. Re: [R-sig-Geo] Comparing spatial point patterns - Syrjala test (jiho) 21. Re: building packages for Linux vs. Windows ( (Ted Harding)) 22. Re: building packages for Linux vs. Windows (John Sorkin) 23. Re: building packages for Linux vs. Windows (Gabor Grothendieck) 24. Re: Vector Size (John Kane) 25. grep etc. (Michael Kubovy) 26. Re: grep etc. (Gabor Csardi) 27. data frame question (joseph) 28. data frame question (joseph) 29. [OT] good reference for mixed models and EM algorithm (Erin Hodgess) 30. Re: [OT] good reference for mixed models and EM algorithm (Spencer Graves) 31. Re: data frame question (Mark Wardle) 32. Re: data frame question (David Winsemius) 33. Re: prcomp vs. princomp vs fast.prcomp (Liviu Andronic) 34. Re: Applying lm to data with combn (Henrique Dallazuanna) 35. reshape (juli pausas) 36. Re: reshape (Henrique Dallazuanna) 37. Re: reshape (Gabor Grothendieck) 38. Re: Do I need to use dropterm()?? (Bill.Venables at csiro.au) 39. j and jcross queries (Robert Biddle) 40. Questions about histograms (Andre Nathan) 41. Re: Questions about histograms (Duncan Murdoch) 42. Re: Questions about histograms (Bill.Venables at csiro.au) 43. Re: Questions about histograms (Andre Nathan) 44. Re: Re gression with time-dependent coefficients (dss) 45. Using R in a university course: dealing with proposal comments (Arin Basu) 46. Re: Using R in a university course: dealing with proposal comments (Bill.Venables at csiro.au) 47. Re: Using R in a university course: dealing with proposal comments (Spencer Graves) 48. Help with write.csv (Suhaila Zainudin) 49. tree() producing NA's (Amnon Melzer) 50. Re: Using R in a university course: dealing with proposal comments (Liviu Andronic) 51. Re: learning S4 (Robin Hankin) 52. Re: tree() producing NA's (Prof Brian Ripley) 53. tree() producing NA's (Amnon Melzer) 54. Dendrogram for agglomerative hierarchical clustering result (noorpiilur) 55. The function predict (Carla Rebelo) 56. Re: Help with write.csv (Richard.Cotton at hsl.gov.uk) 57. Re: The function predict (Dieter Menne) 58. Re: Help with write.csv (Suhaila Zainudin) 59. Re: Dendrogram for agglomerative hierarchical clustering result (Wolfgang Huber) 60. Conditional rows (Ng Stanley) 61. image quality (John Lande) 62. Re: Conditional rows (Gabor Csardi) 63. Re: Conditional rows (Dimitris Rizopoulos) 64. R programming style (David Scott) 65. RGTK2 and glade on Windows - GUI newbie (Anja Kraft) ---------------------------------------------------------------------- Message: 1 Date: Sun, 10 Feb 2008 12:14:59 +0100 From: johannes at huesing.name (Johannes H??sing) Subject: Re: [R] writing a function To: r-help at r-project.org Message-ID: <20080210111459.GA6275 at huesing.name> Content-Type: text/plain; charset=utf-8 mohamed nur anisah <nuranisah_mohamed at yahoo.com> [Fri, Feb 08, 2008 at 04:42:41PM CET]:> Dear lists, > > I'm in my process of learning of writing a function. I tried towrite a simple functions of a matrix and a vector. Here are the codes:> > mm<-function(m,n){ #matrix function > w<-matrix(nrow=m, ncol=n) > for(i in 1:m){ > for(j in 1:n){ > w[i,j]=i+j > } > } > return(w[i,j]) > } >In addition to the other comments, allow me to remark that R provides a lot of convenience functions on vectors that make explicit looping unnecessary. An error such as yours wouldn't have occurred to a more experienced expRt because indices wouldn't turn up in the code at all: mm <- function(m, n) { a <- array(nrow=m, ncol=n) row(a)+col(a) } Greetings Johannes -- Johannes H?sing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johannes at huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi") ------------------------------ Message: 2 Date: Sun, 10 Feb 2008 13:37:15 +0100 From: "Kes Knave" <kestrel78 at gmail.com> Subject: [R] Testing for differecnes between groups, need help to find the right test in R. To: r-help at r-project.org Message-ID: <a81140100802100437w2870ff11yc4b785eba5d03df6 at mail.gmail.com> Content-Type: text/plain Dear all, I have a data set with four different groups, for each group I have several observation (number of observation in each group are unequal), and I want to test if there are some differences in the values between the groups. What will be the most proper way to test this in R? Regards Kes [[alternative HTML version deleted]] ------------------------------ Message: 3 Date: Sun, 10 Feb 2008 08:19:47 -0500 From: "David & Natalia" <3.14david at gmail.com> Subject: [R] Using 'sapply' and 'by' in one function To: r-help at R-project.org Message-ID: <36d9e5e70802100519u49b27701qe252f0a4b7fc8e01 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Greetings, I'm having a problem with something that I think is very simple - I'd like to be able to use the 'sapply' and 'by' functions in 1 function to be able (for example) to get regression coefficients from multiple models by a grouping variable. I think that I'm missing something that is probably obvious to experienced users. Here's a simple (trivial) example of what I'd like to do: new <- data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=r norm(10)) fxa <- function(x,data) { lm(x~Pred,data=data)$coef } sapply(new[,1:2],fxa,new) # this yields coefficients for the predictor in separate models fxb <- function(x) {lm(Outcome.1~Pred,da=x)$coef}; by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex ## I'd like to be able to combine 'sapply' and 'by' to be able to get the regression coefficients for Outome.1 and Outcome.2 by each sex, rather than running fxb a second time predicting 'Outcome.2' or by subsetting the data - by sex - before I run the function, but the following doesn't work - by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new)) 'Error in model.frame.default(formula = x ~ Pred, data = data, drop.unused.levels = TRUE) : variable lengths differ (found for 'Pred')' ##I understand the error message - the length of 'Pred' is 10 while the length of each sex group is 5, but I'm not sure how to correctly write the 'by' function to use 'sapply' inside it. Could someone please point me in the right direction? Thanks very much in advance David S Freedman, CDC (Atlanta USA) [definitely not the well-know statistician, David A Freedman, in Berkeley] ------------------------------ Message: 4 Date: Sun, 10 Feb 2008 08:43:37 -0500 From: "Gabor Grothendieck" <ggrothendieck at gmail.com> Subject: Re: [R] Using 'sapply' and 'by' in one function To: "David & Natalia" <3.14david at gmail.com> Cc: r-help at r-project.org Message-ID: <971536df0802100543k7d622feelfc0332219b0116b7 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 By passing new to fxa via the second argument of fxa, new is not being subsetted hence the error. Try this: by(new, new$sex, function(x) sapply(x[1:2], function(y) coef(lm(y ~ Pred, x))) Actually, you can do the above without sapply as lm can take a matrix for the dependent variable: by(new, new$sex, function(x) coef(lm(as.matrix(x[1:2]) ~ Pred, x))) On Feb 10, 2008 8:19 AM, David & Natalia <3.14david at gmail.com> wrote:> Greetings, > > I'm having a problem with something that I think is very simple - I'd > like to be able to use the 'sapply' and 'by' functions in 1 function > to be able (for example) to get regression coefficients from multiple > models by a grouping variable. I think that I'm missing something > that is probably obvious to experienced users. > > Here's a simple (trivial) example of what I'd like to do: > > new <-data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=r norm(10))> fxa <- function(x,data) { lm(x~Pred,data=data)$coef } > sapply(new[,1:2],fxa,new) # this yields coefficients for the > predictor in separate models > > fxb <- function(x) {lm(Outcome.1~Pred,da=x)$coef}; > by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex > > ## I'd like to be able to combine 'sapply' and 'by' to be able to get > the regression coefficients for Outome.1 and Outcome.2 by each sex, > rather than running fxb a second time predicting 'Outcome.2' or by > subsetting the data - by sex - before I run the function, but the > following doesn't work - > > by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new)) > 'Error in model.frame.default(formula = x ~ Pred, data = data, > drop.unused.levels = TRUE) : > variable lengths differ (found for 'Pred')' > > ##I understand the error message - the length of 'Pred' is 10 while > the length of each sex group is 5, but I'm not sure how to correctly > write the 'by' function to use 'sapply' inside it. Could someone > please point me in the right direction? Thanks very much in advance > > David S Freedman, CDC (Atlanta USA) [definitely not the well-know > statistician, David A Freedman, in Berkeley] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 5 Date: Sun, 10 Feb 2008 05:40:08 -0800 (PST) From: DaniWells <dani_wells210 at hotmail.com> Subject: [R] Do I need to use dropterm()?? To: r-help at r-project.org Message-ID: <15396151.post at talk.nabble.com> Content-Type: text/plain; charset=us-ascii Hello, I'm having some difficulty understanding the useage of the "dropterm()" function in the MASS library. What exactly does it do? I'm very new to R, so any pointers would be very helpful. I've read many definitions of what dropterm() does, but none seem to stick in my mind or click with me. I've coded everything fine for an interaction that runs as follows: two sets of data (one for North aspect, one for Southern Aspect) and have a logscale on the x axis, with survival on the y. After calculating my anova results i have all significant results (ie aspect = sig, logscale of sunlight sig, and aspect:llight = sig). When i have all significant results in my ANOVA table, do i need dropterm(), or is that just to remove insignificant terms? Many thanks, Dani -- View this message in context: http://www.nabble.com/Do-I-need-to-use-dropterm%28%29---tp15396151p15396 151.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 6 Date: Sun, 10 Feb 2008 09:25:43 -0500 From: "Gabor Grothendieck" <ggrothendieck at gmail.com> Subject: Re: [R] Using 'sapply' and 'by' in one function To: "David & Natalia" <3.14david at gmail.com> Cc: r-help at r-project.org Message-ID: <971536df0802100625v488c8374v970a93a726596f92 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Actually thinking about this, not only do you not need sapply but you don't even need by: new2 <- transform(new, sex = factor(sex)) coef(lm(as.matrix(new2[1:2]) ~ sex/Pred - 1, new2)) On Feb 10, 2008 8:43 AM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> By passing new to fxa via the second argument of fxa, new is not being > subsetted hence the error. Try this: > > by(new, new$sex, function(x) sapply(x[1:2], function(y) coef(lm(y ~Pred, x)))> > Actually, you can do the above without sapply as lm can take a matrix > for the dependent variable: > > by(new, new$sex, function(x) coef(lm(as.matrix(x[1:2]) ~ Pred, x))) > > > On Feb 10, 2008 8:19 AM, David & Natalia <3.14david at gmail.com> wrote: > > Greetings, > > > > I'm having a problem with something that I think is very simple -I'd> > like to be able to use the 'sapply' and 'by' functions in 1 function > > to be able (for example) to get regression coefficients frommultiple> > models by a grouping variable. I think that I'm missing something > > that is probably obvious to experienced users. > > > > Here's a simple (trivial) example of what I'd like to do: > > > > new <-data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=r norm(10))> > fxa <- function(x,data) { lm(x~Pred,data=data)$coef } > > sapply(new[,1:2],fxa,new) # this yields coefficients for the > > predictor in separate models > > > > fxb <- function(x) {lm(Outcome.1~Pred,da=x)$coef}; > > by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for eachsex> > > > ## I'd like to be able to combine 'sapply' and 'by' to be able toget> > the regression coefficients for Outome.1 and Outcome.2 by each sex, > > rather than running fxb a second time predicting 'Outcome.2' or by > > subsetting the data - by sex - before I run the function, but the > > following doesn't work - > > > > by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new)) > > 'Error in model.frame.default(formula = x ~ Pred, data = data, > > drop.unused.levels = TRUE) : > > variable lengths differ (found for 'Pred')' > > > > ##I understand the error message - the length of 'Pred' is 10 while > > the length of each sex group is 5, but I'm not sure how to correctly > > write the 'by' function to use 'sapply' inside it. Could someone > > please point me in the right direction? Thanks very much in advance > > > > David S Freedman, CDC (Atlanta USA) [definitely not the well-know > > statistician, David A Freedman, in Berkeley] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > >------------------------------ Message: 7 Date: Sun, 10 Feb 2008 15:39:27 +0100 From: Rod <villegas.ro at gmail.com> Subject: Re: [R] R on Mac PRO does anyone have experience with R on such a platform ? To: "Maura E Monville" <maura.monville at gmail.com> Cc: r-help at r-project.org Message-ID: <29cf68350802100639g5541eb22qaecfae854c3c0d3e at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Feb 10, 2008 2:29 AM, Maura E Monville <maura.monville at gmail.com> wrote:> I saw there exists an R version for Mac/OS. > I'd like to hear from someone who is running R on a Mac/OS beforeventuring> on getting the following computer system. > I am in the process of choosing a powerful laptop 17" MB PRO > 2.6GHZ(dual-core) 4GBRAM .... > > Thank you so much, > -- > Maura E.M > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >You can see the R MacOSX FAQ, http://cran.es.r-project.org/bin/macosx/RMacOSX-FAQ.html. Also you can post in the Mac R list (R-sig-mac) Rod. ------------------------------ Message: 8 Date: Sun, 10 Feb 2008 07:39:59 -0700 From: "Bernard Leemon" <bernie.leemon at gmail.com> Subject: Re: [R] Do I need to use dropterm()?? To: DaniWells <dani_wells210 at hotmail.com> Cc: r-help at r-project.org Message-ID: <5543f5e40802100639kc82afccl574927e5aba31873 at mail.gmail.com> Content-Type: text/plain Hi Dani, it would be better to start with a question you are trying to ask of your data rather than trying to figure out what a particular function does. with your variables and model, even if the component terms were not significant, they must in the model or the product of sunlight and aspect will NOT represent the interaction. also note that the tests of your components are probably not what you think they are. in general, tests of components of interactions test the simple effect of that variable when the other variable is 0. hence, your 'significant' result for aspect pertains to when log sunlight is 0, which probably isn't what you want to be testing. what the significant effect for sunlight means depends on how aspect was coded. you should check to see what code was used to know what zero means. gary mcclelland colorado On Sun, Feb 10, 2008 at 6:40 AM, DaniWells <dani_wells210 at hotmail.com> wrote:> > Hello, > > I'm having some difficulty understanding the useage of the"dropterm()"> function in the MASS library. What exactly does it do? I'm very new toR,> so > any pointers would be very helpful. I've read many definitions of what > dropterm() does, but none seem to stick in my mind or click with me. > > I've coded everything fine for an interaction that runs as follows:two> sets > of data (one for North aspect, one for Southern Aspect) and have a > logscale > on the x axis, with survival on the y. After calculating my anovaresults> i > have all significant results (ie aspect = sig, logscale of sunlight sig, > and aspect:llight = sig). > > When i have all significant results in my ANOVA table, do i need > dropterm(), > or is that just to remove insignificant terms? > > Many thanks, > > Dani > -- > View this message in context: >http://www.nabble.com/Do-I-need-to-use-dropterm%28%29---tp15396151p15396 151.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]] ------------------------------ Message: 9 Date: Sun, 10 Feb 2008 09:41:45 -0500 From: "John Fox" <jfox at mcmaster.ca> Subject: Re: [R] Which package should I use if I estimate a recursive model? To: "'Yongfu He'" <uaedmontonhe at hotmail.com> Cc: r-help at r-project.org Message-ID: <000301c86bf3$0f264390$2d72cab0$@ca> Content-Type: text/plain; charset="us-ascii" Dear Yongfu He, If you mean a recursive structural-equation model, then if you're willing to assume normally distributed errors, equation-by-equation OLS regression, using lm(), will give you the full-information maximum-likelihood estimates of the structural coefficients. You could also use the sem() function in the sem package, but, aside from getting a test of over-identifying restrictions (assuming that the model is overidentified), there's not much reason to do so -- you'll get the same estimates. I hope this helps, John -------------------------------- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Yongfu He > Sent: February-09-08 9:16 PM > To: r-help at r-project.org > Subject: [R] Which package should I use if I estimate a recursive > model? > > > Dear All: > > I want to estimate a simple recursive mode in R. Which package shouldI> use? Thank you very much in advance. > > Yongfu He > _________________________________________________________________ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 10 Date: Sun, 10 Feb 2008 09:12:11 -0600 From: "hadley wickham" <h.wickham at gmail.com> Subject: Re: [R] Using 'sapply' and 'by' in one function To: "Gabor Grothendieck" <ggrothendieck at gmail.com> Cc: r-help at r-project.org, David & Natalia <3.14david at gmail.com> Message-ID: <f8e6ff050802100712m330ac081l432eb8cf9855d012 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Feb 10, 2008 8:25 AM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> Actually thinking about this, not only do you not need sapply but you > don't even need by: > > new2 <- transform(new, sex = factor(sex)) > coef(lm(as.matrix(new2[1:2]) ~ sex/Pred - 1, new2))Although that's a very slightly different model, as it assumes that both sexes have the same error variance. Hadley -- http://had.co.nz/ ------------------------------ Message: 11 Date: Sun, 10 Feb 2008 09:15:15 -0600 From: "Jason Q. McClintic" <jqmcclintic at stthomas.edu> Subject: [R] Error in optim while using fitdistr() function To: "Aswad Gurjar" <aswadgurjar at gmail.com> Cc: r-help at r-project.org Message-ID: <47AF1503.7080408 at stthomas.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed I get the digest, so I apologize if this is a little late. For your situation (based on the description and what I think your code is doing, more on that below), it looks like you are modeling a Poisson flow where the number of hits per unit time is a random integer with some mean value. If I understand your code correctly, you are trying to put your data into k bins of width f<-(max(V1)-min(V1))/k. In that case I would think something like this would work more efficiently: m<-min(V1); k<-floor(1 + log2(length(V1))); f<-(max(V1)-min(V1))/k; binCount<-NULL; for(i in seq(length=k)){ binIndex<-which((m+(i-1)*f<V1)&(V1<m+i*f)); binCount[i]<-sum(V2[binIndex]); }; where i becomes the index of time intervals. Hope it helps. Sincerely, Jason Q. McClintic r-help-request at r-project.org wrote:> Send R-help mailing list submissions to > r-help at r-project.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://stat.ethz.ch/mailman/listinfo/r-help > or, via email, send a message with subject or body 'help' to > r-help-request at r-project.org > > You can reach the person managing the list at > r-help-owner at r-project.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of R-help digest..." >------------------------------ Message: 12 Date: Sun, 10 Feb 2008 10:15:42 -0500 From: "Gabor Grothendieck" <ggrothendieck at gmail.com> Subject: Re: [R] Using 'sapply' and 'by' in one function To: "hadley wickham" <h.wickham at gmail.com> Cc: r-help at r-project.org, David & Natalia <3.14david at gmail.com> Message-ID: <971536df0802100715se131ecbx10e9ce2ca9091427 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Feb 10, 2008 10:12 AM, hadley wickham <h.wickham at gmail.com> wrote:> On Feb 10, 2008 8:25 AM, Gabor Grothendieck <ggrothendieck at gmail.com>wrote:> > Actually thinking about this, not only do you not need sapply butyou> > don't even need by: > > > > new2 <- transform(new, sex = factor(sex)) > > coef(lm(as.matrix(new2[1:2]) ~ sex/Pred - 1, new2)) > > Although that's a very slightly different model, as it assumes that > both sexes have the same error variance. >But the output are the coefficients and they are identical. ------------------------------ Message: 13 Date: Sun, 10 Feb 2008 10:28:00 -0600 From: "Jason Q. McClintic" <jqmcclintic at stthomas.edu> Subject: Re: [R] Error while using fitdistr() function or goodfit() function To: Aswad Gurjar <aswadgurjar at gmail.com> Cc: r-help at r-project.org Message-ID: <47AF2610.2050602 at stthomas.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Try changing your method to "ML" and try again. I tried the run the first example from the documentation and it failed with the same error. Changing the estimation method to ML worked. @List: Can anyone else verify the error I got? I literally ran the following two lines interactively from the example for goodfit: dummy <- rnbinom(200, size = 1.5, prob = 0.8) gf <- goodfit(dummy, type = "nbinomial", method = "MinChisq") and got back Warning messages: 1: In pnbinom(q, size, prob, lower.tail, log.p) : NaNs produced 2: In pnbinom(q, size, prob, lower.tail, log.p) : NaNs produced Again, I hope this helps. Sincerely, Jason Q. McClintic Aswad Gurjar wrote:> Hello, > > Thanks for help.But I am facing different problem. > > I have 421 readings of time and no of requests coming at perticulartime.Basically I have data with interval of one minute and corresponding no of requests.It is discrete in nature.I am collecting data from 9AM to 4PM.But some of readings are coming as 0.When I plotted histogram of data I could not get shape of any standard distribution.Now,my aim is to find distribution which is "best fit" to my data among standard ones.> > So there was huge data.That's why I tried to collect data into no ofbins.That was working properly.Whatever code you have given is working properly too.But your code is more efficient.Now,problem comes at next stage.When I apply fitdistr() for continuous data or goodfit() for discrete data I get following error.I am not able to remove that error.Please help me if you can.> Errors are as follows: > library(vcd) > gf<-goodfit(binCount,type= "nbinomial",method= "MinChisq") > Warning messages: > 1: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > 2: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > 3: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > 4: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > 5: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) >> summary(gf) > > Goodness-of-fit test for nbinomial distribution > > X^2 df P(> X^2) > Pearson 9.811273 2 0.007404729 > Warning message: > Chi-squared approximation may be incorrect in: summary.goodfit(gf) > > for another distribution: > gf<-goodfit(binCount,type= "poisson",method= "MinChisq") > Warning messages: > 1: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> 2: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> 3: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> 4: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> 5: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> 6: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> 7: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> 8: NA/Inf replaced by maximum positive value in: optimize(chi2,range(count))> Goodness-of-fit test for poisson distribution > > X^2 df P(> X^2) > Pearson 1.660931e+115 3 0 > Warning message: > Chi-squared approximation may be incorrect in: summary.goodfit(gf) > > > Aswad > On 2/10/08, Jason Q. McClintic < jqmcclintic at stthomas.edu> wrote: > > I get the digest, so I apologize if this is a little late. > > For your situation (based on the description and what I think yourcode> is doing, more on that below), it looks like you are modeling aPoisson> flow where the number of hits per unit time is a random integer with > some mean value. > > If I understand your code correctly, you are trying to put your data > into k bins of width f<-(max(V1)-min(V1))/k. In that case I wouldthink> something like this would work more efficiently: > > m<-min(V1); > k<-floor(1 + log2(length(V1))); > f<-(max(V1)-min(V1))/k; > binCount<-NULL; > for(i in seq(length=k)){ > binIndex<-which((m+(i-1)*f<V1)&(V1<m+i*f)); > binCount[i]<-sum(V2[binIndex]); > }; > > where i becomes the index of time intervals. > > Hope it helps. > > Sincerely, > > Jason Q. McClintic > > r-help-request at r-project.org wrote: >> Send R-help mailing list submissions to >> r-help at r-project.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://stat.ethz.ch/mailman/listinfo/r-help >> or, via email, send a message with subject or body 'help' to >> r-help-request at r-project.org >> >> You can reach the person managing the list at >> r-help-owner at r-project.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of R-help digest..." >> > > >------------------------------ Message: 14 Date: Sun, 10 Feb 2008 22:20:17 +0530 From: "Aswad Gurjar" <aswadgurjar at gmail.com> Subject: Re: [R] Error while using fitdistr() function or goodfit() function To: jqmcclintic at stthomas.edu Cc: r-help at r-project.org Message-ID: <86d768f00802100850y6c39f232k51c0add42b18c95 at mail.gmail.com> Content-Type: text/plain Hello, Thanks that helped for poisson. When I changed method to ML it worked for poisson but when I used that for nbinomial I got errors.But why is this happening? gf<-goodfit(binCount,type= "poisson") summary(gf) Goodness-of-fit test for poisson distribution X^2 df P(> X^2) Likelihood Ratio 2730.24 3 0 gf<-goodfit(binCount,type= "nbinomial") Warning messages: 1: NaNs produced in: dnbinom(x, size, prob, log) 2: NaNs produced in: dnbinom(x, size, prob, log) summary(gf) Goodness-of-fit test for nbinomial distribution X^2 df P(> X^2) Likelihood Ratio 64.53056 2 9.713306e-15 But how can I interpret above result? When I was using goodfit using method "MinChisq" I was getting some P value.More the P value among goodness of fit tests for different distributions (poisson,binomial,nbinomial) better the fit would be.Am I correct?If I am wrong correct me. But now with ML method how can I decide which distribution is best fit? Thank You. Aswad On 2/10/08, Jason Q. McClintic <jqmcclintic at stthomas.edu> wrote:> > Try changing your method to "ML" and try again. I tried the run the > first example from the documentation and it failed with the sameerror.> Changing the estimation method to ML worked. > > @List: Can anyone else verify the error I got? I literally ran the > following two lines interactively from the example for goodfit: > > dummy <- rnbinom(200, size = 1.5, prob = 0.8) > gf <- goodfit(dummy, type = "nbinomial", method = "MinChisq") > > and got back > > Warning messages: > 1: In pnbinom(q, size, prob, lower.tail, log.p) : NaNs produced > 2: In pnbinom(q, size, prob, lower.tail, log.p) : NaNs produced > > Again, I hope this helps. > > Sincerely, > > Jason Q. McClintic > > Aswad Gurjar wrote: > > Hello, > > > > Thanks for help.But I am facing different problem. > > > > I have 421 readings of time and no of requests coming at perticular > time.Basically I have data with interval of one minute andcorresponding> no of requests.It is discrete in nature.I am collecting data from 9AMto> 4PM.But some of readings are coming as 0.When I plotted histogram ofdata> I could not get shape of any standard distribution.Now,my aim is tofind> distribution which is "best fit" to my data among standard ones. > > > > So there was huge data.That's why I tried to collect data into noof> bins.That was working properly.Whatever code you have given is working > properly too.But your code is more efficient.Now,problem comes at next > stage.When I apply fitdistr() for continuous data or goodfit() for > discrete data I get following error.I am not able to remove that > error.Please help me if you can. > > Errors are as follows: > > library(vcd) > > gf<-goodfit(binCount,type= "nbinomial",method= "MinChisq") > > Warning messages: > > 1: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > > 2: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > > 3: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > > 4: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > > 5: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p) > >> summary(gf) > > > > Goodness-of-fit test for nbinomial distribution > > > > X^2 df P(> X^2) > > Pearson 9.811273 2 0.007404729 > > Warning message: > > Chi-squared approximation may be incorrect in: summary.goodfit(gf) > > > > for another distribution: > > gf<-goodfit(binCount,type= "poisson",method= "MinChisq") > > Warning messages: > > 1: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > 2: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > 3: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > 4: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > 5: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > 6: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > 7: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > 8: NA/Inf replaced by maximum positive value in: optimize(chi2, > range(count)) > > Goodness-of-fit test for poisson distribution > > > > X^2 df P(> X^2) > > Pearson 1.660931e+115 3 0 > > Warning message: > > Chi-squared approximation may be incorrect in: summary.goodfit(gf) > > > > > > Aswad > > On 2/10/08, Jason Q. McClintic < jqmcclintic at stthomas.edu> wrote: > > > > I get the digest, so I apologize if this is a little late. > > > > For your situation (based on the description and what I think yourcode> > is doing, more on that below), it looks like you are modeling aPoisson> > flow where the number of hits per unit time is a random integer with > > some mean value. > > > > If I understand your code correctly, you are trying to put your data > > into k bins of width f<-(max(V1)-min(V1))/k. In that case I wouldthink> > something like this would work more efficiently: > > > > m<-min(V1); > > k<-floor(1 + log2(length(V1))); > > f<-(max(V1)-min(V1))/k; > > binCount<-NULL; > > for(i in seq(length=k)){ > > binIndex<-which((m+(i-1)*f<V1)&(V1<m+i*f)); > > binCount[i]<-sum(V2[binIndex]); > > }; > > > > where i becomes the index of time intervals. > > > > Hope it helps. > > > > Sincerely, > > > > Jason Q. McClintic > > > > r-help-request at r-project.org wrote: > >> Send R-help mailing list submissions to > >> r-help at r-project.org > >> > >> To subscribe or unsubscribe via the World Wide Web, visit > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> or, via email, send a message with subject or body 'help' to > >> r-help-request at r-project.org > >> > >> You can reach the person managing the list at > >> r-help-owner at r-project.org > >> > >> When replying, please edit your Subject line so it is more specific > >> than "Re: Contents of R-help digest..." > >> > > > > > > >[[alternative HTML version deleted]] ------------------------------ Message: 15 Date: Sun, 10 Feb 2008 10:57:11 -0600 From: "hadley wickham" <h.wickham at gmail.com> Subject: Re: [R] Using 'sapply' and 'by' in one function To: "Gabor Grothendieck" <ggrothendieck at gmail.com> Cc: r-help at r-project.org, David & Natalia <3.14david at gmail.com> Message-ID: <f8e6ff050802100857i6a7953d2mcaef414f4696c0b4 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1> > Although that's a very slightly different model, as it assumes that > > both sexes have the same error variance. > > > > But the output are the coefficients and they are identical.For the sake of an example I'm sure that David simply omitted the part of his analysis where he looked at the standard errors as well ;) Hadley -- http://had.co.nz/ ------------------------------ Message: 16 Date: Sun, 10 Feb 2008 12:07:56 -0600 From: "Erin Hodgess" <erinm.hodgess at gmail.com> Subject: [R] building packages for Linux vs. Windows To: r-help at r-project.org Message-ID: <7acc7a990802101007p21c21247hd461eb0ec3fc6759 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi R People: I sure that this is a really easy question, but here goes: I'm trying to build a package that will run on both Linux and Windows. However, there are several commands in a section that will be different in Linux than they are in Windows. Would I be better off just to build two separate packages, please? If just one is needed, how could I determine which system is running in order to use the correct command, please? Thanks in advance, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com ------------------------------ Message: 17 Date: Sun, 10 Feb 2008 13:20:55 -0500 From: Duncan Murdoch <murdoch at stats.uwo.ca> Subject: Re: [R] building packages for Linux vs. Windows To: Erin Hodgess <erinm.hodgess at gmail.com> Cc: r-help at r-project.org Message-ID: <47AF4087.2030501 at stats.uwo.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 10/02/2008 1:07 PM, Erin Hodgess wrote:> Hi R People: > > I sure that this is a really easy question, but here goes: > > I'm trying to build a package that will run on both Linux and Windows. > > However, there are several commands in a section that will be > different in Linux than they are in Windows. > > Would I be better off just to build two separate packages, please? > If just one is needed, how could I determine which system is running > in order to use the correct command, please?You will find it much easier to build just one package. You can use .Platform or (for more detail) Sys.info() to find out what kind of system you're running on. Remember that R doesn't just run on Linux and Windows: there's also MacOSX, and other Unix and Unix-like systems (Solaris, etc.). Duncan Murdoch ------------------------------ Message: 18 Date: Sun, 10 Feb 2008 12:25:22 -0600 From: "Erin Hodgess" <erinm.hodgess at gmail.com> Subject: [R] prcomp vs. princomp vs fast.prcomp To: r-help at r-project.org Message-ID: <7acc7a990802101025u3fd6f87cldcb81972bd327c7f at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi R People: When performing PCA, should I use prcomp, princomp or fast.prcomp, please? thanks. Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com ------------------------------ Message: 19 Date: Sun, 10 Feb 2008 19:28:49 +0100 From: Uwe Ligges <ligges at statistik.tu-dortmund.de> Subject: Re: [R] building packages for Linux vs. Windows To: Erin Hodgess <erinm.hodgess at gmail.com> Cc: r-help at r-project.org, Duncan Murdoch <murdoch at stats.uwo.ca> Message-ID: <47AF4261.5050701 at statistik.tu-dortmund.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Duncan Murdoch wrote:> On 10/02/2008 1:07 PM, Erin Hodgess wrote: >> Hi R People: >> >> I sure that this is a really easy question, but here goes: >> >> I'm trying to build a package that will run on both Linux andWindows.>> >> However, there are several commands in a section that will be >> different in Linux than they are in Windows. >> >> Would I be better off just to build two separate packages, please? >> If just one is needed, how could I determine which system is running >> in order to use the correct command, please? > > You will find it much easier to build just one package. > > You can use .Platform or (for more detail) Sys.info() to find out what> kind of system you're running on. Remember that R doesn't just run on> Linux and Windows: there's also MacOSX, and other Unix and Unix-like > systems (Solaris, etc.).Erin, moreover, R has some nice facility that allows to write a function for Windows, another one for Mac and a thirs one for Unix-alikes and place them in subfolders of ./R with the right names, see the Writing R Extensions manual. Uwe Ligges> Duncan Murdoch > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 20 Date: Sun, 10 Feb 2008 19:30:03 +0100 From: jiho <jo.irisson at gmail.com> Subject: Re: [R] [R-sig-Geo] Comparing spatial point patterns - Syrjala test To: bayesianlogic at acm.org, R Help <r-help at stat.math.ethz.ch>, r-sig-geo at stat.math.ethz.ch Message-ID: <DDDEA575-82FC-4472-BF34-37973BB4EDFA at gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Hi, I went ahead and implemented something. However: - I cannot garantie it gives correct results since, unfortunately, the data used in Syrjala 1996 is not published along with the paper. To avoid mistakes, I started by coding things in a fast and simple way and then tried to optimize the code. At least all versions given the same results. - As expected, the test is still quite slow since it relies on permutations to compute the p.value. The successive optimizations allowed to go from 73 to 13 seconds on my machine, but 13 seconds is still a long time. Furthermore, I don't know how the different versions would scale according to the number of points (I only tested with one dataset). I'm not very good at "thinking vector" so if someone could look at this and further improve it, I would welcome patches. Maybe the only real solution would be to go the Fortran way and link some code to R, but I did not want to wander in such scary places ;) The code and test data is here: http://cbetm.univ-perp.fr/irisson/svn/distribution_data/tetiaroa/trunk/d ata/lib_spatial.R Warning: it probably uses non canonical S syntax, sorry for those with sensitive eyes. On 2008-February-10 , at 17:02 , Jan Theodore Galkowski wrote:> I'm also interested here in comparing spatial point patterns. So, if > anyone finds any further R-based, or S-plus-based work on the > matter, or > any more recent references, might you please include me in the > distribution list? > > Thanks much!Begin forwarded message:> From: jiho <jo.irisson at gmail.com> > Subject: Comparing spatial point patterns - Syrjala test > > Dear Lists, > > At several stations distributed regularly in space[1], we sampled > repeatedly (4 times) the abundance of organisms and measured > environmental parameters. I now want to compare the spatial > distribution of various species (and test wether they differ or > not), or to compare the distribution of a particular organism with > the distribution of some environmental variable. > Syrjala's test[2] seems to be appropriate for such comparisons. The > hamming distance is also used (but it is not associated with a > test). However, as far as I understand it, Syrjala's test only > compares the distribution gathered during one sampling event, while > I have four successive repeats and: > - I am interested in comparing if, on average, the distributions are > the same > - I would prefer to keep the information regarding the variability > of the abundances in time, rather than just comparing the means, > since the abundances are quite variable. > > Therefore I have two questions for all the knowledgeable R users on > these lists: > - Is there a package in which Syrjala's test is implemented for R? > - Is there another way (a better way) to test for such differences? > > Thank you very much in advance for your help. > > [1] http://jo.irisson.free.fr/work/research_tetiaroa.html > [2]http://findarticles.com/p/articles/mi_m2120/is_n1_v77/ai_18066337/pg_7 JiHO --- http://jo.irisson.free.fr/ ------------------------------ Message: 21 Date: Sun, 10 Feb 2008 18:39:40 -0000 (GMT) From: (Ted Harding) <Ted.Harding at manchester.ac.uk> Subject: Re: [R] building packages for Linux vs. Windows To: Erin Hodgess <erinm.hodgess at gmail.com>, r-help at r-project.org Message-ID: <XFMail.080210183940.Ted.Harding at manchester.ac.uk> Content-Type: text/plain; charset=iso-8859-1 On 10-Feb-08 18:07:56, Erin Hodgess wrote:> Hi R People: > > I sure that this is a really easy question, but here goes: > > I'm trying to build a package that will run on both Linux and Windows. > > However, there are several commands in a section that will be > different in Linux than they are in Windows. > > Would I be better off just to build two separate packages, please? > If just one is needed, how could I determine which system is running > in order to use the correct command, please? > > Thanks in advance, > ErinThere is the "version" (a list) variable: version # platform i486-pc-linux-gnu # arch i486 # os linux-gnu # system i486, linux-gnu # status Patched # major 2 # minor 4.0 # year 2006 # month 11 # day 25 # svn rev 39997 # language R from which you can extract the "os" component: version$os # [1] "linux-gnu" I don;t know what this says on a Windows system, but it surely won't mention Linux! So testing this wil enable you to set a flag, e.g. Linux<-ifelse(length(grep("linux",version$os))>0, TRUE, FALSE) if(Linux){window<-function(...) X11(...)} else {window<-function(...) windows(...)} Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 10-Feb-08 Time: 18:39:29 ------------------------------ XFMail ------------------------------ ------------------------------ Message: 22 Date: Sun, 10 Feb 2008 13:49:24 -0500 From: "John Sorkin" <jsorkin at grecc.umaryland.edu> Subject: Re: [R] building packages for Linux vs. Windows To: "Erin Hodgess" <erinm.hodgess at gmail.com>, <ted.harding at manchester.ac.uk>, <r-help at r-project.org> Message-ID: <47AF00B2.91DF.00CB.0 at grecc.umaryland.edu> Content-Type: text/plain; charset=US-ASCII On my widows XP computer, W>From my windows XP system running R 2.6.1: > version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day 26 svn rev 43537 language R version.string R version 2.6.1 (2007-11-26) John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Ted Harding <Ted.Harding at manchester.ac.uk> 2/10/2008 1:39 PM >>>On 10-Feb-08 18:07:56, Erin Hodgess wrote:> Hi R People: > > I sure that this is a really easy question, but here goes: > > I'm trying to build a package that will run on both Linux and Windows. > > However, there are several commands in a section that will be > different in Linux than they are in Windows. > > Would I be better off just to build two separate packages, please? > If just one is needed, how could I determine which system is running > in order to use the correct command, please? > > Thanks in advance, > ErinThere is the "version" (a list) variable: version # platform i486-pc-linux-gnu # arch i486 # os linux-gnu # system i486, linux-gnu # status Patched # major 2 # minor 4.0 # year 2006 # month 11 # day 25 # svn rev 39997 # language R from which you can extract the "os" component: version$os # [1] "linux-gnu" I don;t know what this says on a Windows system, but it surely won't mention Linux! So testing this wil enable you to set a flag, e.g. Linux<-ifelse(length(grep("linux",version$os))>0, TRUE, FALSE) if(Linux){window<-function(...) X11(...)} else {window<-function(...) windows(...)} Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 10-Feb-08 Time: 18:39:29 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} ------------------------------ Message: 23 Date: Sun, 10 Feb 2008 13:56:58 -0500 From: "Gabor Grothendieck" <ggrothendieck at gmail.com> Subject: Re: [R] building packages for Linux vs. Windows To: "Erin Hodgess" <erinm.hodgess at gmail.com> Cc: r-help at r-project.org Message-ID: <971536df0802101056x666d879ap23f0e739d4d661a8 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Feb 10, 2008 1:20 PM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:> On 10/02/2008 1:07 PM, Erin Hodgess wrote: > > Hi R People: > > > > I sure that this is a really easy question, but here goes: > > > > I'm trying to build a package that will run on both Linux andWindows.> > > > However, there are several commands in a section that will be > > different in Linux than they are in Windows. > > > > Would I be better off just to build two separate packages, please? > > If just one is needed, how could I determine which system is running > > in order to use the correct command, please? > > You will find it much easier to build just one package. > > You can use .Platform or (for more detail) Sys.info() to find out what > kind of system you're running on. Remember that R doesn't just run on > Linux and Windows: there's also MacOSX, and other Unix and Unix-like > systems (Solaris, etc.).Just to be a bit more definite try this: if (.Platform$OS.type == "windows") cat("I am on Windows\n") else cat("I am not on Windows\n") ------------------------------ Message: 24 Date: Sun, 10 Feb 2008 14:03:23 -0500 (EST) From: John Kane <jrkrideau at yahoo.ca> Subject: Re: [R] Vector Size To: Oscar A <oscaroseroruiz at yahoo.com>, r-help at r-project.org Message-ID: <776237.75246.qm at web32803.mail.mud.yahoo.com> Content-Type: text/plain; charset=iso-8859-1 You just have too large a vector for your memory. There is not much you can do with an object of 500 MG. You have over 137 million combinations. What are you trying to do with this vector? --- Oscar A <oscaroseroruiz at yahoo.com> wrote:> > Hello everybody!! > I'm from Colombia (South America) and I'm new on R. > I've been trying to > generate all of the possible combinations for a 6 > number combination with > numbers that ranges from 1 to 53. > > I've used the following commands: > > datos<-c(1:53) >M<-matrix(data=(combn(datos,6,FUN=NULL,simplify=TRUE)),nrow=22957480,nco l=6,byrow=TRUE)> > Once the commands are executed, the program shows > the following: > > Error: CANNOT ALLOCATE A VECTOR OF SIZE 525.5 Mb > > > How can I fix this problem? > -- > View this message in context: >http://www.nabble.com/Vector-Size-tp15366901p15366901.html> Sent from the R help mailing list archive at > Nabble.com. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >------------------------------ Message: 25 Date: Sun, 10 Feb 2008 14:14:48 -0500 From: Michael Kubovy <kubovy at virginia.edu> Subject: [R] grep etc. To: r-help at stat.math.ethz.ch Message-ID: <A7B5B7A6-B154-4414-9C6D-53D86CC761D9 at virginia.edu> Content-Type: text/plain; charset=US-ASCII; format=flowed Dear R-helpers, How do I transform v <- c('insd-otsd', 'sppr-unsp') into c('insd--otsd', 'sppr--unsp') ? _____________________________ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400 Charlottesville, VA 22904-4400 Parcels: Room 102 Gilmer Hall McCormick Road Charlottesville, VA 22903 Office: B011 +1-434-982-4729 Lab: B019 +1-434-982-4751 Fax: +1-434-982-4766 WWW: http://www.people.virginia.edu/~mk9y/ ------------------------------ Message: 26 Date: Sun, 10 Feb 2008 20:24:45 +0100 From: Gabor Csardi <csardi at rmki.kfki.hu> Subject: Re: [R] grep etc. To: Michael Kubovy <kubovy at virginia.edu> Cc: r-help at stat.math.ethz.ch Message-ID: <20080210192445.GA25143 at localdomain> Content-Type: text/plain; charset=us-ascii sub("-", "--", v, fixed=TRUE) See ?sub. Gabor On Sun, Feb 10, 2008 at 02:14:48PM -0500, Michael Kubovy wrote:> Dear R-helpers, > > How do I transform > v <- c('insd-otsd', 'sppr-unsp') > into > c('insd--otsd', 'sppr--unsp') > ? > _____________________________ > Professor Michael Kubovy > University of Virginia > Department of Psychology > USPS: P.O.Box 400400 Charlottesville, VA 22904-4400 > Parcels: Room 102 Gilmer Hall > McCormick Road Charlottesville, VA 22903 > Office: B011 +1-434-982-4729 > Lab: B019 +1-434-982-4751 > Fax: +1-434-982-4766 > WWW: http://www.people.virginia.edu/~mk9y/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.-- Csardi Gabor <csardi at rmki.kfki.hu> UNIL DGM ------------------------------ Message: 27 Date: Sun, 10 Feb 2008 11:38:13 -0800 (PST) From: joseph <jdsandjd at yahoo.com> Subject: [R] data frame question To: r-help at r-project.org Cc: r-help at r-project.org Message-ID: <109232.80965.qm at web36905.mail.mud.yahoo.com> Content-Type: text/plain Hello I have 2 data frames df1 and df2. I would like to create a new data frame new_df which will contain only the common rows based on the first 2 columns (chrN and start). The column score in the new data frame should be replaced with a column containing the average score (average_score) from df1 and df2. df1= data.frame(chrN= c(?chr1?, ?chr1?, ?chr1?, ?chr1?, ?chr2?, ?chr2?, ?chr2?), start= c(23, 82, 95, 108, 95, 108, 121), end= c(33, 92, 105, 118, 105, 118, 131), score= c(3, 6, 2, 4, 9, 2, 7)) df2= data.frame(chrN= c(?chr1?, ?chr2?, ?chr2?, ?chr2? , ?chr2?), start= c(23, 50, 95, 20, 121), end= c(33, 60, 105, 30, 131), score= c(9, 3, 7, 7, 3)) new_df= data.frame(chrN= c(?chr1?, ?chr2?, ?chr2?), start= c(23, 95, 121), end= c(33, 105, 131), average_score= c(6, 8, 5)) Thank you for your help Joseph ________________________________________________________________________ ____________ Never miss a thing. Make Yahoo your home page. [[alternative HTML version deleted]] ------------------------------ Message: 28 Date: Sun, 10 Feb 2008 11:38:13 -0800 (PST) From: joseph <jdsandjd at yahoo.com> Subject: [R] data frame question To: r-help at r-project.org Cc: r-help at r-project.org Message-ID: <109232.80965.qm at web36905.mail.mud.yahoo.com> Content-Type: text/plain Hello I have 2 data frames df1 and df2. I would like to create a new data frame new_df which will contain only the common rows based on the first 2 columns (chrN and start). The column score in the new data frame should be replaced with a column containing the average score (average_score) from df1 and df2. df1= data.frame(chrN= c(?chr1?, ?chr1?, ?chr1?, ?chr1?, ?chr2?, ?chr2?, ?chr2?), start= c(23, 82, 95, 108, 95, 108, 121), end= c(33, 92, 105, 118, 105, 118, 131), score= c(3, 6, 2, 4, 9, 2, 7)) df2= data.frame(chrN= c(?chr1?, ?chr2?, ?chr2?, ?chr2? , ?chr2?), start= c(23, 50, 95, 20, 121), end= c(33, 60, 105, 30, 131), score= c(9, 3, 7, 7, 3)) new_df= data.frame(chrN= c(?chr1?, ?chr2?, ?chr2?), start= c(23, 95, 121), end= c(33, 105, 131), average_score= c(6, 8, 5)) Thank you for your help Joseph ________________________________________________________________________ ____________ Never miss a thing. Make Yahoo your home page. [[alternative HTML version deleted]] ------------------------------ Message: 29 Date: Sun, 10 Feb 2008 13:49:33 -0600 From: "Erin Hodgess" <erinm.hodgess at gmail.com> Subject: [R] [OT] good reference for mixed models and EM algorithm To: r-help at r-project.org Message-ID: <7acc7a990802101149t43eca289ob7335a2a905421a6 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Dear R People: Sorry for the off-topic. Could someone recommend a good reference for using the EM algorithm on mixed models, please? I've been looking and there are so many of them. Perhaps someone here can narrow things down a bit. Thanks in advance, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com ------------------------------ Message: 30 Date: Sun, 10 Feb 2008 12:32:48 -0800 From: Spencer Graves <spencer.graves at pdf.com> Subject: Re: [R] [OT] good reference for mixed models and EM algorithm To: Erin Hodgess <erinm.hodgess at gmail.com> Cc: r-help at r-project.org Message-ID: <47AF5F70.7070901 at pdf.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi, Erin: Have you looked at Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer)? As far as I know, Doug Bates has been the leading innovator in this area for the past 20 years. Pinheiro was one of his graduate students. The 'nlme' package was developed by him or under his supervision, and 'lme4' is his current development platform. The "~R\library\scripts" subdirectory contains "ch01.R", "ch02.R", etc. = script files to work the examples in the book (where "~R" = your R installation directory). There are other good books, but I recommend you start with Pinheiro and Bates. Spencer Graves Erin Hodgess wrote:> Dear R People: > > Sorry for the off-topic. Could someone recommend a good reference for > using the EM algorithm on mixed models, please? > > I've been looking and there are so many of them. Perhaps someone here > can narrow things down a bit. > > Thanks in advance, > Sincerely, > Erin >------------------------------ Message: 31 Date: Sun, 10 Feb 2008 20:44:17 +0000 From: "Mark Wardle" <mark at wardle.org> Subject: Re: [R] data frame question To: joseph <jdsandjd at yahoo.com> Cc: r-help at r-project.org Message-ID: <b59a37130802101244k25262e48gd5968d6a77483537 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On 10/02/2008, joseph <jdsandjd at yahoo.com> wrote:> Hello > I have 2 data frames df1 and df2. I would like to create a > new data frame new_df which will contain only the common rows based onthe first 2> columns (chrN and start). The column score in the new data frame > should > be replaced with a column containing the average score (average_score)from df1> and df2.Try this: (avoiding underscores) new.df <- merge(df1, df2, by=c('chrN','start')) new.df$average.score <- apply(df3[,c('score.x','score.y')], 1, mean, na.rm=T) As always, interested to see whether it can be done in one line... -- Dr. Mark Wardle Specialist registrar, Neurology Cardiff, UK ------------------------------ Message: 32 Date: Sun, 10 Feb 2008 20:52:03 +0000 (UTC) From: David Winsemius <dwinsemius at comcast.net> Subject: Re: [R] data frame question To: r-help at stat.math.ethz.ch Message-ID: <Xns9A40A1148E360dNOTwinscomcast at 80.91.229.13> joseph <jdsandjd at yahoo.com> wrote in news:109232.80965.qm at web36905.mail.mud.yahoo.com:> I have 2 data frames df1 and df2. I would like to create a > new data frame new_df which will contain only the common rows based > on the first 2 columns (chrN and start). The column score in the new > data frame should > be replaced with a column containing the average score > (average_score) from df1 and df2. >> df1= data.frame(chrN= c("chr1", "chr1", "chr1", "chr1", "chr2", > "chr2", "chr2"), > start= c(23, 82, 95, 108, 95, 108, 121), > end= c(33, 92, 105, 118, 105, 118, 131), > score= c(3, 6, 2, 4, 9, 2, 7)) > > df2= data.frame(chrN= c("chr1", "chr2", "chr2", "chr2" , "chr2"), > start= c(23, 50, 95, 20, 121), > end= c(33, 60, 105, 30, 131), > score= c(9, 3, 7, 7, 3))Clunky to be sure, but this should worked for me: df3 <- merge(df1,df2,by=c("chrN","start") #non-match variables get auto-relabeled df3$avg.scr <- with(df3, (score.x+score.y)/2) # or mean( ) df3 <- df3[,c("chrN","start","avg.scr")] #drops the variables not of interest df3 chrN start avg.scr 1 chr1 23 6 2 chr2 121 5 3 chr2 95 8 -- David Winsemius ------------------------------ Message: 33 Date: Sun, 10 Feb 2008 22:20:10 +0100 From: "Liviu Andronic" <landronimirc at gmail.com> Subject: Re: [R] prcomp vs. princomp vs fast.prcomp To: "Erin Hodgess" <erinm.hodgess at gmail.com> Cc: r-help at r-project.org Message-ID: <68b1e2610802101320n1dd12b9aqa40da24ff1a2d7c6 at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 On 2/10/08, Erin Hodgess <erinm.hodgess at gmail.com> wrote:> When performing PCA, should I use prcomp, princomp or fast.prcomp,please? You can take a look here [1] and here [2] for some short references.>From the first page: "Principal Components Analysis (PCA) is availablein prcomp() (preferred) and princomp() in standard package stats." There are also - at least - FactoMineR, psych and ade4 that provide PCA funtions. I imagine that it would much depend on what you want to do. Liviu [1] http://cran.miscellaneousmirror.org/src/contrib/Views/Environmetrics.htm l [2] http://cran.r-project.org/src/contrib/Views/Psychometrics.html ------------------------------ Message: 34 Date: Sun, 10 Feb 2008 19:40:53 -0200 From: "Henrique Dallazuanna" <wwwhsd at gmail.com> Subject: Re: [R] Applying lm to data with combn To: AliR <aaliraja at gmail.com> Cc: r-help at r-project.org Message-ID: <da79af330802101340h2b53b0cfwa04c666ac7eb53d0 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 I think that what you want do is stepwise, see step function On 09/02/2008, AliR <aaliraja at gmail.com> wrote:> > Thank you, can you suggest wht is the shortest way to store thecombination> with min residual error term? > > > > AliR wrote: > > > > http://www.nabble.com/file/p15359204/test.data.csv > > http://www.nabble.com/file/p15359204/test.data.csv test.data.csv > > > > Hi, > > > > I have used apply to have certian combinations, but when I try touse> > these combinations I get the error > > [Error in eval(expr, envir, enclos) : object "X.GDAXI" not found].being a> > novice I donot understand that after applying combination to thedata I> > cant access it and use lm on these combinations. The data frameeither is> > no longer a matrix, how can I access the data and make it work forlm!!> > > > Any help please! > > > > > > > > > > > > > > fruit = read.csv(file="test.data.csv",head= TRUE, sep=",")# read itin> > matrix format > > > > #fruit =read.file(row.names=1)$data > > > > mD =head(fruit[, 1:5])# only first five used in combinations > > #X.SSMII = head(fruit[, 6])# Keep it for referebce > > nmax = NULL > > n = ncol(mD)# dont take the last column for reference purpose > > if(is.null(nmax)) nmax = n > > > > mDD = apply(combn(5, 1),1, FUN= function(y) mD[, y])# to > > > > > > > > fg = lm( X.SSMII ~ X.GDAXI + X.FTSE + X.FCHI + X.IBEX, data = mDD)#> > regress on combos > > > > s = cbind(s, Residuals = residuals(fg))# take residuals > > > > print(mD) > > > > > > -- > View this message in context:http://www.nabble.com/Applying-lm-to-data-with-combn-tp15359204p15391159 .html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O ------------------------------ Message: 35 Date: Sun, 10 Feb 2008 22:58:01 +0100 From: "juli pausas" <pausas at gmail.com> Subject: [R] reshape To: R-help <r-help at stat.math.ethz.ch> Message-ID: <a17009720802101358s62b54220yf3e00878b3fa8537 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Dear colleagues, I'd like to reshape a datafame in a long format to a wide format, but I do not quite get what I want. Here is an example of the data I've have (dat): sp <- c("a", "a", "a", "a", "b", "b", "b", "c", "d", "d", "d", "d") tr <- c("A", "B", "B", "C", "A", "B", "C", "A", "A", "B", "C", "C") code <- c("a1", "a2", "a2", "a3", "a3", "a3", "a4", "a4", "a4", "a5", "a5", "a6") dat <- data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code) and below is what I'd like to obtain. That is, I'd like the tr variable in different columns (as a timevar) with their value (val). sp code tr.A tr.B tr.C a a1 31 NA NA a a2 NA 32 NA a a2 NA 33 NA ** a a3 NA NA 34 b a3 35 36 NA b a4 NA NA 37 c a4 38 NA NA d a4 39 NA NA d a5 NA 40 41 d a6 NA NA 42 Using reshape: reshape(dat[,2:5], direction="wide", timevar="tr", idvar=c("code","sp" )) I'm getting very close. The only difference is in the 3rd row (**), that is when sp and code are the same I only get one record. Is there a way to get all records? Any idea? Thank you very much for any help Juli Pausas -- http://www.ceam.es/pausas ------------------------------ Message: 36 Date: Sun, 10 Feb 2008 20:20:05 -0200 From: "Henrique Dallazuanna" <wwwhsd at gmail.com> Subject: Re: [R] reshape To: "juli pausas" <pausas at gmail.com> Cc: R-help <r-help at stat.math.ethz.ch> Message-ID: <da79af330802101420h47bdf79fg69b4f57fe223d64b at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 reshape(dat, direction="wide", timevar="tr", idvar=c("id", "code","sp" ))[,2:6] But, I don't understand why you use reshape On 10/02/2008, juli pausas <pausas at gmail.com> wrote:> Dear colleagues, > I'd like to reshape a datafame in a long format to a wide format, but > I do not quite get what I want. Here is an example of the data I've > have (dat): > > sp <- c("a", "a", "a", "a", "b", "b", "b", "c", "d", "d", "d", "d") > tr <- c("A", "B", "B", "C", "A", "B", "C", "A", "A", "B", "C", "C") > code <- c("a1", "a2", "a2", "a3", "a3", "a3", "a4", "a4", "a4", "a5", > "a5", "a6") > dat <- data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code) > > and below is what I'd like to obtain. That is, I'd like the tr > variable in different columns (as a timevar) with their value (val). > > sp code tr.A tr.B tr.C > a a1 31 NA NA > a a2 NA 32 NA > a a2 NA 33 NA ** > a a3 NA NA 34 > b a3 35 36 NA > b a4 NA NA 37 > c a4 38 NA NA > d a4 39 NA NA > d a5 NA 40 41 > d a6 NA NA 42 > > Using reshape: > > reshape(dat[,2:5], direction="wide", timevar="tr", idvar=c("code","sp"))> > I'm getting very close. The only difference is in the 3rd row (**), > that is when sp and code are the same I only get one record. Is there > a way to get all records? Any idea? > > Thank you very much for any help > > Juli Pausas > > -- > http://www.ceam.es/pausas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O ------------------------------ Message: 37 Date: Sun, 10 Feb 2008 18:31:48 -0500 From: "Gabor Grothendieck" <ggrothendieck at gmail.com> Subject: Re: [R] reshape To: "juli pausas" <pausas at gmail.com> Cc: R-help <r-help at stat.math.ethz.ch> Message-ID: <971536df0802101531q5f297f58y302c4b3a1bed4575 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 This isn't really well defined. Suppose we have two rows that both have a, a2 and a value for B. Now suppose we have another row with a,a2 but with a value for C. Does the third row go with the first one? the second one? a new row? both the first and the second? Here is one possibility but without a good definition of the problem we don't know whether its answering the problem that is intended. In the code below we assume that all dat rows that have the same sp value and the same code value are adjacent and if a tr occurs among those dat rows that is equal to or less than the prior row in factor level order then the new dat row must start a new output row else not. Thus within an sp/code group we assign each row a 1 until we get a tr that is less than the prior row's tr and then we start assigning 2 and so on. This is the new column seq below. We then use seq as part of our id.var in reshape. For the particular example in your post this does give the same answer. f <- function(x) cumsum(c(1, diff(x) <= 0)) dat$seq <- ave(as.numeric(dat$tr), dat$sp, dat$code, FUN = f) reshape(dat[-1], direction="wide", timevar="tr", idvar=c("code","sp","seq" ))[-3] On Feb 10, 2008 4:58 PM, juli pausas <pausas at gmail.com> wrote:> Dear colleagues, > I'd like to reshape a datafame in a long format to a wide format, but > I do not quite get what I want. Here is an example of the data I've > have (dat): > > sp <- c("a", "a", "a", "a", "b", "b", "b", "c", "d", "d", "d", "d") > tr <- c("A", "B", "B", "C", "A", "B", "C", "A", "A", "B", "C", "C") > code <- c("a1", "a2", "a2", "a3", "a3", "a3", "a4", "a4", "a4", "a5", > "a5", "a6") > dat <- data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code) > > and below is what I'd like to obtain. That is, I'd like the tr > variable in different columns (as a timevar) with their value (val). > > sp code tr.A tr.B tr.C > a a1 31 NA NA > a a2 NA 32 NA > a a2 NA 33 NA ** > a a3 NA NA 34 > b a3 35 36 NA > b a4 NA NA 37 > c a4 38 NA NA > d a4 39 NA NA > d a5 NA 40 41 > d a6 NA NA 42 > > Using reshape: > > reshape(dat[,2:5], direction="wide", timevar="tr", idvar=c("code","sp"))> > I'm getting very close. The only difference is in the 3rd row (**), > that is when sp and code are the same I only get one record. Is there > a way to get all records? Any idea? > > Thank you very much for any help > > Juli Pausas > > -- > http://www.ceam.es/pausas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 38 Date: Mon, 11 Feb 2008 09:55:56 +1000 From: <Bill.Venables at csiro.au> Subject: Re: [R] Do I need to use dropterm()?? To: <dani_wells210 at hotmail.com>, <r-help at r-project.org> Message-ID: <B998A44C8986644EA8029CFE6396A924F53908 at exqld2-bne.nexus.csiro.au> Content-Type: text/plain; charset="us-ascii" dropterm() is a tool for model building, not primarily for significance testing. As the name suggests, it tells you what the effect would be were you to "drop" each *accessible* "term" in the model as it currently stands. By default it displays the effect on AIC of dropping each term, in turn, from the model. If you request them, though, it can also give you test statistics and significance probabilities. If there is an "A:B" interaction in the model, the main effects "A" or "B", if present, are not considered until a decision has been made on including "A:B". The meaning of "A:B" in a model is not absolute: it is conditional on which main effect terms you have there as well. This is one reason why the process is ordered in this way, but the main reason is the so-called 'marginality' issue. If you do ask for test statistics and significance probabilities, you get almost a SAS-style "Type III" anova table, with the important restriction noted above: you will not get main effect terms shown along with interactions. If you want the full SAS, uh, version, there are at least two possibilities. 1. Use SAS. 2. Use John Fox's Anova() function from the 'car' package, along with his excellent book, which should explain how to avoid shooting yourself in the foot over this. (This difference of opinion on what should sensibly be done, by the way, predates R by a long shot. My first exposure to it were with the very acrimonious disputes between Nelder and Kempthorne in the mid 70's. It has remained a cross-Atlantic dispute pretty well ever since, with the latest shot being the paper by Lee and Nelder in 2004. Curiously, the origin of the software can almost be determined by the view taken on this issue, with Genstat going one way and SAS, SPSS, ... the other. S-PLUS was a late comer...but I digress!) Bill Venables. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:Bill.Venables at csiro.au http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of DaniWells Sent: Sunday, 10 February 2008 11:40 PM To: r-help at r-project.org Subject: [R] Do I need to use dropterm()?? Hello, I'm having some difficulty understanding the useage of the "dropterm()" function in the MASS library. What exactly does it do? I'm very new to R, so any pointers would be very helpful. I've read many definitions of what dropterm() does, but none seem to stick in my mind or click with me. I've coded everything fine for an interaction that runs as follows: two sets of data (one for North aspect, one for Southern Aspect) and have a logscale on the x axis, with survival on the y. After calculating my anova results i have all significant results (ie aspect = sig, logscale of sunlight sig, and aspect:llight = sig). When i have all significant results in my ANOVA table, do i need dropterm(), or is that just to remove insignificant terms? Many thanks, Dani -- View this message in context: http://www.nabble.com/Do-I-need-to-use-dropterm%28%29---tp15396151p15396 151.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ------------------------------ Message: 39 Date: Sun, 10 Feb 2008 19:37:40 -0500 From: Robert Biddle <robert_biddle at carleton.ca> Subject: [R] j and jcross queries To: r-help at r-project.org Message-ID: <47AF98D4.7010907 at carleton.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi: I have a query related to the J and Jcross functions in the SpatStat package. I use J to finding indications of clustering in my data, and Jcross to look for dependence between point patterns. I use the envelope function to do Monte Carlo tests to look for significance. So far so good. My question is how I can test to see if tests are significantly different. For example, if find J of pattern X and J of pattern Y, how could I determine the liklihood that those results come from different processes? Similarly, if I find J of marks X and Y, and X and Z, how could I determine the liklihood that Y and Z come from different processes? I would appreciate advice. Cheers Robert Biddle ------------------------------ Message: 40 Date: Sun, 10 Feb 2008 23:14:28 -0200 From: Andre Nathan <andre at digirati.com.br> Subject: [R] Questions about histograms To: r-help at r-project.org Message-ID: <1202692468.5515.23.camel at homesick> Content-Type: text/plain Hello I'm doing some experiments with the various histogram functions and I have a two questions about the "prob" option and binning. First, here's a simple plot of my data using the default hist() function:> hist(data[,1], prob = TRUE, xlim = c(0, 35))http://go.sneakymustard.com/tmp/hist.jpg My first question is regarding the resulting plot from hist.scott() and hist.FD(), from the MASS package. I'm setting prob to TRUE in these functions, but as it can be seen in the images below, the value for the first bar of the histogram is well above 1.0. Shouldn't the total area be 1.0 in the case of prob = TRUE?> hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))http://go.sneakymustard.com/tmp/scott.jpg> hist.FD(data[,1], prob = TRUE, xlim=c(0, 35))http://go.sneakymustard.com/tmp/FD.jpg Is there anything I can do to "fix" these plots? My second question is related to binning. Is there a function or package that allows one to use logarithmic binning in R, that is, create bins such that the length of a bin is a multiple of the length of the one before it? Pointers to the appropriate docs are welcome, I've been searching for this and couldn't find any info. Best regards, Andre ------------------------------ Message: 41 Date: Sun, 10 Feb 2008 20:36:01 -0500 From: Duncan Murdoch <murdoch at stats.uwo.ca> Subject: Re: [R] Questions about histograms To: Andre Nathan <andre at digirati.com.br> Cc: r-help at r-project.org Message-ID: <47AFA681.70905 at stats.uwo.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 10/02/2008 8:14 PM, Andre Nathan wrote:> Hello > > I'm doing some experiments with the various histogram functions and I > have a two questions about the "prob" option and binning. > > First, here's a simple plot of my data using the default hist() > function: > >> hist(data[,1], prob = TRUE, xlim = c(0, 35)) > > http://go.sneakymustard.com/tmp/hist.jpg > > My first question is regarding the resulting plot from hist.scott()and> hist.FD(), from the MASS package. I'm setting prob to TRUE in these > functions, but as it can be seen in the images below, the value forthe> first bar of the histogram is well above 1.0. Shouldn't the total area > be 1.0 in the case of prob = TRUE? > >> hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))It looks to me as though the area is one. The first bar is about 3.6 units high, and about 0.2 units wide: area is 0.72. There are no gaps between bars in an R histogram, so the gaps you see in this jpg are bars with zero height. Duncan Murdoch> > http://go.sneakymustard.com/tmp/scott.jpg > >> hist.FD(data[,1], prob = TRUE, xlim=c(0, 35)) > > http://go.sneakymustard.com/tmp/FD.jpg > > Is there anything I can do to "fix" these plots? > > My second question is related to binning. Is there a function orpackage> that allows one to use logarithmic binning in R, that is, create bins > such that the length of a bin is a multiple of the length of the one > before it? > > Pointers to the appropriate docs are welcome, I've been searching for > this and couldn't find any info. > > Best regards, > Andre > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 42 Date: Mon, 11 Feb 2008 11:38:41 +1000 From: <Bill.Venables at csiro.au> Subject: Re: [R] Questions about histograms To: <andre at digirati.com.br>, <r-help at r-project.org> Message-ID: <B998A44C8986644EA8029CFE6396A924F5391E at exqld2-bne.nexus.csiro.au> Content-Type: text/plain; charset="us-ascii" Andre, Regarding your first question, it is by no means clear there is anything to fix, in fact I'm sure there is nothing to fix. The fact that the height of any bar is greater than one is irrelevant - the width of the bar is much less than one, as is the product of height by width. Area is height x width, not just height.... Regarding the second question - logarithmic breaks. I'm not aware of anything currently available to do this, but the tools are there for you to do it yourself. The 'breaks' argument to hist allows you to specify your breaks explicitly (among other things) so it's just a matter of setting up the logarithmic (or, more precisely, 'geometric progression') bins yourself and relaying them on to hist. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:Bill.Venables at csiro.au http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Andre Nathan Sent: Monday, 11 February 2008 11:14 AM To: r-help at r-project.org Subject: [R] Questions about histograms Hello I'm doing some experiments with the various histogram functions and I have a two questions about the "prob" option and binning. First, here's a simple plot of my data using the default hist() function:> hist(data[,1], prob = TRUE, xlim = c(0, 35))http://go.sneakymustard.com/tmp/hist.jpg My first question is regarding the resulting plot from hist.scott() and hist.FD(), from the MASS package. I'm setting prob to TRUE in these functions, but as it can be seen in the images below, the value for the first bar of the histogram is well above 1.0. Shouldn't the total area be 1.0 in the case of prob = TRUE?> hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))http://go.sneakymustard.com/tmp/scott.jpg> hist.FD(data[,1], prob = TRUE, xlim=c(0, 35))http://go.sneakymustard.com/tmp/FD.jpg Is there anything I can do to "fix" these plots? My second question is related to binning. Is there a function or package that allows one to use logarithmic binning in R, that is, create bins such that the length of a bin is a multiple of the length of the one before it? Pointers to the appropriate docs are welcome, I've been searching for this and couldn't find any info. Best regards, Andre ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ------------------------------ Message: 43 Date: Mon, 11 Feb 2008 00:14:29 -0200 From: Andre Nathan <andre at digirati.com.br> Subject: Re: [R] Questions about histograms To: r-help at r-project.org Message-ID: <1202696069.5515.26.camel at homesick> Content-Type: text/plain Thanks Bill and Duncan for your replies, I understand it now. Somehow I didn't notice the width of the bars was < 1.0. Andre On Mon, 2008-02-11 at 11:38 +1000, Bill.Venables at csiro.au wrote:> Andre, > > Regarding your first question, it is by no means clear there isanything> to fix, in fact I'm sure there is nothing to fix. The fact that the > height of any bar is greater than one is irrelevant - the width of the > bar is much less than one, as is the product of height by width. Area > is height x width, not just height.... > > Regarding the second question - logarithmic breaks. I'm not aware of > anything currently available to do this, but the tools are there foryou> to do it yourself. The 'breaks' argument to hist allows you tospecify> your breaks explicitly (among other things) so it's just a matter of > setting up the logarithmic (or, more precisely, 'geometricprogression')> bins yourself and relaying them on to hist. > > > > > Bill Venables > CSIRO Laboratories > PO Box 120, Cleveland, 4163 > AUSTRALIA > Office Phone (email preferred): +61 7 3826 7251 > Fax (if absolutely necessary): +61 7 3826 7304 > Mobile: +61 4 8819 4402 > Home Phone: +61 7 3286 7700 > mailto:Bill.Venables at csiro.au > http://www.cmis.csiro.au/bill.venables/ > > -----Original Message----- > From: r-help-bounces at r-project.org[mailto:r-help-bounces at r-project.org]> On Behalf Of Andre Nathan > Sent: Monday, 11 February 2008 11:14 AM > To: r-help at r-project.org > Subject: [R] Questions about histograms > > Hello > > I'm doing some experiments with the various histogram functions and I > have a two questions about the "prob" option and binning. > > First, here's a simple plot of my data using the default hist() > function: > > > hist(data[,1], prob = TRUE, xlim = c(0, 35)) > > http://go.sneakymustard.com/tmp/hist.jpg > > My first question is regarding the resulting plot from hist.scott()and> hist.FD(), from the MASS package. I'm setting prob to TRUE in these > functions, but as it can be seen in the images below, the value forthe> first bar of the histogram is well above 1.0. Shouldn't the total area > be 1.0 in the case of prob = TRUE? > > > hist.scott(data[,1], prob = TRUE, xlim=c(0, 35)) > > http://go.sneakymustard.com/tmp/scott.jpg > > > hist.FD(data[,1], prob = TRUE, xlim=c(0, 35)) > > http://go.sneakymustard.com/tmp/FD.jpg > > Is there anything I can do to "fix" these plots? > > My second question is related to binning. Is there a function orpackage> that allows one to use logarithmic binning in R, that is, create bins > such that the length of a bin is a multiple of the length of the one > before it? > > Pointers to the appropriate docs are welcome, I've been searching for > this and couldn't find any info. > > Best regards, > Andre > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >------------------------------ Message: 44 Date: Sun, 10 Feb 2008 18:12:01 -0800 (PST) From: dss <dsstoffer at gmail.com> Subject: Re: [R] Re gression with time-dependent coefficients To: r-help at r-project.org Message-ID: <15404399.post at talk.nabble.com> Content-Type: text/plain; charset=UTF-8 There's an example in our text: http://www.stat.pitt.edu/stoffer/tsa2 Time Series Analysis and Its Applications ... it's Example 6.12, Stochastic Regression. The example is about bootstrapping state space models, but MLE is part of the example. The code for the example is on the page for the book... click on "R Code (Ch 6)" on the blue bar at the top. When you get to the Chapter 6 code page, scroll down to "Code to duplicate Example 6.12 [?6.7]". -- View this message in context: http://www.nabble.com/Regression-with-time-dependent-coefficients-tp1531 5302p15404399.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 45 Date: Mon, 11 Feb 2008 16:40:30 +1300 From: "Arin Basu" <arin.basu at gmail.com> Subject: [R] Using R in a university course: dealing with proposal comments To: r-help at r-project.org Message-ID: <af62876a0802101940o3eb23129hf287f4b514c603d3 at mail.gmail.com> Content-Type: text/plain; charset=WINDOWS-1252 Hi All, I am scheduled to teach a graduate course on research methods in health sciences at a university. While drafting the course proposal, I decided to include a brief introduction to R, primarily with an objective to enable the students to do data analysis using R. It is expected that enrolled students of this course have all at least a formal first level introduction to quantitative methods in health sciences and following completion of the course, they are all expected to either evaluate, interpret, or conduct primary research studies in health. The course would be delivered over 5 months, and R was proposed to be taught as several laboratory based hands-on sessions along with required readings within the coursework. The course proposal went to a few colleagues in the university for review. I received review feedbacks from them; two of them commented about inclusion of R in the proposal. In quoting parts these mails, I have masked the names/identities of the referees, and have included just part of the relevant text with their comments. Here are the comments: Comment 1: "In my quick glance, I did not see that statistics would be taught, but I did see that R would be taught. Of course, R is a statistics programme. I worry that teaching R could overwhelm the class. Or teaching R would be worthless, because the students do not understand statistics. " (Prof LR) Comment 2: Finally, on a minor point, why is "R" the statistical software being used? SPSS is probably more widely available in the workplace ? certainly in areas of social policy etc. " (Prof NB) I am interested to know if any of you have faced similar questions from colleagues about inclusion of R in non-statistics based university graduate courses. If you did and were required to address these concerns, how you would respond? TIA, Arin Basu ------------------------------ Message: 46 Date: Mon, 11 Feb 2008 14:27:39 +1000 From: <Bill.Venables at csiro.au> Subject: Re: [R] Using R in a university course: dealing with proposal comments To: <arin.basu at gmail.com>, <r-help at r-project.org> Message-ID: <B998A44C8986644EA8029CFE6396A924F53943 at exqld2-bne.nexus.csiro.au> Content-Type: text/plain; charset="us-ascii" Comment 1 raises a real issue. R is just a tool. Too often people do confuse the tool with the real skill that the people who use it should have. There are plenty of questions on R-help that demonstrate this confusion. It's well worth keeping in mind and acting upon if you can see a problem emerging, but I would not take it quite at face value and abandon R on those grounds. Comment 2 is one of those comments that belongs to a very particular period of time, one that passes as we look on. It reminds me of the time I tried to introduce some new software into my courses, (back in the days when I was a teacher, long, long ago...). The students took to it like ducks to water, but my colleagues on the staff were very slow to [[elided trailing spam]] Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:Bill.Venables at csiro.au http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Arin Basu Sent: Monday, 11 February 2008 1:41 PM To: r-help at r-project.org Subject: [R] Using R in a university course: dealing with proposal comments Hi All, I am scheduled to teach a graduate course on research methods in health sciences at a university. While drafting the course proposal, I decided to include a brief introduction to R, primarily with an objective to enable the students to do data analysis using R. It is expected that enrolled students of this course have all at least a formal first level introduction to quantitative methods in health sciences and following completion of the course, they are all expected to either evaluate, interpret, or conduct primary research studies in health. The course would be delivered over 5 months, and R was proposed to be taught as several laboratory based hands-on sessions along with required readings within the coursework. The course proposal went to a few colleagues in the university for review. I received review feedbacks from them; two of them commented about inclusion of R in the proposal. In quoting parts these mails, I have masked the names/identities of the referees, and have included just part of the relevant text with their comments. Here are the comments: Comment 1: "In my quick glance, I did not see that statistics would be taught, but I did see that R would be taught. Of course, R is a statistics programme. I worry that teaching R could overwhelm the class. Or teaching R would be worthless, because the students do not understand statistics. " (Prof LR) Comment 2: Finally, on a minor point, why is "R" the statistical software being used? SPSS is probably more widely available in the workplace - certainly in areas of social policy etc. " (Prof NB) I am interested to know if any of you have faced similar questions from colleagues about inclusion of R in non-statistics based university graduate courses. If you did and were required to address these concerns, how you would respond? TIA, Arin Basu ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ------------------------------ Message: 47 Date: Sun, 10 Feb 2008 22:06:17 -0800 From: Spencer Graves <spencer.graves at pdf.com> Subject: Re: [R] Using R in a university course: dealing with proposal comments To: Bill.Venables at csiro.au Cc: r-help at r-project.org, arin.basu at gmail.com Message-ID: <47AFE5D9.4010600 at pdf.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed R is "just a tool", but so is English. R is the platform of choice for an increasing portion of people involved in new statistical algorithm development. R is not yet the de facto standard for nearly all serious research internationally, to the extent that English is. However, I believe that is only a matter of time. There will always be a place for software with a nicer graphical user interface, etc., than R. For an undergraduate course, it may be wise to stick with SPSS, SAS, Minitab, etc. Are you teaching graduate students to solve yesterday's problems or tomorrow's? Much of my work in 2007 was in Matlab, because I am working with colleagues who use only Matlab. Matlab has better debugging tools. However, R now has well over 1,000 contributed packages, and r-help and r-sig-x provide better support and extensibility than you will likely get from commercial software. Twice in the past year, an executive said I should get some Matlab toolbox. In the first case, after thinking about it for a few days, I finally requested and received official permission from a Vice President. From that point, it took roughly a week to get a quote from Mathsoft, then close to two weeks to get approval from our Chief Financial Officer, then a few more days to actually get the software. With R, that month long process is reduced to seconds: I download the package and try it. This has allowed me to do things today that I only dreamed of doing a few years ago. Moreover, R makes it much easier for me to learn new statistical techniques. When I'm not sure I understand the math, I can trace through a worked example in R, and the uncertainties almost always disappear. For that, 'debug(fun)' helps a lot. If I want to try something different, I don't have to start from scratch to develop code to perform an existing analysis. I now look for companion R code before I decide to buy a book or when I prioritize how much time I will spend with different books or articles: If something has companion R code, I know I can learn much quicker how to use, modify and extend the statistical tools discussed. Spencer Graves Bill.Venables at csiro.au wrote:> Comment 1 raises a real issue. R is just a tool. Too often people do > confuse the tool with the real skill that the people who use it should > have. There are plenty of questions on R-help that demonstrate this > confusion. It's well worth keeping in mind and acting upon if you can > see a problem emerging, but I would not take it quite at face valueand> abandon R on those grounds. > > Comment 2 is one of those comments that belongs to a very particular > period of time, one that passes as we look on. It reminds me of the > time I tried to introduce some new software into my courses, (back in > the days when I was a teacher, long, long ago...). The students tookto> it like ducks to water, but my colleagues on the staff were very slowto [[elided trailing spam]]> > > Bill Venables > CSIRO Laboratories > PO Box 120, Cleveland, 4163 > AUSTRALIA > Office Phone (email preferred): +61 7 3826 7251 > Fax (if absolutely necessary): +61 7 3826 7304 > Mobile: +61 4 8819 4402 > Home Phone: +61 7 3286 7700 > mailto:Bill.Venables at csiro.au > http://www.cmis.csiro.au/bill.venables/ > > -----Original Message----- > From: r-help-bounces at r-project.org[mailto:r-help-bounces at r-project.org]> On Behalf Of Arin Basu > Sent: Monday, 11 February 2008 1:41 PM > To: r-help at r-project.org > Subject: [R] Using R in a university course: dealing with proposal > comments > > Hi All, > > I am scheduled to teach a graduate course on research methods in > health sciences at a university. While drafting the course proposal, I > decided to include a brief introduction to R, primarily with an > objective to enable the students to do data analysis using R. It is > expected that enrolled students of this course have all at least a > formal first level introduction to quantitative methods in health > sciences and following completion of the course, they are all expected > to either evaluate, interpret, or conduct primary research studies in > health. The course would be delivered over 5 months, and R was > proposed to be taught as several laboratory based hands-on sessions > along with required readings within the coursework. > > The course proposal went to a few colleagues in the university for > review. I received review feedbacks from them; two of them commented > about inclusion of R in the proposal. > > In quoting parts these mails, I have masked the names/identities of > the referees, and have included just part of the relevant text with > their comments. Here are the comments: > > Comment 1: > > "In my quick glance, I did not see that statistics would be taught, > but I did see that R would be taught. Of course, R is a statistics > programme. I worry that teaching R could overwhelm the class. Or > teaching R would be worthless, because the students do not understand > statistics. " (Prof LR) > > Comment 2: > > Finally, on a minor point, why is "R" the statistical software being > used? SPSS is probably more widely available in the workplace - > certainly in areas of social policy etc. " (Prof NB) > > I am interested to know if any of you have faced similar questions > from colleagues about inclusion of R in non-statistics based > university graduate courses. If you did and were required to address > these concerns, how you would respond? > > TIA, > Arin Basu > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 48 Date: Mon, 11 Feb 2008 15:48:21 +0800 From: "Suhaila Zainudin" <suhaila.zainudin at gmail.com> Subject: [R] Help with write.csv To: r-help at r-project.org Message-ID: <9254175c0802102348x311bcd5bgbc368fdf402d0355 at mail.gmail.com> Content-Type: text/plain Dear all, I am new to R. I am using the impute package with data contained in csv file. I have followed the example in the impute package as follows:> mydata = read.csv("sample_impute.csv", header = TRUE) > mydata.expr <- mydata[-1,-(1:2)] > mydata.imputed <- impute.knn(as.matrix(mydata.expr))The impute is succesful. Then I try to write the imputation results (mydata.imputed) to a csv file such as follows..> write.csv(mydata.imputed, file = "sample_imputed.csv")Error in data.frame(data = c(-0.07, -1.22, -0.09, -0.6, 0.65, -0.36, 0.25, : arguments imply differing number of rows: 18, 1, 0 I need help understanding the error message and overcoming the [[elided trailing spam]] [[alternative HTML version deleted]] ------------------------------ Message: 49 Date: Mon, 11 Feb 2008 09:49:11 +0200 From: "Amnon Melzer" <amnon.melzer at eighty20.co.za> Subject: [R] tree() producing NA's To: <r-help at R-project.org> Message-ID: <20080211074936.A6ECB3647CC at smtpauth.cybersmart.co.za> Content-Type: text/plain Hi Hoping someone can help me (a newbie). I am trying to construct a tree using tree() in package tree. One of the fields is a factor field (owner), with many levels. In the resulting tree, I see many NA's (see below), yet in the actual data there are none.> rr200.tr <- tree(backprof ~ ., rr200)> rr200.tr1) root 200 1826.00 -0.2332 ... [snip] ... 5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10 14.25 1.5870 * 3) owner: B E T Partnership,Flaming Sambuca Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11 384.40 10.5900 6) decodds < 12 5 74.80 6.3000 * 7) decodds > 12 6 140.80 14.1700 * Can anyone tell me why this happens and what I can do about it? Regards Amnon [[alternative HTML version deleted]] ------------------------------ Message: 50 Date: Mon, 11 Feb 2008 08:56:00 +0100 From: "Liviu Andronic" <landronimirc at gmail.com> Subject: Re: [R] Using R in a university course: dealing with proposal comments To: "Arin Basu" <arin.basu at gmail.com> Cc: r-help at r-project.org Message-ID: <68b1e2610802102356j6a09ac74hec634c60fc0fdee5 at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Hello Arin, If your future students do not know statistics, you might consider buffering their introduction to R with the help of a GUI package, such as Rcmdr (if functionality is missing, you could add it yourself via the plugin infrastructure). Another way to help students would be to direct them to easy to use and straight-forward resources, like this [1], this [2] or this [3]. On the "why not SPSS" point, I would imagine the answer is quality and price, and all the corollary arguments (say, you can use it at home or during the weekend, etc). No more than my two cents. Liviu [1] http://oit.utk.edu/scc/RforSAS&SPSSusers.pdf [2] http://www.statmethods.net/index.html [3] http://zoonek2.free.fr/UNIX/48_R/all.html On 2/11/08, Arin Basu <arin.basu at gmail.com> wrote:> Comment 1: > > "In my quick glance, I did not see that statistics would be taught, > but I did see that R would be taught. Of course, R is a statistics > programme. I worry that teaching R could overwhelm the class. Or > teaching R would be worthless, because the students do not understand > statistics. " (Prof LR) > > Comment 2: > > Finally, on a minor point, why is "R" the statistical software being > used? SPSS is probably more widely available in the workplace ? > certainly in areas of social policy etc. " (Prof NB)------------------------------ Message: 51 Date: Mon, 11 Feb 2008 08:12:28 +0000 From: Robin Hankin <r.hankin at noc.soton.ac.uk> Subject: Re: [R] learning S4 To: cgenolin at u-paris10.fr Cc: r-help at r-project.org Message-ID: <1A5ED531-1210-42EA-B153-2979FDE38E82 at noc.soton.ac.uk> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Christophe you might find the Brobdingnag package on CRAN helpful here. I wrote the package partly to teach myself S4; it includes a vignette that builds the various S4 components from scratch, in a step-by-step annotated "cookbook". HTH rksh On 8 Feb 2008, at 15:30, cgenolin at u-paris10.fr wrote:> Hi the list. > > I try to learn the S4 programming. I find the wiki and several doc. > But > I still have few questions... > > 1. To define 'representation', we can use two syntax : > - representation=list(temps = 'numeric',traj = 'matrix') > - representation(temps = 'numeric',traj = 'matrix') > Is there any difference ? > 2. 'validityMethod' check the intialisation of a new object, but not > the latter > modifications. Is it possible to set up a validation that check > every > modifications ? > 3. When we use setMethod('initialize',...) does the validityMethod > become un-used ? > 4. Is it possible to set up several initialization processes ? One > that build an objet from a data.frame, one from a matrix... > > Thanks > > Christophe > > ---------------------------------------------------------------- > Ce message a ete envoye par IMP, grace a l'Universite Paris 10 > Nanterre > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.-- Robin Hankin Uncertainty Analyst and Neutral Theorist, National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 ------------------------------ Message: 52 Date: Mon, 11 Feb 2008 08:32:20 +0000 (GMT) From: Prof Brian Ripley <ripley at stats.ox.ac.uk> Subject: Re: [R] tree() producing NA's To: Amnon Melzer <amnon.melzer at eighty20.co.za> Cc: r-help at r-project.org Message-ID: <alpine.LFD.1.00.0802110829270.17998 at gannet.stats.ox.ac.uk> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Take a look at the levels of 'owner'. On Mon, 11 Feb 2008, Amnon Melzer wrote:> Hi > > > > Hoping someone can help me (a newbie). > > > > I am trying to construct a tree using tree() in package tree. One ofthe> fields is a factor field (owner), with many levels. In the resultingtree, I> see many NA's (see below), yet in the actual data there are none.You are misinterpreting this: those are level names. Using a tree with a factor with many levels is a very bad idea: it takes a long time to compute (unless the response is binary) and almost surely overfits.> > >> rr200.tr <- tree(backprof ~ ., rr200) > >> rr200.tr > > 1) root 200 1826.00 -0.2332 > > ... > > [snip] > > ... > > 5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10 14.25 1.5870*> > 3) owner: B E T Partnership,Flaming Sambuca > Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11 384.40 10.5900 > > 6) decodds < 12 5 74.80 6.3000 * > > 7) decodds > 12 6 140.80 14.1700 * > > > > Can anyone tell me why this happens and what I can do about it?Well, you could follow the request at the footer of this and every R-help message.> > > Regards > > > > Amnon > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ------------------------------ Message: 53 Date: Mon, 11 Feb 2008 09:31:19 +0200 From: "Amnon Melzer" <amnon.melzer at eighty20.co.za> Subject: [R] tree() producing NA's To: <r-help at R-project.org> Message-ID: <20080211073146.894BC364B3E at smtpauth.cybersmart.co.za> Content-Type: text/plain Hi Hoping someone can help me (a newbie). I am trying to construct a tree using tree() in package tree. One of the fields is a factor field (owner), with many levels. In the resulting tree, I see many NA's (see below), yet in the actual data there are none.> rr200.tr <- tree(backprof ~ ., rr200)> rr200.tr1) root 200 1826.00 -0.2332 ... [snip] ... 5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10 14.25 1.5870 * 3) owner: B E T Partnership,Flaming Sambuca Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11 384.40 10.5900 6) decodds < 12 5 74.80 6.3000 * 7) decodds > 12 6 140.80 14.1700 * Can anyone tell me why this happens and what I can do about it? Regards Amnon [[alternative HTML version deleted]] ------------------------------ Message: 54 Date: Mon, 11 Feb 2008 01:08:48 -0800 (PST) From: noorpiilur <noorpiilur at yahoo.com> Subject: [R] Dendrogram for agglomerative hierarchical clustering result To: r-help at r-project.org Message-ID: <3b26ade0-722e-49b6-ae6f-f61a9a23672a at e4g2000hsg.googlegroups.com> Content-Type: text/plain; charset=ISO-8859-1 Hey group, I have a problem of drawing dendrogram as the result of my program written in C. My algorithm is a approximation algorithm for single linkage method. AS a result I will get the following data: [Average distance] [cluster A] [cluster B] For example: 42.593141 1 26 42.593141 4 6 42.593141 123 124 42.593141 4 113 74.244206 1 123 74.244206 4 133 74.244206 1 36 So far I have used C to generate a bitmap output but I would like to use the computed result as an input for R to just draw the dendrogram. As I'm new to R any help is appreciated. Thanks, Risto ------------------------------ Message: 55 Date: Mon, 11 Feb 2008 09:14:52 +0000 From: Carla Rebelo <crebelo at liaad.up.pt> Subject: [R] The function predict To: r-help at R-project.org Message-ID: <47B0120C.1080204 at liaad.up.pt> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Good Morning! May you help me? I need to understand the function predict. I need to understand the algorithm implemented, the calculations associated. Where can I find this information? Thank You! ------------------------------ Message: 56 Date: Mon, 11 Feb 2008 09:37:33 +0000 From: Richard.Cotton at hsl.gov.uk Subject: Re: [R] Help with write.csv To: "Suhaila Zainudin" <suhaila.zainudin at gmail.com> Cc: r-help at r-project.org, r-help-bounces at r-project.org Message-ID: <OF1C4897CE.9DE82508-ON802573EC.0033CF5D-802573EC.0034D4AC at hsl.gov.uk> Content-Type: text/plain; charset="US-ASCII"> I am new to R. I am using the impute package with data contained incsv> file. > I have followed the example in the impute package as follows: > > > mydata = read.csv("sample_impute.csv", header = TRUE) > > mydata.expr <- mydata[-1,-(1:2)] > > mydata.imputed <- impute.knn(as.matrix(mydata.expr)) > > The impute is succesful. > > Then I try to write the imputation results (mydata.imputed) to a csvfile> such as follows.. > > > write.csv(mydata.imputed, file = "sample_imputed.csv") > Error in data.frame(data = c(-0.07, -1.22, -0.09, -0.6, 0.65, -0.36,0.25,> : > arguments imply differing number of rows: 18, 1, 0When you use write.csv, the object that you are writing to a file must look something like a data frame or a matrix, i.e. a rectangle of data. The error message suggests that different columns of the thing you are trying to write have different numbers of rows. This means that mydata.imputed isn't the matrix it is supposed to be. You'll have to do some detective work to figure out what mydata.imputed really is. Try this: mydata.imputed class(mydata.imputed) dim(mydata.imputed) Then you need to see why mydata.imputed isn't a matrix. Here there are two possibilities 1. There are some lines of code that you didn't tell us about, where you overwrote mydata.imputed with another value. 2. The impute wasn't as successful as you thought. Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} ------------------------------ Message: 57 Date: Mon, 11 Feb 2008 09:43:26 +0000 (UTC) From: Dieter Menne <dieter.menne at menne-biomed.de> Subject: Re: [R] The function predict To: r-help at stat.math.ethz.ch Message-ID: <loom.20080211T093944-159 at post.gmane.org> Content-Type: text/plain; charset=us-ascii Carla Rebelo <crebelo <at> liaad.up.pt> writes:> May you help me? I need to understand the function predict. I need to > understand the algorithm implemented, the calculations associated.Where> can I find this information?In the documentation: "predict is a generic function for predictions from the results of various model fitting functions. The function invokes particular methods which depend on the class of the first argument." There is no information available for the default predict function, but there is information for the predict.XXX implementations mentioned further below: See Also predict.glm, predict.lm, predict.loess, predict.nls, predict.poly, predict.princomp, predict.smooth.spline. For time-series prediction, predict.ar, predict.Arima, predict.arima0, predict.HoltWinters, predict.StructTS. For details, you should look into the examples provided with predict.lm (as the simplest starter), and the code. Dieter ------------------------------ Message: 58 Date: Mon, 11 Feb 2008 17:50:16 +0800 From: "Suhaila Zainudin" <suhaila.zainudin at gmail.com> Subject: Re: [R] Help with write.csv To: Richard.Cotton at hsl.gov.uk Cc: r-help at r-project.org, r-help-bounces at r-project.org Message-ID: <9254175c0802110150g56d1ae42t610c1f693326a211 at mail.gmail.com> Content-Type: text/plain Thanks for the reply Richie. Finally I tried the following,> write.csv(mydata.imputed$data, file = "mydata_imputed.csv")and it worked. I guess I need to refer to the data portion (using $) of the [[elided trailing spam]] -- Suhaila Zainudin PhD Candidate Universiti Teknologi Malaysia [[alternative HTML version deleted]] ------------------------------ Message: 59 Date: Mon, 11 Feb 2008 10:18:04 +0000 From: Wolfgang Huber <huber at ebi.ac.uk> Subject: Re: [R] Dendrogram for agglomerative hierarchical clustering result To: noorpiilur <noorpiilur at yahoo.com> Cc: r-help at r-project.org Message-ID: <47B020DC.1020202 at ebi.ac.uk> Content-Type: text/plain; charset=ISO-8859-1 Hi Risto, You could try example("dendrogram") best wishes Wolfgang noorpiilur scripsit:> Hey group, > > I have a problem of drawing dendrogram as the result of my program > written in C. My algorithm is a approximation algorithm for single > linkage method. AS a result I will get the following data: > > [Average distance] [cluster A] [cluster B] > > For example: > 42.593141 1 26 > 42.593141 4 6 > 42.593141 123 124 > 42.593141 4 113 > 74.244206 1 123 > 74.244206 4 133 > 74.244206 1 36 > > So far I have used C to generate a bitmap output but I would like to > use the computed result as an input for R to just draw the dendrogram. > > As I'm new to R any help is appreciated. > > Thanks, > Risto >------------------------------ Message: 60 Date: Mon, 11 Feb 2008 18:22:09 +0800 From: "Ng Stanley" <stanleyngkl at gmail.com> Subject: [R] Conditional rows To: r-help <r-help at r-project.org> Message-ID: <462b7fdd0802110222x65accc33v1dbe43cb69227743 at mail.gmail.com> Content-Type: text/plain Hi, Given a simple example, test <- matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 0.1, 0.3, 0.1, 0.1), 3, 3) How to generate row indexes for which their corresponding row values are less than or equal to 0.2 ? For this example, row 2 and 3 are the correct ones. Thanks [[alternative HTML version deleted]] ------------------------------ Message: 61 Date: Mon, 11 Feb 2008 11:18:05 +0100 From: "John Lande" <john.lande77 at gmail.com> Subject: [R] image quality To: r-help at r-project.org Message-ID: <c2ebc3880802110218k6e11b250m7d22ff117dd14a90 at mail.gmail.com> Content-Type: text/plain dear all, I am writing a sweave documentation for my analysis, and I am plotting huge scatter plot data for microarray. unlucly this take a lot of resource to my pc because of the quality of the image which is to high (I see the PC get stuck for each single spot). how can I overcome this problem? is there a way to make lighter image? john [[alternative HTML version deleted]] ------------------------------ Message: 62 Date: Mon, 11 Feb 2008 11:27:29 +0100 From: Gabor Csardi <csardi at rmki.kfki.hu> Subject: Re: [R] Conditional rows To: Ng Stanley <stanleyngkl at gmail.com> Cc: r-help <r-help at r-project.org> Message-ID: <20080211102729.GA25781 at localdomain> Content-Type: text/plain; charset=us-ascii which(apply(test<=0.2, 1, all)) See ?which, ?all, and in particular ?apply. Gabor On Mon, Feb 11, 2008 at 06:22:09PM +0800, Ng Stanley wrote:> Hi, > > Given a simple example, test <- matrix(c(0.1, 0.2, 0.1, 0.2, 0.1,0.1, 0.3,> 0.1, 0.1), 3, 3) > > How to generate row indexes for which their corresponding row valuesare> less than or equal to 0.2 ? For this example, row 2 and 3 are thecorrect> ones. > > Thanks > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.-- Csardi Gabor <csardi at rmki.kfki.hu> UNIL DGM ------------------------------ Message: 63 Date: Mon, 11 Feb 2008 11:41:00 +0100 From: "Dimitris Rizopoulos" <dimitris.rizopoulos at med.kuleuven.be> Subject: Re: [R] Conditional rows To: "Ng Stanley" <stanleyngkl at gmail.com> Cc: r-help at r-project.org Message-ID: <000a01c86c9a$97f86160$0e40210a at www.domain> Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original do you mean the following: test <- matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 0.1, 0.3, 0.1, 0.1), 3, 3) ind <- rowSums(test <= 0.2) == nrow(test) ind which(ind) I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Ng Stanley" <stanleyngkl at gmail.com> To: "r-help" <r-help at r-project.org> Sent: Monday, February 11, 2008 11:22 AM Subject: [R] Conditional rows> Hi, > > Given a simple example, test <- matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, > 0.1, 0.3, > 0.1, 0.1), 3, 3) > > How to generate row indexes for which their corresponding row values > are > less than or equal to 0.2 ? For this example, row 2 and 3 are the > correct > ones. > > Thanks > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm ------------------------------ Message: 64 Date: Mon, 11 Feb 2008 23:47:39 +1300 (NZDT) From: David Scott <d.scott at auckland.ac.nz> Subject: [R] R programming style To: r-help at stat.math.ethz.ch Message-ID: <alpine.LRH.1.00.0802112343200.27944 at stat12.stat.auckland.ac.nz> Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII I am aware of one (unofficial) guide to style for R programming: http://www1.maths.lth.se/help/R/RCC/ from Henrik Bengtsson. Can anyone provide further pointers to good style? Views on Bengtsson's ideas would interest me as well. David Scott _________________________________________________________________ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: d.scott at auckland.ac.nz Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics ------------------------------ Message: 65 Date: Mon, 11 Feb 2008 11:52:25 +0100 (CET) From: "Anja Kraft" <anja.kraft at mail.uni-oldenburg.de> Subject: [R] RGTK2 and glade on Windows - GUI newbie To: r-help at r-project.org Message-ID: <4391.134.106.150.54.1202727145.squirrel at webmail.uni-oldenburg.de> Content-Type: text/plain;charset=utf-8 Hallo, I'd like to write a GUI (first choice with GTK+). I've surfed through the R- an Omegahat-Pages, because I'd like to use RGTK2, GTK 2.10.11 in combination with glade on Windows XP (perhaps later Unix, Mac). I've found a lot of different information. Because of the information I'm not sure, if this combination is running on Windows XP and I'm unsure how it works. Is there anyone, who has experience with this combination (if it works) and could tell me, where I could find something like a tutorial, how this combination is used together and how it works? Thank you very much, Anja Kraft ------------------------------ _______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. End of R-help Digest, Vol 60, Issue 11 ************************************** Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm