Warranty on Accuracy, Precision, Legality, ... of R in Research (These questions may well have been raised.) What is the implied warranty of using R for research & publications, consulting, etc.? Alternately, how does one obtain such a warranty? Your answers will be much appreciated. Perhaps you can point me to some websites which discussed this subject in the past. Thanks & regards - Bert (Bertram K. C. Chan, PhD) ----- Original Message ---- From: "r-help-request@r-project.org" <r-help-request@r-project.org> To: r-help@r-project.org Sent: Monday, September 22, 2008 3:00:04 AM Subject: R-help Digest, Vol 67, Issue 23 Send R-help mailing list submissions to r-help@r-project.org To subscribe or unsubscribe via the World Wide Web, visit https://stat.ethz.ch/mailman/listinfo/r-help or, via email, send a message with subject or body 'help' to r-help-request@r-project.org You can reach the person managing the list at r-help-owner@r-project.org When replying, please edit your Subject line so it is more specific than "Re: Contents of R-help digest..." Today's Topics: 1. Calculating interval for conditional/unconditional correlation matrix (Ana Kolar) 2. How to plot "greater than" symbol on the x-axis (Li, Bingshan) 3. Re: How to plot "greater than" symbol on the x-axis (John Fox) 4. Re: Design lrm function (milicic.marko) 5. Re: How to plot "greater than" symbol on the x-axis (Li, Bingshan) 6. Re: How to plot "greater than" symbol on the x-axis (John Fox) 7. Re: periodicity validation (stephen sefick) 8. Task View for Chemometrics and Computational Physics (Katharine Mullen) 9. Re: Variable Selection for data reduction and discriminant anlaysis (Katharine Mullen) 10. Re: How to plot "greater than" symbol on the x-axis (Henrik Bengtsson) 11. Re: How to plot "greater than" symbol on the x-axis (Henrik Bengtsson) 12. Re: How to plot "greater than" symbol on the x-axis (Gabor Grothendieck) 13. Symmetric matrix (Megh Dal) 14. Re: Symmetric matrix (Jorge Ivan Velez) 15. Re: Symmetric matrix (Dimitris Rizopoulos) 16. R Map using SAS data (Junjie Zhang) 17. Re: How to plot "greater than" symbol on the x-axis (Li, Bingshan) 18. Re: Symmetric matrix (Peter Dalgaard) 19. Re: removing a word, the following space and the next word (Rolf Turner) 20. Re: fitting a hyperbole (Rolf Turner) 21. Re: Unexpected behaviour when testing for independence, with multiple factors ( Javier Acu?a ) 22. Re: Unexpected behaviour when testing for independence with multiple factors ( Javier Acu?a ) 23. r format questions (DS) 24. design question on piping multiple data sets from 1 file into R (DS) 25. color for lattice box plots (Tom Bonen) 26. suppress legend in ggplot(data, aes(y=Y, x=X,fill=Z))? (Tom Bonen) 27. Re: selecting from a series of integers withpre-determined probabilities (Bert Gunter) 28. Multiple plots per window (p@fo76.org) 29. glmer -- extracting standard errors and other statistics (John Poulsen) 30. Re: r format questions (jim holtman) 31. Re: r format questions (jim holtman) 32. Re: Variable Selection for data reduction and discriminant anlaysis (gcam032) 33. Re: Multiple plots per window (p@fo76.org) 34. Re: glmer -- extracting standard errors and other statistics (Weiss, Bernd ) 35. Why isn't R recognising integers as numbers? (Ted Byers) 36. Re: Why isn't R recognising integers as numbers? (jim holtman) 37. Re: Multiple plots per window (Gabor Grothendieck) 38. Re: Why isn't R recognising integers as numbers? (Marc Schwartz) 39. Re: Why isn't R recognising integers as numbers? (Ted Byers) 40. Re: Why isn't R recognising integers as numbers? (Ted Byers) 41. Re: Why isn't R recognising integers as numbers? (Marc Schwartz) 42. Re: Calculating interval for conditional/unconditional correlation matrix (Moshe Olshansky) 43. Re: How to plot "greater than" symbol on the x-axis (Bingshan Li) 44. Re: PDF fonts problem (Paul Murrell) 45. Help for R (Mac) 46. Hmisc and Ubuntu (aptitude install) (Matthew Pettis) 47. adding layers in ggplot2 (data and code included) (Juliet Hannah) 48. Warnings in fitdistr() from MASS. (Rolf Turner) 49. Re: Why isn't R recognising integers as numbers? (Peter Dalgaard) 50. Re: adding layers in ggplot2 (data and code included) (Eric) 51. Time series (ts) questions. (rkevinburton@charter.net) 52. Matrix balancing on margins (PALMIER Patrick - CETE NP/INFRA/TRF) 53. Re: Variable Selection for data reduction and discriminant anlaysis (Mark Difford) 54. Manage huge database ( Jos? E. Lozano ) 55. Re: Manage huge database (Barry Rowlingson) 56. Re: Symmetric matrix (Martin Maechler) 57. Re: Manage huge database (Yihui Xie) 58. Re: Manage huge database ( Jos? E. Lozano ) 59. Re: Manage huge database ( Jos? E. Lozano ) 60. Re: how to keep up with R? (Robin Hankin) 61. Re: Why isn't R recognising integers as numbers? ( (Ted Harding)) 62. Re: Manage huge database (Barry Rowlingson) ---------------------------------------------------------------------- Message: 1 Date: Sun, 21 Sep 2008 03:05:40 -0700 (PDT) Subject: [R] Calculating interval for conditional/unconditional correlation matrix To: R <r-help@r-project.org> Message-ID: <880098.14013.qm@web50610.mail.re2.yahoo.com> Content-Type: text/plain Hi there, Could anyone please help me to understand what should be done in order not to get this error message: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? Here is my code: determinant<- function(x){det(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))} matrix<- function(x){(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))} conditional<-function(x,varcov){ varcov<-matrix(x) sigmaxx<-varcov[3,3] sigmaxz<-varcov[3,1:2] sigmayy<-varcov[4,4] sigmayz<-varcov[4,1:2] sigmazx<-varcov[1:2,3] sigmazy<-varcov[1:2,4] sigmazz<-varcov[1:2,1:2] (x-sigmaxz%*%solve(sigmaZZ)%*%sigmazy)/sqrt((sigmaxx-sigmaxz%*%solve(sigmaZZ)%*%sigmazx)*(sigmayy-sigmayz%*%solve(sigmaZZ)%*%sigmazy))} interval<-uniroot(determinant,lower = min(c(0,1)), upper = max(c(0,1))) I tried also with the code below, but got the same Error message. lower.bound<-uniroot(determinant,c(0,0.5))$root upper.bound<-uniroot(determinant,c(0.51,1))$root [[elided Yahoo spam]] Ana [[alternative HTML version deleted]] ------------------------------ Message: 2 Date: Sat, 20 Sep 2008 23:37:22 -0500 From: "Li, Bingshan" <bli1@bcm.tmc.edu> Subject: [R] How to plot "greater than" symbol on the x-axis To: <r-help@R-project.org> Message-ID: <99FAE9C1DAA75C4BAB3C1441228F95D130C1E7@BCMEVS14.ad.bcm.edu> Content-Type: text/plain Hello everyone, I want to plot a "greater than" symbol (the "_" under ">") on the x-axis in the labels. Is it possible to do it? Thanks. Bingshan [[alternative HTML version deleted]] ------------------------------ Message: 3 Date: Sun, 21 Sep 2008 09:38:13 -0400 From: "John Fox" <jfox@mcmaster.ca> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: "'Li, Bingshan'" <bli1@bcm.tmc.edu> Cc: r-help@r-project.org Message-ID: <000c01c91bef$506f3990$f14dacb0$@ca> Content-Type: text/plain; charset="us-ascii" Dear Bingshan, It isn't entirely clear what you want to do. I think that you want the "greater-than-or-equal-to" symbol, not "greater than," but by itself or in an expression? For the first, xlab=expression("" >= ""), and for the second, e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath. I hope this helps, John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org]On> Behalf Of Li, Bingshan > Sent: September-21-08 12:37 AM > To: r-help@r-project.org > Subject: [R] How to plot "greater than" symbol on the x-axis > > > Hello everyone, > > I want to plot a "greater than" symbol (the "_" under ">") on the x-axisin> the labels. Is it possible to do it? > > Thanks. > > Bingshan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 4 Date: Sun, 21 Sep 2008 07:56:29 -0700 (PDT) From: "milicic.marko" <milicic.marko@gmail.com> Subject: Re: [R] Design lrm function To: r-help@r-project.org Message-ID: <879ed981-d735-41ac-89c1-87a0251b9f06@34g2000hsh.googlegroups.com> Content-Type: text/plain; charset=ISO-8859-1 Thanks Frank. On Sep 20, 2:53?am, Frank E Harrell Jr <f.harr...@vanderbilt.edu> wrote:> milicic.marko wrote: > > Hi, > > > Is it possible to get ROC and accuracy ratio/gini straight out of the > > Design package? > > > Thanks > > The print method for lrm prints the ROC area (labeled "C"). ?lrm does > not print the other 2 measures you listed. ?It computes a generalized > R^2 (much more powerful than all the other measures) and rank indexes > other than C. > > -- > Frank E Harrell Jr ? Professor and Chair ? ? ? ? ? School of Medicine > ? ? ? ? ? ? ? ? ? ? ? Department of Biostatistics ? Vanderbilt University > > ______________________________________________ > R-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 5 Date: Sun, 21 Sep 2008 10:23:53 -0500 From: "Li, Bingshan" <bli1@bcm.tmc.edu> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: "John Fox" <jfox@mcmaster.ca> Cc: r-help@r-project.org Message-ID: <99FAE9C1DAA75C4BAB3C1441228F95D130C1EA@BCMEVS14.ad.bcm.edu> Content-Type: text/plain Hi John, Yes, you are right. I meant "greater-than-or-equal". According to your suggestion, I can plot the symbol only. But what I want is to have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you know how to make it? The expression("">=1"") did not work, and paste(expression("">=""), 1) did not work either. Thanks a lot! Bingshan -----Original Message----- From: John Fox [mailto:jfox@mcmaster.ca] Sent: Sun 9/21/2008 8:38 AM To: Li, Bingshan Cc: r-help@r-project.org Subject: RE: [R] How to plot "greater than" symbol on the x-axis Dear Bingshan, It isn't entirely clear what you want to do. I think that you want the "greater-than-or-equal-to" symbol, not "greater than," but by itself or in an expression? For the first, xlab=expression("" >= ""), and for the second, e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath. I hope this helps, John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org]On> Behalf Of Li, Bingshan > Sent: September-21-08 12:37 AM > To: r-help@r-project.org > Subject: [R] How to plot "greater than" symbol on the x-axis > > > Hello everyone, > > I want to plot a "greater than" symbol (the "_" under ">") on the x-axisin> the labels. Is it possible to do it? > > Thanks. > > Bingshan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]] ------------------------------ Message: 6 Date: Sun, 21 Sep 2008 12:14:04 -0400 From: "John Fox" <jfox@mcmaster.ca> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: "'Li, Bingshan'" <bli1@bcm.tmc.edu> Cc: r-help@r-project.org Message-ID: <000701c91c05$110711e0$331535a0$@ca> Content-Type: text/plain; charset="us-ascii" Dear Bingshan, You can use xlab=expression("" >= "1"), xlab=expression("" >= 1), or expression(NA >= 1), etc. The point is that >= is a binary operator, so a well formed expression needs both a left- and right-hand operand. John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: Li, Bingshan [mailto:bli1@bcm.tmc.edu] > Sent: September-21-08 11:24 AM > To: John Fox > Cc: r-help@r-project.org > Subject: RE: [R] How to plot "greater than" symbol on the x-axis > > Hi John, > > Yes, you are right. I meant "greater-than-or-equal". According to your > suggestion, I can plot the symbol only. But what I want is to have >=1, >=2 > and so on as labels on xaxis. I did not make it work. Do you know how tomake> it? The expression("">=1"") did not work, and paste(expression("">=""), 1) > did not work either. > > Thanks a lot! > > Bingshan > > > -----Original Message----- > From: John Fox [mailto:jfox@mcmaster.ca] > Sent: Sun 9/21/2008 8:38 AM > To: Li, Bingshan > Cc: r-help@r-project.org > Subject: RE: [R] How to plot "greater than" symbol on the x-axis > > Dear Bingshan, > > It isn't entirely clear what you want to do. I think that you want the > "greater-than-or-equal-to" symbol, not "greater than," but by itself or in > an expression? For the first, xlab=expression("" >= ""), and for thesecond,> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath. > > I hope this helps, > John > > ------------------------------ > John Fox, Professor > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > web: socserv.mcmaster.ca/jfox > > > -----Original Message----- > > From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] > On > > Behalf Of Li, Bingshan > > Sent: September-21-08 12:37 AM > > To: r-help@r-project.org > > Subject: [R] How to plot "greater than" symbol on the x-axis > > > > > > Hello everyone, > > > > I want to plot a "greater than" symbol (the "_" under ">") on the x-axis > in > > the labels. Is it possible to do it? > > > > Thanks. > > > > Bingshan > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > >------------------------------ Message: 7 Date: Sun, 21 Sep 2008 12:23:02 -0400 From: "stephen sefick" <ssefick@gmail.com> Subject: Re: [R] periodicity validation To: "yuankun shi" <shiyuankun.debian@gmail.com>, "R-help Mailing List" <r-help@r-project.org> Message-ID: <c502a9e10809210923m10728682wf4f8d41e75dce71e@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 alright this is what you want to do. install.packages("fields", dependencies=TRUE) tim.colors is in this package and it has a blue to red color scheme- blue being the lowest and red being the highest. This color scheme makes sense to me and is a common thing that a people (read engineers) familar with matlab or the like will understand. USE the morlet wavelet it is compactly supported which means that it quickly goes to zero once it gets out of the scale that it is fitting. Making it good for a localized fit. what you are looking at is the modulus (absolute value) of the convolution of the wavelet with the signal at a particular scale (kind of like frequency in fourier analysis) on the y-axis through time (local fitting) on the x-axis. Your are trying to find periodicity? I kind of think of wavelet analysis as the partitioning of variance of the signal into continuous scale. because of algorithm calculation the scale is in log2(value of the time series) so to get to your time units (which you set in the deltat or frequency argument when you create a timeseries with ts() ) 2^(value of the scale). I hope this helps Stephen 2008/9/21 yuankun shi <shiyuankun.debian@gmail.com>:> Thanks, I have succeeded to do this, first wavCWTPeaks to get every peaks' > coordinate, then calculated their horizontal distance, finally,bkde output > the distance's distribution, that's what I want. > On the contrary, picture of wavCWT seems hard to understand, I am not sure > what the y axis and the color mean. Could you do me a favor? > > 2008/9/19 stephen sefick <ssefick@gmail.com> >> >> I would suggest wavelet analysis- >> library wmtsa >> wavCWT >> This will tell you if there is a periodicity localized in time which >> fourier analysis canno tell you- if the variance is not constant >> through time then you should use this. >> >> 2008/9/19 yuankun shi <shiyuankun.debian@gmail.com>: >> > I have spent lots of time to download the code you have mentioned. But >> > all >> > of them is not I wanted, except the latest one, I have not found it >> > anywhere. >> > Maybe I have not make my problem clearly, sorry for that. >> > I have a series data, it consists of time and rate. To plot rate vs time >> > in >> > picture, I found it has perodicity to some extent. The rate rise and >> > fall >> > with time, but not with fixed cycle and fixed amplitude. >> > So I am wondering, is there any tools to get the cycle? and furthmore, >> > to >> > draw it's density picture? >> > Since there is bkde in package KernSmooth, so the 2nd is not strict >> > needed >> > >> > 2008/9/11 stephen sefick <ssefick@gmail.com> >> >> >> >> all of the functions that I listed are time series tools for looking >> >> at what I think you want. this can be done you just have to >> >> understand the methodology. So, look at some of the things that I >> >> suggested, If these don't help then I don't understand what you want, >> >> and it is necissary for you to help me figure out what it is that you >> >> want. >> >> good luck >> >> >> >> 2008/9/11 yk <shiyuankun.debian@gmail.com>: >> >> > The data I mentioned above is oscilating vs time?but there are not >> >> > obersevable fixed cycle if I just plot this data. >> >> > How to get the average cycle?or the most probable range of cycle >> >> > with >> >> > statistical methods? >> >> > I don't know how to achieve it by R, is there any command? >> >> > >> >> > On Sep 11, 10:52 am, "stephen sefick" <ssef...@gmail.com> wrote: >> >> >> ?spectrum >> >> >> ?acf >> >> >> ?ccf >> >> >> library(wmtsa) >> >> >> ?wavCWT >> >> >> library(sowas) >> >> >> ?wsp >> >> >> >> >> >> you could also look at lagged plots to look for periodicity. >> >> >> if you elaborate on the problem and include executable sample code >> >> >> you >> >> >> will probably recieve more help. >> >> >> >> >> >> On Wed, Sep 10, 2008 at 10:02 PM, yk <shiyuankun.deb...@gmail.com> >> >> >> wrote: >> >> >> > There is a series of data contains time in fixed step and energy >> >> >> > varying with time, how to test its periodicity?In R, it seems >> >> >> > there >> >> >> > is >> >> >> > no direct tools since I have search the R manual with periodic and >> >> >> > I >> >> >> > have not found any related topic. >> >> >> > Thanks a lot >> >> >> >> >> >> > ______________________________________________ >> >> >> > R-h...@r-project.org mailing list >> >> >> >https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> > PLEASE do read the posting >> >> >> > guidehttp://www.R-project.org/posting-guide.html >> >> >> > and provide commented, minimal, self-contained, reproducible code. >> >> >> >> >> >> -- >> >> >> Stephen Sefick >> >> >> Research Scientist >> >> >> Southeastern Natural Sciences Academy >> >> >> >> >> >> Let's not spend our time and resources thinking about things that >> >> >> are >> >> >> so little or so large that all they really do for us is puff us up >> >> >> and >> >> >> make us feel like gods. We are mammals, and have not exhausted the >> >> >> annoying little problems of being mammals. >> >> >> >> >> >> -K. Mullis >> >> >> >> >> >> ______________________________________________ >> >> >> R-h...@r-project.org mailing >> >> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help >> >> >> PLEASE do read the posting >> >> >> guidehttp://www.R-project.org/posting-guide.html >> >> >> and provide commented, minimal, self-contained, reproducible code. >> >> > >> >> > ______________________________________________ >> >> > R-help@r-project.org mailing list >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> > PLEASE do read the posting guide >> >> > http://www.R-project.org/posting-guide.html >> >> > and provide commented, minimal, self-contained, reproducible code. >> >> > >> >> >> >> >> >> >> >> -- >> >> Stephen Sefick >> >> Research Scientist >> >> Southeastern Natural Sciences Academy >> >> >> >> Let's not spend our time and resources thinking about things that are >> >> so little or so large that all they really do for us is puff us up and >> >> make us feel like gods. We are mammals, and have not exhausted the >> >> annoying little problems of being mammals. >> >> >> >> -K. Mullis >> > >> > >> >> >> >> -- >> Stephen Sefick >> Research Scientist >> Southeastern Natural Sciences Academy >> >> Let's not spend our time and resources thinking about things that are >> so little or so large that all they really do for us is puff us up and >> make us feel like gods. We are mammals, and have not exhausted the >> annoying little problems of being mammals. >> >> -K. Mullis > >-- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis ------------------------------ Message: 8 Date: Sun, 21 Sep 2008 18:39:55 +0200 (CEST) From: Katharine Mullen <kate@few.vu.nl> Subject: [R] Task View for Chemometrics and Computational Physics To: r-help@r-project.org Message-ID: <Pine.GSO.4.56.0809211836020.3492@laurel.few.vu.nl> Content-Type: TEXT/PLAIN; charset=US-ASCII Dear All, A new task view "ChemPhys" on chemometrics and computational physics is available on CRAN (http://cran.r-project.org/web/views/ChemPhys.html). It describes packages and functions that are of use in modeling chemical/physical systems. Suggestions and comments regarding this task view are welcome. If you think a new category, package or function should be added, please mail. best regards, Kate Mullen ---- Katharine Mullen mail: Department of Physics and Astronomy, Faculty of Sciences Vrije Universiteit Amsterdam, de Boelelaan 1081 1081 HV Amsterdam, The Netherlands room: T.1.06 tel: +31 205987870 fax: +31 205987992 e-mail: kate@nat.vu.nl homepage: http://www.nat.vu.nl/~kate/ ------------------------------ Message: 9 Date: Sun, 21 Sep 2008 18:43:37 +0200 (CEST) From: Katharine Mullen <kate@few.vu.nl> Subject: Re: [R] Variable Selection for data reduction and discriminant anlaysis To: Gareth Campbell <gcam032@gmail.com> Cc: R Help <r-help@r-project.org> Message-ID: <Pine.GSO.4.56.0809211841530.3492@laurel.few.vu.nl> Content-Type: TEXT/PLAIN; charset=US-ASCII There are some pointers to packages for variable selection in the task view for Chemometrics and Computational Physics at http://cran.r-project.org/web/views/ChemPhys.html On Sun, 21 Sep 2008, Gareth Campbell wrote:> Hello all, > > I'm dealing with geochemical analyses of some rocks. > > If I use the full composition (31 elements or variables), I can get > reasonable separation of my 6 sources. Then when I go onto do LDA with the > 6 groups, I get excellent separation. > > I feel like I should be reducing the variables to thos that are providing > the most discrimination between the groups as this is important information > for me. I struggle to interpret the PCA plot in a way that helps me (due to > the large number of elements). So I'm trying to do some sort of step-wise > variable selection. > > I would love to hear from someone (possibly a geochemist or similar) who > does this regularly to determine the best course of action in R to do this. > > > Thanks very much > > > -- > Gareth Campbell > PhD Candidate > The University of Auckland > > P +649 815 3670 > M +6421 256 3511 > E gareth.campbell@esr.cri.nz > gcam032@gmail.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 10 Date: Sun, 21 Sep 2008 09:52:29 -0700 From: "Henrik Bengtsson" <hb@stat.berkeley.edu> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: "Li, Bingshan" <bli1@bcm.tmc.edu> Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca> Message-ID: <59d7961d0809210952j7a8ffb0epdad6b839aba452c9@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 What have you tried this far and what part does not work? If you forget for a moment the fact that you want to have ">=1", ">=2", ... can you do what you want with plain "1", "2", ...? Telling us that helps us help you. Are you asking for the labels on the *tick marks* on the axis? Right now it sounds like you are asking for *the label* on the x axis, but the part that you want multiple ones is confusing. plot(1:10, xlab="1"); is different from: plot(1:10, axes=FALSE); axis(side=1, at=1:10, labels=1:10); To add ">=" to the latter case, this works: bquote("" >= 1:10) labels <- lapply(1:10, FUN=function(x) substitute(>= t, list(t=x))); plot(1:10, axes=FALSE); axis(side=1, at=1:10, labels=labels); On Sun, Sep 21, 2008 at 8:23 AM, Li, Bingshan <bli1@bcm.tmc.edu> wrote:> Hi John, > > Yes, you are right. I meant "greater-than-or-equal". According to your suggestion, I can plot the symbol only. But what I want is to have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you know how to make it? The expression("">=1"") did not work, and paste(expression("">=""), 1) > did not work either. > > Thanks a lot! > > Bingshan > > > -----Original Message----- > From: John Fox [mailto:jfox@mcmaster.ca] > Sent: Sun 9/21/2008 8:38 AM > To: Li, Bingshan > Cc: r-help@r-project.org > Subject: RE: [R] How to plot "greater than" symbol on the x-axis > > Dear Bingshan, > > It isn't entirely clear what you want to do. I think that you want the > "greater-than-or-equal-to" symbol, not "greater than," but by itself or in > an expression? For the first, xlab=expression("" >= ""), and for the second, > e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath. > > I hope this helps, > John > > ------------------------------ > John Fox, Professor > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > web: socserv.mcmaster.ca/jfox > >> -----Original Message----- >> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] > On >> Behalf Of Li, Bingshan >> Sent: September-21-08 12:37 AM >> To: r-help@r-project.org >> Subject: [R] How to plot "greater than" symbol on the x-axis >> >> >> Hello everyone, >> >> I want to plot a "greater than" symbol (the "_" under ">") on the x-axis > in >> the labels. Is it possible to do it? >> >> Thanks. >> >> Bingshan >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 11 Date: Sun, 21 Sep 2008 09:53:39 -0700 From: "Henrik Bengtsson" <hb@stat.berkeley.edu> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: "Li, Bingshan" <bli1@bcm.tmc.edu> Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca> Message-ID: <59d7961d0809210953n1727e462q2f19b37266689348@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Sun, Sep 21, 2008 at 9:52 AM, Henrik Bengtsson <hb@stat.berkeley.edu> wrote:> What have you tried this far and what part does not work? If you > forget for a moment the fact that you want to have ">=1", ">=2", ... > can you do what you want with plain "1", "2", ...? Telling us that > helps us help you. > > Are you asking for the labels on the *tick marks* on the axis? Right > now it sounds like you are asking for *the label* on the x axis, but > the part that you want multiple ones is confusing. > > plot(1:10, xlab="1"); > > is different from: > > plot(1:10, axes=FALSE); > axis(side=1, at=1:10, labels=1:10); > > To add ">=" to the latter case, this works: > > bquote("" >= 1:10) > labels <- lapply(1:10, FUN=function(x) substitute(>= t, list(t=x))); > plot(1:10, axes=FALSE); > axis(side=1, at=1:10, labels=labels);Oops. Forget about the bquote() - cut'n'paste error ...and I don't know how to get rid of the "" preceding each tick label. Maybe someone else knows. /Henrik> > On Sun, Sep 21, 2008 at 8:23 AM, Li, Bingshan <bli1@bcm.tmc.edu> wrote: >> Hi John, >> >> Yes, you are right. I meant "greater-than-or-equal". According to your suggestion, I can plot the symbol only. But what I want is to have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you know how to make it? The expression("">=1"") did not work, and paste(expression("">=""), 1) >> did not work either. >> >> Thanks a lot! >> >> Bingshan >> >> >> -----Original Message----- >> From: John Fox [mailto:jfox@mcmaster.ca] >> Sent: Sun 9/21/2008 8:38 AM >> To: Li, Bingshan >> Cc: r-help@r-project.org >> Subject: RE: [R] How to plot "greater than" symbol on the x-axis >> >> Dear Bingshan, >> >> It isn't entirely clear what you want to do. I think that you want the >> "greater-than-or-equal-to" symbol, not "greater than," but by itself or in >> an expression? For the first, xlab=expression("" >= ""), and for the second, >> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath. >> >> I hope this helps, >> John >> >> ------------------------------ >> John Fox, Professor >> Department of Sociology >> McMaster University >> Hamilton, Ontario, Canada >> web: socserv.mcmaster.ca/jfox >> >>> -----Original Message----- >>> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] >> On >>> Behalf Of Li, Bingshan >>> Sent: September-21-08 12:37 AM >>> To: r-help@r-project.org >>> Subject: [R] How to plot "greater than" symbol on the x-axis >>> >>> >>> Hello everyone, >>> >>> I want to plot a "greater than" symbol (the "_" under ">") on the x-axis >> in >>> the labels. Is it possible to do it? >>> >>> Thanks. >>> >>> Bingshan >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >------------------------------ Message: 12 Date: Sun, 21 Sep 2008 13:08:09 -0400 From: "Gabor Grothendieck" <ggrothendieck@gmail.com> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: "Li, Bingshan" <bli1@bcm.tmc.edu> Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca> Message-ID: <971536df0809211008l559eec03ub016e3fcd71682f@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Sun, Sep 21, 2008 at 11:23 AM, Li, Bingshan <bli1@bcm.tmc.edu> wrote:> Hi John, > > Yes, you are right. I meant "greater-than-or-equal". According to your suggestion, I can plot the symbol only. But what I want is to have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you know how to make it? The expression("">=1"") did not work, and paste(expression("">=""), 1) > did not work either. >Try this: plot(1:10, xaxt = "n") for(i in 1:10) axis(1, i, bquote(phantom(0) >= .(i))) ------------------------------ Message: 13 Date: Sun, 21 Sep 2008 10:47:47 -0700 (PDT) Subject: [R] Symmetric matrix To: r-help@stat.math.ethz.ch Message-ID: <139940.18844.qm@web58102.mail.re3.yahoo.com> Content-Type: text/plain; charset=us-ascii I have following matrix : a = matrix(rnorm(36), 6) Now I want to replace the lower-triangular elements with it's upper-triangular elements. That is I want to make a symmetric matrix from a. I have tried with lower.tri() and upper.tri() function, but got desired result. Can anyone please tell me how to do that? ------------------------------ Message: 14 Date: Sun, 21 Sep 2008 13:54:19 -0400 From: "Jorge Ivan Velez" <jorgeivanvelez@gmail.com> Subject: Re: [R] Symmetric matrix Cc: r-help@stat.math.ethz.ch Message-ID: <317737de0809211054u1485f494l166e21f6b30e4123@mail.gmail.com> Content-Type: text/plain Dear Megh, Try this: a = matrix(rnorm(36), 6) a[upper.tri(a)]<-a[lower.tri(a)] a HTH, Jorge> I have following matrix : > > a = matrix(rnorm(36), 6) > > Now I want to replace the lower-triangular elements with it's > upper-triangular elements. That is I want to make a symmetric matrix from a. > I have tried with lower.tri() and upper.tri() function, but got desired > result. Can anyone please tell me how to do that? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]] ------------------------------ Message: 15 Date: Sun, 21 Sep 2008 19:58:44 +0200 From: Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl> Subject: Re: [R] Symmetric matrix Cc: r-help@stat.math.ethz.ch Message-ID: <48D68B54.70409@erasmusmc.nl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed try the following a <- matrix(rnorm(36), 6) ind <- lower.tri(a) a[ind] <- t(a)[ind] a I hope it helps. Best, Dimitris Megh Dal wrote:> I have following matrix : > > a = matrix(rnorm(36), 6) > > Now I want to replace the lower-triangular elements with it's upper-triangular elements. That is I want to make a symmetric matrix from a. I have tried with lower.tri() and upper.tri() function, but got desired result. Can anyone please tell me how to do that? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 ------------------------------ Message: 16 Date: Sun, 21 Sep 2008 14:26:05 -0400 From: Junjie Zhang <thujacky@hotmail.com> Subject: [R] R Map using SAS data To: <r-help@r-project.org> Message-ID: <BAY105-W522471E0693BD6F2FD69BDDC480@phx.gbl> Content-Type: text/plain Hi there, I'd like to plot some maps. Is it possible for me to use SAS map data in R? Thank you. Best, Junjie _________________________________________________________________ your life. [[alternative HTML version deleted]] ------------------------------ Message: 17 Date: Sun, 21 Sep 2008 11:40:35 -0500 From: "Li, Bingshan" <bli1@bcm.tmc.edu> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: "John Fox" <jfox@mcmaster.ca> Cc: r-help@r-project.org Message-ID: <99FAE9C1DAA75C4BAB3C1441228F95D130C1EB@BCMEVS14.ad.bcm.edu> Content-Type: text/plain Hi John, It works perfectly. Thank you so much for the help! Have a great day. Bingshan -----Original Message----- From: John Fox [mailto:jfox@mcmaster.ca] Sent: Sun 9/21/2008 11:14 AM To: Li, Bingshan Cc: r-help@r-project.org Subject: RE: [R] How to plot "greater than" symbol on the x-axis Dear Bingshan, You can use xlab=expression("" >= "1"), xlab=expression("" >= 1), or expression(NA >= 1), etc. The point is that >= is a binary operator, so a well formed expression needs both a left- and right-hand operand. John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: Li, Bingshan [mailto:bli1@bcm.tmc.edu] > Sent: September-21-08 11:24 AM > To: John Fox > Cc: r-help@r-project.org > Subject: RE: [R] How to plot "greater than" symbol on the x-axis > > Hi John, > > Yes, you are right. I meant "greater-than-or-equal". According to your > suggestion, I can plot the symbol only. But what I want is to have >=1, >=2 > and so on as labels on xaxis. I did not make it work. Do you know how tomake> it? The expression("">=1"") did not work, and paste(expression("">=""), 1) > did not work either. > > Thanks a lot! > > Bingshan > > > -----Original Message----- > From: John Fox [mailto:jfox@mcmaster.ca] > Sent: Sun 9/21/2008 8:38 AM > To: Li, Bingshan > Cc: r-help@r-project.org > Subject: RE: [R] How to plot "greater than" symbol on the x-axis > > Dear Bingshan, > > It isn't entirely clear what you want to do. I think that you want the > "greater-than-or-equal-to" symbol, not "greater than," but by itself or in > an expression? For the first, xlab=expression("" >= ""), and for thesecond,> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath. > > I hope this helps, > John > > ------------------------------ > John Fox, Professor > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > web: socserv.mcmaster.ca/jfox > > > -----Original Message----- > > From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] > On > > Behalf Of Li, Bingshan > > Sent: September-21-08 12:37 AM > > To: r-help@r-project.org > > Subject: [R] How to plot "greater than" symbol on the x-axis > > > > > > Hello everyone, > > > > I want to plot a "greater than" symbol (the "_" under ">") on the x-axis > in > > the labels. Is it possible to do it? > > > > Thanks. > > > > Bingshan > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > >[[alternative HTML version deleted]] ------------------------------ Message: 18 Date: Sun, 21 Sep 2008 21:10:07 +0200 From: Peter Dalgaard <p.dalgaard@biostat.ku.dk> Subject: Re: [R] Symmetric matrix To: Jorge Ivan Velez <jorgeivanvelez@gmail.com> Message-ID: <48D69C0F.3000706@biostat.ku.dk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Jorge Ivan Velez wrote:> Dear Megh, > Try this: > > a = matrix(rnorm(36), 6) > a[upper.tri(a)]<-a[lower.tri(a)] > a > > > HTH, > >If you look carefully, you'll see that it doesn't work! Dimitris had the better idea. -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 ------------------------------ Message: 19 Date: Mon, 22 Sep 2008 08:39:44 +1200 From: Rolf Turner <r.turner@auckland.ac.nz> Subject: Re: [R] removing a word, the following space and the next word To: jim holtman <jholtman@gmail.com> Cc: r-help@r-project.org, Bob Green <bgreen@dyson.brisnet.org.au> Message-ID: <1503F860-54A4-402C-B6B5-C76A88EF2D5E@auckland.ac.nz> Content-Type: text/plain; charset=US-ASCII; format=flowed On 21/09/2008, at 5:15 AM, jim holtman wrote:>> x <- 'Mr Jones ate lunch and Mr Smith was tied' >> gsub('(Mr\\.*)\\s+\\w+', "\\1 <file://0.0.0.1/> xxxx", x) > [1] "Mr xxxx ate lunch and Mr xxxx was tied"I don't get what the bit <file://0.0.0.1/> is about. If I do (just) gsub('(Mr\\.*)\\s+\\w+', "\\1 xxxx", x) I get the desired result, i.e. [1] "Mr xxxx ate lunch and Mr xxxx was tied" cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}} ------------------------------ Message: 20 Date: Mon, 22 Sep 2008 08:58:00 +1200 From: Rolf Turner <r.turner@auckland.ac.nz> Subject: Re: [R] fitting a hyperbole To: Peter Dalgaard <p.dalgaard@biostat.ku.dk> Cc: R-help Forum <r-help@r-project.org> Message-ID: <DAE7D8B9-0991-4DF7-8336-F22CA23E1254@auckland.ac.nz> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed On 21/09/2008, at 10:38 AM, Peter Dalgaard wrote:> stephen sefick wrote: >> I am not sure if I am exaggerating or not read title as hyperbola >> >> On Sat, Sep 20, 2008 at 2:20 PM, stephen sefick >> <ssefick@gmail.com> wrote: >> >>> I have got a data set that is Gross Primary Productivity ~ Total >>> Suspended Solids it is a hyperbola just like: >>> plot(1/c(1:1000)) >>> >>> how do I model this relationship so that I can get all of the neat >>> things that lm gives residuals etc. etc. so that I can see if my >>> eyeball model stands up. Thanks for any help, pointers, or good >>> things to read. >>> > Well, it depends on the exact model you want to fit and the error > characteristics. > > There's a straightforward linear model in the transformed x: > lm(y ~ I(1/x)) > > but there are also transformed models like > > lm(1/y ~ x) > > or > > lm(log(y) ~ log(x)) > > but of course, y, 1/y, and log(y) can't all be homoscedastic normal > variates. Going beyond the linearized models, you can use nls(), as in > > nls(y~ a/(x-b), start=c(a=1,b=0)) > > (which is linear for 1/y, but assumes that y rather than 1/y has > constant variance.)Nicely expressed. Succinct, clear, to the point, comprehensive. I wish I'd said that! (And that's not hyperbole. :-) ) So much more helpful than some postings I've seen recently to the effect of ``Go away and read a book on this topic.'' cheers, Rolf ###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}} ------------------------------ Message: 21 Date: Sun, 21 Sep 2008 17:01:18 -0400 From: " Javier Acu?a " <javier.acuna.o@gmail.com> Subject: Re: [R] Unexpected behaviour when testing for independence, with multiple factors To: bolker@ufl.edu, r-help@r-project.org Message-ID: <e10c29610809211401i6e3d7792p22b72172993aae9a@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1>Ben Bolker <bolker <at> ufl.edu> writes: > >I would try > >fligner.test(dT ~ Topology:Drift:lambda) > >in response to: > >Javier Acuna <javier.acuna.o <at> gmail.com> writes:> > Hi, I'm a new user of R. My background is Electrical Engineering, so > please bear with me if this is a silly question. > > I'm trying to assess whether the results of an experiment satisfy the > hypothesis of homoscedasticity (my ultimate goal is to use ANOVA). > > The result of the experiment is mean delay (dT), which depends on > three factors, topology, drift, and lambda. The first two factors are > categorical (with 4 levels each) and the last one is numerical, with > two levels. > > A sample of my data is as follows: > > dT Topology Drift lambda > 258.789 Tree b1 .43 > 244.195 Tree b1 .43 > 115.961 Tree b2 .3 > 115.183 Tree b2 .3 > > I would like to separate dT in the 32 samples (4x4x2), and test if the > variance of each sample is equal to the other 31 samples. > I tried using fligner.test and bartlett.test, but either test seems to > only work for one factor: > > > fligner.test( dT ~ Topology + Drift + lambda) > > Fligner-Killeen test of homogeneity of variances > > data: dT by Topology by Drift by lambda > Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451 > > > fligner.test( dT ~ Topology ) > > Fligner-Killeen test of homogeneity of variances > > data: dT by Topology > Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451 > > As I see from the previous two outputs, fligner.test only takes into > account the first factor. Similar results are obtained for > bartlett.test.I tried what you suggested Ben, but I'm still puzzled by the output. In this case, I obtain different results with different ordering of the factors:> fligner.test( dT ~ Dims : Topology :Drift )Fligner-Killeen test of homogeneity of variances data: dT by Dims by Topology by Drift Fligner-Killeen:med chi-squared = 195.2067, df = 1, p-value < 2.2e-16> fligner.test( dT ~ Topology :Drift:Dims )Fligner-Killeen test of homogeneity of variances data: dT by Topology by Drift by Dims Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451 I don't know what to do now, any help would be reaaally appreciated. Best Regards Javier ---------------------------------------------------- Javier Acuna Electrical Engineering Grad Student Universidad de Chile javier.acuna.o@gmail.com ------------------------------ Message: 22 Date: Sun, 21 Sep 2008 17:05:06 -0400 From: " Javier Acu?a " <javier.acuna.o@gmail.com> Subject: Re: [R] Unexpected behaviour when testing for independence with multiple factors To: "Michael Dewey" <info@aghmed.fsnet.co.uk> Cc: r-help@r-project.org Message-ID: <e10c29610809211405y3607a7d9s2ba865460aca38e8@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Michael, so you're suggesting that I should do: aux <- interaction( Topology, Drift, lambda) and then fligner.test(dT~aux) Is that correct? On Thu, Sep 18, 2008 at 8:32 AM, Michael Dewey <info <at> aghmed.fsnet.co.uk> wrote:> At 16:03 17/09/2008, Javier Acu?a wrote: >> >> Hi, I'm a new user of R. My background is Electrical Engineering, so >> please bear with me if this is a silly question. > > For future reference you might find > ?interaction > helpful as another tool in your box. > > >> I'm trying to assess whether the results of an experiment satisfy the >> hypothesis of homoscedasticity (my ultimate goal is to use ANOVA). > > It is hard to resist quoting Box (1953, Biometrika, 40, p333) that these > tests are '... like putting to sea in a rowing boat to find out whether > conditions are safe for an ocean liner to leave port' > >> The result of the experiment is mean delay (dT), which depends on >> three factors, topology, drift, and lambda. The first two factors are >> categorical (with 4 levels each) and the last one is numerical, with >> two levels. >> >> A sample of my data is as follows: >> >> dT Topology Drift lambda >> 258.789 Tree b1 .43 >> 244.195 Tree b1 .43 >> 115.961 Tree b2 .3 >> 115.183 Tree b2 .3 >> >> I would like to separate dT in the 32 samples (4x4x2), and test if the >> variance of each sample is equal to the other 31 samples. >> I tried using fligner.test and bartlett.test, but either test seems to >> only work for one factor: >> >> > fligner.test( dT ~ Topology + Drift + lambda) >> >> Fligner-Killeen test of homogeneity of variances >> >> data: dT by Topology by Drift by lambda >> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451 >> >> > fligner.test( dT ~ Topology ) >> >> Fligner-Killeen test of homogeneity of variances >> >> data: dT by Topology >> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451 >> >> As I see from the previous two outputs, fligner.test only takes into >> account the first factor. Similar results are obtained for >> bartlett.test. >> >> At this point I don't know if I'm using the test incorrectly or >> something else. I would really appreciate any help. I'm using R >> version 2.7.2 (2008-08-25) in Windows XP. >> >> Many thanks in advance >> Javier >> >> ---------------------------------------------------- >> Javier Acuna >> Electrical Engineering Grad Student >> Universidad de Chile >> javier.acuna.o@gmail.com > > Michael Dewey > http://www.aghmed.fsnet.co.uk > >------------------------------ Message: 23 Date: Sun, 21 Sep 2008 18:03:52 -0400 From: "DS" <ds5j@excite.com> Subject: [R] r format questions To: r-help@R-project.org Message-ID: <20080921180352.6760@web005.roc2.bluetie.com> Content-Type: text/plain Hi, 1) I have noticed that when I use the aggregate function it outputs numbers in the results. for example: aggregate by product group.1 Aggregate 1 ProductA 1000400.00 2 ProductB 23232323.00 3 Missing 232323.00 is there a way to suppress the numbers infront of aggregate outputs. I checked and they don't look like columns when I do a summary so I can't -1 them away. 2) is there an easy way to then take my aggregate matrix and then format the sum wtih $ and commas. for e.g instead 10000 it should show $10,000.00? I am trying to create a report and am piping the aggregate into an xtable and feeding it R2html. thanks Dhruv ------------------------------------------------------------ Medical Billing and Coding Training ools. http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/ [[alternative HTML version deleted]] ------------------------------ Message: 24 Date: Sun, 21 Sep 2008 18:06:51 -0400 From: "DS" <ds5j@excite.com> Subject: [R] design question on piping multiple data sets from 1 file into R To: r-help@R-project.org Message-ID: <20080921180651.12712@web006.roc2.bluetie.com> Content-Type: text/plain Hi, I have some queries that I use to get time series information for 8 seperate queries which deal with a different set of time series each. I take my queries run them and save the output as csv file and them format the data into graphs in excel. I wanted to know if there is an elegant and clean way to read in 1 csv file but to read the seperate matrices on different rows into seperate R data objects. if this is easy then I can read the 8 datasets in the csv file into 8 r objects and pipe them to time series objects for graphs. thanks Dhruv ------------------------------------------------------------ Email Fax [[elided Yahoo spam]] http://tagline.excite.com/fc/JkJQPTgLMRGrZRz1SpXTBEyJ7zsqYo4Wrxjvd4ml8SSHhbc6NzbNSo/ [[alternative HTML version deleted]] ------------------------------ Message: 25 Date: Sun, 21 Sep 2008 18:09:01 -0400 From: "Tom Bonen" <tom.bonen@googlemail.com> Subject: [R] color for lattice box plots To: r-help@r-project.org Message-ID: <8316adf50809211509l7471b151x282a1cc19fd6a4ba@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 hi, i have a figure with many boxplots and want to differentiate one group of the boxplots by colour of the box. so for example: X <- replicate(3,rnorm(100)) bwplot(X[,1]~as.factor(X[,2]>1)|X[,3]>0) # this gives four boxplots, i'd like to give 1 and 3 a different colour than 2 and 3 # i tried bwplot(X[,1]~as.factor(X[,2]>1)|X[,3]>0,groups=as.factor(X[,2]>1)) but that does not change the display? how can i change the colour for groups with bwplot? thanks. tom ------------------------------ Message: 26 Date: Sun, 21 Sep 2008 18:25:32 -0400 From: "Tom Bonen" <tom.bonen@googlemail.com> Subject: [R] suppress legend in ggplot(data, aes(y=Y, x=X,fill=Z))? To: r-help <r-help@r-project.org> Message-ID: <8316adf50809211525q53e22a3dp5992e1c05c303ab9@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 hi, is there any way to suppress the legend in ggplot(data, aes(y=Y, x=X,fill=Z)) ? i'd like the values to be displayed in different colors as specified by fill= and this works just fine. but i do not want to have the legend on the right that is automactially created when fill is specified. thanks, tom ------------------------------ Message: 27 Date: Sun, 21 Sep 2008 15:33:56 -0700 From: Bert Gunter <gunter.berton@gene.com> Subject: Re: [R] selecting from a series of integers withpre-determined probabilities To: "'John Sorkin'" <jsorkin@grecc.umaryland.edu>, <r-help@r-project.org> Message-ID: <000901c91c3a$225eaa40$6501a8c0@gne.windows.gene.com> Content-Type: text/plain; charset="us-ascii" ?sample. -- Bert Gunter -----Original Message----- From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] On Behalf Of John Sorkin Sent: Saturday, September 20, 2008 12:43 PM To: r-help@r-project.org Subject: [R] selecting from a series of integers withpre-determined probabilities R 2.6 Windows XP I need to select from the integers 1,2,3,4,5 with some pre-determined probability, e.g. probability of selecting 5 80%, probability of selecting 1 or 2 or 3 or 4 20%. Any suggestions for how I might accomplish this? I need to do it very efficiently as I will be doing it 500,000 times. Thanks John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:8}} ------------------------------ Message: 28 Date: Mon, 22 Sep 2008 00:34:51 +0200 From: p@fo76.org Subject: [R] Multiple plots per window To: R Help <r-help@r-project.org> Message-ID: <20080922003451.2qhs63jhhwow48s0@webmail.openit.de> Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Hi all, I'm currently working through "The Analysis of Time Series" by Chris Chatfield. In order to also get a better understanding of R, I play around with the examples and Exercises (no homework or assignement, just selfstudy!!). Exercise 2.1 gives the following dataset (sales figures for 4 week intervals):> sales2.1.dataframe1995 1996 1997 1998 1 153 133 145 111 2 189 177 200 170 3 221 241 187 243 4 215 228 201 178 5 302 283 292 248 6 223 255 220 202 7 201 238 233 163 8 173 164 172 139 9 121 128 119 120 10 106 108 81 96 11 86 87 65 95 12 87 74 76 53 13 108 95 74 94 I want to plot the histograms/densities for all four years in one window. After trying out a couple of things, I finally ended up with the following (it took me two hours - Ouch!): sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108, 133,177,241,228,283,255,238,164,128,108,87,74,95, 145,200,187,201,292,220,233,172,119,81,65,76,74, 111,170,243,178,248,202,163,139,120,96,95,53,94) sales2.1.matrix <- sales2.1 dim(sales2.1.matrix) <- c(4,13) sales2.1.dataframe <- as.data.frame(sales2.1.matrix) names(sales2.1.dataframe) <- c("1995","1996","1997","1998") X11() split.screen(c(2,2)) for (i in 1:4) { screen(i) hist(sales2.1.dataframe[[i]], probability=T, xlim=c(0,400), ylim=c(0,0.006), main=names(sales2.1.dataframe)[i], xlab="Sales") lines(density(sales2.1.dataframe[[i]])) } close.screen(all=TRUE) Although I'm happy that I finally got something that is pretty close to what I wanted, I'm not sure whether this is the best or most elegant way to do it. How would you do it? What functions/packages should I look into, in order to improve these plots? Thanks in advance for your comments and suggestions, Peter ------------------------------ Message: 29 Date: Sun, 21 Sep 2008 18:41:22 -0400 From: John Poulsen <jpoulsen@zoo.ufl.edu> Subject: [R] glmer -- extracting standard errors and other statistics To: r-help@r-project.org Message-ID: <48D6CD92.10305@zoo.ufl.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hello, I am using glmer() from lmer(lme4) to run generalized linear mixed models. However, I am having a problem extracting the standard errors for the fixed effects. I have used: summary(model)$coef fixed.effects(model) coef(model) to get out the parameter estimates, but do not seem able to extract the se's. Anybody have a solution? Thanks, John ------------------------------ Message: 30 Date: Sun, 21 Sep 2008 18:52:10 -0400 From: "jim holtman" <jholtman@gmail.com> Subject: Re: [R] r format questions To: DS <ds5j@excite.com> Cc: r-help@r-project.org Message-ID: <644e1f320809211552s76b1447fg312e57f7b9dba3a3@mail.gmail.com> Content-Type: text/plain You have to explicitly ask that they not be printed:> x <- aggregate(state.x77, list(Region = state.region), mean) > xRegion Population Income Illiteracy Life Exp Murder HS Grad Frost Area 1 Northeast 5495.111 4570.222 1.000000 71.26444 4.722222 53.96667 132.7778 18141.00 2 South 4208.125 4011.938 1.737500 69.70625 10.581250 44.34375 64.6250 54605.12 3 North Central 4803.000 4611.083 0.700000 71.76667 5.275000 54.51667 138.8333 62652.00 4 West 2915.308 4702.615 1.023077 71.23462 7.215385 62.00000 102.1538 134463.00> print(x, row.names=FALSE)Region Population Income Illiteracy Life Exp Murder HS Grad Frost Area Northeast 5495.111 4570.222 1.000000 71.26444 4.722222 53.96667 132.7778 18141.00 South 4208.125 4011.938 1.737500 69.70625 10.581250 44.34375 64.6250 54605.12 North Central 4803.000 4611.083 0.700000 71.76667 5.275000 54.51667 138.8333 62652.00 West 2915.308 4702.615 1.023077 71.23462 7.215385 62.00000 102.1538 134463.00>On Sun, Sep 21, 2008 at 6:03 PM, DS <ds5j@excite.com> wrote:> Hi, > > 1) I have noticed that when I use the aggregate function it outputs > numbers in the results. for example: > aggregate by product > > group.1 Aggregate > 1 ProductA 1000400.00 > 2 ProductB 23232323.00 > 3 Missing 232323.00 > > is there a way to suppress the numbers infront of aggregate outputs. I > checked and they don't look like columns when I do a summary so I can't -1 > them away. > > 2) is there an easy way to then take my aggregate matrix and then format > the sum wtih $ and commas. for e.g instead 10000 it should show > $10,000.00? > > I am trying to create a report and am piping the aggregate into an xtable > and feeding it R2html. > > thanks > Dhruv > > ------------------------------------------------------------ > Medical Billing and Coding Training > > ools. > > http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/ > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ------------------------------ Message: 31 Date: Sun, 21 Sep 2008 18:56:17 -0400 From: "jim holtman" <jholtman@gmail.com> Subject: Re: [R] r format questions To: DS <ds5j@excite.com> Cc: r-help@r-project.org Message-ID: <644e1f320809211556r2023e1c3od632186ebe3d0447@mail.gmail.com> Content-Type: text/plain answer to your second question:> paste("$", format(1234567.77, big.mark=','), sep='')[1] "$1,234,568">you will have to go through each column you want and explicitly do it:> xRegion Population Income Illiteracy Life Exp Murder HS Grad Frost Area 1 Northeast 5495.111 4570.222 1.000000 71.26444 4.722222 53.96667 132.7778 18141.00 2 South 4208.125 4011.938 1.737500 69.70625 10.581250 44.34375 64.6250 54605.12 3 North Central 4803.000 4611.083 0.700000 71.76667 5.275000 54.51667 138.8333 62652.00 4 West 2915.308 4702.615 1.023077 71.23462 7.215385 62.00000 102.1538 134463.00> x$Population <- paste("$", format(x$Population, big.mark=','), sep='') > xRegion Population Income Illiteracy Life Exp Murder HS Grad Frost Area 1 Northeast $5,495.111 4570.222 1.000000 71.26444 4.722222 53.96667 132.7778 18141.00 2 South $4,208.125 4011.938 1.737500 69.70625 10.581250 44.34375 64.6250 54605.12 3 North Central $4,803.000 4611.083 0.700000 71.76667 5.275000 54.51667 138.8333 62652.00 4 West $2,915.308 4702.615 1.023077 71.23462 7.215385 62.00000 102.1538 134463.00>On Sun, Sep 21, 2008 at 6:03 PM, DS <ds5j@excite.com> wrote:> Hi, > > 1) I have noticed that when I use the aggregate function it outputs > numbers in the results. for example: > aggregate by product > > group.1 Aggregate > 1 ProductA 1000400.00 > 2 ProductB 23232323.00 > 3 Missing 232323.00 > > is there a way to suppress the numbers infront of aggregate outputs. I > checked and they don't look like columns when I do a summary so I can't -1 > them away. > > 2) is there an easy way to then take my aggregate matrix and then format > the sum wtih $ and commas. for e.g instead 10000 it should show > $10,000.00? > > I am trying to create a report and am piping the aggregate into an xtable > and feeding it R2html. > > thanks > Dhruv > > ------------------------------------------------------------ > Medical Billing and Coding Training > > ools. > > http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/ > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ------------------------------ Message: 32 Date: Sun, 21 Sep 2008 16:00:47 -0700 (PDT) From: gcam032 <gcam032@gmail.com> Subject: Re: [R] Variable Selection for data reduction and discriminant anlaysis To: r-help@r-project.org Message-ID: <19599461.post@talk.nabble.com> Content-Type: text/plain; charset=us-ascii Thanks Mark, I failed to mention that i'm working within a compositional framework. I didn't want to confuse things. My data is transformed to the clr or alr under Aitchison geometry, so I am essentially working in Euclidean space. Has anyone had experience doing stepwise LDA?? I can't for the life of me find any help online about where to start. Thanks Gareth quote author="Mark Difford"> Hi Gareth,>> If I use the full composition (31 elements or variables), I can get >> reasonable separation of my 6 sources.A word of advice: You need to be exceptionally careful when analyzing compositional data. Taking compositions puts your data values into a constrained/bounded space (generally called a simplex) so that most standard statistical procedures (i.e. anything that uses a Euclidean metric, and most do) deliver erroneous results. Pearson wrote a paper on this long ago, but it's generally been ignored (except by Aitchison and the Spanish School of mathematical statisticians). The problem is comparatively well known to geologists, who work with compositional much of the time. R has a very good package for analysing this data-type: see the compositions package (a new release seems iminent). You will be able to get most of the main references from it. (The authors of the package also have a newly-released article in one of the Elsevier journals [unfor. my bib+ are elsewhere so I cannot give details]). You could start by Wiki'ing your way to "compositional data". HTH, Mark. Gareth Campbell wrote:> > Hello all, > > I'm dealing with geochemical analyses of some rocks. > > If I use the full composition (31 elements or variables), I can get > reasonable separation of my 6 sources. Then when I go onto do LDA with > the > 6 groups, I get excellent separation. > > I feel like I should be reducing the variables to thos that are providing > the most discrimination between the groups as this is important > information > for me. I struggle to interpret the PCA plot in a way that helps me (due > to > the large number of elements). So I'm trying to do some sort of step-wise > variable selection. > > I would love to hear from someone (possibly a geochemist or similar) who > does this regularly to determine the best course of action in R to do > this. > > > Thanks very much > > > -- > Gareth Campbell > PhD Candidate > The University of Auckland > > P +649 815 3670 > M +6421 256 3511 > E gareth.campbell@esr.cri.nz > gcam032@gmail.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Variable-Selection-for-data-reduction-and-discriminant-anlaysis-tp19591270p19599461.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 33 Date: Mon, 22 Sep 2008 01:19:48 +0200 From: p@fo76.org Subject: Re: [R] Multiple plots per window To: r-help@r-project.org Message-ID: <20080922011948.xdupj04hyv4w4c08@webmail.openit.de> Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" sorry, as Mark Leeds pointed out to me, the row/column numbers where mixed up in my example... happens when you cut & paste like mad from your history... it should read as follows: sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108, 133,177,241,228,283,255,238,164,128,108,87,74,95, 145,200,187,201,292,220,233,172,119,81,65,76,74, 111,170,243,178,248,202,163,139,120,96,95,53,94) sales2.1.matrix <- sales2.1 dim(sales2.1.matrix) <- c(13,4) sales2.1.dataframe <- as.data.frame(sales2.1.matrix) names(sales2.1.dataframe) <- c("1995","1996","1997","1998") Peter Quoting p@fo76.org:> Hi all, > > I'm currently working through "The Analysis of Time Series" by Chris > Chatfield. In order to also get a better understanding of R, I play > around with the examples and Exercises (no homework or assignement, > just selfstudy!!). > > Exercise 2.1 gives the following dataset (sales figures for 4 week > intervals): > >> sales2.1.dataframe > 1995 1996 1997 1998 > 1 153 133 145 111 > 2 189 177 200 170 > 3 221 241 187 243 > 4 215 228 201 178 > 5 302 283 292 248 > 6 223 255 220 202 > 7 201 238 233 163 > 8 173 164 172 139 > 9 121 128 119 120 > 10 106 108 81 96 > 11 86 87 65 95 > 12 87 74 76 53 > 13 108 95 74 94 > > I want to plot the histograms/densities for all four years in one window. > After trying out a couple of things, I finally ended up with the following > (it took me two hours - Ouch!): > > sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108, > 133,177,241,228,283,255,238,164,128,108,87,74,95, > 145,200,187,201,292,220,233,172,119,81,65,76,74, > 111,170,243,178,248,202,163,139,120,96,95,53,94) > sales2.1.matrix <- sales2.1 > dim(sales2.1.matrix) <- c(4,13) > sales2.1.dataframe <- as.data.frame(sales2.1.matrix) > names(sales2.1.dataframe) <- c("1995","1996","1997","1998") > > X11() > split.screen(c(2,2)) > for (i in 1:4) > { > screen(i) > hist(sales2.1.dataframe[[i]], > probability=T, > xlim=c(0,400), > ylim=c(0,0.006), > main=names(sales2.1.dataframe)[i], > xlab="Sales") > lines(density(sales2.1.dataframe[[i]])) > } > close.screen(all=TRUE) > > Although I'm happy that I finally got something that is pretty close > to what I wanted, I'm not sure whether this is the best or most elegant > way to do it. How would you do it? What functions/packages should I > look into, in order to improve these plots? > > Thanks in advance for your comments and suggestions, > > Peter > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 34 Date: Mon, 22 Sep 2008 01:49:32 +0200 From: "Weiss, Bernd " <bernd.weiss@uni-koeln.de> Subject: Re: [R] glmer -- extracting standard errors and other statistics To: jpoulsen@zoo.ufl.edu, r-help@r-project.org Message-ID: <48D6DD8C.8030405@uni-koeln.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed John Poulsen schrieb:> Hello, > > I am using glmer() from lmer(lme4) to run generalized linear mixed > models. However, I am having a problem extracting the standard errors > for the fixed effects. > > I have used: > > summary(model)$coef > fixed.effects(model) > coef(model) > > to get out the parameter estimates, but do not seem able to extract the > se's. > > Anybody have a solution? >You need to extract the variance-covariance matrix: library(lme4) gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data = cbpp)) sqrt(diag(vcov(gm1))) HTH, Bernd ------------------------------ Message: 35 Date: Sun, 21 Sep 2008 18:01:57 -0700 (PDT) From: Ted Byers <r.ted.byers@gmail.com> Subject: [R] Why isn't R recognising integers as numbers? To: r-help@r-project.org Message-ID: <19600308.post@talk.nabble.com> Content-Type: text/plain; charset=us-ascii I have a number of files containing anywhere from a few dozen to a few thousand integers, one per record. The statement "refdata18 read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header TRUE,na.strings="")" works fine, and if I type refdata18, I get the integers displayed, one value per record (along with a record number). However, when I try " fitdistr(refdata18,"negative binomial")", or hist.scott(refdata18, prob = TRUE), I get an error: Error in fitdistr(refdata18, "negative binomial") : 'x' must be a non-empty numeric vector Or Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) : 'x' must be numeric How can it not recognise integers as numbers? Thanks Ted -- View this message in context: http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 36 Date: Sun, 21 Sep 2008 21:12:49 -0400 From: "jim holtman" <jholtman@gmail.com> Subject: Re: [R] Why isn't R recognising integers as numbers? To: "Ted Byers" <r.ted.byers@gmail.com> Cc: r-help@r-project.org Message-ID: <644e1f320809211812t7a82ac5dy98208d60b3007ef8@mail.gmail.com> Content-Type: text/plain best guess is that they are not integers. Do 'str' on your object and it probably says they are 'factors'. This is probably due to some of your data being non-numeric. Try using 'colClasses' on read.csv to specify what the column should contain. Also try "scan" after skipping the first record if it is a header:> scan("", what=0L) # bad input after specifying integer1: 1 2 3 4 5: 1 v 5: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'an integer', got 'v'> scan("", what=0L) # good input1: 1 2: 2 3: 3 4: Read 3 items [1] 1 2 3>On Sun, Sep 21, 2008 at 9:01 PM, Ted Byers <r.ted.byers@gmail.com> wrote:> > I have a number of files containing anywhere from a few dozen to a few > thousand integers, one per record. > > The statement "refdata18 > read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header > TRUE,na.strings="")" works fine, and if I type refdata18, I get the > integers > displayed, one value per record (along with a record number). However, > when > I try " fitdistr(refdata18,"negative binomial")", or hist.scott(refdata18, > prob = TRUE), I get an error: > > Error in fitdistr(refdata18, "negative binomial") : > 'x' must be a non-empty numeric vector > Or > Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) : > 'x' must be numeric > > How can it not recognise integers as numbers? > > Thanks > > Ted > -- > View this message in context: > http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ------------------------------ Message: 37 Date: Sun, 21 Sep 2008 21:21:21 -0400 From: "Gabor Grothendieck" <ggrothendieck@gmail.com> Subject: Re: [R] Multiple plots per window To: p@fo76.org Cc: r-help@r-project.org Message-ID: <971536df0809211821u2b99348ai36b1952bc127695d@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Here are two ways: one using classic graphics and one much shorter way using lattice. ggplot2 would be a another short way (not shown). Lines <- "1995 1996 1997 1998 153 133 145 111 189 177 200 170 221 241 187 243 215 228 201 178 302 283 292 248 223 255 220 202 201 238 233 163 173 164 172 139 121 128 119 120 106 108 81 96 86 87 65 95 87 74 76 53 108 95 74 94" # read in data and remove the X from the column names s <- read.table(textConnection(Lines), header = TRUE) names(s) <- sub("X", "", names(s)) # 1. using classic graphics # find overall ranges of x and y h <- lapply(s, hist, probability = TRUE) ylim <- range(unlist(lapply(h, "[[", "density"))) xlim <- range(unlist(lapply(h, "[[", "breaks"))) # plot opar <- par(mfrow = c(2, 2)) for(i in 1:length(s)) { hist(s[[i]], main = names(s)[i], probability = TRUE, xlab = "Sales", xlim = xlim, ylim = ylim) lines(density(s[[i]])) } par(opar) # 2. using lattice its a bit easier library(lattice) histogram( ~ values | ind, stack(s), type = "density", panel = function(...) { panel.histogram(...) panel.densityplot(...) } ) On Sun, Sep 21, 2008 at 7:19 PM, <p@fo76.org> wrote:> sorry, as Mark Leeds pointed out to me, the row/column numbers where > mixed up in my example... happens when you cut & paste like mad from > your history... it should read as follows: > > sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108, > 133,177,241,228,283,255,238,164,128,108,87,74,95, > 145,200,187,201,292,220,233,172,119,81,65,76,74, > 111,170,243,178,248,202,163,139,120,96,95,53,94) > > sales2.1.matrix <- sales2.1 > dim(sales2.1.matrix) <- c(13,4) > > sales2.1.dataframe <- as.data.frame(sales2.1.matrix) > names(sales2.1.dataframe) <- c("1995","1996","1997","1998") > > Peter > > Quoting p@fo76.org: > >> Hi all, >> >> I'm currently working through "The Analysis of Time Series" by Chris >> Chatfield. In order to also get a better understanding of R, I play >> around with the examples and Exercises (no homework or assignement, >> just selfstudy!!). >> >> Exercise 2.1 gives the following dataset (sales figures for 4 week >> intervals): >> >>> sales2.1.dataframe >> >> 1995 1996 1997 1998 >> 1 153 133 145 111 >> 2 189 177 200 170 >> 3 221 241 187 243 >> 4 215 228 201 178 >> 5 302 283 292 248 >> 6 223 255 220 202 >> 7 201 238 233 163 >> 8 173 164 172 139 >> 9 121 128 119 120 >> 10 106 108 81 96 >> 11 86 87 65 95 >> 12 87 74 76 53 >> 13 108 95 74 94 >> >> I want to plot the histograms/densities for all four years in one window. >> After trying out a couple of things, I finally ended up with the following >> (it took me two hours - Ouch!): >> >> sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108, >> 133,177,241,228,283,255,238,164,128,108,87,74,95, >> 145,200,187,201,292,220,233,172,119,81,65,76,74, >> 111,170,243,178,248,202,163,139,120,96,95,53,94) >> sales2.1.matrix <- sales2.1 >> dim(sales2.1.matrix) <- c(4,13) >> sales2.1.dataframe <- as.data.frame(sales2.1.matrix) >> names(sales2.1.dataframe) <- c("1995","1996","1997","1998") >> >> X11() >> split.screen(c(2,2)) >> for (i in 1:4) >> { >> screen(i) >> hist(sales2.1.dataframe[[i]], >> probability=T, >> xlim=c(0,400), >> ylim=c(0,0.006), >> main=names(sales2.1.dataframe)[i], >> xlab="Sales") >> lines(density(sales2.1.dataframe[[i]])) >> } >> close.screen(all=TRUE) >> >> Although I'm happy that I finally got something that is pretty close >> to what I wanted, I'm not sure whether this is the best or most elegant >> way to do it. How would you do it? What functions/packages should I >> look into, in order to improve these plots? >> >> Thanks in advance for your comments and suggestions, >> >> Peter >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 38 Date: Sun, 21 Sep 2008 20:44:50 -0500 From: Marc Schwartz <marc_schwartz@comcast.net> Subject: Re: [R] Why isn't R recognising integers as numbers? To: Ted Byers <r.ted.byers@gmail.com> Cc: r-help@r-project.org Message-ID: <48D6F892.3080901@comcast.net> Content-Type: text/plain; charset=ISO-8859-1 on 09/21/2008 08:01 PM Ted Byers wrote:> I have a number of files containing anywhere from a few dozen to a few > thousand integers, one per record. > > The statement "refdata18 > read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header > TRUE,na.strings="")" works fine, and if I type refdata18, I get the integers > displayed, one value per record (along with a record number). However, when > I try " fitdistr(refdata18,"negative binomial")", or hist.scott(refdata18, > prob = TRUE), I get an error: > > Error in fitdistr(refdata18, "negative binomial") : > 'x' must be a non-empty numeric vector > Or > Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) : > 'x' must be numeric > > How can it not recognise integers as numbers? > > Thanks > > Ted'refdata18' is a data frame and the two functions are expecting a numeric vector. If you use: fitdistr(refdata18[, 1], "negative binomial") or hist(refdata18[, 1]) you should get a suitable result, presuming that the first column in the data frame is a numeric vector. Use: str(refdata18) to get a sense for the structure of the data frame, including the column names, which you could then use, instead of the above index based syntax. HTH, Marc Schwartz ------------------------------ Message: 39 Date: Sun, 21 Sep 2008 18:56:48 -0700 (PDT) From: Ted Byers <r.ted.byers@gmail.com> Subject: Re: [R] Why isn't R recognising integers as numbers? To: r-help@r-project.org Message-ID: <19600695.post@talk.nabble.com> Content-Type: text/plain; charset=us-ascii Thanks Jim, Alas, it wasn't this. Here is the output from both of your suggestions:> refdata18 = read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", > header = TRUE,na.strings="") > str(refdata18)'data.frame': 341 obs. of 1 variable: $ X0: int 0 0 0 0 0 0 0 0 0 0 ...> scan("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", what=0L)Read 342 items [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [26] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [51] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [76] 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [101] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [126] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [151] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [176] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 [201] 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 [226] 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 [251] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 [276] 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 [301] 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 [326] 12 12 12 18 18 18 18 18 18 18 18 18 18 18 18 18 18 Thanks anyway. Ted>jholtman wrote:> > best guess is that they are not integers. Do 'str' on your object and it > probably says they are 'factors'. This is probably due to some of your > data > being non-numeric. Try using 'colClasses' on read.csv to specify what the > column should contain. Also try "scan" after skipping the first record if > it is a header: > >> scan("", what=0L) # bad input after specifying integer > 1: 1 2 3 4 > 5: 1 v > 5: > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : > scan() expected 'an integer', got 'v' >> scan("", what=0L) # good input > 1: 1 > 2: 2 > 3: 3 > 4: > Read 3 items > [1] 1 2 3 >> > > On Sun, Sep 21, 2008 at 9:01 PM, Ted Byers <r.ted.byers@gmail.com> wrote: > >> >> I have a number of files containing anywhere from a few dozen to a few >> thousand integers, one per record. >> >> The statement "refdata18 >> read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header >> TRUE,na.strings="")" works fine, and if I type refdata18, I get the >> integers >> displayed, one value per record (along with a record number). However, >> when >> I try " fitdistr(refdata18,"negative binomial")", or >> hist.scott(refdata18, >> prob = TRUE), I get an error: >> >> Error in fitdistr(refdata18, "negative binomial") : >> 'x' must be a non-empty numeric vector >> Or >> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) >> : >> 'x' must be numeric >> >> How can it not recognise integers as numbers? >> >> Thanks >> >> Ted >> -- >> View this message in context: >> http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600695.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 40 Date: Sun, 21 Sep 2008 19:09:29 -0700 (PDT) From: Ted Byers <r.ted.byers@gmail.com> Subject: Re: [R] Why isn't R recognising integers as numbers? To: r-help@r-project.org Message-ID: <19600803.post@talk.nabble.com> Content-Type: text/plain; charset=us-ascii Thanks Marc, That was it. For the last 30 years, I'd write my own code, in FORTRAN, C++, or even Java, to do whatever statistical analysis I needed. When at the office, sometimes I could use SAS, but that hasn't been an option for me in years. This is the first time I have had to load real data into R (instead of generating random data to use while playing with some of the stats functions, or manually typing dummy data). I take it, then, that the result of loading data is a data frame, and not just a matrix or array. Using something like "refdata18[, 1]" feels rather alien, but I'm sure I'll quickly get used to it. I'd seen it before in the R docs, but it didn't register that I had to use it to get the functions of most interest to me to recognise my data as a vector of numbers, given I'd provided only a vector of integers as input. Thanks Ted Marc Schwartz wrote:> > on 09/21/2008 08:01 PM Ted Byers wrote: >> I have a number of files containing anywhere from a few dozen to a few >> thousand integers, one per record. >> >> The statement "refdata18 >> read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header >> TRUE,na.strings="")" works fine, and if I type refdata18, I get the >> integers >> displayed, one value per record (along with a record number). However, >> when >> I try " fitdistr(refdata18,"negative binomial")", or >> hist.scott(refdata18, >> prob = TRUE), I get an error: >> >> Error in fitdistr(refdata18, "negative binomial") : >> 'x' must be a non-empty numeric vector >> Or >> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) >> : >> 'x' must be numeric >> >> How can it not recognise integers as numbers? >> >> Thanks >> >> Ted > > 'refdata18' is a data frame and the two functions are expecting a > numeric vector. > > If you use: > > fitdistr(refdata18[, 1], "negative binomial") > > or > > hist(refdata18[, 1]) > > you should get a suitable result, presuming that the first column in the > data frame is a numeric vector. > > Use: > > str(refdata18) > > to get a sense for the structure of the data frame, including the column > names, which you could then use, instead of the above index based syntax. > > HTH, > > Marc Schwartz > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600803.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 41 Date: Sun, 21 Sep 2008 21:49:14 -0500 From: Marc Schwartz <marc_schwartz@comcast.net> Subject: Re: [R] Why isn't R recognising integers as numbers? To: Ted Byers <r.ted.byers@gmail.com> Cc: r-help@r-project.org Message-ID: <48D707AA.6040909@comcast.net> Content-Type: text/plain; charset=ISO-8859-1 on 09/21/2008 09:09 PM Ted Byers wrote:> Thanks Marc, > > That was it. > > For the last 30 years, I'd write my own code, in FORTRAN, C++, or even Java, > to do whatever statistical analysis I needed. When at the office, sometimes > I could use SAS, but that hasn't been an option for me in years. > > This is the first time I have had to load real data into R (instead of > generating random data to use while playing with some of the stats > functions, or manually typing dummy data). > > I take it, then, that the result of loading data is a data frame, and not > just a matrix or array. Using something like "refdata18[, 1]" feels rather > alien, but I'm sure I'll quickly get used to it. I'd seen it before in the > R docs, but it didn't register that I had to use it to get the functions of > most interest to me to recognise my data as a vector of numbers, given I'd > provided only a vector of integers as input.<snip> Ted, If you read the 'Value' section of ?read.csv, it indicates that the function returns a data frame. It is important to fully read the help page for new functions so that you understand both how they are used and the result(s) of their actions, including the 'Notes' section, which can include further details, including gotchas and idiosyncrasies. A data frame will be the result of read.csv() even if the data source is a single column. Think of a data frame in the same way as a spreadsheet or database table with one or more columns and one or more rows. The unique aspect of a data frame is that each column can be a different data type, though that need not be the case. Thus, you still need to identify the column within the data frame that you wish to manipulate/analyze further. There are various ways of doing this, which are covered in Chapter 6 of "An Introduction to R" on Lists and Data Frames. Some involve the use of indices, others using a column name, as appropriate. There will be situations where they can be interchangeable and others where one method will be superior to the other. Time and experience will provide insight and intuition. There are a myriad of ways of reading data into R and these are covered in the Data Import/Export manual. Not all result in a data frame, but in general and perhaps most commonly, that will be the result. HTH, Marc ------------------------------ Message: 42 Date: Sun, 21 Sep 2008 19:54:28 -0700 (PDT) Subject: Re: [R] Calculating interval for conditional/unconditional correlation matrix Message-ID: <19678.53863.qm@web32203.mail.mud.yahoo.com> Content-Type: text/plain; charset=utf-8 Hi Ana, There are two problems: First of all, if you want your matrix to have 4 columns it's number of elem[[elided Yahoo spam]] Secondly, and this is what causes your error message, you should not call your second function matrix. Call it matrix1, my_matrix, whatever. Otherwise R thinks that you are calling your matrix function within itself.> Subject: [R] Calculating interval for conditional/unconditional correlation matrix > To: "R" <r-help@r-project.org> > Received: Sunday, 21 September, 2008, 8:05 PM > Hi there, > > Could anyone please help me to understand what should be > done in order not to get this error message: Error: > evaluation nested too deeply: infinite recursion / > options(expressions=)? > > Here is my code: > > determinant<- > function(x){det(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))} > > matrix<- > function(x){(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))} > > > conditional<-function(x,varcov){ > varcov<-matrix(x) > sigmaxx<-varcov[3,3] > sigmaxz<-varcov[3,1:2] > sigmayy<-varcov[4,4] > sigmayz<-varcov[4,1:2] > sigmazx<-varcov[1:2,3] > sigmazy<-varcov[1:2,4] > sigmazz<-varcov[1:2,1:2] > > (x-sigmaxz%*%solve(sigmaZZ)%*%sigmazy)/sqrt((sigmaxx-sigmaxz%*%solve(sigmaZZ)%*%sigmazx)*(sigmayy-sigmayz%*%solve(sigmaZZ)%*%sigmazy))} > > interval<-uniroot(determinant,lower = min(c(0,1)), upper > = max(c(0,1))) > > I tried also with the code below, but got the same Error > message. > > lower.bound<-uniroot(determinant,c(0,0.5))$root > upper.bound<-uniroot(determinant,c(0.51,1))$root > >[[elided Yahoo spam]]> > Ana > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code.------------------------------ Message: 43 Date: Sun, 21 Sep 2008 18:26:20 -0500 From: Bingshan Li <bli1@bcm.tmc.edu> Subject: Re: [R] How to plot "greater than" symbol on the x-axis To: Gabor Grothendieck <ggrothendieck@gmail.com> Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca> Message-ID: <48D6D81C.40509@bcm.tmc.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi Gabor, This works. This is exactly what I want. According to John Fox's reply, I used expression(NA>=1) and it also worked. Thanks for the kind and clever help. Bingshan Gabor Grothendieck wrote:> On Sun, Sep 21, 2008 at 11:23 AM, Li, Bingshan <bli1@bcm.tmc.edu> wrote: > >> Hi John, >> >> Yes, you are right. I meant "greater-than-or-equal". According to your suggestion, I can plot the symbol only. But what I want is to have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you know how to make it? The expression("">=1"") did not work, and paste(expression("">=""), 1) >> did not work either. >> >> > > Try this: > > plot(1:10, xaxt = "n") > for(i in 1:10) axis(1, i, bquote(phantom(0) >= .(i))) >------------------------------ Message: 44 Date: Mon, 22 Sep 2008 14:42:15 +1200 From: Paul Murrell <p.murrell@auckland.ac.nz> Subject: Re: [R] PDF fonts problem To: Mihalicza P?ter <mihalicza.peter@eski.hu> Cc: r-help@r-project.org Message-ID: <48D70607.40301@stat.auckland.ac.nz> Content-Type: text/plain; charset=UTF-8 Hi Mihalicza P?ter wrote:> Dear Dr. Murrel, >[[elided Yahoo spam]]> > Paul Murrell ?rta: >> Hi >> >>> >>> #CMS >>> pdf("tryfont-cms.pdf", family="CMS") >>> grid.text("gg\u151hh\uF6ii\uF3jj kk\u171ll\uFCmm\uFAnn") >>> dev.off() >>> #u151 and u171 doesn't show, though the other accented ones do >>> >>> embedFonts("tryfont-cms.pdf", >>> outfile="tryfont-cms-embed.pdf", >>> fontpaths="/cm-super/afm/") >>> #after embedding the same "slipping" occurs >> >> The 'fontpaths' argument describes where the PFB files are, not where >> the AFM files are. So this is probably failing to embed the fonts >> because it can't find the fonts. Does it work if you change to >> something like ... >> >> embedFonts("tryfont-cms.pdf", >> outfile="tryfont-cms-embed.pdf", >> fontpaths="cm-super/pfb/") >> >> Paul >> >> > This solved my problem, so I am really very grateful! I am not too > familiar with font protocols. > Just for the sake of knowledge: if my embedFonts specification should > not have made any difference, why did the output pdf differed from the > one before embedding?Your embedFonts() specification (especially your 'fontpaths' argument) *did* make a difference. This function calls ghostscript to perform the embedding and if ghostscript cannot find the PFB files it cannot embed the font. If the PDF file does not have embedded fonts, the PDF reader will use (substitute) its own fonts and the result can look awful. Paul> Thanks again, > Peter > > >-- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 paul@stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ ------------------------------ Message: 45 Date: Mon, 22 Sep 2008 11:59:13 +0800 (CST) Subject: [R] Help for R To: r-help@r-project.org Message-ID: <339826.80972.qm@web15908.mail.cnb.yahoo.com> Content-Type: text/plain Dear R users£¬ I've just started learning R and I'm having a problem with it. I was told as following when I tried to run R: Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) : in 'matlab' methods specified for export, but none defined: sum, size, padarray, flipud, fliplr Error: package/namespace load failed for 'matlab' Then I tried "package/load in package/matlab", however, the same message showed to me as above. I appreciate for any help and suggestion. Thanks. Kai --------------------------------- ÑÅ»¢ÓÊÏ䣬ÄúµÄÖÕÉúÓÊÏ䣡 [[alternative HTML version deleted]] ------------------------------ Message: 46 Date: Sun, 21 Sep 2008 23:08:42 -0500 From: "Matthew Pettis" <matthew.pettis@gmail.com> Subject: [R] Hmisc and Ubuntu (aptitude install) To: r-help@r-project.org Message-ID: <82ba77b80809212108y6baf7850i8b7d76c54bad160c@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi, I'm trying to get the Hmisc module on my Ubuntu Hardy Heron install. I tried getting Hmisc from within R by issuing the standard 'install.packages' command, but it said I needed 'gfortran' to compile. I thought I could circumvent this by using 'aptitude' to get the package 'r-cran-hmisc', but when I got it, the package had critical missing parts (got 404s). So, I'll be trying to go back and download 'gfortran', but can anybody tell me if this aptitude ubuntu package should be kept up to date and is just currently overlooked? Thanks, Matt -- It is from the wellspring of our despair and the places that we are broken that we come to repair the world. -- Murray Waas ------------------------------ Message: 47 Date: Mon, 22 Sep 2008 00:47:25 -0400 From: "Juliet Hannah" <juliet.hannah@gmail.com> Subject: [R] adding layers in ggplot2 (data and code included) To: r-help@r-project.org Message-ID: <93d6f2a80809212147o5c2e8d4co316396bad5f6217e@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Here is some sample data: mydata <- read.table(textConnection("Est Group Tri 0 0 4.639644 1 0 4.579189 2 0 4.590714 0 1 4.443696 1 1 4.588243 2 1 4.650505 0 2 4.296608 1 2 4.826036 2 2 4.765386"),header=TRUE); closeAllConnections(); I can form two plots, scatter and lines, as follows: p <- ggplot(mydata, aes(x=Est, y=Tri)) p + geom_point(aes(colour=factor(Group),shape=factor(Group))) and p+ geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F). However, I am unable to have the plots together. I obtain the following error:> p + geom_point(aes(colour=factor(Group),shape=factor(Group)))+geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)Error in `[.data.frame`(df, , var) : undefined columns selected Thanks, Juliet ------------------------------ Message: 48 Date: Mon, 22 Sep 2008 17:30:47 +1200 From: Rolf Turner <r.turner@auckland.ac.nz> Subject: [R] Warnings in fitdistr() from MASS. To: R-help Forum <r-help@r-project.org> Message-ID: <80B3F5B3-0EC0-4E72-BEA9-5B40098294AE@auckland.ac.nz> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed For a lark, I experimented a bit with the data from Ted Byers' recent postings. The result of fitdistr() seemed sensible, but I was bothered by the warnings about NaNs that arose. Warnings always make me nervous. Explicitly this is what I did: TXT <- "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 18 18 18 18 18 18 18 18 18 18 18 18 18 18" x <- scan(textConnection(TXT)) closeAllConnections() try.x <- fitdistr(x,"negative binomial") Two warnings about NaNs being produced resulted. Digging into the code with browser() revealed that in the optimization process negative values of "size" were tried on occasion, and this was giving the NaNs. Basically I'm sending this out so that maybe those who are like me and are made nervous by warnings will be able to search the archives and find reassurance that all is actually well. To keep the warnings from the door, one can set an argument "lower" in the call to fitdistr, e.g. eps <- sqrt(.Machine$double.eps) fitdistr(x,"negative binomial",lower=c(eps,eps)) Note that setting lower=c(0,0) doesn't work --- you get an *error* to the [[elided Yahoo spam]] I also tried building my own local version of fitdistr() which had if(distname == "negative binomial" & is.null(Call$lower)) Call$lower <- rep(sqrt(.Machine$double.eps),2) just after the assignment ``Call$hessian <- TRUE''. This *seemed* to work (i.e. prevent those nervous-making warnings and still give the right answer). cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}} ------------------------------ Message: 49 Date: Mon, 22 Sep 2008 08:01:23 +0200 From: Peter Dalgaard <p.dalgaard@biostat.ku.dk> Subject: Re: [R] Why isn't R recognising integers as numbers? To: Ted Byers <r.ted.byers@gmail.com> Cc: r-help@r-project.org Message-ID: <48D734B3.5090107@biostat.ku.dk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Ted Byers wrote:> Thanks Jim, > > Alas, it wasn't this. Here is the output from both of your suggestions: > > >> refdata18 = read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", >> header = TRUE,na.strings="") >> str(refdata18) >> > 'data.frame': 341 obs. of 1 variable: > $ X0: int 0 0 0 0 0 0 0 0 0 0 ... >Ummm, is there a header line or not? If there isn't, read.csv is going to eat the first observation thinking it is a name (and since it is non-syntactic add an X in front). The scan command looks fine, you just should have assigned it somewhere, x <- scan(......) and then fitdistr(x, ....)>> scan("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", what=0L) >> > Read 342 items > [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 > [26] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 > [51] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 > [76] 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > 1 1 > [101] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > 1 1 > [126] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > 1 1 > [151] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 2 > [176] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 > 3 3 > [201] 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 > 4 4 > [226] 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 > 6 6 > [251] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 > 7 7 > [276] 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 > 10 10 > [301] 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 > 12 12 > [326] 12 12 12 18 18 18 18 18 18 18 18 18 18 18 18 18 18 > >-- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 ------------------------------ Message: 50 Date: Sun, 21 Sep 2008 23:10:53 -0700 From: Eric <rmailbox@justemail.net> Subject: Re: [R] adding layers in ggplot2 (data and code included) To: r-help@r-project.org Message-ID: <48D736ED.20904@justemail.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed The way you've attempted to get this result seems to align with the way R "should" work, but it fails in this case. The fix is to break things up a little bit: p <- ggplot(mydata, aes(x=Est, y=Tri)) p <- p + geom_point(aes(colour=factor(Group),shape=factor(Group))) p <- p + geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F) p Eric Juliet Hannah wrote:> Here is some sample data: > > mydata <- read.table(textConnection("Est Group Tri > 0 0 4.639644 > 1 0 4.579189 > 2 0 4.590714 > 0 1 4.443696 > 1 1 4.588243 > 2 1 4.650505 > 0 2 4.296608 > 1 2 4.826036 > 2 2 4.765386"),header=TRUE); > closeAllConnections(); > > I can form two plots, scatter and lines, as follows: > > p <- ggplot(mydata, aes(x=Est, y=Tri)) > p + geom_point(aes(colour=factor(Group),shape=factor(Group))) > > and > > p+ geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F). > > However, I am unable to have the plots together. > > I obtain the following error: > > >> p + geom_point(aes(colour=factor(Group),shape=factor(Group)))+geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F) >> > Error in `[.data.frame`(df, , var) : undefined columns selected > > Thanks, > > Juliet > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 51 Date: Sun, 21 Sep 2008 23:17:27 -0700 From: <rkevinburton@charter.net> Subject: [R] Time series (ts) questions. To: r-help@r-project.org Message-ID: <20080922021727.S31BZ.130479.root@mp16> Content-Type: text/plain; charset=utf-8 I have been working with the base time series object (ts) and I had a couple of questions that hopefully this group can help me with: 1) What is the best why to append an observation to an existing time-series? Suppose I have a time series: t <- ts(1:12, frequency=5) This would generate two complete cycles and one remainder. Now I would like to append an observation to this time series. I could use 'c' but then I would need to rebuild the whole time series and I would need to know the frequency etc. I would like some operation like '+' that would simply append the value to the end of the time series (incrementing the 'las time value so thing like cycle() still output the correnct values) but alas t + 10 is already taken as an equally useful operation by adding 10 to each element in the time series (rather than in thie case, appending ts(10,frequency) with a time value of 13 to the time series). 2) How is the best way to get the last time value in a time series? I can do something like: (start(t)[2] - 1) + (end(t)[1]-1) * frequency(t) + end(t)[2] But there has to be an easier way. Thank you. Kevin ------------------------------ Message: 52 Date: Mon, 22 Sep 2008 08:43:07 +0200 From: "PALMIER Patrick - CETE NP/INFRA/TRF" <Patrick.Palmier@developpement-durable.gouv.fr> Subject: [R] Matrix balancing on margins To: r-help@r-project.org Message-ID: <48D73E7B.4020702@developpement-durable.gouv.fr> Content-Type: text/plain Hello, Is there any package in R for balancing matrix I want to estimate a matrix with * a initial matrix (1 everywhere for example) * Row margin * Col margin * distance class vector (each cell of the matrix belong to a distance class) and I want that the distance class repartition will be preserved How can I do such thing? Is there any function already existing or should I compute an iterative script myself? Thanks -- *Patrick PALMIER** **Centre d'Études Techniques de l'Équipement Nord - Picardie Département Infrastructures */*Trafic -- Socio-économie */2, rue de Bruxelles, BP 275 59019 Lille cedex FRANCE Tél: +33 (0) 3 20 49 60 70 Fax: +33 (0) 3 20 49 63 69 [[alternative HTML version deleted]] ------------------------------ Message: 53 Date: Sun, 21 Sep 2008 23:48:47 -0700 (PDT) From: Mark Difford <mark_difford@yahoo.co.uk> Subject: Re: [R] Variable Selection for data reduction and discriminant anlaysis To: r-help@r-project.org Message-ID: <19602702.post@talk.nabble.com> Content-Type: text/plain; charset=us-ascii Hi Gareth,>> My data is transformed to the clr or alr under Aitchison geometry, so I >> am essentially working >> in Euclidean space.Great: glad to hear it.>> Has anyone had experience doing stepwise LDA?? I can't for the life of >> me find any help >> online about where to start.A better option might be this: Trevor Hastie and a student of his have recently put out a paper that does a step-up from penalized discriminant analysis based, I think, on Trevor's sparse principal component analysis method (in his elasticnet package). http://www-stat.stanford.edu/~hastie/Papers/sda_line.pdf You can get R-code to do the analysis on the first author's website; there's a link in the paper. Bye, Mark. gcam032 wrote:> > Thanks Mark, > > I failed to mention that i'm working within a compositional framework. I > didn't want to confuse things. My data is transformed to the clr or alr > under Aitchison geometry, so I am essentially working in Euclidean space. > > Has anyone had experience doing stepwise LDA?? I can't for the life of me > find any help online about where to start. > > Thanks > > Gareth > > > quote author="Mark Difford"> > Hi Gareth, > >>> If I use the full composition (31 elements or variables), I can get >>> reasonable separation of my 6 sources. > > A word of advice: You need to be exceptionally careful when analyzing > compositional data. Taking compositions puts your data values into a > constrained/bounded space (generally called a simplex) so that most > standard statistical procedures (i.e. anything that uses a Euclidean > metric, and most do) deliver erroneous results. Pearson wrote a paper on > this long ago, but it's generally been ignored (except by Aitchison and > the Spanish School of mathematical statisticians). > > The problem is comparatively well known to geologists, who work with > compositional much of the time. R has a very good package for analysing > this data-type: see the compositions package (a new release seems > iminent). You will be able to get most of the main references from it. > (The authors of the package also have a newly-released article in one of > the Elsevier journals [unfor. my bib+ are elsewhere so I cannot give > details]). > > You could start by Wiki'ing your way to "compositional data". > > HTH, Mark. > > > > Gareth Campbell wrote: >> >> Hello all, >> >> I'm dealing with geochemical analyses of some rocks. >> >> If I use the full composition (31 elements or variables), I can get >> reasonable separation of my 6 sources. Then when I go onto do LDA with >> the >> 6 groups, I get excellent separation. >> >> I feel like I should be reducing the variables to thos that are providing >> the most discrimination between the groups as this is important >> information >> for me. I struggle to interpret the PCA plot in a way that helps me (due >> to >> the large number of elements). So I'm trying to do some sort of >> step-wise >> variable selection. >> >> I would love to hear from someone (possibly a geochemist or similar) who >> does this regularly to determine the best course of action in R to do >> this. >> >> >> Thanks very much >> >> >> -- >> Gareth Campbell >> PhD Candidate >> The University of Auckland >> >> P +649 815 3670 >> M +6421 256 3511 >> E gareth.campbell@esr.cri.nz >> gcam032@gmail.com >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > >-- View this message in context: http://www.nabble.com/Variable-Selection-for-data-reduction-and-discriminant-anlaysis-tp19591270p19602702.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 54 Date: Mon, 22 Sep 2008 08:50:21 +0200 From: " Jos? E. Lozano " <lozalojo@jcyl.es> Subject: [R] Manage huge database To: <r-help@stat.math.ethz.ch> Message-ID: <47A455630022D55E@mtacsbs.csbs.jcyl.es> (added by postmaster@jcyl.es) Content-Type: text/plain Hello, Recently I have been trying to open a huge database with no success. It’s a 4GB csv plain text file with around 2000 rows and over 500,000 columns/variables. I have try with The SAS System, but it reads only around 5000 columns, no more. R hangs up when opening. Is there any way to work with “parts” (a set of columns) of this database, since its impossible to manage it all at once? Is there any way to establish a link to the csv file and to state the columns you want to fetch every time you make an analysis? I’ve been searching the net, but found little about this topic. Best regards, Jose Lozano [[alternative HTML version deleted]] ------------------------------ Message: 55 Date: Mon, 22 Sep 2008 08:08:20 +0100 From: "Barry Rowlingson" <b.rowlingson@lancaster.ac.uk> Subject: Re: [R] Manage huge database To: " Jos? E. Lozano " <lozalojo@jcyl.es> Cc: r-help@stat.math.ethz.ch Message-ID: <d8ad40b50809220008r73daa11fi5d6b845fc1ca3d04@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 2008/9/22 Jos? E. Lozano <lozalojo@jcyl.es>:> Recently I have been trying to open a huge database with no success. > > It's a 4GB csv plain text file with around 2000 rows and over 500,000 > columns/variables.I wouldn't call a 4GB csv text file a 'database'.> Is there any way to work with "parts" (a set of columns) of this database, > since its impossible to manage it all at once?Yes, use a database. A real database.> Is there any way to establish a link to the csv file and to state the > columns you want to fetch every time you make an analysis?No, but you can establish a link to a database. You want a database. A real relational database.> I've been searching the net, but found little about this topic.Try: http://cran.r-project.org/doc/manuals/R-data.html#Relational-databases Barry ------------------------------ Message: 56 Date: Mon, 22 Sep 2008 09:16:52 +0200 From: Martin Maechler <maechler@stat.math.ethz.ch> Subject: Re: [R] Symmetric matrix To: Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl> Message-ID: <18647.18020.828088.828816@stat.math.ethz.ch> Content-Type: text/plain; charset=us-ascii>>>>> "DR" == Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl> >>>>> on Sun, 21 Sep 2008 19:58:44 +0200 writes:DR> try the following DR> a <- matrix(rnorm(36), 6) DR> ind <- lower.tri(a) DR> a[ind] <- t(a)[ind] DR> a Yes, indeed, it needs the t(.) trick. Note that 'Matrix' package has a function forceSymmetric(.) to do this for you (faster, using C code): A <- forceSymmetric(Matrix(rnorm(36), 6)) is all you'd need {if can afford to trash half of the random numbers generated} Martin Maechler, ETH Zurich DR> I hope it helps. DR> Best, DR> Dimitris DR> Megh Dal wrote: >> I have following matrix : >> >> a = matrix(rnorm(36), 6) >> >> Now I want to replace the lower-triangular elements with it's upper-triangular elements. That is I want to make a symmetric matrix from a. I have tried with lower.tri() and upper.tri() function, but got desired result. Can anyone please tell me how to do that? ------------------------------ Message: 57 Date: Mon, 22 Sep 2008 15:35:04 +0800 From: "Yihui Xie" <xieyihui@gmail.com> Subject: Re: [R] Manage huge database To: " Jos? E. Lozano " <lozalojo@jcyl.es> Cc: r-help@stat.math.ethz.ch Message-ID: <89b6b8c90809220035o3f702624p34cb83000ad6b39f@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi, You can treat it as a database and use ODBC to fetch data from the CSV file using SQL. See the package RODBC for details about database connections. (I have dealt with similar problems before with RODBC) Regards, Yihui -- Yihui Xie <xieyihui@gmail.com> Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China On Mon, Sep 22, 2008 at 2:50 PM, Jos? E. Lozano <lozalojo@jcyl.es> wrote:> Hello, > > > > Recently I have been trying to open a huge database with no success. > > > > It's a 4GB csv plain text file with around 2000 rows and over 500,000 > columns/variables. > > > > I have try with The SAS System, but it reads only around 5000 columns, no > more. R hangs up when opening. > > > > Is there any way to work with "parts" (a set of columns) of this database, > since its impossible to manage it all at once? > > > > Is there any way to establish a link to the csv file and to state the > columns you want to fetch every time you make an analysis? > > > > I've been searching the net, but found little about this topic. > > > > Best regards, > > Jose Lozano > > > [[alternative HTML version deleted]] >------------------------------ Message: 58 Date: Mon, 22 Sep 2008 09:49:09 +0200 From: " Jos? E. Lozano " <lozalojo@jcyl.es> Subject: Re: [R] Manage huge database To: "'Yihui Xie'" <xieyihui@gmail.com> Cc: r-help@stat.math.ethz.ch Message-ID: <47A455630022DC1E@mtacsbs.csbs.jcyl.es> (added by postmaster@jcyl.es) Content-Type: text/plain; charset="iso-8859-1" Hello, Yihui> You can treat it as a database and use ODBC to fetch data from the CSV > file using SQL. See the package RODBC for details about database > connections. (I have dealt with similar problems before with RODBC)Thanks for your tip, I have used RODBC before to read data from MSAccess and MSExcel files, but never I imagined it could work for non-database files such as csv. I will check the RODBC documentation. Best Regards, Jose Lozano ------------------------------------------ Jose E. Lozano Alonso Observatorio de Salud P?blica. Direccion General de Salud P?blica e I+D+I. Junta de Castilla y Le?n. Direccion: Paseo de Zorrilla, n?1. Despacho 3103. CP 47071. Valladolid. ------------------------------ Message: 59 Date: Mon, 22 Sep 2008 10:02:18 +0200 From: " Jos? E. Lozano " <lozalojo@jcyl.es> Subject: Re: [R] Manage huge database To: "'Barry Rowlingson'" <b.rowlingson@lancaster.ac.uk> Cc: r-help@stat.math.ethz.ch Message-ID: <47A455630022DD57@mtacsbs.csbs.jcyl.es> (added by postmaster@jcyl.es) Content-Type: text/plain; charset="iso-8859-1"> I wouldn't call a 4GB csv text file a 'database'.Obviously, a csv it's not a database itself, I tried to mean (though it seems I was not understood) that I had a huge database, exported to csv file by the people who created it (and I don?t have any idea of the original format of the database).> Yes, use a database. A real database.I've used MSAccess and there is a limit of 255 columns, as far as I know, so there is no way of import it. Obviously, I won't buy an Oracle license to read this file, so: what database system allows a 500000 variables table? MySQL? Do I have to split the file in smaller parts to import in tables to relate them all using an index field?> No, but you can establish a link to a database. You want a database. > A real relational database.> Try: > http://cran.r-project.org/doc/manuals/R-data.html#Relational-databasesIt didn't help, sorry. I perfectly knew what a relational database is (and I humbly consider myself an advanced user on working with MSAccess+VBA, only that I've never face this problem with variables), you should not suppose everyone's stupid, though... Thanks for your help, Best regards Jose Lozano ------------------------------ Message: 60 Date: Fri, 22 Aug 2008 09:15:20 +0100 From: Robin Hankin <rksh1@cam.ac.uk> Subject: Re: [R] how to keep up with R? To: a.ramasamy@imperial.ac.uk Cc: r-help <r-help@stat.math.ethz.ch>, Barry Rowlingson <b.rowlingson@lancaster.ac.uk> Message-ID: <48AE7598.7060209@cam.ac.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Adaikalavan Ramasamy wrote:> I agree! The best way to learn (and remember for longer) is to teach > someone else about it. > > And there is not reason not to repeat some of the anlysis done on SAS > with R. That way you can verify your outputs or compare the > presentations. If you consistently find differences in the outputs, > then trying to figure out the reason may lead you to better understand > the methods (e.g. different optimization or estimation procedures). >My take on this: I have repeatedly found that it is surprisingly easy to improve on existing (non-R) implementations of statistical and non-statistical computation, when working in R. Something about the structure of the language, something about the package mechanism, something about R-help, something about R-core, something about open-source, something about JSS or R-news, whatever it is, there is SOMETHING ABOUT R which lends itself to straightforward production of quality software. And that something is missing from other programming languages, IMO. rksh> Regards, Adai > > > > Barry Rowlingson wrote: >> 2008/9/19 Wensui Liu <liuwensui@gmail.com>: >>> Dear Listers, >>> >>> I've been a big fan of R since graduate school. After working in the >>> industry for years, I haven't had many opportunities to use R and am >>> mainly >>> using SAS. However, I am still forcing myself really hard to stay >>> close to R >>> by reading R-help and books and writing R code by myself for fun. >>> But by and >>> by, I start realizing I have hard time to keep up with R and am >>> afraid that >>> I would totally forget how to program in R. >>> >>> I really like it and am very unwilling to give it up. Is there any >>> idea how >>> I might keep touch with R without using it in work on daily basis? I >>> really >>> appreciate it. >>>-- Robin K. S. Hankin Senior Research Associate Cambridge Centre for Climate Change Mitigation Research (4CMR) Faculty of Economics The University of Cambridge rksh1@cam.ac.uk 01223-764877 ------------------------------ Message: 61 Date: Mon, 22 Sep 2008 09:30:52 +0100 (BST) From: (Ted Harding) <Ted.Harding@manchester.ac.uk> Subject: Re: [R] Why isn't R recognising integers as numbers? To: Ted Byers <r.ted.byers@gmail.com> Cc: r-help@r-project.org Message-ID: <XFMail.080922093052.Ted.Harding@manchester.ac.uk> Content-Type: text/plain; charset=iso-8859-1 Hi Ted (from Ted), Just to clarify Marc's comments about dataframes in more basic terms. If you read in data with read.csv() the result returned by the function is a dataframe. This is a specialised kind of list, which you can think of as a list of "columns" all of the same length. You can think of each "column" as a vector of elements, all of which must be of the same type within the column, though the type can vary (e.g. numeric, factor, character) between columns. When you display a dataframe, it looks like a matrix, though in R terms it is not really a matrix; it is a list, where each component of the list is a "column". Of course a dataframe, like any list, might have only one component. But it is still a list -- and the actual contents are only available "one layer down", after you have extracted that component by some means (e.g. by using the "$" extractor). Simple example: L <- c(1,2,3,4) ## vector L # [1] 1 2 3 4 L.df <- data.frame(L=L) ## Dataframe with 1 component named "L" L.df # L # 1 1 # 2 2 # 3 3 # 4 4 L.df$L ## Extract the component named "L" # [1] 1 2 3 4 ## Compare with the result of 'L' above # Try a regression on L (this works): lm(L ~ 1) # Call: # lm(formula = L ~ 1) # Coefficients: # (Intercept) # 2.5 # Try a regression on L.df (this doesn't work): lm(L.df ~ 1) # Error in model.frame.default(formula = L.df ~ 1, # drop.unused.levels = TRUE) : # invalid type (list) for variable 'L.df' # But it does after you refer to the component L by name: lm(L.df$L ~ 1) # Call: # lm(formula = L.df$L ~ 1) # Coefficients: # (Intercept) # 2.5 # or: lm(L ~ 1, data=L.df) # Call: # lm(formula = L ~ 1, data = L.df) # Coefficients: # (Intercept) # 2.5 # But you can (for a dataframe, not a general list) use an "index" method of extraction *as if* it were a matrix (even though it isn't): L.df[,1] # [1] 1 2 3 4 L.df[3,1] # [1] 3 # But compare with: L.df[1] # L # 1 1 # 2 2 # 3 3 # 4 4 which is essentially the same as L.df itself (e.g. lm(L.df[1] ~ 1) will not work in exactly the same way as lm(L.df ~ 1) didn't work). The dataframe structure exists in R because so much data is typically in the row by column (case by variables) layout such as you get in spreadsheets and associated CSV files, and it is very useful to be able to get into this layout directly (and refer to the variables by name, as above). The full generality of a 'list' can also be useful for encapsulating data of a less strictly structured kind, but that is another (longer) story! Helping this helps. Ted. On 22-Sep-08 02:09:29, Ted Byers wrote:> Thanks Marc, > That was it. > > For the last 30 years, I'd write my own code, in FORTRAN, C++, > or even Java, to do whatever statistical analysis I needed. > When at the office, sometimes I could use SAS, but that hasn't > been an option for me in years. > > This is the first time I have had to load real data into R > (instead of generating random data to use while playing with > some of the stats functions, or manually typing dummy data). > > I take it, then, that the result of loading data is a data > frame, and notjust a matrix or array. Using something like > "refdata18[, 1]" feels rather alien, but I'm sure I'll quickly > get used to it. I'd seen it before in the R docs, but it didn't > register that I had to use it to get the functions of most > interest to me to recognise my data as a vector of numbers, > given I'd provided only a vector of integers as input. > > Thanks > > Ted > > > Marc Schwartz wrote: >> >> on 09/21/2008 08:01 PM Ted Byers wrote: >>> I have a number of files containing anywhere from a few dozen to a >>> few >>> thousand integers, one per record. >>> >>> The statement "refdata18 >>> read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header >>> TRUE,na.strings="")" works fine, and if I type refdata18, I get the >>> integers >>> displayed, one value per record (along with a record number). >>> However, >>> when >>> I try " fitdistr(refdata18,"negative binomial")", or >>> hist.scott(refdata18, >>> prob = TRUE), I get an error: >>> >>> Error in fitdistr(refdata18, "negative binomial") : >>> 'x' must be a non-empty numeric vector >>> Or >>> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, >>> ...) >>> : >>> 'x' must be numeric >>> >>> How can it not recognise integers as numbers? >>> >>> Thanks >>> >>> Ted >> >> 'refdata18' is a data frame and the two functions are expecting a >> numeric vector. >> >> If you use: >> >> fitdistr(refdata18[, 1], "negative binomial") >> >> or >> >> hist(refdata18[, 1]) >> >> you should get a suitable result, presuming that the first column in >> the >> data frame is a numeric vector. >> >> Use: >> >> str(refdata18) >> >> to get a sense for the structure of the data frame, including the >> column >> names, which you could then use, instead of the above index based >> syntax. >> >> HTH, >> >> Marc Schwartz >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > View this message in context: > http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp1 > 9600308p19600803.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 22-Sep-08 Time: 09:30:47 ------------------------------ XFMail ------------------------------ ------------------------------ Message: 62 Date: Mon, 22 Sep 2008 09:41:30 +0100 From: "Barry Rowlingson" <b.rowlingson@lancaster.ac.uk> Subject: Re: [R] Manage huge database To: " Jos? E. Lozano " <lozalojo@jcyl.es> Cc: r-help@stat.math.ethz.ch Message-ID: <d8ad40b50809220141l5274bf8fw29d36784de519eab@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 2008/9/22 Jos? E. Lozano <lozalojo@jcyl.es>:>> I wouldn't call a 4GB csv text file a 'database'.> It didn't help, sorry. I perfectly knew what a relational database is (and I > humbly consider myself an advanced user on working with MSAccess+VBA, only > that I've never face this problem with variables), you should not suppose > everyone's stupid, though...[[elided Yahoo spam]] A bit more googling tells me both MySQL and PostgreSQL have limits of a few thousand on the number of columns in a table, not a few hundred thousand. An insightful comment on one mailing list is: "Of course, the real bottom line is that if you think you need more than order-of-a-hundred columns, your database design probably needs revision anyway ;-)" So, how much "design" is in this data? If none, and what you've basically got is a 2000x500000 grid of numbers, then maybe a more raw binary-type format will help - HDF or netCDF? Although I'm not sure how much R support for reading slices of these formats exists, you may be able to use an external utility to write slices out on demand. Random access to parts of these files is pretty fast. http://cran.r-project.org/web/packages/RNetCDF/index.html http://cran.r-project.org/web/packages/hdf5/index.html Thinking back to your 4GB file with 1,000,000,000 entries, that's only 3 bytes per entry (+1 for the comma). What is this data? There may be more efficient ways to handle it. Hope *that* helps... Barry ------------------------------ _______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. End of R-help Digest, Vol 67, Issue 23 ************************************** [[alternative HTML version deleted]]
Marc Schwartz
2008-Sep-22 21:07 UTC
[R] Warranty on Accuracy, Precision, Legality, ... of R in Research
on 09/22/2008 11:26 AM Bert Chan wrote:> Warranty on Accuracy, Precision, Legality, ... of R in Research > > (These questions may well have been raised.) > > What is the implied warranty of using R for research & publications, consulting, etc.? > > Alternately, how does one obtain such a warranty? > > Your answers will be much appreciated. > > Perhaps you can point me to some websites which discussed this subject in the past. > > Thanks & regards - > > Bert > > (Bertram K. C. Chan, PhD)As per the banner that appears whenever you start up R: "R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details." The suitability of R for any particular application is entirely up to the user. Legally, there is nothing preventing you from using R for such applications relative to the license under which R is made available. You did not indicate the specific type of research you have in mind, but if it might be in the domain of clinical trials, please review: http://www.r-project.org/doc/R-FDA.pdf HTH, Marc Schwartz