thr3ads.net - R help - [R] R-help Digest, Vol 67, Issue 23 [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Bert Chan

2008-Sep-22 16:26 UTC

[R] R-help Digest, Vol 67, Issue 23

Warranty on Accuracy, Precision, Legality, ... of R in Research

(These questions may well have been raised.)

What is the implied warranty of using R for research & publications,
consulting, etc.?

Alternately, how does one obtain such a warranty?

Your answers will be much appreciated.

Perhaps you can point me to some websites which discussed this subject in the
past.

Thanks & regards -

Bert

(Bertram K. C. Chan, PhD)


----- Original Message ----
From: "r-help-request@r-project.org"
<r-help-request@r-project.org>
To: r-help@r-project.org
Sent: Monday, September 22, 2008 3:00:04 AM
Subject: R-help Digest, Vol 67, Issue 23

Send R-help mailing list submissions to
    r-help@r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
    https://stat.ethz.ch/mailman/listinfo/r-help
or, via email, send a message with subject or body 'help' to
    r-help-request@r-project.org

You can reach the person managing the list at
    r-help-owner@r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-help digest..."


Today's Topics:

   1. Calculating interval for conditional/unconditional
      correlation    matrix (Ana Kolar)
   2. How to plot "greater than" symbol on the x-axis (Li, Bingshan)
   3. Re: How to plot "greater than" symbol on the x-axis (John Fox)
   4. Re: Design lrm function (milicic.marko)
   5. Re: How to plot "greater than" symbol on the x-axis (Li,
Bingshan)
   6. Re: How to plot "greater than" symbol on the x-axis (John Fox)
   7. Re: periodicity validation (stephen sefick)
   8. Task View for Chemometrics and Computational Physics
      (Katharine Mullen)
   9. Re: Variable Selection for data reduction and discriminant
      anlaysis (Katharine Mullen)
  10. Re: How to plot "greater than" symbol on the x-axis
      (Henrik Bengtsson)
  11. Re: How to plot "greater than" symbol on the x-axis
      (Henrik Bengtsson)
  12. Re: How to plot "greater than" symbol on the x-axis
      (Gabor Grothendieck)
  13. Symmetric matrix (Megh Dal)
  14. Re: Symmetric matrix (Jorge Ivan Velez)
  15. Re: Symmetric matrix (Dimitris Rizopoulos)
  16. R Map using SAS data (Junjie Zhang)
  17. Re: How to plot "greater than" symbol on the x-axis (Li,
Bingshan)
  18. Re: Symmetric matrix (Peter Dalgaard)
  19. Re: removing a word, the following space and the next word
      (Rolf Turner)
  20. Re: fitting a hyperbole (Rolf Turner)
  21. Re: Unexpected behaviour when testing for independence,    with
      multiple factors ( Javier Acu?a )
  22. Re: Unexpected behaviour when testing for independence with
      multiple factors ( Javier Acu?a )
  23. r format questions (DS)
  24. design question on piping multiple data sets from 1 file into
      R (DS)
  25. color for lattice box plots (Tom Bonen)
  26. suppress legend in ggplot(data, aes(y=Y, x=X,fill=Z))? (Tom Bonen)
  27. Re: selecting from a series of integers    withpre-determined
      probabilities (Bert Gunter)
  28. Multiple plots per window (p@fo76.org)
  29. glmer -- extracting standard errors and other statistics
      (John Poulsen)
  30. Re: r format questions (jim holtman)
  31. Re: r format questions (jim holtman)
  32. Re: Variable Selection for data reduction and discriminant
      anlaysis (gcam032)
  33. Re: Multiple plots per window (p@fo76.org)
  34. Re: glmer -- extracting standard errors and other statistics
      (Weiss, Bernd )
  35.  Why isn't R recognising integers as numbers? (Ted Byers)
  36. Re: Why isn't R recognising integers as numbers? (jim holtman)
  37. Re: Multiple plots per window (Gabor Grothendieck)
  38. Re: Why isn't R recognising integers as numbers? (Marc Schwartz)
  39. Re: Why isn't R recognising integers as numbers? (Ted Byers)
  40. Re: Why isn't R recognising integers as numbers? (Ted Byers)
  41. Re: Why isn't R recognising integers as numbers? (Marc Schwartz)
  42. Re: Calculating interval for conditional/unconditional
      correlation matrix (Moshe Olshansky)
  43. Re: How to plot "greater than" symbol on the x-axis (Bingshan
Li)
  44. Re: PDF fonts problem (Paul Murrell)
  45. Help for R (Mac)
  46. Hmisc and Ubuntu (aptitude install) (Matthew Pettis)
  47. adding layers in ggplot2 (data and code included) (Juliet Hannah)
  48. Warnings in fitdistr() from MASS. (Rolf Turner)
  49. Re: Why isn't R recognising integers as numbers? (Peter Dalgaard)
  50. Re: adding layers in ggplot2 (data and code included) (Eric)
  51. Time series (ts) questions. (rkevinburton@charter.net)
  52. Matrix balancing on margins (PALMIER Patrick - CETE NP/INFRA/TRF)
  53. Re: Variable Selection for data reduction and discriminant
      anlaysis (Mark Difford)
  54. Manage huge database ( Jos? E. Lozano )
  55. Re: Manage huge database (Barry Rowlingson)
  56. Re: Symmetric matrix (Martin Maechler)
  57. Re: Manage huge database (Yihui Xie)
  58. Re: Manage huge database ( Jos? E. Lozano )
  59. Re: Manage huge database ( Jos? E. Lozano )
  60. Re: how to keep up with R? (Robin Hankin)
  61. Re: Why isn't R recognising integers as numbers? ( (Ted Harding))
  62. Re: Manage huge database (Barry Rowlingson)


----------------------------------------------------------------------

Message: 1
Date: Sun, 21 Sep 2008 03:05:40 -0700 (PDT)

Subject: [R] Calculating interval for conditional/unconditional
    correlation    matrix
To: R <r-help@r-project.org>
Message-ID: <880098.14013.qm@web50610.mail.re2.yahoo.com>
Content-Type: text/plain

Hi there,

Could anyone please help me to understand what should be done in order not to
get this error message: Error: evaluation nested too deeply: infinite recursion
/ options(expressions=)?

Here is my code:

determinant<-
function(x){det(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}

matrix<-
function(x){(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}


conditional<-function(x,varcov){
    varcov<-matrix(x)
    sigmaxx<-varcov[3,3]
    sigmaxz<-varcov[3,1:2]
    sigmayy<-varcov[4,4]
    sigmayz<-varcov[4,1:2]
    sigmazx<-varcov[1:2,3]
    sigmazy<-varcov[1:2,4]
    sigmazz<-varcov[1:2,1:2]
   
(x-sigmaxz%*%solve(sigmaZZ)%*%sigmazy)/sqrt((sigmaxx-sigmaxz%*%solve(sigmaZZ)%*%sigmazx)*(sigmayy-sigmayz%*%solve(sigmaZZ)%*%sigmazy))}

interval<-uniroot(determinant,lower = min(c(0,1)), upper = max(c(0,1)))

I tried also with the code below, but got the same Error message.

lower.bound<-uniroot(determinant,c(0,0.5))$root
upper.bound<-uniroot(determinant,c(0.51,1))$root


[[elided Yahoo spam]]

Ana



      
    [[alternative HTML version deleted]]



------------------------------

Message: 2
Date: Sat, 20 Sep 2008 23:37:22 -0500
From: "Li, Bingshan" <bli1@bcm.tmc.edu>
Subject: [R] How to plot "greater than" symbol on the x-axis
To: <r-help@R-project.org>
Message-ID:
    <99FAE9C1DAA75C4BAB3C1441228F95D130C1E7@BCMEVS14.ad.bcm.edu>
Content-Type: text/plain


Hello everyone,

I want to plot a "greater than" symbol (the "_" under
">") on the x-axis in the labels. Is it possible to do it?

Thanks.

Bingshan

    [[alternative HTML version deleted]]



------------------------------

Message: 3
Date: Sun, 21 Sep 2008 09:38:13 -0400
From: "John Fox" <jfox@mcmaster.ca>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "'Li, Bingshan'" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org
Message-ID: <000c01c91bef$506f3990$f14dacb0$@ca>
Content-Type: text/plain;    charset="us-ascii"

Dear Bingshan,

It isn't entirely clear what you want to do. I think that you want the
"greater-than-or-equal-to" symbol, not "greater than," but
by itself or in
an expression? For the first, xlab=expression("" >= ""),
and for the second,
e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.

I hope this helps,
John 

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org]
On> Behalf Of Li, Bingshan
> Sent: September-21-08 12:37 AM
> To: r-help@r-project.org
> Subject: [R] How to plot "greater than" symbol on the x-axis
> 
> 
> Hello everyone,
> 
> I want to plot a "greater than" symbol (the "_" under
">") on the x-axis
in> the labels. Is it possible to do it?
> 
> Thanks.
> 
> Bingshan
> 
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.


------------------------------

Message: 4
Date: Sun, 21 Sep 2008 07:56:29 -0700 (PDT)
From: "milicic.marko" <milicic.marko@gmail.com>
Subject: Re: [R] Design lrm function
To: r-help@r-project.org
Message-ID:
    <879ed981-d735-41ac-89c1-87a0251b9f06@34g2000hsh.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1

Thanks Frank.

On Sep 20, 2:53?am, Frank E Harrell Jr <f.harr...@vanderbilt.edu>
wrote:> milicic.marko wrote:
> > Hi,
>
> > Is it possible to get ROC and accuracy ratio/gini straight out of the
> > Design package?
>
> > Thanks
>
> The print method for lrm prints the ROC area (labeled "C"). ?lrm
does
> not print the other 2 measures you listed. ?It computes a generalized
> R^2 (much more powerful than all the other measures) and rank indexes
> other than C.
>
> --
> Frank E Harrell Jr ? Professor and Chair ? ? ? ? ? School of Medicine
> ? ? ? ? ? ? ? ? ? ? ? Department of Biostatistics ? Vanderbilt University
>
> ______________________________________________
> R-h...@r-project.org mailing
listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


------------------------------

Message: 5
Date: Sun, 21 Sep 2008 10:23:53 -0500
From: "Li, Bingshan" <bli1@bcm.tmc.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "John Fox" <jfox@mcmaster.ca>
Cc: r-help@r-project.org
Message-ID:
    <99FAE9C1DAA75C4BAB3C1441228F95D130C1EA@BCMEVS14.ad.bcm.edu>
Content-Type: text/plain

Hi John,

Yes, you are right. I meant "greater-than-or-equal". According to your
suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2 and so on as labels on xaxis. I did not make it work. Do you know how to
make it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
did not work either.

Thanks a lot!

Bingshan


-----Original Message-----
From: John Fox [mailto:jfox@mcmaster.ca]
Sent: Sun 9/21/2008 8:38 AM
To: Li, Bingshan
Cc: r-help@r-project.org
Subject: RE: [R] How to plot "greater than" symbol on the x-axis

Dear Bingshan,

It isn't entirely clear what you want to do. I think that you want the
"greater-than-or-equal-to" symbol, not "greater than," but
by itself or in
an expression? For the first, xlab=expression("" >= ""),
and for the second,
e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.

I hope this helps,
John 

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org]
On> Behalf Of Li, Bingshan
> Sent: September-21-08 12:37 AM
> To: r-help@r-project.org
> Subject: [R] How to plot "greater than" symbol on the x-axis
> 
> 
> Hello everyone,
> 
> I want to plot a "greater than" symbol (the "_" under
">") on the x-axis
in> the labels. Is it possible to do it?
> 
> Thanks.
> 
> Bingshan
> 
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.



    [[alternative HTML version deleted]]



------------------------------

Message: 6
Date: Sun, 21 Sep 2008 12:14:04 -0400
From: "John Fox" <jfox@mcmaster.ca>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "'Li, Bingshan'" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org
Message-ID: <000701c91c05$110711e0$331535a0$@ca>
Content-Type: text/plain;    charset="us-ascii"

Dear Bingshan,

You can use xlab=expression("" >= "1"),
xlab=expression("" >= 1), or
expression(NA >= 1), etc. The point is that >= is a binary operator, so a
well formed expression needs both a left- and right-hand operand.

John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox

> -----Original Message-----
> From: Li, Bingshan [mailto:bli1@bcm.tmc.edu]
> Sent: September-21-08 11:24 AM
> To: John Fox
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
> 
> Hi John,
> 
> Yes, you are right. I meant "greater-than-or-equal". According to
your
> suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2
> and so on as labels on xaxis. I did not make it work. Do you know how to
make> it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
> 
> Thanks a lot!
> 
> Bingshan
> 
> 
> -----Original Message-----
> From: John Fox [mailto:jfox@mcmaster.ca]
> Sent: Sun 9/21/2008 8:38 AM
> To: Li, Bingshan
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
> 
> Dear Bingshan,
> 
> It isn't entirely clear what you want to do. I think that you want the
> "greater-than-or-equal-to" symbol, not "greater than,"
but by itself or in
> an expression? For the first, xlab=expression("" >=
""), and for the
second,> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
> 
> I hope this helps,
>  John
> 
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
> 
> > -----Original Message-----
> > From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
> On
> > Behalf Of Li, Bingshan
> > Sent: September-21-08 12:37 AM
> > To: r-help@r-project.org
> > Subject: [R] How to plot "greater than" symbol on the x-axis
> >
> >
> > Hello everyone,
> >
> > I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
> in
> > the labels. Is it possible to do it?
> >
> > Thanks.
> >
> > Bingshan
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
>


------------------------------

Message: 7
Date: Sun, 21 Sep 2008 12:23:02 -0400
From: "stephen sefick" <ssefick@gmail.com>
Subject: Re: [R] periodicity validation
To: "yuankun shi" <shiyuankun.debian@gmail.com>,    "R-help
Mailing List"
    <r-help@r-project.org>
Message-ID:
    <c502a9e10809210923m10728682wf4f8d41e75dce71e@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

alright this is what you want to do.
install.packages("fields", dependencies=TRUE)
tim.colors is in this package and it has a blue to red color scheme-
blue being the lowest and red being the highest.  This color scheme
makes sense to me and is a common thing that a people (read engineers)
familar with matlab or the like will understand.

USE the morlet wavelet it is compactly supported which means that it
quickly goes to zero once it gets out of the scale that it is fitting.
Making it good for a localized fit.

what you are looking at is the modulus (absolute value) of the
convolution of the wavelet with the signal at a particular scale (kind
of like frequency in fourier analysis) on the y-axis through time
(local fitting) on the x-axis.  Your are trying to find periodicity?
I kind of think of wavelet analysis as the partitioning of variance of
the signal into continuous scale.

because of algorithm calculation the scale is in log2(value of the
time series) so to get to your time units (which you set in the deltat
or frequency argument when you create a timeseries with ts() )
2^(value of the scale).

I hope this helps

Stephen

2008/9/21 yuankun shi
<shiyuankun.debian@gmail.com>:> Thanks, I have succeeded to do this, first wavCWTPeaks to get every
peaks'
> coordinate, then calculated their horizontal distance, finally,bkde output
> the distance's distribution, that's what I want.
> On the contrary, picture of wavCWT seems hard to understand, I am not sure
> what the y axis and the color mean. Could you do me a favor?
>
> 2008/9/19 stephen sefick <ssefick@gmail.com>
>>
>> I would suggest wavelet analysis-
>> library wmtsa
>> wavCWT
>> This will tell you if there is a periodicity localized in time which
>> fourier analysis canno tell you- if the variance is not constant
>> through time then you should use this.
>>
>> 2008/9/19 yuankun shi <shiyuankun.debian@gmail.com>:
>> > I have spent lots of time to download the code you have mentioned.
But
>> > all
>> > of them is not I wanted, except the latest one, I have not found
it
>> > anywhere.
>> > Maybe I have not make my problem clearly, sorry for that.
>> > I have a series data, it consists of time and rate. To plot rate
vs time
>> > in
>> > picture, I found it has perodicity to some extent. The rate rise
and
>> > fall
>> > with time, but not with fixed cycle and fixed amplitude.
>> > So I am wondering, is there any tools to get the cycle? and
furthmore,
>> > to
>> > draw it's density picture?
>> > Since there is bkde in package KernSmooth, so the 2nd is not
strict
>> > needed
>> >
>> > 2008/9/11 stephen sefick <ssefick@gmail.com>
>> >>
>> >> all of the functions that I listed are time series tools for
looking
>> >> at what I think you want.   this can be done you just have to
>> >> understand the methodology.  So, look at some of the things
that I
>> >> suggested,  If these don't help then I don't
understand what you want,
>> >> and it is necissary for you to help me figure out what it is
that you
>> >> want.
>> >> good luck
>> >>
>> >> 2008/9/11 yk <shiyuankun.debian@gmail.com>:
>> >> > The data I mentioned above is oscilating vs time?but
there are not
>> >> > obersevable fixed cycle if I just plot this data.
>> >> > How to get the average  cycle?or the most probable range
of cycle
>> >> > with
>> >> > statistical methods?
>> >> > I don't know how to achieve it by R, is there any
command?
>> >> >
>> >> > On Sep 11, 10:52 am, "stephen sefick"
<ssef...@gmail.com> wrote:
>> >> >> ?spectrum
>> >> >> ?acf
>> >> >> ?ccf
>> >> >> library(wmtsa)
>> >> >> ?wavCWT
>> >> >> library(sowas)
>> >> >> ?wsp
>> >> >>
>> >> >> you could also look at lagged plots to look for
periodicity.
>> >> >> if you elaborate on the problem and include
executable sample code
>> >> >> you
>> >> >> will probably recieve more help.
>> >> >>
>> >> >> On Wed, Sep 10, 2008 at 10:02 PM, yk
<shiyuankun.deb...@gmail.com>
>> >> >> wrote:
>> >> >> > There is a series of data contains time in fixed
step and energy
>> >> >> > varying with time, how to test its
periodicity?In R, it seems
>> >> >> > there
>> >> >> > is
>> >> >> > no direct tools since I have search the R manual
with periodic and
>> >> >> > I
>> >> >> > have not found any related topic.
>> >> >> > Thanks a lot
>> >> >>
>> >> >> > ______________________________________________
>> >> >> > R-h...@r-project.org mailing list
>> >> >> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> > PLEASE do read the posting
>> >> >> > guidehttp://www.R-project.org/posting-guide.html
>> >> >> > and provide commented, minimal, self-contained,
reproducible code.
>> >> >>
>> >> >> --
>> >> >> Stephen Sefick
>> >> >> Research Scientist
>> >> >> Southeastern Natural Sciences Academy
>> >> >>
>> >> >> Let's not spend our time and resources thinking
about things that
>> >> >> are
>> >> >> so little or so large that all they really do for us
is puff us up
>> >> >> and
>> >> >> make us feel like gods. We are mammals, and have not
exhausted the
>> >> >> annoying little problems of being mammals.
>> >> >>
>> >> >>         -K. Mullis
>> >> >>
>> >> >> ______________________________________________
>> >> >> R-h...@r-project.org mailing
>> >> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> PLEASE do read the posting
>> >> >> guidehttp://www.R-project.org/posting-guide.html
>> >> >> and provide commented, minimal, self-contained,
reproducible code.
>> >> >
>> >> > ______________________________________________
>> >> > R-help@r-project.org mailing list
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> > http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained,
reproducible code.
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Stephen Sefick
>> >> Research Scientist
>> >> Southeastern Natural Sciences Academy
>> >>
>> >> Let's not spend our time and resources thinking about
things that are
>> >> so little or so large that all they really do for us is puff
us up and
>> >> make us feel like gods. We are mammals, and have not exhausted
the
>> >> annoying little problems of being mammals.
>> >>
>> >>        -K. Mullis
>> >
>> >
>>
>>
>>
>> --
>> Stephen Sefick
>> Research Scientist
>> Southeastern Natural Sciences Academy
>>
>> Let's not spend our time and resources thinking about things that
are
>> so little or so large that all they really do for us is puff us up and
>> make us feel like gods. We are mammals, and have not exhausted the
>> annoying little problems of being mammals.
>>
>>        -K. Mullis
>
>


-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

    -K. Mullis

------------------------------

Message: 8
Date: Sun, 21 Sep 2008 18:39:55 +0200 (CEST)
From: Katharine Mullen <kate@few.vu.nl>
Subject: [R] Task View for Chemometrics and Computational Physics
To: r-help@r-project.org
Message-ID: <Pine.GSO.4.56.0809211836020.3492@laurel.few.vu.nl>
Content-Type: TEXT/PLAIN; charset=US-ASCII

Dear All,

A new task view "ChemPhys" on chemometrics and computational physics
is
available on CRAN (http://cran.r-project.org/web/views/ChemPhys.html).
It describes packages and functions that are of use in modeling
chemical/physical systems.

Suggestions and comments regarding this task view are welcome.  If you
think a new category, package or function should be added, please mail.

best regards,
Kate Mullen

----
Katharine Mullen
mail: Department of Physics and Astronomy, Faculty of Sciences
Vrije Universiteit Amsterdam, de Boelelaan 1081
1081 HV Amsterdam, The Netherlands
room: T.1.06
tel: +31 205987870
fax: +31 205987992
e-mail: kate@nat.vu.nl
homepage: http://www.nat.vu.nl/~kate/



------------------------------

Message: 9
Date: Sun, 21 Sep 2008 18:43:37 +0200 (CEST)
From: Katharine Mullen <kate@few.vu.nl>
Subject: Re: [R] Variable Selection for data reduction and
    discriminant anlaysis
To: Gareth Campbell <gcam032@gmail.com>
Cc: R Help <r-help@r-project.org>
Message-ID: <Pine.GSO.4.56.0809211841530.3492@laurel.few.vu.nl>
Content-Type: TEXT/PLAIN; charset=US-ASCII

There are some pointers to packages for variable selection in the task
view for Chemometrics and Computational Physics at
http://cran.r-project.org/web/views/ChemPhys.html

On Sun, 21 Sep 2008, Gareth Campbell wrote:
> Hello all,
>
> I'm dealing with geochemical analyses of some rocks.
>
> If I use the full composition (31 elements or variables), I can get
> reasonable separation of my 6 sources.  Then when I go onto do LDA with the
> 6 groups, I get excellent separation.
>
> I feel like I should be reducing the variables to thos that are providing
> the most discrimination between the groups as this is important information
> for me.  I struggle to interpret the PCA plot in a way that helps me (due
to
> the large number of elements).  So I'm trying to do some sort of
step-wise
> variable selection.
>
> I would love to hear from someone (possibly a geochemist or similar) who
> does this regularly to determine the best course of action in R to do this.
>
>
> Thanks very much
>
>
> --
> Gareth Campbell
> PhD Candidate
> The University of Auckland
>
> P +649 815 3670
> M +6421 256 3511
> E gareth.campbell@esr.cri.nz
> gcam032@gmail.com
>
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


------------------------------

Message: 10
Date: Sun, 21 Sep 2008 09:52:29 -0700
From: "Henrik Bengtsson" <hb@stat.berkeley.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "Li, Bingshan" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID:
    <59d7961d0809210952j7a8ffb0epdad6b839aba452c9@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

What have you tried this far and what part does not work?  If you
forget for a moment the fact that you want to have ">=1",
">=2", ...
can you do what you want with plain "1", "2", ...?  Telling
us that
helps us help you.

Are you asking for the labels on the *tick marks* on the axis?  Right
now it sounds like you are asking for *the label* on the x axis, but
the part that you want multiple ones is confusing.

plot(1:10, xlab="1");

is different from:

plot(1:10, axes=FALSE);
axis(side=1, at=1:10, labels=1:10);

To add ">=" to the latter case, this works:

bquote("" >= 1:10)
labels <- lapply(1:10, FUN=function(x) substitute(>= t, list(t=x)));
plot(1:10, axes=FALSE);
axis(side=1, at=1:10, labels=labels);


On Sun, Sep 21, 2008 at 8:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:> Hi John,
>
> Yes, you are right. I meant "greater-than-or-equal". According to
your suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2 and so on as labels on xaxis. I did not make it work. Do you know how to
make it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
>
> Thanks a lot!
>
> Bingshan
>
>
> -----Original Message-----
> From: John Fox [mailto:jfox@mcmaster.ca]
> Sent: Sun 9/21/2008 8:38 AM
> To: Li, Bingshan
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
>
> Dear Bingshan,
>
> It isn't entirely clear what you want to do. I think that you want the
> "greater-than-or-equal-to" symbol, not "greater than,"
but by itself or in
> an expression? For the first, xlab=expression("" >=
""), and for the second,
> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
>
> I hope this helps,
>  John
>
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
>> -----Original Message-----
>> From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
> On
>> Behalf Of Li, Bingshan
>> Sent: September-21-08 12:37 AM
>> To: r-help@r-project.org
>> Subject: [R] How to plot "greater than" symbol on the x-axis
>>
>>
>> Hello everyone,
>>
>> I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
> in
>> the labels. Is it possible to do it?
>>
>> Thanks.
>>
>> Bingshan
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


------------------------------

Message: 11
Date: Sun, 21 Sep 2008 09:53:39 -0700
From: "Henrik Bengtsson" <hb@stat.berkeley.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "Li, Bingshan" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID:
    <59d7961d0809210953n1727e462q2f19b37266689348@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Sun, Sep 21, 2008 at 9:52 AM, Henrik Bengtsson <hb@stat.berkeley.edu>
wrote:> What have you tried this far and what part does not work?  If you
> forget for a moment the fact that you want to have ">=1",
">=2", ...
> can you do what you want with plain "1", "2", ...? 
Telling us that
> helps us help you.
>
> Are you asking for the labels on the *tick marks* on the axis?  Right
> now it sounds like you are asking for *the label* on the x axis, but
> the part that you want multiple ones is confusing.
>
> plot(1:10, xlab="1");
>
> is different from:
>
> plot(1:10, axes=FALSE);
> axis(side=1, at=1:10, labels=1:10);
>
> To add ">=" to the latter case, this works:
>
> bquote("" >= 1:10)
> labels <- lapply(1:10, FUN=function(x) substitute(>= t, list(t=x)));
> plot(1:10, axes=FALSE);
> axis(side=1, at=1:10, labels=labels);
Oops.  Forget about the bquote() - cut'n'paste error  ...and I don't
know how to get rid of the "" preceding each tick label.  Maybe
someone else knows.

/Henrik
>
> On Sun, Sep 21, 2008 at 8:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:
>> Hi John,
>>
>> Yes, you are right. I meant "greater-than-or-equal".
According to your suggestion, I can plot the symbol only. But what I want is to
have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you
know how to make it? The expression("">=1"") did not
work, and paste(expression("">=""), 1)
>> did not work either.
>>
>> Thanks a lot!
>>
>> Bingshan
>>
>>
>> -----Original Message-----
>> From: John Fox [mailto:jfox@mcmaster.ca]
>> Sent: Sun 9/21/2008 8:38 AM
>> To: Li, Bingshan
>> Cc: r-help@r-project.org
>> Subject: RE: [R] How to plot "greater than" symbol on the
x-axis
>>
>> Dear Bingshan,
>>
>> It isn't entirely clear what you want to do. I think that you want
the
>> "greater-than-or-equal-to" symbol, not "greater
than," but by itself or in
>> an expression? For the first, xlab=expression("" >=
""), and for the second,
>> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
>>
>> I hope this helps,
>>  John
>>
>> ------------------------------
>> John Fox, Professor
>> Department of Sociology
>> McMaster University
>> Hamilton, Ontario, Canada
>> web: socserv.mcmaster.ca/jfox
>>
>>> -----Original Message-----
>>> From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
>> On
>>> Behalf Of Li, Bingshan
>>> Sent: September-21-08 12:37 AM
>>> To: r-help@r-project.org
>>> Subject: [R] How to plot "greater than" symbol on the
x-axis
>>>
>>>
>>> Hello everyone,
>>>
>>> I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
>> in
>>> the labels. Is it possible to do it?
>>>
>>> Thanks.
>>>
>>> Bingshan
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>


------------------------------

Message: 12
Date: Sun, 21 Sep 2008 13:08:09 -0400
From: "Gabor Grothendieck" <ggrothendieck@gmail.com>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "Li, Bingshan" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID:
    <971536df0809211008l559eec03ub016e3fcd71682f@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Sun, Sep 21, 2008 at 11:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:> Hi John,
>
> Yes, you are right. I meant "greater-than-or-equal". According to
your suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2 and so on as labels on xaxis. I did not make it work. Do you know how to
make it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
>
Try this:

plot(1:10, xaxt = "n")
for(i in 1:10) axis(1, i, bquote(phantom(0) >= .(i)))



------------------------------

Message: 13
Date: Sun, 21 Sep 2008 10:47:47 -0700 (PDT)

Subject: [R] Symmetric matrix
To: r-help@stat.math.ethz.ch
Message-ID: <139940.18844.qm@web58102.mail.re3.yahoo.com>
Content-Type: text/plain; charset=us-ascii

I have following matrix :

a = matrix(rnorm(36), 6)

Now I want to replace the lower-triangular elements with it's
upper-triangular elements. That is I want to make a symmetric matrix from a. I
have tried with lower.tri() and upper.tri() function, but got desired result.
Can anyone please tell me how to do that?



------------------------------

Message: 14
Date: Sun, 21 Sep 2008 13:54:19 -0400
From: "Jorge Ivan Velez" <jorgeivanvelez@gmail.com>
Subject: Re: [R] Symmetric matrix

Cc: r-help@stat.math.ethz.ch
Message-ID:
    <317737de0809211054u1485f494l166e21f6b30e4123@mail.gmail.com>
Content-Type: text/plain

Dear Megh,
Try this:

a = matrix(rnorm(36), 6)
a[upper.tri(a)]<-a[lower.tri(a)]
a

HTH,


Jorge




> I have following matrix :
>
> a = matrix(rnorm(36), 6)
>
> Now I want to replace the lower-triangular elements with it's
> upper-triangular elements. That is I want to make a symmetric matrix from
a.
> I have tried with lower.tri() and upper.tri() function, but got desired
> result. Can anyone please tell me how to do that?
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
    [[alternative HTML version deleted]]



------------------------------

Message: 15
Date: Sun, 21 Sep 2008 19:58:44 +0200
From: Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl>
Subject: Re: [R] Symmetric matrix

Cc: r-help@stat.math.ethz.ch
Message-ID: <48D68B54.70409@erasmusmc.nl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

try the following

a <- matrix(rnorm(36), 6)
ind <- lower.tri(a)
a[ind] <- t(a)[ind]
a


I hope it helps.

Best,
Dimitris


Megh Dal wrote:> I have following matrix :
> 
> a = matrix(rnorm(36), 6)
> 
> Now I want to replace the lower-triangular elements with it's
upper-triangular elements. That is I want to make a symmetric matrix from a. I
have tried with lower.tri() and upper.tri() function, but got desired result.
Can anyone please tell me how to do that?
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014



------------------------------

Message: 16
Date: Sun, 21 Sep 2008 14:26:05 -0400
From: Junjie Zhang <thujacky@hotmail.com>
Subject: [R] R Map using SAS data
To: <r-help@r-project.org>
Message-ID: <BAY105-W522471E0693BD6F2FD69BDDC480@phx.gbl>
Content-Type: text/plain

Hi there,

I'd like to plot some maps.  Is it possible for me to use SAS map data in R?
Thank you.

Best,
Junjie
_________________________________________________________________

your life.

    [[alternative HTML version deleted]]



------------------------------

Message: 17
Date: Sun, 21 Sep 2008 11:40:35 -0500
From: "Li, Bingshan" <bli1@bcm.tmc.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "John Fox" <jfox@mcmaster.ca>
Cc: r-help@r-project.org
Message-ID:
    <99FAE9C1DAA75C4BAB3C1441228F95D130C1EB@BCMEVS14.ad.bcm.edu>
Content-Type: text/plain


Hi John,

It works perfectly. Thank you so much for the help! Have a great day.

Bingshan

-----Original Message-----
From: John Fox [mailto:jfox@mcmaster.ca]
Sent: Sun 9/21/2008 11:14 AM
To: Li, Bingshan
Cc: r-help@r-project.org
Subject: RE: [R] How to plot "greater than" symbol on the x-axis

Dear Bingshan,

You can use xlab=expression("" >= "1"),
xlab=expression("" >= 1), or
expression(NA >= 1), etc. The point is that >= is a binary operator, so a
well formed expression needs both a left- and right-hand operand.

John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox

> -----Original Message-----
> From: Li, Bingshan [mailto:bli1@bcm.tmc.edu]
> Sent: September-21-08 11:24 AM
> To: John Fox
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
> 
> Hi John,
> 
> Yes, you are right. I meant "greater-than-or-equal". According to
your
> suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2
> and so on as labels on xaxis. I did not make it work. Do you know how to
make> it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
> 
> Thanks a lot!
> 
> Bingshan
> 
> 
> -----Original Message-----
> From: John Fox [mailto:jfox@mcmaster.ca]
> Sent: Sun 9/21/2008 8:38 AM
> To: Li, Bingshan
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
> 
> Dear Bingshan,
> 
> It isn't entirely clear what you want to do. I think that you want the
> "greater-than-or-equal-to" symbol, not "greater than,"
but by itself or in
> an expression? For the first, xlab=expression("" >=
""), and for the
second,> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
> 
> I hope this helps,
>  John
> 
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
> 
> > -----Original Message-----
> > From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
> On
> > Behalf Of Li, Bingshan
> > Sent: September-21-08 12:37 AM
> > To: r-help@r-project.org
> > Subject: [R] How to plot "greater than" symbol on the x-axis
> >
> >
> > Hello everyone,
> >
> > I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
> in
> > the labels. Is it possible to do it?
> >
> > Thanks.
> >
> > Bingshan
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> 




    [[alternative HTML version deleted]]



------------------------------

Message: 18
Date: Sun, 21 Sep 2008 21:10:07 +0200
From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
Subject: Re: [R] Symmetric matrix
To: Jorge Ivan Velez <jorgeivanvelez@gmail.com>

Message-ID: <48D69C0F.3000706@biostat.ku.dk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Jorge Ivan Velez wrote:> Dear Megh,
> Try this:
>
> a = matrix(rnorm(36), 6)
> a[upper.tri(a)]<-a[lower.tri(a)]
> a
>
>  
> HTH,
>
>  If you look carefully, you'll see that it doesn't work! Dimitris had the
better idea.

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)              FAX: (+45) 35327907



------------------------------

Message: 19
Date: Mon, 22 Sep 2008 08:39:44 +1200
From: Rolf Turner <r.turner@auckland.ac.nz>
Subject: Re: [R] removing a word, the following space and the next
    word
To: jim holtman <jholtman@gmail.com>
Cc: r-help@r-project.org, Bob Green <bgreen@dyson.brisnet.org.au>
Message-ID: <1503F860-54A4-402C-B6B5-C76A88EF2D5E@auckland.ac.nz>
Content-Type: text/plain; charset=US-ASCII; format=flowed


On 21/09/2008, at 5:15 AM, jim holtman wrote:
>> x <- 'Mr Jones ate lunch and Mr Smith was tied'
>> gsub('(Mr\\.*)\\s+\\w+', "\\1 <file://0.0.0.1/>
xxxx", x)
> [1] "Mr xxxx ate lunch and Mr xxxx was tied"
I don't get what the bit

    <file://0.0.0.1/>

is about.  If I do (just)

    gsub('(Mr\\.*)\\s+\\w+', "\\1 xxxx", x)

I get the desired result, i.e.


[1] "Mr xxxx ate lunch and Mr xxxx was tied"

    cheers,

        Rolf Turner


    

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}



------------------------------

Message: 20
Date: Mon, 22 Sep 2008 08:58:00 +1200
From: Rolf Turner <r.turner@auckland.ac.nz>
Subject: Re: [R] fitting a hyperbole
To: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
Cc: R-help Forum <r-help@r-project.org>
Message-ID: <DAE7D8B9-0991-4DF7-8336-F22CA23E1254@auckland.ac.nz>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed


On 21/09/2008, at 10:38 AM, Peter Dalgaard wrote:
> stephen sefick wrote:
>> I am not sure if I am exaggerating or not read title as hyperbola
>>
>> On Sat, Sep 20, 2008 at 2:20 PM, stephen sefick  
>> <ssefick@gmail.com> wrote:
>>
>>> I have got a data set that is Gross Primary Productivity ~ Total
>>> Suspended Solids it is a hyperbola just like:
>>> plot(1/c(1:1000))
>>>
>>> how do I model this relationship so that I can get all of the neat
>>> things that lm gives residuals etc. etc. so that I can see if my
>>> eyeball model stands up.  Thanks for any help, pointers, or good
>>> things to read.
>>>
> Well, it depends on the exact model you want to fit and the error  
> characteristics.
>
> There's a straightforward linear model in the transformed x:
> lm(y ~ I(1/x))
>
> but there are also transformed models like
>
> lm(1/y ~ x)
>
> or
>
> lm(log(y) ~ log(x))
>
> but of course, y, 1/y, and log(y) can't all be homoscedastic normal  
> variates. Going beyond the linearized models, you can use nls(), as in
>
> nls(y~ a/(x-b), start=c(a=1,b=0))
>
> (which is linear for 1/y, but assumes that y rather than 1/y has  
> constant variance.)
Nicely expressed.  Succinct, clear, to the point, comprehensive.  I  
wish I'd said that!

(And that's not hyperbole. :-) )

So much more helpful than some postings I've seen recently to the  
effect of ``Go away
and read a book on this topic.''

    cheers,

        Rolf

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}



------------------------------

Message: 21
Date: Sun, 21 Sep 2008 17:01:18 -0400
From: " Javier Acu?a " <javier.acuna.o@gmail.com>
Subject: Re: [R] Unexpected behaviour when testing for independence,
    with multiple factors
To: bolker@ufl.edu, r-help@r-project.org
Message-ID:
    <e10c29610809211401i6e3d7792p22b72172993aae9a@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
>Ben Bolker <bolker <at> ufl.edu> writes:
>
>I would try
>
>fligner.test(dT ~ Topology:Drift:lambda)
>
>in response to:
>
>Javier Acuna <javier.acuna.o <at> gmail.com> writes:
>
> Hi, I'm a new user of R. My background is Electrical Engineering, so
> please bear with me if this is a silly question.
>
> I'm trying to assess whether the results of an experiment satisfy the
> hypothesis of homoscedasticity (my ultimate goal is to use ANOVA).
>
> The result of the experiment is mean delay (dT), which depends on
> three factors, topology, drift, and lambda. The first two factors are
> categorical (with 4 levels each) and the last one is numerical, with
> two levels.
>
> A sample of my data is as follows:
>
> dT     Topology        Drift   lambda
> 258.789        Tree    b1      .43
> 244.195        Tree    b1      .43
> 115.961        Tree    b2      .3
> 115.183        Tree    b2      .3
>
> I would like to separate dT in the 32 samples (4x4x2), and test if the
> variance of each sample is equal to the other 31 samples.
> I tried using fligner.test and bartlett.test, but either test seems to
> only work for one factor:
>
> > fligner.test( dT ~ Topology + Drift + lambda)
>
>         Fligner-Killeen test of homogeneity of variances
>
> data:  dT by Topology by Drift by lambda
> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>
> > fligner.test( dT ~ Topology )
>
>         Fligner-Killeen test of homogeneity of variances
>
> data:  dT by Topology
> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>
> As I see from the previous two outputs, fligner.test only takes into
> account the first factor. Similar results are obtained for
> bartlett.test.
I tried what you suggested Ben, but I'm still puzzled by the output.
In this case, I obtain different results with different ordering of
the factors:
> fligner.test( dT ~ Dims : Topology :Drift )
        Fligner-Killeen test of homogeneity of variances

data:  dT by Dims by Topology by Drift
Fligner-Killeen:med chi-squared = 195.2067, df = 1, p-value < 2.2e-16
> fligner.test( dT ~ Topology :Drift:Dims  )
        Fligner-Killeen test of homogeneity of variances

data:  dT by Topology by Drift by Dims
Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451


I don't know what to do now, any help would be reaaally appreciated.

Best Regards
Javier

----------------------------------------------------
Javier Acuna
Electrical Engineering Grad Student
Universidad de Chile
javier.acuna.o@gmail.com



------------------------------

Message: 22
Date: Sun, 21 Sep 2008 17:05:06 -0400
From: " Javier Acu?a " <javier.acuna.o@gmail.com>
Subject: Re: [R] Unexpected behaviour when testing for independence
    with    multiple factors
To: "Michael Dewey" <info@aghmed.fsnet.co.uk>
Cc: r-help@r-project.org
Message-ID:
    <e10c29610809211405y3607a7d9s2ba865460aca38e8@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Michael, so you're suggesting that I should do:

aux <- interaction( Topology, Drift, lambda)
and then
fligner.test(dT~aux)

Is that correct?

On Thu, Sep 18, 2008 at 8:32 AM, Michael Dewey <info  <at>
aghmed.fsnet.co.uk> wrote:> At 16:03 17/09/2008, Javier Acu?a wrote:
>>
>> Hi, I'm a new user of R. My background is Electrical Engineering,
so
>> please bear with me if this is a silly question.
>
> For future reference you might find
> ?interaction
> helpful as another tool in your box.
>
>
>> I'm trying to assess whether the results of an experiment satisfy
the
>> hypothesis of homoscedasticity (my ultimate goal is to use ANOVA).
>
> It is hard to resist quoting Box (1953, Biometrika, 40, p333) that these
> tests are '... like putting to sea in a rowing boat to find out whether
> conditions are safe for an ocean liner to leave port'
>
>> The result of the experiment is mean delay (dT), which depends on
>> three factors, topology, drift, and lambda. The first two factors are
>> categorical (with 4 levels each) and the last one is numerical, with
>> two levels.
>>
>> A sample of my data is as follows:
>>
>> dT      Topology        Drift   lambda
>> 258.789 Tree    b1      .43
>> 244.195 Tree    b1      .43
>> 115.961 Tree    b2      .3
>> 115.183 Tree    b2      .3
>>
>> I would like to separate dT in the 32 samples (4x4x2), and test if the
>> variance of each sample is equal to the other 31 samples.
>> I tried using fligner.test and bartlett.test, but either test seems to
>> only work for one factor:
>>
>> > fligner.test( dT ~ Topology + Drift + lambda)
>>
>>        Fligner-Killeen test of homogeneity of variances
>>
>> data:  dT by Topology by Drift by lambda
>> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>>
>> > fligner.test( dT ~ Topology )
>>
>>        Fligner-Killeen test of homogeneity of variances
>>
>> data:  dT by Topology
>> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>>
>> As I see from the previous two outputs, fligner.test only takes into
>> account the first factor. Similar results are obtained for
>> bartlett.test.
>>
>> At this point I don't know if I'm using the test incorrectly or
>> something else. I would really appreciate any help. I'm using R
>> version 2.7.2 (2008-08-25) in Windows XP.
>>
>> Many thanks in advance
>> Javier
>>
>> ----------------------------------------------------
>> Javier Acuna
>> Electrical Engineering Grad Student
>> Universidad de Chile
>> javier.acuna.o@gmail.com
>
> Michael Dewey
> http://www.aghmed.fsnet.co.uk
>
>


------------------------------

Message: 23
Date: Sun, 21 Sep 2008 18:03:52 -0400
From: "DS" <ds5j@excite.com>
Subject: [R] r format questions
To: r-help@R-project.org
Message-ID: <20080921180352.6760@web005.roc2.bluetie.com>
Content-Type: text/plain

Hi,

1)   I have noticed that when I use the aggregate function it outputs numbers in
the results. for example:
aggregate by product

    group.1       Aggregate
1    ProductA   1000400.00
2    ProductB   23232323.00
3    Missing      232323.00

is there a way to suppress the numbers infront of aggregate outputs.  I checked
and they don't look like columns when I do a summary so I can't -1 them
away.

2) is there an easy way to then take my aggregate matrix and then format the sum
wtih $ and commas. for e.g instead 10000 it should show
$10,000.00?

I am trying to create a report and am piping the aggregate into an xtable and
feeding it R2html.

thanks
Dhruv

------------------------------------------------------------
Medical Billing and Coding Training

ools.
http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/
    [[alternative HTML version deleted]]



------------------------------

Message: 24
Date: Sun, 21 Sep 2008 18:06:51 -0400
From: "DS" <ds5j@excite.com>
Subject: [R] design question on piping multiple data sets from 1 file
    into R
To: r-help@R-project.org
Message-ID: <20080921180651.12712@web006.roc2.bluetie.com>
Content-Type: text/plain

Hi,
   I have some queries that I use  to get time series information for 8 seperate
queries which deal with a different set of time series each.

  I take my queries run them and save the output as csv file and them format the
data into graphs in excel.

  I wanted to know if there is an elegant and clean way to read in 1 csv file
but to read the seperate matrices on different rows into seperate R data
objects.

  if this is easy then I can read the 8 datasets in the csv file into 8 r
objects and pipe them to time series objects for graphs.

thanks
Dhruv

------------------------------------------------------------
Email Fax
[[elided Yahoo spam]]
http://tagline.excite.com/fc/JkJQPTgLMRGrZRz1SpXTBEyJ7zsqYo4Wrxjvd4ml8SSHhbc6NzbNSo/
    [[alternative HTML version deleted]]



------------------------------

Message: 25
Date: Sun, 21 Sep 2008 18:09:01 -0400
From: "Tom Bonen" <tom.bonen@googlemail.com>
Subject: [R] color for lattice box plots
To: r-help@r-project.org
Message-ID:
    <8316adf50809211509l7471b151x282a1cc19fd6a4ba@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

hi,

i have a figure with many boxplots and want to differentiate one group
of the boxplots by colour of the box. so for example:

X <- replicate(3,rnorm(100))
bwplot(X[,1]~as.factor(X[,2]>1)|X[,3]>0)

# this gives four boxplots, i'd like to give 1 and 3 a different
colour than 2 and 3

# i tried
bwplot(X[,1]~as.factor(X[,2]>1)|X[,3]>0,groups=as.factor(X[,2]>1))

but that does not change the display? how can i change the colour for
groups with bwplot? thanks.

tom



------------------------------

Message: 26
Date: Sun, 21 Sep 2008 18:25:32 -0400
From: "Tom Bonen" <tom.bonen@googlemail.com>
Subject: [R] suppress legend in ggplot(data, aes(y=Y, x=X,fill=Z))?
To: r-help <r-help@r-project.org>
Message-ID:
    <8316adf50809211525q53e22a3dp5992e1c05c303ab9@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

hi,

is there any way to suppress the legend in ggplot(data, aes(y=Y,
x=X,fill=Z)) ? i'd like the values to be displayed in different colors
as specified by fill= and this works just fine. but i do not want to
have the legend on the right that is automactially created when fill
is specified.

thanks,
tom



------------------------------

Message: 27
Date: Sun, 21 Sep 2008 15:33:56 -0700
From: Bert Gunter <gunter.berton@gene.com>
Subject: Re: [R] selecting from a series of integers
    withpre-determined    probabilities
To: "'John Sorkin'" <jsorkin@grecc.umaryland.edu>,
    <r-help@r-project.org>
Message-ID: <000901c91c3a$225eaa40$6501a8c0@gne.windows.gene.com>
Content-Type: text/plain;    charset="us-ascii"

?sample.

-- Bert Gunter

-----Original Message-----
From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] On
Behalf Of John Sorkin
Sent: Saturday, September 20, 2008 12:43 PM
To: r-help@r-project.org
Subject: [R] selecting from a series of integers withpre-determined
probabilities

R 2.6
Windows XP

I need to select from the integers 1,2,3,4,5 with some pre-determined
probability, e.g. probability of selecting 5 80%, probability of selecting 1
or  2 or  3 or 4 20%. Any suggestions for how I might accomplish this? I
need to do it very efficiently as I will be doing it 500,000 times.
Thanks
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:8}}



------------------------------

Message: 28
Date: Mon, 22 Sep 2008 00:34:51 +0200
From: p@fo76.org
Subject: [R] Multiple plots per window
To: R Help <r-help@r-project.org>
Message-ID: <20080922003451.2qhs63jhhwow48s0@webmail.openit.de>
Content-Type: text/plain;    charset=ISO-8859-1;    DelSp="Yes";
    format="flowed"

Hi all,

I'm currently working through "The Analysis of Time Series" by
Chris
Chatfield. In order to also get a better understanding of R, I play
around with the examples and Exercises (no homework or assignement,
just selfstudy!!).

Exercise 2.1 gives the following dataset (sales figures for 4 week
intervals):
> sales2.1.dataframe    1995 1996 1997 1998
1   153  133  145  111
2   189  177  200  170
3   221  241  187  243
4   215  228  201  178
5   302  283  292  248
6   223  255  220  202
7   201  238  233  163
8   173  164  172  139
9   121  128  119  120
10  106  108   81   96
11   86   87   65   95
12   87   74   76   53
13  108   95   74   94

I want to plot the histograms/densities for all four years in one window.
After trying out a couple of things, I finally ended up with the following
(it took me two hours - Ouch!):

sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
133,177,241,228,283,255,238,164,128,108,87,74,95,
145,200,187,201,292,220,233,172,119,81,65,76,74,
111,170,243,178,248,202,163,139,120,96,95,53,94)
sales2.1.matrix <- sales2.1
dim(sales2.1.matrix) <- c(4,13)
sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")

X11()
split.screen(c(2,2))
for (i in 1:4)
{
     screen(i)
     hist(sales2.1.dataframe[[i]],
         probability=T,
         xlim=c(0,400),
         ylim=c(0,0.006),
         main=names(sales2.1.dataframe)[i],
         xlab="Sales")
     lines(density(sales2.1.dataframe[[i]]))
}
close.screen(all=TRUE)

Although I'm happy that I finally got something that is pretty close
to what I wanted, I'm not sure whether this is the best or most elegant
way to do it. How would you do it? What functions/packages should I
look into, in order to improve these plots?

Thanks in advance for your comments and suggestions,

Peter



------------------------------

Message: 29
Date: Sun, 21 Sep 2008 18:41:22 -0400
From: John Poulsen <jpoulsen@zoo.ufl.edu>
Subject: [R] glmer -- extracting standard errors and other statistics
To: r-help@r-project.org
Message-ID: <48D6CD92.10305@zoo.ufl.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hello,

I am using glmer() from lmer(lme4) to run generalized linear mixed 
models.  However, I am having a problem extracting the standard errors 
for the fixed effects.

I have used:

summary(model)$coef
fixed.effects(model)
coef(model)

to get out the parameter estimates, but do not seem able to extract the 
se's.

Anybody have a solution?

Thanks,
John



------------------------------

Message: 30
Date: Sun, 21 Sep 2008 18:52:10 -0400
From: "jim holtman" <jholtman@gmail.com>
Subject: Re: [R] r format questions
To: DS <ds5j@excite.com>
Cc: r-help@r-project.org
Message-ID:
    <644e1f320809211552s76b1447fg312e57f7b9dba3a3@mail.gmail.com>
Content-Type: text/plain

You have to explicitly ask that they not be printed:
> x <- aggregate(state.x77, list(Region = state.region), mean)
> x         Region Population   Income Illiteracy Life Exp    Murder  HS
Grad    Frost      Area
1     Northeast   5495.111 4570.222   1.000000 71.26444  4.722222 53.96667
132.7778  18141.00
2         South   4208.125 4011.938   1.737500 69.70625 10.581250 44.34375
64.6250  54605.12
3 North Central   4803.000 4611.083   0.700000 71.76667  5.275000 54.51667
138.8333  62652.00
4          West   2915.308 4702.615   1.023077 71.23462  7.215385 62.00000
102.1538 134463.00> print(x, row.names=FALSE)        Region Population   Income Illiteracy Life Exp    Murder  HS Grad
Frost      Area
     Northeast   5495.111 4570.222   1.000000 71.26444  4.722222 53.96667
132.7778  18141.00
         South   4208.125 4011.938   1.737500 69.70625 10.581250 44.34375
64.6250  54605.12
North Central   4803.000 4611.083   0.700000 71.76667  5.275000 54.51667
138.8333  62652.00
          West   2915.308 4702.615   1.023077 71.23462  7.215385 62.00000
102.1538 134463.00>

On Sun, Sep 21, 2008 at 6:03 PM, DS <ds5j@excite.com> wrote:
> Hi,
>
> 1)   I have noticed that when I use the aggregate function it outputs
> numbers in the results. for example:
> aggregate by product
>
>    group.1       Aggregate
> 1    ProductA   1000400.00
> 2    ProductB   23232323.00
> 3    Missing      232323.00
>
> is there a way to suppress the numbers infront of aggregate outputs.  I
> checked and they don't look like columns when I do a summary so I
can't -1
> them away.
>
> 2) is there an easy way to then take my aggregate matrix and then format
> the sum wtih $ and commas. for e.g instead 10000 it should show
> $10,000.00?
>
> I am trying to create a report and am piping the aggregate into an xtable
> and feeding it R2html.
>
> thanks
> Dhruv
>
> ------------------------------------------------------------
> Medical Billing and Coding Training
>
> ools.
>
>
http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

    [[alternative HTML version deleted]]



------------------------------

Message: 31
Date: Sun, 21 Sep 2008 18:56:17 -0400
From: "jim holtman" <jholtman@gmail.com>
Subject: Re: [R] r format questions
To: DS <ds5j@excite.com>
Cc: r-help@r-project.org
Message-ID:
    <644e1f320809211556r2023e1c3od632186ebe3d0447@mail.gmail.com>
Content-Type: text/plain

answer to your second question:
> paste("$", format(1234567.77, big.mark=','),
sep='')
[1] "$1,234,568">
you will have to go through each column you want and explicitly do it:
> x         Region Population   Income Illiteracy Life Exp    Murder  HS
Grad    Frost      Area
1     Northeast   5495.111 4570.222   1.000000 71.26444  4.722222 53.96667
132.7778  18141.00
2         South   4208.125 4011.938   1.737500 69.70625 10.581250 44.34375
64.6250  54605.12
3 North Central   4803.000 4611.083   0.700000 71.76667  5.275000 54.51667
138.8333  62652.00
4          West   2915.308 4702.615   1.023077 71.23462  7.215385 62.00000
102.1538 134463.00> x$Population <- paste("$", format(x$Population,
big.mark=','), sep='')
> x         Region Population   Income Illiteracy Life Exp    Murder  HS
Grad    Frost      Area
1     Northeast $5,495.111 4570.222   1.000000 71.26444  4.722222 53.96667
132.7778  18141.00
2         South $4,208.125 4011.938   1.737500 69.70625 10.581250 44.34375
64.6250  54605.12
3 North Central $4,803.000 4611.083   0.700000 71.76667  5.275000 54.51667
138.8333  62652.00
4          West $2,915.308 4702.615   1.023077 71.23462  7.215385 62.00000
102.1538 134463.00>

On Sun, Sep 21, 2008 at 6:03 PM, DS <ds5j@excite.com> wrote:
> Hi,
>
> 1)   I have noticed that when I use the aggregate function it outputs
> numbers in the results. for example:
> aggregate by product
>
>    group.1       Aggregate
> 1    ProductA   1000400.00
> 2    ProductB   23232323.00
> 3    Missing      232323.00
>
> is there a way to suppress the numbers infront of aggregate outputs.  I
> checked and they don't look like columns when I do a summary so I
can't -1
> them away.
>
> 2) is there an easy way to then take my aggregate matrix and then format
> the sum wtih $ and commas. for e.g instead 10000 it should show
> $10,000.00?
>
> I am trying to create a report and am piping the aggregate into an xtable
> and feeding it R2html.
>
> thanks
> Dhruv
>
> ------------------------------------------------------------
> Medical Billing and Coding Training
>
> ools.
>
>
http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

    [[alternative HTML version deleted]]



------------------------------

Message: 32
Date: Sun, 21 Sep 2008 16:00:47 -0700 (PDT)
From: gcam032 <gcam032@gmail.com>
Subject: Re: [R] Variable Selection for data reduction and
    discriminant anlaysis
To: r-help@r-project.org
Message-ID: <19599461.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii


Thanks Mark,

I failed to mention that i'm working within a compositional framework.  I
didn't want to confuse things.  My data is transformed to the clr or alr
under Aitchison geometry, so I am essentially working in Euclidean space.

Has anyone had experience doing stepwise LDA??  I can't for the life of me
find any help online about where to start.

Thanks

Gareth


quote author="Mark Difford">
Hi Gareth,
>> If I use the full composition (31 elements or variables), I can get
>> reasonable separation of my 6 sources.
A word of advice: You need to be exceptionally careful when analyzing
compositional data. Taking compositions puts your data values into a
constrained/bounded space (generally called a simplex) so that most standard
statistical procedures (i.e. anything that uses a Euclidean metric, and most
do) deliver erroneous results. Pearson wrote a paper on this long ago, but
it's generally been ignored (except by Aitchison and the Spanish School of
mathematical statisticians).

The problem is comparatively well known to geologists, who work with
compositional much of the time. R has a very good package for analysing this
data-type: see the compositions package  (a new release seems iminent). You
will be able to get most of the main references from it. (The authors of the
package also have a newly-released article in one of the Elsevier journals
[unfor. my bib+ are elsewhere so I cannot give details]).

You could start by Wiki'ing your way to "compositional data".

HTH, Mark.



Gareth Campbell wrote:> 
> Hello all,
> 
> I'm dealing with geochemical analyses of some rocks.
> 
> If I use the full composition (31 elements or variables), I can get
> reasonable separation of my 6 sources.  Then when I go onto do LDA with
> the
> 6 groups, I get excellent separation.
> 
> I feel like I should be reducing the variables to thos that are providing
> the most discrimination between the groups as this is important
> information
> for me.  I struggle to interpret the PCA plot in a way that helps me (due
> to
> the large number of elements).  So I'm trying to do some sort of
step-wise
> variable selection.
> 
> I would love to hear from someone (possibly a geochemist or similar) who
> does this regularly to determine the best course of action in R to do
> this.
> 
> 
> Thanks very much
> 
> 
> -- 
> Gareth Campbell
> PhD Candidate
> The University of Auckland
> 
> P +649 815 3670
> M +6421 256 3511
> E gareth.campbell@esr.cri.nz
> gcam032@gmail.com
> 
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 


-- 
View this message in context:
http://www.nabble.com/Variable-Selection-for-data-reduction-and-discriminant-anlaysis-tp19591270p19599461.html
Sent from the R help mailing list archive at Nabble.com.



------------------------------

Message: 33
Date: Mon, 22 Sep 2008 01:19:48 +0200
From: p@fo76.org
Subject: Re: [R] Multiple plots per window
To: r-help@r-project.org
Message-ID: <20080922011948.xdupj04hyv4w4c08@webmail.openit.de>
Content-Type: text/plain;    charset=ISO-8859-1;    DelSp="Yes";
    format="flowed"

sorry, as Mark Leeds pointed out to me, the row/column numbers where
mixed up in my example... happens when you cut & paste like mad from
your history... it should read as follows:

sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
133,177,241,228,283,255,238,164,128,108,87,74,95,
145,200,187,201,292,220,233,172,119,81,65,76,74,
111,170,243,178,248,202,163,139,120,96,95,53,94)

sales2.1.matrix <- sales2.1
dim(sales2.1.matrix) <- c(13,4)

sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")

Peter

Quoting p@fo76.org:
> Hi all,
>
> I'm currently working through "The Analysis of Time Series"
by Chris
> Chatfield. In order to also get a better understanding of R, I play
> around with the examples and Exercises (no homework or assignement,
> just selfstudy!!).
>
> Exercise 2.1 gives the following dataset (sales figures for 4 week
> intervals):
>
>> sales2.1.dataframe
>    1995 1996 1997 1998
> 1   153  133  145  111
> 2   189  177  200  170
> 3   221  241  187  243
> 4   215  228  201  178
> 5   302  283  292  248
> 6   223  255  220  202
> 7   201  238  233  163
> 8   173  164  172  139
> 9   121  128  119  120
> 10  106  108   81   96
> 11   86   87   65   95
> 12   87   74   76   53
> 13  108   95   74   94
>
> I want to plot the histograms/densities for all four years in one window.
> After trying out a couple of things, I finally ended up with the following
> (it took me two hours - Ouch!):
>
> sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
> 133,177,241,228,283,255,238,164,128,108,87,74,95,
> 145,200,187,201,292,220,233,172,119,81,65,76,74,
> 111,170,243,178,248,202,163,139,120,96,95,53,94)
> sales2.1.matrix <- sales2.1
> dim(sales2.1.matrix) <- c(4,13)
> sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
> names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
>
> X11()
> split.screen(c(2,2))
> for (i in 1:4)
> {
>     screen(i)
>     hist(sales2.1.dataframe[[i]],
>         probability=T,
>         xlim=c(0,400),
>         ylim=c(0,0.006),
>         main=names(sales2.1.dataframe)[i],
>         xlab="Sales")
>     lines(density(sales2.1.dataframe[[i]]))
> }
> close.screen(all=TRUE)
>
> Although I'm happy that I finally got something that is pretty close
> to what I wanted, I'm not sure whether this is the best or most elegant
> way to do it. How would you do it? What functions/packages should I
> look into, in order to improve these plots?
>
> Thanks in advance for your comments and suggestions,
>
> Peter
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


------------------------------

Message: 34
Date: Mon, 22 Sep 2008 01:49:32 +0200
From: "Weiss, Bernd " <bernd.weiss@uni-koeln.de>
Subject: Re: [R] glmer -- extracting standard errors and other
    statistics
To: jpoulsen@zoo.ufl.edu, r-help@r-project.org
Message-ID: <48D6DD8C.8030405@uni-koeln.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

John Poulsen schrieb:> Hello,
> 
> I am using glmer() from lmer(lme4) to run generalized linear mixed 
> models.  However, I am having a problem extracting the standard errors 
> for the fixed effects.
> 
> I have used:
> 
> summary(model)$coef
> fixed.effects(model)
> coef(model)
> 
> to get out the parameter estimates, but do not seem able to extract the
> se's.
> 
> Anybody have a solution?
> 
You need to extract the variance-covariance matrix:

library(lme4)

gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), 
              family = binomial, data = cbpp))

sqrt(diag(vcov(gm1)))


HTH,

Bernd



------------------------------

Message: 35
Date: Sun, 21 Sep 2008 18:01:57 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: [R]  Why isn't R recognising integers as numbers?
To: r-help@r-project.org
Message-ID: <19600308.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii


I have a number of files containing anywhere from a few dozen to a few
thousand integers, one per record.

The statement "refdata18
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
TRUE,na.strings="")" works fine, and if I type refdata18, I get
the integers
displayed, one value per record (along with a record number).  However, when
I try " fitdistr(refdata18,"negative binomial")", or
hist.scott(refdata18,
prob = TRUE), I get an error:

Error in fitdistr(refdata18, "negative binomial") : 
  'x' must be a non-empty numeric vector
Or
Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) : 
  'x' must be numeric

How can it not recognise integers as numbers?

Thanks

Ted
-- 
View this message in context:
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html
Sent from the R help mailing list archive at Nabble.com.



------------------------------

Message: 36
Date: Sun, 21 Sep 2008 21:12:49 -0400
From: "jim holtman" <jholtman@gmail.com>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: "Ted Byers" <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID:
    <644e1f320809211812t7a82ac5dy98208d60b3007ef8@mail.gmail.com>
Content-Type: text/plain

best guess is that they are not integers.  Do 'str' on your object and
it
probably says they are 'factors'.  This is probably due to some of your
data
being non-numeric.  Try using 'colClasses' on read.csv to specify what
the
column should contain.  Also try "scan" after skipping the first
record if
it is a header:
> scan("", what=0L)  # bad input after specifying integer1: 1 2 3 4
5: 1 v
5:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
  scan() expected 'an integer', got 'v'> scan("", what=0L)  # good input1: 1
2: 2
3: 3
4:
Read 3 items
[1] 1 2 3>
On Sun, Sep 21, 2008 at 9:01 PM, Ted Byers <r.ted.byers@gmail.com> wrote:
>
> I have a number of files containing anywhere from a few dozen to a few
> thousand integers, one per record.
>
> The statement "refdata18 >
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
> TRUE,na.strings="")" works fine, and if I type refdata18, I
get the
> integers
> displayed, one value per record (along with a record number).  However,
> when
> I try " fitdistr(refdata18,"negative binomial")", or
hist.scott(refdata18,
> prob = TRUE), I get an error:
>
> Error in fitdistr(refdata18, "negative binomial") :
>  'x' must be a non-empty numeric vector
> Or
> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) :
>  'x' must be numeric
>
> How can it not recognise integers as numbers?
>
> Thanks
>
> Ted
> --
> View this message in context:
>
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

    [[alternative HTML version deleted]]



------------------------------

Message: 37
Date: Sun, 21 Sep 2008 21:21:21 -0400
From: "Gabor Grothendieck" <ggrothendieck@gmail.com>
Subject: Re: [R] Multiple plots per window
To: p@fo76.org
Cc: r-help@r-project.org
Message-ID:
    <971536df0809211821u2b99348ai36b1952bc127695d@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Here are two ways: one using classic graphics and one much
shorter way using lattice.   ggplot2 would be a another short way
(not shown).

Lines <- "1995 1996 1997 1998
  153  133  145  111
  189  177  200  170
  221  241  187  243
  215  228  201  178
  302  283  292  248
  223  255  220  202
  201  238  233  163
  173  164  172  139
  121  128  119  120
  106  108   81   96
   86   87   65   95
   87   74   76   53
  108   95   74   94"

# read in data and remove the X from the column names
s <- read.table(textConnection(Lines), header = TRUE)
names(s) <- sub("X", "", names(s))

# 1. using classic graphics

# find overall ranges of x and y
h <- lapply(s, hist, probability = TRUE)
ylim <- range(unlist(lapply(h, "[[", "density")))
xlim <- range(unlist(lapply(h, "[[", "breaks")))

# plot
opar <- par(mfrow = c(2, 2))
for(i in 1:length(s)) {
hist(s[[i]], main = names(s)[i], probability = TRUE,
    xlab = "Sales", xlim = xlim, ylim = ylim)
lines(density(s[[i]]))
}
par(opar)

# 2. using lattice its a bit easier

library(lattice)
histogram( ~ values | ind, stack(s), type = "density",
    panel = function(...) {
        panel.histogram(...)
        panel.densityplot(...)
    }
)



On Sun, Sep 21, 2008 at 7:19 PM,  <p@fo76.org>
wrote:> sorry, as Mark Leeds pointed out to me, the row/column numbers where
> mixed up in my example... happens when you cut & paste like mad from
> your history... it should read as follows:
>
> sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
> 133,177,241,228,283,255,238,164,128,108,87,74,95,
> 145,200,187,201,292,220,233,172,119,81,65,76,74,
> 111,170,243,178,248,202,163,139,120,96,95,53,94)
>
> sales2.1.matrix <- sales2.1
> dim(sales2.1.matrix) <- c(13,4)
>
> sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
> names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
>
> Peter
>
> Quoting p@fo76.org:
>
>> Hi all,
>>
>> I'm currently working through "The Analysis of Time
Series" by Chris
>> Chatfield. In order to also get a better understanding of R, I play
>> around with the examples and Exercises (no homework or assignement,
>> just selfstudy!!).
>>
>> Exercise 2.1 gives the following dataset (sales figures for 4 week
>> intervals):
>>
>>> sales2.1.dataframe
>>
>>   1995 1996 1997 1998
>> 1   153  133  145  111
>> 2   189  177  200  170
>> 3   221  241  187  243
>> 4   215  228  201  178
>> 5   302  283  292  248
>> 6   223  255  220  202
>> 7   201  238  233  163
>> 8   173  164  172  139
>> 9   121  128  119  120
>> 10  106  108   81   96
>> 11   86   87   65   95
>> 12   87   74   76   53
>> 13  108   95   74   94
>>
>> I want to plot the histograms/densities for all four years in one
window.
>> After trying out a couple of things, I finally ended up with the
following
>> (it took me two hours - Ouch!):
>>
>> sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
>> 133,177,241,228,283,255,238,164,128,108,87,74,95,
>> 145,200,187,201,292,220,233,172,119,81,65,76,74,
>> 111,170,243,178,248,202,163,139,120,96,95,53,94)
>> sales2.1.matrix <- sales2.1
>> dim(sales2.1.matrix) <- c(4,13)
>> sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
>> names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
>>
>> X11()
>> split.screen(c(2,2))
>> for (i in 1:4)
>> {
>>    screen(i)
>>    hist(sales2.1.dataframe[[i]],
>>        probability=T,
>>        xlim=c(0,400),
>>        ylim=c(0,0.006),
>>        main=names(sales2.1.dataframe)[i],
>>        xlab="Sales")
>>    lines(density(sales2.1.dataframe[[i]]))
>> }
>> close.screen(all=TRUE)
>>
>> Although I'm happy that I finally got something that is pretty
close
>> to what I wanted, I'm not sure whether this is the best or most
elegant
>> way to do it. How would you do it? What functions/packages should I
>> look into, in order to improve these plots?
>>
>> Thanks in advance for your comments and suggestions,
>>
>> Peter
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


------------------------------

Message: 38
Date: Sun, 21 Sep 2008 20:44:50 -0500
From: Marc Schwartz <marc_schwartz@comcast.net>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <48D6F892.3080901@comcast.net>
Content-Type: text/plain; charset=ISO-8859-1

on 09/21/2008 08:01 PM Ted Byers wrote:> I have a number of files containing anywhere from a few dozen to a few
> thousand integers, one per record.
> 
> The statement "refdata18 >
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
> TRUE,na.strings="")" works fine, and if I type refdata18, I
get the integers
> displayed, one value per record (along with a record number).  However,
when
> I try " fitdistr(refdata18,"negative binomial")", or
hist.scott(refdata18,
> prob = TRUE), I get an error:
> 
> Error in fitdistr(refdata18, "negative binomial") : 
>   'x' must be a non-empty numeric vector
> Or
> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) : 
>   'x' must be numeric
> 
> How can it not recognise integers as numbers?
> 
> Thanks
> 
> Ted
'refdata18' is a data frame and the two functions are expecting a
numeric vector.

If you use:

  fitdistr(refdata18[, 1], "negative binomial")

or

  hist(refdata18[, 1])

you should get a suitable result, presuming that the first column in the
data frame is a numeric vector.

Use:

  str(refdata18)

to get a sense for the structure of the data frame, including the column
names, which you could then use, instead of the above index based syntax.

HTH,

Marc Schwartz



------------------------------

Message: 39
Date: Sun, 21 Sep 2008 18:56:48 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: r-help@r-project.org
Message-ID: <19600695.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii


Thanks Jim,

Alas, it wasn't this.  Here is the output from both of your suggestions:
> refdata18 =
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv",
> header = TRUE,na.strings="")
> str(refdata18)'data.frame':   341 obs. of  1 variable:
$ X0: int  0 0 0 0 0 0 0 0 0 0 ...> scan("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", what=0L)Read 342 items
  [1]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
0  0
[26]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
0  0
[51]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
0  0
[76]  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
1  1
[101]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
1  1
[126]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
1  1
[151]  1  1  1  1  1  1  1  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2
2  2
[176]  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3
3  3
[201]  3  3  3  3  3  3  3  3  3  3  3  3  3  4  4  4  4  4  4  4  4  4  4
4  4
[226]  4  4  4  4  4  4  4  5  5  5  5  5  5  5  5  5  6  6  6  6  6  6  6
6  6
[251]  6  6  6  6  6  6  6  6  6  6  6  6  6  6  7  7  7  7  7  7  7  7  7
7  7
[276]  7  7  7  8  8  8  8  9  9  9  9  9  9  9  9  9 10 10 10 10 10 10 10
10 10
[301] 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12
12 12
[326] 12 12 12 18 18 18 18 18 18 18 18 18 18 18 18 18 18

Thanks anyway.

Ted> 
jholtman wrote:> 
> best guess is that they are not integers.  Do 'str' on your object
and it
> probably says they are 'factors'.  This is probably due to some of
your
> data
> being non-numeric.  Try using 'colClasses' on read.csv to specify
what the
> column should contain.  Also try "scan" after skipping the first
record if
> it is a header:
> 
>> scan("", what=0L)  # bad input after specifying integer
> 1: 1 2 3 4
> 5: 1 v
> 5:
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> :
>   scan() expected 'an integer', got 'v'
>> scan("", what=0L)  # good input
> 1: 1
> 2: 2
> 3: 3
> 4:
> Read 3 items
> [1] 1 2 3
>>
> 
> On Sun, Sep 21, 2008 at 9:01 PM, Ted Byers <r.ted.byers@gmail.com>
wrote:
> 
>>
>> I have a number of files containing anywhere from a few dozen to a few
>> thousand integers, one per record.
>>
>> The statement "refdata18 >>
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
>> TRUE,na.strings="")" works fine, and if I type
refdata18, I get the
>> integers
>> displayed, one value per record (along with a record number).  However,
>> when
>> I try " fitdistr(refdata18,"negative binomial")",
or
>> hist.scott(refdata18,
>> prob = TRUE), I get an error:
>>
>> Error in fitdistr(refdata18, "negative binomial") :
>>  'x' must be a non-empty numeric vector
>> Or
>> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab,
...)
>> :
>>  'x' must be numeric
>>
>> How can it not recognise integers as numbers?
>>
>> Thanks
>>
>> Ted
>> --
>> View this message in context:
>>
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600695.html
Sent from the R help mailing list archive at Nabble.com.



------------------------------

Message: 40
Date: Sun, 21 Sep 2008 19:09:29 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: r-help@r-project.org
Message-ID: <19600803.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii


Thanks Marc,

That was it. 

For the last 30 years, I'd write my own code, in FORTRAN, C++, or even Java,
to do whatever statistical analysis I needed.  When at the office, sometimes
I could use SAS, but that hasn't been an option for me in years.

This is the first time I have had to load real data into R (instead of
generating random data to use while playing with some of the stats
functions, or manually typing dummy data).

I take it, then, that the result of loading data is a data frame, and not
just a matrix or array.  Using something like "refdata18[, 1]" feels
rather
alien, but I'm sure I'll quickly get used to it.  I'd seen it before
in the
R docs, but it didn't register that I had to use it to get the functions of
most interest to me to recognise my data as a vector of numbers, given I'd
provided only a vector of integers as input.

Thanks

Ted


Marc Schwartz wrote:> 
> on 09/21/2008 08:01 PM Ted Byers wrote:
>> I have a number of files containing anywhere from a few dozen to a few
>> thousand integers, one per record.
>> 
>> The statement "refdata18 >>
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
>> TRUE,na.strings="")" works fine, and if I type
refdata18, I get the
>> integers
>> displayed, one value per record (along with a record number).  However,
>> when
>> I try " fitdistr(refdata18,"negative binomial")",
or
>> hist.scott(refdata18,
>> prob = TRUE), I get an error:
>> 
>> Error in fitdistr(refdata18, "negative binomial") : 
>>   'x' must be a non-empty numeric vector
>> Or
>> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab,
...)
>> : 
>>   'x' must be numeric
>> 
>> How can it not recognise integers as numbers?
>> 
>> Thanks
>> 
>> Ted
> 
> 'refdata18' is a data frame and the two functions are expecting a
> numeric vector.
> 
> If you use:
> 
>   fitdistr(refdata18[, 1], "negative binomial")
> 
> or
> 
>   hist(refdata18[, 1])
> 
> you should get a suitable result, presuming that the first column in the
> data frame is a numeric vector.
> 
> Use:
> 
>   str(refdata18)
> 
> to get a sense for the structure of the data frame, including the column
> names, which you could then use, instead of the above index based syntax.
> 
> HTH,
> 
> Marc Schwartz
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600803.html
Sent from the R help mailing list archive at Nabble.com.



------------------------------

Message: 41
Date: Sun, 21 Sep 2008 21:49:14 -0500
From: Marc Schwartz <marc_schwartz@comcast.net>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <48D707AA.6040909@comcast.net>
Content-Type: text/plain; charset=ISO-8859-1

on 09/21/2008 09:09 PM Ted Byers wrote:> Thanks Marc,
> 
> That was it. 
> 
> For the last 30 years, I'd write my own code, in FORTRAN, C++, or even
Java,
> to do whatever statistical analysis I needed.  When at the office,
sometimes
> I could use SAS, but that hasn't been an option for me in years.
> 
> This is the first time I have had to load real data into R (instead of
> generating random data to use while playing with some of the stats
> functions, or manually typing dummy data).
> 
> I take it, then, that the result of loading data is a data frame, and not
> just a matrix or array.  Using something like "refdata18[, 1]"
feels rather
> alien, but I'm sure I'll quickly get used to it.  I'd seen it
before in the
> R docs, but it didn't register that I had to use it to get the
functions of
> most interest to me to recognise my data as a vector of numbers, given
I'd
> provided only a vector of integers as input.
<snip>

Ted,

If you read the 'Value' section of ?read.csv, it indicates that the
function returns a data frame. It is important to fully read the help
page for new functions so that you understand both how they are used and
the result(s) of their actions, including the 'Notes' section, which can
include further details, including gotchas and idiosyncrasies.

A data frame will be the result of read.csv() even if the data source is
a single column. Think of a data frame in the same way as a spreadsheet
or database table with one or more columns and one or more rows. The
unique aspect of a data frame is that each column can be a different
data type, though that need not be the case.

Thus, you still need to identify the column within the data frame that
you wish to manipulate/analyze further. There are various ways of doing
this, which are covered in Chapter 6 of "An Introduction to R" on
Lists
and Data Frames. Some involve the use of indices, others using a column
name, as appropriate. There will be situations where they can be
interchangeable and others where one method will be superior to the
other. Time and experience will provide insight and intuition.

There are a myriad of ways of reading data into R and these are covered
in the Data Import/Export manual. Not all result in a data frame, but in
general and perhaps most commonly, that will be the result.

HTH,

Marc



------------------------------

Message: 42
Date: Sun, 21 Sep 2008 19:54:28 -0700 (PDT)

Subject: Re: [R] Calculating interval for conditional/unconditional
    correlation matrix

Message-ID: <19678.53863.qm@web32203.mail.mud.yahoo.com>
Content-Type: text/plain; charset=utf-8

Hi Ana,

There are two problems:

First of all, if you want your matrix to have 4 columns it's number of
elem[[elided Yahoo spam]]

Secondly, and this is what causes your error message, you should not call your
second function matrix. Call it matrix1, my_matrix, whatever. Otherwise R thinks
that you are calling your matrix function within itself.




> Subject: [R] Calculating interval for conditional/unconditional correlation
matrix
> To: "R" <r-help@r-project.org>
> Received: Sunday, 21 September, 2008, 8:05 PM
> Hi there,
> 
> Could anyone please help me to understand what should be
> done in order not to get this error message: Error:
> evaluation nested too deeply: infinite recursion /
> options(expressions=)?
> 
> Here is my code:
> 
> determinant<-
>
function(x){det(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}
> 
> matrix<-
>
function(x){(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}
> 
> 
> conditional<-function(x,varcov){
>     varcov<-matrix(x)
>     sigmaxx<-varcov[3,3]
>     sigmaxz<-varcov[3,1:2]
>     sigmayy<-varcov[4,4]
>     sigmayz<-varcov[4,1:2]
>     sigmazx<-varcov[1:2,3]
>     sigmazy<-varcov[1:2,4]
>     sigmazz<-varcov[1:2,1:2]
>    
>
(x-sigmaxz%*%solve(sigmaZZ)%*%sigmazy)/sqrt((sigmaxx-sigmaxz%*%solve(sigmaZZ)%*%sigmazx)*(sigmayy-sigmayz%*%solve(sigmaZZ)%*%sigmazy))}
> 
> interval<-uniroot(determinant,lower = min(c(0,1)), upper
> = max(c(0,1)))
> 
> I tried also with the code below, but got the same Error
> message.
> 
> lower.bound<-uniroot(determinant,c(0,0.5))$root
> upper.bound<-uniroot(determinant,c(0.51,1))$root
> 
> 
[[elided Yahoo spam]]> 
> Ana
> 
> 
> 
>      
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.


------------------------------

Message: 43
Date: Sun, 21 Sep 2008 18:26:20 -0500
From: Bingshan Li <bli1@bcm.tmc.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: Gabor Grothendieck <ggrothendieck@gmail.com>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID: <48D6D81C.40509@bcm.tmc.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi Gabor,

This works. This is exactly what I want. According to John Fox's reply, 
I used expression(NA>=1) and it also worked. Thanks for the kind and 
clever help.

Bingshan


Gabor Grothendieck wrote:> On Sun, Sep 21, 2008 at 11:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:
>  
>> Hi John,
>>
>> Yes, you are right. I meant "greater-than-or-equal".
According to your suggestion, I can plot the symbol only. But what I want is to
have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you
know how to make it? The expression("">=1"") did not
work, and paste(expression("">=""), 1)
>> did not work either.
>>
>>    
>
> Try this:
>
> plot(1:10, xaxt = "n")
> for(i in 1:10) axis(1, i, bquote(phantom(0) >= .(i)))
>


------------------------------

Message: 44
Date: Mon, 22 Sep 2008 14:42:15 +1200
From: Paul Murrell <p.murrell@auckland.ac.nz>
Subject: Re: [R] PDF fonts problem
To: Mihalicza P?ter <mihalicza.peter@eski.hu>
Cc: r-help@r-project.org
Message-ID: <48D70607.40301@stat.auckland.ac.nz>
Content-Type: text/plain; charset=UTF-8

Hi


Mihalicza P?ter wrote:> Dear Dr. Murrel,
> 
[[elided Yahoo spam]]> 
> Paul Murrell ?rta:
>> Hi
>>
>>>
>>> #CMS
>>> pdf("tryfont-cms.pdf", family="CMS")
>>> grid.text("gg\u151hh\uF6ii\uF3jj kk\u171ll\uFCmm\uFAnn")
>>> dev.off()
>>> #u151 and u171 doesn't show, though the other accented ones do
>>>
>>> embedFonts("tryfont-cms.pdf",
>>> outfile="tryfont-cms-embed.pdf",
>>> fontpaths="/cm-super/afm/")
>>> #after embedding the same "slipping" occurs
>>
>> The 'fontpaths' argument describes where the PFB files are, not
where
>> the AFM files are.  So this is probably failing to embed the fonts 
>> because it can't find the fonts.  Does it work if you change to 
>> something like ...
>>
>>  embedFonts("tryfont-cms.pdf",
>>            outfile="tryfont-cms-embed.pdf",
>>            fontpaths="cm-super/pfb/")
>>
>> Paul
>>
>>
> This solved my problem, so I am really very grateful! I am not too 
> familiar with font protocols.
> Just for the sake of knowledge: if my embedFonts specification should 
> not have made any difference, why did the output pdf differed from the 
> one before embedding?

Your embedFonts() specification (especially your 'fontpaths' argument)
*did* make a difference.  This function calls ghostscript to perform the
embedding and if ghostscript cannot find the PFB files it cannot embed
the font.  If the PDF file does not have embedded fonts, the PDF reader
will use (substitute) its own fonts and the result can look awful.

Paul

> Thanks again,
> Peter
> 
> 
> 
-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
paul@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/



------------------------------

Message: 45
Date: Mon, 22 Sep 2008 11:59:13 +0800 (CST)

Subject: [R] Help for R
To: r-help@r-project.org
Message-ID: <339826.80972.qm@web15908.mail.cnb.yahoo.com>
Content-Type: text/plain

Dear R users£¬
  
  I've just started learning R and I'm having a problem with it. I was
told as following when I tried to run R:
  
  Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source =
keep.source) :
        in 'matlab' methods specified for export, but none defined: sum,
size, padarray, flipud, fliplr
Error: package/namespace load failed for 'matlab'
  
  Then I tried "package/load in package/matlab", however, the same
message showed to me as above.
  
  I appreciate for any help and suggestion. Thanks.
  
  Kai

      
---------------------------------
ÑÅ»¢ÓÊÏä£¬ÄúµÄÖÕÉúÓÊÏä£¡
    [[alternative HTML version deleted]]



------------------------------

Message: 46
Date: Sun, 21 Sep 2008 23:08:42 -0500
From: "Matthew Pettis" <matthew.pettis@gmail.com>
Subject: [R] Hmisc and Ubuntu (aptitude install)
To: r-help@r-project.org
Message-ID:
    <82ba77b80809212108y6baf7850i8b7d76c54bad160c@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi,

I'm trying to get the Hmisc module on my Ubuntu Hardy Heron install.
I tried getting Hmisc from within R by issuing the standard
'install.packages' command, but it said I needed 'gfortran' to
compile.  I thought I could circumvent this by using 'aptitude' to get
the package 'r-cran-hmisc', but when I got it, the package had
critical missing parts (got 404s).  So, I'll be trying to go back and
download 'gfortran', but can anybody tell me if this aptitude ubuntu
package should be kept up to date and is just currently overlooked?

Thanks,
Matt

-- 
It is from the wellspring of our despair and the places that we are
broken that we come to repair the world.
-- Murray Waas



------------------------------

Message: 47
Date: Mon, 22 Sep 2008 00:47:25 -0400
From: "Juliet Hannah" <juliet.hannah@gmail.com>
Subject: [R] adding layers in ggplot2 (data and code included)
To: r-help@r-project.org
Message-ID:
    <93d6f2a80809212147o5c2e8d4co316396bad5f6217e@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Here is some sample data:

mydata <- read.table(textConnection("Est Group    Tri
       0        0 4.639644
       1        0 4.579189
       2        0 4.590714
       0        1 4.443696
       1        1 4.588243
       2        1 4.650505
       0        2 4.296608
       1        2 4.826036
       2        2 4.765386"),header=TRUE);
  closeAllConnections();

I can form two plots, scatter and  lines, as follows:

p <- ggplot(mydata, aes(x=Est, y=Tri))
p + geom_point(aes(colour=factor(Group),shape=factor(Group)))

and

p+ geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F).

However, I am unable to have the plots together.

I obtain the following error:
> p +
geom_point(aes(colour=factor(Group),shape=factor(Group)))+geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)Error in `[.data.frame`(df, , var) : undefined columns selected

Thanks,

Juliet



------------------------------

Message: 48
Date: Mon, 22 Sep 2008 17:30:47 +1200
From: Rolf Turner <r.turner@auckland.ac.nz>
Subject: [R] Warnings in fitdistr() from MASS.
To: R-help Forum <r-help@r-project.org>
Message-ID: <80B3F5B3-0EC0-4E72-BEA9-5B40098294AE@auckland.ac.nz>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed


For a lark, I experimented a bit with the data from Ted Byers' recent
postings.  The result of fitdistr() seemed sensible, but I was bothered
by the warnings about NaNs that arose.  Warnings always make me nervous.

Explicitly this is what I did:

TXT <- "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
         0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  
0 0 0 0 0
         0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1  
1 1 1 1 1
         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1  
1 1 1 1 1
         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2  
2 2 2 2 2
         2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3  
3 3 3 3 3
         3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5  
5 5 5 5 5
         6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7  
7 7 7 7 7
         7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 11  
11 11 11
         11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12  
12 12 12
         12 18 18 18 18 18 18 18 18 18 18 18 18 18 18"

x <- scan(textConnection(TXT))
closeAllConnections()
try.x <- fitdistr(x,"negative binomial")

Two warnings about NaNs being produced resulted.

Digging into the code with browser() revealed that in the  
optimization process
negative values of "size" were tried on occasion, and this was giving
the NaNs.

Basically I'm sending this out so that maybe those who are like me  
and are
made nervous by warnings will be able to search the archives and find  
reassurance
that all is actually well.

To keep the warnings from the door, one can set an argument "lower"  
in the
call to fitdistr, e.g.

    eps <- sqrt(.Machine$double.eps)
    fitdistr(x,"negative binomial",lower=c(eps,eps))

Note that setting lower=c(0,0) doesn't work --- you get an *error* to 
the
[[elided Yahoo spam]]

I also tried building my own local version of fitdistr() which had

    if(distname == "negative binomial" & is.null(Call$lower))
             Call$lower <- rep(sqrt(.Machine$double.eps),2)

just after the assignment ``Call$hessian <- TRUE''.  This *seemed*
to work (i.e. prevent those nervous-making warnings and still give
the right answer).

    cheers,

        Rolf Turner


######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}



------------------------------

Message: 49
Date: Mon, 22 Sep 2008 08:01:23 +0200
From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <48D734B3.5090107@biostat.ku.dk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Ted Byers wrote:> Thanks Jim,
>
> Alas, it wasn't this.  Here is the output from both of your
suggestions:
>
>  
>> refdata18 =
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv",
>> header = TRUE,na.strings="")
>> str(refdata18)
>>    
> 'data.frame':   341 obs. of  1 variable:
>  $ X0: int  0 0 0 0 0 0 0 0 0 0 ...
>  Ummm, is there a header line or not? If there isn't, read.csv is going 
to eat the first observation thinking it is a name (and since it is 
non-syntactic add an X in front).

The scan command looks fine, you just should have assigned it somewhere, 
x <- scan(......) and then fitdistr(x, ....)
>> scan("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv",
what=0L)
>>    
> Read 342 items
>   [1]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
> 0  0
>  [26]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
> 0  0
>  [51]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
> 0  0
>  [76]  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
> 1  1
> [101]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
> 1  1
> [126]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
> 1  1
> [151]  1  1  1  1  1  1  1  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2 
> 2  2
> [176]  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3 
> 3  3
> [201]  3  3  3  3  3  3  3  3  3  3  3  3  3  4  4  4  4  4  4  4  4  4  4 
> 4  4
> [226]  4  4  4  4  4  4  4  5  5  5  5  5  5  5  5  5  6  6  6  6  6  6  6 
> 6  6
> [251]  6  6  6  6  6  6  6  6  6  6  6  6  6  6  7  7  7  7  7  7  7  7  7 
> 7  7
> [276]  7  7  7  8  8  8  8  9  9  9  9  9  9  9  9  9 10 10 10 10 10 10 10
> 10 10
> [301] 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12
> 12 12
> [326] 12 12 12 18 18 18 18 18 18 18 18 18 18 18 18 18 18
>
>  
-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)              FAX: (+45) 35327907



------------------------------

Message: 50
Date: Sun, 21 Sep 2008 23:10:53 -0700
From: Eric <rmailbox@justemail.net>
Subject: Re: [R] adding layers in ggplot2 (data and code included)
To: r-help@r-project.org
Message-ID: <48D736ED.20904@justemail.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed


The way you've attempted to get this result seems to align with the way 
R "should" work, but it fails in this case.
The fix is to break things up a little bit:

p <- ggplot(mydata, aes(x=Est, y=Tri))
p <- p + geom_point(aes(colour=factor(Group),shape=factor(Group)))
p <- p + 
geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)
p


Eric



Juliet Hannah wrote:> Here is some sample data:
>
> mydata <- read.table(textConnection("Est Group    Tri
>        0        0 4.639644
>        1        0 4.579189
>        2        0 4.590714
>        0        1 4.443696
>        1        1 4.588243
>        2        1 4.650505
>        0        2 4.296608
>        1        2 4.826036
>        2        2 4.765386"),header=TRUE);
>   closeAllConnections();
>
> I can form two plots, scatter and  lines, as follows:
>
> p <- ggplot(mydata, aes(x=Est, y=Tri))
> p + geom_point(aes(colour=factor(Group),shape=factor(Group)))
>
> and
>
> p+
geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F).
>
> However, I am unable to have the plots together.
>
> I obtain the following error:
>
>  
>> p +
geom_point(aes(colour=factor(Group),shape=factor(Group)))+geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)
>>    
> Error in `[.data.frame`(df, , var) : undefined columns selected
>
> Thanks,
>
> Juliet
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


------------------------------

Message: 51
Date: Sun, 21 Sep 2008 23:17:27 -0700
From: <rkevinburton@charter.net>
Subject: [R] Time series (ts) questions.
To: r-help@r-project.org
Message-ID: <20080922021727.S31BZ.130479.root@mp16>
Content-Type: text/plain; charset=utf-8

I have been working with the base time series object (ts) and I had a couple of
questions that hopefully this group can help me with:

1) What is the best why to append an observation to an existing time-series?
Suppose I have a time series:

t <- ts(1:12, frequency=5)

This would generate two complete cycles and one remainder. Now I would like to
append an observation to this time series. I could use 'c' but then I
would need to rebuild the whole time series and I would need to know the
frequency etc. I would like some operation like '+' that would simply
append the value to the end of the time series (incrementing the 'las time
value so thing like cycle() still output the correnct values) but alas

t + 10

is already taken as an equally useful operation by adding 10 to each element in
the time series (rather than in thie case, appending ts(10,frequency) with a
time value of 13 to the time series).

2) How is the best way to get the last time value in a time series? I can do
something like:

(start(t)[2] - 1) + (end(t)[1]-1) * frequency(t) + end(t)[2]

But there has to be an easier way.

Thank you.

Kevin



------------------------------

Message: 52
Date: Mon, 22 Sep 2008 08:43:07 +0200
From: "PALMIER Patrick - CETE NP/INFRA/TRF"
    <Patrick.Palmier@developpement-durable.gouv.fr>
Subject: [R] Matrix balancing on margins
To: r-help@r-project.org
Message-ID: <48D73E7B.4020702@developpement-durable.gouv.fr>
Content-Type: text/plain

Hello,

Is there any package in R for balancing matrix

I want to estimate a matrix with

    *  a initial matrix (1 everywhere for example)
    * Row margin
    * Col margin
    * distance class  vector  (each cell of the matrix  belong to a
      distance class) and I want that the distance class repartition
      will be preserved

How can I do such thing?
Is there any function already existing or should I compute an iterative 
script myself?

Thanks

-- 

*Patrick PALMIER**
**Centre d'Études Techniques de l'Équipement Nord - Picardie
Département Infrastructures
*/*Trafic -- Socio-économie
*/2, rue de Bruxelles, BP 275
59019 Lille cedex
FRANCE
Tél: +33 (0) 3 20 49 60 70
Fax: +33 (0) 3 20 49 63 69


    [[alternative HTML version deleted]]



------------------------------

Message: 53
Date: Sun, 21 Sep 2008 23:48:47 -0700 (PDT)
From: Mark Difford <mark_difford@yahoo.co.uk>
Subject: Re: [R] Variable Selection for data reduction and
    discriminant anlaysis
To: r-help@r-project.org
Message-ID: <19602702.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii


Hi Gareth,
>> My data is transformed to the clr or alr under Aitchison geometry, so I
>> am essentially working 
>> in Euclidean space.
Great: glad to hear it.
>> Has anyone had experience doing stepwise LDA??  I can't for the
life of
>> me find any help 
>> online about where to start.
A better option might be this: Trevor Hastie and a student of his have
recently put out a paper that does a step-up from penalized discriminant
analysis based, I think, on Trevor's sparse principal component analysis
method (in his elasticnet package).

http://www-stat.stanford.edu/~hastie/Papers/sda_line.pdf

You can get R-code to do the analysis on the first author's website;
there's
a link in the paper.

Bye, Mark.


gcam032 wrote:> 
> Thanks Mark,
> 
> I failed to mention that i'm working within a compositional framework. 
I
> didn't want to confuse things.  My data is transformed to the clr or
alr
> under Aitchison geometry, so I am essentially working in Euclidean space.
> 
> Has anyone had experience doing stepwise LDA??  I can't for the life of
me
> find any help online about where to start.
> 
> Thanks
> 
> Gareth
> 
> 
> quote author="Mark Difford">
> Hi Gareth,
> 
>>> If I use the full composition (31 elements or variables), I can get
>>> reasonable separation of my 6 sources.
> 
> A word of advice: You need to be exceptionally careful when analyzing
> compositional data. Taking compositions puts your data values into a
> constrained/bounded space (generally called a simplex) so that most
> standard statistical procedures (i.e. anything that uses a Euclidean
> metric, and most do) deliver erroneous results. Pearson wrote a paper on
> this long ago, but it's generally been ignored (except by Aitchison and
> the Spanish School of mathematical statisticians).
> 
> The problem is comparatively well known to geologists, who work with
> compositional much of the time. R has a very good package for analysing
> this data-type: see the compositions package  (a new release seems
> iminent). You will be able to get most of the main references from it.
> (The authors of the package also have a newly-released article in one of
> the Elsevier journals [unfor. my bib+ are elsewhere so I cannot give
> details]).
> 
> You could start by Wiki'ing your way to "compositional data".
> 
> HTH, Mark.
> 
> 
> 
> Gareth Campbell wrote:
>> 
>> Hello all,
>> 
>> I'm dealing with geochemical analyses of some rocks.
>> 
>> If I use the full composition (31 elements or variables), I can get
>> reasonable separation of my 6 sources.  Then when I go onto do LDA with
>> the
>> 6 groups, I get excellent separation.
>> 
>> I feel like I should be reducing the variables to thos that are
providing
>> the most discrimination between the groups as this is important
>> information
>> for me.  I struggle to interpret the PCA plot in a way that helps me
(due
>> to
>> the large number of elements).  So I'm trying to do some sort of
>> step-wise
>> variable selection.
>> 
>> I would love to hear from someone (possibly a geochemist or similar)
who
>> does this regularly to determine the best course of action in R to do
>> this.
>> 
>> 
>> Thanks very much
>> 
>> 
>> -- 
>> Gareth Campbell
>> PhD Candidate
>> The University of Auckland
>> 
>> P +649 815 3670
>> M +6421 256 3511
>> E gareth.campbell@esr.cri.nz
>> gcam032@gmail.com
>> 
>>     [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
> 
> 


-- 
View this message in context:
http://www.nabble.com/Variable-Selection-for-data-reduction-and-discriminant-anlaysis-tp19591270p19602702.html
Sent from the R help mailing list archive at Nabble.com.



------------------------------

Message: 54
Date: Mon, 22 Sep 2008 08:50:21 +0200
From: " Jos? E. Lozano " <lozalojo@jcyl.es>
Subject: [R] Manage huge database
To: <r-help@stat.math.ethz.ch>
Message-ID: <47A455630022D55E@mtacsbs.csbs.jcyl.es> (added by
    postmaster@jcyl.es)
Content-Type: text/plain

Hello,



Recently I have been trying to open a huge database with no success.



It’s a 4GB csv plain text file with around 2000 rows and over 500,000
columns/variables.



I have try with The SAS System, but it reads only around 5000 columns, no
more. R hangs up when opening.



Is there any way to work with “parts” (a set of columns) of this database,
since its impossible to manage it all at once?



Is there any way to establish a link to the csv file and to state the
columns you want to fetch every time you make an analysis?



I’ve been searching the net, but found little about this topic.



Best regards,

Jose Lozano


    [[alternative HTML version deleted]]



------------------------------

Message: 55
Date: Mon, 22 Sep 2008 08:08:20 +0100
From: "Barry Rowlingson" <b.rowlingson@lancaster.ac.uk>
Subject: Re: [R] Manage huge database
To: " Jos? E. Lozano " <lozalojo@jcyl.es>
Cc: r-help@stat.math.ethz.ch
Message-ID:
    <d8ad40b50809220008r73daa11fi5d6b845fc1ca3d04@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

2008/9/22 Jos? E. Lozano <lozalojo@jcyl.es>:
> Recently I have been trying to open a huge database with no success.
>
> It's a 4GB csv plain text file with around 2000 rows and over 500,000
> columns/variables.
I wouldn't call a 4GB csv text file a 'database'.
> Is there any way to work with "parts" (a set of columns) of this
database,
> since its impossible to manage it all at once?
Yes, use a database. A real database.
> Is there any way to establish a link to the csv file and to state the
> columns you want to fetch every time you make an analysis?
No, but you can establish a link to a database. You want a database.
A real relational database.
> I've been searching the net, but found little about this topic.
Try:
http://cran.r-project.org/doc/manuals/R-data.html#Relational-databases

Barry



------------------------------

Message: 56
Date: Mon, 22 Sep 2008 09:16:52 +0200
From: Martin Maechler <maechler@stat.math.ethz.ch>
Subject: Re: [R] Symmetric matrix
To: Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl>

Message-ID: <18647.18020.828088.828816@stat.math.ethz.ch>
Content-Type: text/plain; charset=us-ascii
>>>>> "DR" == Dimitris Rizopoulos
<d.rizopoulos@erasmusmc.nl>
>>>>>     on Sun, 21 Sep 2008 19:58:44 +0200 writes:
    DR> try the following
    DR> a <- matrix(rnorm(36), 6)
    DR> ind <- lower.tri(a)
    DR> a[ind] <- t(a)[ind]
    DR> a

Yes, indeed, it needs the t(.) trick.

Note that 'Matrix' package has a function  forceSymmetric(.) to
do this for you (faster, using C code):

A <- forceSymmetric(Matrix(rnorm(36), 6))

is all you'd need {if can afford to trash half of the random
                      numbers generated}

Martin Maechler, ETH Zurich


    DR> I hope it helps.

    DR> Best,
    DR> Dimitris


    DR> Megh Dal wrote:
    >> I have following matrix :
    >> 
    >> a = matrix(rnorm(36), 6)
    >> 
    >> Now I want to replace the lower-triangular elements with it's
upper-triangular elements. That is I want to make a symmetric matrix from a. I
have tried with lower.tri() and upper.tri() function, but got desired result.
Can anyone please tell me how to do that?



------------------------------

Message: 57
Date: Mon, 22 Sep 2008 15:35:04 +0800
From: "Yihui Xie" <xieyihui@gmail.com>
Subject: Re: [R] Manage huge database
To: " Jos? E. Lozano " <lozalojo@jcyl.es>
Cc: r-help@stat.math.ethz.ch
Message-ID:
    <89b6b8c90809220035o3f702624p34cb83000ad6b39f@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi,

You can treat it as a database and use ODBC to fetch data from the CSV
file using SQL. See the package RODBC for details about database
connections. (I have dealt with similar problems before with RODBC)

Regards,
Yihui
--
Yihui Xie <xieyihui@gmail.com>
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building,
Renmin University of China, Beijing, 100872, China



On Mon, Sep 22, 2008 at 2:50 PM, Jos? E. Lozano <lozalojo@jcyl.es>
wrote:> Hello,
>
>
>
> Recently I have been trying to open a huge database with no success.
>
>
>
> It's a 4GB csv plain text file with around 2000 rows and over 500,000
> columns/variables.
>
>
>
> I have try with The SAS System, but it reads only around 5000 columns, no
> more. R hangs up when opening.
>
>
>
> Is there any way to work with "parts" (a set of columns) of this
database,
> since its impossible to manage it all at once?
>
>
>
> Is there any way to establish a link to the csv file and to state the
> columns you want to fetch every time you make an analysis?
>
>
>
> I've been searching the net, but found little about this topic.
>
>
>
> Best regards,
>
> Jose Lozano
>
>
>        [[alternative HTML version deleted]]
>


------------------------------

Message: 58
Date: Mon, 22 Sep 2008 09:49:09 +0200
From: " Jos? E. Lozano " <lozalojo@jcyl.es>
Subject: Re: [R] Manage huge database
To: "'Yihui Xie'" <xieyihui@gmail.com>
Cc: r-help@stat.math.ethz.ch
Message-ID: <47A455630022DC1E@mtacsbs.csbs.jcyl.es> (added by
    postmaster@jcyl.es)
Content-Type: text/plain;    charset="iso-8859-1"

Hello, Yihui
> You can treat it as a database and use ODBC to fetch data from the CSV
> file using SQL. See the package RODBC for details about database
> connections. (I have dealt with similar problems before with RODBC)
Thanks for your tip, I have used RODBC before to read data from MSAccess and
MSExcel files, but never I imagined it could work for non-database files
such as csv.

I will check the RODBC documentation.

Best Regards,
Jose Lozano

------------------------------------------
Jose E. Lozano Alonso
Observatorio de Salud P?blica.
Direccion General de Salud P?blica e I+D+I.
Junta de Castilla y Le?n.
Direccion: Paseo de Zorrilla, n?1. Despacho 3103. CP 47071. Valladolid.



------------------------------

Message: 59
Date: Mon, 22 Sep 2008 10:02:18 +0200
From: " Jos? E. Lozano " <lozalojo@jcyl.es>
Subject: Re: [R] Manage huge database
To: "'Barry Rowlingson'" <b.rowlingson@lancaster.ac.uk>
Cc: r-help@stat.math.ethz.ch
Message-ID: <47A455630022DD57@mtacsbs.csbs.jcyl.es> (added by
    postmaster@jcyl.es)
Content-Type: text/plain;    charset="iso-8859-1"
> I wouldn't call a 4GB csv text file a 'database'.
Obviously, a csv it's not a database itself, I tried to mean (though it
seems I was not understood) that I had a huge database, exported to csv file
by the people who created it (and I don?t have any idea of the original
format of the database).
> Yes, use a database. A real database.
I've used MSAccess and there is a limit of 255 columns, as far as I know, so
there is no way of import it. Obviously, I won't buy an Oracle license to
read this file, so: what database system allows a 500000 variables table?
MySQL? Do I have to split the file in smaller parts to import in tables to
relate them all using an index field?
> No, but you can establish a link to a database. You want a database.
> A real relational database.
> Try:
> http://cran.r-project.org/doc/manuals/R-data.html#Relational-databases
It didn't help, sorry. I perfectly knew what a relational database is (and I
humbly consider myself an advanced user on working with MSAccess+VBA, only
that I've never face this problem with variables), you should not suppose
everyone's stupid, though...

Thanks for your help,
Best regards
Jose Lozano



------------------------------

Message: 60
Date: Fri, 22 Aug 2008 09:15:20 +0100
From: Robin Hankin <rksh1@cam.ac.uk>
Subject: Re: [R] how to keep up with R?
To: a.ramasamy@imperial.ac.uk
Cc: r-help <r-help@stat.math.ethz.ch>,    Barry Rowlingson
    <b.rowlingson@lancaster.ac.uk>
Message-ID: <48AE7598.7060209@cam.ac.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Adaikalavan Ramasamy wrote:> I agree! The best way to learn (and remember for longer) is to teach 
> someone else about it.
>
> And there is not reason not to repeat some of the anlysis done on SAS 
> with R. That way you can verify your outputs or compare the 
> presentations. If you consistently find differences in the outputs, 
> then trying to figure out the reason may lead you to better understand 
> the methods (e.g. different optimization or estimation procedures).
>
My take on this:

I have repeatedly found that it is surprisingly easy to improve on 
existing (non-R) implementations
of statistical and non-statistical computation, when working  in R.

Something about the structure of the language, something about the 
package mechanism,
something about R-help, something about R-core, something about 
open-source, something
about JSS or R-news, whatever it is, there is SOMETHING ABOUT R which 
lends itself
to straightforward production of quality software.  And that something 
is missing from other
programming languages, IMO.



rksh


> Regards, Adai
>
>
>
> Barry Rowlingson wrote:
>> 2008/9/19 Wensui Liu <liuwensui@gmail.com>:
>>> Dear Listers,
>>>
>>> I've been a big fan of R since graduate school. After working
in the
>>> industry for years, I haven't had many opportunities to use R
and am
>>> mainly
>>> using SAS. However, I am still forcing myself really hard to stay 
>>> close to R
>>> by reading R-help and books and writing R code by myself for fun. 
>>> But by and
>>> by, I start realizing I have hard time to keep up with R and am 
>>> afraid that
>>> I would totally forget how to program in R.
>>>
>>> I really like it and am very unwilling to give it up. Is there any 
>>> idea how
>>> I might keep touch with R without using it in work on daily basis?
I
>>> really
>>> appreciate it.
>>>
-- 
Robin K. S. Hankin
Senior Research Associate
Cambridge Centre for Climate Change Mitigation Research (4CMR)
Faculty of Economics
The University of Cambridge
rksh1@cam.ac.uk
01223-764877



------------------------------

Message: 61
Date: Mon, 22 Sep 2008 09:30:52 +0100 (BST)
From: (Ted Harding) <Ted.Harding@manchester.ac.uk>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <XFMail.080922093052.Ted.Harding@manchester.ac.uk>
Content-Type: text/plain; charset=iso-8859-1

Hi Ted (from Ted),
Just to clarify Marc's comments about dataframes in more basic terms.

If you read in data with read.csv() the result returned by the function
is a dataframe. This is a specialised kind of list, which you can think
of as a list of "columns" all of the same length. You can think of
each
"column" as a vector of elements, all of which must be of the same
type
within the column, though the type can vary (e.g. numeric, factor,
character) between columns. When you display a dataframe, it looks like
a matrix, though in R terms it is not really a matrix; it is a list,
where each component of the list is a "column".

Of course a dataframe, like any list, might have only one component.
But it is still a list -- and the actual contents are only available
"one layer down", after you have extracted that component by some
means (e.g. by using the "$" extractor). Simple example:

  L <- c(1,2,3,4)         ## vector
  L
# [1] 1 2 3 4
  L.df <- data.frame(L=L) ## Dataframe with 1 component named "L"
  L.df
#   L
# 1 1
# 2 2
# 3 3
# 4 4
  L.df$L                  ## Extract the component named "L"
# [1] 1 2 3 4             ## Compare with the result of 'L' above

# Try a regression on L (this works):
  lm(L ~ 1)
# Call:
# lm(formula = L ~ 1)
# Coefficients:
# (Intercept)  
#         2.5  

# Try a regression on L.df (this doesn't work):
  lm(L.df ~ 1)
# Error in model.frame.default(formula = L.df ~ 1,
#   drop.unused.levels = TRUE) : 
#   invalid type (list) for variable 'L.df'

# But it does after you refer to the component L by name:
  lm(L.df$L ~ 1)
# Call:
# lm(formula = L.df$L ~ 1)
# Coefficients:
# (Intercept)  
#         2.5  

# or:
  lm(L ~ 1, data=L.df)
# Call:
# lm(formula = L ~ 1, data = L.df)
# Coefficients:
# (Intercept)  
#         2.5  

# But you can (for a dataframe, not a general list) use an "index"
method of extraction *as if* it were a matrix (even though it isn't):

  L.df[,1]
# [1] 1 2 3 4
  L.df[3,1]
# [1] 3

# But compare with:
  L.df[1]
#   L
# 1 1
# 2 2
# 3 3
# 4 4

which is essentially the same as L.df itself (e.g. lm(L.df[1] ~ 1)
will not work in exactly the same way as lm(L.df ~ 1) didn't work).

The dataframe structure exists in R because so much data is typically
in the row by column (case by variables) layout such as you get in
spreadsheets and associated CSV files, and it is very useful to be
able to get into this layout directly (and refer to the variables
by name, as above).

The full generality of a 'list' can also be useful for encapsulating
data of a less strictly structured kind, but that is another (longer)
story!

Helping this helps.
Ted.


On 22-Sep-08 02:09:29, Ted Byers wrote:> Thanks Marc,
> That was it. 
> 
> For the last 30 years, I'd write my own code, in FORTRAN, C++,
> or even Java, to do whatever statistical analysis I needed.
> When at the office, sometimes I could use SAS, but that hasn't
> been an option for me in years.
> 
> This is the first time I have had to load real data into R
> (instead of generating random data to use while playing with
> some of the stats functions, or manually typing dummy data).
> 
> I take it, then, that the result of loading data is a data
> frame, and notjust a matrix or array. Using something like
> "refdata18[, 1]" feels rather alien, but I'm sure I'll
quickly
> get used to it.  I'd seen it before in the R docs, but it didn't
> register that I had to use it to get the functions of most
> interest to me to recognise my data as a vector of numbers,
> given I'd provided only a vector of integers as input.
> 
> Thanks
> 
> Ted
> 
> 
> Marc Schwartz wrote:
>> 
>> on 09/21/2008 08:01 PM Ted Byers wrote:
>>> I have a number of files containing anywhere from a few dozen to a
>>> few
>>> thousand integers, one per record.
>>> 
>>> The statement "refdata18 >>>
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
>>> TRUE,na.strings="")" works fine, and if I type
refdata18, I get the
>>> integers
>>> displayed, one value per record (along with a record number). 
>>> However,
>>> when
>>> I try " fitdistr(refdata18,"negative
binomial")", or
>>> hist.scott(refdata18,
>>> prob = TRUE), I get an error:
>>> 
>>> Error in fitdistr(refdata18, "negative binomial") : 
>>>   'x' must be a non-empty numeric vector
>>> Or
>>> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab,
>>> ...)
>>> : 
>>>   'x' must be numeric
>>> 
>>> How can it not recognise integers as numbers?
>>> 
>>> Thanks
>>> 
>>> Ted
>> 
>> 'refdata18' is a data frame and the two functions are expecting
a
>> numeric vector.
>> 
>> If you use:
>> 
>>   fitdistr(refdata18[, 1], "negative binomial")
>> 
>> or
>> 
>>   hist(refdata18[, 1])
>> 
>> you should get a suitable result, presuming that the first column in
>> the
>> data frame is a numeric vector.
>> 
>> Use:
>> 
>>   str(refdata18)
>> 
>> to get a sense for the structure of the data frame, including the
>> column
>> names, which you could then use, instead of the above index based
>> syntax.
>> 
>> HTH,
>> 
>> Marc Schwartz
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp1
> 9600308p19600803.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding@manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 22-Sep-08                                       Time: 09:30:47
------------------------------ XFMail ------------------------------



------------------------------

Message: 62
Date: Mon, 22 Sep 2008 09:41:30 +0100
From: "Barry Rowlingson" <b.rowlingson@lancaster.ac.uk>
Subject: Re: [R] Manage huge database
To: " Jos? E. Lozano " <lozalojo@jcyl.es>
Cc: r-help@stat.math.ethz.ch
Message-ID:
    <d8ad40b50809220141l5274bf8fw29d36784de519eab@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

2008/9/22 Jos? E.  Lozano <lozalojo@jcyl.es>:>> I wouldn't call a 4GB csv text file a 'database'.
> It didn't help, sorry. I perfectly knew what a relational database is
(and I
> humbly consider myself an advanced user on working with MSAccess+VBA, only
> that I've never face this problem with variables), you should not
suppose
> everyone's stupid, though...
[[elided Yahoo spam]]

A bit more googling tells me both MySQL and PostgreSQL have limits of
a few thousand on the number of columns in a table, not a few hundred
thousand. An insightful comment on one mailing list is:

"Of course, the real bottom line is that if you think you need more than
order-of-a-hundred columns, your database design probably needs revision
anyway ;-)"

So, how much "design" is in this data? If none, and what you've
basically got is a 2000x500000 grid of numbers, then maybe a more raw
binary-type format will help - HDF or netCDF? Although I'm not sure
how much R support for reading slices of these formats exists, you may
be able to use an external utility to write slices out on demand.
Random access to parts of these files is pretty fast.

http://cran.r-project.org/web/packages/RNetCDF/index.html
http://cran.r-project.org/web/packages/hdf5/index.html

Thinking back to your 4GB file with 1,000,000,000 entries, that's
only 3 bytes per entry (+1 for the comma). What is this data? There
may be more efficient ways to handle it.

Hope *that* helps...

Barry



------------------------------

_______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

End of R-help Digest, Vol 67, Issue 23
**************************************

	[[alternative HTML version deleted]]

Marc Schwartz

2008-Sep-22 21:07 UTC

head link

[R] Warranty on Accuracy, Precision, Legality, ... of R in Research

on 09/22/2008 11:26 AM Bert Chan wrote:> Warranty on Accuracy, Precision, Legality, ... of R in Research
> 
> (These questions may well have been raised.)
> 
> What is the implied warranty of using R for research & publications,
consulting, etc.?
> 
> Alternately, how does one obtain such a warranty?
> 
> Your answers will be much appreciated.
> 
> Perhaps you can point me to some websites which discussed this subject in
the past.
> 
> Thanks & regards -
> 
> Bert
> 
> (Bertram K. C. Chan, PhD)
As per the banner that appears whenever you start up R:

"R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details."

The suitability of R for any particular application is entirely up to
the user. Legally, there is nothing preventing you from using R for such
applications relative to the license under which R is made available.

You did not indicate the specific type of research you have in mind, but
if it might be in the domain of clinical trials, please review:

  http://www.r-project.org/doc/R-FDA.pdf

HTH,

Marc Schwartz

Maybe Matching Threads

Search for more possibly parallel threads

R help - Sep 2008 - R-help Digest, Vol 67, Issue 23

[R] R-help Digest, Vol 67, Issue 23

[R] Warranty on Accuracy, Precision, Legality, ... of R in Research

Maybe Matching Threads