Warranty on Accuracy, Precision, Legality, ... of R in Research
(These questions may well have been raised.)
What is the implied warranty of using R for research & publications,
consulting, etc.?
Alternately, how does one obtain such a warranty?
Your answers will be much appreciated.
Perhaps you can point me to some websites which discussed this subject in the
past.
Thanks & regards -
Bert
(Bertram K. C. Chan, PhD)
----- Original Message ----
From: "r-help-request@r-project.org"
<r-help-request@r-project.org>
To: r-help@r-project.org
Sent: Monday, September 22, 2008 3:00:04 AM
Subject: R-help Digest, Vol 67, Issue 23
Send R-help mailing list submissions to
r-help@r-project.org
To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-help
or, via email, send a message with subject or body 'help' to
r-help-request@r-project.org
You can reach the person managing the list at
r-help-owner@r-project.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-help digest..."
Today's Topics:
1. Calculating interval for conditional/unconditional
correlation matrix (Ana Kolar)
2. How to plot "greater than" symbol on the x-axis (Li, Bingshan)
3. Re: How to plot "greater than" symbol on the x-axis (John Fox)
4. Re: Design lrm function (milicic.marko)
5. Re: How to plot "greater than" symbol on the x-axis (Li,
Bingshan)
6. Re: How to plot "greater than" symbol on the x-axis (John Fox)
7. Re: periodicity validation (stephen sefick)
8. Task View for Chemometrics and Computational Physics
(Katharine Mullen)
9. Re: Variable Selection for data reduction and discriminant
anlaysis (Katharine Mullen)
10. Re: How to plot "greater than" symbol on the x-axis
(Henrik Bengtsson)
11. Re: How to plot "greater than" symbol on the x-axis
(Henrik Bengtsson)
12. Re: How to plot "greater than" symbol on the x-axis
(Gabor Grothendieck)
13. Symmetric matrix (Megh Dal)
14. Re: Symmetric matrix (Jorge Ivan Velez)
15. Re: Symmetric matrix (Dimitris Rizopoulos)
16. R Map using SAS data (Junjie Zhang)
17. Re: How to plot "greater than" symbol on the x-axis (Li,
Bingshan)
18. Re: Symmetric matrix (Peter Dalgaard)
19. Re: removing a word, the following space and the next word
(Rolf Turner)
20. Re: fitting a hyperbole (Rolf Turner)
21. Re: Unexpected behaviour when testing for independence, with
multiple factors ( Javier Acu?a )
22. Re: Unexpected behaviour when testing for independence with
multiple factors ( Javier Acu?a )
23. r format questions (DS)
24. design question on piping multiple data sets from 1 file into
R (DS)
25. color for lattice box plots (Tom Bonen)
26. suppress legend in ggplot(data, aes(y=Y, x=X,fill=Z))? (Tom Bonen)
27. Re: selecting from a series of integers withpre-determined
probabilities (Bert Gunter)
28. Multiple plots per window (p@fo76.org)
29. glmer -- extracting standard errors and other statistics
(John Poulsen)
30. Re: r format questions (jim holtman)
31. Re: r format questions (jim holtman)
32. Re: Variable Selection for data reduction and discriminant
anlaysis (gcam032)
33. Re: Multiple plots per window (p@fo76.org)
34. Re: glmer -- extracting standard errors and other statistics
(Weiss, Bernd )
35. Why isn't R recognising integers as numbers? (Ted Byers)
36. Re: Why isn't R recognising integers as numbers? (jim holtman)
37. Re: Multiple plots per window (Gabor Grothendieck)
38. Re: Why isn't R recognising integers as numbers? (Marc Schwartz)
39. Re: Why isn't R recognising integers as numbers? (Ted Byers)
40. Re: Why isn't R recognising integers as numbers? (Ted Byers)
41. Re: Why isn't R recognising integers as numbers? (Marc Schwartz)
42. Re: Calculating interval for conditional/unconditional
correlation matrix (Moshe Olshansky)
43. Re: How to plot "greater than" symbol on the x-axis (Bingshan
Li)
44. Re: PDF fonts problem (Paul Murrell)
45. Help for R (Mac)
46. Hmisc and Ubuntu (aptitude install) (Matthew Pettis)
47. adding layers in ggplot2 (data and code included) (Juliet Hannah)
48. Warnings in fitdistr() from MASS. (Rolf Turner)
49. Re: Why isn't R recognising integers as numbers? (Peter Dalgaard)
50. Re: adding layers in ggplot2 (data and code included) (Eric)
51. Time series (ts) questions. (rkevinburton@charter.net)
52. Matrix balancing on margins (PALMIER Patrick - CETE NP/INFRA/TRF)
53. Re: Variable Selection for data reduction and discriminant
anlaysis (Mark Difford)
54. Manage huge database ( Jos? E. Lozano )
55. Re: Manage huge database (Barry Rowlingson)
56. Re: Symmetric matrix (Martin Maechler)
57. Re: Manage huge database (Yihui Xie)
58. Re: Manage huge database ( Jos? E. Lozano )
59. Re: Manage huge database ( Jos? E. Lozano )
60. Re: how to keep up with R? (Robin Hankin)
61. Re: Why isn't R recognising integers as numbers? ( (Ted Harding))
62. Re: Manage huge database (Barry Rowlingson)
----------------------------------------------------------------------
Message: 1
Date: Sun, 21 Sep 2008 03:05:40 -0700 (PDT)
Subject: [R] Calculating interval for conditional/unconditional
correlation matrix
To: R <r-help@r-project.org>
Message-ID: <880098.14013.qm@web50610.mail.re2.yahoo.com>
Content-Type: text/plain
Hi there,
Could anyone please help me to understand what should be done in order not to
get this error message: Error: evaluation nested too deeply: infinite recursion
/ options(expressions=)?
Here is my code:
determinant<-
function(x){det(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}
matrix<-
function(x){(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}
conditional<-function(x,varcov){
varcov<-matrix(x)
sigmaxx<-varcov[3,3]
sigmaxz<-varcov[3,1:2]
sigmayy<-varcov[4,4]
sigmayz<-varcov[4,1:2]
sigmazx<-varcov[1:2,3]
sigmazy<-varcov[1:2,4]
sigmazz<-varcov[1:2,1:2]
(x-sigmaxz%*%solve(sigmaZZ)%*%sigmazy)/sqrt((sigmaxx-sigmaxz%*%solve(sigmaZZ)%*%sigmazx)*(sigmayy-sigmayz%*%solve(sigmaZZ)%*%sigmazy))}
interval<-uniroot(determinant,lower = min(c(0,1)), upper = max(c(0,1)))
I tried also with the code below, but got the same Error message.
lower.bound<-uniroot(determinant,c(0,0.5))$root
upper.bound<-uniroot(determinant,c(0.51,1))$root
[[elided Yahoo spam]]
Ana
[[alternative HTML version deleted]]
------------------------------
Message: 2
Date: Sat, 20 Sep 2008 23:37:22 -0500
From: "Li, Bingshan" <bli1@bcm.tmc.edu>
Subject: [R] How to plot "greater than" symbol on the x-axis
To: <r-help@R-project.org>
Message-ID:
<99FAE9C1DAA75C4BAB3C1441228F95D130C1E7@BCMEVS14.ad.bcm.edu>
Content-Type: text/plain
Hello everyone,
I want to plot a "greater than" symbol (the "_" under
">") on the x-axis in the labels. Is it possible to do it?
Thanks.
Bingshan
[[alternative HTML version deleted]]
------------------------------
Message: 3
Date: Sun, 21 Sep 2008 09:38:13 -0400
From: "John Fox" <jfox@mcmaster.ca>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "'Li, Bingshan'" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org
Message-ID: <000c01c91bef$506f3990$f14dacb0$@ca>
Content-Type: text/plain; charset="us-ascii"
Dear Bingshan,
It isn't entirely clear what you want to do. I think that you want the
"greater-than-or-equal-to" symbol, not "greater than," but
by itself or in
an expression? For the first, xlab=expression("" >= ""),
and for the second,
e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
I hope this helps,
John
------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org]
On> Behalf Of Li, Bingshan
> Sent: September-21-08 12:37 AM
> To: r-help@r-project.org
> Subject: [R] How to plot "greater than" symbol on the x-axis
>
>
> Hello everyone,
>
> I want to plot a "greater than" symbol (the "_" under
">") on the x-axis
in> the labels. Is it possible to do it?
>
> Thanks.
>
> Bingshan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
------------------------------
Message: 4
Date: Sun, 21 Sep 2008 07:56:29 -0700 (PDT)
From: "milicic.marko" <milicic.marko@gmail.com>
Subject: Re: [R] Design lrm function
To: r-help@r-project.org
Message-ID:
<879ed981-d735-41ac-89c1-87a0251b9f06@34g2000hsh.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1
Thanks Frank.
On Sep 20, 2:53?am, Frank E Harrell Jr <f.harr...@vanderbilt.edu>
wrote:> milicic.marko wrote:
> > Hi,
>
> > Is it possible to get ROC and accuracy ratio/gini straight out of the
> > Design package?
>
> > Thanks
>
> The print method for lrm prints the ROC area (labeled "C"). ?lrm
does
> not print the other 2 measures you listed. ?It computes a generalized
> R^2 (much more powerful than all the other measures) and rank indexes
> other than C.
>
> --
> Frank E Harrell Jr ? Professor and Chair ? ? ? ? ? School of Medicine
> ? ? ? ? ? ? ? ? ? ? ? Department of Biostatistics ? Vanderbilt University
>
> ______________________________________________
> R-h...@r-project.org mailing
listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
------------------------------
Message: 5
Date: Sun, 21 Sep 2008 10:23:53 -0500
From: "Li, Bingshan" <bli1@bcm.tmc.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "John Fox" <jfox@mcmaster.ca>
Cc: r-help@r-project.org
Message-ID:
<99FAE9C1DAA75C4BAB3C1441228F95D130C1EA@BCMEVS14.ad.bcm.edu>
Content-Type: text/plain
Hi John,
Yes, you are right. I meant "greater-than-or-equal". According to your
suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2 and so on as labels on xaxis. I did not make it work. Do you know how to
make it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
did not work either.
Thanks a lot!
Bingshan
-----Original Message-----
From: John Fox [mailto:jfox@mcmaster.ca]
Sent: Sun 9/21/2008 8:38 AM
To: Li, Bingshan
Cc: r-help@r-project.org
Subject: RE: [R] How to plot "greater than" symbol on the x-axis
Dear Bingshan,
It isn't entirely clear what you want to do. I think that you want the
"greater-than-or-equal-to" symbol, not "greater than," but
by itself or in
an expression? For the first, xlab=expression("" >= ""),
and for the second,
e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
I hope this helps,
John
------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org]
On> Behalf Of Li, Bingshan
> Sent: September-21-08 12:37 AM
> To: r-help@r-project.org
> Subject: [R] How to plot "greater than" symbol on the x-axis
>
>
> Hello everyone,
>
> I want to plot a "greater than" symbol (the "_" under
">") on the x-axis
in> the labels. Is it possible to do it?
>
> Thanks.
>
> Bingshan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
------------------------------
Message: 6
Date: Sun, 21 Sep 2008 12:14:04 -0400
From: "John Fox" <jfox@mcmaster.ca>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "'Li, Bingshan'" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org
Message-ID: <000701c91c05$110711e0$331535a0$@ca>
Content-Type: text/plain; charset="us-ascii"
Dear Bingshan,
You can use xlab=expression("" >= "1"),
xlab=expression("" >= 1), or
expression(NA >= 1), etc. The point is that >= is a binary operator, so a
well formed expression needs both a left- and right-hand operand.
John
------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: Li, Bingshan [mailto:bli1@bcm.tmc.edu]
> Sent: September-21-08 11:24 AM
> To: John Fox
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
>
> Hi John,
>
> Yes, you are right. I meant "greater-than-or-equal". According to
your
> suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2
> and so on as labels on xaxis. I did not make it work. Do you know how to
make> it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
>
> Thanks a lot!
>
> Bingshan
>
>
> -----Original Message-----
> From: John Fox [mailto:jfox@mcmaster.ca]
> Sent: Sun 9/21/2008 8:38 AM
> To: Li, Bingshan
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
>
> Dear Bingshan,
>
> It isn't entirely clear what you want to do. I think that you want the
> "greater-than-or-equal-to" symbol, not "greater than,"
but by itself or in
> an expression? For the first, xlab=expression("" >=
""), and for the
second,> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
>
> I hope this helps,
> John
>
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
> > -----Original Message-----
> > From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
> On
> > Behalf Of Li, Bingshan
> > Sent: September-21-08 12:37 AM
> > To: r-help@r-project.org
> > Subject: [R] How to plot "greater than" symbol on the x-axis
> >
> >
> > Hello everyone,
> >
> > I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
> in
> > the labels. Is it possible to do it?
> >
> > Thanks.
> >
> > Bingshan
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
------------------------------
Message: 7
Date: Sun, 21 Sep 2008 12:23:02 -0400
From: "stephen sefick" <ssefick@gmail.com>
Subject: Re: [R] periodicity validation
To: "yuankun shi" <shiyuankun.debian@gmail.com>, "R-help
Mailing List"
<r-help@r-project.org>
Message-ID:
<c502a9e10809210923m10728682wf4f8d41e75dce71e@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
alright this is what you want to do.
install.packages("fields", dependencies=TRUE)
tim.colors is in this package and it has a blue to red color scheme-
blue being the lowest and red being the highest. This color scheme
makes sense to me and is a common thing that a people (read engineers)
familar with matlab or the like will understand.
USE the morlet wavelet it is compactly supported which means that it
quickly goes to zero once it gets out of the scale that it is fitting.
Making it good for a localized fit.
what you are looking at is the modulus (absolute value) of the
convolution of the wavelet with the signal at a particular scale (kind
of like frequency in fourier analysis) on the y-axis through time
(local fitting) on the x-axis. Your are trying to find periodicity?
I kind of think of wavelet analysis as the partitioning of variance of
the signal into continuous scale.
because of algorithm calculation the scale is in log2(value of the
time series) so to get to your time units (which you set in the deltat
or frequency argument when you create a timeseries with ts() )
2^(value of the scale).
I hope this helps
Stephen
2008/9/21 yuankun shi
<shiyuankun.debian@gmail.com>:> Thanks, I have succeeded to do this, first wavCWTPeaks to get every
peaks'
> coordinate, then calculated their horizontal distance, finally,bkde output
> the distance's distribution, that's what I want.
> On the contrary, picture of wavCWT seems hard to understand, I am not sure
> what the y axis and the color mean. Could you do me a favor?
>
> 2008/9/19 stephen sefick <ssefick@gmail.com>
>>
>> I would suggest wavelet analysis-
>> library wmtsa
>> wavCWT
>> This will tell you if there is a periodicity localized in time which
>> fourier analysis canno tell you- if the variance is not constant
>> through time then you should use this.
>>
>> 2008/9/19 yuankun shi <shiyuankun.debian@gmail.com>:
>> > I have spent lots of time to download the code you have mentioned.
But
>> > all
>> > of them is not I wanted, except the latest one, I have not found
it
>> > anywhere.
>> > Maybe I have not make my problem clearly, sorry for that.
>> > I have a series data, it consists of time and rate. To plot rate
vs time
>> > in
>> > picture, I found it has perodicity to some extent. The rate rise
and
>> > fall
>> > with time, but not with fixed cycle and fixed amplitude.
>> > So I am wondering, is there any tools to get the cycle? and
furthmore,
>> > to
>> > draw it's density picture?
>> > Since there is bkde in package KernSmooth, so the 2nd is not
strict
>> > needed
>> >
>> > 2008/9/11 stephen sefick <ssefick@gmail.com>
>> >>
>> >> all of the functions that I listed are time series tools for
looking
>> >> at what I think you want. this can be done you just have to
>> >> understand the methodology. So, look at some of the things
that I
>> >> suggested, If these don't help then I don't
understand what you want,
>> >> and it is necissary for you to help me figure out what it is
that you
>> >> want.
>> >> good luck
>> >>
>> >> 2008/9/11 yk <shiyuankun.debian@gmail.com>:
>> >> > The data I mentioned above is oscilating vs time?but
there are not
>> >> > obersevable fixed cycle if I just plot this data.
>> >> > How to get the average cycle?or the most probable range
of cycle
>> >> > with
>> >> > statistical methods?
>> >> > I don't know how to achieve it by R, is there any
command?
>> >> >
>> >> > On Sep 11, 10:52 am, "stephen sefick"
<ssef...@gmail.com> wrote:
>> >> >> ?spectrum
>> >> >> ?acf
>> >> >> ?ccf
>> >> >> library(wmtsa)
>> >> >> ?wavCWT
>> >> >> library(sowas)
>> >> >> ?wsp
>> >> >>
>> >> >> you could also look at lagged plots to look for
periodicity.
>> >> >> if you elaborate on the problem and include
executable sample code
>> >> >> you
>> >> >> will probably recieve more help.
>> >> >>
>> >> >> On Wed, Sep 10, 2008 at 10:02 PM, yk
<shiyuankun.deb...@gmail.com>
>> >> >> wrote:
>> >> >> > There is a series of data contains time in fixed
step and energy
>> >> >> > varying with time, how to test its
periodicity?In R, it seems
>> >> >> > there
>> >> >> > is
>> >> >> > no direct tools since I have search the R manual
with periodic and
>> >> >> > I
>> >> >> > have not found any related topic.
>> >> >> > Thanks a lot
>> >> >>
>> >> >> > ______________________________________________
>> >> >> > R-h...@r-project.org mailing list
>> >> >> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> > PLEASE do read the posting
>> >> >> > guidehttp://www.R-project.org/posting-guide.html
>> >> >> > and provide commented, minimal, self-contained,
reproducible code.
>> >> >>
>> >> >> --
>> >> >> Stephen Sefick
>> >> >> Research Scientist
>> >> >> Southeastern Natural Sciences Academy
>> >> >>
>> >> >> Let's not spend our time and resources thinking
about things that
>> >> >> are
>> >> >> so little or so large that all they really do for us
is puff us up
>> >> >> and
>> >> >> make us feel like gods. We are mammals, and have not
exhausted the
>> >> >> annoying little problems of being mammals.
>> >> >>
>> >> >> -K. Mullis
>> >> >>
>> >> >> ______________________________________________
>> >> >> R-h...@r-project.org mailing
>> >> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> PLEASE do read the posting
>> >> >> guidehttp://www.R-project.org/posting-guide.html
>> >> >> and provide commented, minimal, self-contained,
reproducible code.
>> >> >
>> >> > ______________________________________________
>> >> > R-help@r-project.org mailing list
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> > http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained,
reproducible code.
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Stephen Sefick
>> >> Research Scientist
>> >> Southeastern Natural Sciences Academy
>> >>
>> >> Let's not spend our time and resources thinking about
things that are
>> >> so little or so large that all they really do for us is puff
us up and
>> >> make us feel like gods. We are mammals, and have not exhausted
the
>> >> annoying little problems of being mammals.
>> >>
>> >> -K. Mullis
>> >
>> >
>>
>>
>>
>> --
>> Stephen Sefick
>> Research Scientist
>> Southeastern Natural Sciences Academy
>>
>> Let's not spend our time and resources thinking about things that
are
>> so little or so large that all they really do for us is puff us up and
>> make us feel like gods. We are mammals, and have not exhausted the
>> annoying little problems of being mammals.
>>
>> -K. Mullis
>
>
--
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy
Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.
-K. Mullis
------------------------------
Message: 8
Date: Sun, 21 Sep 2008 18:39:55 +0200 (CEST)
From: Katharine Mullen <kate@few.vu.nl>
Subject: [R] Task View for Chemometrics and Computational Physics
To: r-help@r-project.org
Message-ID: <Pine.GSO.4.56.0809211836020.3492@laurel.few.vu.nl>
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear All,
A new task view "ChemPhys" on chemometrics and computational physics
is
available on CRAN (http://cran.r-project.org/web/views/ChemPhys.html).
It describes packages and functions that are of use in modeling
chemical/physical systems.
Suggestions and comments regarding this task view are welcome. If you
think a new category, package or function should be added, please mail.
best regards,
Kate Mullen
----
Katharine Mullen
mail: Department of Physics and Astronomy, Faculty of Sciences
Vrije Universiteit Amsterdam, de Boelelaan 1081
1081 HV Amsterdam, The Netherlands
room: T.1.06
tel: +31 205987870
fax: +31 205987992
e-mail: kate@nat.vu.nl
homepage: http://www.nat.vu.nl/~kate/
------------------------------
Message: 9
Date: Sun, 21 Sep 2008 18:43:37 +0200 (CEST)
From: Katharine Mullen <kate@few.vu.nl>
Subject: Re: [R] Variable Selection for data reduction and
discriminant anlaysis
To: Gareth Campbell <gcam032@gmail.com>
Cc: R Help <r-help@r-project.org>
Message-ID: <Pine.GSO.4.56.0809211841530.3492@laurel.few.vu.nl>
Content-Type: TEXT/PLAIN; charset=US-ASCII
There are some pointers to packages for variable selection in the task
view for Chemometrics and Computational Physics at
http://cran.r-project.org/web/views/ChemPhys.html
On Sun, 21 Sep 2008, Gareth Campbell wrote:
> Hello all,
>
> I'm dealing with geochemical analyses of some rocks.
>
> If I use the full composition (31 elements or variables), I can get
> reasonable separation of my 6 sources. Then when I go onto do LDA with the
> 6 groups, I get excellent separation.
>
> I feel like I should be reducing the variables to thos that are providing
> the most discrimination between the groups as this is important information
> for me. I struggle to interpret the PCA plot in a way that helps me (due
to
> the large number of elements). So I'm trying to do some sort of
step-wise
> variable selection.
>
> I would love to hear from someone (possibly a geochemist or similar) who
> does this regularly to determine the best course of action in R to do this.
>
>
> Thanks very much
>
>
> --
> Gareth Campbell
> PhD Candidate
> The University of Auckland
>
> P +649 815 3670
> M +6421 256 3511
> E gareth.campbell@esr.cri.nz
> gcam032@gmail.com
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
------------------------------
Message: 10
Date: Sun, 21 Sep 2008 09:52:29 -0700
From: "Henrik Bengtsson" <hb@stat.berkeley.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "Li, Bingshan" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID:
<59d7961d0809210952j7a8ffb0epdad6b839aba452c9@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
What have you tried this far and what part does not work? If you
forget for a moment the fact that you want to have ">=1",
">=2", ...
can you do what you want with plain "1", "2", ...? Telling
us that
helps us help you.
Are you asking for the labels on the *tick marks* on the axis? Right
now it sounds like you are asking for *the label* on the x axis, but
the part that you want multiple ones is confusing.
plot(1:10, xlab="1");
is different from:
plot(1:10, axes=FALSE);
axis(side=1, at=1:10, labels=1:10);
To add ">=" to the latter case, this works:
bquote("" >= 1:10)
labels <- lapply(1:10, FUN=function(x) substitute(>= t, list(t=x)));
plot(1:10, axes=FALSE);
axis(side=1, at=1:10, labels=labels);
On Sun, Sep 21, 2008 at 8:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:> Hi John,
>
> Yes, you are right. I meant "greater-than-or-equal". According to
your suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2 and so on as labels on xaxis. I did not make it work. Do you know how to
make it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
>
> Thanks a lot!
>
> Bingshan
>
>
> -----Original Message-----
> From: John Fox [mailto:jfox@mcmaster.ca]
> Sent: Sun 9/21/2008 8:38 AM
> To: Li, Bingshan
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
>
> Dear Bingshan,
>
> It isn't entirely clear what you want to do. I think that you want the
> "greater-than-or-equal-to" symbol, not "greater than,"
but by itself or in
> an expression? For the first, xlab=expression("" >=
""), and for the second,
> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
>
> I hope this helps,
> John
>
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
>> -----Original Message-----
>> From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
> On
>> Behalf Of Li, Bingshan
>> Sent: September-21-08 12:37 AM
>> To: r-help@r-project.org
>> Subject: [R] How to plot "greater than" symbol on the x-axis
>>
>>
>> Hello everyone,
>>
>> I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
> in
>> the labels. Is it possible to do it?
>>
>> Thanks.
>>
>> Bingshan
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
------------------------------
Message: 11
Date: Sun, 21 Sep 2008 09:53:39 -0700
From: "Henrik Bengtsson" <hb@stat.berkeley.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "Li, Bingshan" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID:
<59d7961d0809210953n1727e462q2f19b37266689348@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
On Sun, Sep 21, 2008 at 9:52 AM, Henrik Bengtsson <hb@stat.berkeley.edu>
wrote:> What have you tried this far and what part does not work? If you
> forget for a moment the fact that you want to have ">=1",
">=2", ...
> can you do what you want with plain "1", "2", ...?
Telling us that
> helps us help you.
>
> Are you asking for the labels on the *tick marks* on the axis? Right
> now it sounds like you are asking for *the label* on the x axis, but
> the part that you want multiple ones is confusing.
>
> plot(1:10, xlab="1");
>
> is different from:
>
> plot(1:10, axes=FALSE);
> axis(side=1, at=1:10, labels=1:10);
>
> To add ">=" to the latter case, this works:
>
> bquote("" >= 1:10)
> labels <- lapply(1:10, FUN=function(x) substitute(>= t, list(t=x)));
> plot(1:10, axes=FALSE);
> axis(side=1, at=1:10, labels=labels);
Oops. Forget about the bquote() - cut'n'paste error ...and I don't
know how to get rid of the "" preceding each tick label. Maybe
someone else knows.
/Henrik
>
> On Sun, Sep 21, 2008 at 8:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:
>> Hi John,
>>
>> Yes, you are right. I meant "greater-than-or-equal".
According to your suggestion, I can plot the symbol only. But what I want is to
have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you
know how to make it? The expression("">=1"") did not
work, and paste(expression("">=""), 1)
>> did not work either.
>>
>> Thanks a lot!
>>
>> Bingshan
>>
>>
>> -----Original Message-----
>> From: John Fox [mailto:jfox@mcmaster.ca]
>> Sent: Sun 9/21/2008 8:38 AM
>> To: Li, Bingshan
>> Cc: r-help@r-project.org
>> Subject: RE: [R] How to plot "greater than" symbol on the
x-axis
>>
>> Dear Bingshan,
>>
>> It isn't entirely clear what you want to do. I think that you want
the
>> "greater-than-or-equal-to" symbol, not "greater
than," but by itself or in
>> an expression? For the first, xlab=expression("" >=
""), and for the second,
>> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
>>
>> I hope this helps,
>> John
>>
>> ------------------------------
>> John Fox, Professor
>> Department of Sociology
>> McMaster University
>> Hamilton, Ontario, Canada
>> web: socserv.mcmaster.ca/jfox
>>
>>> -----Original Message-----
>>> From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
>> On
>>> Behalf Of Li, Bingshan
>>> Sent: September-21-08 12:37 AM
>>> To: r-help@r-project.org
>>> Subject: [R] How to plot "greater than" symbol on the
x-axis
>>>
>>>
>>> Hello everyone,
>>>
>>> I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
>> in
>>> the labels. Is it possible to do it?
>>>
>>> Thanks.
>>>
>>> Bingshan
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
------------------------------
Message: 12
Date: Sun, 21 Sep 2008 13:08:09 -0400
From: "Gabor Grothendieck" <ggrothendieck@gmail.com>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "Li, Bingshan" <bli1@bcm.tmc.edu>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID:
<971536df0809211008l559eec03ub016e3fcd71682f@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
On Sun, Sep 21, 2008 at 11:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:> Hi John,
>
> Yes, you are right. I meant "greater-than-or-equal". According to
your suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2 and so on as labels on xaxis. I did not make it work. Do you know how to
make it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
>
Try this:
plot(1:10, xaxt = "n")
for(i in 1:10) axis(1, i, bquote(phantom(0) >= .(i)))
------------------------------
Message: 13
Date: Sun, 21 Sep 2008 10:47:47 -0700 (PDT)
Subject: [R] Symmetric matrix
To: r-help@stat.math.ethz.ch
Message-ID: <139940.18844.qm@web58102.mail.re3.yahoo.com>
Content-Type: text/plain; charset=us-ascii
I have following matrix :
a = matrix(rnorm(36), 6)
Now I want to replace the lower-triangular elements with it's
upper-triangular elements. That is I want to make a symmetric matrix from a. I
have tried with lower.tri() and upper.tri() function, but got desired result.
Can anyone please tell me how to do that?
------------------------------
Message: 14
Date: Sun, 21 Sep 2008 13:54:19 -0400
From: "Jorge Ivan Velez" <jorgeivanvelez@gmail.com>
Subject: Re: [R] Symmetric matrix
Cc: r-help@stat.math.ethz.ch
Message-ID:
<317737de0809211054u1485f494l166e21f6b30e4123@mail.gmail.com>
Content-Type: text/plain
Dear Megh,
Try this:
a = matrix(rnorm(36), 6)
a[upper.tri(a)]<-a[lower.tri(a)]
a
HTH,
Jorge
> I have following matrix :
>
> a = matrix(rnorm(36), 6)
>
> Now I want to replace the lower-triangular elements with it's
> upper-triangular elements. That is I want to make a symmetric matrix from
a.
> I have tried with lower.tri() and upper.tri() function, but got desired
> result. Can anyone please tell me how to do that?
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
------------------------------
Message: 15
Date: Sun, 21 Sep 2008 19:58:44 +0200
From: Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl>
Subject: Re: [R] Symmetric matrix
Cc: r-help@stat.math.ethz.ch
Message-ID: <48D68B54.70409@erasmusmc.nl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
try the following
a <- matrix(rnorm(36), 6)
ind <- lower.tri(a)
a[ind] <- t(a)[ind]
a
I hope it helps.
Best,
Dimitris
Megh Dal wrote:> I have following matrix :
>
> a = matrix(rnorm(36), 6)
>
> Now I want to replace the lower-triangular elements with it's
upper-triangular elements. That is I want to make a symmetric matrix from a. I
have tried with lower.tri() and upper.tri() function, but got desired result.
Can anyone please tell me how to do that?
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center
Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
------------------------------
Message: 16
Date: Sun, 21 Sep 2008 14:26:05 -0400
From: Junjie Zhang <thujacky@hotmail.com>
Subject: [R] R Map using SAS data
To: <r-help@r-project.org>
Message-ID: <BAY105-W522471E0693BD6F2FD69BDDC480@phx.gbl>
Content-Type: text/plain
Hi there,
I'd like to plot some maps. Is it possible for me to use SAS map data in R?
Thank you.
Best,
Junjie
_________________________________________________________________
your life.
[[alternative HTML version deleted]]
------------------------------
Message: 17
Date: Sun, 21 Sep 2008 11:40:35 -0500
From: "Li, Bingshan" <bli1@bcm.tmc.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: "John Fox" <jfox@mcmaster.ca>
Cc: r-help@r-project.org
Message-ID:
<99FAE9C1DAA75C4BAB3C1441228F95D130C1EB@BCMEVS14.ad.bcm.edu>
Content-Type: text/plain
Hi John,
It works perfectly. Thank you so much for the help! Have a great day.
Bingshan
-----Original Message-----
From: John Fox [mailto:jfox@mcmaster.ca]
Sent: Sun 9/21/2008 11:14 AM
To: Li, Bingshan
Cc: r-help@r-project.org
Subject: RE: [R] How to plot "greater than" symbol on the x-axis
Dear Bingshan,
You can use xlab=expression("" >= "1"),
xlab=expression("" >= 1), or
expression(NA >= 1), etc. The point is that >= is a binary operator, so a
well formed expression needs both a left- and right-hand operand.
John
------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: Li, Bingshan [mailto:bli1@bcm.tmc.edu]
> Sent: September-21-08 11:24 AM
> To: John Fox
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
>
> Hi John,
>
> Yes, you are right. I meant "greater-than-or-equal". According to
your
> suggestion, I can plot the symbol only. But what I want is to have >=1,
>=2
> and so on as labels on xaxis. I did not make it work. Do you know how to
make> it? The expression("">=1"") did not work, and
paste(expression("">=""), 1)
> did not work either.
>
> Thanks a lot!
>
> Bingshan
>
>
> -----Original Message-----
> From: John Fox [mailto:jfox@mcmaster.ca]
> Sent: Sun 9/21/2008 8:38 AM
> To: Li, Bingshan
> Cc: r-help@r-project.org
> Subject: RE: [R] How to plot "greater than" symbol on the x-axis
>
> Dear Bingshan,
>
> It isn't entirely clear what you want to do. I think that you want the
> "greater-than-or-equal-to" symbol, not "greater than,"
but by itself or in
> an expression? For the first, xlab=expression("" >=
""), and for the
second,> e.g., xlab=expression(x >= x[min]). More generally, see ?plotmath.
>
> I hope this helps,
> John
>
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
> > -----Original Message-----
> > From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
> On
> > Behalf Of Li, Bingshan
> > Sent: September-21-08 12:37 AM
> > To: r-help@r-project.org
> > Subject: [R] How to plot "greater than" symbol on the x-axis
> >
> >
> > Hello everyone,
> >
> > I want to plot a "greater than" symbol (the "_"
under ">") on the x-axis
> in
> > the labels. Is it possible to do it?
> >
> > Thanks.
> >
> > Bingshan
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
[[alternative HTML version deleted]]
------------------------------
Message: 18
Date: Sun, 21 Sep 2008 21:10:07 +0200
From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
Subject: Re: [R] Symmetric matrix
To: Jorge Ivan Velez <jorgeivanvelez@gmail.com>
Message-ID: <48D69C0F.3000706@biostat.ku.dk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Jorge Ivan Velez wrote:> Dear Megh,
> Try this:
>
> a = matrix(rnorm(36), 6)
> a[upper.tri(a)]<-a[lower.tri(a)]
> a
>
>
> HTH,
>
>
If you look carefully, you'll see that it doesn't work! Dimitris had the
better idea.
--
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
------------------------------
Message: 19
Date: Mon, 22 Sep 2008 08:39:44 +1200
From: Rolf Turner <r.turner@auckland.ac.nz>
Subject: Re: [R] removing a word, the following space and the next
word
To: jim holtman <jholtman@gmail.com>
Cc: r-help@r-project.org, Bob Green <bgreen@dyson.brisnet.org.au>
Message-ID: <1503F860-54A4-402C-B6B5-C76A88EF2D5E@auckland.ac.nz>
Content-Type: text/plain; charset=US-ASCII; format=flowed
On 21/09/2008, at 5:15 AM, jim holtman wrote:
>> x <- 'Mr Jones ate lunch and Mr Smith was tied'
>> gsub('(Mr\\.*)\\s+\\w+', "\\1 <file://0.0.0.1/>
xxxx", x)
> [1] "Mr xxxx ate lunch and Mr xxxx was tied"
I don't get what the bit
<file://0.0.0.1/>
is about. If I do (just)
gsub('(Mr\\.*)\\s+\\w+', "\\1 xxxx", x)
I get the desired result, i.e.
[1] "Mr xxxx ate lunch and Mr xxxx was tied"
cheers,
Rolf Turner
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
------------------------------
Message: 20
Date: Mon, 22 Sep 2008 08:58:00 +1200
From: Rolf Turner <r.turner@auckland.ac.nz>
Subject: Re: [R] fitting a hyperbole
To: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
Cc: R-help Forum <r-help@r-project.org>
Message-ID: <DAE7D8B9-0991-4DF7-8336-F22CA23E1254@auckland.ac.nz>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
On 21/09/2008, at 10:38 AM, Peter Dalgaard wrote:
> stephen sefick wrote:
>> I am not sure if I am exaggerating or not read title as hyperbola
>>
>> On Sat, Sep 20, 2008 at 2:20 PM, stephen sefick
>> <ssefick@gmail.com> wrote:
>>
>>> I have got a data set that is Gross Primary Productivity ~ Total
>>> Suspended Solids it is a hyperbola just like:
>>> plot(1/c(1:1000))
>>>
>>> how do I model this relationship so that I can get all of the neat
>>> things that lm gives residuals etc. etc. so that I can see if my
>>> eyeball model stands up. Thanks for any help, pointers, or good
>>> things to read.
>>>
> Well, it depends on the exact model you want to fit and the error
> characteristics.
>
> There's a straightforward linear model in the transformed x:
> lm(y ~ I(1/x))
>
> but there are also transformed models like
>
> lm(1/y ~ x)
>
> or
>
> lm(log(y) ~ log(x))
>
> but of course, y, 1/y, and log(y) can't all be homoscedastic normal
> variates. Going beyond the linearized models, you can use nls(), as in
>
> nls(y~ a/(x-b), start=c(a=1,b=0))
>
> (which is linear for 1/y, but assumes that y rather than 1/y has
> constant variance.)
Nicely expressed. Succinct, clear, to the point, comprehensive. I
wish I'd said that!
(And that's not hyperbole. :-) )
So much more helpful than some postings I've seen recently to the
effect of ``Go away
and read a book on this topic.''
cheers,
Rolf
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
------------------------------
Message: 21
Date: Sun, 21 Sep 2008 17:01:18 -0400
From: " Javier Acu?a " <javier.acuna.o@gmail.com>
Subject: Re: [R] Unexpected behaviour when testing for independence,
with multiple factors
To: bolker@ufl.edu, r-help@r-project.org
Message-ID:
<e10c29610809211401i6e3d7792p22b72172993aae9a@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
>Ben Bolker <bolker <at> ufl.edu> writes:
>
>I would try
>
>fligner.test(dT ~ Topology:Drift:lambda)
>
>in response to:
>
>Javier Acuna <javier.acuna.o <at> gmail.com> writes:
>
> Hi, I'm a new user of R. My background is Electrical Engineering, so
> please bear with me if this is a silly question.
>
> I'm trying to assess whether the results of an experiment satisfy the
> hypothesis of homoscedasticity (my ultimate goal is to use ANOVA).
>
> The result of the experiment is mean delay (dT), which depends on
> three factors, topology, drift, and lambda. The first two factors are
> categorical (with 4 levels each) and the last one is numerical, with
> two levels.
>
> A sample of my data is as follows:
>
> dT Topology Drift lambda
> 258.789 Tree b1 .43
> 244.195 Tree b1 .43
> 115.961 Tree b2 .3
> 115.183 Tree b2 .3
>
> I would like to separate dT in the 32 samples (4x4x2), and test if the
> variance of each sample is equal to the other 31 samples.
> I tried using fligner.test and bartlett.test, but either test seems to
> only work for one factor:
>
> > fligner.test( dT ~ Topology + Drift + lambda)
>
> Fligner-Killeen test of homogeneity of variances
>
> data: dT by Topology by Drift by lambda
> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>
> > fligner.test( dT ~ Topology )
>
> Fligner-Killeen test of homogeneity of variances
>
> data: dT by Topology
> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>
> As I see from the previous two outputs, fligner.test only takes into
> account the first factor. Similar results are obtained for
> bartlett.test.
I tried what you suggested Ben, but I'm still puzzled by the output.
In this case, I obtain different results with different ordering of
the factors:
> fligner.test( dT ~ Dims : Topology :Drift )
Fligner-Killeen test of homogeneity of variances
data: dT by Dims by Topology by Drift
Fligner-Killeen:med chi-squared = 195.2067, df = 1, p-value < 2.2e-16
> fligner.test( dT ~ Topology :Drift:Dims )
Fligner-Killeen test of homogeneity of variances
data: dT by Topology by Drift by Dims
Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
I don't know what to do now, any help would be reaaally appreciated.
Best Regards
Javier
----------------------------------------------------
Javier Acuna
Electrical Engineering Grad Student
Universidad de Chile
javier.acuna.o@gmail.com
------------------------------
Message: 22
Date: Sun, 21 Sep 2008 17:05:06 -0400
From: " Javier Acu?a " <javier.acuna.o@gmail.com>
Subject: Re: [R] Unexpected behaviour when testing for independence
with multiple factors
To: "Michael Dewey" <info@aghmed.fsnet.co.uk>
Cc: r-help@r-project.org
Message-ID:
<e10c29610809211405y3607a7d9s2ba865460aca38e8@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Michael, so you're suggesting that I should do:
aux <- interaction( Topology, Drift, lambda)
and then
fligner.test(dT~aux)
Is that correct?
On Thu, Sep 18, 2008 at 8:32 AM, Michael Dewey <info <at>
aghmed.fsnet.co.uk> wrote:> At 16:03 17/09/2008, Javier Acu?a wrote:
>>
>> Hi, I'm a new user of R. My background is Electrical Engineering,
so
>> please bear with me if this is a silly question.
>
> For future reference you might find
> ?interaction
> helpful as another tool in your box.
>
>
>> I'm trying to assess whether the results of an experiment satisfy
the
>> hypothesis of homoscedasticity (my ultimate goal is to use ANOVA).
>
> It is hard to resist quoting Box (1953, Biometrika, 40, p333) that these
> tests are '... like putting to sea in a rowing boat to find out whether
> conditions are safe for an ocean liner to leave port'
>
>> The result of the experiment is mean delay (dT), which depends on
>> three factors, topology, drift, and lambda. The first two factors are
>> categorical (with 4 levels each) and the last one is numerical, with
>> two levels.
>>
>> A sample of my data is as follows:
>>
>> dT Topology Drift lambda
>> 258.789 Tree b1 .43
>> 244.195 Tree b1 .43
>> 115.961 Tree b2 .3
>> 115.183 Tree b2 .3
>>
>> I would like to separate dT in the 32 samples (4x4x2), and test if the
>> variance of each sample is equal to the other 31 samples.
>> I tried using fligner.test and bartlett.test, but either test seems to
>> only work for one factor:
>>
>> > fligner.test( dT ~ Topology + Drift + lambda)
>>
>> Fligner-Killeen test of homogeneity of variances
>>
>> data: dT by Topology by Drift by lambda
>> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>>
>> > fligner.test( dT ~ Topology )
>>
>> Fligner-Killeen test of homogeneity of variances
>>
>> data: dT by Topology
>> Fligner-Killeen:med chi-squared = 15.4343, df = 2, p-value = 0.0004451
>>
>> As I see from the previous two outputs, fligner.test only takes into
>> account the first factor. Similar results are obtained for
>> bartlett.test.
>>
>> At this point I don't know if I'm using the test incorrectly or
>> something else. I would really appreciate any help. I'm using R
>> version 2.7.2 (2008-08-25) in Windows XP.
>>
>> Many thanks in advance
>> Javier
>>
>> ----------------------------------------------------
>> Javier Acuna
>> Electrical Engineering Grad Student
>> Universidad de Chile
>> javier.acuna.o@gmail.com
>
> Michael Dewey
> http://www.aghmed.fsnet.co.uk
>
>
------------------------------
Message: 23
Date: Sun, 21 Sep 2008 18:03:52 -0400
From: "DS" <ds5j@excite.com>
Subject: [R] r format questions
To: r-help@R-project.org
Message-ID: <20080921180352.6760@web005.roc2.bluetie.com>
Content-Type: text/plain
Hi,
1) I have noticed that when I use the aggregate function it outputs numbers in
the results. for example:
aggregate by product
group.1 Aggregate
1 ProductA 1000400.00
2 ProductB 23232323.00
3 Missing 232323.00
is there a way to suppress the numbers infront of aggregate outputs. I checked
and they don't look like columns when I do a summary so I can't -1 them
away.
2) is there an easy way to then take my aggregate matrix and then format the sum
wtih $ and commas. for e.g instead 10000 it should show
$10,000.00?
I am trying to create a report and am piping the aggregate into an xtable and
feeding it R2html.
thanks
Dhruv
------------------------------------------------------------
Medical Billing and Coding Training
ools.
http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/
[[alternative HTML version deleted]]
------------------------------
Message: 24
Date: Sun, 21 Sep 2008 18:06:51 -0400
From: "DS" <ds5j@excite.com>
Subject: [R] design question on piping multiple data sets from 1 file
into R
To: r-help@R-project.org
Message-ID: <20080921180651.12712@web006.roc2.bluetie.com>
Content-Type: text/plain
Hi,
I have some queries that I use to get time series information for 8 seperate
queries which deal with a different set of time series each.
I take my queries run them and save the output as csv file and them format the
data into graphs in excel.
I wanted to know if there is an elegant and clean way to read in 1 csv file
but to read the seperate matrices on different rows into seperate R data
objects.
if this is easy then I can read the 8 datasets in the csv file into 8 r
objects and pipe them to time series objects for graphs.
thanks
Dhruv
------------------------------------------------------------
Email Fax
[[elided Yahoo spam]]
http://tagline.excite.com/fc/JkJQPTgLMRGrZRz1SpXTBEyJ7zsqYo4Wrxjvd4ml8SSHhbc6NzbNSo/
[[alternative HTML version deleted]]
------------------------------
Message: 25
Date: Sun, 21 Sep 2008 18:09:01 -0400
From: "Tom Bonen" <tom.bonen@googlemail.com>
Subject: [R] color for lattice box plots
To: r-help@r-project.org
Message-ID:
<8316adf50809211509l7471b151x282a1cc19fd6a4ba@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
hi,
i have a figure with many boxplots and want to differentiate one group
of the boxplots by colour of the box. so for example:
X <- replicate(3,rnorm(100))
bwplot(X[,1]~as.factor(X[,2]>1)|X[,3]>0)
# this gives four boxplots, i'd like to give 1 and 3 a different
colour than 2 and 3
# i tried
bwplot(X[,1]~as.factor(X[,2]>1)|X[,3]>0,groups=as.factor(X[,2]>1))
but that does not change the display? how can i change the colour for
groups with bwplot? thanks.
tom
------------------------------
Message: 26
Date: Sun, 21 Sep 2008 18:25:32 -0400
From: "Tom Bonen" <tom.bonen@googlemail.com>
Subject: [R] suppress legend in ggplot(data, aes(y=Y, x=X,fill=Z))?
To: r-help <r-help@r-project.org>
Message-ID:
<8316adf50809211525q53e22a3dp5992e1c05c303ab9@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
hi,
is there any way to suppress the legend in ggplot(data, aes(y=Y,
x=X,fill=Z)) ? i'd like the values to be displayed in different colors
as specified by fill= and this works just fine. but i do not want to
have the legend on the right that is automactially created when fill
is specified.
thanks,
tom
------------------------------
Message: 27
Date: Sun, 21 Sep 2008 15:33:56 -0700
From: Bert Gunter <gunter.berton@gene.com>
Subject: Re: [R] selecting from a series of integers
withpre-determined probabilities
To: "'John Sorkin'" <jsorkin@grecc.umaryland.edu>,
<r-help@r-project.org>
Message-ID: <000901c91c3a$225eaa40$6501a8c0@gne.windows.gene.com>
Content-Type: text/plain; charset="us-ascii"
?sample.
-- Bert Gunter
-----Original Message-----
From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] On
Behalf Of John Sorkin
Sent: Saturday, September 20, 2008 12:43 PM
To: r-help@r-project.org
Subject: [R] selecting from a series of integers withpre-determined
probabilities
R 2.6
Windows XP
I need to select from the integers 1,2,3,4,5 with some pre-determined
probability, e.g. probability of selecting 5 80%, probability of selecting 1
or 2 or 3 or 4 20%. Any suggestions for how I might accomplish this? I
need to do it very efficiently as I will be doing it 500,000 times.
Thanks
John
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:8}}
------------------------------
Message: 28
Date: Mon, 22 Sep 2008 00:34:51 +0200
From: p@fo76.org
Subject: [R] Multiple plots per window
To: R Help <r-help@r-project.org>
Message-ID: <20080922003451.2qhs63jhhwow48s0@webmail.openit.de>
Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes";
format="flowed"
Hi all,
I'm currently working through "The Analysis of Time Series" by
Chris
Chatfield. In order to also get a better understanding of R, I play
around with the examples and Exercises (no homework or assignement,
just selfstudy!!).
Exercise 2.1 gives the following dataset (sales figures for 4 week
intervals):
> sales2.1.dataframe
1995 1996 1997 1998
1 153 133 145 111
2 189 177 200 170
3 221 241 187 243
4 215 228 201 178
5 302 283 292 248
6 223 255 220 202
7 201 238 233 163
8 173 164 172 139
9 121 128 119 120
10 106 108 81 96
11 86 87 65 95
12 87 74 76 53
13 108 95 74 94
I want to plot the histograms/densities for all four years in one window.
After trying out a couple of things, I finally ended up with the following
(it took me two hours - Ouch!):
sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
133,177,241,228,283,255,238,164,128,108,87,74,95,
145,200,187,201,292,220,233,172,119,81,65,76,74,
111,170,243,178,248,202,163,139,120,96,95,53,94)
sales2.1.matrix <- sales2.1
dim(sales2.1.matrix) <- c(4,13)
sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
X11()
split.screen(c(2,2))
for (i in 1:4)
{
screen(i)
hist(sales2.1.dataframe[[i]],
probability=T,
xlim=c(0,400),
ylim=c(0,0.006),
main=names(sales2.1.dataframe)[i],
xlab="Sales")
lines(density(sales2.1.dataframe[[i]]))
}
close.screen(all=TRUE)
Although I'm happy that I finally got something that is pretty close
to what I wanted, I'm not sure whether this is the best or most elegant
way to do it. How would you do it? What functions/packages should I
look into, in order to improve these plots?
Thanks in advance for your comments and suggestions,
Peter
------------------------------
Message: 29
Date: Sun, 21 Sep 2008 18:41:22 -0400
From: John Poulsen <jpoulsen@zoo.ufl.edu>
Subject: [R] glmer -- extracting standard errors and other statistics
To: r-help@r-project.org
Message-ID: <48D6CD92.10305@zoo.ufl.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hello,
I am using glmer() from lmer(lme4) to run generalized linear mixed
models. However, I am having a problem extracting the standard errors
for the fixed effects.
I have used:
summary(model)$coef
fixed.effects(model)
coef(model)
to get out the parameter estimates, but do not seem able to extract the
se's.
Anybody have a solution?
Thanks,
John
------------------------------
Message: 30
Date: Sun, 21 Sep 2008 18:52:10 -0400
From: "jim holtman" <jholtman@gmail.com>
Subject: Re: [R] r format questions
To: DS <ds5j@excite.com>
Cc: r-help@r-project.org
Message-ID:
<644e1f320809211552s76b1447fg312e57f7b9dba3a3@mail.gmail.com>
Content-Type: text/plain
You have to explicitly ask that they not be printed:
> x <- aggregate(state.x77, list(Region = state.region), mean)
> x
Region Population Income Illiteracy Life Exp Murder HS
Grad Frost Area
1 Northeast 5495.111 4570.222 1.000000 71.26444 4.722222 53.96667
132.7778 18141.00
2 South 4208.125 4011.938 1.737500 69.70625 10.581250 44.34375
64.6250 54605.12
3 North Central 4803.000 4611.083 0.700000 71.76667 5.275000 54.51667
138.8333 62652.00
4 West 2915.308 4702.615 1.023077 71.23462 7.215385 62.00000
102.1538 134463.00> print(x, row.names=FALSE)
Region Population Income Illiteracy Life Exp Murder HS Grad
Frost Area
Northeast 5495.111 4570.222 1.000000 71.26444 4.722222 53.96667
132.7778 18141.00
South 4208.125 4011.938 1.737500 69.70625 10.581250 44.34375
64.6250 54605.12
North Central 4803.000 4611.083 0.700000 71.76667 5.275000 54.51667
138.8333 62652.00
West 2915.308 4702.615 1.023077 71.23462 7.215385 62.00000
102.1538 134463.00>
On Sun, Sep 21, 2008 at 6:03 PM, DS <ds5j@excite.com> wrote:
> Hi,
>
> 1) I have noticed that when I use the aggregate function it outputs
> numbers in the results. for example:
> aggregate by product
>
> group.1 Aggregate
> 1 ProductA 1000400.00
> 2 ProductB 23232323.00
> 3 Missing 232323.00
>
> is there a way to suppress the numbers infront of aggregate outputs. I
> checked and they don't look like columns when I do a summary so I
can't -1
> them away.
>
> 2) is there an easy way to then take my aggregate matrix and then format
> the sum wtih $ and commas. for e.g instead 10000 it should show
> $10,000.00?
>
> I am trying to create a report and am piping the aggregate into an xtable
> and feeding it R2html.
>
> thanks
> Dhruv
>
> ------------------------------------------------------------
> Medical Billing and Coding Training
>
> ools.
>
>
http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
------------------------------
Message: 31
Date: Sun, 21 Sep 2008 18:56:17 -0400
From: "jim holtman" <jholtman@gmail.com>
Subject: Re: [R] r format questions
To: DS <ds5j@excite.com>
Cc: r-help@r-project.org
Message-ID:
<644e1f320809211556r2023e1c3od632186ebe3d0447@mail.gmail.com>
Content-Type: text/plain
answer to your second question:
> paste("$", format(1234567.77, big.mark=','),
sep='')
[1] "$1,234,568">
you will have to go through each column you want and explicitly do it:
> x
Region Population Income Illiteracy Life Exp Murder HS
Grad Frost Area
1 Northeast 5495.111 4570.222 1.000000 71.26444 4.722222 53.96667
132.7778 18141.00
2 South 4208.125 4011.938 1.737500 69.70625 10.581250 44.34375
64.6250 54605.12
3 North Central 4803.000 4611.083 0.700000 71.76667 5.275000 54.51667
138.8333 62652.00
4 West 2915.308 4702.615 1.023077 71.23462 7.215385 62.00000
102.1538 134463.00> x$Population <- paste("$", format(x$Population,
big.mark=','), sep='')
> x
Region Population Income Illiteracy Life Exp Murder HS
Grad Frost Area
1 Northeast $5,495.111 4570.222 1.000000 71.26444 4.722222 53.96667
132.7778 18141.00
2 South $4,208.125 4011.938 1.737500 69.70625 10.581250 44.34375
64.6250 54605.12
3 North Central $4,803.000 4611.083 0.700000 71.76667 5.275000 54.51667
138.8333 62652.00
4 West $2,915.308 4702.615 1.023077 71.23462 7.215385 62.00000
102.1538 134463.00>
On Sun, Sep 21, 2008 at 6:03 PM, DS <ds5j@excite.com> wrote:
> Hi,
>
> 1) I have noticed that when I use the aggregate function it outputs
> numbers in the results. for example:
> aggregate by product
>
> group.1 Aggregate
> 1 ProductA 1000400.00
> 2 ProductB 23232323.00
> 3 Missing 232323.00
>
> is there a way to suppress the numbers infront of aggregate outputs. I
> checked and they don't look like columns when I do a summary so I
can't -1
> them away.
>
> 2) is there an easy way to then take my aggregate matrix and then format
> the sum wtih $ and commas. for e.g instead 10000 it should show
> $10,000.00?
>
> I am trying to create a report and am piping the aggregate into an xtable
> and feeding it R2html.
>
> thanks
> Dhruv
>
> ------------------------------------------------------------
> Medical Billing and Coding Training
>
> ools.
>
>
http://tagline.excite.com/fc/JkJQPTgMxYL8ba16zHPqHis4q6x4p3rbpaGcEAQIui8YyxCoQBNUxa/
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
------------------------------
Message: 32
Date: Sun, 21 Sep 2008 16:00:47 -0700 (PDT)
From: gcam032 <gcam032@gmail.com>
Subject: Re: [R] Variable Selection for data reduction and
discriminant anlaysis
To: r-help@r-project.org
Message-ID: <19599461.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii
Thanks Mark,
I failed to mention that i'm working within a compositional framework. I
didn't want to confuse things. My data is transformed to the clr or alr
under Aitchison geometry, so I am essentially working in Euclidean space.
Has anyone had experience doing stepwise LDA?? I can't for the life of me
find any help online about where to start.
Thanks
Gareth
quote author="Mark Difford">
Hi Gareth,
>> If I use the full composition (31 elements or variables), I can get
>> reasonable separation of my 6 sources.
A word of advice: You need to be exceptionally careful when analyzing
compositional data. Taking compositions puts your data values into a
constrained/bounded space (generally called a simplex) so that most standard
statistical procedures (i.e. anything that uses a Euclidean metric, and most
do) deliver erroneous results. Pearson wrote a paper on this long ago, but
it's generally been ignored (except by Aitchison and the Spanish School of
mathematical statisticians).
The problem is comparatively well known to geologists, who work with
compositional much of the time. R has a very good package for analysing this
data-type: see the compositions package (a new release seems iminent). You
will be able to get most of the main references from it. (The authors of the
package also have a newly-released article in one of the Elsevier journals
[unfor. my bib+ are elsewhere so I cannot give details]).
You could start by Wiki'ing your way to "compositional data".
HTH, Mark.
Gareth Campbell wrote:>
> Hello all,
>
> I'm dealing with geochemical analyses of some rocks.
>
> If I use the full composition (31 elements or variables), I can get
> reasonable separation of my 6 sources. Then when I go onto do LDA with
> the
> 6 groups, I get excellent separation.
>
> I feel like I should be reducing the variables to thos that are providing
> the most discrimination between the groups as this is important
> information
> for me. I struggle to interpret the PCA plot in a way that helps me (due
> to
> the large number of elements). So I'm trying to do some sort of
step-wise
> variable selection.
>
> I would love to hear from someone (possibly a geochemist or similar) who
> does this regularly to determine the best course of action in R to do
> this.
>
>
> Thanks very much
>
>
> --
> Gareth Campbell
> PhD Candidate
> The University of Auckland
>
> P +649 815 3670
> M +6421 256 3511
> E gareth.campbell@esr.cri.nz
> gcam032@gmail.com
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context:
http://www.nabble.com/Variable-Selection-for-data-reduction-and-discriminant-anlaysis-tp19591270p19599461.html
Sent from the R help mailing list archive at Nabble.com.
------------------------------
Message: 33
Date: Mon, 22 Sep 2008 01:19:48 +0200
From: p@fo76.org
Subject: Re: [R] Multiple plots per window
To: r-help@r-project.org
Message-ID: <20080922011948.xdupj04hyv4w4c08@webmail.openit.de>
Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes";
format="flowed"
sorry, as Mark Leeds pointed out to me, the row/column numbers where
mixed up in my example... happens when you cut & paste like mad from
your history... it should read as follows:
sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
133,177,241,228,283,255,238,164,128,108,87,74,95,
145,200,187,201,292,220,233,172,119,81,65,76,74,
111,170,243,178,248,202,163,139,120,96,95,53,94)
sales2.1.matrix <- sales2.1
dim(sales2.1.matrix) <- c(13,4)
sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
Peter
Quoting p@fo76.org:
> Hi all,
>
> I'm currently working through "The Analysis of Time Series"
by Chris
> Chatfield. In order to also get a better understanding of R, I play
> around with the examples and Exercises (no homework or assignement,
> just selfstudy!!).
>
> Exercise 2.1 gives the following dataset (sales figures for 4 week
> intervals):
>
>> sales2.1.dataframe
> 1995 1996 1997 1998
> 1 153 133 145 111
> 2 189 177 200 170
> 3 221 241 187 243
> 4 215 228 201 178
> 5 302 283 292 248
> 6 223 255 220 202
> 7 201 238 233 163
> 8 173 164 172 139
> 9 121 128 119 120
> 10 106 108 81 96
> 11 86 87 65 95
> 12 87 74 76 53
> 13 108 95 74 94
>
> I want to plot the histograms/densities for all four years in one window.
> After trying out a couple of things, I finally ended up with the following
> (it took me two hours - Ouch!):
>
> sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
> 133,177,241,228,283,255,238,164,128,108,87,74,95,
> 145,200,187,201,292,220,233,172,119,81,65,76,74,
> 111,170,243,178,248,202,163,139,120,96,95,53,94)
> sales2.1.matrix <- sales2.1
> dim(sales2.1.matrix) <- c(4,13)
> sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
> names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
>
> X11()
> split.screen(c(2,2))
> for (i in 1:4)
> {
> screen(i)
> hist(sales2.1.dataframe[[i]],
> probability=T,
> xlim=c(0,400),
> ylim=c(0,0.006),
> main=names(sales2.1.dataframe)[i],
> xlab="Sales")
> lines(density(sales2.1.dataframe[[i]]))
> }
> close.screen(all=TRUE)
>
> Although I'm happy that I finally got something that is pretty close
> to what I wanted, I'm not sure whether this is the best or most elegant
> way to do it. How would you do it? What functions/packages should I
> look into, in order to improve these plots?
>
> Thanks in advance for your comments and suggestions,
>
> Peter
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
------------------------------
Message: 34
Date: Mon, 22 Sep 2008 01:49:32 +0200
From: "Weiss, Bernd " <bernd.weiss@uni-koeln.de>
Subject: Re: [R] glmer -- extracting standard errors and other
statistics
To: jpoulsen@zoo.ufl.edu, r-help@r-project.org
Message-ID: <48D6DD8C.8030405@uni-koeln.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
John Poulsen schrieb:> Hello,
>
> I am using glmer() from lmer(lme4) to run generalized linear mixed
> models. However, I am having a problem extracting the standard errors
> for the fixed effects.
>
> I have used:
>
> summary(model)$coef
> fixed.effects(model)
> coef(model)
>
> to get out the parameter estimates, but do not seem able to extract the
> se's.
>
> Anybody have a solution?
>
You need to extract the variance-covariance matrix:
library(lme4)
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
family = binomial, data = cbpp))
sqrt(diag(vcov(gm1)))
HTH,
Bernd
------------------------------
Message: 35
Date: Sun, 21 Sep 2008 18:01:57 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: [R] Why isn't R recognising integers as numbers?
To: r-help@r-project.org
Message-ID: <19600308.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii
I have a number of files containing anywhere from a few dozen to a few
thousand integers, one per record.
The statement "refdata18
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
TRUE,na.strings="")" works fine, and if I type refdata18, I get
the integers
displayed, one value per record (along with a record number). However, when
I try " fitdistr(refdata18,"negative binomial")", or
hist.scott(refdata18,
prob = TRUE), I get an error:
Error in fitdistr(refdata18, "negative binomial") :
'x' must be a non-empty numeric vector
Or
Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) :
'x' must be numeric
How can it not recognise integers as numbers?
Thanks
Ted
--
View this message in context:
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html
Sent from the R help mailing list archive at Nabble.com.
------------------------------
Message: 36
Date: Sun, 21 Sep 2008 21:12:49 -0400
From: "jim holtman" <jholtman@gmail.com>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: "Ted Byers" <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID:
<644e1f320809211812t7a82ac5dy98208d60b3007ef8@mail.gmail.com>
Content-Type: text/plain
best guess is that they are not integers. Do 'str' on your object and
it
probably says they are 'factors'. This is probably due to some of your
data
being non-numeric. Try using 'colClasses' on read.csv to specify what
the
column should contain. Also try "scan" after skipping the first
record if
it is a header:
> scan("", what=0L) # bad input after specifying integer
1: 1 2 3 4
5: 1 v
5:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
scan() expected 'an integer', got 'v'> scan("", what=0L) # good input
1: 1
2: 2
3: 3
4:
Read 3 items
[1] 1 2 3>
On Sun, Sep 21, 2008 at 9:01 PM, Ted Byers <r.ted.byers@gmail.com> wrote:
>
> I have a number of files containing anywhere from a few dozen to a few
> thousand integers, one per record.
>
> The statement "refdata18 >
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
> TRUE,na.strings="")" works fine, and if I type refdata18, I
get the
> integers
> displayed, one value per record (along with a record number). However,
> when
> I try " fitdistr(refdata18,"negative binomial")", or
hist.scott(refdata18,
> prob = TRUE), I get an error:
>
> Error in fitdistr(refdata18, "negative binomial") :
> 'x' must be a non-empty numeric vector
> Or
> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) :
> 'x' must be numeric
>
> How can it not recognise integers as numbers?
>
> Thanks
>
> Ted
> --
> View this message in context:
>
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
------------------------------
Message: 37
Date: Sun, 21 Sep 2008 21:21:21 -0400
From: "Gabor Grothendieck" <ggrothendieck@gmail.com>
Subject: Re: [R] Multiple plots per window
To: p@fo76.org
Cc: r-help@r-project.org
Message-ID:
<971536df0809211821u2b99348ai36b1952bc127695d@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Here are two ways: one using classic graphics and one much
shorter way using lattice. ggplot2 would be a another short way
(not shown).
Lines <- "1995 1996 1997 1998
153 133 145 111
189 177 200 170
221 241 187 243
215 228 201 178
302 283 292 248
223 255 220 202
201 238 233 163
173 164 172 139
121 128 119 120
106 108 81 96
86 87 65 95
87 74 76 53
108 95 74 94"
# read in data and remove the X from the column names
s <- read.table(textConnection(Lines), header = TRUE)
names(s) <- sub("X", "", names(s))
# 1. using classic graphics
# find overall ranges of x and y
h <- lapply(s, hist, probability = TRUE)
ylim <- range(unlist(lapply(h, "[[", "density")))
xlim <- range(unlist(lapply(h, "[[", "breaks")))
# plot
opar <- par(mfrow = c(2, 2))
for(i in 1:length(s)) {
hist(s[[i]], main = names(s)[i], probability = TRUE,
xlab = "Sales", xlim = xlim, ylim = ylim)
lines(density(s[[i]]))
}
par(opar)
# 2. using lattice its a bit easier
library(lattice)
histogram( ~ values | ind, stack(s), type = "density",
panel = function(...) {
panel.histogram(...)
panel.densityplot(...)
}
)
On Sun, Sep 21, 2008 at 7:19 PM, <p@fo76.org>
wrote:> sorry, as Mark Leeds pointed out to me, the row/column numbers where
> mixed up in my example... happens when you cut & paste like mad from
> your history... it should read as follows:
>
> sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
> 133,177,241,228,283,255,238,164,128,108,87,74,95,
> 145,200,187,201,292,220,233,172,119,81,65,76,74,
> 111,170,243,178,248,202,163,139,120,96,95,53,94)
>
> sales2.1.matrix <- sales2.1
> dim(sales2.1.matrix) <- c(13,4)
>
> sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
> names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
>
> Peter
>
> Quoting p@fo76.org:
>
>> Hi all,
>>
>> I'm currently working through "The Analysis of Time
Series" by Chris
>> Chatfield. In order to also get a better understanding of R, I play
>> around with the examples and Exercises (no homework or assignement,
>> just selfstudy!!).
>>
>> Exercise 2.1 gives the following dataset (sales figures for 4 week
>> intervals):
>>
>>> sales2.1.dataframe
>>
>> 1995 1996 1997 1998
>> 1 153 133 145 111
>> 2 189 177 200 170
>> 3 221 241 187 243
>> 4 215 228 201 178
>> 5 302 283 292 248
>> 6 223 255 220 202
>> 7 201 238 233 163
>> 8 173 164 172 139
>> 9 121 128 119 120
>> 10 106 108 81 96
>> 11 86 87 65 95
>> 12 87 74 76 53
>> 13 108 95 74 94
>>
>> I want to plot the histograms/densities for all four years in one
window.
>> After trying out a couple of things, I finally ended up with the
following
>> (it took me two hours - Ouch!):
>>
>> sales2.1 <- c(153,189,221,215,302,223,201,173,121,106,86,87,108,
>> 133,177,241,228,283,255,238,164,128,108,87,74,95,
>> 145,200,187,201,292,220,233,172,119,81,65,76,74,
>> 111,170,243,178,248,202,163,139,120,96,95,53,94)
>> sales2.1.matrix <- sales2.1
>> dim(sales2.1.matrix) <- c(4,13)
>> sales2.1.dataframe <- as.data.frame(sales2.1.matrix)
>> names(sales2.1.dataframe) <-
c("1995","1996","1997","1998")
>>
>> X11()
>> split.screen(c(2,2))
>> for (i in 1:4)
>> {
>> screen(i)
>> hist(sales2.1.dataframe[[i]],
>> probability=T,
>> xlim=c(0,400),
>> ylim=c(0,0.006),
>> main=names(sales2.1.dataframe)[i],
>> xlab="Sales")
>> lines(density(sales2.1.dataframe[[i]]))
>> }
>> close.screen(all=TRUE)
>>
>> Although I'm happy that I finally got something that is pretty
close
>> to what I wanted, I'm not sure whether this is the best or most
elegant
>> way to do it. How would you do it? What functions/packages should I
>> look into, in order to improve these plots?
>>
>> Thanks in advance for your comments and suggestions,
>>
>> Peter
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
------------------------------
Message: 38
Date: Sun, 21 Sep 2008 20:44:50 -0500
From: Marc Schwartz <marc_schwartz@comcast.net>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <48D6F892.3080901@comcast.net>
Content-Type: text/plain; charset=ISO-8859-1
on 09/21/2008 08:01 PM Ted Byers wrote:> I have a number of files containing anywhere from a few dozen to a few
> thousand integers, one per record.
>
> The statement "refdata18 >
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
> TRUE,na.strings="")" works fine, and if I type refdata18, I
get the integers
> displayed, one value per record (along with a record number). However,
when
> I try " fitdistr(refdata18,"negative binomial")", or
hist.scott(refdata18,
> prob = TRUE), I get an error:
>
> Error in fitdistr(refdata18, "negative binomial") :
> 'x' must be a non-empty numeric vector
> Or
> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab, ...) :
> 'x' must be numeric
>
> How can it not recognise integers as numbers?
>
> Thanks
>
> Ted
'refdata18' is a data frame and the two functions are expecting a
numeric vector.
If you use:
fitdistr(refdata18[, 1], "negative binomial")
or
hist(refdata18[, 1])
you should get a suitable result, presuming that the first column in the
data frame is a numeric vector.
Use:
str(refdata18)
to get a sense for the structure of the data frame, including the column
names, which you could then use, instead of the above index based syntax.
HTH,
Marc Schwartz
------------------------------
Message: 39
Date: Sun, 21 Sep 2008 18:56:48 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: r-help@r-project.org
Message-ID: <19600695.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii
Thanks Jim,
Alas, it wasn't this. Here is the output from both of your suggestions:
> refdata18 =
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv",
> header = TRUE,na.strings="")
> str(refdata18)
'data.frame': 341 obs. of 1 variable:
$ X0: int 0 0 0 0 0 0 0 0 0 0 ...> scan("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", what=0L)
Read 342 items
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0
[26] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0
[51] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0
[76] 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1
[101] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1
[126] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1
[151] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2
[176] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3
3 3
[201] 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4
4 4
[226] 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6
6 6
[251] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7
7 7
[276] 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10
10 10
[301] 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12
12 12
[326] 12 12 12 18 18 18 18 18 18 18 18 18 18 18 18 18 18
Thanks anyway.
Ted>
jholtman wrote:>
> best guess is that they are not integers. Do 'str' on your object
and it
> probably says they are 'factors'. This is probably due to some of
your
> data
> being non-numeric. Try using 'colClasses' on read.csv to specify
what the
> column should contain. Also try "scan" after skipping the first
record if
> it is a header:
>
>> scan("", what=0L) # bad input after specifying integer
> 1: 1 2 3 4
> 5: 1 v
> 5:
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> :
> scan() expected 'an integer', got 'v'
>> scan("", what=0L) # good input
> 1: 1
> 2: 2
> 3: 3
> 4:
> Read 3 items
> [1] 1 2 3
>>
>
> On Sun, Sep 21, 2008 at 9:01 PM, Ted Byers <r.ted.byers@gmail.com>
wrote:
>
>>
>> I have a number of files containing anywhere from a few dozen to a few
>> thousand integers, one per record.
>>
>> The statement "refdata18 >>
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
>> TRUE,na.strings="")" works fine, and if I type
refdata18, I get the
>> integers
>> displayed, one value per record (along with a record number). However,
>> when
>> I try " fitdistr(refdata18,"negative binomial")",
or
>> hist.scott(refdata18,
>> prob = TRUE), I get an error:
>>
>> Error in fitdistr(refdata18, "negative binomial") :
>> 'x' must be a non-empty numeric vector
>> Or
>> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab,
...)
>> :
>> 'x' must be numeric
>>
>> How can it not recognise integers as numbers?
>>
>> Thanks
>>
>> Ted
>> --
>> View this message in context:
>>
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600308.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context:
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600695.html
Sent from the R help mailing list archive at Nabble.com.
------------------------------
Message: 40
Date: Sun, 21 Sep 2008 19:09:29 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: r-help@r-project.org
Message-ID: <19600803.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii
Thanks Marc,
That was it.
For the last 30 years, I'd write my own code, in FORTRAN, C++, or even Java,
to do whatever statistical analysis I needed. When at the office, sometimes
I could use SAS, but that hasn't been an option for me in years.
This is the first time I have had to load real data into R (instead of
generating random data to use while playing with some of the stats
functions, or manually typing dummy data).
I take it, then, that the result of loading data is a data frame, and not
just a matrix or array. Using something like "refdata18[, 1]" feels
rather
alien, but I'm sure I'll quickly get used to it. I'd seen it before
in the
R docs, but it didn't register that I had to use it to get the functions of
most interest to me to recognise my data as a vector of numbers, given I'd
provided only a vector of integers as input.
Thanks
Ted
Marc Schwartz wrote:>
> on 09/21/2008 08:01 PM Ted Byers wrote:
>> I have a number of files containing anywhere from a few dozen to a few
>> thousand integers, one per record.
>>
>> The statement "refdata18 >>
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
>> TRUE,na.strings="")" works fine, and if I type
refdata18, I get the
>> integers
>> displayed, one value per record (along with a record number). However,
>> when
>> I try " fitdistr(refdata18,"negative binomial")",
or
>> hist.scott(refdata18,
>> prob = TRUE), I get an error:
>>
>> Error in fitdistr(refdata18, "negative binomial") :
>> 'x' must be a non-empty numeric vector
>> Or
>> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab,
...)
>> :
>> 'x' must be numeric
>>
>> How can it not recognise integers as numbers?
>>
>> Thanks
>>
>> Ted
>
> 'refdata18' is a data frame and the two functions are expecting a
> numeric vector.
>
> If you use:
>
> fitdistr(refdata18[, 1], "negative binomial")
>
> or
>
> hist(refdata18[, 1])
>
> you should get a suitable result, presuming that the first column in the
> data frame is a numeric vector.
>
> Use:
>
> str(refdata18)
>
> to get a sense for the structure of the data frame, including the column
> names, which you could then use, instead of the above index based syntax.
>
> HTH,
>
> Marc Schwartz
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context:
http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp19600308p19600803.html
Sent from the R help mailing list archive at Nabble.com.
------------------------------
Message: 41
Date: Sun, 21 Sep 2008 21:49:14 -0500
From: Marc Schwartz <marc_schwartz@comcast.net>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <48D707AA.6040909@comcast.net>
Content-Type: text/plain; charset=ISO-8859-1
on 09/21/2008 09:09 PM Ted Byers wrote:> Thanks Marc,
>
> That was it.
>
> For the last 30 years, I'd write my own code, in FORTRAN, C++, or even
Java,
> to do whatever statistical analysis I needed. When at the office,
sometimes
> I could use SAS, but that hasn't been an option for me in years.
>
> This is the first time I have had to load real data into R (instead of
> generating random data to use while playing with some of the stats
> functions, or manually typing dummy data).
>
> I take it, then, that the result of loading data is a data frame, and not
> just a matrix or array. Using something like "refdata18[, 1]"
feels rather
> alien, but I'm sure I'll quickly get used to it. I'd seen it
before in the
> R docs, but it didn't register that I had to use it to get the
functions of
> most interest to me to recognise my data as a vector of numbers, given
I'd
> provided only a vector of integers as input.
<snip>
Ted,
If you read the 'Value' section of ?read.csv, it indicates that the
function returns a data frame. It is important to fully read the help
page for new functions so that you understand both how they are used and
the result(s) of their actions, including the 'Notes' section, which can
include further details, including gotchas and idiosyncrasies.
A data frame will be the result of read.csv() even if the data source is
a single column. Think of a data frame in the same way as a spreadsheet
or database table with one or more columns and one or more rows. The
unique aspect of a data frame is that each column can be a different
data type, though that need not be the case.
Thus, you still need to identify the column within the data frame that
you wish to manipulate/analyze further. There are various ways of doing
this, which are covered in Chapter 6 of "An Introduction to R" on
Lists
and Data Frames. Some involve the use of indices, others using a column
name, as appropriate. There will be situations where they can be
interchangeable and others where one method will be superior to the
other. Time and experience will provide insight and intuition.
There are a myriad of ways of reading data into R and these are covered
in the Data Import/Export manual. Not all result in a data frame, but in
general and perhaps most commonly, that will be the result.
HTH,
Marc
------------------------------
Message: 42
Date: Sun, 21 Sep 2008 19:54:28 -0700 (PDT)
Subject: Re: [R] Calculating interval for conditional/unconditional
correlation matrix
Message-ID: <19678.53863.qm@web32203.mail.mud.yahoo.com>
Content-Type: text/plain; charset=utf-8
Hi Ana,
There are two problems:
First of all, if you want your matrix to have 4 columns it's number of
elem[[elided Yahoo spam]]
Secondly, and this is what causes your error message, you should not call your
second function matrix. Call it matrix1, my_matrix, whatever. Otherwise R thinks
that you are calling your matrix function within itself.
> Subject: [R] Calculating interval for conditional/unconditional correlation
matrix
> To: "R" <r-help@r-project.org>
> Received: Sunday, 21 September, 2008, 8:05 PM
> Hi there,
>
> Could anyone please help me to understand what should be
> done in order not to get this error message: Error:
> evaluation nested too deeply: infinite recursion /
> options(expressions=)?
>
> Here is my code:
>
> determinant<-
>
function(x){det(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}
>
> matrix<-
>
function(x){(matrix(c(1.0,0.2,0.5,0.8,0.2,1.0,0.5,0.6,0.5,0.5,0.5,1.0,x,0.8,0.6,x,1.0),ncol=4,byrow=T))}
>
>
> conditional<-function(x,varcov){
> varcov<-matrix(x)
> sigmaxx<-varcov[3,3]
> sigmaxz<-varcov[3,1:2]
> sigmayy<-varcov[4,4]
> sigmayz<-varcov[4,1:2]
> sigmazx<-varcov[1:2,3]
> sigmazy<-varcov[1:2,4]
> sigmazz<-varcov[1:2,1:2]
>
>
(x-sigmaxz%*%solve(sigmaZZ)%*%sigmazy)/sqrt((sigmaxx-sigmaxz%*%solve(sigmaZZ)%*%sigmazx)*(sigmayy-sigmayz%*%solve(sigmaZZ)%*%sigmazy))}
>
> interval<-uniroot(determinant,lower = min(c(0,1)), upper
> = max(c(0,1)))
>
> I tried also with the code below, but got the same Error
> message.
>
> lower.bound<-uniroot(determinant,c(0,0.5))$root
> upper.bound<-uniroot(determinant,c(0.51,1))$root
>
>
[[elided Yahoo spam]]>
> Ana
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
------------------------------
Message: 43
Date: Sun, 21 Sep 2008 18:26:20 -0500
From: Bingshan Li <bli1@bcm.tmc.edu>
Subject: Re: [R] How to plot "greater than" symbol on the x-axis
To: Gabor Grothendieck <ggrothendieck@gmail.com>
Cc: r-help@r-project.org, John Fox <jfox@mcmaster.ca>
Message-ID: <48D6D81C.40509@bcm.tmc.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi Gabor,
This works. This is exactly what I want. According to John Fox's reply,
I used expression(NA>=1) and it also worked. Thanks for the kind and
clever help.
Bingshan
Gabor Grothendieck wrote:> On Sun, Sep 21, 2008 at 11:23 AM, Li, Bingshan <bli1@bcm.tmc.edu>
wrote:
>
>> Hi John,
>>
>> Yes, you are right. I meant "greater-than-or-equal".
According to your suggestion, I can plot the symbol only. But what I want is to
have >=1, >=2 and so on as labels on xaxis. I did not make it work. Do you
know how to make it? The expression("">=1"") did not
work, and paste(expression("">=""), 1)
>> did not work either.
>>
>>
>
> Try this:
>
> plot(1:10, xaxt = "n")
> for(i in 1:10) axis(1, i, bquote(phantom(0) >= .(i)))
>
------------------------------
Message: 44
Date: Mon, 22 Sep 2008 14:42:15 +1200
From: Paul Murrell <p.murrell@auckland.ac.nz>
Subject: Re: [R] PDF fonts problem
To: Mihalicza P?ter <mihalicza.peter@eski.hu>
Cc: r-help@r-project.org
Message-ID: <48D70607.40301@stat.auckland.ac.nz>
Content-Type: text/plain; charset=UTF-8
Hi
Mihalicza P?ter wrote:> Dear Dr. Murrel,
>
[[elided Yahoo spam]]>
> Paul Murrell ?rta:
>> Hi
>>
>>>
>>> #CMS
>>> pdf("tryfont-cms.pdf", family="CMS")
>>> grid.text("gg\u151hh\uF6ii\uF3jj kk\u171ll\uFCmm\uFAnn")
>>> dev.off()
>>> #u151 and u171 doesn't show, though the other accented ones do
>>>
>>> embedFonts("tryfont-cms.pdf",
>>> outfile="tryfont-cms-embed.pdf",
>>> fontpaths="/cm-super/afm/")
>>> #after embedding the same "slipping" occurs
>>
>> The 'fontpaths' argument describes where the PFB files are, not
where
>> the AFM files are. So this is probably failing to embed the fonts
>> because it can't find the fonts. Does it work if you change to
>> something like ...
>>
>> embedFonts("tryfont-cms.pdf",
>> outfile="tryfont-cms-embed.pdf",
>> fontpaths="cm-super/pfb/")
>>
>> Paul
>>
>>
> This solved my problem, so I am really very grateful! I am not too
> familiar with font protocols.
> Just for the sake of knowledge: if my embedFonts specification should
> not have made any difference, why did the output pdf differed from the
> one before embedding?
Your embedFonts() specification (especially your 'fontpaths' argument)
*did* make a difference. This function calls ghostscript to perform the
embedding and if ghostscript cannot find the PFB files it cannot embed
the font. If the PDF file does not have embedded fonts, the PDF reader
will use (substitute) its own fonts and the result can look awful.
Paul
> Thanks again,
> Peter
>
>
>
--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
paul@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/
------------------------------
Message: 45
Date: Mon, 22 Sep 2008 11:59:13 +0800 (CST)
Subject: [R] Help for R
To: r-help@r-project.org
Message-ID: <339826.80972.qm@web15908.mail.cnb.yahoo.com>
Content-Type: text/plain
Dear R users£¬
I've just started learning R and I'm having a problem with it. I was
told as following when I tried to run R:
Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source =
keep.source) :
in 'matlab' methods specified for export, but none defined: sum,
size, padarray, flipud, fliplr
Error: package/namespace load failed for 'matlab'
Then I tried "package/load in package/matlab", however, the same
message showed to me as above.
I appreciate for any help and suggestion. Thanks.
Kai
---------------------------------
ÑÅ»¢ÓÊÏ䣬ÄúµÄÖÕÉúÓÊÏ䣡
[[alternative HTML version deleted]]
------------------------------
Message: 46
Date: Sun, 21 Sep 2008 23:08:42 -0500
From: "Matthew Pettis" <matthew.pettis@gmail.com>
Subject: [R] Hmisc and Ubuntu (aptitude install)
To: r-help@r-project.org
Message-ID:
<82ba77b80809212108y6baf7850i8b7d76c54bad160c@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Hi,
I'm trying to get the Hmisc module on my Ubuntu Hardy Heron install.
I tried getting Hmisc from within R by issuing the standard
'install.packages' command, but it said I needed 'gfortran' to
compile. I thought I could circumvent this by using 'aptitude' to get
the package 'r-cran-hmisc', but when I got it, the package had
critical missing parts (got 404s). So, I'll be trying to go back and
download 'gfortran', but can anybody tell me if this aptitude ubuntu
package should be kept up to date and is just currently overlooked?
Thanks,
Matt
--
It is from the wellspring of our despair and the places that we are
broken that we come to repair the world.
-- Murray Waas
------------------------------
Message: 47
Date: Mon, 22 Sep 2008 00:47:25 -0400
From: "Juliet Hannah" <juliet.hannah@gmail.com>
Subject: [R] adding layers in ggplot2 (data and code included)
To: r-help@r-project.org
Message-ID:
<93d6f2a80809212147o5c2e8d4co316396bad5f6217e@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Here is some sample data:
mydata <- read.table(textConnection("Est Group Tri
0 0 4.639644
1 0 4.579189
2 0 4.590714
0 1 4.443696
1 1 4.588243
2 1 4.650505
0 2 4.296608
1 2 4.826036
2 2 4.765386"),header=TRUE);
closeAllConnections();
I can form two plots, scatter and lines, as follows:
p <- ggplot(mydata, aes(x=Est, y=Tri))
p + geom_point(aes(colour=factor(Group),shape=factor(Group)))
and
p+ geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F).
However, I am unable to have the plots together.
I obtain the following error:
> p +
geom_point(aes(colour=factor(Group),shape=factor(Group)))+geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)
Error in `[.data.frame`(df, , var) : undefined columns selected
Thanks,
Juliet
------------------------------
Message: 48
Date: Mon, 22 Sep 2008 17:30:47 +1200
From: Rolf Turner <r.turner@auckland.ac.nz>
Subject: [R] Warnings in fitdistr() from MASS.
To: R-help Forum <r-help@r-project.org>
Message-ID: <80B3F5B3-0EC0-4E72-BEA9-5B40098294AE@auckland.ac.nz>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
For a lark, I experimented a bit with the data from Ted Byers' recent
postings. The result of fitdistr() seemed sensible, but I was bothered
by the warnings about NaNs that arose. Warnings always make me nervous.
Explicitly this is what I did:
TXT <- "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2
2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3
3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5
5 5 5 5 5
6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7
7 7 7 7 7
7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 11
11 11 11
11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12
12 12 12
12 18 18 18 18 18 18 18 18 18 18 18 18 18 18"
x <- scan(textConnection(TXT))
closeAllConnections()
try.x <- fitdistr(x,"negative binomial")
Two warnings about NaNs being produced resulted.
Digging into the code with browser() revealed that in the
optimization process
negative values of "size" were tried on occasion, and this was giving
the NaNs.
Basically I'm sending this out so that maybe those who are like me
and are
made nervous by warnings will be able to search the archives and find
reassurance
that all is actually well.
To keep the warnings from the door, one can set an argument "lower"
in the
call to fitdistr, e.g.
eps <- sqrt(.Machine$double.eps)
fitdistr(x,"negative binomial",lower=c(eps,eps))
Note that setting lower=c(0,0) doesn't work --- you get an *error* to
the
[[elided Yahoo spam]]
I also tried building my own local version of fitdistr() which had
if(distname == "negative binomial" & is.null(Call$lower))
Call$lower <- rep(sqrt(.Machine$double.eps),2)
just after the assignment ``Call$hessian <- TRUE''. This *seemed*
to work (i.e. prevent those nervous-making warnings and still give
the right answer).
cheers,
Rolf Turner
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
------------------------------
Message: 49
Date: Mon, 22 Sep 2008 08:01:23 +0200
From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <48D734B3.5090107@biostat.ku.dk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Ted Byers wrote:> Thanks Jim,
>
> Alas, it wasn't this. Here is the output from both of your
suggestions:
>
>
>> refdata18 =
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv",
>> header = TRUE,na.strings="")
>> str(refdata18)
>>
> 'data.frame': 341 obs. of 1 variable:
> $ X0: int 0 0 0 0 0 0 0 0 0 0 ...
>
Ummm, is there a header line or not? If there isn't, read.csv is going
to eat the first observation thinking it is a name (and since it is
non-syntactic add an X in front).
The scan command looks fine, you just should have assigned it somewhere,
x <- scan(......) and then fitdistr(x, ....)
>> scan("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv",
what=0L)
>>
> Read 342 items
> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0
> [26] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0
> [51] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0
> [76] 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> 1 1
> [101] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> 1 1
> [126] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> 1 1
> [151] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> 2 2
> [176] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3
> 3 3
> [201] 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4
> 4 4
> [226] 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6
> 6 6
> [251] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7
> 7 7
> [276] 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10
> 10 10
> [301] 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12
> 12 12
> [326] 12 12 12 18 18 18 18 18 18 18 18 18 18 18 18 18 18
>
>
--
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
------------------------------
Message: 50
Date: Sun, 21 Sep 2008 23:10:53 -0700
From: Eric <rmailbox@justemail.net>
Subject: Re: [R] adding layers in ggplot2 (data and code included)
To: r-help@r-project.org
Message-ID: <48D736ED.20904@justemail.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
The way you've attempted to get this result seems to align with the way
R "should" work, but it fails in this case.
The fix is to break things up a little bit:
p <- ggplot(mydata, aes(x=Est, y=Tri))
p <- p + geom_point(aes(colour=factor(Group),shape=factor(Group)))
p <- p +
geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)
p
Eric
Juliet Hannah wrote:> Here is some sample data:
>
> mydata <- read.table(textConnection("Est Group Tri
> 0 0 4.639644
> 1 0 4.579189
> 2 0 4.590714
> 0 1 4.443696
> 1 1 4.588243
> 2 1 4.650505
> 0 2 4.296608
> 1 2 4.826036
> 2 2 4.765386"),header=TRUE);
> closeAllConnections();
>
> I can form two plots, scatter and lines, as follows:
>
> p <- ggplot(mydata, aes(x=Est, y=Tri))
> p + geom_point(aes(colour=factor(Group),shape=factor(Group)))
>
> and
>
> p+
geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F).
>
> However, I am unable to have the plots together.
>
> I obtain the following error:
>
>
>> p +
geom_point(aes(colour=factor(Group),shape=factor(Group)))+geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)
>>
> Error in `[.data.frame`(df, , var) : undefined columns selected
>
> Thanks,
>
> Juliet
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
------------------------------
Message: 51
Date: Sun, 21 Sep 2008 23:17:27 -0700
From: <rkevinburton@charter.net>
Subject: [R] Time series (ts) questions.
To: r-help@r-project.org
Message-ID: <20080922021727.S31BZ.130479.root@mp16>
Content-Type: text/plain; charset=utf-8
I have been working with the base time series object (ts) and I had a couple of
questions that hopefully this group can help me with:
1) What is the best why to append an observation to an existing time-series?
Suppose I have a time series:
t <- ts(1:12, frequency=5)
This would generate two complete cycles and one remainder. Now I would like to
append an observation to this time series. I could use 'c' but then I
would need to rebuild the whole time series and I would need to know the
frequency etc. I would like some operation like '+' that would simply
append the value to the end of the time series (incrementing the 'las time
value so thing like cycle() still output the correnct values) but alas
t + 10
is already taken as an equally useful operation by adding 10 to each element in
the time series (rather than in thie case, appending ts(10,frequency) with a
time value of 13 to the time series).
2) How is the best way to get the last time value in a time series? I can do
something like:
(start(t)[2] - 1) + (end(t)[1]-1) * frequency(t) + end(t)[2]
But there has to be an easier way.
Thank you.
Kevin
------------------------------
Message: 52
Date: Mon, 22 Sep 2008 08:43:07 +0200
From: "PALMIER Patrick - CETE NP/INFRA/TRF"
<Patrick.Palmier@developpement-durable.gouv.fr>
Subject: [R] Matrix balancing on margins
To: r-help@r-project.org
Message-ID: <48D73E7B.4020702@developpement-durable.gouv.fr>
Content-Type: text/plain
Hello,
Is there any package in R for balancing matrix
I want to estimate a matrix with
* a initial matrix (1 everywhere for example)
* Row margin
* Col margin
* distance class vector (each cell of the matrix belong to a
distance class) and I want that the distance class repartition
will be preserved
How can I do such thing?
Is there any function already existing or should I compute an iterative
script myself?
Thanks
--
*Patrick PALMIER**
**Centre d'Études Techniques de l'Équipement Nord - Picardie
Département Infrastructures
*/*Trafic -- Socio-économie
*/2, rue de Bruxelles, BP 275
59019 Lille cedex
FRANCE
Tél: +33 (0) 3 20 49 60 70
Fax: +33 (0) 3 20 49 63 69
[[alternative HTML version deleted]]
------------------------------
Message: 53
Date: Sun, 21 Sep 2008 23:48:47 -0700 (PDT)
From: Mark Difford <mark_difford@yahoo.co.uk>
Subject: Re: [R] Variable Selection for data reduction and
discriminant anlaysis
To: r-help@r-project.org
Message-ID: <19602702.post@talk.nabble.com>
Content-Type: text/plain; charset=us-ascii
Hi Gareth,
>> My data is transformed to the clr or alr under Aitchison geometry, so I
>> am essentially working
>> in Euclidean space.
Great: glad to hear it.
>> Has anyone had experience doing stepwise LDA?? I can't for the
life of
>> me find any help
>> online about where to start.
A better option might be this: Trevor Hastie and a student of his have
recently put out a paper that does a step-up from penalized discriminant
analysis based, I think, on Trevor's sparse principal component analysis
method (in his elasticnet package).
http://www-stat.stanford.edu/~hastie/Papers/sda_line.pdf
You can get R-code to do the analysis on the first author's website;
there's
a link in the paper.
Bye, Mark.
gcam032 wrote:>
> Thanks Mark,
>
> I failed to mention that i'm working within a compositional framework.
I
> didn't want to confuse things. My data is transformed to the clr or
alr
> under Aitchison geometry, so I am essentially working in Euclidean space.
>
> Has anyone had experience doing stepwise LDA?? I can't for the life of
me
> find any help online about where to start.
>
> Thanks
>
> Gareth
>
>
> quote author="Mark Difford">
> Hi Gareth,
>
>>> If I use the full composition (31 elements or variables), I can get
>>> reasonable separation of my 6 sources.
>
> A word of advice: You need to be exceptionally careful when analyzing
> compositional data. Taking compositions puts your data values into a
> constrained/bounded space (generally called a simplex) so that most
> standard statistical procedures (i.e. anything that uses a Euclidean
> metric, and most do) deliver erroneous results. Pearson wrote a paper on
> this long ago, but it's generally been ignored (except by Aitchison and
> the Spanish School of mathematical statisticians).
>
> The problem is comparatively well known to geologists, who work with
> compositional much of the time. R has a very good package for analysing
> this data-type: see the compositions package (a new release seems
> iminent). You will be able to get most of the main references from it.
> (The authors of the package also have a newly-released article in one of
> the Elsevier journals [unfor. my bib+ are elsewhere so I cannot give
> details]).
>
> You could start by Wiki'ing your way to "compositional data".
>
> HTH, Mark.
>
>
>
> Gareth Campbell wrote:
>>
>> Hello all,
>>
>> I'm dealing with geochemical analyses of some rocks.
>>
>> If I use the full composition (31 elements or variables), I can get
>> reasonable separation of my 6 sources. Then when I go onto do LDA with
>> the
>> 6 groups, I get excellent separation.
>>
>> I feel like I should be reducing the variables to thos that are
providing
>> the most discrimination between the groups as this is important
>> information
>> for me. I struggle to interpret the PCA plot in a way that helps me
(due
>> to
>> the large number of elements). So I'm trying to do some sort of
>> step-wise
>> variable selection.
>>
>> I would love to hear from someone (possibly a geochemist or similar)
who
>> does this regularly to determine the best course of action in R to do
>> this.
>>
>>
>> Thanks very much
>>
>>
>> --
>> Gareth Campbell
>> PhD Candidate
>> The University of Auckland
>>
>> P +649 815 3670
>> M +6421 256 3511
>> E gareth.campbell@esr.cri.nz
>> gcam032@gmail.com
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
--
View this message in context:
http://www.nabble.com/Variable-Selection-for-data-reduction-and-discriminant-anlaysis-tp19591270p19602702.html
Sent from the R help mailing list archive at Nabble.com.
------------------------------
Message: 54
Date: Mon, 22 Sep 2008 08:50:21 +0200
From: " Jos? E. Lozano " <lozalojo@jcyl.es>
Subject: [R] Manage huge database
To: <r-help@stat.math.ethz.ch>
Message-ID: <47A455630022D55E@mtacsbs.csbs.jcyl.es> (added by
postmaster@jcyl.es)
Content-Type: text/plain
Hello,
Recently I have been trying to open a huge database with no success.
It’s a 4GB csv plain text file with around 2000 rows and over 500,000
columns/variables.
I have try with The SAS System, but it reads only around 5000 columns, no
more. R hangs up when opening.
Is there any way to work with “parts” (a set of columns) of this database,
since its impossible to manage it all at once?
Is there any way to establish a link to the csv file and to state the
columns you want to fetch every time you make an analysis?
I’ve been searching the net, but found little about this topic.
Best regards,
Jose Lozano
[[alternative HTML version deleted]]
------------------------------
Message: 55
Date: Mon, 22 Sep 2008 08:08:20 +0100
From: "Barry Rowlingson" <b.rowlingson@lancaster.ac.uk>
Subject: Re: [R] Manage huge database
To: " Jos? E. Lozano " <lozalojo@jcyl.es>
Cc: r-help@stat.math.ethz.ch
Message-ID:
<d8ad40b50809220008r73daa11fi5d6b845fc1ca3d04@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
2008/9/22 Jos? E. Lozano <lozalojo@jcyl.es>:
> Recently I have been trying to open a huge database with no success.
>
> It's a 4GB csv plain text file with around 2000 rows and over 500,000
> columns/variables.
I wouldn't call a 4GB csv text file a 'database'.
> Is there any way to work with "parts" (a set of columns) of this
database,
> since its impossible to manage it all at once?
Yes, use a database. A real database.
> Is there any way to establish a link to the csv file and to state the
> columns you want to fetch every time you make an analysis?
No, but you can establish a link to a database. You want a database.
A real relational database.
> I've been searching the net, but found little about this topic.
Try:
http://cran.r-project.org/doc/manuals/R-data.html#Relational-databases
Barry
------------------------------
Message: 56
Date: Mon, 22 Sep 2008 09:16:52 +0200
From: Martin Maechler <maechler@stat.math.ethz.ch>
Subject: Re: [R] Symmetric matrix
To: Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl>
Message-ID: <18647.18020.828088.828816@stat.math.ethz.ch>
Content-Type: text/plain; charset=us-ascii
>>>>> "DR" == Dimitris Rizopoulos
<d.rizopoulos@erasmusmc.nl>
>>>>> on Sun, 21 Sep 2008 19:58:44 +0200 writes:
DR> try the following
DR> a <- matrix(rnorm(36), 6)
DR> ind <- lower.tri(a)
DR> a[ind] <- t(a)[ind]
DR> a
Yes, indeed, it needs the t(.) trick.
Note that 'Matrix' package has a function forceSymmetric(.) to
do this for you (faster, using C code):
A <- forceSymmetric(Matrix(rnorm(36), 6))
is all you'd need {if can afford to trash half of the random
numbers generated}
Martin Maechler, ETH Zurich
DR> I hope it helps.
DR> Best,
DR> Dimitris
DR> Megh Dal wrote:
>> I have following matrix :
>>
>> a = matrix(rnorm(36), 6)
>>
>> Now I want to replace the lower-triangular elements with it's
upper-triangular elements. That is I want to make a symmetric matrix from a. I
have tried with lower.tri() and upper.tri() function, but got desired result.
Can anyone please tell me how to do that?
------------------------------
Message: 57
Date: Mon, 22 Sep 2008 15:35:04 +0800
From: "Yihui Xie" <xieyihui@gmail.com>
Subject: Re: [R] Manage huge database
To: " Jos? E. Lozano " <lozalojo@jcyl.es>
Cc: r-help@stat.math.ethz.ch
Message-ID:
<89b6b8c90809220035o3f702624p34cb83000ad6b39f@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Hi,
You can treat it as a database and use ODBC to fetch data from the CSV
file using SQL. See the package RODBC for details about database
connections. (I have dealt with similar problems before with RODBC)
Regards,
Yihui
--
Yihui Xie <xieyihui@gmail.com>
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building,
Renmin University of China, Beijing, 100872, China
On Mon, Sep 22, 2008 at 2:50 PM, Jos? E. Lozano <lozalojo@jcyl.es>
wrote:> Hello,
>
>
>
> Recently I have been trying to open a huge database with no success.
>
>
>
> It's a 4GB csv plain text file with around 2000 rows and over 500,000
> columns/variables.
>
>
>
> I have try with The SAS System, but it reads only around 5000 columns, no
> more. R hangs up when opening.
>
>
>
> Is there any way to work with "parts" (a set of columns) of this
database,
> since its impossible to manage it all at once?
>
>
>
> Is there any way to establish a link to the csv file and to state the
> columns you want to fetch every time you make an analysis?
>
>
>
> I've been searching the net, but found little about this topic.
>
>
>
> Best regards,
>
> Jose Lozano
>
>
> [[alternative HTML version deleted]]
>
------------------------------
Message: 58
Date: Mon, 22 Sep 2008 09:49:09 +0200
From: " Jos? E. Lozano " <lozalojo@jcyl.es>
Subject: Re: [R] Manage huge database
To: "'Yihui Xie'" <xieyihui@gmail.com>
Cc: r-help@stat.math.ethz.ch
Message-ID: <47A455630022DC1E@mtacsbs.csbs.jcyl.es> (added by
postmaster@jcyl.es)
Content-Type: text/plain; charset="iso-8859-1"
Hello, Yihui
> You can treat it as a database and use ODBC to fetch data from the CSV
> file using SQL. See the package RODBC for details about database
> connections. (I have dealt with similar problems before with RODBC)
Thanks for your tip, I have used RODBC before to read data from MSAccess and
MSExcel files, but never I imagined it could work for non-database files
such as csv.
I will check the RODBC documentation.
Best Regards,
Jose Lozano
------------------------------------------
Jose E. Lozano Alonso
Observatorio de Salud P?blica.
Direccion General de Salud P?blica e I+D+I.
Junta de Castilla y Le?n.
Direccion: Paseo de Zorrilla, n?1. Despacho 3103. CP 47071. Valladolid.
------------------------------
Message: 59
Date: Mon, 22 Sep 2008 10:02:18 +0200
From: " Jos? E. Lozano " <lozalojo@jcyl.es>
Subject: Re: [R] Manage huge database
To: "'Barry Rowlingson'" <b.rowlingson@lancaster.ac.uk>
Cc: r-help@stat.math.ethz.ch
Message-ID: <47A455630022DD57@mtacsbs.csbs.jcyl.es> (added by
postmaster@jcyl.es)
Content-Type: text/plain; charset="iso-8859-1"
> I wouldn't call a 4GB csv text file a 'database'.
Obviously, a csv it's not a database itself, I tried to mean (though it
seems I was not understood) that I had a huge database, exported to csv file
by the people who created it (and I don?t have any idea of the original
format of the database).
> Yes, use a database. A real database.
I've used MSAccess and there is a limit of 255 columns, as far as I know, so
there is no way of import it. Obviously, I won't buy an Oracle license to
read this file, so: what database system allows a 500000 variables table?
MySQL? Do I have to split the file in smaller parts to import in tables to
relate them all using an index field?
> No, but you can establish a link to a database. You want a database.
> A real relational database.
> Try:
> http://cran.r-project.org/doc/manuals/R-data.html#Relational-databases
It didn't help, sorry. I perfectly knew what a relational database is (and I
humbly consider myself an advanced user on working with MSAccess+VBA, only
that I've never face this problem with variables), you should not suppose
everyone's stupid, though...
Thanks for your help,
Best regards
Jose Lozano
------------------------------
Message: 60
Date: Fri, 22 Aug 2008 09:15:20 +0100
From: Robin Hankin <rksh1@cam.ac.uk>
Subject: Re: [R] how to keep up with R?
To: a.ramasamy@imperial.ac.uk
Cc: r-help <r-help@stat.math.ethz.ch>, Barry Rowlingson
<b.rowlingson@lancaster.ac.uk>
Message-ID: <48AE7598.7060209@cam.ac.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Adaikalavan Ramasamy wrote:> I agree! The best way to learn (and remember for longer) is to teach
> someone else about it.
>
> And there is not reason not to repeat some of the anlysis done on SAS
> with R. That way you can verify your outputs or compare the
> presentations. If you consistently find differences in the outputs,
> then trying to figure out the reason may lead you to better understand
> the methods (e.g. different optimization or estimation procedures).
>
My take on this:
I have repeatedly found that it is surprisingly easy to improve on
existing (non-R) implementations
of statistical and non-statistical computation, when working in R.
Something about the structure of the language, something about the
package mechanism,
something about R-help, something about R-core, something about
open-source, something
about JSS or R-news, whatever it is, there is SOMETHING ABOUT R which
lends itself
to straightforward production of quality software. And that something
is missing from other
programming languages, IMO.
rksh
> Regards, Adai
>
>
>
> Barry Rowlingson wrote:
>> 2008/9/19 Wensui Liu <liuwensui@gmail.com>:
>>> Dear Listers,
>>>
>>> I've been a big fan of R since graduate school. After working
in the
>>> industry for years, I haven't had many opportunities to use R
and am
>>> mainly
>>> using SAS. However, I am still forcing myself really hard to stay
>>> close to R
>>> by reading R-help and books and writing R code by myself for fun.
>>> But by and
>>> by, I start realizing I have hard time to keep up with R and am
>>> afraid that
>>> I would totally forget how to program in R.
>>>
>>> I really like it and am very unwilling to give it up. Is there any
>>> idea how
>>> I might keep touch with R without using it in work on daily basis?
I
>>> really
>>> appreciate it.
>>>
--
Robin K. S. Hankin
Senior Research Associate
Cambridge Centre for Climate Change Mitigation Research (4CMR)
Faculty of Economics
The University of Cambridge
rksh1@cam.ac.uk
01223-764877
------------------------------
Message: 61
Date: Mon, 22 Sep 2008 09:30:52 +0100 (BST)
From: (Ted Harding) <Ted.Harding@manchester.ac.uk>
Subject: Re: [R] Why isn't R recognising integers as numbers?
To: Ted Byers <r.ted.byers@gmail.com>
Cc: r-help@r-project.org
Message-ID: <XFMail.080922093052.Ted.Harding@manchester.ac.uk>
Content-Type: text/plain; charset=iso-8859-1
Hi Ted (from Ted),
Just to clarify Marc's comments about dataframes in more basic terms.
If you read in data with read.csv() the result returned by the function
is a dataframe. This is a specialised kind of list, which you can think
of as a list of "columns" all of the same length. You can think of
each
"column" as a vector of elements, all of which must be of the same
type
within the column, though the type can vary (e.g. numeric, factor,
character) between columns. When you display a dataframe, it looks like
a matrix, though in R terms it is not really a matrix; it is a list,
where each component of the list is a "column".
Of course a dataframe, like any list, might have only one component.
But it is still a list -- and the actual contents are only available
"one layer down", after you have extracted that component by some
means (e.g. by using the "$" extractor). Simple example:
L <- c(1,2,3,4) ## vector
L
# [1] 1 2 3 4
L.df <- data.frame(L=L) ## Dataframe with 1 component named "L"
L.df
# L
# 1 1
# 2 2
# 3 3
# 4 4
L.df$L ## Extract the component named "L"
# [1] 1 2 3 4 ## Compare with the result of 'L' above
# Try a regression on L (this works):
lm(L ~ 1)
# Call:
# lm(formula = L ~ 1)
# Coefficients:
# (Intercept)
# 2.5
# Try a regression on L.df (this doesn't work):
lm(L.df ~ 1)
# Error in model.frame.default(formula = L.df ~ 1,
# drop.unused.levels = TRUE) :
# invalid type (list) for variable 'L.df'
# But it does after you refer to the component L by name:
lm(L.df$L ~ 1)
# Call:
# lm(formula = L.df$L ~ 1)
# Coefficients:
# (Intercept)
# 2.5
# or:
lm(L ~ 1, data=L.df)
# Call:
# lm(formula = L ~ 1, data = L.df)
# Coefficients:
# (Intercept)
# 2.5
# But you can (for a dataframe, not a general list) use an "index"
method of extraction *as if* it were a matrix (even though it isn't):
L.df[,1]
# [1] 1 2 3 4
L.df[3,1]
# [1] 3
# But compare with:
L.df[1]
# L
# 1 1
# 2 2
# 3 3
# 4 4
which is essentially the same as L.df itself (e.g. lm(L.df[1] ~ 1)
will not work in exactly the same way as lm(L.df ~ 1) didn't work).
The dataframe structure exists in R because so much data is typically
in the row by column (case by variables) layout such as you get in
spreadsheets and associated CSV files, and it is very useful to be
able to get into this layout directly (and refer to the variables
by name, as above).
The full generality of a 'list' can also be useful for encapsulating
data of a less strictly structured kind, but that is another (longer)
story!
Helping this helps.
Ted.
On 22-Sep-08 02:09:29, Ted Byers wrote:> Thanks Marc,
> That was it.
>
> For the last 30 years, I'd write my own code, in FORTRAN, C++,
> or even Java, to do whatever statistical analysis I needed.
> When at the office, sometimes I could use SAS, but that hasn't
> been an option for me in years.
>
> This is the first time I have had to load real data into R
> (instead of generating random data to use while playing with
> some of the stats functions, or manually typing dummy data).
>
> I take it, then, that the result of loading data is a data
> frame, and notjust a matrix or array. Using something like
> "refdata18[, 1]" feels rather alien, but I'm sure I'll
quickly
> get used to it. I'd seen it before in the R docs, but it didn't
> register that I had to use it to get the functions of most
> interest to me to recognise my data as a vector of numbers,
> given I'd provided only a vector of integers as input.
>
> Thanks
>
> Ted
>
>
> Marc Schwartz wrote:
>>
>> on 09/21/2008 08:01 PM Ted Byers wrote:
>>> I have a number of files containing anywhere from a few dozen to a
>>> few
>>> thousand integers, one per record.
>>>
>>> The statement "refdata18 >>>
read.csv("K:\\MerchantData\\RiskModel\\Capture.Week.18.csv", header
>>> TRUE,na.strings="")" works fine, and if I type
refdata18, I get the
>>> integers
>>> displayed, one value per record (along with a record number).
>>> However,
>>> when
>>> I try " fitdistr(refdata18,"negative
binomial")", or
>>> hist.scott(refdata18,
>>> prob = TRUE), I get an error:
>>>
>>> Error in fitdistr(refdata18, "negative binomial") :
>>> 'x' must be a non-empty numeric vector
>>> Or
>>> Error in hist.default(x, nclass.scott(x), prob = prob, xlab = xlab,
>>> ...)
>>> :
>>> 'x' must be numeric
>>>
>>> How can it not recognise integers as numbers?
>>>
>>> Thanks
>>>
>>> Ted
>>
>> 'refdata18' is a data frame and the two functions are expecting
a
>> numeric vector.
>>
>> If you use:
>>
>> fitdistr(refdata18[, 1], "negative binomial")
>>
>> or
>>
>> hist(refdata18[, 1])
>>
>> you should get a suitable result, presuming that the first column in
>> the
>> data frame is a numeric vector.
>>
>> Use:
>>
>> str(refdata18)
>>
>> to get a sense for the structure of the data frame, including the
>> column
>> names, which you could then use, instead of the above index based
>> syntax.
>>
>> HTH,
>>
>> Marc Schwartz
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context:
> http://www.nabble.com/Why-isn%27t-R-recognising-integers-as-numbers--tp1
> 9600308p19600803.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding@manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 22-Sep-08 Time: 09:30:47
------------------------------ XFMail ------------------------------
------------------------------
Message: 62
Date: Mon, 22 Sep 2008 09:41:30 +0100
From: "Barry Rowlingson" <b.rowlingson@lancaster.ac.uk>
Subject: Re: [R] Manage huge database
To: " Jos? E. Lozano " <lozalojo@jcyl.es>
Cc: r-help@stat.math.ethz.ch
Message-ID:
<d8ad40b50809220141l5274bf8fw29d36784de519eab@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
2008/9/22 Jos? E. Lozano <lozalojo@jcyl.es>:>> I wouldn't call a 4GB csv text file a 'database'.
> It didn't help, sorry. I perfectly knew what a relational database is
(and I
> humbly consider myself an advanced user on working with MSAccess+VBA, only
> that I've never face this problem with variables), you should not
suppose
> everyone's stupid, though...
[[elided Yahoo spam]]
A bit more googling tells me both MySQL and PostgreSQL have limits of
a few thousand on the number of columns in a table, not a few hundred
thousand. An insightful comment on one mailing list is:
"Of course, the real bottom line is that if you think you need more than
order-of-a-hundred columns, your database design probably needs revision
anyway ;-)"
So, how much "design" is in this data? If none, and what you've
basically got is a 2000x500000 grid of numbers, then maybe a more raw
binary-type format will help - HDF or netCDF? Although I'm not sure
how much R support for reading slices of these formats exists, you may
be able to use an external utility to write slices out on demand.
Random access to parts of these files is pretty fast.
http://cran.r-project.org/web/packages/RNetCDF/index.html
http://cran.r-project.org/web/packages/hdf5/index.html
Thinking back to your 4GB file with 1,000,000,000 entries, that's
only 3 bytes per entry (+1 for the comma). What is this data? There
may be more efficient ways to handle it.
Hope *that* helps...
Barry
------------------------------
_______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
End of R-help Digest, Vol 67, Issue 23
**************************************
[[alternative HTML version deleted]]
Marc Schwartz
2008-Sep-22 21:07 UTC
[R] Warranty on Accuracy, Precision, Legality, ... of R in Research
on 09/22/2008 11:26 AM Bert Chan wrote:> Warranty on Accuracy, Precision, Legality, ... of R in Research > > (These questions may well have been raised.) > > What is the implied warranty of using R for research & publications, consulting, etc.? > > Alternately, how does one obtain such a warranty? > > Your answers will be much appreciated. > > Perhaps you can point me to some websites which discussed this subject in the past. > > Thanks & regards - > > Bert > > (Bertram K. C. Chan, PhD)As per the banner that appears whenever you start up R: "R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details." The suitability of R for any particular application is entirely up to the user. Legally, there is nothing preventing you from using R for such applications relative to the license under which R is made available. You did not indicate the specific type of research you have in mind, but if it might be in the domain of clinical trials, please review: http://www.r-project.org/doc/R-FDA.pdf HTH, Marc Schwartz