thr3ads.net - R help - [R] Getting started with R [Jan 2002]

If this information is useful, please help other people find it:
Share via:

Jay Pfaffman

2002-Jan-15 17:34 UTC

[R] Getting started with R

I've got a background in computer science & have been using Linux for
nearly a decade.  I'm working on a Ph.D. in education and technology
and I essentially live in emacs and do all of my writing in LaTeX.
To me R seems like the perfect stats package.  Unfortunately, the
learning curve is killing me.  I feel like that if I'd waded through
pulling down menus in SPSS or SAS I could have gotten a bit more done
by now, but I don't want to use those programs.

What I'd like is a cookbook of a few basic procedures.  I think I'm
more interested in the R code than I am statistical explication,
though I don't object to the latter.  Is Venables and Ripley
"MASS"
going to do that for me or would "S Programming" be more appropriate?
In my cursory look through the sample chapter from Nolan and Speed I
saw no S-plus/S/R code whatsoever.

One thing I'm trying to do right now is certainly trivial, but I can't
quite get it going.  Hopefully I'm not sounding too much like I'm
asking you to do my homework.

In a perception study, I've got three within-subject conditions, A, B,
and C.  Each condition has 4 trials with 2 times and an angle
(actually an error measurement between the actual angle and the one
the subjects pointed to).  All I want is to get the stuff that
summary() gives split out by condition.  It might also be nice to
split it out between subjects as well to look at, and possibly correct
for individual differences, (which might be difficult with so few
trials?).  My data columns are as follows:

A B C (with 0 or 1 to indicate condition, would a single column with
1-3 be better?)

t1, t2, angle-error

Surely fewer than 10 lines of R could yield me these results and maybe
a couple pretty graphs.

In another study where I'm looking at motivation and hobbies, which I
have almost no idea how to analyze (which suggests I might have chosen
a bad design & that a problem like this probably doesn't belong in my
"cookbook") I've had people rank a set of 25 characteristics of
their
activities or motivations (5 in each of 5 categories) and would like
to see if any patterns are emerging there.  My data start out as an
ordered list of these cards (1-25); I futzed in a spreadsheet to get
two columns, the motivation number and its rank.  If I could avoid
using the spreadsheet, that'd be nice.  

Thanks.

-- 
Jay Pfaffman                           pfaffman at relaxpc.com
+1-415-821-7507 (H)                    +1-415-810-2238 (M)
http://relax.ltc.vanderbilt.edu/~pfaffman/
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Jonathan Baron

2002-Jan-15 18:25 UTC

head link

[R] Getting started with R

I think our "Notes on R for psychology..." was written just
for you.  I believe it has examples much like the ones you
describe.  (It is a little out of date because it has not been
revised to keep up with newer versions of R ... yet.)

Your particular problems might be solved with by(), or
apply(), or tapply().  It is a little hard for me to tell
because I'm not sure how you've laid out your data.  In
particular, there is no code for subject, yet you say it
is a within-subject design.

For your second example, if you have data for many subjects
(not clear from your description), you might try things in
the mva package.  The biplot function for principal
components is particularly nice.  You might want to put
the data into a one-row-per-subject matrix or data-frame,
in both cases (although the layout you seem to have has
other advantages, as we explain).

Our notes are in the contributed documents section, and at
http://finzi.psych.upenn.edu
and thre is also a reference card in both places (also a
little out of date - mostly needing a second page for
graphics).

As for your question about whether to use one code for each
condition or 1-3 to indicate conditions, either is good, but
you probably want to make the 1-3 code a _factor_, which is
what is sometimes called a categorical variable.

Jon Baron
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Andrew Perrin

2002-Jan-15 19:02 UTC

head link

[R] Getting started with R

If I understand what you're asking, it's essentially the same thing I
asked the list for a week or so ago.  

First, if A, B, and C conditions are mutually exclusive, then yes, I would
suggest working with a single variable with three values. As a rule of
thumb (more about database theory than statistics) you should avoid
designing data structures that can hold invalid data.

I quote below the responses from Rossini and Lumley to my original query:
On 9 Jan 2002, A.J. Rossini wrote:
> >>>>> "AP" == Andrew Perrin <andrew_perrin at
unc.edu> writes:
>
>     AP> I'd like to get summary statistics (really just a mean would
>     AP> be fine) for a vector in a data frame, but split based on the
>     AP> value of another vector.  That is, I have a data frame
>     AP> (hcd.df) with variables datecat (which is always 1 or 2) and
>     AP> auth.sum (-8..+8).  I've used xtabs to get chi-square
>     AP> comparisons, but what I need now is a simple mean of auth.sum
>     AP> where datecat is 1 and another where datecat is 2. Thanks for
>     AP> any advice.
>
> Something like :
>
>         lapply(split(hcd.df$auth.sum,hcd.df$datecat),mean)
>
Or
    tapply(hcf.df$auth.sum, hcd.df$datecat, mean)

or (in 1.4.0)

    with(hcf.df, {tapply(auth.sum, datecat, mean})


        -thomas

Thomas Lumley                   Asst. Professor, Biostatistics
tlumley at u.washington.edu        University of Washington, Seattle

-----

In your case, I'd say something like:

tapply(df$angle, df$condition, summary)

is probably right.

----------------------------------------------------------------------
Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
 Assistant Professor of Sociology, U of North Carolina, Chapel Hill
      269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA


On Tue, 15 Jan 2002, Jay Pfaffman wrote:
> I've got a background in computer science & have been using Linux
for
> nearly a decade.  I'm working on a Ph.D. in education and technology
> and I essentially live in emacs and do all of my writing in LaTeX.
> To me R seems like the perfect stats package.  Unfortunately, the
> learning curve is killing me.  I feel like that if I'd waded through
> pulling down menus in SPSS or SAS I could have gotten a bit more done
> by now, but I don't want to use those programs.
> 
> What I'd like is a cookbook of a few basic procedures.  I think I'm
> more interested in the R code than I am statistical explication,
> though I don't object to the latter.  Is Venables and Ripley
"MASS"
> going to do that for me or would "S Programming" be more
appropriate?
> In my cursory look through the sample chapter from Nolan and Speed I
> saw no S-plus/S/R code whatsoever.
> 
> One thing I'm trying to do right now is certainly trivial, but I
can't
> quite get it going.  Hopefully I'm not sounding too much like I'm
> asking you to do my homework.
> 
> In a perception study, I've got three within-subject conditions, A, B,
> and C.  Each condition has 4 trials with 2 times and an angle
> (actually an error measurement between the actual angle and the one
> the subjects pointed to).  All I want is to get the stuff that
> summary() gives split out by condition.  It might also be nice to
> split it out between subjects as well to look at, and possibly correct
> for individual differences, (which might be difficult with so few
> trials?).  My data columns are as follows:
> 
> A B C (with 0 or 1 to indicate condition, would a single column with
> 1-3 be better?)
> 
> t1, t2, angle-error
> 
> Surely fewer than 10 lines of R could yield me these results and maybe
> a couple pretty graphs.
> 
> In another study where I'm looking at motivation and hobbies, which I
> have almost no idea how to analyze (which suggests I might have chosen
> a bad design & that a problem like this probably doesn't belong in
my
> "cookbook") I've had people rank a set of 25 characteristics
of their
> activities or motivations (5 in each of 5 categories) and would like
> to see if any patterns are emerging there.  My data start out as an
> ordered list of these cards (1-25); I futzed in a spreadsheet to get
> two columns, the motivation number and its rank.  If I could avoid
> using the spreadsheet, that'd be nice.  
> 
> Thanks.
> 
> -- 
> Jay Pfaffman                           pfaffman at relaxpc.com
> +1-415-821-7507 (H)                    +1-415-810-2238 (M)
> http://relax.ltc.vanderbilt.edu/~pfaffman/
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
>
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Warnes, Gregory R

2002-Jan-15 20:15 UTC

head link

[R] Getting started with R

>  -----Original Message----- >  From: Jay Pfaffman [mailto:pfaffman at relaxpc.com]
 >  Sent: Tuesday, January 15, 2002 12:35 PM
 >  To: r-help at stat.math.ethz.ch
 >  Subject: [R] Getting started with R

 > [...]
 >  What I'd like is a cookbook of a few basic procedures.  I think
I'm
 >  more interested in the R code than I am statistical explication,
 >  though I don't object to the latter.  Is Venables and Ripley
"MASS"
 >  going to do that for me or would "S Programming" be more 
 >  appropriate?

Have you looked at http://cran.r-project.org/doc/manuals/R-intro.pdf ?

 > [...]
 >  In a perception study, I've got three within-subject 
 >  conditions, A, B,
 >  and C.  Each condition has 4 trials with 2 times and an angle
 >  (actually an error measurement between the actual angle and the one
 >  the subjects pointed to).  All I want is to get the stuff that
 >  summary() gives split out by condition.  It might also be nice to
 >  split it out between subjects as well to look at, and 
 >  possibly correct
 >  for individual differences, (which might be difficult with so few
 >  trials?).  My data columns are as follows:
 >  
 >  A B C (with 0 or 1 to indicate condition, would a single column with
 >  1-3 be better?)

It is probably easier to deal with a single 'factor' variable with three
levels 'A','B', and 'C'.

 >  
 >  t1, t2, angle-error
 >  
 >  Surely fewer than 10 lines of R could yield me these 
 >  results and maybe
 >  a couple pretty graphs.
 >  
If you create a data file named myfile.csv containing :

Condition,t1,t2,angle.error
A, 10, 12, 30
B, 12, 6, 15
C, 9, 16, 0
...

you can read it in using

	> mydata <- read.csv("myfile.csv")

Then to get summaries of each of the variables separated by conditions do
something like

	> by(mydata, mydata$Condition, summary)

Some plots:

	> plot(t1 ~ Condition, data=mydata)
	> plot(t2 ~ Condition, data=mydata)
	> plot(angle ~ Condition, data=mydata)

Run a regression model testing if t1 depends on condition:

	> summary(lm(t2 ~ Condition, data=mydata))

 >  In another study where I'm looking at motivation and 
 >  hobbies, which I
 >  have almost no idea how to analyze (which suggests I might 
 >  have chosen
 >  a bad design & that a problem like this probably doesn't 
 >  belong in my
 >  "cookbook") I've had people rank a set of 25 
 >  characteristics of their
 >  activities or motivations (5 in each of 5 categories) and would like
 >  to see if any patterns are emerging there.  My data start out as an
 >  ordered list of these cards (1-25); I futzed in a spreadsheet to get
 >  two columns, the motivation number and its rank.  If I could avoid
 >  using the spreadsheet, that'd be nice.  

You would need to give quite a bit more information before one could suggest
a reasonable method of anlyzing this data.  Still, it seems that the easiest
format to handle the data for statistical analysis would be one row per
participant, with one variable (column) for each activity or motivation. 

-Greg

 >  
 >  Thanks.
 >  
 >  -- 
 >  Jay Pfaffman                           pfaffman at relaxpc.com
 >  +1-415-821-7507 (H)                    +1-415-810-2238 (M)
 >  http://relax.ltc.vanderbilt.edu/~pfaffman/
 >  -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
 >  -.-.-.-.-.-.-.-.-.-
 >  r-help mailing list -- Read 
http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._


LEGAL NOTICE
Unless expressly stated otherwise, this message is confidential and may be
privileged. It is intended for the addressee(s) only. Access to this E-mail by
anyone else is unauthorized. If you are not an addressee, any disclosure or
copying of the contents of this E-mail or any action taken (or not taken) in
reliance on it is unauthorized and may be unlawful. If you are not an addressee,
please inform the sender immediately.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Jan 2002 - Getting started with R

[R] Getting started with R

[R] Getting started with R

[R] Getting started with R

[R] Getting started with R

Seemingly Similar Threads