From: Martin Maechler <maechler at stat.math.ethz.ch>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <17739.46076.735981.117358 at stat.math.ethz.ch>
Date: Fri, 3 Nov 2006 22:26:20 +0100
To: Barry Rowlingson <B.Rowlingson at lancaster.ac.uk>
Cc: Sarah Goslee <sarah.goslee at gmail.com>,
r-help at stat.math.ethz.ch,
Sarah White <swhite at sgul.ac.uk>
Subject: Re: [R] Translation of R code required
In-Reply-To: <454B8CAE.6060602 at lancaster.ac.uk>
References: <454B80F6.2040605 at sgul.ac.uk>
<efb536d50611030955o33ce3ba6sb7342648963f34d6 at mail.gmail.com>
<454B8521.60707 at sgul.ac.uk>
<efb536d50611031012k7594bbdw3727d20f1c617740 at mail.gmail.com>
<454B8CAE.6060602 at lancaster.ac.uk>
X-Mailer: VM 7.19 under Emacs 21.4.1
Reply-To: Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> "BaRow" == Barry Rowlingson <B.Rowlingson at
lancaster.ac.uk>
>>>>> on Fri, 03 Nov 2006 18:38:38 +0000 writes:
BaRow> Sarah Goslee wrote:
>> Since this step works,
>>
>> mental$Rx <- factor(mental$Rx,
levels=c("VS","IPS"))
>>
>> I think that some of the names are right, but perhaps
>> that one is spelled differently (Centre vs centre, maybe,
>> since R is case-sensitive?).
BaRow> How do you know that step works? If the dataframe
BaRow> 'mental' doesnt have an 'Rx' column then mental$Rx
BaRow> will be 'NULL', and the factor() function will make a
BaRow> factor out of that...
>> x=data.frame(foo=1:10)
BaRow> so x is a data frame with one column, called
BaRow> 'foo'. Now lets make levels from a non-existent
BaRow> column, 'bar':
>> x$bar=factor(x$bar,levels=c('x','y'))
BaRow> No complaints... But try printing 'x' now (by
BaRow> typing x at the command line)... Ick!
Hmm, what version of R would that be?
In any recent versions,
I get
> x <- data.frame(foo = 1:10)
> x$bar <- factor(x$bar,levels=c('x','y'))
Error in as.vector(x, mode) : invalid argument 'mode'
But I want to make another point:
For about a year now, for "serious" data-analysis using data frames,
I've been advocating to use the slightly more clumsy but much
more error-prone ``column indexing by name'' instead of the
quick-and-dirty
"$" selection :
> x$bar
NULL
> x[,"bar"]
Error in "[.data.frame"(x, , "bar") : undefined columns
selected
BaRow> Try things one line at a time and check the objects
BaRow> created or modified are sensible.
Indeed!