I must express thanks to Peter Konings, Gary Collins, David Barron,
Prof. Brian Ripley, Vladimir Eremeev, and Michael Dewey (I hope I did
not leave anyone out) all of whom suggested I used the subset parameter
of lm to restrict the subjects included in my lm. R is a special
programming language and statistics package, both because of the
wonderful features of R (thank you R developers), but equally
importantly because of the community of people who willingly give of
there time and knowledge to help other users. Many thanks.
If any R developers are out there, may I suggest that the help page for
lm include more information (perhaps an example) on how one uses the
subset option. The current documentation states:
subsetan optional vector specifying a subset of observations to be used
in the fitting process.
Although I read the help page, I could not get subset to work until the
kind people mentioned above sent me examples.
Again, many thanks to one and all!
John
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin@grecc.umaryland.edu
>>> Prof Brian Ripley <ripley@stats.ox.ac.uk> 1/18/2007 3:38 AM
>>>
On Thu, 18 Jan 2007, David Barron wrote:
> Why not use the subset option? Something like:
>
> lm(diff ~ Age + Race, data=data, subset=data$Meno=="PRE")
>
> should do the trick, and be much easier to read!
And
lm(diff ~ Age + Race, data = data, subset = (Meno=="PRE"))
would be easier still.
>
> On 18/01/07, John Sorkin <jsorkin@grecc.umaryland.edu> wrote:
>> I am having trouble selecting rows of a dataframe that will be
included>> in a regression. I am trying to select those rows for which the
variable>> Meno equals PRE. I have used the code below:
>>
>>
difffitPre<-lm(data[,"diff"]~data[,"Age"]+data[,"Race"],data=data[data[,"Meno"]=="PRE",])
You are missing a comma in data = data[<...>, ]
>> summary(difffitPre)
>>
>> The output from the summary indicates that more than 76 rows are
>> included in the regression:
>>
>> Residual standard error: 2.828 on 76 degrees of freedom
>>
>> where in fact only 22 rows should be included as can be seen from
the>> following:
>>
>>
print(data[length(data[,"Meno"]=="PRE","Meno"]))
>> [1] 22
>>
>> I would appreciate any help in modifying the data= parameter of the
lm>> so that I include only those subjects for which Meno=PRE.
>>
>> R 2.3.1
>> Windows XP
>>
>> Thanks,
>> John
>>
>> John Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> Baltimore VA Medical Center GRECC,
>> University of Maryland School of Medicine Claude D. Pepper OAIC,
>> University of Maryland Clinical Nutrition Research Unit, and
>> Baltimore VA Center Stroke of Excellence
>>
>> University of Maryland School of Medicine
>> Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>>
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>> jsorkin@grecc.umaryland.edu
>>
>> Confidentiality Statement:
>> This email message, including any attachments, is for the
so...{{dropped}}>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R ( http://www.r/
)-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}