HI,
I am not sure about whether your subset function is correct.? If you look into
this link (http://stat.ethz.ch/R-manual/R-devel/library/base/html/subset.html),
it says about how to use subset (subset(data, condition) instead of
(subset=data==condition).? Also, the one I am describing about use a different
format.? For eg, in your data, both Group1 and Group2 are separate columns with
each having the same values for the independent variables.? Normally, for
different groups (or factors with multiple levels), it will be in the same
column like this:
?>dat2
?? ID Group Mem?? Gen Chance MSELGM MSELVR MSELFM MSELRL MSELEL ADOS Age
1?? 1???? 1? 75? 50.0???? 50???? 53???? 52???? 62???? 57???? 56??? 3? 25
2?? 2???? 1? 75? 12.5???? 50???? 46???? 48???? 47???? 52???? 55??? 2? 30
3?? 3???? 1? 25? 37.5???? 50???? 48???? 43???? 52???? 63???? 63??? 3? 24
4?? 4???? 1? 25? 37.5???? 50???? 51???? 62???? 52???? 59???? 54??? 0? 31
5?? 5???? 1? 50? 87.5???? 50???? 45???? 58???? 42???? 46???? 43??? 6? 31
6?? 6???? 1 100 100.0???? 50???? 45???? 80???? 49???? 69???? 63??? 1? 31
7?? 7???? 2? 75? 50.0???? 50???? 53???? 52???? 62???? 57???? 56??? 3? 25
8?? 8???? 2? 75? 12.5???? 50???? 46???? 48???? 47???? 52???? 55??? 2? 30
9?? 9???? 2? 25? 37.5???? 50???? 48???? 43???? 52???? 63???? 63??? 3? 24
10 10???? 2? 25? 37.5???? 50???? 51???? 62???? 52???? 59???? 54??? 0? 31
11 11???? 2? 50? 87.5???? 50???? 45???? 58???? 42???? 46???? 43??? 6? 31
12 12???? 2 100 100.0???? 50???? 45???? 80???? 49???? 69???? 63??? 1? 31
dat3<-subset(dat2,Group==1)
dat4<-subset(dat2,Group==2)> dat4
?? ID Group Mem?? Gen Chance MSELGM MSELVR MSELFM MSELRL MSELEL ADOS Age
7?? 7???? 2? 75? 50.0???? 50???? 53???? 52???? 62???? 57???? 56??? 3? 25
8?? 8???? 2? 75? 12.5???? 50???? 46???? 48???? 47???? 52???? 55??? 2? 30
9?? 9???? 2? 25? 37.5???? 50???? 48???? 43???? 52???? 63???? 63??? 3? 24
10 10???? 2? 25? 37.5???? 50???? 51???? 62???? 52???? 59???? 54??? 0? 31
11 11???? 2? 50? 87.5???? 50???? 45???? 58???? 42???? 46???? 43??? 6? 31
12 12???? 2 100 100.0???? 50???? 45???? 80???? 49???? 69???? 63??? 1? 31
> fit1<-lm(Gen~MSELEL,data=dat3)
> fit2<-lm(Gen~MSELEL,data=dat4)
cor.test (dat3$Gen, dat3$MSELEL, method="pearson")
In the sample dataset that you showed here, you will get the same correlation
results and regression results for both groups as there was no change in the
values of the dependent or independent variables.
I guess this helps.
A.K.
?
----- Original Message -----
From: jacaranda tree <myjacaranda at yahoo.com>
To: "R-help at r-project.org" <R-help at r-project.org>
Cc:
Sent: Sunday, June 3, 2012 11:51 AM
Subject: [R] a question about subsetting
Hi all,
I started using R about 3 weeks ago, and now I've pretty much figured out
how to do the types of statistical modeling, graphs, tables etc. that I
frequently ?use (with zero background in computer languages or other statistical
packages that are similar to R like S or SAS!). So it's been a ?quite
?rewarding process so far, and I thank you all R gurus for all your generous
help!
That being said, my question is about applying a model or an analysis to
different groups based on a grouping variable. Below is the first six rows of my
data:
? ?ID Group1 Group2 Mem ? Gen Chance MSELGM MSELVR MSELFM MSELRL MSELEL ADOS Age
1 ?1 ? ? ?1 ? ? ? ? ? 1 ? ? ? ?75 ? ? 50.0 ? ? 50 ? ? ? ? 53 ? ? ? ? ? ? ? 52 ?
? ? ? ? ?62 ? ? ? ? ? ? 57 ? ? ? ? ? ?56 ? ? ? ?3 ? ? ? ?25
2 ?2 ? ? ?1 ? ? ? ? ? 1 ? ? ? ?75 ? ? 12.5 ? ? 50 ? ? ? ? 46 ? ? ? ? ? ? ? 48 ?
? ? ? ? ?47 ? ? ? ? ? ? 52 ? ? ? ? ? ?55 ? ? ? ?2 ? ? ? ?30
3 ?3 ? ? ?1 ? ? ? ? ? 1 ? ? ? ?25 ? ? 37.5 ? ? 50 ? ? ? ? 48 ? ? ? ? ? ? ? 43 ?
? ? ? ? ?52 ? ? ? ? ? ? 63 ? ? ? ? ? ?63 ? ? ? ?3 ? ? ? ?24
4 ?4 ? ? ?1 ? ? ? ? ? 1 ? ? ? ?25 ? ? 37.5 ? ? 50 ? ? ? ? 51 ? ? ? ? ? ? ? 62 ?
? ? ? ? ?52 ? ? ? ? ? ? 59 ? ? ? ? ? ?54 ? ? ? ?0 ? ? ? ?31
5 ?5 ? ? ?1 ? ? ? ? ? 1 ? ? ? ?50 ? ? 87.5 ? ? 50 ? ? ? ? 45 ? ? ? ? ? ? ? 58 ?
? ? ? ? ?42 ? ? ? ? ? ? 46 ? ? ? ? ? ?43 ? ? ? ?6 ? ? ? ?31
6 ?6 ? ? ?1 ? ? ? ? ? 1 ? ? ? 100 ? ?100.0 ? 50 ? ? ? ? 45 ? ? ? ? ? ? ? 80 ? ?
? ? ? ?49 ? ? ? ? ? ? 69 ? ? ? ? ? ?63 ? ? ? ?1 ? ? ? ?31
Group1: First grouping variable
Group2: Second grouping variable
Mem: Memory trial
Gen: Generalization trial
MSEL: Mullen Scales of Early Learning (a scale measuring various skills in
little children). GM: Gross Motor Scale, VR: Visual Reception, FM: Fine Motor,
RL: receptive Language, EL: Expressive Language.?
ADOS: An autism-specific measure.
First I wanted to do correlations between Generalization (variable Gen) and
expressive language (MSELEL) for each group of Group1. For this, I used lapply
or by functions which work just fine. Here is the code with
lapply:?lapply(split(mydata, mydata$Group1), function(x){cor.test(x[,5],
x[,11], method = "pearson")})
Then I did regression. My DV is the variable Gen, and the IV is MSELEL. And
again I wanted to do this for each group. Here is the code I came up with for
each group:
fit1<-lm(Gen~ MSELEL, data=mydata, subset=mydata$Group1==1)
fit2<-lm(Gen~MSELEL, data=mydata, subset=mydata$Group1==2)
This works fine for regression, but when I used the "subset" function
with the correlation (e.g. ? cor.test (mydata$Gen, mydata$MSELEL,
method="pearson", subset=mydata$Group1==1) , it did not work. It just
did the correlation for the entire group and then used this for both groups. I
was just curious as to why subset function works with regression, but not with
correlation. Any thoughts??
Thanks,
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.