Hi!
I have a dataset with some 300+ variables and 2000+ records. I'd like to
grind
through a bunch of analyses on the variables by using a script, but can't
figure out how to refer to variable names properly. For some of the simpler
stuff I use various "apply" functions, but for others (like t-tests
etc) I need
by command procedures. I've tried various flavors of "for(var in
names(Dataset)){...}" but this does not work consistently. Actually,
"for(var
in names(Dataset){print var}, seems to work perfectly, giving a list of
variable names, but "for(var in names(Dataset)){mean(var, na.rm=T) or
for(var
in names(Dataset)){glm(var~var1+var2+var3....} do not.
Any suggestions about how best to go about this?
Thanks
Jon
Hi,
perhaps that is what you want?
> df <- as.data.frame(matrix(runif(100),ncol=10))
> form <- as.formula(paste(names(df)[length(df)], "~ ."))
> lm(form,data=df)
Call:
lm(formula = form, data = df)
Coefficients:
(Intercept) V1 V2 V3 V4
-1.367 -2.920 3.631 -7.259 -3.704
V5 V6 V7 V8 V9
4.225 3.049 4.522 2.496 -0.578
> form <- as.formula(paste(names(df)[length(df)],
"~",paste(names(df)[3],names(df)[4],sep="+")))
> lm(form,data=df)
Call:
lm(formula = form, data = df)
Coefficients:
(Intercept) V3 V4
0.652 0.360 -0.448
Regards,Christian> Hi!
>
> I have a dataset with some 300+ variables and 2000+ records. I'd like
to grind
> through a bunch of analyses on the variables by using a script, but
can't
> figure out how to refer to variable names properly. For some of the simpler
> stuff I use various "apply" functions, but for others (like
t-tests etc) I need
> by command procedures. I've tried various flavors of "for(var in
> names(Dataset)){...}" but this does not work consistently. Actually,
"for(var
> in names(Dataset){print var}, seems to work perfectly, giving a list of
> variable names, but "for(var in names(Dataset)){mean(var, na.rm=T) or
for(var
> in names(Dataset)){glm(var~var1+var2+var3....} do not.
>
> Any suggestions about how best to go about this?
>
> Thanks
>
> Jon
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
For the mean() example, I believe this should work (untested)
for (var in names(Dataset)) print( mean( Dataset[[var]] , na.rm=TRUE ) )
or
for (var in names(Dataset)) print( mean( Dataset[,var] , na.rm=TRUE ) )
But it's harder with lm, glm, and friends. For them I think maybe you
can do it by constructing a formula object, see ?formula.
Maybe something like,
tmpf <- as.formula( paste( var, ' ~var1 + var2 + var3') )
glm( tmpf , Dataset)
inside the loop, but I haven't done this and am not an expert.
Here's a quick example:
> foo <- data.frame( x=1:10, y=rnorm(10) )
> ick <- as.formula( ' x ~ y')
> lm(ick,foo)
Call:
lm(formula = ick, data = foo)
Coefficients:
(Intercept) y
5.895 2.158
## compare with:> lm(x~y,foo)
Call:
lm(formula = x ~ y, data = foo)
Coefficients:
(Intercept) y
5.895 2.158
But it's a question that comes up periodically on r-help, so I'd also
suggest searching the archives.
-Don
At 10:50 AM -0400 6/18/09, Jon Erik Ween wrote:>Hi!
>
>I have a dataset with some 300+ variables and 2000+ records. I'd like to
grind
>through a bunch of analyses on the variables by using a script, but
can't
>figure out how to refer to variable names properly. For some of the simpler
>stuff I use various "apply" functions, but for others (like
t-tests
>etc) I need
>by command procedures. I've tried various flavors of "for(var in
>names(Dataset)){...}" but this does not work consistently. Actually,
"for(var
>in names(Dataset){print var}, seems to work perfectly, giving a list of
>variable names, but "for(var in names(Dataset)){mean(var, na.rm=T) or
for(var
>in names(Dataset)){glm(var~var1+var2+var3....} do not.
>
>Any suggestions about how best to go about this?
>
>Thanks
>
>Jon
>
>______________________________________________
>R-help at r-project.org mailing list
>https:// stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http:// www.
R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
--------------------------------------
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062