thr3ads.net - R help - [R] regression analysis in R [Oct 2012]

If this information is useful, please help other people find it:
Share via:

eliza botto

2012-Oct-26 20:00 UTC

[R] regression analysis in R

Dear useRs,
i have vectors of about 27 descriptors, each having 703 elements. what i want to
do is the following 1. i want to do regression analysis of these 27 vectors
individually, against a dependent vector, say B, having same number of
elements.2. i would like to know best 10 regression results, if i do regression
analysis of dependent vector against the random combination of any 4
descriptors. more precisely, in the first step we did regression of dependent
vector against individual vector of each descriptor, but now we want R to
randomly combine descriptors in a set of 4 and does regression analysis with B
to see what are top 10 combination of descriptors giving good regression results
with B? i hope i am clear. i know 2nd part is more tricky, but i will be
extremely happy if you can answer any one of the above questions.
thanks in advanceeliza
 		 	   		  
	[[alternative HTML version deleted]]

arun

2012-Oct-26 21:47 UTC

head link

[R] regression analysis in R

HI,
May be this helps.
set.seed(8)
mat1<-matrix(sample(150,90,replace=FALSE),ncol=9,nrow=10)
dat1<-data.frame(mat1)
set.seed(10)
B<-sample(150:190,10,replace=FALSE)

res1<-lapply(dat1,function(x) lm(B~as.matrix(x)))
#or
res1<-lapply(dat1,function(x) lm(B~x))

res1Summary<-lapply(res1,summary)
#to get the coefficients
res1SummaryCoef<-lapply(res1,function(x) summary(x)$coefficients)
res1SummaryCoef[1:3]
#$X1
#??????????????? Estimate Std. Error?? t value???? Pr(>|t|)
#(Intercept)? 150.1303702 8.45536736 17.755630 1.035959e-07
#as.matrix(x)?? 0.2126583 0.09304937? 2.285436 5.163141e-02
#
#$X2
#????????????????? Estimate Std. Error???? t value???? Pr(>|t|)
#(Intercept)? 168.219302287? 6.9904434 24.06418202 9.479720e-09
#as.matrix(x)? -0.002386046? 0.1146838 -0.02080544 9.839104e-01
#
#$X3
#?????????????? Estimate Std. Error?? t value???? Pr(>|t|)
#(Intercept)? 180.303999? 8.6675156 20.802270 2.990115e-08
#as.matrix(x)? -0.157268? 0.1021179 -1.540064 1.621101e-01


#to get pvalue of Fstatistic
res1pvalueF<-lapply(res1,function(x)
pf(summary(x)$fstatistic[1],summary(x)$fstatistic[2],summary(x)$fstatistic[3],lower.tail=FALSE))
#to get r.squared value
res1rSquare<-lapply(res1,function(x) summary(x)$r.squared)
?
#2nd part 
#Create some new datasets using random combination of columns from dat1
dat2<-dat1[,sample(names(dat1),4)]
?dat3<-dat1[,sample(names(dat1),4)]
?dat4<-dat1[,sample(names(dat1),4)]
?dat5<-dat1[,sample(names(dat1),4)]
?dat6<-dat1[,sample(names(dat1),4)]
head(dat2)
#? X7? X3? X8? X5
#1 85? 30 113 100
#2 89? 53 115? 32
#3 74? 79? 63? 54
#4 57? 28? 52? 94
#5? 6? 84 135 132
#6? 5 123 146 127
?head(dat3)
#?? X8? X2? X6? X3
#1 113? 64? 14? 30
#2 115? 13?? 7? 53
#3? 63? 60? 15? 79
#4? 52? 75? 34? 28
#5 135? 19 107? 84
#6 146 126? 27 123

#create a list of dataframes
list1<-list(dat2,dat3,dat4,dat5,dat6)
res2<-lapply(list1,function(x) lm(B~as.matrix(x)))
res2rSquare<-lapply(res2,function(x) summary(x)$r.squared)
unlist(res2rSquare)
#[1] 0.8444332 0.6316695 0.6971695 0.7322519 0.4328805

For selection of the best model based on combination of descriptors, you can
also look for step-wise elimination, or based on AIC or BIC values.

A.K.







----- Original Message -----
From: eliza botto <eliza_botto at hotmail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, October 26, 2012 4:00 PM
Subject: [R] regression analysis in R


Dear useRs,
i have vectors of about 27 descriptors, each having 703 elements. what i want to
do is the following 1. i want to do regression analysis of these 27 vectors
individually, against a dependent vector, say B, having same number of
elements.2. i would like to know best 10 regression results, if i do regression
analysis of dependent vector against the random combination of any 4
descriptors. more precisely, in the first step we did regression of dependent
vector against individual vector of each descriptor, but now we want R to
randomly combine descriptors in a set of 4 and does regression analysis with B
to see what are top 10 combination of descriptors giving good regression results
with B? i hope i am clear. i know 2nd part is more tricky, but i will be
extremely happy if you can answer any one of the above questions.
thanks in advanceeliza
??? ???  ??? ?  ??? ??? ? 
??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Rui Barradas

2012-Oct-26 22:40 UTC

head link

[R] regression analysis in R

Hello,

Using the same example, at the end, add the following lines to have the 
models ordered by AIC.


aic <- lapply(res2, AIC)
idx <- order(unlist(aic))
lapply(list1[idx], names)

And if there are more than 10 models, if you want the 10 best,

best10 <- idx[1:10]
lapply(list1[best10], names)

Hope this helps,

Rui Barradas
Em 26-10-2012 22:47, arun escreveu:> HI,
> May be this helps.
> set.seed(8)
> mat1<-matrix(sample(150,90,replace=FALSE),ncol=9,nrow=10)
> dat1<-data.frame(mat1)
> set.seed(10)
> B<-sample(150:190,10,replace=FALSE)
>
> res1<-lapply(dat1,function(x) lm(B~as.matrix(x)))
> #or
> res1<-lapply(dat1,function(x) lm(B~x))
>
> res1Summary<-lapply(res1,summary)
> #to get the coefficients
> res1SummaryCoef<-lapply(res1,function(x) summary(x)$coefficients)
> res1SummaryCoef[1:3]
> #$X1
> #                Estimate Std. Error   t value     Pr(>|t|)
> #(Intercept)  150.1303702 8.45536736 17.755630 1.035959e-07
> #as.matrix(x)   0.2126583 0.09304937  2.285436 5.163141e-02
> #
> #$X2
> #                  Estimate Std. Error     t value     Pr(>|t|)
> #(Intercept)  168.219302287  6.9904434 24.06418202 9.479720e-09
> #as.matrix(x)  -0.002386046  0.1146838 -0.02080544 9.839104e-01
> #
> #$X3
> #               Estimate Std. Error   t value     Pr(>|t|)
> #(Intercept)  180.303999  8.6675156 20.802270 2.990115e-08
> #as.matrix(x)  -0.157268  0.1021179 -1.540064 1.621101e-01
>
>
> #to get pvalue of Fstatistic
> res1pvalueF<-lapply(res1,function(x)
pf(summary(x)$fstatistic[1],summary(x)$fstatistic[2],summary(x)$fstatistic[3],lower.tail=FALSE))
> #to get r.squared value
> res1rSquare<-lapply(res1,function(x) summary(x)$r.squared)
>   
> #2nd part
> #Create some new datasets using random combination of columns from dat1
> dat2<-dat1[,sample(names(dat1),4)]
>   dat3<-dat1[,sample(names(dat1),4)]
>   dat4<-dat1[,sample(names(dat1),4)]
>   dat5<-dat1[,sample(names(dat1),4)]
>   dat6<-dat1[,sample(names(dat1),4)]
> head(dat2)
> #  X7  X3  X8  X5
> #1 85  30 113 100
> #2 89  53 115  32
> #3 74  79  63  54
> #4 57  28  52  94
> #5  6  84 135 132
> #6  5 123 146 127
>   head(dat3)
> #   X8  X2  X6  X3
> #1 113  64  14  30
> #2 115  13   7  53
> #3  63  60  15  79
> #4  52  75  34  28
> #5 135  19 107  84
> #6 146 126  27 123
>
> #create a list of dataframes
> list1<-list(dat2,dat3,dat4,dat5,dat6)
> res2<-lapply(list1,function(x) lm(B~as.matrix(x)))
> res2rSquare<-lapply(res2,function(x) summary(x)$r.squared)
> unlist(res2rSquare)
> #[1] 0.8444332 0.6316695 0.6971695 0.7322519 0.4328805
>
> For selection of the best model based on combination of descriptors, you
can also look for step-wise elimination, or based on AIC or BIC values.
>
> A.K.
>
>
>
>
>
>
>
> ----- Original Message -----
> From: eliza botto <eliza_botto at hotmail.com>
> To: "r-help at r-project.org" <r-help at r-project.org>
> Cc:
> Sent: Friday, October 26, 2012 4:00 PM
> Subject: [R] regression analysis in R
>
>
> Dear useRs,
> i have vectors of about 27 descriptors, each having 703 elements. what i
want to do is the following 1. i want to do regression analysis of these 27
vectors individually, against a dependent vector, say B, having same number of
elements.2. i would like to know best 10 regression results, if i do regression
analysis of dependent vector against the random combination of any 4
descriptors. more precisely, in the first step we did regression of dependent
vector against individual vector of each descriptor, but now we want R to
randomly combine descriptors in a set of 4 and does regression analysis with B
to see what are top 10 combination of descriptors giving good regression results
with B? i hope i am clear. i know 2nd part is more tricky, but i will be
extremely happy if you can answer any one of the above questions.
> thanks in advanceeliza
>                            
>      [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Peter Ehlers

2012-Oct-26 23:36 UTC

head link

[R] regression analysis in R

On 2012-10-26 13:00, eliza botto wrote:>
> Dear useRs,
> i have vectors of about 27 descriptors, each having 703 elements. what i
want to do is the following 1. i want to do regression analysis of these 27
vectors individually, against a dependent vector, say B, having same number of
elements.2. i would like to know best 10 regression results, if i do regression
analysis of dependent vector against the random combination of any 4
descriptors. more precisely, in the first step we did regression of dependent
vector against individual vector of each descriptor, but now we want R to
randomly combine descriptors in a set of 4 and does regression analysis with B
to see what are top 10 combination of descriptors giving good regression results
with B? i hope i am clear. i know 2nd part is more tricky, but i will be
extremely happy if you can answer any one of the above questions.
> thanks in advanceeliza
>
I hope that you're doing _exploratory_ data analysis.
Have a look at the 'leaps' package. It might be suitable.

Peter Ehlers

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Oct 2012 - regression analysis in R

[R] regression analysis in R

[R] regression analysis in R

[R] regression analysis in R

[R] regression analysis in R

Apparently Analagous Threads