Hello With due respect, have a nice time. I would like to ask some command in R. It is regarding variable selection in linear regression. In R, there is one rebuild function called "step" which selecting variables according to AIC. let say i have data [y, x1,x2,x3,x4] we start with y~b0 i compute the partial F test and choose the variable with maximum partial F to enter the model, let say x4 with max value of partial F=58.02377. therefore, our next model is y~b0+b4x4 my questions... 1.how should i write so that x4 will be added to the next step? 2. the formula for partial F test is F*=(SSE(reduced model)-SSE(full model)/dfR-dfF) / (SSE(full model)/dfF) which can be simply as F*=MSR(xi | x1,x2,...,xi-1,xi+1) / MSE(x1,x2,...,xi-1,xi,xi+1) If i would like to write my formula by simplified one, how can i write it for every xi (not in the model) that need to be selected with conditionally depend on other x's (in the model) let say , i want to select other variables (x1, x2, x3) after x4 is selected F*=MSR(x3|x4)/MSE(x3,x4) Below, i attach my simple code p <- dim(mydata)[2] d <- p-1 n <- dim(mydata)[1] x <- as.matrix(mydata[,2:p]) y <- as.matrix(mydata[,1]) X <- as.matrix(rep(1,n)) b <- lm(y~1,data=mydata)$coefficients yhat <- X%*%b res <- y-yhat sigma.hat <- sqrt(sum(res^2)/(n-ncol(X))) cv <- sigma.hat^2*ginv(t(X)%*%X) se <- sqrt(diag(cv)) pc <- matrix(0,nrow=1,ncol=d) resF <- matrix(0, nrow=n, ncol=d) pf <- matrix(0, nrow=1, ncol=d) for(j in 1:d){ pc[,j] <- cor(x=(x[,j]), y=(mydata[,1])) resF[,j] <- lsfit(x[,j], y)$residuals sseF <- t(as.matrix(apply(resF^2, 2, sum))) resR <- lm(y~1,data=mydata)$residuals sseR <- sum(resR^2) dfF <- n-2 dfR <- n-1 pf[,j] <- ((sseR-sseF[,j])/(dfR-dfF))/(sseF[,j]/dfF) max.pf=max(pf) max.pc=max(pc) Thank you and looking forward to hear some replies. Sincerely, Iba Universiti Putra Malaysia -- View this message in context: http://r.789695.n4.nabble.com/variable-selection-in-linear-regression-tp3578795p3578795.html Sent from the R help mailing list archive at Nabble.com.
Hello With due respect, have a nice time. I would like to ask some command in R. It is regarding variable selection in linear regression. In R, there is one rebuild function called "step" which selecting variables according to AIC. let say i have data [y, x1,x2,x3,x4] we start with y~b0 i compute the partial F test and choose the variable with maximum partial F to enter the model, let say x4 with max value of partial F=58.02377. therefore, our next model is y~b0+b4x4 my questions... 1.how should i write so that x4 will be added to the next step? 2. the formula for partial F test is F*=(SSE(reduced model)-SSE(full model)/dfR-dfF) / (SSE(full model)/dfF) which can be simply as F*=MSR(xi | x1,x2,...,xi-1,xi+1) / MSE(x1,x2,...,xi-1,xi,xi+1) If i would like to write my formula by simplified one, how can i write it for every xi (not in the model) that need to be selected with conditionally depend on other x's (in the model) let say , i want to select other variables (x1, x2, x3) after x4 is selected F*=MSR(x3|x4)/MSE(x3,x4) Below, i attach my simple code p <- dim(mydata)[2] d <- p-1 n <- dim(mydata)[1] x <- as.matrix(mydata[,2:p]) y <- as.matrix(mydata[,1]) X <- as.matrix(rep(1,n)) b <- lm(y~1,data=mydata)$coefficients yhat <- X%*%b res <- y-yhat sigma.hat <- sqrt(sum(res^2)/(n-ncol(X))) cv <- sigma.hat^2*ginv(t(X)%*%X) se <- sqrt(diag(cv)) pc <- matrix(0,nrow=1,ncol=d) resF <- matrix(0, nrow=n, ncol=d) pf <- matrix(0, nrow=1, ncol=d) for(j in 1:d){ pc[,j] <- cor(x=(x[,j]), y=(mydata[,1])) resF[,j] <- lsfit(x[,j], y)$residuals sseF <- t(as.matrix(apply(resF^2, 2, sum))) resR <- lm(y~1,data=mydata)$residuals sseR <- sum(resR^2) dfF <- n-2 dfR <- n-1 pf[,j] <- ((sseR-sseF[,j])/(dfR-dfF))/(sseF[,j]/dfF) max.pf=max(pf) max.pc=max(pc) Thank you and looking forward to hear some replies. Sincerely, Iba Universiti Putra Malaysia -- View this message in context: http://r.789695.n4.nabble.com/variable-selection-in-linear-regression-tp3578967p3578967.html Sent from the R help mailing list archive at Nabble.com.