Umesh Rosyara
2011-Feb-25 17:02 UTC
[R] help please ..simple question regarding output the p-value inside a function and lm
Dear R community members and R experts I am stuck at a point and I tried with my colleagues and did not get it out. Sorry, I need your help. Here my data (just created to show the example): # generating a dataset just to show how my dataset look like, here I have x variables # x1 .........to X1000 plus ind and y ind <- c(1:100) y <- rnorm(100, 10,2) set.seed(201) P <- vector() dataf1 <- as.data.frame(matrix(rep(NA, 100000), nrow=100)) dataf <- data.frame (dataf1, ind,y) names(dataf) <- (c(paste("x",1:1000, sep=""),"ind", "y")) for(i in 1:1000) { dataf[,i] <- rnorm(100) } # my intension was to fit a model that would fit the following fashion: y ~ x1 +x2, y ~ x3+x4, y ~ x5+ x6............y ~ x999+x1000 (to end of the dataframe) # please not that I want to avoid to fit y ~ x2 + x3 or y ~ x4 + x5 (means that I am selecting two x variables at time to end) # question: how can I do this and put inside a user function as I worked out the following??? # defining function for lm model mylm <- function (mydata,nvar) { y <- NULL P1 <- vector (mode="numeric", length = nvar) P2 <- vector (mode="numeric", length = nvar) for(i in 1: nvar) { print(P1[i] <- summary(lm(mydata$y ~ mydata[,i]) + mydata[,i+1]$coefficients[2,4])) print(P2[i] <- summary(lm(mydata$y ~ mydata[,i]) + mydata[,i+1]$coefficients[2,5])) print(plot(nvar, P1)) print(plot(nvar, P2)) } } # applying the function to mydata mylm (dataf, 1000) Does not work?? The following is the error message: Error in model.frame.default(formula = mydata$y ~ mydata[, i], drop.unused.levels = TRUE) : invalid type (NULL) for variable 'mydata$y' Please help ! Thanks; Umesh R [[alternative HTML version deleted]]
Umesh Rosyara
2011-Feb-26 17:07 UTC
[R] help please ..simple question regarding output the p-value inside a function and lm
Hi Jorge and R users Thank you so much for the responses. You input helped me alot and potentially can help me to solve one more problem, but I got error message. I am sorry to ask you again but if you can find my problem in quick look that will be great. I hope this will not cost alot of your time as this is based on your idea. # Just data X1 <- c(1,3,4,2,2) X2 <- c(2,1,3,1,2) X3 <- c(4,3,2,1,1) X4<- c(1,1,1,2,3) X5 <- c(3,2,1,1,2) X6 <- c(1,1,2,2,3) odataframe <- data.frame(X1,X2,X3,X4,X5,X6) My objective here is sort the value of the pair of variables (X1 and X2, X3 and X4, X5 and X6 and so on.........) in such way that the second column in pair is always higher than the first one (X2 > X1, X4 > X3, X6> X5 and so on.......). Here is my attempt: nmrk <- 3 nvar <- 2*nmrk lapply(1:nvar, function(ind){ # indices for the variables we need a <- seq(1, nvar, by = 2) b <- seq(2, nvar, by = 2) # shorting column tx[, a[ind]] = ifelse(odataframe[, a[ind]] < odataframe[,b[ind]], odataframe[, a[ind]], odataframe[, b[ind]]) tx[, b[ind]] = ifelse(odataframe[, b[ind]] > dataframe[,a[ind]], odataframe[,b[ind]], odataframe[,a[ind]]) df1 <- transform( odataframe, odataframe[, a[ind]]= tx[, a[ind]], odataframe[, b[ind]]= tx[, b[ind]])) } I got the following error: Error: Error: unexpected '=' in: "tx[, b[ind]] = ifelse(odataframe[, b[ind]] > dataframe[,a[ind]], odataframe[,b[ind]], odataframe[,a[ind]]) df1 <- transform( odataframe, odataframe[, a[ind]]=" Thanks; Umesh R [[alternative HTML version deleted]]