Hi everyone, Let me have a dataframe named ?mydata? and created as below, *> n=c(5,5,5,5) #number of trils> x1=c(2,3,1,3) ) #number of successes > x2=c(5,5,5,5) #number of successes > x3=c(0,0,0,0) #number of successes > x4=c(5,0,5,0) #number of successes > mydata=data.frame(n,x1,x2,x3,x4) > mydata*n x1 x2 x3 x4 1 5 2 5 0 5 2 5 3 5 0 0 3 5 1 5 0 5 4 5 3 5 0 0 But for my modeling purposes(binomial), I cannot have a dataframe which has all success columns, all failure columns or only the success and failure columns. That is I need to delete x2, x3 and x4 from my data.frame I can delete x2 and x3 as follows *mydata = t(subset(t(mydata), rowSums(t(mydata)) > 0)) mydata = t(subset(t(mydata), rowSums(t(sim.data)) < 20)) #where 20=4*5* How can I subset my data by removing x4, which contains either number trials or zeros as elements? Can I give a single logical condition in the subset code to skip all such rows(i.e. skipping x2,x3, and x4 at once)? *** I am doing this for a very large dataframe(1000s of columns as responses) in a simulation study, but here I explained with a simple case. Thank you for your kindness! -- View this message in context: http://r.789695.n4.nabble.com/sub-setting-a-data-frame-with-binomial-responses-tp4638702.html Sent from the R help mailing list archive at Nabble.com.
David L Carlson
2012-Aug-02 04:56 UTC
[R] sub setting a data frame with binomial responses
If I understand you correctly you want to exclude columns where all successes equal trials, all successes equal 0, or successes are a mixture of trials and 0 with no in between values. You did not make it clear if the number of trials can vary, but in your example they do not. Given that all three criteria can be consolidated into a single statement:> mydata <- structure(list(n = c(5, 5, 5, 5), x1 = c(2, 3, 1, 3),x2 = c(5, 5, 5, 5), x3 = c(0, 0, 0, 0), x4 = c(5, 0, 5, 0)), .Names = c("n", "x1", "x2", "x3", "x4"), row.names = c(NA, -4L), class = "data.frame")> mydatan x1 x2 x3 x4 1 5 2 5 0 5 2 5 3 5 0 0 3 5 1 5 0 5 4 5 3 5 0 0> idx <- sapply(mydata[,-1], function(x) all(x %in% c(0, 5))) > idx <- c(TRUE, !idx) # add TRUE to include the first column > mydata[, idx]n x1 1 5 2 2 5 3 3 5 1 4 5 3 ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of john.james > Sent: Wednesday, August 01, 2012 10:19 AM > To: r-help at r-project.org > Subject: [R] sub setting a data frame with binomial responses > > Hi everyone, > Let me have a dataframe named ?mydata? and created as below, > *> n=c(5,5,5,5) #number of trils > > x1=c(2,3,1,3) ) #number of successes > > x2=c(5,5,5,5) #number of successes > > x3=c(0,0,0,0) #number of successes > > x4=c(5,0,5,0) #number of successes > > mydata=data.frame(n,x1,x2,x3,x4) > > mydata* > n x1 x2 x3 x4 > 1 5 2 5 0 5 > 2 5 3 5 0 0 > 3 5 1 5 0 5 > 4 5 3 5 0 0 > But for my modeling purposes(binomial), I cannot have a dataframe which > has > all success columns, all failure columns or only the success and > failure > columns. > That is I need to delete x2, x3 and x4 from my data.frame > I can delete x2 and x3 as follows > *mydata = t(subset(t(mydata), rowSums(t(mydata)) > 0)) > mydata = t(subset(t(mydata), rowSums(t(sim.data)) < 20)) #where 20=4*5* > > How can I subset my data by removing x4, which contains either number > trials > or zeros as elements? > Can I give a single logical condition in the subset code to skip all > such > rows(i.e. skipping x2,x3, and x4 at once)? > > *** I am doing this for a very large dataframe(1000s of columns as > responses) in a simulation study, but here I explained with a simple > case. > > Thank you for your kindness! > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/sub- > setting-a-data-frame-with-binomial-responses-tp4638702.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
arun kirshna [via R]
2012-Aug-02 06:22 UTC
[R] sub setting a data frame with binomial responses
Hi, Just two variant's of David's solution: idx<-apply(mydata[-1],2,function(x) any(!x %in% c(0,5))) idx x1 x2 x3 x4 TRUE FALSE FALSE FALSE idx<-c(TRUE,idx) mydata[,idx] #second idx<-apply(mydata[-1],2,function(x) all(ifelse(x!=0 & x!=5,TRUE,FALSE ))) mydata[,c(TRUE,idx)] n x1 1 5 2 2 5 3 3 5 1 4 5 3 A.K. ______________________________________ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sub-setting-a-data-frame-with-binomial-responses-tp4638702p4638833.html This email was sent by arun kirshna (via Nabble) To receive all replies by email, subscribe to this discussion: http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=subscribe_by_code&node=4638702&code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzODcwMnwtNzg0MjM1NTA4 [[alternative HTML version deleted]]
Apparently Analagous Threads
- GlmmPQL with binomial errors
- Problem with predict and lines in plotting binomial glm
- Warning message: In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
- From THE R BOOK -> Warning: In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
- binomial dist: obtaining probability of success on each trial