thr3ads.net - R help - [R] sub setting a data frame with binomial responses [Aug 2012]

If this information is useful, please help other people find it:
Share via:

john.james

2012-Aug-01 15:19 UTC

[R] sub setting a data frame with binomial responses

Hi everyone,
Let me have a dataframe named ?mydata? and created as below,
*> n=c(5,5,5,5) #number of trils> x1=c(2,3,1,3) ) #number of successes 
> x2=c(5,5,5,5) #number of successes
> x3=c(0,0,0,0) #number of successes
> x4=c(5,0,5,0) #number of successes
> mydata=data.frame(n,x1,x2,x3,x4)
> mydata*  n x1 x2 x3 x4
1 5  2  5  0  5
2 5  3  5  0  0
3 5  1  5  0  5
4 5  3  5  0  0
But for my modeling purposes(binomial), I cannot have a dataframe which has
all success columns, all failure columns or only the success and failure
columns.
That is I need to delete x2, x3 and x4 from my data.frame
I can delete x2 and x3 as follows
*mydata = t(subset(t(mydata), rowSums(t(mydata)) > 0))
mydata = t(subset(t(mydata), rowSums(t(sim.data)) < 20)) #where 20=4*5*

How can I subset my data by removing x4, which contains either number trials
or zeros as elements? 
Can I give a single logical condition in the subset code to skip all such
rows(i.e. skipping x2,x3, and x4 at once)?

*** I am doing this for a very large dataframe(1000s of columns as
responses) in a simulation study, but here I explained with a simple case.

Thank you for your kindness!




--
View this message in context:
http://r.789695.n4.nabble.com/sub-setting-a-data-frame-with-binomial-responses-tp4638702.html
Sent from the R help mailing list archive at Nabble.com.

David L Carlson

2012-Aug-02 04:56 UTC

head link

[R] sub setting a data frame with binomial responses

If I understand you correctly you want to exclude columns where all successes
equal trials, all successes equal 0, or successes are a mixture of trials and 0
with no in between values. You did not make it clear if the number of trials can
vary, but in your example they do not. Given that all three criteria can be
consolidated into a single statement:
> mydata <- structure(list(n = c(5, 5, 5, 5), x1 = c(2, 3, 1, 3),   x2 = c(5, 5, 5, 5), x3 = c(0, 0, 0, 0), x4 = c(5, 0, 5, 0)), 
  .Names = c("n", "x1", "x2", "x3",
"x4"), row.names = c(NA, -4L),
  class = "data.frame")> mydata  n x1 x2 x3 x4
1 5  2  5  0  5
2 5  3  5  0  0
3 5  1  5  0  5
4 5  3  5  0  0> idx <- sapply(mydata[,-1], function(x) all(x %in% c(0, 5)))
> idx <- c(TRUE, !idx) # add TRUE to include the first column
> mydata[, idx]  n x1
1 5  2
2 5  3
3 5  1
4 5  3

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of john.james
> Sent: Wednesday, August 01, 2012 10:19 AM
> To: r-help at r-project.org
> Subject: [R] sub setting a data frame with binomial responses
> 
> Hi everyone,
> Let me have a dataframe named ?mydata? and created as below,
> *> n=c(5,5,5,5) #number of trils
> > x1=c(2,3,1,3) ) #number of successes
> > x2=c(5,5,5,5) #number of successes
> > x3=c(0,0,0,0) #number of successes
> > x4=c(5,0,5,0) #number of successes
> > mydata=data.frame(n,x1,x2,x3,x4)
> > mydata*
>   n x1 x2 x3 x4
> 1 5  2  5  0  5
> 2 5  3  5  0  0
> 3 5  1  5  0  5
> 4 5  3  5  0  0
> But for my modeling purposes(binomial), I cannot have a dataframe which
> has
> all success columns, all failure columns or only the success and
> failure
> columns.
> That is I need to delete x2, x3 and x4 from my data.frame
> I can delete x2 and x3 as follows
> *mydata = t(subset(t(mydata), rowSums(t(mydata)) > 0))
> mydata = t(subset(t(mydata), rowSums(t(sim.data)) < 20)) #where 20=4*5*
> 
> How can I subset my data by removing x4, which contains either number
> trials
> or zeros as elements?
> Can I give a single logical condition in the subset code to skip all
> such
> rows(i.e. skipping x2,x3, and x4 at once)?
> 
> *** I am doing this for a very large dataframe(1000s of columns as
> responses) in a simulation study, but here I explained with a simple
> case.
> 
> Thank you for your kindness!
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/sub-
> setting-a-data-frame-with-binomial-responses-tp4638702.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

arun kirshna [via R]

2012-Aug-02 06:22 UTC

head link

[R] sub setting a data frame with binomial responses

Hi,

Just two variant's of David's solution:
 idx<-apply(mydata[-1],2,function(x) any(!x %in% c(0,5)))
idx
   x1    x2    x3    x4 
 TRUE FALSE FALSE FALSE 
 idx<-c(TRUE,idx)
 mydata[,idx]
#second
idx<-apply(mydata[-1],2,function(x) all(ifelse(x!=0 & x!=5,TRUE,FALSE )))
 mydata[,c(TRUE,idx)]
  n x1
1 5  2
2 5  3
3 5  1
4 5  3
A.K.




______________________________________
If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/sub-setting-a-data-frame-with-binomial-responses-tp4638702p4638833.html
This email was sent by arun kirshna (via Nabble)
To receive all replies by email, subscribe to this discussion:
http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=subscribe_by_code&node=4638702&code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzODcwMnwtNzg0MjM1NTA4
	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more possibly parallel threads

R help - Aug 2012 - sub setting a data frame with binomial responses

[R] sub setting a data frame with binomial responses

[R] sub setting a data frame with binomial responses

[R] sub setting a data frame with binomial responses

Reasonably Related Threads