Hello, I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has either value 0 or 1, column 2 has values between 0 and 10. I would like to count how often Column 1 has the value 1, while Column 2 has a value greater 5. This is my attempt, which works but doesn't seem to be very efficient, especially when testing different values or columns. count=0 for (i in 1:1000) { if(dataset[i,2]>5 && ind[i,1]==1) { count=count+1}} I'm looking for a more efficient/elegant way to do this! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Count-based-on-2-conditions-Beginner-Question-tp4643282.html Sent from the R help mailing list archive at Nabble.com.
On 16.09.2012 12:41, SirRon wrote:> Hello, > I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has > either value 0 or 1, column 2 has values between 0 and 10. I would like to > count how often Column 1 has the value 1, while Column 2 has a value greater > 5. > > This is my attempt, which works but doesn't seem to be very efficient, > especially when testing different values or columns. > > count=0 > for (i in 1:1000) { if(dataset[i,2]>5 && ind[i,1]==1) { count=count+1}}You probably want count <- sum(dataset[,2]>5 & ind[,1]==1) Uwe Ligges> > I'm looking for a more efficient/elegant way to do this! > > Thanks! > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Count-based-on-2-conditions-Beginner-Question-tp4643282.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hello, Since logical values F/T are coded as integers 0/1, you can use this: set.seed(5712) # make it reproducible n <- 1e3 x <- data.frame(A = sample(0:1, n, TRUE), B = sample(0:10, n, TRUE)) count <- sum(x$A == 1 & x$B > 5) # 207 Hope this helps, Rui Barradas Em 16-09-2012 11:41, SirRon escreveu:> Hello, > I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has > either value 0 or 1, column 2 has values between 0 and 10. I would like to > count how often Column 1 has the value 1, while Column 2 has a value greater > 5. > > This is my attempt, which works but doesn't seem to be very efficient, > especially when testing different values or columns. > > count=0 > for (i in 1:1000) { if(dataset[i,2]>5 && ind[i,1]==1) { count=count+1}} > > I'm looking for a more efficient/elegant way to do this! > > Thanks! > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Count-based-on-2-conditions-Beginner-Question-tp4643282.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
HI, Try this: set.seed(1) ?dat1<-data.frame(col1=sample(0:1,1000,replace=TRUE),col2=sample(0:10,1000,replace=TRUE)) count(dat1$col1==1 & dat1$col2>5)[2,2] #[1] 209 A.K. ----- Original Message ----- From: SirRon <thechristoph at gmx.at> To: r-help at r-project.org Cc: Sent: Sunday, September 16, 2012 6:41 AM Subject: [R] Count based on 2 conditions [Beginner Question] Hello, I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has either value 0 or 1, column 2 has values between 0 and 10. I would like to count how often Column 1 has the value 1, while Column 2 has a value greater 5. This is my attempt, which works but doesn't seem to be very efficient, especially when testing different values or columns. count=0 for (i in 1:1000) { if(dataset[i,2]>5 && ind[i,1]==1) { count=count+1}} I'm looking for a more efficient/elegant way to do this! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Count-based-on-2-conditions-Beginner-Question-tp4643282.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Sep 16, 2012, at 3:41 AM, SirRon wrote:> Hello, > I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has > either value 0 or 1, column 2 has values between 0 and 10. I would like to > count how often Column 1 has the value 1, while Column 2 has a value greater > 5. > > This is my attempt, which works but doesn't seem to be very efficient, > especially when testing different values or columns. > > count=0 > for (i in 1:1000) { if(dataset[i,2]>5 && ind[i,1]==1) { count=count+1}} > > I'm looking for a more efficient/elegant way to do this! >I see others have given you a solution using the vectorized sum function. I would have reached for 'table' and done it thusly: table( one=dataset[,1], GT5=dataset[ , 2] > 5 ) -- David Winsemius, MD Alameda, CA, USA
Stephen Politzer-Ahles
2012-Sep-17 12:09 UTC
[R] Count based on 2 conditions [Beginner Question]
Most of your counting needs can be handled elegantly with the xtabs() function (cross-tabulation). This'll work a lot faster than an iterative method. For your data I would suggest something like this: # Create a column indicating whether or not the value in Col2 is above 5 dataset$Col2greaterthan5 <- dataset$Col2 > 5 # Cross-tabulate xtabs(~ Col1 + Col2greaterthan5, dataset)> Message: 2 > Date: Sun, 16 Sep 2012 03:41:45 -0700 (PDT) > From: SirRon <thechristoph@gmx.at> > To: r-help@r-project.org > Subject: [R] Count based on 2 conditions [Beginner Question] > Message-ID: <1347792105574-4643282.post@n4.nabble.com> > Content-Type: text/plain; charset=us-ascii > > Hello, > I'm working with a dataset that has 2 columns and 1000 entries. Column 1 > has > either value 0 or 1, column 2 has values between 0 and 10. I would like to > count how often Column 1 has the value 1, while Column 2 has a value > greater > 5. > > This is my attempt, which works but doesn't seem to be very efficient, > especially when testing different values or columns. > > count=0 > for (i in 1:1000) { if(dataset[i,2]>5 && ind[i,1]==1) { count=count+1}} > > I'm looking for a more efficient/elegant way to do this! > > Thanks! > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Count-based-on-2-conditions-Beginner-Question-tp4643282.html > Sent from the R help mailing list archive at Nabble.com. >[[alternative HTML version deleted]]