> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Erik Iverson
> Sent: Sunday, August 08, 2010 11:27 PM
> To: Alexander Eggel
> Cc: r-help at r-project.org
> Subject: Re: [R] Extract values from data frame in R
>
> On 08/09/2010 01:16 AM, Alexander Eggel wrote:
> > Using R, I would like to find out which Samples (S1, S2,
> S3, S4, S5) fulfill
> > the following criteria:contain minimally one value (x, y or
> z) bigger than
> > 4. Any ideas? Thanks, Alex.
> >
> >> data
> > Sample x y z
> > 1 S1 -0.3 5.3 2.5
> > 2 S2 0.4 0.2 -1.2
> > 3 S3 1.2 -0.6 3.2
> > 4 S4 4.3 0.7 5.7
> > 5 S5 2.4 4.3 2.3
>
> Untested:
>
> Sample[apply(Sample[-1], 1, function(x) any(x) > 4)),
"Sample"]
The any(x)>4 should be any(x>4), as in:
> f1 <- function(data) data[apply(data[-1], 1, function(x) any(x >
4)), "Sample"]
Note that operating a column at a time on a
data is often faster than operating a row at
a time. E.g.,
> f2 <- function(data) with(data, Sample[x>4 | y>4 | z>4])
> makeData <- function (nrow, seed){
if (!missing(seed))
set.seed(seed)
data.frame(Sample = sample(paste("S", 1:5, sep = ""),
replace TRUE,
size = nrow), x = rgamma(nrow, 4), y = rgamma(nrow, 5),
z = rgamma(nrow, 3))
}
> z <- makeData(10000, seed=73)
> system.time(v1 <- f1(z))
user system elapsed
0.27 0.00 0.25
> system.time(v2 <- f2(z))
user system elapsed
0.00 0.01 0.01
> identical(v1, v2)
[1] TRUE
> length(v1)
[1] 8390
(I prefer that non-apply approach because apply often
causes trouble when used with data.frames -- it is only
safe when all columns are numeric.)
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>