Data frames are lists. Each column of the data frame is a component of the
list. So in, e.g.
lapply(data, function(x) x)
the function would receive each column of the data frame in turn.
To apply a function to each row of the data frame (which may need some care) one
tool you can use is apply(...)
apply(data, 1, function(x) ...)
The form of the result will depend on the value of the function. If the value
returned by the function is a vector, these will form the *columns* of the
result of apply, not the rows, which will be a matrix.
For the normalization problem, here is one way to do it:
data <- within(data, norm <- tapply(value, group, function(x)
x/sum(x))[group])
Warning 1: the second of these assignment operators may not be replaced by
'='.
Warning 2: untested code!
________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On
Behalf Of Noah Silverman [noah at smartmediacorp.com]
Sent: 28 February 2010 12:37
To: r-help at r-project.org
Subject: [R] lapply with data frame
I'm a bit confused on how to use lapply with a data.frame.
For example.
lapply(data, function(x) print(x))
WHAT exactly is passed to the function. Is it each ROW in the data
frame, one by one, or each column, or the entire frame in one shot?
What I want to do apply a function to each row in the data frame. Is
lapply the right way.
A second application is to normalize a column value by group. For
example, if I have the following table:
id group value norm
1 A 3.2
2 A 3.0
3 A 3.1
4 B 5.5
5 B 6.0
6 B 6.2
etc...
The long version would be:
foreach (group in unique(data$group)){
data$norm[group==group] <- data$value[group==group] /
sum(data$value[group==group])
}
There must be a faster way to do this with lapply. (Ideally, I'd then
use mclapply to run on multi-cores and really crank up the speed.)
Any suggestions?
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.