Dear Ken,
At 05:51 PM 13/09/2001 -0400, kestickler at netscape.net
wrote:>Hi,
>
>i have a dataframe such as:
>
> Exp1 Exp2 Exp3
>name1 12.6 78.0 45.6
>name2 11.9 19.0 21.0
>name3 10.0 14.0 17.0
>...
>...
>...
>
>Real datasets might be quite large - 20,000 rows by 100 columns
>
>I want to calculate metrics such as the variation *row-wise*. So, var for
>name1, var for name 2, var for name3 etc.
>
>Can someone kindly guide me on how best to code this?
The size of the dataset may prove to be a problem, but in principle this
kind of calculation can be done with the apply function: apply(df, 1, var),
where df is the data frame containing your data.
>Also, once such a metric has been calculated for each row, how best to
>store the results such that when (for instance) the results are sorted, i
>can access the row names along with the (ordered) variance value?
You can simply create a new variable in the data frame, e.g., df$var <-
apply(df, 1, var) .
When you sort a variable in a data frame, usually the row names don't show
in the result. But something like the following should work (again, if the
size of the problem isn't too large for your resources):
names(df$var) <- rownames(df); sort(df$var)
I hope that this helps,
John
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
-----------------------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._