An embedded and charset-unspecified text was scrubbed... Name: ei saatavilla URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110201/fe2362c4/attachment.pl>
Henrique Dallazuanna
2011-Feb-01 13:44 UTC
[R] Problems with sample means and standard deviations
Try this: vapply(replicate(y, sample(listMV, 5), simplify = FALSE), function(x)c(mean(x), sd(x)), c(Mean = 0, Sd = 0)) On Tue, Feb 1, 2011 at 7:56 AM, Titta <peltokyla@gmail.com> wrote:> Hi, > > I am doing program that takes samples y times from listMV and saves the > result to list MVdata. The problem is that I need sample mean or standard > deviation for each sample (times y) and for all samples together. How can I > do that? Mean() and sd() won´t work. > > Thanks allready, > Titta > > > > > listMV<-c(1.182101983,1.249382648,1.374104215,1.336153877,1.331386231,1.319032094,1.311126545,1.221740863,1.298848481,1.241727379,1.339273873,1.386809408,1.355919009,1.321051409,1.256459148,1.284277166,1.300219992,1.377359149,1.231984488,1.308793786,1.319114185,1.417506978,1.310797119,1.230818679,1.229165322,1.320724049,1.342038449,1.201942636,1.334793202,1.30065893,1.409992259,1.369055222,1.214696135,1.228829414,1.273789905,1.328549897,1.201871417,1.272051102,1.381760814,1.482881264,1.35225819,1.171344013,1.235416322,1.25905681,1.34637339,1.188881698,1.221856048,1.302875505,1.43703543,1.434648007,1.246797867,1.236886744,1.308768636,1.253534504,1.246544401,1.347202456,1.253535584,1.442176865,1.40847141,1.241578938,1.238772941,1.30662151,1.326978911,1.237433784,1.308488464,1.274562848,1.452806933,1.486559719,1.237405035,1.175760893,1.316972548,1.313807387,1.224698176,1.239616142,1.259846334,1.423991194,1.406917943,1.25118274,1.200447065,1.237256663,1.237398053 > + ) > > y<-3 > > MVdata=c() > > for(i in 1:y){ > + s<-sample(listMV,size=5, replace=FALSE) > + MVdata[[length(MVdata)+length(y)]]<-s} > > > > MVdata > [[1]] > [1] 1.434648 1.256459 1.237405 1.259057 1.334793 > > [[2]] > [1] 1.221856 1.201871 1.320724 1.231984 1.259846 > > [[3]] > [1] 1.182102 1.214696 1.310797 1.237405 1.308794 > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Dennis Murphy
2011-Feb-01 17:34 UTC
[R] Problems with sample means and standard deviations
Hi: Here are a few ways to do this. A useful approach is to use replicate() to generate the samples, flatten the resulting matrix into a data frame and call one or more packages that are well capable of handling multiple outputs per data subset. Step 1: Generate the samples and rearrange into a data frame. u <- replicate(400, sample(listMV, 5)) # 400 samples of size 5 w/o replacement from listMV df <- data.frame(rep(1:ncol(u), each = nrow(u)), as.vector(u)) df is now a 2000 x 2 data frame with two columns: one a sample number indicator, the other the values sampled from listMV. You have more options when the data are arranged in this form. Step 2: Get the means and standard deviations per sample. The three packages I use below use somewhat different approaches. To use them, they need to be installed first from CRAN if you don't already have them. Fortunately, all three packages have documentation available; doBy and data.table have package vignettes while plyr has its own website: http://had.co.nz/plyr/ All three packages are designed to process data quickly in a groupwise fashion, which is why I created an indicator for sample number. # ---------------------- (a) Package doBy: library(doBy) # Create a function to return the mean and standard deviation of a variable g <- function(x) c(mean = mean(x), sd = sd(x)) # Apply the function g() above to the y values in each sample ww <- summaryBy(y ~ sample, data = df, FUN = g) # returns a 400 x 2 data frame head(ww, 3) sample y.mean y.sd 1 1 1.242163 0.04225226 2 2 1.301827 0.07032729 3 3 1.332400 0.02500223 # The overall mean and standard deviation is gotten as follows:> summaryBy(y ~ 1, data = df, FUN = g)y.mean y.sd 1 1.299855 0.07606458 # A nice generalization is that summaryBy() can take multiple responses # on the left side of the formula and return a column of means and standard # deviations for each by group. # ----------------- (b) package plyr: library(plyr) w <- ddply(df, .(sample), summarise, m = mean(y), s = sd(y))>dim(w)[1] 400 3> head(w)sample m s 1 1 1.242163 0.04225226 2 2 1.301827 0.07032729 3 3 1.332400 0.02500223 The summarise argument creates a new data frame for the summary functions defined by m and s. There is a separate summarise() function for overall summaries:> summarise(df, m = mean(y), s = sd(y))m s 1 1.299855 0.07606458 # ------------------------------- (c) Package data.table: library(data.table) dt <- data.table(sample = rep(1:400, each = 5), y = as.vector(u)) w2 <- dt[, list(m = mean(y), s = sd(y)), by = 'sample']> dim(w2)[1] 400 3> head(w2)sample m s [1,] 1 1.242163 0.04225226 [2,] 2 1.301827 0.07032729 [3,] 3 1.332400 0.02500223 The overall mean and standard deviation is straightforward: dt[, list(m = mean(y), s = sd(y))] m s [1,] 1.299855 0.07606458 # ----------------------------- This is not an exhaustive list, as there are other ways to do the same thing, which others may take the opportunity to show you. HTH, Dennis On Tue, Feb 1, 2011 at 1:56 AM, Titta <peltokyla@gmail.com> wrote:> Hi, > > I am doing program that takes samples y times from listMV and saves the > result to list MVdata. The problem is that I need sample mean or standard > deviation for each sample (times y) and for all samples together. How can I > do that? Mean() and sd() won´t work. > > Thanks allready, > Titta > > > > > listMV<-c(1.182101983,1.249382648,1.374104215,1.336153877,1.331386231,1.319032094,1.311126545,1.221740863,1.298848481,1.241727379,1.339273873,1.386809408,1.355919009,1.321051409,1.256459148,1.284277166,1.300219992,1.377359149,1.231984488,1.308793786,1.319114185,1.417506978,1.310797119,1.230818679,1.229165322,1.320724049,1.342038449,1.201942636,1.334793202,1.30065893,1.409992259,1.369055222,1.214696135,1.228829414,1.273789905,1.328549897,1.201871417,1.272051102,1.381760814,1.482881264,1.35225819,1.171344013,1.235416322,1.25905681,1.34637339,1.188881698,1.221856048,1.302875505,1.43703543,1.434648007,1.246797867,1.236886744,1.308768636,1.253534504,1.246544401,1.347202456,1.253535584,1.442176865,1.40847141,1.241578938,1.238772941,1.30662151,1.326978911,1.237433784,1.308488464,1.274562848,1.452806933,1.486559719,1.237405035,1.175760893,1.316972548,1.313807387,1.224698176,1.239616142,1.259846334,1.423991194,1.406917943,1.25118274,1.200447065,1.237256663,1.237398053 > + ) > > y<-3 > > MVdata=c() > > for(i in 1:y){ > + s<-sample(listMV,size=5, replace=FALSE) > + MVdata[[length(MVdata)+length(y)]]<-s} > > > > MVdata > [[1]] > [1] 1.434648 1.256459 1.237405 1.259057 1.334793 > > [[2]] > [1] 1.221856 1.201871 1.320724 1.231984 1.259846 > > [[3]] > [1] 1.182102 1.214696 1.310797 1.237405 1.308794 > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]