thr3ads.net - R help - [R] Problems with sample means and standard deviations [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Titta

2011-Feb-01 09:56 UTC

[R] Problems with sample means and standard deviations

An embedded and charset-unspecified text was scrubbed...
Name: ei saatavilla
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20110201/fe2362c4/attachment.pl>

Henrique Dallazuanna

2011-Feb-01 13:44 UTC

head link

[R] Problems with sample means and standard deviations

Try this:

vapply(replicate(y, sample(listMV, 5), simplify = FALSE),
function(x)c(mean(x), sd(x)), c(Mean = 0, Sd = 0))


On Tue, Feb 1, 2011 at 7:56 AM, Titta <peltokyla@gmail.com> wrote:
> Hi,
>
> I am doing program that takes samples y times from listMV and saves the
> result to list MVdata. The problem is that I need sample mean or standard
> deviation for each sample (times y) and for all samples together. How can I
> do that? Mean() and sd() won´t work.
>
> Thanks allready,
> Titta
>
> >
>
>
listMV<-c(1.182101983,1.249382648,1.374104215,1.336153877,1.331386231,1.319032094,1.311126545,1.221740863,1.298848481,1.241727379,1.339273873,1.386809408,1.355919009,1.321051409,1.256459148,1.284277166,1.300219992,1.377359149,1.231984488,1.308793786,1.319114185,1.417506978,1.310797119,1.230818679,1.229165322,1.320724049,1.342038449,1.201942636,1.334793202,1.30065893,1.409992259,1.369055222,1.214696135,1.228829414,1.273789905,1.328549897,1.201871417,1.272051102,1.381760814,1.482881264,1.35225819,1.171344013,1.235416322,1.25905681,1.34637339,1.188881698,1.221856048,1.302875505,1.43703543,1.434648007,1.246797867,1.236886744,1.308768636,1.253534504,1.246544401,1.347202456,1.253535584,1.442176865,1.40847141,1.241578938,1.238772941,1.30662151,1.326978911,1.237433784,1.308488464,1.274562848,1.452806933,1.486559719,1.237405035,1.175760893,1.316972548,1.313807387,1.224698176,1.239616142,1.259846334,1.423991194,1.406917943,1.25118274,1.200447065,1.237256663,1.237398053
> + )
> > y<-3
> > MVdata=c()
> > for(i in 1:y){
> + s<-sample(listMV,size=5, replace=FALSE)
> + MVdata[[length(MVdata)+length(y)]]<-s}
> >
> > MVdata
> [[1]]
> [1] 1.434648 1.256459 1.237405 1.259057 1.334793
>
> [[2]]
> [1] 1.221856 1.201871 1.320724 1.231984 1.259846
>
> [[3]]
> [1] 1.182102 1.214696 1.310797 1.237405 1.308794
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

	[[alternative HTML version deleted]]

Dennis Murphy

2011-Feb-01 17:34 UTC

head link

[R] Problems with sample means and standard deviations

Hi:

Here are a few ways to do this.

A useful approach is to use replicate() to generate the samples, flatten the
resulting matrix into a data frame and call one or more packages that are
well capable of handling multiple outputs per data subset.

Step 1: Generate the samples and rearrange into a data frame.

u <- replicate(400, sample(listMV, 5))      # 400 samples of size 5 w/o
replacement from listMV
df <- data.frame(rep(1:ncol(u), each = nrow(u)), as.vector(u))

df is now a 2000 x 2 data frame with two columns: one a sample number
indicator, the other the values sampled from listMV. You have more options
when the data are arranged in this form.

Step 2: Get the means and standard deviations per sample.

The three packages I use below use somewhat different approaches. To use
them, they need to be installed first from CRAN if you don't already have
them. Fortunately, all three packages have documentation available; doBy and
data.table have package vignettes while plyr has its own website:
http://had.co.nz/plyr/
All three packages are designed to process data quickly in a groupwise
fashion, which is why I created an indicator for sample number.
# ----------------------

(a) Package doBy:

library(doBy)

# Create a function to return the mean and standard deviation of a variable
g <- function(x) c(mean = mean(x), sd = sd(x))

# Apply the function g() above to the y values in each sample
ww <- summaryBy(y ~ sample, data = df, FUN = g)   # returns a 400 x 2 data
frame
head(ww, 3)
  sample   y.mean       y.sd
1      1 1.242163 0.04225226
2      2 1.301827 0.07032729
3      3 1.332400 0.02500223

# The overall mean and standard deviation is gotten as
follows:> summaryBy(y ~ 1, data = df, FUN = g)    y.mean       y.sd
1 1.299855 0.07606458

# A nice generalization is that summaryBy() can take multiple responses
# on the left side of the formula and return a column of means and standard
# deviations for each by group.
# -----------------

(b) package plyr:

library(plyr)
w <- ddply(df, .(sample), summarise, m = mean(y), s =
sd(y))>dim(w)
[1] 400   3> head(w)  sample        m          s
1      1 1.242163 0.04225226
2      2 1.301827 0.07032729
3      3 1.332400 0.02500223

The summarise argument creates a new data frame for the summary functions
defined by m and s. There is a separate summarise() function for overall
summaries:
> summarise(df, m = mean(y), s = sd(y))         m          s
1 1.299855 0.07606458

# -------------------------------
(c) Package data.table:

library(data.table)
dt <- data.table(sample = rep(1:400, each = 5), y = as.vector(u))
w2 <- dt[, list(m = mean(y), s = sd(y)), by =
'sample']> dim(w2)
[1] 400   3> head(w2)     sample        m          s
[1,]      1 1.242163 0.04225226
[2,]      2 1.301827 0.07032729
[3,]      3 1.332400 0.02500223

The overall mean and standard deviation is straightforward:

dt[, list(m = mean(y), s = sd(y))]
            m          s
[1,] 1.299855 0.07606458

# -----------------------------

This is not an exhaustive list, as there are other ways to do the same
thing, which others may take the opportunity to show you.

HTH,
Dennis


On Tue, Feb 1, 2011 at 1:56 AM, Titta <peltokyla@gmail.com> wrote:
> Hi,
>
> I am doing program that takes samples y times from listMV and saves the
> result to list MVdata. The problem is that I need sample mean or standard
> deviation for each sample (times y) and for all samples together. How can I
> do that? Mean() and sd() won´t work.
>
> Thanks allready,
> Titta
>
> >
>
>
listMV<-c(1.182101983,1.249382648,1.374104215,1.336153877,1.331386231,1.319032094,1.311126545,1.221740863,1.298848481,1.241727379,1.339273873,1.386809408,1.355919009,1.321051409,1.256459148,1.284277166,1.300219992,1.377359149,1.231984488,1.308793786,1.319114185,1.417506978,1.310797119,1.230818679,1.229165322,1.320724049,1.342038449,1.201942636,1.334793202,1.30065893,1.409992259,1.369055222,1.214696135,1.228829414,1.273789905,1.328549897,1.201871417,1.272051102,1.381760814,1.482881264,1.35225819,1.171344013,1.235416322,1.25905681,1.34637339,1.188881698,1.221856048,1.302875505,1.43703543,1.434648007,1.246797867,1.236886744,1.308768636,1.253534504,1.246544401,1.347202456,1.253535584,1.442176865,1.40847141,1.241578938,1.238772941,1.30662151,1.326978911,1.237433784,1.308488464,1.274562848,1.452806933,1.486559719,1.237405035,1.175760893,1.316972548,1.313807387,1.224698176,1.239616142,1.259846334,1.423991194,1.406917943,1.25118274,1.200447065,1.237256663,1.237398053
> + )
> > y<-3
> > MVdata=c()
> > for(i in 1:y){
> + s<-sample(listMV,size=5, replace=FALSE)
> + MVdata[[length(MVdata)+length(y)]]<-s}
> >
> > MVdata
> [[1]]
> [1] 1.434648 1.256459 1.237405 1.259057 1.334793
>
> [[2]]
> [1] 1.221856 1.201871 1.320724 1.231984 1.259846
>
> [[3]]
> [1] 1.182102 1.214696 1.310797 1.237405 1.308794
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
	[[alternative HTML version deleted]]

R help - Feb 2011 - Problems with sample means and standard deviations

[R] Problems with sample means and standard deviations

[R] Problems with sample means and standard deviations

[R] Problems with sample means and standard deviations