thr3ads.net - R help - [R] Why mean is not working in by? [Dec 2015]

If this information is useful, please help other people find it:
Share via:

Dimitri Liakhovitski

2015-Dec-08 23:08 UTC

[R] Why mean is not working in by?

Sorry, I omitted the first line:

myvars <- c("Sepal.Length", "Sepal.Width")
by(data = iris[myvars], INDICES = iris["Species"], FUN = summary)
by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
by(data = iris[myvars], INDICES = iris["Species"], FUN = min)

by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
by(data = iris[myvars], INDICES = iris["Species"], FUN = mean)

The first lines are doing what I expected them to do: for each level
of the factor "Species" they gave me a summary, a sum, a variance, a
max, a min for each of the 2 variables in question (myvars).
I expected by to generate the sd and the mean for the 2 variables in
question for each level of "Species".

On Tue, Dec 8, 2015 at 5:50 PM, Sarah Goslee <sarah.goslee at gmail.com>
wrote:> Hi Dimitri,
>
> I changed this into a reproducible example (we don't know what myvars
> is). Assuming length(myvars) > 1, I'm not convinced that your first
> five lines "work" either: what do you expect?
>
> I get:
>
>> by(data = iris[, -5], INDICES = iris["Species"], FUN = min)
> Species: setosa
> [1] 0.1
> ------------------------------------------------------------------
> Species: versicolor
> [1] 1
> ------------------------------------------------------------------
> Species: virginica
> [1] 1.4
>
> But was expecting:
>
>> aggregate(iris[,-5], by=iris[,"Species", drop=FALSE],
FUN=min)
>      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
> 1     setosa          4.3         2.3          1.0         0.1
> 2 versicolor          4.9         2.0          3.0         1.0
> 3  virginica          4.9         2.2          4.5         1.4
>
>
>
> aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=sd)
> aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=mean)
>
> provide the answers I would expect. If you want clearer advice, you
> need to provide an actually reproducible example, and tell us more
> about what you expect to get.
>
> Sarah
>
>
> On Tue, Dec 8, 2015 at 5:30 PM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>> Hello!
>> Could you please explain why the first 5 lines work but the last 2
lines don't?
>> Thank you!
>>
>> by(data = iris[myvars], INDICES = iris["Species"], FUN =
summary)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = min)
>>
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN =
mean)
>>
>> --
>> Dimitri Liakhovitski
>>


-- 
Dimitri Liakhovitski

William Dunlap

2015-Dec-08 23:17 UTC

head link

[R] Why mean is not working in by?

by() calls FUN with a data.frame as the argument.  summary(), sum(), etc.
have methods that work on data.frames but sd() and mean() do not.

aggregate() calls its FUN with each column of a data.frame as the argument.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Dec 8, 2015 at 3:08 PM, Dimitri Liakhovitski <
dimitri.liakhovitski at gmail.com> wrote:
> Sorry, I omitted the first line:
>
> myvars <- c("Sepal.Length", "Sepal.Width")
> by(data = iris[myvars], INDICES = iris["Species"], FUN = summary)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = min)
>
> by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = mean)
>
> The first lines are doing what I expected them to do: for each level
> of the factor "Species" they gave me a summary, a sum, a
variance, a
> max, a min for each of the 2 variables in question (myvars).
> I expected by to generate the sd and the mean for the 2 variables in
> question for each level of "Species".
>
> On Tue, Dec 8, 2015 at 5:50 PM, Sarah Goslee <sarah.goslee at
gmail.com>
> wrote:
> > Hi Dimitri,
> >
> > I changed this into a reproducible example (we don't know what
myvars
> > is). Assuming length(myvars) > 1, I'm not convinced that your
first
> > five lines "work" either: what do you expect?
> >
> > I get:
> >
> >> by(data = iris[, -5], INDICES = iris["Species"], FUN =
min)
> > Species: setosa
> > [1] 0.1
> > ------------------------------------------------------------------
> > Species: versicolor
> > [1] 1
> > ------------------------------------------------------------------
> > Species: virginica
> > [1] 1.4
> >
> > But was expecting:
> >
> >> aggregate(iris[,-5], by=iris[,"Species", drop=FALSE],
FUN=min)
> >      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
> > 1     setosa          4.3         2.3          1.0         0.1
> > 2 versicolor          4.9         2.0          3.0         1.0
> > 3  virginica          4.9         2.2          4.5         1.4
> >
> >
> >
> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE],
FUN=sd)
> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE],
FUN=mean)
> >
> > provide the answers I would expect. If you want clearer advice, you
> > need to provide an actually reproducible example, and tell us more
> > about what you expect to get.
> >
> > Sarah
> >
> >
> > On Tue, Dec 8, 2015 at 5:30 PM, Dimitri Liakhovitski
> > <dimitri.liakhovitski at gmail.com> wrote:
> >> Hello!
> >> Could you please explain why the first 5 lines work but the last 2
> lines don't?
> >> Thank you!
> >>
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN =
summary)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN =
sum)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN =
var)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN =
max)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN =
min)
> >>
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN =
sd)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN =
mean)
> >>
> >> --
> >> Dimitri Liakhovitski
> >>
>
>
>
> --
> Dimitri Liakhovitski
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Dimitri Liakhovitski

2015-Dec-08 23:18 UTC

head link

[R] Why mean is not working in by?

Got it - thank you, everybody!
by splits it into data frames.
Lesson: use aggregate.

On Tue, Dec 8, 2015 at 6:17 PM, William Dunlap <wdunlap at tibco.com>
wrote:> by() calls FUN with a data.frame as the argument.  summary(), sum(), etc.
> have methods that work on data.frames but sd() and mean() do not.
>
> aggregate() calls its FUN with each column of a data.frame as the argument.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Dec 8, 2015 at 3:08 PM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>>
>> Sorry, I omitted the first line:
>>
>> myvars <- c("Sepal.Length", "Sepal.Width")
>> by(data = iris[myvars], INDICES = iris["Species"], FUN =
summary)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = min)
>>
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN =
mean)
>>
>> The first lines are doing what I expected them to do: for each level
>> of the factor "Species" they gave me a summary, a sum, a
variance, a
>> max, a min for each of the 2 variables in question (myvars).
>> I expected by to generate the sd and the mean for the 2 variables in
>> question for each level of "Species".
>>
>> On Tue, Dec 8, 2015 at 5:50 PM, Sarah Goslee <sarah.goslee at
gmail.com>
>> wrote:
>> > Hi Dimitri,
>> >
>> > I changed this into a reproducible example (we don't know what
myvars
>> > is). Assuming length(myvars) > 1, I'm not convinced that
your first
>> > five lines "work" either: what do you expect?
>> >
>> > I get:
>> >
>> >> by(data = iris[, -5], INDICES = iris["Species"], FUN
= min)
>> > Species: setosa
>> > [1] 0.1
>> > ------------------------------------------------------------------
>> > Species: versicolor
>> > [1] 1
>> > ------------------------------------------------------------------
>> > Species: virginica
>> > [1] 1.4
>> >
>> > But was expecting:
>> >
>> >> aggregate(iris[,-5], by=iris[,"Species",
drop=FALSE], FUN=min)
>> >      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
>> > 1     setosa          4.3         2.3          1.0         0.1
>> > 2 versicolor          4.9         2.0          3.0         1.0
>> > 3  virginica          4.9         2.2          4.5         1.4
>> >
>> >
>> >
>> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE],
FUN=sd)
>> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE],
FUN=mean)
>> >
>> > provide the answers I would expect. If you want clearer advice,
you
>> > need to provide an actually reproducible example, and tell us more
>> > about what you expect to get.
>> >
>> > Sarah
>> >
>> >
>> > On Tue, Dec 8, 2015 at 5:30 PM, Dimitri Liakhovitski
>> > <dimitri.liakhovitski at gmail.com> wrote:
>> >> Hello!
>> >> Could you please explain why the first 5 lines work but the
last 2
>> >> lines don't?
>> >> Thank you!
>> >>
>> >> by(data = iris[myvars], INDICES = iris["Species"],
FUN = summary)
>> >> by(data = iris[myvars], INDICES = iris["Species"],
FUN = sum)
>> >> by(data = iris[myvars], INDICES = iris["Species"],
FUN = var)
>> >> by(data = iris[myvars], INDICES = iris["Species"],
FUN = max)
>> >> by(data = iris[myvars], INDICES = iris["Species"],
FUN = min)
>> >>
>> >> by(data = iris[myvars], INDICES = iris["Species"],
FUN = sd)
>> >> by(data = iris[myvars], INDICES = iris["Species"],
FUN = mean)
>> >>
>> >> --
>> >> Dimitri Liakhovitski
>> >>
>>
>>
>>
>> --
>> Dimitri Liakhovitski
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Dimitri Liakhovitski

R help - Dec 2015 - Why mean is not working in by?

[R] Why mean is not working in by?

[R] Why mean is not working in by?

[R] Why mean is not working in by?