thr3ads.net - R help - [R] "by" question [Jun 2009]

If this information is useful, please help other people find it:
Share via:

David Hugh-Jones

2009-Jun-24 16:08 UTC

[R] "by" question

Hello all

I have a big data frame and I regularly want to break it down into subsets,
calculate some new data, and add it back to the data frame.

At the moment my technique seems a bit ugly and embarrassing. Something
like:

result <- by(mydata, mydata$some_factor, function (x) {
  # do something to create a vector v with length(v) == nrow(x)
 return(v)
})
# now result has a big list, argh... how do I put it neatly back into the
mydata data frame?
for (i in unique(mydata$some_factor) {
mydata$newvar[mydata$somefactor ==i] <- result[[i]]
}

What should I be doing instead of this?

David Hugh-Jones
Post-doctoral Researcher
Max Planck Institute of Economics, Jena
http://davidhughjones.googlepages.com

	[[alternative HTML version deleted]]

jim holtman

2009-Jun-24 16:15 UTC

head link

[R] "by" question

How about something like this:
> x   id      data
1   1 0.7773207
2   3 0.9606180
3   2 0.4346595
4   3 0.7125147
5   2 0.3999944
6   2 0.3253522
7   2 0.7570871
8   3 0.2026923
9   3 0.7111212
10  2 0.1216919> # compute running sum for each ID
> x$run <- ave(x$data, x$id, FUN=cumsum)
> x   id      data       run
1   1 0.7773207 0.7773207
2   3 0.9606180 0.9606180
3   2 0.4346595 0.4346595
4   3 0.7125147 1.6731327
5   2 0.3999944 0.8346539
6   2 0.3253522 1.1600060
7   2 0.7570871 1.9170932
8   3 0.2026923 1.8758249
9   3 0.7111212 2.5869462
10  2 0.1216919 2.0387851>

On Wed, Jun 24, 2009 at 12:08 PM, David Hugh-Jones
<davidhughjones@gmail.com> wrote:
> Hello all
>
> I have a big data frame and I regularly want to break it down into subsets,
> calculate some new data, and add it back to the data frame.
>
> At the moment my technique seems a bit ugly and embarrassing. Something
> like:
>
> result <- by(mydata, mydata$some_factor, function (x) {
>  # do something to create a vector v with length(v) == nrow(x)
>  return(v)
> })
> # now result has a big list, argh... how do I put it neatly back into the
> mydata data frame?
> for (i in unique(mydata$some_factor) {
> mydata$newvar[mydata$somefactor ==i] <- result[[i]]
> }
>
> What should I be doing instead of this?
>
> David Hugh-Jones
> Post-doctoral Researcher
> Max Planck Institute of Economics, Jena
> http://davidhughjones.googlepages.com
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

	[[alternative HTML version deleted]]

R help - Jun 2009 - "by" question

[R] "by" question

[R] "by" question

Maybe Matching Threads