I have data that looks like this: start end value 1 4 2 5 8 1 9 10 0 I want to transform the data so that it becomes: startend value 1 2 2 2 3 2 4 2 5 1 6 1 7 1 8 1 9 0 10 0 ---- I've written a for loop that can do the transformation BUT I need to do this on very large datasets (millions of rows). Does anyone know of an R package that has a function that can do this transformation? Any help is much appreciated! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/help-with-simple-but-massive-data-transformation-tp2989850p2989850.html Sent from the R help mailing list archive at Nabble.com.
ONKELINX, Thierry
2010-Oct-11 14:43 UTC
[R] help with simple but massive data transformation
This should be easy with apply()
do.call(rbind, apply(dataset, 1, function(x){
list(data.frame(startend = x[1]:x[2], value = x[3])
}))
Untested!
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium
Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
> -----Oorspronkelijk bericht-----
> Van: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] Namens clee
> Verzonden: maandag 11 oktober 2010 16:17
> Aan: r-help at r-project.org
> Onderwerp: [R] help with simple but massive data transformation
>
>
> I have data that looks like this:
>
> start end value
> 1 4 2
> 5 8 1
> 9 10 0
>
>
> I want to transform the data so that it becomes:
>
> startend value
> 1 2
> 2 2
> 3 2
> 4 2
> 5 1
> 6 1
> 7 1
> 8 1
> 9 0
> 10 0
>
> ----
> I've written a for loop that can do the transformation BUT I
> need to do this on very large datasets (millions of rows).
> Does anyone know of an R package that has a function that can
> do this transformation?
>
> Any help is much appreciated!
>
> Thanks!
> --
> View this message in context:
> http://r.789695.n4.nabble.com/help-with-simple-but-massive-dat
> a-transformation-tp2989850p2989850.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Gabor Grothendieck
2010-Oct-11 14:48 UTC
[R] help with simple but massive data transformation
On Mon, Oct 11, 2010 at 10:16 AM, clee <cheelee7 at gmail.com> wrote:> > I have data that looks like this: > > start ? ? end ? ? value > 1 ? ? ? ? ?4 ? ? ? ? 2 > 5 ? ? ? ? ?8 ? ? ? ? 1 > 9 ? ? ? ? 10 ? ? ? ?0 > > > I want to transform the data so that it becomes: > > startend ? ? value > 1 ? ? ? ? ? ? ? 2 > 2 ? ? ? ? ? ? ? 2 > 3 ? ? ? ? ? ? ? 2 > 4 ? ? ? ? ? ? ? 2 > 5 ? ? ? ? ? ? ? 1 > 6 ? ? ? ? ? ? ? 1 > 7 ? ? ? ? ? ? ? 1 > 8 ? ? ? ? ? ? ? 1 > 9 ? ? ? ? ? ? ? 0 > 10 ? ? ? ? ? ? 0 > > I've written a for loop that can do the transformation BUT I need to do this > on very large datasets (millions of rows). ?Does anyone know of an R package > that has a function that can do this transformation?A very similar question was just asked recently. See this: https://stat.ethz.ch/pipermail/r-help/2010-October/255791.html -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
David Winsemius
2010-Oct-11 14:51 UTC
[R] help with simple but massive data transformation
On Oct 11, 2010, at 10:16 AM, clee wrote:> > I have data that looks like this: > > start end value > 1 4 2 > 5 8 1 > 9 10 0 > > > I want to transform the data so that it becomes: > > startend value > 1 2 > 2 2 > 3 2 > 4 2 > 5 1 > 6 1 > 7 1 > 8 1 > 9 0 > 10 0> do.call("rbind", apply(dta, 1, function(.r) matrix(c( seq(.r[1], .r[2]), vals=rep(.r[3], .r[2]-.r[1]+1) ), ncol=2) )) [,1] [,2] [1,] 1 2 [2,] 2 2 [3,] 3 2 [4,] 4 2 [5,] 5 1 [6,] 6 1 [7,] 7 1 [8,] 8 1 [9,] 9 0 [10,] 10 0> > ---- > I've written a for loop that can do the transformation BUT I need to > do this > on very large datasets (millions of rows). Does anyone know of an R > package > that has a function that can do this transformation? > > Any help is much appreciated! > > Thanks! > -- > View this message in context: http://r.789695.n4.nabble.com/help-with-simple-but-massive-data-transformation-tp2989850p2989850.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT