Hi, I have the following data frame, where col2 is a startdate and col3 an enddate COL1 COL2 COL3 A 40462 40482 B 40462 40478 The above timeframe of 3 weeks I would like to splits it in weeks like this COL1 COL2 COL3 COL4 A 40462 40468 1 A 40469 40475 1 A 40476 40482 1 B 40462 40468 1 B 40469 40475 1 B 40476 40478 0.428 Where COL4 is an identifier if the timeframe between COL2 and COL3 is exactly 7 days or shorter. In the example above for B the last split contains only 3 days so the value in COL 4 is 3/7 I can''t figure out to do the above. Is there someone who can help me out? Thx in advance, Bert [[alternative HTML version deleted]]
Dear Bert,
Use the plyr package to do the magic
library(plyr)
dataset <- data.frame(COL1 = c("A", "B"), COL2 = 40462,
COL3 = c(40482,
40478))
tmp <- ddply(dataset, "COL1", function(x){
delta <- with(x, 1 + COL3 - COL2)
rows <- rep(1, delta %/% 7)
if(delta %% 7 > 0){
rows <- c(rows, (delta %% 7) / 7)
}
data.frame(COL4 = rows)
})
merge(dataset, tmp)
HTH,
Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium
Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
> -----Oorspronkelijk bericht-----
> Van: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] Namens Bert Jacobs
> Verzonden: maandag 11 oktober 2010 11:26
> Aan: r-help at r-project.org
> Onderwerp: [R] Split rows depending on time frame
>
> Hi,
>
>
>
> I have the following data frame, where col2 is a startdate
> and col3 an enddate
>
>
>
> COL1 COL2 COL3
>
> A 40462 40482
>
> B 40462 40478
>
>
>
> The above timeframe of 3 weeks I would like to splits it in
> weeks like this
>
> COL1 COL2 COL3 COL4
>
> A 40462 40468 1
>
> A 40469 40475 1
>
> A 40476 40482 1
>
> B 40462 40468 1
>
> B 40469 40475 1
>
> B 40476 40478 0.428
>
>
>
> Where COL4 is an identifier if the timeframe between COL2 and
> COL3 is exactly 7 days or shorter.
>
> In the example above for B the last split contains only 3
> days so the value in COL 4 is 3/7
>
>
>
> I can't figure out to do the above. Is there someone who can
> help me out?
>
>
>
> Thx in advance,
>
> Bert
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
On Mon, Oct 11, 2010 at 5:25 AM, Bert Jacobs <bert.jacobs at figurestofacts.be> wrote:> Hi, > > > > I have the following data frame, where col2 is a startdate and col3 an > enddate > > > > COL1 ? ? ?COL2 ? ? ?COL3 > > A ? ? ? ? ? ? 40462 ? ?40482 > > B ? ? ? ? ? ? 40462 ? ?40478 > > > > The above timeframe of 3 weeks I would like to splits it in weeks like this > > COL1 ? ? ?COL2 ? ? ?COL3 ? ? ?COL4 > > A ? ? ? ? ? ? 40462 ? ?40468 ? ?1 > > A ? ? ? ? ? ? 40469 ? ?40475 ? ?1 > > A ? ? ? ? ? ? 40476 ? ?40482 ? ?1 > > B ? ? ? ? ? ? 40462 ? ?40468 ? ?1 > > B ? ? ? ? ? ? 40469 ? ?40475 ? ?1 > > B ? ? ? ? ? ? 40476 ? ?40478 ? ?0.428 > > > > Where COL4 is an identifier if the timeframe between COL2 and COL3 is > exactly 7 days or shorter. > > In the example above for B the last split contains only 3 days so the value > in COL 4 is 3/7Try this: DF <- data.frame(COL1 = c("A", "B"), COL2 = 40462, COL3 = c(40482, 40478)) do.call("rbind", by(DF, DF$COL1, function(x) with(x, { COL2 <- seq(COL2, COL3, 7) COL3 <- pmin(COL2 + 6, COL3) COL4 <- (COL3 - COL2 + 1) / 7 data.frame(COL1, COL2, COL3, COL4) }))) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com