Hi, I have the following data frame, where col2 is a startdate and col3 an enddate COL1 COL2 COL3 A 40462 40482 B 40462 40478 The above timeframe of 3 weeks I would like to splits it in weeks like this COL1 COL2 COL3 COL4 A 40462 40468 1 A 40469 40475 1 A 40476 40482 1 B 40462 40468 1 B 40469 40475 1 B 40476 40478 0.428 Where COL4 is an identifier if the timeframe between COL2 and COL3 is exactly 7 days or shorter. In the example above for B the last split contains only 3 days so the value in COL 4 is 3/7 I can''t figure out to do the above. Is there someone who can help me out? Thx in advance, Bert [[alternative HTML version deleted]]
Dear Bert, Use the plyr package to do the magic library(plyr) dataset <- data.frame(COL1 = c("A", "B"), COL2 = 40462, COL3 = c(40482, 40478)) tmp <- ddply(dataset, "COL1", function(x){ delta <- with(x, 1 + COL3 - COL2) rows <- rep(1, delta %/% 7) if(delta %% 7 > 0){ rows <- c(rows, (delta %% 7) / 7) } data.frame(COL4 = rows) }) merge(dataset, tmp) HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey> -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Namens Bert Jacobs > Verzonden: maandag 11 oktober 2010 11:26 > Aan: r-help at r-project.org > Onderwerp: [R] Split rows depending on time frame > > Hi, > > > > I have the following data frame, where col2 is a startdate > and col3 an enddate > > > > COL1 COL2 COL3 > > A 40462 40482 > > B 40462 40478 > > > > The above timeframe of 3 weeks I would like to splits it in > weeks like this > > COL1 COL2 COL3 COL4 > > A 40462 40468 1 > > A 40469 40475 1 > > A 40476 40482 1 > > B 40462 40468 1 > > B 40469 40475 1 > > B 40476 40478 0.428 > > > > Where COL4 is an identifier if the timeframe between COL2 and > COL3 is exactly 7 days or shorter. > > In the example above for B the last split contains only 3 > days so the value in COL 4 is 3/7 > > > > I can't figure out to do the above. Is there someone who can > help me out? > > > > Thx in advance, > > Bert > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Mon, Oct 11, 2010 at 5:25 AM, Bert Jacobs <bert.jacobs at figurestofacts.be> wrote:> Hi, > > > > I have the following data frame, where col2 is a startdate and col3 an > enddate > > > > COL1 ? ? ?COL2 ? ? ?COL3 > > A ? ? ? ? ? ? 40462 ? ?40482 > > B ? ? ? ? ? ? 40462 ? ?40478 > > > > The above timeframe of 3 weeks I would like to splits it in weeks like this > > COL1 ? ? ?COL2 ? ? ?COL3 ? ? ?COL4 > > A ? ? ? ? ? ? 40462 ? ?40468 ? ?1 > > A ? ? ? ? ? ? 40469 ? ?40475 ? ?1 > > A ? ? ? ? ? ? 40476 ? ?40482 ? ?1 > > B ? ? ? ? ? ? 40462 ? ?40468 ? ?1 > > B ? ? ? ? ? ? 40469 ? ?40475 ? ?1 > > B ? ? ? ? ? ? 40476 ? ?40478 ? ?0.428 > > > > Where COL4 is an identifier if the timeframe between COL2 and COL3 is > exactly 7 days or shorter. > > In the example above for B the last split contains only 3 days so the value > in COL 4 is 3/7Try this: DF <- data.frame(COL1 = c("A", "B"), COL2 = 40462, COL3 = c(40482, 40478)) do.call("rbind", by(DF, DF$COL1, function(x) with(x, { COL2 <- seq(COL2, COL3, 7) COL3 <- pmin(COL2 + 6, COL3) COL4 <- (COL3 - COL2 + 1) / 7 data.frame(COL1, COL2, COL3, COL4) }))) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com