ListeRs, Within the last two months, I thought I saw mention of an R function that would create a new data frame composed of duplicates or multiple copies of rows of an input data frame given one or several columns of values indicating how many times each row should be copied. As a simple example, given a dataframe: > in.df x y 1 A 1 2 B 2 3 C 3 "func.name (in.df, in.df$y)" would produce something like: x y 1 A 1 2 B 2 3 B 2 4 C 3 5 C 3 6 C 3 For the life of me, I can't remember what that function was called, nor can I find it using help.search or RSiteSearch on terms like "duplicate" or "copy". Perhaps it had another primary purpose, but this was a side-effect or secondary capability. Was I hallucinating, or does this exist as a function in base R? Or, will I have to make one with "rep"? Thanks in advance! e. -- Eric Archer, Ph.D. NOAA-SWFSC 8604 La Jolla Shores Dr. La Jolla, CA 92037 858-546-7121,7003(FAX) eric.archer at noaa.gov "Lighthouses are more helpful than churches." - Benjamin Franklin "Cogita tute" - Think for yourself
Sundar Dorai-Raj
2006-Feb-10 20:45 UTC
[R] Creating multiple copies of rows in data frames
Eric Archer wrote:> ListeRs, > > Within the last two months, I thought I saw mention of an R function > that would create a new data frame composed of duplicates or multiple > copies of rows of an input data frame given one or several columns of > values indicating how many times each row should be copied. As a simple > example, given a dataframe: > > > in.df > x y > 1 A 1 > 2 B 2 > 3 C 3 > > "func.name (in.df, in.df$y)" would produce something like: > > x y > 1 A 1 > 2 B 2 > 3 B 2 > 4 C 3 > 5 C 3 > 6 C 3 > > For the life of me, I can't remember what that function was called, nor > can I find it using help.search or RSiteSearch on terms like "duplicate" > or "copy". Perhaps it had another primary purpose, but this was a > side-effect or secondary capability. Was I hallucinating, or does this > exist as a function in base R? Or, will I have to make one with "rep"? > Thanks in advance! > > e. >As you mentioned, ?rep should be sufficient: in.df <- data.frame(x = LETTERS[1:3], y = 1:3) out.df <- in.df[rep(seq(nrow(in.df)), in.df$y), ] Then if you want to re-name the row.names row.names(out.df) <- seq(nrow(out.df)) --sundar