Kirsten Beyer
2007-Aug-22 20:28 UTC
[R] Need a variant of rbind for datasets with different numbers of columns
Hello. I am looking for a function that will allow me to paste rows together without regard for the numbers of columns in the datasets to be joined. The only columns where it matters if they are aligned correctly are at the beginning - the rest of the columns represent differing numbers of ICD9 (disease) codes reported by each person(record) at a health visit. They are in no particular order. For example, a result would look like this: patient ICD91 ICD92 ICD93 patient A 12345 6789 1543 patient B 3469 9090 patient C 1234 I am trying to accomplish this inside a loop which first identifies the codes associated with the person and then joins them to the person. I have the code working so that it can create a row for each person, but I can't figure out how to join these rows together! FYI, my dataset has 200,000+ people. Thanks
jim holtman
2007-Aug-23 00:26 UTC
[R] Need a variant of rbind for datasets with different numbers of columns
Where is the data coming from since it has a variable number of columns in each row? Is it coming from a text file? If so, you can use the "fill=TRUE" option when reading to fill out empty columns. You need to provide at least a subset of the data so we can see what you are working with. On 8/22/07, Kirsten Beyer <kirsten-beyer at uiowa.edu> wrote:> Hello. I am looking for a function that will allow me to paste rows > together without regard for the numbers of columns in the datasets to > be joined. The only columns where it matters if they are aligned > correctly are at the beginning - the rest of the columns represent > differing numbers of ICD9 (disease) codes reported by each > person(record) at a health visit. They are in no particular order. > > For example, a result would look like this: > > patient ICD91 ICD92 ICD93 > patient A 12345 6789 1543 > patient B 3469 9090 > patient C 1234 > > I am trying to accomplish this inside a loop which first identifies > the codes associated with the person and then joins them to the > person. I have the code working so that it can create a row for each > person, but I can't figure out how to join these rows together! FYI, > my dataset has 200,000+ people. > > Thanks > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
hadley wickham
2007-Aug-23 13:55 UTC
[R] Need a variant of rbind for datasets with different numbers of columns
You might try rbind.fill in the reshape package. Hadley On 8/22/07, Kirsten Beyer <kirsten-beyer at uiowa.edu> wrote:> Hello. I am looking for a function that will allow me to paste rows > together without regard for the numbers of columns in the datasets to > be joined. The only columns where it matters if they are aligned > correctly are at the beginning - the rest of the columns represent > differing numbers of ICD9 (disease) codes reported by each > person(record) at a health visit. They are in no particular order. > > For example, a result would look like this: > > patient ICD91 ICD92 ICD93 > patient A 12345 6789 1543 > patient B 3469 9090 > patient C 1234 > > I am trying to accomplish this inside a loop which first identifies > the codes associated with the person and then joins them to the > person. I have the code working so that it can create a row for each > person, but I can't figure out how to join these rows together! FYI, > my dataset has 200,000+ people. > > Thanks > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- http://had.co.nz/