Kate Ignatius
2014-Aug-16 19:42 UTC
[R] data.table/ifelse conditional new variable question
Hi, I have a data.table question (as well as if else statement query). I have a large list of families (file has 935 individuals that are sorted by famiy of varying sizes). At the moment the file has the columns: SampleID FamilyID Relationship To prevent from having to make a pedigree file by hand - ie adding a PaternalID and a MaternalID one by one I want to try write a script that will quickly do this for me (I eventually want to run this through a program such as plink) Is there a way to use data.table (maybe in conjucntion with ifelse to do this effectively)? An example of the file is something like: Family.ID Sample.ID Relationship 14 62 sibling 14 94 father 14 63 sibling 14 59 mother 17 6004 father 17 6003 mother 17 6005 sibling 17 368 sibling 130 202 mother 130 203 father 130 204 sibling 130 205 sibling 130 206 sibling 222 9 mother 222 45 sibling 222 34 sibling 222 10 sibling 222 11 sibling 222 18 father But the goal is to have a file like this: Family.ID Sample.ID Relationship PID MID 14 62 sibling 94 59 14 94 father 0 0 14 63 sibling 94 59 14 59 mother 0 0 17 6004 father 0 0 17 6003 mother 0 0 17 6005 sibling 6004 6003 17 368 sibling 6004 6003 130 202 mother 0 0 130 203 father 0 0 130 204 sibling 203 202 130 205 sibling 203 202 130 206 sibling 203 202 222 9 mother 0 0 222 45 sibling 18 9 222 34 sibling 18 9 222 10 sibling 18 9 222 11 sibling 18 9 222 18 father 0 0 I've tried searches for this but with no luck. Greatly appreciate any help - even if its just a link to a great example/solution! Thanks!
Jorge I Velez
2014-Aug-16 22:48 UTC
[R] data.table/ifelse conditional new variable question
Dear Kate, Assuming you have nuclear families, one option would be: x <- read.table(textConnection("Family.ID Sample.ID Relationship 14 62 sibling 14 94 father 14 63 sibling 14 59 mother 17 6004 father 17 6003 mother 17 6005 sibling 17 368 sibling 130 202 mother 130 203 father 130 204 sibling 130 205 sibling 130 206 sibling 222 9 mother 222 45 sibling 222 34 sibling 222 10 sibling 222 11 sibling 222 18 father"), header = TRUE) closeAllConnections() xs <- with(x, split(x, Family.ID)) res <- do.call(rbind, lapply(xs, function(l){ l$PID <- l$MID <- 0 father <- with(l, Relationship == 'father') mother <- with(l, Relationship == 'mother') l$PID[l$Relationship == 'sibling'] <- l$Sample.ID[father] l$MID[l$Relationship == 'sibling'] <- l$Sample.ID[mother] l })) res HTH, Jorge.- Best regards, Jorge.- On Sun, Aug 17, 2014 at 5:42 AM, Kate Ignatius <kate.ignatius at gmail.com> wrote:> Hi, > > I have a data.table question (as well as if else statement query). > > I have a large list of families (file has 935 individuals that are > sorted by famiy of varying sizes). At the moment the file has the > columns: > > SampleID FamilyID Relationship > > To prevent from having to make a pedigree file by hand - ie adding a > PaternalID and a MaternalID one by one I want to try write a script > that will quickly do this for me (I eventually want to run this > through a program such as plink) Is there a way to use data.table > (maybe in conjucntion with ifelse to do this effectively)? > > An example of the file is something like: > > Family.ID Sample.ID Relationship > 14 62 sibling > 14 94 father > 14 63 sibling > 14 59 mother > 17 6004 father > 17 6003 mother > 17 6005 sibling > 17 368 sibling > 130 202 mother > 130 203 father > 130 204 sibling > 130 205 sibling > 130 206 sibling > 222 9 mother > 222 45 sibling > 222 34 sibling > 222 10 sibling > 222 11 sibling > 222 18 father > > But the goal is to have a file like this: > > Family.ID Sample.ID Relationship PID MID > 14 62 sibling 94 59 > 14 94 father 0 0 > 14 63 sibling 94 59 > 14 59 mother 0 0 > 17 6004 father 0 0 > 17 6003 mother 0 0 > 17 6005 sibling 6004 6003 > 17 368 sibling 6004 6003 > 130 202 mother 0 0 > 130 203 father 0 0 > 130 204 sibling 203 202 > 130 205 sibling 203 202 > 130 206 sibling 203 202 > 222 9 mother 0 0 > 222 45 sibling 18 9 > 222 34 sibling 18 9 > 222 10 sibling 18 9 > 222 11 sibling 18 9 > 222 18 father 0 0 > > I've tried searches for this but with no luck. Greatly appreciate any > help - even if its just a link to a great example/solution! > > Thanks! > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]