I have data in the following form: ID COUPON0 COUPON1 COUPON2 COUPON3 1 1 1000 1001 1002 2 2 NA NA NA 3 1000 1003 NA 1004 4 1001 NA 1005 NA 5 1002 NA NA NA 12 1003 NA NA 1006 7 1005 NA NA NA 8 1004 1007 NA NA 9 1006 NA NA NA 26 1007 NA NA NA I would like to convert this into an adjacency matrix like the following: 1 2 3 4 5 12 7 8 9 26 1 0 0 1 1 1 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 1 0 1 0 0 4 0 0 0 0 0 0 1 0 0 0 5 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 1 0 7 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 1 9 0 0 0 0 0 0 0 0 0 0 26 0 0 0 0 0 0 0 0 0 0 The actual data contains about 570 rows and 7 "coupon" columns. COUPON0 is a unique coupon number submitted by each participant. COUPON1-COUPON7 are unique coupon numbers distributed to other participants. About 15 participants were "seeds" who distributed coupon numbers but did not receive a coupon from another participant. Many participants (including some seeds) did not distribute any coupons. Any ideas about how to make this conversion would be greatly appreciated. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894
The generalized inner product function which I posted in response to another query earlier today: https://www.stat.math.ethz.ch/pipermail/r-help/2006-February/087120.html can also solve this problem: f <- function(x, y) length(na.omit(unlist(intersect(x,y)))) > 0 inner(dd[,-1], dd[,-1], f) - diag(nrow(dd)) # inner from cited post On 2/18/06, Chuck Cleland <ccleland at optonline.net> wrote:> I have data in the following form: > > ID COUPON0 COUPON1 COUPON2 COUPON3 > 1 1 1000 1001 1002 > 2 2 NA NA NA > 3 1000 1003 NA 1004 > 4 1001 NA 1005 NA > 5 1002 NA NA NA > 12 1003 NA NA 1006 > 7 1005 NA NA NA > 8 1004 1007 NA NA > 9 1006 NA NA NA > 26 1007 NA NA NA > > I would like to convert this into an adjacency matrix like the following: > > 1 2 3 4 5 12 7 8 9 26 > 1 0 0 1 1 1 0 0 0 0 0 > 2 0 0 0 0 0 0 0 0 0 0 > 3 0 0 0 0 0 1 0 1 0 0 > 4 0 0 0 0 0 0 1 0 0 0 > 5 0 0 0 0 0 0 0 0 0 0 > 12 0 0 0 0 0 0 0 0 1 0 > 7 0 0 0 0 0 0 0 0 0 0 > 8 0 0 0 0 0 0 0 0 0 1 > 9 0 0 0 0 0 0 0 0 0 0 > 26 0 0 0 0 0 0 0 0 0 0 > > The actual data contains about 570 rows and 7 "coupon" columns. > COUPON0 is a unique coupon number submitted by each participant. > COUPON1-COUPON7 are unique coupon numbers distributed to other > participants. About 15 participants were "seeds" who distributed coupon > numbers but did not receive a coupon from another participant. Many > participants (including some seeds) did not distribute any coupons. > Any ideas about how to make this conversion would be greatly appreciated. > > -- > Chuck Cleland, Ph.D. > NDRI, Inc. > 71 West 23rd Street, 8th floor > New York, NY 10010 > tel: (212) 845-4495 (Tu, Th) > tel: (732) 452-1424 (M, W, F) > fax: (917) 438-0894 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
> I have data in the following form: > > ID COUPON0 COUPON1 COUPON2 COUPON3 > 1 1 1000 1001 1002 > 2 2 NA NA NA > 3 1000 1003 NA 1004 > 4 1001 NA 1005 NA > 5 1002 NA NA NA > 12 1003 NA NA 1006 > 7 1005 NA NA NA > 8 1004 1007 NA NA > 9 1006 NA NA NA > 26 1007 NA NA NA > > I would like to convert this into an adjacency matrix like the following:Here's one solution: a <- data.frame( ID = c(1, 2, 3, 4, 5, 12, 7, 8, 9, 26), COUPON0 = c(1, 2, 1000, 1001, 1002, 1003, 1005, 1004, 1006, 1007), COUPON1 = c(1000, NA, 1003, NA, NA, NA, NA, 1007, NA, NA), COUPON2 = c(1001, NA, NA, 1005, NA, NA, NA, NA, NA, NA), COUPON3 = c(1002, NA, 1004, NA, NA, 1006, NA, NA, NA, NA) ) names(a) <- tolower(names(a)) # Make a look up table from coupon to id coupontoid <- a$id names(coupontoid) <- a$coupon0 # Convert a to a more normal (in the sense of # database normalisation) form # (many other ways to do this) library(reshape) am <- melt(a, id=c("id", "coupon0")) # Remap coupon number to id map <- data.frame(src=am$id, dest=coupontoid[as.character(am$value)]) # Convert to adjacency matrix xtabs(~ src + dest, map) # Force all levels to display map$src <- factor(map$src, levels=unique(a$id)) map$dest <- factor(map$dest, levels=unique(a$id)) xtabs(~ src + dest, map) Regards, Hadley