I have data in the following form:
ID COUPON0 COUPON1 COUPON2 COUPON3
1 1 1000 1001 1002
2 2 NA NA NA
3 1000 1003 NA 1004
4 1001 NA 1005 NA
5 1002 NA NA NA
12 1003 NA NA 1006
7 1005 NA NA NA
8 1004 1007 NA NA
9 1006 NA NA NA
26 1007 NA NA NA
I would like to convert this into an adjacency matrix like the following:
1 2 3 4 5 12 7 8 9 26
1 0 0 1 1 1 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 1 0 1 0 0
4 0 0 0 0 0 0 1 0 0 0
5 0 0 0 0 0 0 0 0 0 0
12 0 0 0 0 0 0 0 0 1 0
7 0 0 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 1
9 0 0 0 0 0 0 0 0 0 0
26 0 0 0 0 0 0 0 0 0 0
The actual data contains about 570 rows and 7 "coupon" columns.
COUPON0 is a unique coupon number submitted by each participant.
COUPON1-COUPON7 are unique coupon numbers distributed to other
participants. About 15 participants were "seeds" who distributed
coupon
numbers but did not receive a coupon from another participant. Many
participants (including some seeds) did not distribute any coupons.
Any ideas about how to make this conversion would be greatly appreciated.
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 452-1424 (M, W, F)
fax: (917) 438-0894
The generalized inner product function which I posted in response to another query earlier today: https://www.stat.math.ethz.ch/pipermail/r-help/2006-February/087120.html can also solve this problem: f <- function(x, y) length(na.omit(unlist(intersect(x,y)))) > 0 inner(dd[,-1], dd[,-1], f) - diag(nrow(dd)) # inner from cited post On 2/18/06, Chuck Cleland <ccleland at optonline.net> wrote:> I have data in the following form: > > ID COUPON0 COUPON1 COUPON2 COUPON3 > 1 1 1000 1001 1002 > 2 2 NA NA NA > 3 1000 1003 NA 1004 > 4 1001 NA 1005 NA > 5 1002 NA NA NA > 12 1003 NA NA 1006 > 7 1005 NA NA NA > 8 1004 1007 NA NA > 9 1006 NA NA NA > 26 1007 NA NA NA > > I would like to convert this into an adjacency matrix like the following: > > 1 2 3 4 5 12 7 8 9 26 > 1 0 0 1 1 1 0 0 0 0 0 > 2 0 0 0 0 0 0 0 0 0 0 > 3 0 0 0 0 0 1 0 1 0 0 > 4 0 0 0 0 0 0 1 0 0 0 > 5 0 0 0 0 0 0 0 0 0 0 > 12 0 0 0 0 0 0 0 0 1 0 > 7 0 0 0 0 0 0 0 0 0 0 > 8 0 0 0 0 0 0 0 0 0 1 > 9 0 0 0 0 0 0 0 0 0 0 > 26 0 0 0 0 0 0 0 0 0 0 > > The actual data contains about 570 rows and 7 "coupon" columns. > COUPON0 is a unique coupon number submitted by each participant. > COUPON1-COUPON7 are unique coupon numbers distributed to other > participants. About 15 participants were "seeds" who distributed coupon > numbers but did not receive a coupon from another participant. Many > participants (including some seeds) did not distribute any coupons. > Any ideas about how to make this conversion would be greatly appreciated. > > -- > Chuck Cleland, Ph.D. > NDRI, Inc. > 71 West 23rd Street, 8th floor > New York, NY 10010 > tel: (212) 845-4495 (Tu, Th) > tel: (732) 452-1424 (M, W, F) > fax: (917) 438-0894 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
> I have data in the following form: > > ID COUPON0 COUPON1 COUPON2 COUPON3 > 1 1 1000 1001 1002 > 2 2 NA NA NA > 3 1000 1003 NA 1004 > 4 1001 NA 1005 NA > 5 1002 NA NA NA > 12 1003 NA NA 1006 > 7 1005 NA NA NA > 8 1004 1007 NA NA > 9 1006 NA NA NA > 26 1007 NA NA NA > > I would like to convert this into an adjacency matrix like the following:Here's one solution: a <- data.frame( ID = c(1, 2, 3, 4, 5, 12, 7, 8, 9, 26), COUPON0 = c(1, 2, 1000, 1001, 1002, 1003, 1005, 1004, 1006, 1007), COUPON1 = c(1000, NA, 1003, NA, NA, NA, NA, 1007, NA, NA), COUPON2 = c(1001, NA, NA, 1005, NA, NA, NA, NA, NA, NA), COUPON3 = c(1002, NA, 1004, NA, NA, 1006, NA, NA, NA, NA) ) names(a) <- tolower(names(a)) # Make a look up table from coupon to id coupontoid <- a$id names(coupontoid) <- a$coupon0 # Convert a to a more normal (in the sense of # database normalisation) form # (many other ways to do this) library(reshape) am <- melt(a, id=c("id", "coupon0")) # Remap coupon number to id map <- data.frame(src=am$id, dest=coupontoid[as.character(am$value)]) # Convert to adjacency matrix xtabs(~ src + dest, map) # Force all levels to display map$src <- factor(map$src, levels=unique(a$id)) map$dest <- factor(map$dest, levels=unique(a$id)) xtabs(~ src + dest, map) Regards, Hadley