I have data that looks like this: Friend1, Friend2 A, B A, C B, A C, D And I'd like to generate some more rows and another column. In the new column I'd like to add a 1 beside all the existing rows. That bit's easy enough. Then I'd like to add rows for all the possible directed combinations of rows not included in the existing data. So for the above I think that would be: A, D D, A B, C C, B B, D C, A D, B D, C and then put a 0 in the column beside these. Can anyone suggest how to do this? I'm using R version 2.15.3. Thank you, Thomas Chesney This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system, you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.
You could use the data.table package require(data.table) DT <- data.table(Friend1 = sample(LETTERS, 10, replace = TRUE), Friend2 = sample(LETTERS, 10, replace = TRUE), Indicator = 1) ALL <- data.table(unique(expand.grid(DT))) setkey(ALL) OTHERS <- ALL[!DT] OTHERS[, Indicator := 0] RESULT <- rbind(DT, ALL) Best Simon On 01 Nov 2013, at 10:32, Thomas <Thomas.Chesney at nottingham.ac.uk> wrote:> I have data that looks like this: > > Friend1, Friend2 > A, B > A, C > B, A > C, D > > And I'd like to generate some more rows and another column. In the new column I'd like to add a 1 beside all the existing rows. That bit's easy enough. > > Then I'd like to add rows for all the possible directed combinations of rows not included in the existing data. So for the above I think that would be: > > A, D > D, A > B, C > C, B > B, D > C, A > D, B > D, C > > and then put a 0 in the column beside these. > > Can anyone suggest how to do this? > > I'm using R version 2.15.3. > > Thank you, > > Thomas Chesney > This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. > > This message has been checked for viruses but the contents of an attachment > may still contain software viruses which could damage your computer system, you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Thomas,
It depends whether you'd like to include all levels of each column in every
column. For including all values you could try something like this:
isAllDifferent <- function(z) !any(duplicated(z))
myData <- data.frame(Friend1=c("a", "a", "b",
"c"), Friend2=c("b", "c", "a",
"d"), stringsAsFactors=FALSE)
friends <- unique(unlist(myData, use.names=FALSE))
allCombs <- do.call(expand.grid, rep(list(friends), ncol(myData)))
colnames(allCombs) <- colnames(myData)
allCombs <- allCombs[apply(allCombs, 1, isAllDifferent),]
output <- cbind(allCombs, included=1*do.call(paste,
allCombs)%in%do.call(paste, myData))
output[order(output$included, decreasing=TRUE),]
Friend1 Friend2 included
2 b a 1
5 a b 1
9 a c 1
15 c d 1
3 c a 0
4 d a 0
7 c b 0
8 d b 0
10 b c 0
12 d c 0
13 a d 0
14 b d 0
If you only want each column to contain its corresponding values, you could try
something like this:
myData <- data.frame(Friend1=c("a", "a", "b",
"c"),
Friend2=c("b", "c", "a", "d"), new =
1)
newData <- expand.grid(Friend1 = unique(myData$Friend1),
Friend2 = unique(myData$Friend2))
output <- merge(myData, newData, all = TRUE)
output$new[is.na(output$new)] <- 0
output
Friend1 Friend2 new
1 a a 0
2 a b 1
3 a c 1
4 a d 0
5 b a 1
6 b b 0
7 b c 0
8 b d 0
9 c a 0
10 c b 0
11 c c 0
12 c d 1
I hope this helps.
Best wishes
Chris
Chris Campbell, PhD
Tel. +44 (0) 1249 705 450?| Mobile. +44 (0) 7929 628349
ccampbell at mango-solutions.com?| http://www.mango-solutions.com
Data Analysis that Delivers
Mango Solutions
2 Methuen Park, Chippenham, Wiltshire. SN14 OGB UK
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Thomas
Sent: 01 November 2013 09:32
To: r-help at r-project.org
Subject: [R] Combinations of values in two columns
I have data that looks like this:
Friend1, Friend2
A, B
A, C
B, A
C, D
And I'd like to generate some more rows and another column. In the new
column I'd like to add a 1 beside all the existing rows. That bit's easy
enough.
Then I'd like to add rows for all the possible directed combinations of rows
not included in the existing data. So for the above I think that would be:
A, D
D, A
B, C
C, B
B, D
C, A
D, B
D, C
and then put a 0 in the column beside these.
Can anyone suggest how to do this?
I'm using R version 2.15.3.
Thank you,
Thomas Chesney
This message and any attachment are intended solely for the addressee and may
contain confidential information. If you have received this message in error,
please send it back to me, and immediately delete it. Please do not use, copy
or disclose the information contained in this message or in any attachment. Any
views or opinions expressed by the author of this email do not necessarily
reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an attachment may
still contain software viruses which could damage your computer system, you are
advised to perform your own checks. Email communications with the University of
Nottingham may be monitored as permitted by UK legislation.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
LEGAL NOTICE\ \ This message is intended for the use of ...{{dropped:18}}
Hi,
You may try:
dat1 <- read.table(text="
Friend1,Friend2
A,B
A,C
B,A
C,D",sep=",",header=TRUE,stringsAsFactors=FALSE)
indx <- as.vector(outer(unique(dat1[,1]),unique(dat1[,2]),paste))
res <-
cbind(setNames(read.table(text=indx,sep="",header=FALSE,stringsAsFactors=FALSE),paste0("Friend",1:2)),
New=1*(indx %in% as.character(interaction(dat1,sep=" "))))
A.K.
On Friday, November 1, 2013 5:56 AM, Thomas <thomas.chesney at
nottingham.ac.uk> wrote:
I have data that looks like this:
Friend1, Friend2
A, B
A, C
B, A
C, D
And I'd like to generate some more rows and another column. In the new?
column I'd like to add a 1 beside all the existing rows. That bit's?
easy enough.
Then I'd like to add rows for all the possible directed combinations?
of rows not included in the existing data. So for the above I think?
that would be:
A, D
D, A
B, C
C, B
B, D
C, A
D, B
D, C
and then put a 0 in the column beside these.
Can anyone suggest how to do this?
I'm using R version 2.15.3.
Thank you,
Thomas Chesney
This message and any attachment are intended solely for the addressee and may
contain confidential information. If you have received this message in error,
please send it back to me, and immediately delete it.? Please do not use, copy
or disclose the information contained in this message or in any attachment.? Any
views or opinions expressed by the author of this email do not necessarily
reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an attachment
may still contain software viruses which could damage your computer system, you
are advised to perform your own checks. Email communications with the University
of Nottingham may be monitored as permitted by UK legislation.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.