I stumbled onto this working on an update to coxph. The last 6 lines below are the question, the rest create a test data set. tmt585% R R version 2.12.2 (2011-02-25) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-unknown-linux-gnu (64-bit) # Lines of code from survival/tests/singtest.R> library(survival)Loading required package: splines> test1 <- data.frame(time= c(4, 3,1,1,2,2,3),+ status=c(1,NA,1,0,1,1,0), + x= c(0, 2,1,1,1,0,0))> > temp <- rep(0:3, rep(7,4)) > > stest <- data.frame(start = 10*temp,+ stop = 10*temp + test1$time, + status = rep(test1$status,4), + x = c(test1$x+ 1:7, rep(test1$x,3)), + epoch = rep(1:4, rep(7,4)))> > fit1 <- coxph(Surv(start, stop, status) ~ x * factor(epoch), stest)## New lines> temp1 <- fit1$linear.predictor > temp2 <- as.matrix(temp1) > match(temp1, unique(temp1))[1] 1 2 3 4 4 5 6 7 7 7 6 6 6 8 8 8 6 6 6 9 9 9 6 6> match(temp2, unique(temp2))[1] 1 2 3 4 4 5 6 7 7 7 6 6 6 NA NA NA 6 6 6 8 8 8 6 6 ----------------------- I've solved it for my code by not calling match on a 1 column vector. In general, however, should I be using some other paradym for this "map to unique" operation? For example match(as.character(x), unique(as.character(x)) ? Terry T
On Wed, Mar 09, 2011 at 08:48:10AM -0600, Terry Therneau wrote:> I stumbled onto this working on an update to coxph. The last 6 lines > below are the question, the rest create a test data set. > > tmt585% R > R version 2.12.2 (2011-02-25) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-unknown-linux-gnu (64-bit) > > # Lines of code from survival/tests/singtest.R > > library(survival) > Loading required package: splines > > test1 <- data.frame(time= c(4, 3,1,1,2,2,3), > + status=c(1,NA,1,0,1,1,0), > + x= c(0, 2,1,1,1,0,0)) > > > > temp <- rep(0:3, rep(7,4)) > > > > stest <- data.frame(start = 10*temp, > + stop = 10*temp + test1$time, > + status = rep(test1$status,4), > + x = c(test1$x+ 1:7, rep(test1$x,3)), > + epoch = rep(1:4, rep(7,4))) > > > > fit1 <- coxph(Surv(start, stop, status) ~ x * factor(epoch), stest) > > ## New lines > > temp1 <- fit1$linear.predictor > > temp2 <- as.matrix(temp1) > > match(temp1, unique(temp1)) > [1] 1 2 3 4 4 5 6 7 7 7 6 6 6 8 8 8 6 6 6 9 9 9 6 6 > > match(temp2, unique(temp2)) > [1] 1 2 3 4 4 5 6 7 7 7 6 6 6 NA NA NA 6 6 6 8 8 8 > 6 6 > > ----------------------- > > I've solved it for my code by not calling match on a 1 column vector. > In general, however, should I be using some other paradym for this "map > to unique" operation? For example match(as.character(x), > unique(as.character(x)) ?Let me suggest an alternative, which is consistent with unique() on numeric vectors and uses a transformation of the column using rank(). For example, temp3 <- as.matrix(rank(temp1, ties.method="max")) match(temp3, unique(temp3)) [1] 1 2 3 4 4 5 6 7 7 7 6 6 6 8 8 8 6 6 6 9 9 9 6 6 Can this be used in your code? Petr Savicky.
Simon Urbanek
2011-Mar-09 19:11 UTC
[Rd] unique.matrix issue [Was: Anomaly with unique and match]
match() is a red herring here -- it is really a very specific thing that has to do with the fact that you're running unique() on a matrix. Also it's much easier to reproduce:> x=c(1,1+0.2e-15) > x[1] 1 1> sprintf("%a",x)[1] "0x1p+0" "0x1.0000000000001p+0"> unique(x)[1] 1 1> sprintf("%a",unique(x))[1] "0x1p+0" "0x1.0000000000001p+0"> unique(matrix(x,2))[,1] [1,] 1 and this comes from the fact that unique.matrix uses string representation since it has to take into account all values of a row/column so it pastes all values into one string, but for the two numbers that is the same:> as.character(x)[1] "1" "1" Cheers, Simon On Mar 9, 2011, at 9:48 AM, Terry Therneau wrote:> I stumbled onto this working on an update to coxph. The last 6 lines > below are the question, the rest create a test data set. > > tmt585% R > R version 2.12.2 (2011-02-25) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-unknown-linux-gnu (64-bit) > > # Lines of code from survival/tests/singtest.R >> library(survival) > Loading required package: splines >> test1 <- data.frame(time= c(4, 3,1,1,2,2,3), > + status=c(1,NA,1,0,1,1,0), > + x= c(0, 2,1,1,1,0,0)) >> >> temp <- rep(0:3, rep(7,4)) >> >> stest <- data.frame(start = 10*temp, > + stop = 10*temp + test1$time, > + status = rep(test1$status,4), > + x = c(test1$x+ 1:7, rep(test1$x,3)), > + epoch = rep(1:4, rep(7,4))) >> >> fit1 <- coxph(Surv(start, stop, status) ~ x * factor(epoch), stest) > > ## New lines >> temp1 <- fit1$linear.predictor >> temp2 <- as.matrix(temp1) >> match(temp1, unique(temp1)) > [1] 1 2 3 4 4 5 6 7 7 7 6 6 6 8 8 8 6 6 6 9 9 9 6 6 >> match(temp2, unique(temp2)) > [1] 1 2 3 4 4 5 6 7 7 7 6 6 6 NA NA NA 6 6 6 8 8 8 > 6 6 > > ----------------------- > > I've solved it for my code by not calling match on a 1 column vector. > In general, however, should I be using some other paradym for this "map > to unique" operation? For example match(as.character(x), > unique(as.character(x)) ? > > Terry T > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >