peter.gedeck@pharma.novartis.com
2003-Jun-18 18:38 UTC
[Rd] Private: Problem with tapply/lapply and sample (PR#3286)
Full_Name: Peter Gedeck Version: R1.6.2 and R1.7.0 OS: Windows XP Submission from: (NULL) (194.191.169.72) Hello, I marked the bug report Private, as I don't want my email address on the web server. The problem that I found is best explained using an example. index <- 1:6 cluster <- c(1,1,1,2,2,3) tapply(index,cluster,sample) gives $"1" [1] 2 1 3 $"2" [1] 4 5 $"3" [1] 3 2 4 1 6 5 The result for 1 and 2 is correct, the last line should however be $"3" [1] 6 and not return a sample containing all data. Other functions seem to work correctly,> tapply(index,cluster,min)1 2 3 1 4 6> tapply(index,cluster,max)1 2 3 3 5 6> tapply(index,cluster,sum)1 2 3 6 9 6 Another example which maybe gives an indication where the problem lies is> index <- 1:3 > cluster <- c(1,2,3) > tapply(index,cluster,sample)$"1" [1] 1 $"2" [1] 2 1 $"3" [1] 1 3 2 The results should be 1, 2 and 3. I tried to identify where the error occurs, but I think the error happens in .Internal(lapply(X,FUN)) in lapply. Regards, Peter
Peter Dalgaard BSA
2003-Jun-18 18:46 UTC
[Rd] Private: Problem with tapply/lapply and sample (PR#3286)
peter.gedeck@pharma.novartis.com writes:> I marked the bug report Private, as I don't want my email address on the web > server.That doesn't actually work. There used to be a checkbox for it but it was removed a while ago. I see I forgot to remove the text that explained how it worked (or didn't work: generally very few people got to see those reports and that's why the feature got removed.) I'll fix the web page shortly. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
Prof Brian Ripley
2003-Jun-18 18:50 UTC
[Rd] Private: Problem with tapply/lapply and sample (PR#3286)
You have already broadcast your address, and it is not reasonable to expect help anonymously. In this case, the bug is in your reading of the help page for sample, quite possibly because you didn't read it. Perhaps this might help your comprehension: mysample <- function(x,...) { print(x) sample(x, ...) }> tapply(index,cluster,mysample)[1] 1 2 3 [1] 4 5 [1] 6 $"1" [1] 1 3 2 $"2" [1] 5 4 $"3" [1] 1 3 6 2 5 4> sample(6)[1] 5 2 3 4 1 6 Now look up what sample(6) does. On Wed, 18 Jun 2003 peter.gedeck@pharma.novartis.com wrote:> Full_Name: Peter Gedeck > Version: R1.6.2 and R1.7.0 > OS: Windows XP > Submission from: (NULL) (194.191.169.72) > > > Hello, > > I marked the bug report Private, as I don't want my email address on the web > server. The problem that I found is best explained using an example. > > index <- 1:6 > cluster <- c(1,1,1,2,2,3) > tapply(index,cluster,sample) > > gives > > $"1" > [1] 2 1 3 > $"2" > [1] 4 5 > $"3" > [1] 3 2 4 1 6 5 > > The result for 1 and 2 is correct, the last line should however be > $"3" > [1] 6 > and not return a sample containing all data.That is exactly what you asked for. Working as documented is not a bug.> Other functions seem to work > correctly, > > > tapply(index,cluster,min) > 1 2 3 > 1 4 6 > > tapply(index,cluster,max) > 1 2 3 > 3 5 6 > > tapply(index,cluster,sum) > 1 2 3 > 6 9 6 > > Another example which maybe gives an indication where the problem lies isThe finger is still pointing at you! -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595