Søren Højsgaard
2005-Sep-27 17:12 UTC
[R] Using unsplit - unsplit does not seem to reverse the effect of split
In data OME in MASS I would like to extract the first 5 observations per subject (=ID). So I do library(MASS) OMEsub <- split(OME, OME$ID) OMEsub <- lapply(OMEsub,function(x)x[1:5,]) unsplit(OMEsub, OME$ID) - which results in [[1]] [1] 1 1 1 1 1 [[2]] [1] 30 30 30 30 30 [[3]] [1] low low low low low Levels: N/A high low [[4]] [1] 35 35 40 40 45 [[5]] [1] coherent incoherent coherent incoherent coherent Levels: coherent incoherent [[6]] [1] 1 4 0 1 2 ............ [[1094]] [1] 4 5 5 5 2 [[1095]] [1] 100 100 100 100 100 [[1096]] [1] 18 18 18 18 18 [[1097]] [1] N/A N/A N/A N/A N/A Levels: N/A high low There were 50 or more warnings (use warnings() to see the first 50) warnings() Warning messages: 1: number of items to replace is not a multiple of replacement length 2: number of items to replace is not a multiple of replacement length 3: number of items to replace is not a multiple of replacement length .... According to documentation unsplit is the reverse of split, but I must be missing a point somewhere... Can anyone help? Thanks in advance. S??ren
Peter Dalgaard
2005-Sep-27 17:49 UTC
[R] Using unsplit - unsplit does not seem to reverse the effect of split
S??ren H??jsgaard <Soren.Hojsgaard at agrsci.dk> writes:> In data OME in MASS I would like to extract the first 5 observations per subject (=ID). So I do > > library(MASS) > OMEsub <- split(OME, OME$ID) > OMEsub <- lapply(OMEsub,function(x)x[1:5,]) > unsplit(OMEsub, OME$ID) > > - which results in > > [[1]] > [1] 1 1 1 1 1 > [[2]] > [1] 30 30 30 30 30 > [[3]] > [1] low low low low low > Levels: N/A high low > [[4]] > [1] 35 35 40 40 45 > [[5]] > [1] coherent incoherent coherent incoherent coherent > Levels: coherent incoherent > [[6]] > [1] 1 4 0 1 2 > > ............ > > [[1094]] > [1] 4 5 5 5 2 > [[1095]] > [1] 100 100 100 100 100 > [[1096]] > [1] 18 18 18 18 18 > [[1097]] > [1] N/A N/A N/A N/A N/A > Levels: N/A high low > There were 50 or more warnings (use warnings() to see the first 50) > > warnings() > Warning messages: > 1: number of items to replace is not a multiple of replacement length > 2: number of items to replace is not a multiple of replacement length > 3: number of items to replace is not a multiple of replacement length > .... > > According to documentation unsplit is the reverse of split, but I must be missing a point somewhere... Can anyone help? Thanks in advance. S??renIt only works if the first argument is or could have resulted from a split on the second argument. That is clearly not the case when you are creating subvectors. I have on occasion wanted an unsplit that worked without the 2nd argument as in unsplit(l, rep(seq(along=l), sapply(l,length)) ) but if you think about it, it's not really doing anything that do.call("c",l) or do.call("rbind",l) won't do. -- O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Marc Schwartz (via MN)
2005-Sep-27 17:58 UTC
[R] Using unsplit - unsplit does not seem to reverse the effect of split
On Tue, 2005-09-27 at 19:12 +0200, S伱伕ren H伱伕jsgaard wrote:> In data OME in MASS I would like to extract the first 5 observations > per subject (=ID). So I do > > library(MASS) > OMEsub <- split(OME, OME$ID) > OMEsub <- lapply(OMEsub,function(x)x[1:5,]) > unsplit(OMEsub, OME$ID) > > - which results in > > [[1]] > [1] 1 1 1 1 1 > [[2]] > [1] 30 30 30 30 30 > [[3]] > [1] low low low low low > Levels: N/A high low > [[4]] > [1] 35 35 40 40 45 > [[5]] > [1] coherent incoherent coherent incoherent coherent > Levels: coherent incoherent > [[6]] > [1] 1 4 0 1 2 > > ............ > > [[1094]] > [1] 4 5 5 5 2 > [[1095]] > [1] 100 100 100 100 100 > [[1096]] > [1] 18 18 18 18 18 > [[1097]] > [1] N/A N/A N/A N/A N/A > Levels: N/A high low > There were 50 or more warnings (use warnings() to see the first 50) > > warnings() > Warning messages: > 1: number of items to replace is not a multiple of replacement length > 2: number of items to replace is not a multiple of replacement length > 3: number of items to replace is not a multiple of replacement length > .... > > According to documentation unsplit is the reverse of split, but I must > be missing a point somewhere... Can anyone help? Thanks in advance. > S伱伕renIf you read the documentation for split/unsplit, you will also note that in the Details section it says: 'unsplit' works only with lists of vectors as opposed to lists of data frames, which is the result of your split() operation. Also note that in the Value section, it indicates: 'unsplit' returns a vector for which 'split(x, f)' equals 'value' as opposed to unsplit returning a data frame. Thus, use: OME1 <- do.call("rbind", OMEsub) where OME1 will be the result of rbind()'ing the data frames in the OMEsub list. See ?do.call for more information. HTH, Marc Schwartz