davclark@nyu.edu
2005-Feb-07 21:20 UTC
[Rd] Incorrect behavior for ordering timepoints in "reshape" (PR#7669)
Full_Name: Dav Clark Version: 2.0.1 OS: OS X 10.3 Submission from: (NULL) (128.122.87.35) When the timepoints that reshape uses (in direction="long") are negative or fractional, the time label is assigned incorrectly. It is easier to give an example than to describe the problem abstractly: Assume you have a data.frame header with values related to peri-stimulus time like this: "HRF -5" "HRF -2.5" "HRF 0" "HRF 2.5" ... "HRF 10" And you give reshape a split argument of a space " ". Then the labels will be assigned strangely, based on alphabetical ordering. So the above list order maps to: -2.5, -5, 0, 10, ... 2.5 Items under the "HRF -5" column in wide format recieve a -2.5 label, items under "HRF 2.5" receive a label of 10, and so on. Somewhere, the time labels are being used before conversion to numbers. But, reshape returns an error if it is not possible to convert the timepoints to numeric! So obviously, more functionality could be provided, or at least the documentation should reflect the current shortfall. For completeness, here is a minimal example demonstrating the bug: df <- data.frame(id="S1", V1="from -2", V2="from -1") names(df)[2:3] <- c("vals.-2", "vals.-1") df reshape(df, direction="long", varying=2:3) Thanks! Dav
Peter Dalgaard
2005-Feb-08 00:42 UTC
[Rd] Incorrect behavior for ordering timepoints in "reshape" (PR#7669)
davclark@nyu.edu writes:> Full_Name: Dav Clark > Version: 2.0.1 > OS: OS X 10.3 > Submission from: (NULL) (128.122.87.35) > > > When the timepoints that reshape uses (in direction="long") are negative or > fractional, the time label is assigned incorrectly. It is easier to give an > example than to describe the problem abstractly: > > Assume you have a data.frame header with values related to peri-stimulus time > like this: > > "HRF -5" "HRF -2.5" "HRF 0" "HRF 2.5" ... "HRF 10" > > And you give reshape a split argument of a space " ". > > Then the labels will be assigned strangely, based on alphabetical ordering. So > the above list order maps to: > > -2.5, -5, 0, 10, ... 2.5 > > Items under the "HRF -5" column in wide format recieve a -2.5 label, items under > "HRF 2.5" receive a label of 10, and so on. > > Somewhere, the time labels are being used before conversion to numbers. But, > reshape returns an error if it is not possible to convert the timepoints to > numeric! So obviously, more functionality could be provided, or at least the > documentation should reflect the current shortfall. > > For completeness, here is a minimal example demonstrating the bug: > > df <- data.frame(id="S1", V1="from -2", V2="from -1") > names(df)[2:3] <- c("vals.-2", "vals.-1") > df > reshape(df, direction="long", varying=2:3)Hmm, this looks messed up even without the negatives. The guess() function inside reshape always sorts before converting to numeric, so you get the 1 10 11 2 3 4 5 6 7 8 9 effect, but what is worse: the sorting decouples the values from the variable names, as demonstrated by modifying your example slightly> reshape(df, direction="long", varying=3:2)id time vals S1.-1 S1 -1 from -1 S1.-2 S1 -2 from -2 I'm not at all sure I understand what was supposed to happen here, perhaps the sort in varying <- unique(nn[, 1]) times <- sort(unique(nn[, 2])) is a thinko? Over to Thomas, I think. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907