Adrian Dragulescu
2009-Nov-04 19:44 UTC
[Rd] inconsistent behavior for logical vectors when using apply (" TRUE")
Hello,> X <- data.frame(letters=letters[1:3], flag=c(TRUE, FALSE, TRUE)) > Xletters flag 1 a TRUE 2 b FALSE 3 c TRUE> apply(X, 1, as.list)[[1]] [[1]]$letters [1] "a" [[1]]$flag [1] " TRUE" [[2]] [[2]]$letters [1] "b" [[2]]$flag [1] "FALSE" [[3]] [[3]]$letters [1] "c" [[3]]$flag [1] " TRUE" Notice how TRUE becomes " TRUE" and FALSE becomes "FALSE". Not sure why TRUE gets an extra whitespace in front. Checked with R-2.10.0, but can reproduce the behavior as far back as R-2.8.1. Adrian Dragulescu> sessionInfo()R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.10.0
Greg Snow
2009-Nov-04 20:02 UTC
[Rd] inconsistent behavior for logical vectors when using apply (" TRUE")
The apply function was meant to work on matrices and arrays, when you use it on a data frame, the frame is first converted to a matrix. Since your data frame has columns of different modes, the logical column is converted to character and the matrix is of the single mode character. That is what you are seeing. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r- > project.org] On Behalf Of Adrian Dragulescu > Sent: Wednesday, November 04, 2009 12:45 PM > To: r-devel > Subject: [Rd] inconsistent behavior for logical vectors when using > apply (" TRUE") > > > Hello, > > > X <- data.frame(letters=letters[1:3], flag=c(TRUE, FALSE, TRUE)) > > X > letters flag > 1 a TRUE > 2 b FALSE > 3 c TRUE > > apply(X, 1, as.list) > [[1]] > [[1]]$letters > [1] "a" > > [[1]]$flag > [1] " TRUE" > > > [[2]] > [[2]]$letters > [1] "b" > > [[2]]$flag > [1] "FALSE" > > > [[3]] > [[3]]$letters > [1] "c" > > [[3]]$flag > [1] " TRUE" > > Notice how TRUE becomes " TRUE" and FALSE becomes "FALSE". Not sure > why > TRUE gets an extra whitespace in front. > > Checked with R-2.10.0, but can reproduce the behavior as far back as > R-2.8.1. > > Adrian Dragulescu > > > sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.10.0 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Adrian Dragulescu
2009-Nov-04 20:23 UTC
[Rd] inconsistent behavior for logical vectors when using apply (" TRUE")
Well documented in ?as.matrix. Ignore my previous post. On Wed, 4 Nov 2009, Greg Snow wrote:> The apply function was meant to work on matrices and arrays, when you use it on a data frame, the frame is first converted to a matrix. Since your data frame has columns of different modes, the logical column is converted to character and the matrix is of the single mode character. That is what you are seeing. > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.snow at imail.org > 801.408.8111 > >> -----Original Message----- >> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r- >> project.org] On Behalf Of Adrian Dragulescu >> Sent: Wednesday, November 04, 2009 12:45 PM >> To: r-devel >> Subject: [Rd] inconsistent behavior for logical vectors when using >> apply (" TRUE") >> >> >> Hello, >> >>> X <- data.frame(letters=letters[1:3], flag=c(TRUE, FALSE, TRUE)) >>> X >> letters flag >> 1 a TRUE >> 2 b FALSE >> 3 c TRUE >>> apply(X, 1, as.list) >> [[1]] >> [[1]]$letters >> [1] "a" >> >> [[1]]$flag >> [1] " TRUE" >> >> >> [[2]] >> [[2]]$letters >> [1] "b" >> >> [[2]]$flag >> [1] "FALSE" >> >> >> [[3]] >> [[3]]$letters >> [1] "c" >> >> [[3]]$flag >> [1] " TRUE" >> >> Notice how TRUE becomes " TRUE" and FALSE becomes "FALSE". Not sure >> why >> TRUE gets an extra whitespace in front. >> >> Checked with R-2.10.0, but can reproduce the behavior as far back as >> R-2.8.1. >> >> Adrian Dragulescu >> >>> sessionInfo() >> R version 2.10.0 (2009-10-26) >> i386-pc-mingw32 >> >> locale: >> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >> States.1252 >> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] tools_2.10.0 >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >
Tony Plate
2009-Nov-04 20:25 UTC
[Rd] inconsistent behavior for logical vectors when using apply (" TRUE")
This happens in as.matrix(), which gets called by apply(). When you've got a mixed-mode dataframe like this, as.matrix() converts everything to character. But, the rules it uses for each column don't seem to be entirely consistent regarding whether columns are space-padded to make each element have the same number of characters. The way it works out, logical mode columns are passed through format(), which space-pads by default. Factor columns are passed through as.vector(), and character-mode columns are left alone. The result is that some columns come out space-padded, and some don't, depending on their original mode. To get greater control of this, you theoretically should be able to do something like apply(as.matrix(format(X, justify="none")), ...), except that format() seems to ignore the justify argument for logical vectors, e.g.: > format(c(T,F,T)) [1] " TRUE" "FALSE" " TRUE" > format(c(T,F,T), justify="none") [1] " TRUE" "FALSE" " TRUE" > If it's really important for you to get this to work the way you want, you can convert the logical column of the data frame using as.character (see the end of the example below). Here's an example that shows probably far more than you wanted to know: > X <- data.frame(letters=letters[1:3], flag=c(TRUE, FALSE, TRUE), codef=c("a","ab","abcd"), codec=I(c("x", "xy", "xyz"))) > sapply(X, class) letters flag codef codec "factor" "logical" "factor" "AsIs" > as.matrix(X) letters flag codef codec [1,] "a" " TRUE" "a" "x" [2,] "b" "FALSE" "ab" "xy" [3,] "c" " TRUE" "abcd" "xyz" > unclass(format(X)) $letters [1] "a" "b" "c" $flag [1] " TRUE" "FALSE" " TRUE" $codef [1] "a" "ab" "abcd" $codec [1] "x" "xy" "xyz" attr(,"row.names") [1] "1" "2" "3" > unclass(format(X, justify="left")) $letters [1] "a" "b" "c" $flag [1] " TRUE" "FALSE" " TRUE" $codef [1] "a " "ab " "abcd" $codec [1] "x " "xy " "xyz" attr(,"row.names") [1] "1" "2" "3" > > # The only way I can see to get the logical column converted to character without padding: > X1 <- X > X1$flag <- as.character(X1$flag) > as.matrix(X1) letters flag codef codec [1,] "a" "TRUE" "a" "x" [2,] "b" "FALSE" "ab" "xy" [3,] "c" "TRUE" "abcd" "xyz" > Adrian Dragulescu wrote:> > Hello, > >> X <- data.frame(letters=letters[1:3], flag=c(TRUE, FALSE, TRUE)) >> X > letters flag > 1 a TRUE > 2 b FALSE > 3 c TRUE >> apply(X, 1, as.list) > [[1]] > [[1]]$letters > [1] "a" > > [[1]]$flag > [1] " TRUE" > > > [[2]] > [[2]]$letters > [1] "b" > > [[2]]$flag > [1] "FALSE" > > > [[3]] > [[3]]$letters > [1] "c" > > [[3]]$flag > [1] " TRUE" > > Notice how TRUE becomes " TRUE" and FALSE becomes "FALSE". Not sure > why TRUE gets an extra whitespace in front. > > Checked with R-2.10.0, but can reproduce the behavior as far back as > R-2.8.1. > > Adrian Dragulescu > >> sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.10.0 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Seemingly Similar Threads
- Help needed on applying a function across different data sets and aggregating the results into a single data set
- as.Date function yields inconsistent results (PR#14166)
- Bold greek letters using plotmath
- "strange" behaviour: recognition of decimal numbers by 'which'
- How to read a matrix with Hebrew row names?