(1)> a = c("a","b") > mode(a)[1] "character"> b = c(1,2) > mode(b)[1] "numeric"> c = data.frame(a,b) > mode(c$a)[1] "numeric" (2)> a = c("a","a","b","b","c") > levels(as.factor(a))[1] "a" "b" "c"> levels(as.factor(a[1:3]))[1] "a" "b"> a = as.factor(a) > levels(a)[1] "a" "b" "c"> levels(a[1:3])[1] "a" "b" "c" Any explanation would be helpful. Thanks.
(1)> a <- c("a", "b") > str(a)chr [1:2] "a" "b"> b <- c(1,2) > str(b)num [1:2] 1 2> c <- data.frame(a, b) > str(c)'data.frame': 2 obs. of 2 variables: $ a: Factor w/ 2 levels "a","b": 1 2 $ b: num 1 2> mode(c$a)[1] "numeric"> c2 <- data.frame(a, b, stringsAsFactors=FALSE) > str(c2)'data.frame': 2 obs. of 2 variables: $ a: chr "a" "b" $ b: num 1 2> mode(c2$a)[1] "character" (2)> a <- c("a", "a", "b", "b", "c") > levels(as.factor(a))[1] "a" "b" "c"> b <- a[1:3] > b[1] "a" "a" "b"> levels(as.factor(b))[1] "a" "b"> a <- as.factor(a) > a[1] a a b b c Levels: a b c> a[1:3][1] a a b Levels: a b c On Sat, Jan 22, 2011 at 9:16 AM, analyst41 at hotmail.com <analyst41 at hotmail.com> wrote:> (1) > >> a = c("a","b") >> mode(a) > [1] "character" >> b = c(1,2) >> mode(b) > [1] "numeric" >> c = data.frame(a,b) >> mode(c$a) > [1] "numeric" > > (2) > > >> a = c("a","a","b","b","c") >> levels(as.factor(a)) > [1] "a" "b" "c" >> levels(as.factor(a[1:3])) > [1] "a" "b" >> a = as.factor(a) >> levels(a) > [1] "a" "b" "c" >> levels(a[1:3]) > [1] "a" "b" "c" > > Any explanation would be helpful. ?Thanks. >-- Sarah Goslee http://www.functionaldiversity.org
On Sat, 22 Jan 2011 06:16:43 -0800 (PST) "analyst41 at hotmail.com" <analyst41 at hotmail.com> wrote:> (1) > > > a = c("a","b") > > mode(a) > [1] "character" > > b = c(1,2) > > mode(b) > [1] "numeric" > > c = data.frame(a,b) > > mode(c$a) > [1] "numeric"R> str(c) 'data.frame': 2 obs. of 2 variables: $ a: Factor w/ 2 levels "a","b": 1 2 $ b: num 1 2 Character vectors are turned into factors by default by data.frame(). OTOH: R> c = data.frame(a,b, stringsAsFactors=FALSE) R> mode(c$a) [1] "character"> (2) > > > > a = c("a","a","b","b","c") > > levels(as.factor(a)) > [1] "a" "b" "c" > > levels(as.factor(a[1:3])) > [1] "a" "b" > > a = as.factor(a) > > levels(a) > [1] "a" "b" "c" > > levels(a[1:3]) > [1] "a" "b" "c"Subsetting factors does not get rid of no-longer used levels by default. OTOH: R> levels(a[1:3, drop=TRUE]) [1] "a" "b" or R> levels(factor(a[1:3])) [1] "a" "b" HTH. Cheers, Berwin ========================== Full address ===========================Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019) +61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009 e-mail: berwin at maths.uwa.edu.au Australia http://www.maths.uwa.edu.au/~berwin
My explanation for No2: When coercing a character vector to factor, the current levels are stored. By choosing a subvector of the factor you don't change the levels of the factor. So levels(a[1:3]) is still [1] "a" "b" "c" in the last line ... If you want to reduce levels you need to tell R.> levels(a[1:3, drop=TRUE])[1] "a" "b" ________________ Moritz Grenke http://www.360mix.de -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von analyst41 at hotmail.com Gesendet: Samstag, 22. Januar 2011 15:17 An: r-help at r-project.org Betreff: [R] two apparent anomalies (1)> a = c("a","b") > mode(a)[1] "character"> b = c(1,2) > mode(b)[1] "numeric"> c = data.frame(a,b) > mode(c$a)[1] "numeric" (2)> a = c("a","a","b","b","c") > levels(as.factor(a))[1] "a" "b" "c"> levels(as.factor(a[1:3]))[1] "a" "b"> a = as.factor(a) > levels(a)[1] "a" "b" "c"> levels(a[1:3])[1] "a" "b" "c" Any explanation would be helpful. Thanks. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
If you type ?as.data.frame and ?data.frame you can see there are differences. data.frame() can take more than one "data parameter". #producing different results: data.frame(c(1,2,3), c("hello","world","!")) as.data.frame(c(1,2,3), c("hello","world","!")) there are differences (in parameters) for "as.factor" and "factor" as well. Type ?factor ________________ Moritz Grenke http://www.360mix.de> Thanks for both responses.> is there a difference between the "as.factor" and "factor" commands > and also between "as.data.frame" and "data.frame"?