Teng Sun wrote:> Suppose I have two columns of entries, how can I get the union of the
> two columns? Please note: I input my columns through excel. These
> entries have text format in excel. Also, out of curiosity, how can I
> find out the data type of a data frame ?
df <- data.frame(n1 =
c("apple","orange","soda","red","white",""),
n2 =
c("soda","apple","green","yellow","blue","white"),
x = rnorm(6))
> str(df)
'data.frame': 6 obs. of 3 variables:
$ n1: Factor w/ 6 levels "","apple","orange",..:
2 3 5 4 6 1
$ n2: Factor w/ 6 levels "apple","blue",..: 4 1 3 6 2 5
$ x : num -0.0932 -2.0714 -0.9539 0.7249 -0.7039 ...
> lapply(df, class)
$n1
[1] "factor"
$n2
[1] "factor"
$x
[1] "numeric"
>> a <- read.csv("book1.csv")
>> a
> n1 n2
> 1 apple soda
> 2 orange apple
> 3 soda green
> 4 red yellow
> 5 white blue
> 6 white
>
>> union(a$n1,a$n2)
> [1] 2 3 5 4 6 1
>
> I want the actual names instead of the indexes.
You are getting the union of factor levels rather than the union of
the strings. Try this:
> union(as.character(df$n1), as.character(df$n2))
[1] "apple" "orange" "soda"
[4] "red" "white" ""
[7] "green" "yellow" "blue"
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894