Teng Sun wrote:> Suppose I have two columns of entries, how can I get the union of the
> two columns? Please note: I input my columns through excel. These
> entries have text format in excel. Also, out of curiosity, how can I
> find out the data type of a data frame ?
df <- data.frame(n1 =
c("apple","orange","soda","red","white",""),
                 n2 =
c("soda","apple","green","yellow","blue","white"),
                 x = rnorm(6))
> str(df)
'data.frame':   6 obs. of  3 variables:
 $ n1: Factor w/ 6 levels "","apple","orange",..:
2 3 5 4 6 1
 $ n2: Factor w/ 6 levels "apple","blue",..: 4 1 3 6 2 5
 $ x : num  -0.0932 -2.0714 -0.9539  0.7249 -0.7039 ...
> lapply(df, class)
$n1
[1] "factor"
$n2
[1] "factor"
$x
[1] "numeric"
>> a <- read.csv("book1.csv")
>> a
>       n1     n2
> 1  apple   soda
> 2 orange  apple
> 3   soda  green
> 4    red yellow
> 5  white   blue
> 6         white
> 
>> union(a$n1,a$n2)
> [1] 2 3 5 4 6 1
> 
> I want the actual names instead of the indexes.
  You are getting the union of factor levels rather than the union of
the strings.  Try this:
> union(as.character(df$n1), as.character(df$n2))
[1] "apple"  "orange" "soda"
[4] "red"    "white"  ""
[7] "green"  "yellow" "blue"
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894