raz
2014-Sep-11 15:49 UTC
[R] create new column by replacing multiple unique values in existing column
Hi, ?I got the following data frame: dat1 <- read.table(text="a,b 1,A1 2,A1 3,A1 4,A1 5,A1 6,A2 7,A2 8,A2 9,A2 10,A2 11,B1 12,B1 13,B1 14,B1 15,B1",sep=",",header=T) ? ?I would like to add a new column dat1$new based on column "b" (dat$b) in which values will be substituted according to their unique values e.g "A1" will be "1", "A2" will be "2" and so on (this is only a part of a large table). It would be better if I could change all unique values in dat1 to numbers 1:unique(n). if not then how do I change all values ("A1","A2","B1") to (1,2,3) in a new column?. Thanks a lot, Raz? -- \m/ [[alternative HTML version deleted]]
David L Carlson
2014-Sep-11 16:06 UTC
[R] create new column by replacing multiple unique values in existing column
Note that in the data you sent, b is a factor:> str(dat1)'data.frame': 15 obs. of 2 variables: $ a: int 1 2 3 4 5 6 7 8 9 10 ... $ b: Factor w/ 3 levels "A1","A2","B1": 1 1 1 1 1 2 2 2 2 2 ... So all you need is> dat1$new <- as.numeric(dat1$b) > table(dat1$new) > table(dat1$new)1 2 3 5 5 5> table(dat1$b)A1 A2 B1 5 5 5 If b is not a factor in your table, make it one ?factor ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of raz Sent: Thursday, September 11, 2014 10:49 AM To: r-help at r-project.org Subject: [R] create new column by replacing multiple unique values in existing column Hi, ?I got the following data frame: dat1 <- read.table(text="a,b 1,A1 2,A1 3,A1 4,A1 5,A1 6,A2 7,A2 8,A2 9,A2 10,A2 11,B1 12,B1 13,B1 14,B1 15,B1",sep=",",header=T) ? ?I would like to add a new column dat1$new based on column "b" (dat$b) in which values will be substituted according to their unique values e.g "A1" will be "1", "A2" will be "2" and so on (this is only a part of a large table). It would be better if I could change all unique values in dat1 to numbers 1:unique(n). if not then how do I change all values ("A1","A2","B1") to (1,2,3) in a new column?. Thanks a lot, Raz? -- \m/ [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
John McKown
2014-Sep-11 16:16 UTC
[R] create new column by replacing multiple unique values in existing column
On Thu, Sep 11, 2014 at 10:49 AM, raz <barvazduck at gmail.com> wrote:> Hi, > > I got the following data frame: > dat1 <- read.table(text="a,b > 1,A1 > 2,A1 > 3,A1 > 4,A1 > 5,A1 > 6,A2 > 7,A2 > 8,A2 > 9,A2 > 10,A2 > 11,B1 > 12,B1 > 13,B1 > 14,B1 > 15,B1",sep=",",header=T) > > > I would like to add a new column dat1$new based on column "b" (dat$b) in > which values will be substituted according to their unique values e.g "A1" > will be "1", "A2" will be "2" and so on (this is only a part of a large > table). It would be better if I could change all unique values in dat1 to > numbers 1:unique(n). if not then how do I change all values > ("A1","A2","B1") to (1,2,3) in a new column?. > > Thanks a lot, > > Raz > > -- > \m/ > > [[alternative HTML version deleted]] >Please change your email client to post only in plain text, no HTML. Thanks. And, lucky you. You __already__ have what you want in the table. Try the following: print(dat1$b); Hum, you just get what you're expecting. Things like A1 and B1 and so on. Now try: print(as.integer(dat1$b)); Whoa! You have unique integers based on the values in column b! That's because the data in column b is a __factor__. And the as.integer() prints the factor number instead of the factor value. If column b is not a factor, then you can make it one with a simple: dat1$b <- as.factor(dat1$b); it you really want it in a separate column for some reason: dat1$new <- as.integer(dat1$b); But then you are responsible for keeping columns b and new "in sync". Keeping/making column b a factor lets you use as.integer() and you are GOLDEN! -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown