Fridolin
2012-Aug-27 10:09 UTC
[R] Changing entries of column of type "factor"/Adding a new level to a factor
What is a smart way to change an entry inside a column of a dataframe or matrix which is of type "factor"? Here is my script incl. input data:> #set working directory: > setwd("K:/R") > > #read in data: > input<-read.table("Exampleinput.txt", sep="\t", header=TRUE) > > #check data: > inputInd M1 M2 M3 1 1 96/98 120/120 0/0 2 2 102/108 120/124 305/305 3 3 96/108 120/120 0/0 4 4 0/0 116/120 300/305 5 5 96/108 120/130 300/305 6 6 98/98 116/120 300/305 7 7 98/108 120/120 305/305 8 8 98/108 120/120 305/305 9 9 98/102 120/124 300/300 10 10 108/108 120/120 305/305> str(input)'data.frame': 10 obs. of 4 variables: $ Ind: int 1 2 3 4 5 6 7 8 9 10 $ M1 : Factor w/ 8 levels "0/0","102/108",..: 5 2 4 1 4 8 7 7 6 3 $ M2 : Factor w/ 4 levels "116/120","120/120",..: 2 3 2 1 4 1 2 2 3 2 $ M3 : Factor w/ 4 levels "0/0","300/300",..: 1 4 1 3 3 3 4 4 2 4> > #replace 0/0 by 999/999: > for (r in 1:10)+ for (c in 2:4) + if (input[r,c]=="0/0") input[r,c]<-"999/999" Warnmeldungen: 1: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : invalid factor level, NAs generated 2: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : invalid factor level, NAs generated 3: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : invalid factor level, NAs generated> inputInd M1 M2 M3 1 1 96/98 120/120 <NA> 2 2 102/108 120/124 305/305 3 3 96/108 120/120 <NA> 4 4 <NA> 116/120 300/305 5 5 96/108 120/130 300/305 6 6 98/98 116/120 300/305 7 7 98/108 120/120 305/305 8 8 98/108 120/120 305/305 9 9 98/102 120/124 300/300 10 10 108/108 120/120 305/305 I want to replace all "0/0" by "999/999". My code should work for columns of type "character" and "integer". But to make it work for a "factor"-column I would need to add the new level of "999/999" at first, I guess. How do I add a new level? -- View this message in context: http://r.789695.n4.nabble.com/Changing-entries-of-column-of-type-factor-Adding-a-new-level-to-a-factor-tp4641402.html Sent from the R help mailing list archive at Nabble.com.
PIKAL Petr
2012-Aug-27 14:22 UTC
[R] Changing entries of column of type "factor"/Adding a new level to a factor
Hi you could save yourself time to read help page for factor> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Fridolin > Sent: Monday, August 27, 2012 12:09 PM > To: r-help at r-project.org > Subject: [R] Changing entries of column of type "factor"/Adding a new > level to a factor > > What is a smart way to change an entry inside a column of a dataframe > or matrix which is of type "factor"?data.frame is not a matrix each factor have levels attribute levels(input$M1) level 0/0 shall be the first so levels(input$M1)[1] <-"999/999" Easiest way is probably cycle through columns of your data for( i in columns) levels(input[,i])[levels(input[,i])=="0/0"]<-"999/999" Regards Petr> > Here is my script incl. input data: > > #set working directory: > > setwd("K:/R") > > > > #read in data: > > input<-read.table("Exampleinput.txt", sep="\t", header=TRUE) > > > > #check data: > > input > Ind M1 M2 M3 > 1 1 96/98 120/120 0/0 > 2 2 102/108 120/124 305/305 > 3 3 96/108 120/120 0/0 > 4 4 0/0 116/120 300/305 > 5 5 96/108 120/130 300/305 > 6 6 98/98 116/120 300/305 > 7 7 98/108 120/120 305/305 > 8 8 98/108 120/120 305/305 > 9 9 98/102 120/124 300/300 > 10 10 108/108 120/120 305/305 > > str(input) > 'data.frame': 10 obs. of 4 variables: > $ Ind: int 1 2 3 4 5 6 7 8 9 10 > $ M1 : Factor w/ 8 levels "0/0","102/108",..: 5 2 4 1 4 8 7 7 6 3 $ > M2 : Factor w/ 4 levels "116/120","120/120",..: 2 3 2 1 4 1 2 2 3 2 $ > M3 : Factor w/ 4 levels "0/0","300/300",..: 1 4 1 3 3 3 4 4 2 4 > > > > #replace 0/0 by 999/999: > > for (r in 1:10) > + for (c in 2:4) > + if (input[r,c]=="0/0") input[r,c]<-"999/999" > Warnmeldungen: > 1: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > 2: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > 3: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > > input > Ind M1 M2 M3 > 1 1 96/98 120/120 <NA> > 2 2 102/108 120/124 305/305 > 3 3 96/108 120/120 <NA> > 4 4 <NA> 116/120 300/305 > 5 5 96/108 120/130 300/305 > 6 6 98/98 116/120 300/305 > 7 7 98/108 120/120 305/305 > 8 8 98/108 120/120 305/305 > 9 9 98/102 120/124 300/300 > 10 10 108/108 120/120 305/305 > > > I want to replace all "0/0" by "999/999". My code should work for > columns of type "character" and "integer". But to make it work for a > "factor"-column I would need to add the new level of "999/999" at > first, I guess. How do I add a new level? > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Changing- > entries-of-column-of-type-factor-Adding-a-new-level-to-a-factor- > tp4641402.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
William Dunlap
2012-Aug-27 14:28 UTC
[R] Changing entries of column of type "factor"/Adding a new level to a factor
One way is to read in your text data as character columns with read.table(stringsAsFactors=FALSE,...), then fiddle with the values, and finally make factors out of the columns that you want to be factors, specifying the levels explicitly. (Do you really want those number/number things to be treated as factors? The orders of their levels will be weird unless you list them all when making the factor.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Fridolin > Sent: Monday, August 27, 2012 3:09 AM > To: r-help at r-project.org > Subject: [R] Changing entries of column of type "factor"/Adding a new level to a factor > > What is a smart way to change an entry inside a column of a dataframe or > matrix which is of type "factor"? > > Here is my script incl. input data: > > #set working directory: > > setwd("K:/R") > > > > #read in data: > > input<-read.table("Exampleinput.txt", sep="\t", header=TRUE) > > > > #check data: > > input > Ind M1 M2 M3 > 1 1 96/98 120/120 0/0 > 2 2 102/108 120/124 305/305 > 3 3 96/108 120/120 0/0 > 4 4 0/0 116/120 300/305 > 5 5 96/108 120/130 300/305 > 6 6 98/98 116/120 300/305 > 7 7 98/108 120/120 305/305 > 8 8 98/108 120/120 305/305 > 9 9 98/102 120/124 300/300 > 10 10 108/108 120/120 305/305 > > str(input) > 'data.frame': 10 obs. of 4 variables: > $ Ind: int 1 2 3 4 5 6 7 8 9 10 > $ M1 : Factor w/ 8 levels "0/0","102/108",..: 5 2 4 1 4 8 7 7 6 3 > $ M2 : Factor w/ 4 levels "116/120","120/120",..: 2 3 2 1 4 1 2 2 3 2 > $ M3 : Factor w/ 4 levels "0/0","300/300",..: 1 4 1 3 3 3 4 4 2 4 > > > > #replace 0/0 by 999/999: > > for (r in 1:10) > + for (c in 2:4) > + if (input[r,c]=="0/0") input[r,c]<-"999/999" > Warnmeldungen: > 1: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > 2: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > 3: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > > input > Ind M1 M2 M3 > 1 1 96/98 120/120 <NA> > 2 2 102/108 120/124 305/305 > 3 3 96/108 120/120 <NA> > 4 4 <NA> 116/120 300/305 > 5 5 96/108 120/130 300/305 > 6 6 98/98 116/120 300/305 > 7 7 98/108 120/120 305/305 > 8 8 98/108 120/120 305/305 > 9 9 98/102 120/124 300/300 > 10 10 108/108 120/120 305/305 > > > I want to replace all "0/0" by "999/999". My code should work for columns of > type "character" and "integer". But to make it work for a "factor"-column I > would need to add the new level of "999/999" at first, I guess. How do I add > a new level? > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Changing-entries-of- > column-of-type-factor-Adding-a-new-level-to-a-factor-tp4641402.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
David Winsemius
2012-Aug-27 16:19 UTC
[R] Changing entries of column of type "factor"/Adding a new level to a factor
On Aug 27, 2012, at 3:09 AM, Fridolin wrote:> What is a smart way to change an entry inside a column of a > dataframe or > matrix which is of type "factor"? > > Here is my script incl. input data: >> #set working directory: >> setwd("K:/R") >> >> #read in data: >> input<-read.table("Exampleinput.txt", sep="\t", header=TRUE) >> >> #check data: >> input > Ind M1 M2 M3 > 1 1 96/98 120/120 0/0 > 2 2 102/108 120/124 305/305 > 3 3 96/108 120/120 0/0 > 4 4 0/0 116/120 300/305 > 5 5 96/108 120/130 300/305 > 6 6 98/98 116/120 300/305 > 7 7 98/108 120/120 305/305 > 8 8 98/108 120/120 305/305 > 9 9 98/102 120/124 300/300 > 10 10 108/108 120/120 305/305 >> str(input) > 'data.frame': 10 obs. of 4 variables: > $ Ind: int 1 2 3 4 5 6 7 8 9 10 > $ M1 : Factor w/ 8 levels "0/0","102/108",..: 5 2 4 1 4 8 7 7 6 3 > $ M2 : Factor w/ 4 levels "116/120","120/120",..: 2 3 2 1 4 1 2 2 3 2 > $ M3 : Factor w/ 4 levels "0/0","300/300",..: 1 4 1 3 3 3 4 4 2 4 >> >> #replace 0/0 by 999/999: >> for (r in 1:10) > + for (c in 2:4) > + if (input[r,c]=="0/0") input[r,c]<-"999/999" > Warnmeldungen: > 1: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > 2: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated > 3: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") : > invalid factor level, NAs generated >> input > Ind M1 M2 M3 > 1 1 96/98 120/120 <NA> > 2 2 102/108 120/124 305/305 > 3 3 96/108 120/120 <NA> > 4 4 <NA> 116/120 300/305 > 5 5 96/108 120/130 300/305 > 6 6 98/98 116/120 300/305 > 7 7 98/108 120/120 305/305 > 8 8 98/108 120/120 305/305 > 9 9 98/102 120/124 300/300 > 10 10 108/108 120/120 305/305 > > > I want to replace all "0/0" by "999/999". My code should work for > columns of > type "character" and "integer". But to make it work for a "factor"- > column I > would need to add the new level of "999/999" at first, I guess. How > do I add > a new level??levels levels(input$M1) <- c(levels(input$M1), "999/999") -- David Winsemius, MD Heritage Laboratories West Hartford, CT