David Romano
2012-Nov-15 11:19 UTC
[R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers
Hi everyone, I have a data frame one of whose columns is a character vector and the rest are numeric, and in debugging a script, I noticed that an ifelse call seems to be coercing the character column to a numeric column, and producing unintended values as a result. Roughly, here's what I tried to do: df: a data frame with, say, the first column as a character column and the second and third columns numeric. also: NA's occur only in the numeric columns, and if they occur in one, they occur in the other as well. I wanted to replace the NA's in column 2 with 0's and the ones in column 3 with 1's, so first I did this:> na.replacements <-ifelse(col(df)==2,0,1).Then I used a second ifelse call to try to remove the NA's as I wanted, first by doing this:> clean.df <- ifelse(is.na(df), na.replacements, df),which produced a list of lists vaguely resembling df, with the NA's mostly intact, and so then I tried this:> clean.df <- ifelse(is.na(df), na.replacements, unlist(df)),which seems to work if all the columns are numeric, but otherwise changes strings to numbers. I can't make sense of the help documentation enough to clear this up, but my guess is that the "yes" and "no" values passed to ifelse need to be vectors, in which case it seems I'll have to use another approach entirely, but even if is not the case and lists are acceptable, I'm not sure how to convert a mixed-mode data frame into a vector-like list of elements (which I would hope would work). I'd be grateful for any suggestions! Thanks, David Romano [[alternative HTML version deleted]]
Stendera, Sonja, Dr.
2012-Nov-15 14:33 UTC
[R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers
Hi everyone, please put me off that list!!! The unsubscribe function does not function... THANKS!!! BW Sonja -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von David Romano Gesendet: 15 November 2012 12:19 An: r-help at r-project.org Betreff: [R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers Hi everyone, I have a data frame one of whose columns is a character vector and the rest are numeric, and in debugging a script, I noticed that an ifelse call seems to be coercing the character column to a numeric column, and producing unintended values as a result. Roughly, here's what I tried to do: df: a data frame with, say, the first column as a character column and the second and third columns numeric. also: NA's occur only in the numeric columns, and if they occur in one, they occur in the other as well. I wanted to replace the NA's in column 2 with 0's and the ones in column 3 with 1's, so first I did this:> na.replacements <-ifelse(col(df)==2,0,1).Then I used a second ifelse call to try to remove the NA's as I wanted, first by doing this:> clean.df <- ifelse(is.na(df), na.replacements, df),which produced a list of lists vaguely resembling df, with the NA's mostly intact, and so then I tried this:> clean.df <- ifelse(is.na(df), na.replacements, unlist(df)),which seems to work if all the columns are numeric, but otherwise changes strings to numbers. I can't make sense of the help documentation enough to clear this up, but my guess is that the "yes" and "no" values passed to ifelse need to be vectors, in which case it seems I'll have to use another approach entirely, but even if is not the case and lists are acceptable, I'm not sure how to convert a mixed-mode data frame into a vector-like list of elements (which I would hope would work). I'd be grateful for any suggestions! Thanks, David Romano [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Bert Gunter
2012-Nov-15 14:46 UTC
[R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers
David: You seem to be getting lost in basic R tasks. Have you read the Intro to R tutorial? If not, do so, as this should tell you how to do what you need. If so, re-read the sections on indexing ("["), replacement, and NA's. Also read about character vectors and factors. -- Bert On Thu, Nov 15, 2012 at 3:19 AM, David Romano <dromano at stanford.edu> wrote:> Hi everyone, > > I have a data frame one of whose columns is a character vector and the rest > are numeric, and in debugging a script, I noticed that an ifelse call seems > to be coercing the character column to a numeric column, and producing > unintended values as a result. Roughly, here's what I tried to do: > > df: a data frame with, say, the first column as a character column and the > second and third columns numeric. > > also: NA's occur only in the numeric columns, and if they occur in one, > they occur in the other as well. > > I wanted to replace the NA's in column 2 with 0's and the ones in column 3 > with 1's, so first I did this: > >> na.replacements <-ifelse(col(df)==2,0,1). > > Then I used a second ifelse call to try to remove the NA's as I wanted, > first by doing this: > >> clean.df <- ifelse(is.na(df), na.replacements, df), > > which produced a list of lists vaguely resembling df, with the NA's mostly > intact, and so then I tried this: > >> clean.df <- ifelse(is.na(df), na.replacements, unlist(df)), > > which seems to work if all the columns are numeric, but otherwise changes > strings to numbers. > > I can't make sense of the help documentation enough to clear this up, but > my guess is that the "yes" and "no" values passed to ifelse need to be > vectors, in which case it seems I'll have to use another approach entirely, > but even if is not the case and lists are acceptable, I'm not sure how to > convert a mixed-mode data frame into a vector-like list of elements (which > I would hope would work). > > I'd be grateful for any suggestions! > > Thanks, > David Romano > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
arun
2012-Nov-15 16:25 UTC
[R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers
Hi, df1<-read.table(text=" col1 col2 col3 A?? 15.5?? 8.5 A?? 8.5??? 7.5 A?? NA???? NA B?? 8.0?? 6.0 B?? NA???? NA B?? 9.0?? 10.0 ",sep="",header=TRUE,stringsAsFactors=FALSE) ?str(df1) #'data.frame':??? 6 obs. of? 3 variables: # $ col1: chr? "A" "A" "A" "B" ... # $ col2: num? 15.5 8.5 NA 8 NA 9 # $ col3: num? 8.5 7.5 NA 6 NA 10 ?df1$col2[is.na(df1$col2)]<-0 ?df1$col3[is.na(df1$col3)]<-1 ?df1 #? col1 col2 col3 #1??? A 15.5? 8.5 #2??? A? 8.5? 7.5 #3??? A? 0.0? 1.0 #4??? B? 8.0? 6.0 #5??? B? 0.0? 1.0 #6??? B? 9.0 10.0 #or if you want to use ifelse() from the original df1 ?ifelse(is.na(df1$col2),0,df1$col2) #[1] 15.5? 8.5? 0.0? 8.0? 0.0? 9.0 ?ifelse(is.na(df1$col3),1,df1$col2) #[1] 15.5? 8.5? 1.0? 8.0? 1.0? 9.0 A.K. ----- Original Message ----- From: David Romano <dromano at stanford.edu> To: r-help at r-project.org Cc: Sent: Thursday, November 15, 2012 6:19 AM Subject: [R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers Hi everyone, I have a data frame one of whose columns is a character vector and the rest are numeric, and in debugging a script, I noticed that an ifelse call seems to be coercing the character column to a numeric column, and producing unintended values as a result.? Roughly, here's what I tried to do: df: a data frame with, say, the first column as a character column and the second and third columns numeric. also: NA's occur only in the numeric columns, and if they occur in one, they occur in the other as well. I wanted to replace the NA's in column 2 with 0's and the ones in column 3 with 1's, so first I did this:> na.replacements <-ifelse(col(df)==2,0,1).Then I used a second ifelse call to try to remove the NA's as I wanted, first by doing this:> clean.df <- ifelse(is.na(df), na.replacements, df),which produced a list of lists vaguely resembling df, with the NA's mostly intact, and so then I tried this:> clean.df <- ifelse(is.na(df), na.replacements, unlist(df)),which seems to work if all the columns are numeric, but otherwise changes strings to numbers. I can't make sense of the help documentation enough to clear this up, but my guess is that the "yes" and "no" values passed to ifelse need to be vectors, in which case it seems I'll have to use another approach entirely, but even if is not the case and lists are acceptable, I'm not sure how to convert a mixed-mode data frame into a vector-like list of elements (which I would hope would work). I'd be grateful for any suggestions! Thanks, David Romano ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
soon yi
2012-Nov-15 19:29 UTC
[R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers
#Data df<-data.frame(id=letters[1:10],var1=rnorm(10,10,5),var2=rnorm(10,5,2),var3=rnorm(10,1,1)) #Missing df$var1[2]<-df$var2[c(2,6)]<-df$var3[c(2,5)]<-NA na.replace<-seq(1:ncol(df))-1 df[,names(df)]<-sapply(1:dim(df)[2], function(ii) {ifelse(is.na(df[,ii]),na.replace[ii],df[,ii])} ) David Romano-2 wrote> Hi everyone, > > I have a data frame one of whose columns is a character vector and the > rest > are numeric, and in debugging a script, I noticed that an ifelse call > seems > to be coercing the character column to a numeric column, and producing > unintended values as a result. Roughly, here's what I tried to do: > > df: a data frame with, say, the first column as a character column and the > second and third columns numeric. > > also: NA's occur only in the numeric columns, and if they occur in one, > they occur in the other as well. > > I wanted to replace the NA's in column 2 with 0's and the ones in column 3 > with 1's, so first I did this: > >> na.replacements <-ifelse(col(df)==2,0,1). > > Then I used a second ifelse call to try to remove the NA's as I wanted, > first by doing this: > >> clean.df <- ifelse(is.na(df), na.replacements, df), > > which produced a list of lists vaguely resembling df, with the NA's mostly > intact, and so then I tried this: > >> clean.df <- ifelse(is.na(df), na.replacements, unlist(df)), > > which seems to work if all the columns are numeric, but otherwise changes > strings to numbers. > > I can't make sense of the help documentation enough to clear this up, but > my guess is that the "yes" and "no" values passed to ifelse need to be > vectors, in which case it seems I'll have to use another approach > entirely, > but even if is not the case and lists are acceptable, I'm not sure how to > convert a mixed-mode data frame into a vector-like list of elements (which > I would hope would work). > > I'd be grateful for any suggestions! > > Thanks, > David Romano > > [[alternative HTML version deleted]] > > ______________________________________________> R-help@> mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- View this message in context: http://r.789695.n4.nabble.com/using-ifelse-to-remove-NA-s-from-specific-columns-of-a-data-frame-containing-strings-and-numbers-tp4649599p4649642.html Sent from the R help mailing list archive at Nabble.com.