Hello, I have a big data.frame, a piece of it as follows. a b c d 1 58009 2010-11-02 0 NA 2 114761 NA 1 2008-11-05 3 184440 NA 1 2009-12-08 4 189372 NA 0 NA 5 105286 NA 0 NA 6 186717 NA 0 NA 7 189106 NA 0 NA 8 127306 NA 0 NA 9 157342 2011-04-25 0 NA I want to replace b[NA] values with "20011-07-28" where c==0. I use rstudio and i'm a novice. -- View this message in context: http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3714715.html Sent from the R help mailing list archive at Nabble.com.
Romain DOUMENC
2011-Aug-03 08:28 UTC
[R] conditional data replace (recode, change or whatsoever)
Please do your homework before asking the list: An introduction to R, chapter 7 Am 03.08.2011 10:05, schrieb zcatav:> Hello, > I have a big data.frame, a piece of it as follows. > > a b c d > 1 58009 2010-11-02 0 NA > 2 114761 NA 1 2008-11-05 > 3 184440 NA 1 2009-12-08 > 4 189372 NA 0 NA > 5 105286 NA 0 NA > 6 186717 NA 0 NA > 7 189106 NA 0 NA > 8 127306 NA 0 NA > 9 157342 2011-04-25 0 NA > > I want to replace b[NA] values with "20011-07-28" where c==0. I use rstudio > and i'm a novice. > > > -- > View this message in context: http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3714715.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Petr PIKAL
2011-Aug-03 09:18 UTC
[R] Odp: conditional data replace (recode, change or whatsoever)
Hi> > Hello, > I have a big data.frame, a piece of it as follows. > > a b c d > 1 58009 2010-11-02 0 NA > 2 114761 NA 1 2008-11-05 > 3 184440 NA 1 2009-12-08 > 4 189372 NA 0 NA > 5 105286 NA 0 NA > 6 186717 NA 0 NA > 7 189106 NA 0 NA > 8 127306 NA 0 NA > 9 157342 2011-04-25 0 NA > > I want to replace b[NA] values with "20011-07-28" where c==0. I userstudio> and i'm a novice.I believe there are better solutions but I would use two steps select rows where c==0 (see also FAQ 7.31) sel<-which(big.data.frame$c==0) change NA values in b column based on sel big.data.frame$b[sel][is.na(big.data.frame$b[sel])]<-"20011-07-28" Beware of data types AFAIK R can not accept "20011-07-28" as a date. Regards Petr> > > -- > View this message in context: http://r.789695.n4.nabble.com/conditional- > data-replace-recode-change-or-whatsoever-tp3714715p3714715.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
zcatav
2011-Aug-03 11:02 UTC
[R] Odp: conditional data replace (recode, change or whatsoever)
Petr Pikal wrote:> > Hi > I believe there are better solutions but I would use two steps > > select rows where c==0 (see also FAQ 7.31) > sel<-which(big.data.frame$c==0) > > change NA values in b column based on sel > big.data.frame$b[sel][is.na(big.data.frame$b[sel])]<-"20011-07-28" > > Beware of data types AFAIK R can not accept "20011-07-28" as a date. > > Regards > Petr > >Thanks, it runs like a charm. Replaced date format just a typo. -- View this message in context: http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3715080.html Sent from the R help mailing list archive at Nabble.com.
R. Michael Weylandt <michael.weylandt@gmail.com>
2011-Aug-03 11:10 UTC
[R] conditional data replace (recode, change or whatsoever)
As others have noted, this is discussed in many free R tutorials, but if you want to do it in one line I think this should do it: X[is.NA(X[,"b"])&(X[,"c"]==0),"b"]<-"2011-07-28" #where X is the name of the data frame. It's a somewhat convoluted line of code but if you read it inside out the logic is clear: Find those rows where column b is NA and c is 1 by searching all rows of the relevant column (the X[,something] syntax): select those rows and the b column. Put the desired date in those slots. let me know of I can further clarify this. I changed the date assuming a typo on your end. Welcome and good luck getting started with R, Michael Weylandt On Aug 3, 2011, at 4:05 AM, zcatav <zcatav at gmail.com> wrote:> Hello, > I have a big data.frame, a piece of it as follows. > > a b c d > 1 58009 2010-11-02 0 NA > 2 114761 NA 1 2008-11-05 > 3 184440 NA 1 2009-12-08 > 4 189372 NA 0 NA > 5 105286 NA 0 NA > 6 186717 NA 0 NA > 7 189106 NA 0 NA > 8 127306 NA 0 NA > 9 157342 2011-04-25 0 NA > > I want to replace b[NA] values with "20011-07-28" where c==0. I use rstudio > and i'm a novice. > > > -- > View this message in context: http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3714715.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Your suggestion works perfect as i pointed previous message. Now have another question about data editing. I try this code: X[X[,"c"]==1,"b"]<-X[,"d"] and results with error: `[<-.data.frame`(`*tmp*`, X[, "c"] == 1, "b", value = c(NA, : replacement has 9 rows, data has 2 Logically i selected 2 rows with X[,"c"]==1. Than i want to replace in that rows its own data from "d" to "b" with X[,"b"]<-X[,"d"]. What is wrong? -- View this message in context: http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3715218.html Sent from the R help mailing list archive at Nabble.com.
Gabor Grothendieck
2011-Aug-03 12:42 UTC
[R] conditional data replace (recode, change or whatsoever)
On Wed, Aug 3, 2011 at 8:09 AM, zcatav <zcatav at gmail.com> wrote:> Your suggestion works perfect as i pointed previous message. Now have another > question about data editing. I try this code: > X[X[,"c"]==1,"b"]<-X[,"d"] > and results with error: `[<-.data.frame`(`*tmp*`, X[, "c"] == 1, "b", value > = c(NA, ?: > ?replacement has 9 rows, data has 2 > > Logically i selected 2 rows with X[,"c"]==1. Than i want to replace in that > rows its own data from "d" to "b" with X[,"b"]<-X[,"d"]. What is wrong? >Also check out transform and ifelse, e.g. transform(X, b = ifelse(is.na(b) & c == 0, "2011-07-28", b)) transform(X, b = ifelse(c == 1, d, c)) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
David Winsemius
2011-Aug-03 13:08 UTC
[R] conditional data replace (recode, change or whatsoever)
On Aug 3, 2011, at 8:09 AM, zcatav wrote:> Your suggestion works perfect as i pointed previous message. Now > have another > question about data editing. I try this code: > X[X[,"c"]==1,"b"]<-X[,"d"] > and results with error: `[<-.data.frame`(`*tmp*`, X[, "c"] == 1, > "b", value > = c(NA, : > replacement has 9 rows, data has 2 > > Logically i selected 2 rows with X[,"c"]==1. Than i want to replace > in that > rows its own data from "d" to "b" with X[,"b"]<-X[,"d"]. What is > wrong?You need to apply the same logical test/selection on the rows of the RHS as you are doing on the LHS. Possibly: X[ X[,"c"]==1, "b"] <- X[ X[,"c"]==1, "d"] (No data, not tested code.) -- David Winsemius, MD West Hartford, CT
Gabor Grothendieck wrote:> > On Wed, Aug 3, 2011 at 8:09 AM, zcatav <zcatav at gmail.com> wrote: >> Your suggestion works perfect as i pointed previous message. Now have >> another >> question about data editing. I try this code: >> X[X[,"c"]==1,"b"]<-X[,"d"] >> and results with error: `[<-.data.frame`(`*tmp*`, X[, "c"] == 1, "b", >> value >> = c(NA, ?: >> ?replacement has 9 rows, data has 2 >> >> Logically i selected 2 rows with X[,"c"]==1. Than i want to replace in >> that >> rows its own data from "d" to "b" with X[,"b"]<-X[,"d"]. What is wrong? >> > > Also check out transform and ifelse, e.g. > > transform(X, b = ifelse(is.na(b) & c == 0, "2011-07-28", b)) > > transform(X, b = ifelse(c == 1, d, c)) > >> transform(X, b = ifelse(is.na(b) & c == 0, "2011-07-28", b))This code results as follows. Data at [1,b] and [9,b] not managed as Date. a b c d 1 58009 14915 0 <NA> 2 114761 <NA> 1 2008-11-05 3 184440 <NA> 1 2009-12-08 4 189372 2011-07-28 0 <NA> 5 105286 2011-07-28 0 <NA> 6 186717 2011-07-28 0 <NA> 7 189106 2011-07-28 0 <NA> 8 127306 2011-07-28 0 <NA> 9 157342 15089 0 <NA> And the second code> transform(X, b = ifelse(c == 1, d, c))results as follows. Data at [,b] are completly lost. a b c d 1 58009 1 0 <NA> 2 114761 14188 1 2008-11-05 3 184440 14586 1 2009-12-08 4 189372 1 0 <NA> 5 105286 1 0 <NA> 6 186717 1 0 <NA> 7 189106 1 0 <NA> 8 127306 1 0 <NA> 9 157342 1 0 <NA> I think this solution not proper for me. -- View this message in context: http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3715525.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius wrote:> > On Aug 3, 2011, at 8:09 AM, zcatav wrote: > ........................ > You need to apply the same logical test/selection on the rows of the > RHS as you are doing on the LHS. > Possibly: > > X[ X[,"c"]==1, "b"] <- X[ X[,"c"]==1, "d"] > >This solution was suggested by R. Michael Weylandt and it works great. -- View this message in context: http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3715544.html Sent from the R help mailing list archive at Nabble.com.
zcatav <zcatav <at> gmail.com> writes:> > Your suggestion works perfect as i pointed previous message. Now have another > question about data editing. I try this code: > X[X[,"c"]==1,"b"]<-X[,"d"] > and results with error: `[<-.data.frame`(`*tmp*`, X[, "c"] == 1, "b", value > = c(NA, : > replacement has 9 rows, data has 2 >is this equivalent and/or preferred to: X$b[X$c==1]<-X$d[X$c==1] ?? I assume this goes back to the various indexing methods for a dataframe, an object vector that is a column of a data frame vs. an object data frame that happens to be one column of a larger data frame. on a very large data set is one preferable for speed? one for memory use? I tend to index using $ operators often and if I should quit let me know!! Thanks, Justin> Logically i selected 2 rows with X[,"c"]==1. Than i want to replace in that > rows its own data from "d" to "b" with X[,"b"]<-X[,"d"]. What is wrong? > > -- > View this message in context:http://r.789695.n4.nabble.com/conditional-data-replace-recode-change-or-whatsoever-tp3714715p3715218.html> Sent from the R help mailing list archive at Nabble.com. > >