Hi, I have two dataframes: The first, df1, contains some missing data: cola colb colc cold cole 1 NA 5 9 NA 17 2 NA 6 NA 14 NA 3 3 NA 11 15 19 4 4 8 12 NA 20 The second, df2, contains the following: cola colb colc cold cole 1 1.4 0.8 0.02 1.6 0.6 I'm wanting all missing data in df1$cola to be replaced by the value of df2$cola. Then the missing data in df1$colb to be replaced with the corresponding value in df2$colb etc. I can get this to work column by column with single input lines but as my original dataset is a lot larger I'm wanting a create a loop but can't work out how. The single line command is: df1$cola[is.na(df1$cola)]<-df2$cola I've tried a replace function within a loop but get error messages: list<-colnames(df1) for (i in list) { r<-replace(df1$i,df1$i[is.na(df1$i)],df2$i) } with error messages of: Warning messages: 1: In is.na(mymat$snp) : is.na() applied to non-(list or vector) of type 'NULL' Can anyone help me with this? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Help-with-loop-tp4636140.html Sent from the R help mailing list archive at Nabble.com.
Hello, A one-liner could be df1 <- read.table(text=" cola colb colc cold cole 1 NA 5 9 NA 17 2 NA 6 NA 14 NA 3 3 NA 11 15 19 4 4 8 12 NA 20 ", header=TRUE) df2 <- read.table(text=" cola colb colc cold cole 1 1.4 0.8 0.02 1.6 0.6 ", header=TRUE) sapply(names(df1), function(nm) {df1[[nm]][is.na(df1[[nm]])] <- df2[[nm]]; df1[[nm]]}) Avoid loops, use *apply. Hope this helps, Rui Barradas Em 11-07-2012 15:11, paulalou escreveu:> Hi, > > I have two dataframes: > > The first, df1, contains some missing data: > > cola colb colc cold cole > 1 NA 5 9 NA 17 > 2 NA 6 NA 14 NA > 3 3 NA 11 15 19 > 4 4 8 12 NA 20 > > The second, df2, contains the following: > > cola colb colc cold cole > 1 1.4 0.8 0.02 1.6 0.6 > > I'm wanting all missing data in df1$cola to be replaced by the value of > df2$cola. Then the missing data in df1$colb to be replaced with the > corresponding value in df2$colb etc. > > I can get this to work column by column with single input lines but as my > original dataset is a lot larger I'm wanting a create a loop but can't work > out how. > > The single line command is: > > df1$cola[is.na(df1$cola)]<-df2$cola > > I've tried a replace function within a loop but get error messages: > > list<-colnames(df1) > > for (i in list) { > r<-replace(df1$i,df1$i[is.na(df1$i)],df2$i) > } > > > with error messages of: > > Warning messages: > 1: In is.na(mymat$snp) : > is.na() applied to non-(list or vector) of type 'NULL' > > Can anyone help me with this? > > Thanks > > -- > View this message in context: http://r.789695.n4.nabble.com/Help-with-loop-tp4636140.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
I think I just learned this myself: Don't put the $ extension in the bracket : df1$cola[is.na(df1$cola)]<-> > df2$cola >Instead substitute using brackets within the brackets: df1["cola"]is.na(df1["cola"])]<-> > df2["cola"]> then the "cola" s can be substituted. >Maybe this will help On Wed, Jul 11, 2012 at 10:11 AM, paulalou <pls28@medschl.cam.ac.uk> wrote:> Hi, > > I have two dataframes: > > The first, df1, contains some missing data: > > cola colb colc cold cole > 1 NA 5 9 NA 17 > 2 NA 6 NA 14 NA > 3 3 NA 11 15 19 > 4 4 8 12 NA 20 > > The second, df2, contains the following: > > cola colb colc cold cole > 1 1.4 0.8 0.02 1.6 0.6 > > I'm wanting all missing data in df1$cola to be replaced by the value of > df2$cola. Then the missing data in df1$colb to be replaced with the > corresponding value in df2$colb etc. > > I can get this to work column by column with single input lines but as my > original dataset is a lot larger I'm wanting a create a loop but can't work > out how. > > The single line command is: > > df1$cola[is.na(df1$cola)]<-df2$cola > > I've tried a replace function within a loop but get error messages: > > list<-colnames(df1) > > for (i in list) { > r<-replace(df1$i,df1$i[is.na(df1$i)],df2$i) > } > > > with error messages of: > > Warning messages: > 1: In is.na(mymat$snp) : > is.na() applied to non-(list or vector) of type 'NULL' > > Can anyone help me with this? > > Thanks > > -- > View this message in context: > http://r.789695.n4.nabble.com/Help-with-loop-tp4636140.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Charles Stangor Professor and Associate Chair [[alternative HTML version deleted]]
Hi, Try this: func1<-function(x,y,z) ?{ifelse(is.na(y[[x]]),z[[x]],y[[x]])} dat3<-data.frame(lapply(colnames(df1),function(x) func1(x,df1,df2))) colnames(dat3)<-colnames(df1) dat3 ? cola colb? colc cold cole 1? 1.4? 5.0? 9.00? 1.6 17.0 2? 1.4? 6.0? 0.02 14.0? 0.6 3? 3.0? 0.8 11.00 15.0 19.0 4? 4.0? 8.0 12.00? 1.6 20.0 #or sapply(colnames(df1),function(x) func1(x,df1,df2)) A.K. ----- Original Message ----- From: paulalou <pls28 at medschl.cam.ac.uk> To: r-help at r-project.org Cc: Sent: Wednesday, July 11, 2012 10:11 AM Subject: [R] Help with loop Hi, I have two dataframes: The first, df1, contains some missing data: ? cola colb colc cold cole 1? ? NA? ? 5? ? 9? NA? 17 2? ? NA? ? 6? NA? 14? NA 3? ? 3? ? NA? 11? 15? 19 4? ? 4? ? 8? 12? NA? 20 The second, df2, contains the following: ? cola colb colc cold cole 1? 1.4? 0.8 0.02? 1.6? 0.6 I'm wanting all missing data in df1$cola to be replaced by the value of df2$cola. Then the missing data in df1$colb to be replaced with the corresponding value in df2$colb etc. I can get this to work column by column with single input lines but as my original dataset is a lot larger I'm wanting a create a loop but can't work out how. The single line command is: df1$cola[is.na(df1$cola)]<-df2$cola I've tried a replace function within a loop but get error messages: list<-colnames(df1) for (i in list) { r<-replace(df1$i,df1$i[is.na(df1$i)],df2$i) } with error messages of: Warning messages: 1: In is.na(mymat$snp) : ? is.na() applied to non-(list or vector) of type 'NULL' Can anyone help me with this? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Help-with-loop-tp4636140.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
hello there, I'm an R beginner and got plunged into this. I guess my attempts are hopeless so far, so I won't even show them. I want to write a loop, which prints all erroneous values. My definition of erroneous: If the current counts (partridge counts in a hunting district) differ from last years counts by more than 50 percent and absolut values differ by more than 5 animals I want r to print these values. I have a grouping variable District "D", the year "Y" and the counts "C". example table: D Y C a 2005 10 a 2006 0 a 2007 9 b 2005 1 b 2006 0 b 2007 1 c 2005 5 c 2006 NA c 2007 4 Although the difference in a and b is 100 percent I would doubt a's population breakdown, whereas District b is credible. To confuse things I want the loop to skip missing values and instead look at the year after. Any help is very much appreciated! Thanks, Katrin