Hi:
Here's one approach:
f <- function(df) {
rs <- with(na.exclude(df), tapply(y, strata, sum)/tapply(x, strata,
sum))
u <- transform(subset(df, is.na(y)), y = x * rs[strata])
transform(df, y = replace(y, u$id, u$y))
}
f(df)
The function works as follows:
(1) With the rows of the data frame where y is not missing,
find the sum(y)/sum(x) ratio in each stratum. rs is a vector whose
length is
the number of strata. (Hopefully, all of your x-sums are nonzero...) If
you have
missing x values in your real data, you need to think about how you
want to
handle them.
(2) In a sub-data frame u containing the missing y's, replace them with the
value of
x times the value of rs corresponding to its stratum.
(3) Replace the missing y's in df with the y's from u, matching on id
numbers. (This
is a by-product of subset(), BTW.)
HTH,
Dennis
On Tue, Jan 25, 2011 at 9:40 AM, andrija djurovic
<djandrija@gmail.com>wrote:
> Hello R user,
>
> I have following data frame:
>
> df=data.frame(id=c(1:10),strata=rep(c(1,2),c(5,5)),y=c(
> 10,12,10,NA,15,70,NA,NA,55,100),x=c(3,4,5,7,4,10,12,8,3,15))
>
> and I would like to replace NA's with:
>
> instead of first NA tapply(na.exclude(df)$y,na.exclude(df)$strata,sum)[1]*
> *7 */tapply(na.exclude(df)$x,na.exclude(df)$strata,sum)[1]
> where 7 is the value of x (id=4) in strata 1 where y=NA
>
> instead of second NA tapply(na.exclude(df)$y,na.exclude(df)$strata,sum)[2]*
> *12 */tapply(na.exclude(df)$x,na.exclude(df)$strata,sum)[2]
> where 12 is the value of x (id=7) in strata 2 where y=NA
>
> instead of third NA tapply(na.exclude(df)$y,na.exclude(df)$strata,sum)[2]*
> *
> 8 */tapply(na.exclude(df)$x,na.exclude(df)$strata,sum)[2]
> where 8 is the value of x(id=8) in strata 2 where y=NA.
>
> So, I would like to replace NA inside the stratas on above explained way.
>
> Does anyone know how to do this?
>
> thanks in advance
>
> Andrija
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]