Hi to all. I've been trying to calculate weight-for-age z-scores with the y2z command (AGD package). However, I keep getting strange results. My hypothesis is that missings are the problem. My dataframe looks like this: data <- structure(list(sex = structure(c(3L, 3L, 3L, 2L, 3L), .Label = c("", "F", "M"), class = "factor"), weight = c(8.5, 8.2, 9, NA, 5.8), age = c(8, 9, 12, 9, 1)), .Names = c("sex", "weight", "age" ), class = "data.frame", row.names = c(NA, 5L)) Weight is in kg and age in months. I will use WHO curves for children younger than 2 years of age. z-score calculation: library(AGD) data$zeta <- y2z(y = data$weight, x = data$age/12, sex = data$sex, ref = get("who.wgt")) I get: Warning message: In `split<-.default`(`*tmp*`, f, drop = drop, value = value) : number of items to replace is not a multiple of replacement length data$zeta [1] NA NA NA -0.124 NA However a for loop seems to work. for (i in 1:5) { data$zeta[i] <- y2z(y = data$weight[i], x = data$age[i]/12, sex = data$sex[i], ref = get("who.wgt")) } data$zeta [1] -0.124 -0.751 -0.635 NA 2.002 Is there a workaround so that I don't have to use a for loop? na.action doesn't work either. Thanks. Martin
It looks like the y2z() function strips NA's so that the vector lengths do not match any longer. The simplest workaround is to remove the NA's. You could do that by using data2 <- na.omit(data) to strip the observations with NA if they will not be used in the rest of the analysis. If you want to preserve the NAs in the data frame, this seems to work:> nomiss <- complete.cases(data) > data$zeta[nomiss] <- with(data[nomiss, ], y2z(weight, age/12, sex, ref=who.wgt)) > datasex weight age zeta 1 M 8.5 8 -0.124 2 M 8.2 9 -0.751 3 M 9.0 12 -0.635 4 F NA 9 NA 5 M 5.8 1 2.002 David L. Carlson Department of Anthropology Texas A&M University -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Canon Sent: Sunday, October 25, 2015 7:03 AM To: R help <r-help at r-project.org> Subject: [R] y2z question (AGD) Hi to all. I've been trying to calculate weight-for-age z-scores with the y2z command (AGD package). However, I keep getting strange results. My hypothesis is that missings are the problem. My dataframe looks like this: data <- structure(list(sex = structure(c(3L, 3L, 3L, 2L, 3L), .Label = c("", "F", "M"), class = "factor"), weight = c(8.5, 8.2, 9, NA, 5.8), age = c(8, 9, 12, 9, 1)), .Names = c("sex", "weight", "age" ), class = "data.frame", row.names = c(NA, 5L)) Weight is in kg and age in months. I will use WHO curves for children younger than 2 years of age. z-score calculation: library(AGD) data$zeta <- y2z(y = data$weight, x = data$age/12, sex = data$sex, ref = get("who.wgt")) I get: Warning message: In `split<-.default`(`*tmp*`, f, drop = drop, value = value) : number of items to replace is not a multiple of replacement length data$zeta [1] NA NA NA -0.124 NA However a for loop seems to work. for (i in 1:5) { data$zeta[i] <- y2z(y = data$weight[i], x = data$age[i]/12, sex = data$sex[i], ref = get("who.wgt")) } data$zeta [1] -0.124 -0.751 -0.635 NA 2.002 Is there a workaround so that I don't have to use a for loop? na.action doesn't work either. Thanks. Martin ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thank you, David. This solves the problem. Regards, Martin On Sun, Oct 25, 2015 at 12:53 PM, David L Carlson <dcarlson at tamu.edu> wrote:> It looks like the y2z() function strips NA's so that the vector lengths do > not match any longer. The simplest workaround is to remove the NA's. You > could do that by using data2 <- na.omit(data) to strip the observations > with NA if they will not be used in the rest of the analysis. > > If you want to preserve the NAs in the data frame, this seems to work: > > > nomiss <- complete.cases(data) > > data$zeta[nomiss] <- with(data[nomiss, ], y2z(weight, age/12, sex, > ref=who.wgt)) > > data > sex weight age zeta > 1 M 8.5 8 -0.124 > 2 M 8.2 9 -0.751 > 3 M 9.0 12 -0.635 > 4 F NA 9 NA > 5 M 5.8 1 2.002 > > > David L. Carlson > Department of Anthropology > Texas A&M University > > > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Martin > Canon > Sent: Sunday, October 25, 2015 7:03 AM > To: R help <r-help at r-project.org> > Subject: [R] y2z question (AGD) > > Hi to all. > > I've been trying to calculate weight-for-age z-scores with the y2z > command (AGD package). > > However, I keep getting strange results. > > My hypothesis is that missings are the problem. > > My dataframe looks like this: > > data <- structure(list(sex = structure(c(3L, 3L, 3L, 2L, 3L), .Label > c("", > "F", "M"), class = "factor"), weight = c(8.5, 8.2, 9, NA, 5.8), > age = c(8, 9, 12, 9, 1)), .Names = c("sex", "weight", "age" > ), class = "data.frame", row.names = c(NA, 5L)) > > Weight is in kg and age in months. > I will use WHO curves for children younger than 2 years of age. > > z-score calculation: > > library(AGD) > data$zeta <- y2z(y = data$weight, x = data$age/12, sex = data$sex, > ref = get("who.wgt")) > > I get: > > Warning message: > In `split<-.default`(`*tmp*`, f, drop = drop, value = value) : > number of items to replace is not a multiple of replacement length > > data$zeta > [1] NA NA NA -0.124 NA > > However a for loop seems to work. > > for (i in 1:5) { > > data$zeta[i] <- y2z(y = data$weight[i], > x = data$age[i]/12, > sex = data$sex[i], > ref = get("who.wgt")) > } > > data$zeta > [1] -0.124 -0.751 -0.635 NA 2.002 > > Is there a workaround so that I don't have to use a for loop? > na.action doesn't work either. > > Thanks. > > > Martin > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
You can also use the na.exclude function, which is like na.omit but attaches an attribute, "na.action", telling which rows were removed. Then use the naresid function to insert NA's into the right places in the output of the function that only works properly on NA-less data. E.g., cleanData <- na.exclude(data) data$zeta <- naresid(attr(cleanData, "na.action"), with(cleanData, y2z(weight, age/12, sex, ref=who.wgt))) data # sex weight age zeta #1 M 8.5 8 -0.124 #2 M 8.2 9 -0.751 #3 M 9.0 12 -0.635 #4 F NA 9 NA #5 M 5.8 1 2.002 It is a little wordy, but it does handle both vector and matrix data. Bill Dunlap TIBCO Software wdunlap tibco.com On Sun, Oct 25, 2015 at 10:53 AM, David L Carlson <dcarlson at tamu.edu> wrote:> It looks like the y2z() function strips NA's so that the vector lengths do not match any longer. The simplest workaround is to remove the NA's. You could do that by using data2 <- na.omit(data) to strip the observations with NA if they will not be used in the rest of the analysis. > > If you want to preserve the NAs in the data frame, this seems to work: > >> nomiss <- complete.cases(data) >> data$zeta[nomiss] <- with(data[nomiss, ], y2z(weight, age/12, sex, ref=who.wgt)) >> data > sex weight age zeta > 1 M 8.5 8 -0.124 > 2 M 8.2 9 -0.751 > 3 M 9.0 12 -0.635 > 4 F NA 9 NA > 5 M 5.8 1 2.002 > > > David L. Carlson > Department of Anthropology > Texas A&M University > > > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Canon > Sent: Sunday, October 25, 2015 7:03 AM > To: R help <r-help at r-project.org> > Subject: [R] y2z question (AGD) > > Hi to all. > > I've been trying to calculate weight-for-age z-scores with the y2z > command (AGD package). > > However, I keep getting strange results. > > My hypothesis is that missings are the problem. > > My dataframe looks like this: > > data <- structure(list(sex = structure(c(3L, 3L, 3L, 2L, 3L), .Label = c("", > "F", "M"), class = "factor"), weight = c(8.5, 8.2, 9, NA, 5.8), > age = c(8, 9, 12, 9, 1)), .Names = c("sex", "weight", "age" > ), class = "data.frame", row.names = c(NA, 5L)) > > Weight is in kg and age in months. > I will use WHO curves for children younger than 2 years of age. > > z-score calculation: > > library(AGD) > data$zeta <- y2z(y = data$weight, x = data$age/12, sex = data$sex, > ref = get("who.wgt")) > > I get: > > Warning message: > In `split<-.default`(`*tmp*`, f, drop = drop, value = value) : > number of items to replace is not a multiple of replacement length > > data$zeta > [1] NA NA NA -0.124 NA > > However a for loop seems to work. > > for (i in 1:5) { > > data$zeta[i] <- y2z(y = data$weight[i], > x = data$age[i]/12, > sex = data$sex[i], > ref = get("who.wgt")) > } > > data$zeta > [1] -0.124 -0.751 -0.635 NA 2.002 > > Is there a workaround so that I don't have to use a for loop? > na.action doesn't work either. > > Thanks. > > > Martin > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.