Iulia Dumitru
2022-Aug-02 09:18 UTC
[R] I don't understand the result of `svyboxplot` function from the Survey package
After following the example given here: https://www.rdocumentation.org/packages/survey/versions/4.1-1/topics/svyhist for `svyboxplot` I get the result in the attached image. This is a box plot of the `enroll` variable from the stratified dataset `apistrat`, grouped by `stype`: E (elementary school), M (middle school) and H (high school). If I use the `svyby` function to group the data by `stype` and find the mean for each group, I get this result:> svyby(~enroll, ~stype, dstrat, svymean)stype enroll se E E 416.78 16.41740 H H 1320.70 91.70781 M M 832.48 54.52157 Clearly the means are very different from each other. Then why don?t the box plots show this? I don?t know how to interpret the plot. Could someone please offer some insight on this? Thank you!
Anthony Damico
2022-Aug-02 11:49 UTC
[R] I don't understand the result of `svyboxplot` function from the Survey package
hi, nice catch! i'm ccing the author of the survey package because this might be a issue. when i run ?svyboxplot, i also see all three boxes in the exact same place.. seems like the svyby() call inside of svyboxplot does something unexpected when svyquantile gets passed using keep.var=FALSE and ci=FALSE library(survey) data(api) dstrat <- svydesign(id = ~1, strata = ~stype, weights = ~pw, data apistrat, fpc = ~fpc) # looks OK svyby(~enroll, ~stype, dstrat, svyquantile, quantiles = c(0, 0.25, 0.5, 0.75, 1), na.rm = TRUE) # returns each result three times in an unexpected configuration.. svyboxplot then grabs the repeated information from the first six columns svyby(~enroll, ~stype, dstrat, svyquantile, ci = FALSE, keep.var = FALSE, quantiles = c(0, 0.25, 0.5, 0.75, 1), na.rm = TRUE) On Tue, Aug 2, 2022 at 7:30 AM Iulia Dumitru <iuliadmtru at gmail.com> wrote:> After following the example given here: > https://www.rdocumentation.org/packages/survey/versions/4.1-1/topics/svyhist > for `svyboxplot` I get the result in the attached image. This is a box > plot of the `enroll` variable from the stratified dataset `apistrat`, > grouped by `stype`: E (elementary school), M (middle school) and H (high > school). If I use the `svyby` function to group the data by `stype` and > find the mean for each group, I get this result: > > > svyby(~enroll, ~stype, dstrat, svymean) > stype enroll se > E E 416.78 16.41740 > H H 1320.70 91.70781 > M M 832.48 54.52157 > > Clearly the means are very different from each other. Then why don?t the > box plots show this? I don?t know how to interpret the plot. Could someone > please offer some insight on this? Thank you! > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]