Cade, Brian
2017-Dec-05 20:17 UTC
[R] warnings about factor levels dropped from predict.glm
I am helping a student with some logistic regression analyses and we are getting some strange inconsistencies regarding a warning about factor levels being dropped when running predict.glm(, newdata = ournewdata) on the logistic regression model object. We have checked multiple times that the factor levels have been defined similarly on both data sets (one used to estimate model and the newdata) and that values occur for all factor levels in both data sets. When I run these commands on my version of R (3.2.5) on a Windows 7 OS I do not get the warnings. When the student runs them on her version of R (not sure what number hers is) on her Mac, she gets these warnings constantly. I've checked some records manually by doing the algebra and the predict.glm() function is working correctly incorporating the factor levels on my machine. Any thoughts??? Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: cadeb at usgs.gov <brian_cade at usgs.gov> tel: 970 226-9326 [[alternative HTML version deleted]]
Bert Gunter
2017-Dec-05 20:37 UTC
[R] warnings about factor levels dropped from predict.glm
A guess (treat accordingly): Different BLAS versions are in use on the two different machines/versions. In one, near singularities are handled, and in the other they are not, percolating up to warnings at the R level. You can check this by seeing whether the estimated fit is the same on the 2 machines. If so, ignore the above. -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Dec 5, 2017 at 12:17 PM, Cade, Brian <cadeb at usgs.gov> wrote:> I am helping a student with some logistic regression analyses and we are > getting some strange inconsistencies regarding a warning about factor > levels being dropped when running predict.glm(, newdata = ournewdata) on > the logistic regression model object. We have checked multiple times that > the factor levels have been defined similarly on both data sets (one used > to estimate model and the newdata) and that values occur for all factor > levels in both data sets. When I run these commands on my version of R > (3.2.5) on a Windows 7 OS I do not get the warnings. When the student runs > them on her version of R (not sure what number hers is) on her Mac, she > gets these warnings constantly. I've checked some records manually by > doing the algebra and the predict.glm() function is working correctly > incorporating the factor levels on my machine. Any thoughts??? > > Brian > > Brian S. Cade, PhD > > U. S. Geological Survey > Fort Collins Science Center > 2150 Centre Ave., Bldg. C > Fort Collins, CO 80526-8818 > > email: cadeb at usgs.gov <brian_cade at usgs.gov> > tel: 970 226-9326 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Marc Schwartz
2017-Dec-05 21:13 UTC
[R] warnings about factor levels dropped from predict.glm
Hi, I suspect that the warning may be coming from stats::model.frame.default(), with text along the lines of: "contrasts dropped from factor YOUR.FACTOR.NAME due to missing levels" You might want to see if the student has a ~/.Rprofile file that has some modified default options regarding contrasts, etc. Check to see if there is some change/difference in the structure of the data frames in use, specifically any contrast related attributes on the relevant data frame columns that are different on the two systems. See ?str. Have them open an R session from the macOS terminal and run R using: R --vanilla to see if you get the same errors on their system. If not, it suggests that perhaps their .Rprofile file has something non-default in it, and/or perhaps there is a .RData file in their working directory that has some saved workspace objects causing a conflict, as that file will be loaded by default with a new R session. Regards, Marc Schwartz> On Dec 5, 2017, at 3:37 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > A guess (treat accordingly): > > Different BLAS versions are in use on the two different machines/versions. > In one, near singularities are handled, and in the other they are not, > percolating up to warnings at the R level. > > You can check this by seeing whether the estimated fit is the same on the 2 > machines. If so, ignore the above. > > -- Bert > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Tue, Dec 5, 2017 at 12:17 PM, Cade, Brian <cadeb at usgs.gov> wrote: > >> I am helping a student with some logistic regression analyses and we are >> getting some strange inconsistencies regarding a warning about factor >> levels being dropped when running predict.glm(, newdata = ournewdata) on >> the logistic regression model object. We have checked multiple times that >> the factor levels have been defined similarly on both data sets (one used >> to estimate model and the newdata) and that values occur for all factor >> levels in both data sets. When I run these commands on my version of R >> (3.2.5) on a Windows 7 OS I do not get the warnings. When the student runs >> them on her version of R (not sure what number hers is) on her Mac, she >> gets these warnings constantly. I've checked some records manually by >> doing the algebra and the predict.glm() function is working correctly >> incorporating the factor levels on my machine. Any thoughts??? >> >> Brian >> >> Brian S. Cade, PhD >> >> U. S. Geological Survey >> Fort Collins Science Center >> 2150 Centre Ave., Bldg. C >> Fort Collins, CO 80526-8818 >> >> email: cadeb at usgs.gov <brian_cade at usgs.gov> >> tel: 970 226-9326