Thanks Bill for your quick reply. I tried your solution and it did work for the simple user defined function xploly. But when I try with other function, it gave me error again: OPoly<-function(x,degree=1,weight=1){ weight=round(weight,0)# weight need to be integer if(length(weight)!=length(x))weight=rep(1,length(x)) p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree) Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree]) class(Z)<-"OPoly";Z } ##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps the x to range[-2,2] then do QR, then return the results with sqrt(norm2). Comparing with poly, this transformation will make the model coefficients within a similar range as other variables, the R poly routine will usually give you a very large coefficients. I did not find such routine in R, so I have to define this as user defined function. ####### I also have following function as you suggested: makepredictcall.OPoly<-function(var,call) { if (is.call(call)) { if (identical(call[[1]], quote(OPoly))) { if (!is.null(tmp <- attr(var, "coefs"))) { call$coefs <- tmp } } } call } But I still got error for following:> g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family = Gamma)> predict(g3,dc)Error in poly(4 * (rep(x, weight) - mean(range(x)))/diff(range(x)), degree) :missing values are not allowed in 'poly' I thought it might be due to the /diff(range(x) in the function. But even I remove that part, it will still give me error. Any idea? Many thanks in advance. Alex On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdunlap at tibco.com> wrote:> Read about the 'makepredictcall' generic function. There is a method, > makepredictcall.poly(), for poly() that attaches the polynomial > coefficients > used during the fitting procedure to the call to poly() that predict() > makes. > You ought to supply a similar method for your xpoly(), and xpoly() needs > to return an object of a a new class that will cause that method to be used. > > E.g., > > xpoly <- function(x,degree=1,...){ ret <- poly(x,degree=degree,...); > class(ret) <- "xpoly" ; ret } > makepredictcall.xpoly <- function (var, call) > { > if (is.call(call)) { > if (identical(call[[1]], quote(xpoly))) { > if (!is.null(tmp <- attr(var, "coefs"))) { > call$coefs <- tmp > } > } > } > call > } > > g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) > predict(g2,dc) > # 1 2 3 4 > 5 > #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 > #-0.01398928608 > # 6 7 8 9 > #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 > > You can see the effects of makepredictcall() in the 'terms' component > of glm's output. The 'variables' attribute of it gives the original > function > calls and the 'predvars' attribute gives the calls to be used for > prediction: > > attr(g2$terms, "variables") > list(lot1, log(u), xpoly(u, 1)) > > attr(g2$terms, "predvars") > list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 = c(1, > 9, 8850)))) > > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkunshan at gmail.com> > wrote: > >> Hello, I have a question about the formula and the user defined function: >> >> I can do following: >> ###Case 1: >> > clotting <- data.frame( >> + u = c(5,10,15,20,30,40,60,80,100), >> + lot1 = c(118,58,42,35,27,25,21,19,18), >> + lot2 = c(69,35,26,21,18,16,13,12,12)) >> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma) >> > dc=clotting >> > dc$u=1 >> > predict(g1,dc) >> 1 2 3 4 5 >> 6 7 8 9 >> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 >> -0.01398929 -0.01398929 -0.01398929 >> >> However, if I just simply wrap the poly as a user defined function ( in >> reality I would have my own more complex function) then I will get error: >> ###Case 2: >> > xpoly<-function(x,degree=1){poly(x,degree)} >> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) >> > predict(g2,dc) >> Error in poly(x, degree) : >> 'degree' must be less than number of unique points >> >> It seems that the predict always treat the user defined function in the >> formula with I(). My question is how can I get the results for Case2 >> same >> as case1? >> >> Anyone can have any idea about this? >> >> Thank you very much. >> >> Alex >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]
OPoly<-function(x,degree=1,weight=1){ weight=round(weight,0)# weight need to be integer if(length(weight)!=length(x))weight=rep(1,length(x)) p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree) Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[- seq(2)]))[,degree]) class(Z)<-"OPoly";Z } You need to make OPoly to have optional argument(s) that give the original-regressor-dependent information to OPoly and then have it return, as attributes, the value of those arguments. makepredictcall will take the attributes and attach them to the call in predvars so predict uses values derived from the original regressors, not value derived from the data to be predicted from. Take a look at a pair like makepredictcall.scale() and scale() for an example: scale has optional arguments 'center' and 'scale' that it returns as attributes and makepredictcall.scale adds those to the call to scale that it is given. Thus when you predict, the scale and center arguments come from the original data, not from the data you are predicting from. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Jul 16, 2015 at 3:43 PM, Kunshan Yin <yinkunshan at gmail.com> wrote:> Thanks Bill for your quick reply. > > I tried your solution and it did work for the simple user defined function > xploly. But when I try with other function, it gave me error again: > > OPoly<-function(x,degree=1,weight=1){ > weight=round(weight,0)# weight need to be integer > if(length(weight)!=length(x))weight=rep(1,length(x)) > p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree) > > Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree]) > class(Z)<-"OPoly";Z > } > > ##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps > the x to range[-2,2] then do QR, then return the results with sqrt(norm2). > Comparing with poly, this transformation will make the model coefficients > within a similar range as other variables, the R poly routine will usually > give you a very large coefficients. I did not find such routine in R, so I > have to define this as user defined function. > ####### > > I also have following function as you suggested: > > makepredictcall.OPoly<-function(var,call) > { > if (is.call(call)) { > if (identical(call[[1]], quote(OPoly))) { > if (!is.null(tmp <- attr(var, "coefs"))) { > call$coefs <- tmp > } > } > } > call > } > > > But I still got error for following: > > > g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family = Gamma) > > > predict(g3,dc)Error in poly(4 * (rep(x, weight) - mean(range(x)))/diff(range(x)), degree) : > missing values are not allowed in 'poly' > > I thought it might be due to the /diff(range(x) in the function. But even > I remove that part, it will still give me error. Any idea? > > Many thanks in advance. > > Alex > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdunlap at tibco.com> wrote: > >> Read about the 'makepredictcall' generic function. There is a method, >> makepredictcall.poly(), for poly() that attaches the polynomial >> coefficients >> used during the fitting procedure to the call to poly() that predict() >> makes. >> You ought to supply a similar method for your xpoly(), and xpoly() needs >> to return an object of a a new class that will cause that method to be used. >> >> E.g., >> >> xpoly <- function(x,degree=1,...){ ret <- poly(x,degree=degree,...); >> class(ret) <- "xpoly" ; ret } >> makepredictcall.xpoly <- function (var, call) >> { >> if (is.call(call)) { >> if (identical(call[[1]], quote(xpoly))) { >> if (!is.null(tmp <- attr(var, "coefs"))) { >> call$coefs <- tmp >> } >> } >> } >> call >> } >> >> g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) >> predict(g2,dc) >> # 1 2 3 4 >> 5 >> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 >> #-0.01398928608 >> # 6 7 8 9 >> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 >> >> You can see the effects of makepredictcall() in the 'terms' component >> of glm's output. The 'variables' attribute of it gives the original >> function >> calls and the 'predvars' attribute gives the calls to be used for >> prediction: >> > attr(g2$terms, "variables") >> list(lot1, log(u), xpoly(u, 1)) >> > attr(g2$terms, "predvars") >> list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 = c(1, >> 9, 8850)))) >> >> >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkunshan at gmail.com> >> wrote: >> >>> Hello, I have a question about the formula and the user defined function: >>> >>> I can do following: >>> ###Case 1: >>> > clotting <- data.frame( >>> + u = c(5,10,15,20,30,40,60,80,100), >>> + lot1 = c(118,58,42,35,27,25,21,19,18), >>> + lot2 = c(69,35,26,21,18,16,13,12,12)) >>> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma) >>> > dc=clotting >>> > dc$u=1 >>> > predict(g1,dc) >>> 1 2 3 4 5 >>> 6 7 8 9 >>> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 >>> -0.01398929 -0.01398929 -0.01398929 >>> >>> However, if I just simply wrap the poly as a user defined function ( in >>> reality I would have my own more complex function) then I will get >>> error: >>> ###Case 2: >>> > xpoly<-function(x,degree=1){poly(x,degree)} >>> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) >>> > predict(g2,dc) >>> Error in poly(x, degree) : >>> 'degree' must be less than number of unique points >>> >>> It seems that the predict always treat the user defined function in the >>> formula with I(). My question is how can I get the results for Case2 >>> same >>> as case1? >>> >>> Anyone can have any idea about this? >>> >>> Thank you very much. >>> >>> Alex >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >[[alternative HTML version deleted]]
This might do what you want: OPoly <- function(x, degree=1, weight=1, coefs=NULL, rangeX=NULL){ weight <- round(weight,0)# weight need to be integer if(length(weight)!=length(x)) { weight <- rep(1,length(x)) } if (is.null(rangeX)) { rangeX <- range(x) } p <- poly(4*(rep(x,weight)-mean(rangeX))/diff(rangeX), degree=degree, coefs=coefs) # why t(t(...))? That strips the attributes. Z <- t( t(p[cumsum(weight),]) * sqrt(attr(p,"coefs")$norm2[-seq(2)]) )[, degree, drop=FALSE] class(Z) <- "OPoly" attr(Z, "coefs") <- attr(p, "coefs") attr(Z, "rangeX") <- rangeX Z } makepredictcall.OPoly<-function(var,call) { if (is.call(call)) { if (identical(call[[1]], quote(OPoly))) { if (!is.null(tmp <- attr(var, "coefs"))) { call$coefs <- tmp } if (!is.null(tmp <- attr(var, "rangeX"))) { call$rangeX <- tmp } call$weight <- 1 # weight not relevant in predictions } } call } d <- data.frame(Y=1:8, X=log(1:8), Weight=1:8) fit <- lm(data=d, Y ~ OPoly(X, degree=2, weight=Weight)) predict(fit)[c(3,8)] predict(fit, newdata=data.frame(X=d$X[c(3,8)])) # same result Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Jul 16, 2015 at 4:39 PM, William Dunlap <wdunlap at tibco.com> wrote:> OPoly<-function(x,degree=1,weight=1){ > weight=round(weight,0)# weight need to be integer > if(length(weight)!=length(x))weight=rep(1,length(x)) > p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree) > Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[- > seq(2)]))[,degree]) > class(Z)<-"OPoly";Z > } > > You need to make OPoly to have optional argument(s) that give > the original-regressor-dependent information to OPoly and then > have it return, as attributes, the value of those arguments. > makepredictcall > will take the attributes and attach them to the call in predvars so > predict uses values derived from the original regressors, not value derived > from the data to be predicted from. > > Take a look at a pair like makepredictcall.scale() and scale() for an > example: > scale has optional arguments 'center' and 'scale' that it returns as > attributes > and makepredictcall.scale adds those to the call to scale that it is given. > Thus when you predict, the scale and center arguments come from the > original data, not from the data you are predicting from. > > > > > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Thu, Jul 16, 2015 at 3:43 PM, Kunshan Yin <yinkunshan at gmail.com> wrote: > >> Thanks Bill for your quick reply. >> >> I tried your solution and it did work for the simple user defined >> function xploly. But when I try with other function, it gave me error again: >> >> OPoly<-function(x,degree=1,weight=1){ >> weight=round(weight,0)# weight need to be integer >> if(length(weight)!=length(x))weight=rep(1,length(x)) >> p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree) >> >> Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree]) >> class(Z)<-"OPoly";Z >> } >> >> ##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps >> the x to range[-2,2] then do QR, then return the results with sqrt(norm2). >> Comparing with poly, this transformation will make the model coefficients >> within a similar range as other variables, the R poly routine will usually >> give you a very large coefficients. I did not find such routine in R, so I >> have to define this as user defined function. >> ####### >> >> I also have following function as you suggested: >> >> makepredictcall.OPoly<-function(var,call) >> { >> if (is.call(call)) { >> if (identical(call[[1]], quote(OPoly))) { >> if (!is.null(tmp <- attr(var, "coefs"))) { >> call$coefs <- tmp >> } >> } >> } >> call >> } >> >> >> But I still got error for following: >> >> > g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family = Gamma) >> >> > predict(g3,dc)Error in poly(4 * (rep(x, weight) - mean(range(x)))/diff(range(x)), degree) : >> missing values are not allowed in 'poly' >> >> I thought it might be due to the /diff(range(x) in the function. But >> even I remove that part, it will still give me error. Any idea? >> >> Many thanks in advance. >> >> Alex >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdunlap at tibco.com> >> wrote: >> >>> Read about the 'makepredictcall' generic function. There is a method, >>> makepredictcall.poly(), for poly() that attaches the polynomial >>> coefficients >>> used during the fitting procedure to the call to poly() that predict() >>> makes. >>> You ought to supply a similar method for your xpoly(), and xpoly() needs >>> to return an object of a a new class that will cause that method to be used. >>> >>> E.g., >>> >>> xpoly <- function(x,degree=1,...){ ret <- poly(x,degree=degree,...); >>> class(ret) <- "xpoly" ; ret } >>> makepredictcall.xpoly <- function (var, call) >>> { >>> if (is.call(call)) { >>> if (identical(call[[1]], quote(xpoly))) { >>> if (!is.null(tmp <- attr(var, "coefs"))) { >>> call$coefs <- tmp >>> } >>> } >>> } >>> call >>> } >>> >>> g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) >>> predict(g2,dc) >>> # 1 2 3 4 >>> 5 >>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 >>> #-0.01398928608 >>> # 6 7 8 9 >>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 >>> >>> You can see the effects of makepredictcall() in the 'terms' component >>> of glm's output. The 'variables' attribute of it gives the original >>> function >>> calls and the 'predvars' attribute gives the calls to be used for >>> prediction: >>> > attr(g2$terms, "variables") >>> list(lot1, log(u), xpoly(u, 1)) >>> > attr(g2$terms, "predvars") >>> list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 = c(1, >>> 9, 8850)))) >>> >>> >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >>> >>> On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkunshan at gmail.com> >>> wrote: >>> >>>> Hello, I have a question about the formula and the user defined >>>> function: >>>> >>>> I can do following: >>>> ###Case 1: >>>> > clotting <- data.frame( >>>> + u = c(5,10,15,20,30,40,60,80,100), >>>> + lot1 = c(118,58,42,35,27,25,21,19,18), >>>> + lot2 = c(69,35,26,21,18,16,13,12,12)) >>>> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma) >>>> > dc=clotting >>>> > dc$u=1 >>>> > predict(g1,dc) >>>> 1 2 3 4 5 >>>> 6 7 8 9 >>>> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 >>>> -0.01398929 -0.01398929 -0.01398929 >>>> >>>> However, if I just simply wrap the poly as a user defined function ( in >>>> reality I would have my own more complex function) then I will get >>>> error: >>>> ###Case 2: >>>> > xpoly<-function(x,degree=1){poly(x,degree)} >>>> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) >>>> > predict(g2,dc) >>>> Error in poly(x, degree) : >>>> 'degree' must be less than number of unique points >>>> >>>> It seems that the predict always treat the user defined function in the >>>> formula with I(). My question is how can I get the results for Case2 >>>> same >>>> as case1? >>>> >>>> Anyone can have any idea about this? >>>> >>>> Thank you very much. >>>> >>>> Alex >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >> >[[alternative HTML version deleted]]