thr3ads.net - R help - [R] User defined function within a formula [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Kunshan Yin

2015-Jul-16 22:43 UTC

[R] User defined function within a formula

Thanks Bill for your quick reply.

I tried your solution and it did work for the simple user defined function
xploly. But when I try with other function, it gave me error again:

OPoly<-function(x,degree=1,weight=1){
  weight=round(weight,0)# weight need to be integer
  if(length(weight)!=length(x))weight=rep(1,length(x))
  p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree)

Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree])
  class(Z)<-"OPoly";Z
}

##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps the
x to range[-2,2] then do QR, then return the results with sqrt(norm2).
Comparing with poly, this transformation will make the model coefficients
within a similar range as other variables, the R poly routine will usually
give you a very large coefficients. I did not find such routine in R, so I
have to define this as user defined function.
#######

I  also have following function as you suggested:

makepredictcall.OPoly<-function(var,call)
{
  if (is.call(call)) {
    if (identical(call[[1]], quote(OPoly))) {
      if (!is.null(tmp <- attr(var, "coefs"))) {
        call$coefs <- tmp
      }
    }
  }
  call
}


But I still got error for following:
> g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family = Gamma)
> predict(g3,dc)Error in poly(4 * (rep(x, weight) -
mean(range(x)))/diff(range(x)), degree) :  missing values are not allowed in 'poly'

I thought it might be due to the /diff(range(x) in the function.  But even
I remove that part, it will still give me error. Any idea?

Many thanks in advance.

Alex























On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdunlap at tibco.com>
wrote:
> Read about the 'makepredictcall' generic function.  There is a
method,
> makepredictcall.poly(), for poly() that attaches the polynomial
> coefficients
> used during the fitting procedure to the call to poly() that predict()
> makes.
> You ought to supply a similar method for your xpoly(), and xpoly() needs
> to return an object of a a new class that will cause that method to be
used.
>
> E.g.,
>
> xpoly <- function(x,degree=1,...){ ret <- poly(x,degree=degree,...);
> class(ret) <- "xpoly" ; ret }
> makepredictcall.xpoly <- function (var, call)
> {
>     if (is.call(call)) {
>         if (identical(call[[1]], quote(xpoly))) {
>             if (!is.null(tmp <- attr(var, "coefs"))) {
>                 call$coefs <- tmp
>             }
>         }
>     }
>     call
> }
>
> g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma)
> predict(g2,dc)
> #             1              2              3              4
>  5
> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
> #-0.01398928608
> #             6              7              8              9
> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
>
> You can see the effects of makepredictcall() in the 'terms'
component
> of glm's output.  The 'variables' attribute of it gives the
original
> function
> calls and the 'predvars' attribute gives the calls to be used for
> prediction:
>    > attr(g2$terms, "variables")
>    list(lot1, log(u), xpoly(u, 1))
>   > attr(g2$terms, "predvars")
>   list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 = c(1,
>   9, 8850))))
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkunshan at
gmail.com>
> wrote:
>
>> Hello, I have a question about the formula and the user defined
function:
>>
>> I can do following:
>> ###Case 1:
>> > clotting <- data.frame(
>> +     u = c(5,10,15,20,30,40,60,80,100),
>> +     lot1 = c(118,58,42,35,27,25,21,19,18),
>> +     lot2 = c(69,35,26,21,18,16,13,12,12))
>> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma)
>> > dc=clotting
>> > dc$u=1
>> > predict(g1,dc)
>>           1           2           3           4           5
>> 6           7           8           9
>> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
>> -0.01398929 -0.01398929 -0.01398929
>>
>> However, if I just simply wrap the poly as a user defined function ( in
>> reality I would have my own more complex function)  then I will get
error:
>> ###Case 2:
>> > xpoly<-function(x,degree=1){poly(x,degree)}
>> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family =
Gamma)
>> > predict(g2,dc)
>> Error in poly(x, degree) :
>>   'degree' must be less than number of unique points
>>
>> It seems that the predict always treat the user defined function in the
>> formula with I().  My question is how can I get the  results for Case2
>> same
>> as case1?
>>
>> Anyone can have any idea about this?
>>
>> Thank you very much.
>>
>> Alex
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
	[[alternative HTML version deleted]]

William Dunlap

2015-Jul-16 23:39 UTC

head link

[R] User defined function within a formula

OPoly<-function(x,degree=1,weight=1){
  weight=round(weight,0)# weight need to be integer
  if(length(weight)!=length(x))weight=rep(1,length(x))
  p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree)
  Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-
seq(2)]))[,degree])
  class(Z)<-"OPoly";Z
}

You need to make OPoly to have optional argument(s) that give
the original-regressor-dependent information to OPoly and then
have it return, as attributes, the value of those arguments.
 makepredictcall
will take the attributes and attach them to the call in predvars so
predict uses values derived from the original regressors, not value derived
from the data to be predicted from.

Take a look at a pair like makepredictcall.scale() and scale() for an
example:
scale has optional arguments 'center' and 'scale' that it
returns as
attributes
and makepredictcall.scale adds those to the call to scale that it is given.
Thus when you predict, the scale and center arguments come from the
original data, not from the data you are predicting from.






Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Jul 16, 2015 at 3:43 PM, Kunshan Yin <yinkunshan at gmail.com>
wrote:
> Thanks Bill for your quick reply.
>
> I tried your solution and it did work for the simple user defined function
> xploly. But when I try with other function, it gave me error again:
>
> OPoly<-function(x,degree=1,weight=1){
>   weight=round(weight,0)# weight need to be integer
>   if(length(weight)!=length(x))weight=rep(1,length(x))
>   p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree)
>
>
Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree])
>   class(Z)<-"OPoly";Z
> }
>
> ##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps
> the x to range[-2,2] then do QR, then return the results with sqrt(norm2).
> Comparing with poly, this transformation will make the model coefficients
> within a similar range as other variables, the R poly routine will usually
> give you a very large coefficients. I did not find such routine in R, so I
> have to define this as user defined function.
> #######
>
> I  also have following function as you suggested:
>
> makepredictcall.OPoly<-function(var,call)
> {
>   if (is.call(call)) {
>     if (identical(call[[1]], quote(OPoly))) {
>       if (!is.null(tmp <- attr(var, "coefs"))) {
>         call$coefs <- tmp
>       }
>     }
>   }
>   call
> }
>
>
> But I still got error for following:
>
> > g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family = Gamma)
>
> > predict(g3,dc)Error in poly(4 * (rep(x, weight) -
mean(range(x)))/diff(range(x)), degree) :
>   missing values are not allowed in 'poly'
>
> I thought it might be due to the /diff(range(x) in the function.  But even
> I remove that part, it will still give me error. Any idea?
>
> Many thanks in advance.
>
> Alex
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdunlap at
tibco.com> wrote:
>
>> Read about the 'makepredictcall' generic function.  There is a
method,
>> makepredictcall.poly(), for poly() that attaches the polynomial
>> coefficients
>> used during the fitting procedure to the call to poly() that predict()
>> makes.
>> You ought to supply a similar method for your xpoly(), and xpoly()
needs
>> to return an object of a a new class that will cause that method to be
used.
>>
>> E.g.,
>>
>> xpoly <- function(x,degree=1,...){ ret <-
poly(x,degree=degree,...);
>> class(ret) <- "xpoly" ; ret }
>> makepredictcall.xpoly <- function (var, call)
>> {
>>     if (is.call(call)) {
>>         if (identical(call[[1]], quote(xpoly))) {
>>             if (!is.null(tmp <- attr(var, "coefs"))) {
>>                 call$coefs <- tmp
>>             }
>>         }
>>     }
>>     call
>> }
>>
>> g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family =
Gamma)
>> predict(g2,dc)
>> #             1              2              3              4
>>  5
>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
>> #-0.01398928608
>> #             6              7              8              9
>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
>>
>> You can see the effects of makepredictcall() in the 'terms'
component
>> of glm's output.  The 'variables' attribute of it gives the
original
>> function
>> calls and the 'predvars' attribute gives the calls to be used
for
>> prediction:
>>    > attr(g2$terms, "variables")
>>    list(lot1, log(u), xpoly(u, 1))
>>   > attr(g2$terms, "predvars")
>>   list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 = c(1,
>>   9, 8850))))
>>
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkunshan at
gmail.com>
>> wrote:
>>
>>> Hello, I have a question about the formula and the user defined
function:
>>>
>>> I can do following:
>>> ###Case 1:
>>> > clotting <- data.frame(
>>> +     u = c(5,10,15,20,30,40,60,80,100),
>>> +     lot1 = c(118,58,42,35,27,25,21,19,18),
>>> +     lot2 = c(69,35,26,21,18,16,13,12,12))
>>> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family =
Gamma)
>>> > dc=clotting
>>> > dc$u=1
>>> > predict(g1,dc)
>>>           1           2           3           4           5
>>> 6           7           8           9
>>> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
-0.01398929
>>> -0.01398929 -0.01398929 -0.01398929
>>>
>>> However, if I just simply wrap the poly as a user defined function
( in
>>> reality I would have my own more complex function)  then I will get
>>> error:
>>> ###Case 2:
>>> > xpoly<-function(x,degree=1){poly(x,degree)}
>>> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family =
Gamma)
>>> > predict(g2,dc)
>>> Error in poly(x, degree) :
>>>   'degree' must be less than number of unique points
>>>
>>> It seems that the predict always treat the user defined function in
the
>>> formula with I().  My question is how can I get the  results for
Case2
>>> same
>>> as case1?
>>>
>>> Anyone can have any idea about this?
>>>
>>> Thank you very much.
>>>
>>> Alex
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
	[[alternative HTML version deleted]]

William Dunlap

2015-Jul-16 23:51 UTC

head link

[R] User defined function within a formula

This might do what you want:

OPoly <- function(x, degree=1, weight=1, coefs=NULL, rangeX=NULL){
  weight <- round(weight,0)# weight need to be integer
  if(length(weight)!=length(x)) {
    weight <- rep(1,length(x))
  }
  if (is.null(rangeX)) {
      rangeX <- range(x)
  }
  p <- poly(4*(rep(x,weight)-mean(rangeX))/diff(rangeX), degree=degree,
coefs=coefs)
  # why t(t(...))?  That strips the attributes.
  Z <- t( t(p[cumsum(weight),]) *
sqrt(attr(p,"coefs")$norm2[-seq(2)]) )[,
degree, drop=FALSE]
  class(Z) <- "OPoly"
  attr(Z, "coefs") <- attr(p, "coefs")
  attr(Z, "rangeX") <- rangeX
  Z
}

makepredictcall.OPoly<-function(var,call)
{
  if (is.call(call)) {
    if (identical(call[[1]], quote(OPoly))) {
      if (!is.null(tmp <- attr(var, "coefs"))) {
        call$coefs <- tmp
      }
      if (!is.null(tmp <- attr(var, "rangeX"))) {
        call$rangeX <- tmp
      }
      call$weight <- 1 # weight not relevant in predictions
    }
  }
  call
}

d <- data.frame(Y=1:8, X=log(1:8), Weight=1:8)
fit <- lm(data=d, Y ~ OPoly(X, degree=2, weight=Weight))
predict(fit)[c(3,8)]
predict(fit, newdata=data.frame(X=d$X[c(3,8)])) # same result


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Jul 16, 2015 at 4:39 PM, William Dunlap <wdunlap at tibco.com>
wrote:
> OPoly<-function(x,degree=1,weight=1){
>   weight=round(weight,0)# weight need to be integer
>   if(length(weight)!=length(x))weight=rep(1,length(x))
>   p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree)
>   Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-
> seq(2)]))[,degree])
>   class(Z)<-"OPoly";Z
> }
>
> You need to make OPoly to have optional argument(s) that give
> the original-regressor-dependent information to OPoly and then
> have it return, as attributes, the value of those arguments.
>  makepredictcall
> will take the attributes and attach them to the call in predvars so
> predict uses values derived from the original regressors, not value derived
> from the data to be predicted from.
>
> Take a look at a pair like makepredictcall.scale() and scale() for an
> example:
> scale has optional arguments 'center' and 'scale' that it
returns as
> attributes
> and makepredictcall.scale adds those to the call to scale that it is given.
> Thus when you predict, the scale and center arguments come from the
> original data, not from the data you are predicting from.
>
>
>
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Jul 16, 2015 at 3:43 PM, Kunshan Yin <yinkunshan at
gmail.com> wrote:
>
>> Thanks Bill for your quick reply.
>>
>> I tried your solution and it did work for the simple user defined
>> function xploly. But when I try with other function, it gave me error
again:
>>
>> OPoly<-function(x,degree=1,weight=1){
>>   weight=round(weight,0)# weight need to be integer
>>   if(length(weight)!=length(x))weight=rep(1,length(x))
>>   p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree)
>>
>>
Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree])
>>   class(Z)<-"OPoly";Z
>> }
>>
>> ##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps
>> the x to range[-2,2] then do QR, then return the results with
sqrt(norm2).
>> Comparing with poly, this transformation will make the model
coefficients
>> within a similar range as other variables, the R poly routine will
usually
>> give you a very large coefficients. I did not find such routine in R,
so I
>> have to define this as user defined function.
>> #######
>>
>> I  also have following function as you suggested:
>>
>> makepredictcall.OPoly<-function(var,call)
>> {
>>   if (is.call(call)) {
>>     if (identical(call[[1]], quote(OPoly))) {
>>       if (!is.null(tmp <- attr(var, "coefs"))) {
>>         call$coefs <- tmp
>>       }
>>     }
>>   }
>>   call
>> }
>>
>>
>> But I still got error for following:
>>
>> > g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family =
Gamma)
>>
>> > predict(g3,dc)Error in poly(4 * (rep(x, weight) -
mean(range(x)))/diff(range(x)), degree) :
>>   missing values are not allowed in 'poly'
>>
>> I thought it might be due to the /diff(range(x) in the function.  But
>> even I remove that part, it will still give me error. Any idea?
>>
>> Many thanks in advance.
>>
>> Alex
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdunlap at
tibco.com>
>> wrote:
>>
>>> Read about the 'makepredictcall' generic function.  There
is a method,
>>> makepredictcall.poly(), for poly() that attaches the polynomial
>>> coefficients
>>> used during the fitting procedure to the call to poly() that
predict()
>>> makes.
>>> You ought to supply a similar method for your xpoly(), and xpoly()
needs
>>> to return an object of a a new class that will cause that method to
be used.
>>>
>>> E.g.,
>>>
>>> xpoly <- function(x,degree=1,...){ ret <-
poly(x,degree=degree,...);
>>> class(ret) <- "xpoly" ; ret }
>>> makepredictcall.xpoly <- function (var, call)
>>> {
>>>     if (is.call(call)) {
>>>         if (identical(call[[1]], quote(xpoly))) {
>>>             if (!is.null(tmp <- attr(var, "coefs"))) {
>>>                 call$coefs <- tmp
>>>             }
>>>         }
>>>     }
>>>     call
>>> }
>>>
>>> g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family =
Gamma)
>>> predict(g2,dc)
>>> #             1              2              3              4
>>>  5
>>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
>>> #-0.01398928608
>>> #             6              7              8              9
>>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
>>>
>>> You can see the effects of makepredictcall() in the 'terms'
component
>>> of glm's output.  The 'variables' attribute of it gives
the original
>>> function
>>> calls and the 'predvars' attribute gives the calls to be
used for
>>> prediction:
>>>    > attr(g2$terms, "variables")
>>>    list(lot1, log(u), xpoly(u, 1))
>>>   > attr(g2$terms, "predvars")
>>>   list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 =
c(1,
>>>   9, 8850))))
>>>
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkunshan at
gmail.com>
>>> wrote:
>>>
>>>> Hello, I have a question about the formula and the user defined
>>>> function:
>>>>
>>>> I can do following:
>>>> ###Case 1:
>>>> > clotting <- data.frame(
>>>> +     u = c(5,10,15,20,30,40,60,80,100),
>>>> +     lot1 = c(118,58,42,35,27,25,21,19,18),
>>>> +     lot2 = c(69,35,26,21,18,16,13,12,12))
>>>> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family
= Gamma)
>>>> > dc=clotting
>>>> > dc$u=1
>>>> > predict(g1,dc)
>>>>           1           2           3           4           5
>>>> 6           7           8           9
>>>> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
-0.01398929
>>>> -0.01398929 -0.01398929 -0.01398929
>>>>
>>>> However, if I just simply wrap the poly as a user defined
function ( in
>>>> reality I would have my own more complex function)  then I will
get
>>>> error:
>>>> ###Case 2:
>>>> > xpoly<-function(x,degree=1){poly(x,degree)}
>>>> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family
= Gamma)
>>>> > predict(g2,dc)
>>>> Error in poly(x, degree) :
>>>>   'degree' must be less than number of unique points
>>>>
>>>> It seems that the predict always treat the user defined
function in the
>>>> formula with I().  My question is how can I get the  results
for Case2
>>>> same
>>>> as case1?
>>>>
>>>> Anyone can have any idea about this?
>>>>
>>>> Thank you very much.
>>>>
>>>> Alex
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>>>
>>>
>>>
>>
>
	[[alternative HTML version deleted]]

R help - Jul 2015 - User defined function within a formula

[R] User defined function within a formula

[R] User defined function within a formula

[R] User defined function within a formula