I apologize for my earlier posting that, unbeknownst to me before, apparently was not in the correct format for this list. Hopefully this attempt will go through, and no-one will hold the newbie mistake against me. I could really use some help in writing a new glm link function in order to run an analysis of daily nest survival rates. I've struggled with this for weeks now, and can at least display the contents of the glm function, but I'm afraid I can't figure out even how to get started at modifiying the appropriate section (fairly new at R, complete beginner in writing functions in R!). Essentially, all I will be doing is running a logistic regression, but with a different link function. The link function is a modification of the logit link: g(theta) = natural log( (theta ^(1/t)) / (1- (theta ^(1/t)) ) ; where t is the length of the interval between nest checks. Could anyone help? I hope the answer is rather simple, since this just adds the exponent (1/t) to the logit link function; but I have yet to figure out how to do this. Thanks in advance for any help. cheers, Jessi Brown Program in Ecology, Evolution, and Conservation Biology University of Nevada-Reno
Prof Brian Ripley
2006-Apr-17 08:25 UTC
[R] second try; writing user-defined GLM link function
I've not seen a response to either of these. It is fairly simple. You need to copy 'binomial' to your own function, and edit it to allow the name you choose for your link function. Then you need to copy 'make.link', and edit it to add an entry under your chosen name. I'd copy the probit one (the logit one is internal for speed). Note that to avoid namespace issues, you do need to copy (and probably rename) the functions you change. On Sat, 15 Apr 2006, Jessi Brown wrote:> I apologize for my earlier posting that, unbeknownst to me before, > apparently was not in the correct format for this list. Hopefully this > attempt will go through, and no-one will hold the newbie mistake > against me. > > I could really use some help in writing a new glm link function in > order to run an analysis of daily nest survival rates. I've struggled > with this for weeks now, and can at least display the contents of the > glm function, but I'm afraid I can't figure out even how to get > started at modifiying the appropriate section (fairly new at R, > complete beginner in writing functions in R!). > > Essentially, all I will be doing is running a logistic regression, but > with a different link function. The link function is a modification of > the logit link: > g(theta) = natural log( (theta ^(1/t)) / (1- (theta ^(1/t)) ) ; > where t is the length of the interval between nest checks. > > Could anyone help? I hope the answer is rather simple, since this just > adds the exponent (1/t) to the logit link function; but I have yet to > figure out how to do this. > > Thanks in advance for any help. > > cheers, Jessi Brown > Program in Ecology, Evolution, and Conservation Biology > University of Nevada-Reno-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Richard Chandler
2006-Apr-17 10:59 UTC
[R] second try; writing user-defined GLM link function
If you're not tied to glm(), you could use optim() which might allow you some more flexibility. I modified the code from Venables and Ripley (2002) pg. 445 to do this same thing a while back. If you use it you should double check it with a statistician. #link function le.link <- function(theta, expos) {log(theta^(1/expos)/(1-theta^ (1/expos)))} #inverse link le.inv.link <- function(theta, expos) {((exp(theta)/(1+exp(theta))) ^expos)} LogExpos <- function(formula, data=NULL, wt=rep(1, length(y)), start = rep(0, p), expos, ...) { x <- model.matrix(formula, data=data) y <- model.frame(formula, data=data)[,1] fmin <- function(beta, X, y, w) { p <- le.inv.link(X %*% beta, expos=expos) -sum(2*w*ifelse(y, log(p), log(1-p))) } gmin <- function(beta, X, y, w) { eta <- X %*% beta; p <- le.inv.link(eta, expos=expos) as.vector(-2 * (w*dlogis(eta) * ifelse(y, 1/p, -1/(1- p)))) %*% X } if(is.null(dim(x))) dim(x) <- c(length(x), 1) dn <- dimnames(x)[[2]] if(!length(dn)) dn <- paste("Var", 1:ncol(x), sep="") p <- ncol(x) if(is.factor(y)) y <- (unclass(y) !=1) fit <- optim(start, fmin, gmin, X = x, y = y, w = wt, method="BFGS", hessian=T, ...) names(fit$par) <- dn fit$se <- sqrt(diag(solve(fit$hessian))) names(fit$se) <- dn z <- cbind(fit$par, fit$se); colnames(z) <- c ("Estimate", "std.err.") cat("\nCoefficients:\n"); print(z) cat("\nResidual Deviance:", format(fit$value), "\n") cat("\nConvergence message:", fit$convergence, "\n") invisible(fit) } #Example set.seed(1) yy <- rbinom(100, 1, .5) x1 <- rnorm(100, 1) x2 <- rnorm(100, 1) time <- rpois(100, 3) time <- ifelse(time==0, 1, time) dat <- data.frame(yy, x1, x2, time) LogExpos(yy ~ x1 + x2, dat, expos=time) #Output Coefficients: Estimate std.err. (Intercept) 1.25483347 0.2122538 x1 0.09026530 0.1430969 x2 -0.02418791 0.1195096 Residual Deviance: 158.4612 Convergence message: 0 Good luck, Richard -- Richard Chandler, M.S. candidate Department of Natural Resources Conservation UMass Amherst (413)545-1237
I was a little hesitant to post to everyone until I figured out why there is a discrepancy in the intercept estimates when compared to the same model run in SAS vs. R. Everything else comes out correctly, including the other coefficient estimates... so perhaps it is just the numerical method used. I think glm in R is using IWLS, and SAS is using ML. If anyone has another idea feel free to let me know. Here is my modified binomial family, now called logexposure. Since there is no other choice for the link for this new logexposure "family", I have just embedded the make.link inside the logexposure family. Good luck and I'd appreciate any comments. logexposure<-function (link="logit",ExposureDays) { logexp<-make.link("logit") logexp$linkfun<-function(mu,expos=ExposureDays) { log((mu^(1/expos))/(1-mu^(1/expos))) } logexp$linkinv<-function(eta,expos=ExposureDays) { (exp(eta)/(1+exp(eta)))^expos } logexp$mu.eta<-function(eta,expos=ExposureDays) { (expos*exp(eta*expos)*(1+exp(eta))^(-expos-1)) } variance <- function(mu) mu * (1 - mu) validmu <- function(mu) all(mu > 0) && all(mu < 1) dev.resids <- function(y, mu, wt) .Call("binomial_dev_resids", y, mu, wt, PACKAGE = "stats") aic <- function(y, n, mu, wt, dev) { m <- if (any(n > 1)) n else wt -2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m * y), round(m), mu, log = TRUE)) } initialize <- expression({ if (NCOL(y) == 1) { if (is.factor(y)) y <- y != levels(y)[1] n <- rep.int(1, nobs) if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1") mustart <- (weights * y + 0.5)/(weights + 1) m <- weights * y if (any(abs(m - round(m)) > 0.001)) warning("non-integer successes in a binomial glm!") } else if (NCOL(y) == 2) { if (any(abs(y - round(y)) > 0.001)) warning("non-integer counts in a binomial glm!") n <- y[, 1] + y[, 2] y <- ifelse(n == 0, 0, y[, 1]/n) weights <- weights * n mustart <- (n * y + 0.5)/(n + 1) } else stop("for the binomial family,", " y must be a vector of 0 and 1's\n", "or a 2 column matrix where col 1 is", " no. successes and col 2 is no. failures") }) structure(list(family="logexposure", link=logexp, linkfun=logexp$linkfun, linkinv=logexp$linkinv, variance=variance, dev.resids=dev.resids, aic=aic, mu.eta=logexp$mu.eta, initialize=initialize, validmu=validmu, valideta=logexp$valideta), class = "family") } #Example nestdata<-read.table("http://www.branta.org/ShafferChatNestData/chat.txt") chat.glm.logexp<-glm(survive/trials~parastat+nest_ht*patsize, family=logexposure(ExposureDays=nestdata$expos),data=nestdata) # if you have MASS installed library(MASS) chat.step<-stepAIC(chat.glm.logexp,scope=list(upper=~parastat+nest_ht*patsize,lower=~1)) chat.step$anova summary(chat.step) #Not run: # note this compares to following results from SAS : # Criteria For Assessing Goodness Of Fit # # Criterion DF Value Value/DF # Deviance 289 193.9987 0.6713 # Scaled Deviance 289 193.9987 0.6713 # Pearson Chi-Square 289 537.8609 1.8611 # Scaled Pearson X2 289 537.8609 1.8611 # Log Likelihood -96.9994 # Algorithm converged. # Analysis Of Parameter Estimates # # Standard Wald 95% Chi- #Parameter DF Estimate Error Limits Square Pr > ChiSq ##Intercept 1 2.6973 0.2769 2.1546 3.2399 94.92 <.0001 #parastat0 1 -1.0350 0.5201 -2.0544 -0.0155 3.96 0.0466 #parastat1 0 0.0000 0.0000 0.0000 0.0000 . . #patsizelarge 1 1.0844 0.5094 0.0861 2.0827 4.53 0.0333 #patsizesmall 0 0.0000 0.0000 0.0000 0.0000 . . #Scale 0 1.0000 0.0000 1.0000 1.0000 # #NOTE: The scale parameter was held fixed. Mark Herzog, Ph.D. Program Leader, San Francisco Bay Research Wetland Division, PRBO Conservation Science 4990 Shoreline Highway 1 Stinson Beach, CA 94970 (415) 893-7677 x308 mherzog at prbo.org Jessi Brown wrote:> I apologize for my earlier posting that, unbeknownst to me before, > apparently was not in the correct format for this list. Hopefully this > attempt will go through, and no-one will hold the newbie mistake > against me. > > I could really use some help in writing a new glm link function in > order to run an analysis of daily nest survival rates. I've struggled > with this for weeks now, and can at least display the contents of the > glm function, but I'm afraid I can't figure out even how to get > started at modifiying the appropriate section (fairly new at R, > complete beginner in writing functions in R!). > > Essentially, all I will be doing is running a logistic regression, but > with a different link function. The link function is a modification of > the logit link: > g(theta) = natural log( (theta ^(1/t)) / (1- (theta ^(1/t)) ) ; > where t is the length of the interval between nest checks. > > Could anyone help? I hope the answer is rather simple, since this just > adds the exponent (1/t) to the logit link function; but I have yet to > figure out how to do this. > > Thanks in advance for any help. > > cheers, Jessi Brown > Program in Ecology, Evolution, and Conservation Biology > University of Nevada-Reno > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > >