Philippe Grosjean

2004-Oct-17 13:55 UTC

### [R] Errors while compiling packages with namespace?

Hello, I try to set up namespaces for packages. It is fine for several of them, except one whose compilation fails (under Windows XP & R 2.0.0): ---------- Making package svViews ------------ adding build stamp to DESCRIPTION installing NAMESPACE file and metadata Error in parse(file, n, text, prompt) : syntax error on line 21 Execution halted make[2]: *** [nmspace] Error 1 make[1]: *** [all] Error 2 make: *** [pkg-svViews] Error 2 *** Installation of svViews failed *** This kind of error tells me that there is something wrong in my code, making it impossible to parse, isn''t it? However, when I source the code in R, everything is fine. Also, this package compiled without errors before I introduced NAMESPACE. The only changes I did in the code (beside adding NAMESPACE), is to eliminate all "require(....)" and to replace them by "import" and "importFrom" statements in NAMESPACE. So, I suppose this should be due to a wrong or missing "import", or "importFrom" directive. Does anybody has another suggestion? My question is: how do I know where is the error in my code, given this message: "syntax error in line 21" while installing NAMESPACE. Obviously, it is not in "line 21" of any of my original code files in ./R (because I can source them all without error). At this point, I am completelly lost. Any help would be welcome. This package contains several hundreds of lines of code, and NAMESPACE is quite complex: importFrom(svMisc, listCustoms, getTemp) importFrom(R2HTML, HTML, HTMLhr, HTMLInsertGraph, HTMLli, HTML.cormat) importFrom(utils, browseURL, methods) importFrom(lattice, lset) importFrom(MASS, lda) import(svIO, graphics, grDevices, stats) export(guiViewsCmd, guiViewsCSS, guiViewsCSSChange, guiViewsDir, guiViewsDisplay, guiViewsFile, report, reportGraph, viewHTMLinit) S3method(view, default) S3method(view, data.frame) S3method(view, function) S3method(view, matrix) S3method(view, princomp) S3method(view, trellis) S3method(view, ts) Thank you. Best, Philippe Grosjean ..............................................<??}))><........ ) ) ) ) ) ( ( ( ( ( Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( ( Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Pentagone ( ( ( ( ( Academie Universitaire Wallonie-Bruxelles ) ) ) ) ) 6, av du Champ de Mars, 7000 Mons, Belgium ( ( ( ( ( ) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.33.12 ( ( ( ( ( email: Philippe.Grosjean at umh.ac.be ) ) ) ) ) ( ( ( ( ( web: http://www.umh.ac.be/~econum ) ) ) ) ) ..............................................................

The error message points to line 21 of the NAMESPACE file: S3method(view, function) The NAMESPACE file is parsed by the R parser, so this is a suntax error since function is a reserved word. Put quotes around it and you should be OK. luke On Sun, 17 Oct 2004, Philippe Grosjean wrote:> Hello, > > I try to set up namespaces for packages. It is fine for several of them, > except one whose compilation fails (under Windows XP & R 2.0.0): > > ---------- Making package svViews ------------ > adding build stamp to DESCRIPTION > installing NAMESPACE file and metadata > Error in parse(file, n, text, prompt) : syntax error on line 21 > Execution halted > make[2]: *** [nmspace] Error 1 > make[1]: *** [all] Error 2 > make: *** [pkg-svViews] Error 2 > *** Installation of svViews failed *** > > This kind of error tells me that there is something wrong in my code, making > it impossible to parse, isn''t it? However, when I source the code in R, > everything is fine. Also, this package compiled without errors before I > introduced NAMESPACE. The only changes I did in the code (beside adding > NAMESPACE), is to eliminate all "require(....)" and to replace them by > "import" and "importFrom" statements in NAMESPACE. So, I suppose this should > be due to a wrong or missing "import", or "importFrom" directive. Does > anybody has another suggestion? > > My question is: how do I know where is the error in my code, given this > message: "syntax error in line 21" while installing NAMESPACE. Obviously, it > is not in "line 21" of any of my original code files in ./R (because I can > source them all without error). At this point, I am completelly lost. Any > help would be welcome. This package contains several hundreds of lines of > code, and NAMESPACE is quite complex: > > importFrom(svMisc, listCustoms, getTemp) > importFrom(R2HTML, HTML, HTMLhr, HTMLInsertGraph, HTMLli, HTML.cormat) > importFrom(utils, browseURL, methods) > importFrom(lattice, lset) > importFrom(MASS, lda) > > import(svIO, graphics, grDevices, stats) > > export(guiViewsCmd, > guiViewsCSS, > guiViewsCSSChange, > guiViewsDir, > guiViewsDisplay, > guiViewsFile, > report, > reportGraph, > viewHTMLinit) > > S3method(view, default) > S3method(view, data.frame) > S3method(view, function) > S3method(view, matrix) > S3method(view, princomp) > S3method(view, trellis) > S3method(view, ts) > > Thank you. > Best, > > Philippe Grosjean > > ..............................................<??}))><........ > ) ) ) ) ) > ( ( ( ( ( Prof. Philippe Grosjean > ) ) ) ) ) > ( ( ( ( ( Numerical Ecology of Aquatic Systems > ) ) ) ) ) Mons-Hainaut University, Pentagone > ( ( ( ( ( Academie Universitaire Wallonie-Bruxelles > ) ) ) ) ) 6, av du Champ de Mars, 7000 Mons, Belgium > ( ( ( ( ( > ) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.33.12 > ( ( ( ( ( email: Philippe.Grosjean at umh.ac.be > ) ) ) ) ) > ( ( ( ( ( web: http://www.umh.ac.be/~econum > ) ) ) ) ) > .............................................................. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Luke Tierney University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke at stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu

Hi, all: I find survreg {survival} has provided many distributions such as weibull, lognormal, etc. But I wonder why it doesn''t have the support for gamma distribution since it should be a good distr. in lifetime analysis. Can anybody figure out the reason? I''ve tried to implement the likelihood function of progressively censored data for gamma distr. and use optim() to solve the paramemters. The log-likelihood function L contains some integrations. I use tryCatch() to capture the error when integration lead to divergence and return Inf. But if consequent two calls to the objective function return Inf, optim() will raise errors: Error in optim(c(ga, 1/la), fr, method = "BFGS") : non-finite finite-difference value [1] What can I do except for choosing better initial values? The last question, by its name "survreg", survreg does its job by regression, but why p.75 in Tableman, Kim (2004) said that "We use the S function survReg to fit parametric models (with the MLE approach)...". Does survreg use regression or MLE approach? Thanks for your help. [1] Mara Tableman, Jong Sung Kim, Survival Analysis Using S, Chapman & Hall/CRC, 2004

On Sun, 17 Oct 2004, Kuan-Ta Chen wrote:> Hi, all: > > I find survreg {survival} has provided many distributions such as weibull, > lognormal, etc. But I wonder why it doesn''t have the support for gamma > distribution since it should be a good distr. in lifetime analysis. Can > anybody figure out the reason?I suspect Dr Therneau had no need of it: it is not commonly a good distribution in medical applications. He did however provide a way for users to specify other distributions: see ?survreg.distributions.> I''ve tried to implement the likelihood function of progressively censored > data for gamma distr. and use optim() to solve the paramemters. The > log-likelihood function L contains some integrations. I use tryCatch() toIt should not contain numerical integrations: all you need is dgamma and pgamma to specify the log-likelihood.> capture the error when integration lead to divergence and return Inf. > But if consequent two calls to the objective function return Inf, optim() > will raise errors: > > Error in optim(c(ga, 1/la), fr, method = "BFGS") : > non-finite finite-difference value [1] > > What can I do except for choosing better initial values?It seems very unlikely that the log-likelihood really is Inf, and so you need to calculate it more carefully. Finite-differencing numerical integrations is almost bound to be unstable, and you can write down the log-likelihood and its first derivative in terms of functions available in R.> The last question, by its name "survreg", survreg does its job by > regression, > but why p.75 in Tableman, Kim (2004) said that "We use the S function > survReg to fit parametric models (with the MLE approach)...". Does survreg > use regression or MLE approach?What do you understand by these? There is no such thing as `regression approach''. survreg fits a linear regression model to log survival times, by maximum likelihood. Note that `regression'' is often used to mean fitting by OLS, but also often used to mean a linear model for a mean effect. I suggest you find a less confusing text. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595

Kuan-Ta Chen <kuan <at> ilife.cx> writes: : I find survreg {survival} has provided many distributions such as weibull, : lognormal, etc. But I wonder why it doesn''t have the support for gamma : distribution since it should be a good distr. in lifetime analysis. Can : anybody figure out the reason? I have not tried it myself but you might want to check out: http://www.math.mun.ca/~ypeng/research/gfcure/index.htm

On Sun, 17 Oct 2004, Kuan-Ta Chen wrote:> Hi, all: > > I find survreg {survival} has provided many distributions such as weibull, > lognormal, etc. But I wonder why it doesn''t have the support for gamma > distribution since it should be a good distr. in lifetime analysis. Can > anybody figure out the reason?Presumably the actual reason is because Terry Therneau didn''t need to use the Gamma model. However, all the distributions in survreg are location-scale families, which the Gamma is not, so the basic algorithm would have to be different.> I''ve tried to implement the likelihood function of progressively censored > data for gamma distr. and use optim() to solve the paramemters. The > log-likelihood function L contains some integrations.It shouldn''t have to: we do have pgamma() built in (and digamma, trigamma, etc for derivatives).> I use tryCatch() to > capture the error when integration lead to divergence and return Inf. > But if consequent two calls to the objective function return Inf, optim() > will raise errors: > > Error in optim(c(ga, 1/la), fr, method = "BFGS") : > non-finite finite-difference value [1] > > What can I do except for choosing better initial values?Choose better initial values. You should be able to get quite good initial values for regression coefficients by using survreg on a lognormal distribution, since Gamma and lognormal distributions agree pretty well except in the extreme tails. You could then try getting the shape parameter by matching the variance of the Gamma to the variance of the fitted lognormal.> The last question, by its name "survreg", survreg does its job by > regression, > but why p.75 in Tableman, Kim (2004) said that "We use the S function > survReg to fit parametric models (with the MLE approach)...". Does survreg > use regression or MLE approach?Its job *is* regression. It uses maximum likelihood to fit a regression model. -thomas

One million thanks to Prof. Ripley and Prof. Lumley. I think I now have more understanding regarding survreg with gamma distribution. But one of my problems is still there: in the text of Lee, Wang (2003), there are two "kinds" of parametric fitting: 1) fitting of survival distributions (like regular probabillity distribution fitting) 2) regression model fitting (mostly assume an accelerated failure time model). Survreg {survival} provides model fitting of (2). But I still have one problem regarding (1): try to estimate the parameters of gamma distributions for some data. For regular gamma distr. fitting, we could use fitdistr (mass) or use optim()/mle() with log-likelihood composed by dgamma()/pgamma(). But because the data contains (randomly) censored observations, the log-likelihood function must be modified to include the effect of duration of censored observations. To clarify, I''ve excerpted the log-likelihood function and two equations of gamma and lambda by taking the first derivation. But unfortunately, but the loglik function and equations contain integrations and I can''t analytically eliminate them. That''s the reason why I used integration in optim() and always got errors (since I don''t have clues to handle divergent integration.) The excerpt is from Lee, Wang (2003) p.193 (sorry I don''t have another way to show the complicated equations): http://kuan.ilife.cx/gammamle.jpg The authors suggest using numerical method to solve the equation and I don''t have any idea to eliminate the integrations from these equations before optim(). Please give me some hint, thanks. Lee, Wang (2003): Elisa T. Lee, John Wenyu Wang, Statistical Methods for Survival Data Analysis, 3rd edition, 2003 Best regards, Kuan-Ta Chen ----- Original Message ----- From: "Thomas Lumley" <tlumley at u.washington.edu> To: "Kuan-Ta Chen" <kuan at ilife.cx> Cc: <r-help at stat.math.ethz.ch> Sent: Monday, October 18, 2004 11:48 PM Subject: Re: [R] Survreg with gamma distribution> On Sun, 17 Oct 2004, Kuan-Ta Chen wrote: > > > Hi, all: > > > > I find survreg {survival} has provided many distributions such asweibull,> > lognormal, etc. But I wonder why it doesn''t have the support for gamma > > distribution since it should be a good distr. in lifetime analysis. Can > > anybody figure out the reason? > > Presumably the actual reason is because Terry Therneau didn''t need to use > the Gamma model. However, all the distributions in survreg are > location-scale families, which the Gamma is not, so the basic algorithm > would have to be different. > > > I''ve tried to implement the likelihood function of progressivelycensored> > data for gamma distr. and use optim() to solve the paramemters. The > > log-likelihood function L contains some integrations. > > It shouldn''t have to: we do have pgamma() built in (and digamma, trigamma, > etc for derivatives). > > > I use tryCatch() to > > capture the error when integration lead to divergence and return Inf. > > But if consequent two calls to the objective function return Inf,optim()> > will raise errors: > > > > Error in optim(c(ga, 1/la), fr, method = "BFGS") : > > non-finite finite-difference value [1] > > > > What can I do except for choosing better initial values? > > Choose better initial values. You should be able to get quite good > initial values for regression coefficients by using survreg on a lognormal > distribution, since Gamma and lognormal distributions agree pretty well > except in the extreme tails. You could then try getting the shape > parameter by matching the variance of the Gamma to the variance of the > fitted lognormal. > > > The last question, by its name "survreg", survreg does its job by > > regression, > > but why p.75 in Tableman, Kim (2004) said that "We use the S function > > survReg to fit parametric models (with the MLE approach)...". Doessurvreg> > use regression or MLE approach? > > Its job *is* regression. It uses maximum likelihood to fit a regression > model. > > -thomas >

On Tue, 19 Oct 2004, Kuan-Ta Chen wrote:> > One million thanks to Prof. Ripley and Prof. Lumley. I think I now have more > understanding regarding survreg with gamma distribution. But one of my > problems is still there: in the text of Lee, Wang (2003), there are two > "kinds" of parametric fitting: 1) fitting of survival distributions (like > regular probabillity distribution fitting) 2) regression model fitting > (mostly assume an accelerated failure time model). Survreg {survival} > provides model fitting of (2). But I still have one problem regarding (1): > try to estimate the parameters of gamma distributions for some data.There aren''t really two separate kinds: 1 is a special case of 2, so survreg() can do 1.> For regular gamma distr. fitting, we could use fitdistr (mass) or use > optim()/mle() with log-likelihood composed by dgamma()/pgamma(). But because > the data contains (randomly) censored observations, the log-likelihood > function must be modified to include the effect of duration of censored > observations.Yes. The loglikelihood is pgamma(x,shape,scale=scale,lower.tail=FALSE,log.p=TRUE) for a censored observation and dgamma(x,shape,scale=scale,log=TRUE) for an uncensored observation. No integration necessary. You might want to work with log(shape) and log(scale) instead, to avoid the boundaries at 0. eg if your data were in variables "times" and "status" ll <-function(logshape,logscale){ -sum(ifelse(status, pgamma(times,exp(logshape),scale=exp(logscale), lower.tail=FALSE,log.p=TRUE), dgamma(times,exp(logshape),scale=exp(logscale),log=TRUE) )) } This works in mle() without too much sensitivity to starting values. -thomas

On Mon, Oct 18, 2004 at 08:48:40AM -0700, Thomas Lumley wrote:> On Sun, 17 Oct 2004, Kuan-Ta Chen wrote: > > >Hi, all: > > > >I find survreg {survival} has provided many distributions such as weibull, > >lognormal, etc. But I wonder why it doesn''t have the support for gamma > >distribution since it should be a good distr. in lifetime analysis. Can > >anybody figure out the reason? > > Presumably the actual reason is because Terry Therneau didn''t need to use > the Gamma model. However, all the distributions in survreg are > location-scale families,But only after a time transformation (usually the log transformation) in most cases (exponential, Weibull, lognormal, ...)> which the Gamma is not, so the basic algorithm > would have to be different.which also holds for the Gamma; log(Gamma) is a location-scale family. So the basic algorithm should work after all? (Haven''t tried it myself, though.) G??ran -- G??ran Brostr??m tel: +46 90 786 5223 Department of Statistics fax: +46 90 786 6614 Ume?? University http://www.stat.umu.se/egna/gb/ SE-90187 Ume??, Sweden e-mail: gb at stat.umu.se

On Mon, 18 Oct 2004, G?ran Brostr?m wrote:> On Mon, Oct 18, 2004 at 08:48:40AM -0700, Thomas Lumley wrote: > However, all the distributions in survreg are >> location-scale families, > > But only after a time transformation (usually the log transformation) in > most cases (exponential, Weibull, lognormal, ...) > >> which the Gamma is not, so the basic algorithm >> would have to be different. > > which also holds for the Gamma; log(Gamma) is a location-scale family. So > the basic algorithm should work after all? (Haven''t tried it myself, > though.)I don''t think the log(Gamma) is a location-scale family (though I may be missing something). For fixed shape parameter it is a location family, but not a scale family as the shape parameter varies: a) In survreg() the extreme-value [log(Weibull)] distributions are the location-scale family that contain the log(Exponential). b) The standardised skewness of log(Gamma) random variables varies with the shape parameter (by simulation), though not with the scale parameter. -thomas

On Mon, Oct 18, 2004 at 12:57:09PM -0700, Thomas Lumley wrote:> On Mon, 18 Oct 2004, G??ran Brostr??m wrote: > > >On Mon, Oct 18, 2004 at 08:48:40AM -0700, Thomas Lumley wrote: > > However, all the distributions in survreg are > >>location-scale families, > > > >But only after a time transformation (usually the log transformation) in > >most cases (exponential, Weibull, lognormal, ...) > > > >>which the Gamma is not, so the basic algorithm > >>would have to be different. > > > >which also holds for the Gamma; log(Gamma) is a location-scale family. So > >the basic algorithm should work after all? (Haven''t tried it myself, > >though.) > > I don''t think the log(Gamma) is a location-scale family (though I may be > missing something).You are not missing anything, but I was, apparently; I have always thought of a shape parameter as follows: If the cdf of an rv X can be written as F(x) = G((x/s)^p), then (s, p) is a scale-shape parameter. In that case, the log transform (of X) gives a location-scale family of distributions. Obviously, the gamma cdf is not of the scale-shape form above, and so the log transform does not give a location-scale family. I apologize for the misinformation. G??ran

Many thanks to Prof. Lumley. You are right, the likelihood function could be written with builtin gamma distribution functions. Now the MLE works completely fine. :-) Best Regards, Kuan-Ta Chen ----- Original Message ----- From: "Thomas Lumley" <tlumley at u.washington.edu> To: "Kuan-Ta Chen" <kuan at ilife.cx> Cc: <r-help at stat.math.ethz.ch> Sent: Tuesday, October 19, 2004 1:37 AM Subject: Re: [R] Survreg with gamma distribution> On Tue, 19 Oct 2004, Kuan-Ta Chen wrote: > > > > > One million thanks to Prof. Ripley and Prof. Lumley. I think I now havemore> > understanding regarding survreg with gamma distribution. But one of my > > problems is still there: in the text of Lee, Wang (2003), there are two > > "kinds" of parametric fitting: 1) fitting of survival distributions(like> > regular probabillity distribution fitting) 2) regression model fitting > > (mostly assume an accelerated failure time model). Survreg {survival} > > provides model fitting of (2). But I still have one problem regarding(1):> > try to estimate the parameters of gamma distributions for some data. > > There aren''t really two separate kinds: 1 is a special case of 2, so > survreg() can do 1. > > > For regular gamma distr. fitting, we could use fitdistr (mass) or use > > optim()/mle() with log-likelihood composed by dgamma()/pgamma(). Butbecause> > the data contains (randomly) censored observations, the log-likelihood > > function must be modified to include the effect of duration of censored > > observations. > > Yes. The loglikelihood is > pgamma(x,shape,scale=scale,lower.tail=FALSE,log.p=TRUE) > for a censored observation and > dgamma(x,shape,scale=scale,log=TRUE) > for an uncensored observation. No integration necessary. > > You might want to work with log(shape) and log(scale) instead, to avoid > the boundaries at 0. > > eg if your data were in variables "times" and "status" > ll <-function(logshape,logscale){ > -sum(ifelse(status, > pgamma(times,exp(logshape),scale=exp(logscale), > lower.tail=FALSE,log.p=TRUE), > dgamma(times,exp(logshape),scale=exp(logscale),log=TRUE) > )) > } > > This works in mle() without too much sensitivity to starting values. > > -thomas >

### Possibly Parallel Threads

- fitdistr, mle''s and gamma distribution
- cyclic dependency error
- S4 generic not exported correctly / incorrect dispatch?
- Correct NAMESPACE approach when writing an S3 method for a generic in another package
- comparing lm(), survreg( ... , dist="gaussian") and survreg( ... , dist="lognormal")