Colin Bleay wrote:> last week i sent an e-mail about dealing with errors thrown up from a > glm.nb model carried out on multiple random datasets. > > every so often a dataset is created which results in the following error > after a call to glm.nb: > > "Error: NA/NaN/Inf in foreign function call (arg 1) > In addition: Warning message: > Step size truncated due to divergence" > > > I am at a loss as to how to deal with this. > > firstly because the dataset that is generated, although throwing an error > when the glm.nb model is applied, is a valid dataset. so how do i > incorporate this dataset in my results (results being descriptive stats on > the coefficients from the multiple datasets) i.e. shoould coefficients be > set to zero?Almost surely, setting the coefficients equal to 0 is the wrong thing to do. What the right thing is depends on the answer to ``lastly''. Setting the coefficients to be NA in this case (i.e. effectively throwing away such cases) is also wrong, but not quite as wrong as setting them equal to 0.> secondly, how do i capture and deal with the error. is it possible to > construct an "if" statement so that "if error, do this, if not continue"This should be do-able using try(). Something like: c.list <- list() save.bummers <- list() K <- 0 for(i in 1:42) { repeat { X <- generate.random.data.set() Y <- try(glm.nb(X,whatever)) if(inherits(Y,"try-error")) { K <- K+1 save.bummers[[K]] <- X } else break } c.list[[i]] <- coeff(Y) } This should give you a sample of 42 coefficient vectors from the ``successful'' data sets, and a list of all the (a random number of) data sets that yielded a lack of success. You can then take the data sets stored in save.bummers and experiment with them to see what is causing the problem.> lastly, i am unsure as to what characteristics of a dataset would result in > these errors in the glm.nb?Here I have to heed the advice (attributed to a ``great art historian'') from George F. Simmons' wonderful book on elementary differential equations: ``A fool he who gives more than he has.'' cheers, Rolf Turner rolf at math.unb.ca
CR Bleay, School Biological Sciences
2004-Jul-06 13:42 UTC
[R] Re: errors in randomization test
dear rolf, thank you for the assistance, i did not know how to catch the errors from try. of course the new code has thrown up a new error: "Error in terms.default(object) : no terms component" which i have to resolve. cheers, colin --On Tuesday, July 6, 2004 9:23 am -0300 Rolf Turner <rolf at math.unb.ca> wrote:> Colin Bleay wrote: > >> last week i sent an e-mail about dealing with errors thrown up from a >> glm.nb model carried out on multiple random datasets. >> >> every so often a dataset is created which results in the following error >> after a call to glm.nb: >> >> "Error: NA/NaN/Inf in foreign function call (arg 1) >> In addition: Warning message: >> Step size truncated due to divergence" >> >> >> I am at a loss as to how to deal with this. >> >> firstly because the dataset that is generated, although throwing an >> error when the glm.nb model is applied, is a valid dataset. so how do i >> incorporate this dataset in my results (results being descriptive stats >> on the coefficients from the multiple datasets) i.e. shoould >> coefficients be set to zero? > > Almost surely, setting the coefficients equal to 0 is the > wrong thing to do. What the right thing is depends on the > answer to ``lastly''. > > Setting the coefficients to be NA in this case (i.e. > effectively throwing away such cases) is also wrong, but not > quite as wrong as setting them equal to 0. > >> secondly, how do i capture and deal with the error. is it possible to >> construct an "if" statement so that "if error, do this, if not continue" > > This should be do-able using try(). Something like: > > c.list <- list() > save.bummers <- list() > K <- 0 > for(i in 1:42) { > repeat { > X <- generate.random.data.set() > Y <- try(glm.nb(X,whatever)) > if(inherits(Y,"try-error")) { > K <- K+1 > save.bummers[[K]] <- X > } else break > } > c.list[[i]] <- coeff(Y) > } > > This should give you a sample of 42 coefficient vectors from > the ``successful'' data sets, and a list of all the (a random > number of) data sets that yielded a lack of success. You can > then take the data sets stored in save.bummers and experiment > with them to see what is causing the problem. > >> lastly, i am unsure as to what characteristics of a dataset would result >> in these errors in the glm.nb? > > Here I have to heed the advice (attributed to a ``great art > historian'') from George F. Simmons' wonderful book on > elementary differential equations: ``A fool he who gives > more than he has.'' > > cheers, > > Rolf Turner > rolf at math.unb.ca >---------------------- Dr Colin Bleay Dept. Biological Sciences, University of Bristol, Woodlands rd., Bristol, BS8 1UG. UK Tel: 44 (0)117 928 7470 Fax: 44 (0)117
CR Bleay, School Biological Sciences
2004-Jul-08 14:35 UTC
[R] Re: errors in randomization test
Dear Rolf, I tried using you code, however i have found that the whole routine is still stopped by the call to GLM.nb fro certain datasets before it enters the "if" statement. is there anyway to ensure that this does not occur. cheers, colin --On Tuesday, July 6, 2004 9:23 am -0300 Rolf Turner <rolf at math.unb.ca> wrote:> Colin Bleay wrote: > >> last week i sent an e-mail about dealing with errors thrown up from a >> glm.nb model carried out on multiple random datasets. >> >> every so often a dataset is created which results in the following error >> after a call to glm.nb: >> >> "Error: NA/NaN/Inf in foreign function call (arg 1) >> In addition: Warning message: >> Step size truncated due to divergence" >> >> >> I am at a loss as to how to deal with this. >> >> firstly because the dataset that is generated, although throwing an >> error when the glm.nb model is applied, is a valid dataset. so how do i >> incorporate this dataset in my results (results being descriptive stats >> on the coefficients from the multiple datasets) i.e. shoould >> coefficients be set to zero? > > Almost surely, setting the coefficients equal to 0 is the > wrong thing to do. What the right thing is depends on the > answer to ``lastly''. > > Setting the coefficients to be NA in this case (i.e. > effectively throwing away such cases) is also wrong, but not > quite as wrong as setting them equal to 0. > >> secondly, how do i capture and deal with the error. is it possible to >> construct an "if" statement so that "if error, do this, if not continue" > > This should be do-able using try(). Something like: > > c.list <- list() > save.bummers <- list() > K <- 0 > for(i in 1:42) { > repeat { > X <- generate.random.data.set() > Y <- try(glm.nb(X,whatever)) > if(inherits(Y,"try-error")) { > K <- K+1 > save.bummers[[K]] <- X > } else break > } > c.list[[i]] <- coeff(Y) > } > > This should give you a sample of 42 coefficient vectors from > the ``successful'' data sets, and a list of all the (a random > number of) data sets that yielded a lack of success. You can > then take the data sets stored in save.bummers and experiment > with them to see what is causing the problem. > >> lastly, i am unsure as to what characteristics of a dataset would result >> in these errors in the glm.nb? > > Here I have to heed the advice (attributed to a ``great art > historian'') from George F. Simmons' wonderful book on > elementary differential equations: ``A fool he who gives > more than he has.'' > > cheers, > > Rolf Turner > rolf at math.unb.ca >---------------------- Dr Colin Bleay Dept. Biological Sciences, University of Bristol, Woodlands rd., Bristol, BS8 1UG. UK Tel: 44 (0)117 928 7470 Fax: 44 (0)117