tk@tariqkhan.org
2004-Aug-26 13:58 UTC
[R] EM norm package (NA/NaN/Inf in foreign function call (arg 2))
Greetings! I am bootstrapping and I am using EM in the norm package to fill in missing data for a financial time series with each step of the loop. For the most part EM works fine for me, but the following error message is guaranteed before I hit the 200th scenario: Iterations of EM: 1...2...3........348...349...Error: NA/NaN/Inf in foreign function call (arg 2) The following code should replicate the error by downloading the dataset from the internet (it is not too big): library(norm) df<-download.file("http://www.tariqkhan.org/R/DataFromExcel.csv", "C:/Program Files/R/d.csv") mat<-as.matrix(read.table("C:/Program Files/R/d.csv", sep = ",")) s<-prelim.norm(mat) rngseed(1234567) thetahat<-em.norm(s, maxits = 1000, criterion = 0.0035) Iterations of EM: 1...2...3........348...349...Error: NA/NaN/Inf in foreign function call (arg 2) Someone else on the list found that using scale() helped with em.norm, but for me it only increased the number of iterations before giving the same error. I dont get it. Insights into what I can do to solve this would be much appreciated! Details: Norm Package version 1.0.9; R version 1.9.0; Windows XP Pro 2002 SP1; 384MB RAM, Pentium 4 CPU 2.40 GHz -------------------------------------------------------------------- mail2web - Check your email from the web at http://mail2web.com/ .
(Ted Harding)
2004-Aug-26 18:03 UTC
[R] EM norm package (NA/NaN/Inf in foreign function call (ar
On 26-Aug-04 tk at tariqkhan.org wrote:> The following code should replicate the error by downloading > the dataset from the internet (it is not too big): > > library(norm) > df<-download.file("http://www.tariqkhan.org/R/DataFromExcel.csv", > "C:/Program Files/R/d.csv") > mat<-as.matrix(read.table("C:/Program Files/R/d.csv", sep = ","))I downloaded the dataset in my own way: 51x26 matrix with 166 missing values, right? -- and then: mat <- as.matrix(read.csv("DataFromExcel.csv"))> s<-prelim.norm(mat) > rngseed(1234567)You don't need to set rngseed at this stage, since em.norm does not require it; but never mind, it is needed if you go on to do imputations.> thetahat<-em.norm(s, maxits = 1000, criterion = 0.0035) > > Iterations of EM: > 1...2...3........348...349...Error: NA/NaN/Inf in foreign function call > (arg 2)I did not get this result: using the same command, em.norm terminated normally after 82 iterations. You can get your error message when a [nearly] singular matrix is generated in the course of em.norm, since it has to invert a matrix to compute the expected values of the missing components of the sufficient statistics. Having set rngseed as above, I then did mat.imp<-imp.norm(s,thetahat,mat) after which svd(mat.imp)$d [1] 8.343633e+04 2.321644e-01 1.751089e-01 1.275187e-01 [5] 1.116023e-01 8.807676e-02 8.006840e-02 6.198593e-02 [9] 6.002220e-02 5.918019e-02 5.617467e-02 4.797701e-02 [13] 4.631037e-02 4.239089e-02 3.917043e-02 3.786447e-02 [17] 3.007310e-02 2.704916e-02 2.397084e-02 2.025846e-02 [21] 1.681492e-02 1.336568e-02 9.161890e-03 6.042817e-03 [25] 4.795948e-03 6.187377e-10 shows that the imputed matrix is close to 1-dimensional and very nearly singular: the largest singular value is 8e+04, the next 10 are O(0.1), the next 14 are O(0.01), and the last one is O(1e-09). so there is the potential for singularity problems, However, as I say, I did not encounter any, so the behaviour you observe is a bit puzzling. I observe that if I set "criterion = 0.000699" or greater (compared with your 0.0035), then em.norm terminates normally in 479 cycles of fewer, while for "criterion = 0.000698" or less it goes the full 1000 cycles. But still no error message. However, this does suggest that the maximum is not too well defined. I'm using norm version 1.0-9, like you, with R version 1.8.1 on Linux (so I did dos2unix on DataFromExcel.csv as well, but that shouldn't matter). Apart from the version of R, the only difference between us is that you're running on Windows rather than Linux, but hopefully that shouldn't matter either. Hmmm. Best wishes, Ted.> Someone else on the list found that using scale() helped with > em.norm, but for me it only increased the number of iterations > before giving the same error. > > I dont get it. Insights into what I can do to solve this would > be much appreciated! > > Details: > Norm Package version 1.0.9; > R version 1.9.0; > Windows XP Pro 2002 SP1; > 384MB RAM, Pentium 4 CPU 2.40 GHz-------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 167 1972 Date: 26-Aug-04 Time: 19:03:53 ------------------------------ XFMail ------------------------------