Before reading water chemistry into a data frame I removed all missing data. Yet when I try to run cenros() to summarize a specific chemical I get an error that I do not understand: with( subset(chem, param=='Ag'), cenros(quant,ceneq1) ) Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y' I would like to learn what I did incorrectly so I can avoid these errors in the future. The data frame structure is str(chem) 'data.frame': 120309 obs. of 8 variables: $ site : Factor w/ 65 levels ";Influent","D-1",..: 2 2 2 2 2 2 ... $ sampdate: Date, format: "2007-12-12" "2007-12-12" ... $ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ... $ param : Factor w/ 37 levels "Ag","Al","Alk_tot",..: 1 2 8 17 3 9 ... $ quant : num 0 0.106 1 231 231 0.011 0 0.002 0 100 ... $ ceneq1 : logi FALSE FALSE TRUE FALSE FALSE FALSE ... $ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ... $ ceiling : Factor w/ 3909 levels "0.000","0.000)",..: 1 116 841 1771 ... I ran dput() on the data frame but cannot make sense of the output (a 5.5M ASCII text file). Pointers appreciated. Rich
On 07/11/2012 05:54 AM, Rich Shepard wrote:> Before reading water chemistry into a data frame I removed all missing > data. Yet when I try to run cenros() to summarize a specific chemical I get > an error that I do not understand: > > with( subset(chem, param=='Ag'), cenros(quant,ceneq1) ) > Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : > NA/NaN/Inf in 'y' > > I would like to learn what I did incorrectly so I can avoid these errors > in the future. > > The data frame structure is > > str(chem) > 'data.frame': 120309 obs. of 8 variables: > $ site : Factor w/ 65 levels ";Influent","D-1",..: 2 2 2 2 2 2 ... > $ sampdate: Date, format: "2007-12-12" "2007-12-12" ... > $ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ... > $ param : Factor w/ 37 levels "Ag","Al","Alk_tot",..: 1 2 8 17 3 9 ... > $ quant : num 0 0.106 1 231 231 0.011 0 0.002 0 100 ... > $ ceneq1 : logi FALSE FALSE TRUE FALSE FALSE FALSE ... > $ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ... > $ ceiling : Factor w/ 3909 levels "0.000","0.000)",..: 1 116 841 1771 ... > > I ran dput() on the data frame but cannot make sense of the output (a 5.5M > ASCII text file). >Hi Rich, I don't have the NADA package, but I suspect that the cenros function is doing something like dividing by zero. With that much data, it may be hard to pinpoint where this is occurring. I would cut my data in half, run it, cut the remainder in half, run it and so on until the error goes away. With any luck, the last slice of data that was removed won't be too large to work out which value is causing the problem and what the problem is. Jim
An my "easy" but not very useful answer is that this particular subset probably violates some assumption of the cenros() model. I myself would start with simple inspections of the data, such as with( subset(chem, param=='Ag'), table(ceneq1) ) with( subset(chem, param=='Ag'), qqnorm(quant) ) with( subset(chem, param=='Ag'), range(quant) ) in the hopes that something pops out. Do you have any zeros in quant? (see ?cenros) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 7/10/12 12:54 PM, "Rich Shepard" <rshepard at appl-ecosys.com> wrote:> Before reading water chemistry into a data frame I removed all missing >data. Yet when I try to run cenros() to summarize a specific chemical I >get >an error that I do not understand: > >with( subset(chem, param=='Ag'), cenros(quant,ceneq1) ) >Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : > NA/NaN/Inf in 'y' > > I would like to learn what I did incorrectly so I can avoid these >errors >in the future. > > The data frame structure is > >str(chem) >'data.frame': 120309 obs. of 8 variables: > $ site : Factor w/ 65 levels ";Influent","D-1",..: 2 2 2 2 2 2 ... > $ sampdate: Date, format: "2007-12-12" "2007-12-12" ... > $ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ... > $ param : Factor w/ 37 levels "Ag","Al","Alk_tot",..: 1 2 8 17 3 9 ... > $ quant : num 0 0.106 1 231 231 0.011 0 0.002 0 100 ... > $ ceneq1 : logi FALSE FALSE TRUE FALSE FALSE FALSE ... > $ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ... > $ ceiling : Factor w/ 3909 levels "0.000","0.000)",..: 1 116 841 1771 >... > > I ran dput() on the data frame but cannot make sense of the output (a >5.5M >ASCII text file). > > Pointers appreciated. > >Rich > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.