R friends, In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful ideas? lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + paraF + paraCl + paraBr + paraI + paraMe) Residuals: Min 1Q Median 3Q Max -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 7.9173 0.1129 70.135 < 2e-16 *** metaF -0.3973 0.2339 -1.698 0.115172 metaCl NA NA NA NA metaBr 0.3454 0.1149 3.007 0.010929 * metaI 0.4827 0.2339 2.063 0.061404 . metaMe 0.3654 0.1149 3.181 0.007909 ** paraF 0.7675 0.1449 5.298 0.000189 *** paraCl 0.3400 0.1449 2.347 0.036925 * paraBr 1.0200 0.1449 7.040 1.36e-05 *** paraI 1.3327 0.2339 5.697 9.96e-05 *** paraMe 1.2191 0.1573 7.751 5.19e-06 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 0.2049 on 12 degrees of freedom Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05
On 26-Feb-09 12:58:49, Bob Gotwals wrote:> R friends, > > In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful > ideas?>From the degress of freedom in your output, it seems you are fitting10 binary variables to a total of 23 observations. In such circumstances, it is not unlikely that the matrix of 0s and 1s representing the binary variables would have at least 1 column which can be represented as a linear combination of the others (which is what the "1 not defined because of singularities" means). Get more data, or use fewer variables! Or, also worth considering, check whether there are relationahips "in the real world" between your 10 variables which would tend to generate such linear dependence. Ted.> lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + > paraF + paraCl + paraBr + paraI + paraMe) > > Residuals: > Min 1Q Median 3Q Max > -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 > > Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 7.9173 0.1129 70.135 < 2e-16 *** > metaF -0.3973 0.2339 -1.698 0.115172 > metaCl NA NA NA NA > metaBr 0.3454 0.1149 3.007 0.010929 * > metaI 0.4827 0.2339 2.063 0.061404 . > metaMe 0.3654 0.1149 3.181 0.007909 ** > paraF 0.7675 0.1449 5.298 0.000189 *** > paraCl 0.3400 0.1449 2.347 0.036925 * > paraBr 1.0200 0.1449 7.040 1.36e-05 *** > paraI 1.3327 0.2339 5.697 9.96e-05 *** > paraMe 1.2191 0.1573 7.751 5.19e-06 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 0.2049 on 12 degrees of freedom > Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 > F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 26-Feb-09 Time: 15:07:40 ------------------------------ XFMail ------------------------------
It looks like your data has not enough information to estimate the parameter for metaCl. Maybe because metaCL is identical to one of the other variables or a constant. HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens Bob Gotwals Verzonden: donderdag 26 februari 2009 13:59 Aan: r-help at r-project.org Onderwerp: [R] Singularity in a regression? R friends, In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful ideas? lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + paraF + paraCl + paraBr + paraI + paraMe) Residuals: Min 1Q Median 3Q Max -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 7.9173 0.1129 70.135 < 2e-16 *** metaF -0.3973 0.2339 -1.698 0.115172 metaCl NA NA NA NA metaBr 0.3454 0.1149 3.007 0.010929 * metaI 0.4827 0.2339 2.063 0.061404 . metaMe 0.3654 0.1149 3.181 0.007909 ** paraF 0.7675 0.1449 5.298 0.000189 *** paraCl 0.3400 0.1449 2.347 0.036925 * paraBr 1.0200 0.1449 7.040 1.36e-05 *** paraI 1.3327 0.2339 5.697 9.96e-05 *** paraMe 1.2191 0.1573 7.751 5.19e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.2049 on 12 degrees of freedom Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
I saw Ted's reply and it is certainly sensible. I would wonder whether to model ought to be recast so that the scientific question is more clear? You are obviously studying the effect of different substitutions (F, Cl, Br, I, Me) and different positions around an aromatic ring (meta, para). Why not consider the order of electrophilicity (or possibly size) and the position as two different variables, one ordered and the other binomial? After recoding, your formula might then look like activity ~ electro + position ... or possibly activity ~ electro + size + position, and you would be less likely to run into difficulties with collinearity. You would also have some science in your model rather than casting aimlessly about in the data. If your ordering is sensible, you end up testing with 2 or 3 degrees of freedom. -- David Winsemius On Feb 26, 2009, at 7:58 AM, Bob Gotwals wrote:> R friends, > > In a matrix of 1s and 0s, I'm getting a singularity error. Any > helpful ideas? > > lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + > paraF + paraCl + paraBr + paraI + paraMe) > > Residuals: > Min 1Q Median 3Q Max > -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 > > Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 7.9173 0.1129 70.135 < 2e-16 *** > metaF -0.3973 0.2339 -1.698 0.115172 > metaCl NA NA NA NA > metaBr 0.3454 0.1149 3.007 0.010929 * > metaI 0.4827 0.2339 2.063 0.061404 . > metaMe 0.3654 0.1149 3.181 0.007909 ** > paraF 0.7675 0.1449 5.298 0.000189 *** > paraCl 0.3400 0.1449 2.347 0.036925 * > paraBr 1.0200 0.1449 7.040 1.36e-05 *** > paraI 1.3327 0.2339 5.697 9.96e-05 *** > paraMe 1.2191 0.1573 7.751 5.19e-06 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 0.2049 on 12 degrees of freedom > Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 > F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
If collinearity exists, one of the solutions is regulazation version of regression. There are different types of regularization method. like Ridge, LASSO, elastic net etc. For example, in MASS package you can get ridge regression. Alex On Thu, Feb 26, 2009 at 1:58 PM, Bob Gotwals <gotwals@ncssm.edu> wrote:> R friends, > > In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful > ideas? > > lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + > paraF + paraCl + paraBr + paraI + paraMe) > > Residuals: > Min 1Q Median 3Q Max > -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 > > Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 7.9173 0.1129 70.135 < 2e-16 *** > metaF -0.3973 0.2339 -1.698 0.115172 > metaCl NA NA NA NA > metaBr 0.3454 0.1149 3.007 0.010929 * > metaI 0.4827 0.2339 2.063 0.061404 . > metaMe 0.3654 0.1149 3.181 0.007909 ** > paraF 0.7675 0.1449 5.298 0.000189 *** > paraCl 0.3400 0.1449 2.347 0.036925 * > paraBr 1.0200 0.1449 7.040 1.36e-05 *** > paraI 1.3327 0.2339 5.697 9.96e-05 *** > paraMe 1.2191 0.1573 7.751 5.19e-06 *** > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Residual standard error: 0.2049 on 12 degrees of freedom > Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 > F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]