Dear r-users, I am looking for an implementation of the Naive Bayes classifier for a multi-class classification problem. I can not even find the Naive Bayes classifier for two classes, though I can not believe it is not available. Can anyone help me? Uschi -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Wed, 16 May 2001, Ursula Sondhauss wrote:> I am looking for an implementation of the Naive Bayes classifier for a > multi-class classification problem. I can not even find the Naive Bayes > classifier for two classes, though I can not believe it is not > available. Can anyone help me?Hard to believe but likely true. However, as I understand this, it applies to a (K+1)-way contingency table, with K explanatory factors and and one response. And the `naive Bayes' model is a particular model for that table. If you want a classifier, you only need the conditional distribution of the response given the explanatory factors, and that is a main-effects-only multiple logistic model. Now the *estimation* procedures may be slightly different (`naive Bayes' is not fully defined), but if that does not matter, use multinom() in package nnet to fit this. A book on Graphical Modelling (e.g. Whittaker or Edwards) may help elucidate the connections. Let me stress *as I understand this* here. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
As I understand Naive Bayes it is essentially a finite mixture model for multivariate categorical distributions where the variables are independent in each component of the mixture. That is, I believe it to be a synonym Latent Class analysis. I believe the Frayley/Raftery package mclust may include this sort of model, and possibly other packages. Certainly these models may be expressed in the language of graphical models. Whether or not this would be useful for estimation purposes I am uncertain. Murray Jorgensen At 04:28 PM 16-05-01 +0100, Prof Brian Ripley wrote:>On Wed, 16 May 2001, Ursula Sondhauss wrote: > >> I am looking for an implementation of the Naive Bayes classifier for a >> multi-class classification problem. I can not even find the Naive Bayes >> classifier for two classes, though I can not believe it is not >> available. Can anyone help me? > >Hard to believe but likely true. However, as I understand this, it applies >to a (K+1)-way contingency table, with K explanatory factors and and one >response. And the `naive Bayes' model is a particular model for that >table. If you want a classifier, you only need the conditional >distribution of the response given the explanatory factors, and that is a >main-effects-only multiple logistic model. Now the *estimation* >procedures may be slightly different (`naive Bayes' is not fully defined), >but if that does not matter, use multinom() in package nnet to fit this. > >A book on Graphical Modelling (e.g. Whittaker or Edwards) may help >elucidate the connections. > >Let me stress *as I understand this* here. > >-- >Brian D. Ripley, ripley at stats.ox.ac.uk >Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >University of Oxford, Tel: +44 1865 272861 (self) >1 South Parks Road, +44 1865 272860 (secr) >Oxford OX1 3TG, UK Fax: +44 1865 272595 > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.->r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html >Send "info", "help", or "[un]subscribe" >(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._>Murray Jorgensen, Department of Statistics, U of Waikato, Hamilton, NZ -----[+64-7-838-4773]---------------------------[maj at waikato.ac.nz]----- "Doubt everything or believe everything:these are two equally convenient strategies. With either we dispense with the need to think." http://www.stats.waikato.ac.nz/Staff/maj.html - Henri Poincare' -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 17 May 2001, Murray Jorgensen wrote:> As I understand Naive Bayes it is essentially a finite mixture model for > multivariate categorical distributions where the variables are independent in > each component of the mixture. That is, I believe it to be a synonym Latent > Class analysis. I believe the Frayley/Raftery package mclust may include this > sort of model, and possibly other packages. Certainly these models may be > expressed in the language of graphical models. Whether or not this would be > useful for estimation purposes I am uncertain. > > Murray Jorgensenmclust (R-version) only fits Gaussian mixtures with and without Poisson noise. Christian *********************************************************************** Christian Hennig University of Hamburg, Faculty of Mathematics - SPST/ZMS (Schwerpunkt Mathematische Statistik und Stochastische Prozesse, Zentrum fuer Modellierung und Simulation) Bundesstrasse 55, D-20146 Hamburg, Germany Tel: x40/42838 4907, privat x40/631 62 79 hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag.de -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
The "naive Bayes" classifier I've seen discussed in various machine-learning papers and books is as described by David Meyer in his posting, except that class (mixture component) membership is known in the training data. So it's "supervised"--classes aren't "latent". The estimation is usually just via "plug-in": 1. Compute marginal frequencies within class. 2. multiply these together as if variables (say x) were independent within class to get an "estimate" of the class-conditional probabilities p(x | c) 3. via Bayes rule get the (x-) conditional probabilities over class (posterior class probabilities) p(c | x). (Actually you don't need to divide here since it's a common factor in the quantities to be compared to get the classifier...) 4. To classify x find the class c maximizing p(c | x) (or minimizing the sum of L(c,i)*p(i|x) over i if L(,) is a given loss function). Often step 1 is replaced by Bayesian estimates of the marginal probabilities to prevent 0 estimates and reduce variance. In case you don't find an R implementation I hope the above is helpful. A final remark: while the expression for the posterior probabilities is the same as for logistic regression (as Brian Ripley pointed out), the estimation is different--even in large samples--when the model is incorrect (as it is anticipated to be by the "naive" qualifier). Tom Mitchell's talk at the SIAM Data Mining conference had an example of this, citing large gains in performance by switching from the naive bayes approach to maximizing the logistic regression likelihood. Reid Huntsinger -----Original Message----- From: David Meyer [mailto:david.meyer at ci.tuwien.ac.at] Sent: Thursday, May 17, 2001 5:32 AM To: Murray Jorgensen Cc: Ursula Sondhauss; r-help at stat.math.ethz.ch Subject: Re: [R] Naive Bayes Classifier Murray Jorgensen wrote:> > As I understand Naive Bayes it is essentially a finite mixture model for > multivariate categorical distributions where the variables are independentin> each component of the mixture. That is, I believe it to be a synonymLatent> Class analysis. I believe the Frayley/Raftery package mclust may includethis> sort of model, and possibly other packages. Certainly these models may be > expressed in the language of graphical models. Whether or not this wouldbe> useful for estimation purposes I am uncertain.You could also try lca() in package e1071. -d> > Murray Jorgensen > > At 04:28 PM 16-05-01 +0100, Prof Brian Ripley wrote: > >On Wed, 16 May 2001, Ursula Sondhauss wrote: > > > >> I am looking for an implementation of the Naive Bayes classifier for a > >> multi-class classification problem. I can not even find the Naive Bayes > >> classifier for two classes, though I can not believe it is not > >> available. Can anyone help me? > > > >Hard to believe but likely true. However, as I understand this, itapplies> >to a (K+1)-way contingency table, with K explanatory factors and and one > >response. And the `naive Bayes' model is a particular model for that > >table. If you want a classifier, you only need the conditional > >distribution of the response given the explanatory factors, and that is a > >main-effects-only multiple logistic model. Now the *estimation* > >procedures may be slightly different (`naive Bayes' is not fullydefined),> >but if that does not matter, use multinom() in package nnet to fit this. > > > >A book on Graphical Modelling (e.g. Whittaker or Edwards) may help > >elucidate the connections. > > > >Let me stress *as I understand this* here. > > > >-- > >Brian D. Ripley, ripley at stats.ox.ac.uk > >Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > >University of Oxford, Tel: +44 1865 272861 (self) > >1 South Parks Road, +44 1865 272860 (secr) > >Oxford OX1 3TG, UK Fax: +44 1865 272595 > > > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > .-.- > >r-help mailing list -- Readhttp://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html> >Send "info", "help", or "[un]subscribe" > >(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > ._._ > > > Murray Jorgensen, Department of Statistics, U of Waikato, Hamilton, NZ > -----[+64-7-838-4773]---------------------------[maj at waikato.ac.nz]----- > "Doubt everything or believe everything:these are two equally convenient > strategies. With either we dispense with the need to think." > http://www.stats.waikato.ac.nz/Staff/maj.html - Henri Poincare' > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. -.-> r-help mailing list -- Readhttp://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html> Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._ -- Mag. David Meyer Wiedner Hauptstrasse 8-10 Vienna University of Technology A-1040 Vienna/AUSTRIA Department for Statistics, Probability Tel.: (+431) 58801/10772 Theory and Actuarial Mathematics mail: david.meyer at ci.tuwien.ac.at -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. -.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
While not implemented in R, but as a JAVA stand-alone program, BAYDA does Naive Bayes classification. The project web page is at: http://www.cs.helsinki.fi/research/cosco/Projects/NONE/SW/ Pretty easy IMO to port data back and forth. Best, Mark Hall -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Johann Petrak
2001-May-18 13:51 UTC
[R] Hashes in R? Multivariate random numbers? Cholesky decomposition?
Are there hashes in R? Or a package that implements hashes? I am also looking for multivariate gaussian random numbers. That brings me to: is there cholesky decomposition of a matrix? Johann -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian Ripley
2001-May-18 13:59 UTC
[R] Hashes in R? Multivariate random numbers? Cholesky decomposition?
On Fri, 18 May 2001, Johann Petrak wrote: [...]> I am also looking for multivariate gaussian random numbers.mvrnorm in package MASS> That brings me to: is there cholesky decomposition of > a matrix?Well, there is a Choleski decomposition (as he apparently spelt it): ?chol. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._