Hi all I have built a Bayesian network using discrete data using the bnlearn package. When I try to run the cpquery function on this data it returns NaN for some some cases. Running the cpquery in debug mode for such a case (n=10^5, method="lw") creates the following output: generated a grand total of 1e+05 samples. > event has a probability mass of 14982.37 out of NaN (p = NaN). [1] NaN The cpquery command takes the following structure: cpquery(fullFitted,event=(C1_class=="Med"), evidence=list(GK_class = "ModHi", GTh_class = "Lo", GU_class = "Lo", El_class = "Hi", E50_class = "Med", E150_class = "Med" ) , n=10^5, method = "lw", debug=TRUE) Similarly, when I try to run the predict method on the same data, it returns the following warning: Warning message: In map.prediction(node = node, fitted = object, data = data, n extra.args$n, : dropping 38073 observations because generated samples are NAs. Could you advise me why these queries are generating NaN values, and how they might be resolved. The session info is as follows: sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 [4] LC_NUMERIC=C LC_TIME=English_Australia.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] bnlearn_4.2 loaded via a namespace (and not attached): [1] compiler_3.4.1 tools_3.4.1 Many thanks in advance Ross [[alternative HTML version deleted]]
Dear Ross, This usually happen because you have parameters with a value of NaN in your network, because the data you estimate the network from are sparse and you are using maximum likelihood estimates. You should either 1) use simpler networks for which you can estimate all conditional distributions from the data or 2) use posterior estimates for the parameters. Cheers, Marco On 13 July 2017 at 06:29, Ross Chapman <rosspjchapman at gmail.com> wrote:> Hi all > > > > I have built a Bayesian network using discrete data using the bnlearn > package. > > > > When I try to run the cpquery function on this data it returns NaN for some > some cases. > > > > Running the cpquery in debug mode for such a case (n=10^5, method="lw") > creates the following output: > > > > generated a grand total of 1e+05 samples. > > > event has a probability mass of 14982.37 out of NaN (p = NaN). > > [1] NaN > > > > > > The cpquery command takes the following structure: > > > > cpquery(fullFitted,event=(C1_class=="Med"), > > evidence=list(GK_class = "ModHi", > > GTh_class = "Lo", > > GU_class = "Lo", > > El_class = "Hi", > > E50_class = "Med", > > E150_class = "Med" > > ) , > > n=10^5, method = "lw", debug=TRUE) > > > > Similarly, when I try to run the predict method on the same data, it > returns > the following warning: > > > > Warning message: > In map.prediction(node = node, fitted = object, data = data, n > extra.args$n, : > dropping 38073 observations because generated samples are NAs. > > > > > > Could you advise me why these queries are generating NaN values, and how > they might be resolved. > > > > The session info is as follows: > > > > sessionInfo() > R version 3.4.1 (2017-06-30) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows >= 8 x64 (build 9200) > > Matrix products: default > > locale: > [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 > LC_MONETARY=English_Australia.1252 > [4] LC_NUMERIC=C LC_TIME=English_Australia.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] bnlearn_4.2 > > loaded via a namespace (and not attached): > [1] compiler_3.4.1 tools_3.4.1 > > > > > > Many thanks in advance > > > > Ross > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Marco Scutari, Ph.D. Lecturer in Statistics, Department of Statistics University of Oxford, United Kingdom [[alternative HTML version deleted]]
Dear Marco, Thanks for your helpful comments. Using the posterior estimates seems to have fixed the problem. Ross From: Marco Scutari [mailto:marco.scutari at gmail.com] Sent: Thursday, 13 July 2017 7:35 PM To: Ross Chapman <rosspjchapman at gmail.com> Cc: r-help <r-help at r-project.org> Subject: Re: [R] bnlearn and cpquery Dear Ross, This usually happen because you have parameters with a value of NaN in your network, because the data you estimate the network from are sparse and you are using maximum likelihood estimates. You should either 1) use simpler networks for which you can estimate all conditional distributions from the data or 2) use posterior estimates for the parameters. Cheers, Marco On 13 July 2017 at 06:29, Ross Chapman <rosspjchapman at gmail.com <mailto:rosspjchapman at gmail.com> > wrote: Hi all I have built a Bayesian network using discrete data using the bnlearn package. When I try to run the cpquery function on this data it returns NaN for some some cases. Running the cpquery in debug mode for such a case (n=10^5, method="lw") creates the following output: generated a grand total of 1e+05 samples. > event has a probability mass of 14982.37 out of NaN (p = NaN). [1] NaN The cpquery command takes the following structure: cpquery(fullFitted,event=(C1_class=="Med"), evidence=list(GK_class = "ModHi", GTh_class = "Lo", GU_class = "Lo", El_class = "Hi", E50_class = "Med", E150_class = "Med" ) , n=10^5, method = "lw", debug=TRUE) Similarly, when I try to run the predict method on the same data, it returns the following warning: Warning message: In map.prediction(node = node, fitted = object, data = data, n extra.args$n, : dropping 38073 observations because generated samples are NAs. Could you advise me why these queries are generating NaN values, and how they might be resolved. The session info is as follows: sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 [4] LC_NUMERIC=C LC_TIME=English_Australia.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] bnlearn_4.2 loaded via a namespace (and not attached): [1] compiler_3.4.1 tools_3.4.1 Many thanks in advance Ross [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Marco Scutari, Ph.D. Lecturer in Statistics, Department of Statistics University of Oxford, United Kingdom [[alternative HTML version deleted]]