I have the joint distribution of three discrete random variables z1, z2 and z3 which is captured by "z" and "prob" as described below. For example, the probability for z1=0.46667, z2=-1 and z3=-1 is 2.752e-13. Also, the probability adds up to 1.> head(z) z1 z2 z3[1,] -0.46667 -1.0000 -1.0000 [2,] -0.33333 -0.9333 -0.9333 [3,] -0.20000 -0.8667 -0.8667 [4,] -0.06667 -0.8000 -0.8000 [5,] 0.06667 -0.7333 -0.7333 [6,] 0.20000 -0.6667 -0.6667> prob[1:5][1] 2.752e-13 3.210e-12 1.348e-11 2.656e-11 2.656e-11> sum(prob)[1] 1 I want to put the distribution into a joint probability table. I use the following code. But the probability no longer adds up to 1.> z1 <- sort(unique(z[,1])); z2 <- sort(unique(z[,2])); z3 <- sort(unique(z[,3]))> P <- array(0, dim=c(length(z1),length(z2),length(z3)), dimnames=list(A=z1, H=z2, M=z3))> > for (i in 1:(dim(z)[1])){+ ind <- z[i,]+ P[dimnames(P)$A==ind[1], dimnames(P)$H==ind[2], dimnames(P)$M==ind[3]] <- prob[i] + }> sum(P)[1] 37.5The problem is when we look z1 as below, there are lot of repeated values.> unique(z[,1]) [1] -0.46667 -0.33333 -0.20000 -0.06667 0.06667 0.20000 0.33333 0.46667 -0.53333 -0.40000[11] -0.26667 -0.13333 0.00000 0.13333 0.26667 0.40000 0.53333 -0.60000 -0.06667 0.06667 [21] 0.60000 -0.66667 -0.13333 0.13333 0.66667 -0.73333 -0.60000 -0.20000 0.20000 0.60000 [31] 0.73333 -0.80000 -0.53333 -0.26667 0.26667 0.53333 0.80000 -0.86667 -0.46667 -0.33333 [41] 0.33333 0.46667 0.86667 -0.93333 -0.40000 0.40000 0.93333 -1.00000 -0.46667 -0.33333 [51] 0.33333 0.46667 1.00000 -0.53333 -0.26667 0.26667 0.53333 -0.20000 0.20000 -0.66667 [61] -0.13333 0.13333 0.66667 -0.73333 -0.06667 0.06667 0.73333 -0.20000 0.20000 -0.13333 [71] 0.13333 -0.06667 0.06667 -0.06667 0.06667 Is there a way to fix this? Any idea and suggestions? Thanks very much!! Hanna [[alternative HTML version deleted]]
A mess, as you persist in using html. -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Jul 28, 2017 at 8:17 AM, li li <hannah.hlx at gmail.com> wrote:> I have the joint distribution of three discrete random variables z1, z2 and > z3 which is captured by "z" > and "prob" as described below. > > For example, the probability for z1=0.46667, z2=-1 and z3=-1 is 2.752e-13. > Also, the probability adds up to 1. > >> head(z) z1 z2 z3 > [1,] -0.46667 -1.0000 -1.0000 > [2,] -0.33333 -0.9333 -0.9333 > [3,] -0.20000 -0.8667 -0.8667 > [4,] -0.06667 -0.8000 -0.8000 > [5,] 0.06667 -0.7333 -0.7333 > [6,] 0.20000 -0.6667 -0.6667> prob[1:5][1] 2.752e-13 3.210e-12 > 1.348e-11 2.656e-11 2.656e-11> sum(prob)[1] 1 > > > I want to put the distribution into a joint probability table. I use the > following code. But the probability no longer adds up to 1. > >> z1 <- sort(unique(z[,1])); z2 <- sort(unique(z[,2])); z3 <- sort(unique(z[,3]))> P <- array(0, dim=c(length(z1),length(z2),length(z3)), dimnames=list(A=z1, H=z2, M=z3))> > for (i in 1:(dim(z)[1])){+ ind <- z[i,]+ P[dimnames(P)$A==ind[1], dimnames(P)$H==ind[2], dimnames(P)$M==ind[3]] <- prob[i] + }> sum(P)[1] 37.5 > > > The problem is when we look z1 as below, there are lot of repeated values. > >> unique(z[,1]) [1] -0.46667 -0.33333 -0.20000 -0.06667 0.06667 0.20000 0.33333 0.46667 -0.53333 -0.40000 > [11] -0.26667 -0.13333 0.00000 0.13333 0.26667 0.40000 0.53333 > -0.60000 -0.06667 0.06667 > [21] 0.60000 -0.66667 -0.13333 0.13333 0.66667 -0.73333 -0.60000 > -0.20000 0.20000 0.60000 > [31] 0.73333 -0.80000 -0.53333 -0.26667 0.26667 0.53333 0.80000 > -0.86667 -0.46667 -0.33333 > [41] 0.33333 0.46667 0.86667 -0.93333 -0.40000 0.40000 0.93333 > -1.00000 -0.46667 -0.33333 > [51] 0.33333 0.46667 1.00000 -0.53333 -0.26667 0.26667 0.53333 > -0.20000 0.20000 -0.66667 > [61] -0.13333 0.13333 0.66667 -0.73333 -0.06667 0.06667 0.73333 > -0.20000 0.20000 -0.13333 > [71] 0.13333 -0.06667 0.06667 -0.06667 0.06667 > > > > Is there a way to fix this? Any idea and suggestions? Thanks very much!! > > Hanna > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Most likely, previous computations have ended up giving slightly different values of say 0.13333. A pragmatic way out is to round to, say, 5 digits before applying unique. In this particular case, it seems like all numbers are multiples of 1/30, so another idea could be to multiply by 30, round, and divide by 30. -pd> On 28 Jul 2017, at 17:17 , li li <hannah.hlx at gmail.com> wrote: > > I have the joint distribution of three discrete random variables z1, z2 and > z3 which is captured by "z" > and "prob" as described below. > > For example, the probability for z1=0.46667, z2=-1 and z3=-1 is 2.752e-13. > Also, the probability adds up to 1. > >> head(z) z1 z2 z3 > [1,] -0.46667 -1.0000 -1.0000 > [2,] -0.33333 -0.9333 -0.9333 > [3,] -0.20000 -0.8667 -0.8667 > [4,] -0.06667 -0.8000 -0.8000 > [5,] 0.06667 -0.7333 -0.7333 > [6,] 0.20000 -0.6667 -0.6667> prob[1:5][1] 2.752e-13 3.210e-12 > 1.348e-11 2.656e-11 2.656e-11> sum(prob)[1] 1 > > > I want to put the distribution into a joint probability table. I use the > following code. But the probability no longer adds up to 1. > >> z1 <- sort(unique(z[,1])); z2 <- sort(unique(z[,2])); z3 <- sort(unique(z[,3]))> P <- array(0, dim=c(length(z1),length(z2),length(z3)), dimnames=list(A=z1, H=z2, M=z3))> > for (i in 1:(dim(z)[1])){+ ind <- z[i,]+ P[dimnames(P)$A==ind[1], dimnames(P)$H==ind[2], dimnames(P)$M==ind[3]] <- prob[i] + }> sum(P)[1] 37.5 > > > The problem is when we look z1 as below, there are lot of repeated values. > >> unique(z[,1]) [1] -0.46667 -0.33333 -0.20000 -0.06667 0.06667 0.20000 0.33333 0.46667 -0.53333 -0.40000 > [11] -0.26667 -0.13333 0.00000 0.13333 0.26667 0.40000 0.53333 > -0.60000 -0.06667 0.06667 > [21] 0.60000 -0.66667 -0.13333 0.13333 0.66667 -0.73333 -0.60000 > -0.20000 0.20000 0.60000 > [31] 0.73333 -0.80000 -0.53333 -0.26667 0.26667 0.53333 0.80000 > -0.86667 -0.46667 -0.33333 > [41] 0.33333 0.46667 0.86667 -0.93333 -0.40000 0.40000 0.93333 > -1.00000 -0.46667 -0.33333 > [51] 0.33333 0.46667 1.00000 -0.53333 -0.26667 0.26667 0.53333 > -0.20000 0.20000 -0.66667 > [61] -0.13333 0.13333 0.66667 -0.73333 -0.06667 0.06667 0.73333 > -0.20000 0.20000 -0.13333 > [71] 0.13333 -0.06667 0.06667 -0.06667 0.06667 > > > > Is there a way to fix this? Any idea and suggestions? Thanks very much!! > > Hanna > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Thank you Peter. Rounding fixed the problem. 2017-07-28 12:15 GMT-04:00 peter dalgaard <pdalgd at gmail.com>:> Most likely, previous computations have ended up giving slightly different > values of say 0.13333. A pragmatic way out is to round to, say, 5 digits > before applying unique. In this particular case, it seems like all numbers > are multiples of 1/30, so another idea could be to multiply by 30, round, > and divide by 30. > > -pd > > > On 28 Jul 2017, at 17:17 , li li <hannah.hlx at gmail.com> wrote: > > > > I have the joint distribution of three discrete random variables z1, z2 > and > > z3 which is captured by "z" > > and "prob" as described below. > > > > For example, the probability for z1=0.46667, z2=-1 and z3=-1 is > 2.752e-13. > > Also, the probability adds up to 1. > > > >> head(z) z1 z2 z3 > > [1,] -0.46667 -1.0000 -1.0000 > > [2,] -0.33333 -0.9333 -0.9333 > > [3,] -0.20000 -0.8667 -0.8667 > > [4,] -0.06667 -0.8000 -0.8000 > > [5,] 0.06667 -0.7333 -0.7333 > > [6,] 0.20000 -0.6667 -0.6667> prob[1:5][1] 2.752e-13 3.210e-12 > > 1.348e-11 2.656e-11 2.656e-11> sum(prob)[1] 1 > > > > > > I want to put the distribution into a joint probability table. I use the > > following code. But the probability no longer adds up to 1. > > > >> z1 <- sort(unique(z[,1])); z2 <- sort(unique(z[,2])); z3 <- > sort(unique(z[,3]))> P <- array(0, dim=c(length(z1),length(z2),length(z3)), > dimnames=list(A=z1, H=z2, M=z3))> > for (i in 1:(dim(z)[1])){+ ind <- > z[i,]+ P[dimnames(P)$A==ind[1], dimnames(P)$H==ind[2], > dimnames(P)$M==ind[3]] <- prob[i] + }> sum(P)[1] 37.5 > > > > > > The problem is when we look z1 as below, there are lot of repeated > values. > > > >> unique(z[,1]) [1] -0.46667 -0.33333 -0.20000 -0.06667 0.06667 > 0.20000 0.33333 0.46667 -0.53333 -0.40000 > > [11] -0.26667 -0.13333 0.00000 0.13333 0.26667 0.40000 0.53333 > > -0.60000 -0.06667 0.06667 > > [21] 0.60000 -0.66667 -0.13333 0.13333 0.66667 -0.73333 -0.60000 > > -0.20000 0.20000 0.60000 > > [31] 0.73333 -0.80000 -0.53333 -0.26667 0.26667 0.53333 0.80000 > > -0.86667 -0.46667 -0.33333 > > [41] 0.33333 0.46667 0.86667 -0.93333 -0.40000 0.40000 0.93333 > > -1.00000 -0.46667 -0.33333 > > [51] 0.33333 0.46667 1.00000 -0.53333 -0.26667 0.26667 0.53333 > > -0.20000 0.20000 -0.66667 > > [61] -0.13333 0.13333 0.66667 -0.73333 -0.06667 0.06667 0.73333 > > -0.20000 0.20000 -0.13333 > > [71] 0.13333 -0.06667 0.06667 -0.06667 0.06667 > > > > > > > > Is there a way to fix this? Any idea and suggestions? Thanks very much!! > > > > Hanna > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > > >[[alternative HTML version deleted]]