I have a data frame called e, dim is 27,3, the first 5 lines look like this: V1 V2 V3 V4 1 1673 0.36 0.08 Smith 2 167 0.36 0.08 Allen 3 99 0.37 0.06 Allen 4 116 0.38 0.07 Allen 5 95 0.41 0.08 Allen I am trying to calculate the proportion/percentage of V1 which would have values >0.42 if V2 was the mean of a normal distribution with V1 people and a standard distribution of V3. The loop works but only for 4 iterations then stops, I can't understand why, the code and the output are below output <- rep(NA, 27) for (i in 1:length(e)) { x <- rnorm(n=e[i,1], mean=e[i,2], sd=e[i,3]) n <- e[i,1] v <- x>0.42 q <-(sum(v)/n)*100 output[i] <- q }>output[1] 22.23551 27.54491 25.25253 19.82759 NA NA NA NA NA [10] NA NA NA NA NA NA NA NA NA [19] NA NA NA NA NA NA NA NA NA [[alternative HTML version deleted]]
There's a seeming inconsistency in this question -- namely, you provide an example of a data frame with 4 columns but say it is 27x3 -- but I think your question comes from a misunderstanding of what length(e) calculates. For a data frame it gives the number of columns back. Hence if you have a 27x4 data frame (which you appear to) iterations will only fill the first four elements of output. You'd probably rather use NROW(e). As an aside, for these sort of loops, seq_along() is usually a very good choice, but it doesn't work here because of the length() thing. On another note, why don't you just do the calculation analytically and save yourself some trouble? # Something like with(e, qnorm(0.42, V2, V3)*100) Michael On Sat, Oct 22, 2011 at 7:33 PM, Philip Robinson <philip.c.robinson at gmail.com> wrote:> I have a data frame called e, dim is 27,3, the first 5 lines look like this: > > > > > > ? ? V1 ? V2 ? V3 ? ? ? ?V4 > > 1 ?1673 0.36 0.08 ?Smith > > 2 167 0.36 0.08 ? ? Allen > > 3 ? ?99 0.37 0.06 ? ? Allen > > 4 ? 116 0.38 0.07 ? ? Allen > > 5 ? ?95 0.41 0.08 ? ? Allen > > > > I am trying to calculate the proportion/percentage of V1 which would have > values >0.42 if V2 was the mean of a normal distribution with V1 people and > a standard distribution of V3. The loop works but only for 4 iterations then > stops, I can't understand why, the code and the output are below > > > > > > output <- rep(NA, 27) > > for (i in 1:length(e)) > > { > > x <- rnorm(n=e[i,1], mean=e[i,2], sd=e[i,3]) > > n <- e[i,1] > > v <- x>0.42 > > q <-(sum(v)/n)*100 > > output[i] <- q > > } > > > >>output > > [1] 22.23551 27.54491 25.25253 19.82759 ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA > NA > > [10] ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA > NA > > [19] ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA > NA > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 11-10-22 7:33 PM, Philip Robinson wrote:> I have a data frame called e, dim is 27,3, the first 5 lines look like this: > > > > > > V1 V2 V3 V4 > > 1 1673 0.36 0.08 Smith > > 2 167 0.36 0.08 Allen > > 3 99 0.37 0.06 Allen > > 4 116 0.38 0.07 Allen > > 5 95 0.41 0.08 Allen >That doesn't look like 3 columns, it looks like 4.> > > I am trying to calculate the proportion/percentage of V1 which would have > values>0.42 if V2 was the mean of a normal distribution with V1 people and > a standard distribution of V3. The loop works but only for 4 iterations then > stops, I can't understand why, the code and the output are below > > > > > > output<- rep(NA, 27) > > for (i in 1:length(e))The length of a dataframe is the number of columns. Use nrow(e) for the number of rows. Duncan Murdoch> > { > > x<- rnorm(n=e[i,1], mean=e[i,2], sd=e[i,3]) > > n<- e[i,1] > > v<- x>0.42 > > q<-(sum(v)/n)*100 > > output[i]<- q > > } > > > >> output > > [1] 22.23551 27.54491 25.25253 19.82759 NA NA NA NA > NA > > [10] NA NA NA NA NA NA NA NA > NA > > [19] NA NA NA NA NA NA NA NA > NA > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi: Here are a couple of ways, using the data snippet you provided as the input data frame e. Start by defining the function, which outputs a percentage: f <- function(n, mean, sd) { s <- rnorm(n, mean = mean, s = sd) round(100 * sum(s > 0.42)/length(s), 4) } (1) Use the plyr package and its mdply() function. Note that the columns of epars have the same names as the arguments of f. library('plyr') mdply(epars, f) n mean sd V1 1 1673 0.36 0.08 24.0287 2 167 0.36 0.08 23.9521 3 99 0.37 0.06 18.1818 4 116 0.38 0.07 22.4138 5 95 0.41 0.08 40.0000 (2) Split the columns of e into vectors and use mapply(): ssize <- e[, 1] mns <- e[, 2] sds <- e[, 3] mapply(f, ssize, mns, sds) [1] 21.3987 19.1617 22.2222 32.7586 45.2632 mapply() just returns the percentages. The percentages are different between the two calls because the samples are different. HTH, Dennis On Sat, Oct 22, 2011 at 4:33 PM, Philip Robinson <philip.c.robinson at gmail.com> wrote:> I have a data frame called e, dim is 27,3, the first 5 lines look like this: > > > > > > ? ? V1 ? V2 ? V3 ? ? ? ?V4 > > 1 ?1673 0.36 0.08 ?Smith > > 2 167 0.36 0.08 ? ? Allen > > 3 ? ?99 0.37 0.06 ? ? Allen > > 4 ? 116 0.38 0.07 ? ? Allen > > 5 ? ?95 0.41 0.08 ? ? Allen > > > > I am trying to calculate the proportion/percentage of V1 which would have > values >0.42 if V2 was the mean of a normal distribution with V1 people and > a standard distribution of V3. The loop works but only for 4 iterations then > stops, I can't understand why, the code and the output are below > > > > > > output <- rep(NA, 27) > > for (i in 1:length(e)) > > { > > x <- rnorm(n=e[i,1], mean=e[i,2], sd=e[i,3]) > > n <- e[i,1] > > v <- x>0.42 > > q <-(sum(v)/n)*100 > > output[i] <- q > > } > > > >>output > > [1] 22.23551 27.54491 25.25253 19.82759 ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA > NA > > [10] ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA > NA > > [19] ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA ? ? ? NA > NA > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >