Rui Barradas
2021-Jan-22 23:07 UTC
[R] How to generate SE for the proportion value using a randomization process in R?
Hello,
Something like this, using base package boot?
library(boot)
bootprop <- function(data, index){
d <- data[index, ]
sum(d[["BothTimes"]], na.rm = TRUE)/sum(d[["Time1"]],
na.rm = TRUE)
}
R <- 1e3
set.seed(2020)
b <- boot(daT, bootprop, R)
b
b$t0 # original
sd(b$t) # bootstrapped estimate of the SE of the sample prop.
hist(b$t, freq = FALSE)
Hope this helps,
Rui Barradas
?s 21:57 de 22/01/21, Marna Wagley escreveu:> Hi All,
> I was trying to estimate standard error (SE) for the proportion value using
> some kind of randomization process (bootstrapping or jackknifing) in R, but
> I could not figure it out.
>
> Is there any way to generate SE for the proportion?
>
> The example of the data and the code I am using is attached for your
> reference. I would like to generate the value of proportion with a SE using
> a 1000 times randomization.
>
> dat<-structure(list(Sample = structure(c(1L, 12L, 13L, 14L, 15L, 16L,
> 17L, 18L, 19L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L), .Label =
c("id1",
> "id10", "id11", "id12", "id13",
"id14", "id15", "id16", "id17",
> "id18", "id19", "Id2", "id3",
"id4", "id5", "id6", "id7",
"id8",
> "id9"), class = "factor"), Time1 = c(0L, 1L, 1L, 1L,
0L, 0L,
> 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L), Time2 = c(1L,
> 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
> 1L, 1L)), .Names = c("Sample", "Time1",
"Time2"), class = "data.frame",
> row.names = c(NA,
> -19L))
> daT<-data.frame(dat %>%
> mutate(Time1.but.not.in.Time2 = case_when(
> Time1 %in% "1" & Time2 %in% "0" ~
"1"),
> Time2.but.not.in.Time1 = case_when(
> Time1 %in% "0" & Time2 %in% "1" ~
"1"),
> BothTimes = case_when(
> Time1 %in% "1" & Time2 %in% "1" ~
"1")))
> daT
> summary(daT)
>
> cols.num <-
c("Time1.but.not.in.Time2","Time2.but.not.in.Time1",
> "BothTimes")
> daT[cols.num] <- sapply(daT[cols.num],as.numeric)
> summary(daT)
> ProportionValue<-sum(daT$BothTimes, na.rm=T)/sum(daT$Time1, na.rm=T)
> ProportionValue
> standard error??
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Marna Wagley
2021-Jan-23 07:47 UTC
[R] How to generate SE for the proportion value using a randomization process in R?
Dear Rui,
I was wondering whether we have to square root of SD to find SE, right?
bootprop <- function(data, index){
d <- data[index, ]
sum(d[["BothTimes"]], na.rm = TRUE)/sum(d[["Time1"]],
na.rm = TRUE)
}
R <- 1e3
set.seed(2020)
b <- boot(daT, bootprop, R)
b
b$t0 # original
sd(b$t) # bootstrapped estimate of the SE of the sample prop.
sd(b$t)/sqrt(1000)
pandit*(1-pandit)
hist(b$t, freq = FALSE)
On Fri, Jan 22, 2021 at 3:07 PM Rui Barradas <ruipbarradas at sapo.pt>
wrote:
> Hello,
>
> Something like this, using base package boot?
>
>
> library(boot)
>
> bootprop <- function(data, index){
> d <- data[index, ]
> sum(d[["BothTimes"]], na.rm =
TRUE)/sum(d[["Time1"]], na.rm = TRUE)
> }
>
> R <- 1e3
> set.seed(2020)
> b <- boot(daT, bootprop, R)
> b
> b$t0 # original
> sd(b$t) # bootstrapped estimate of the SE of the sample prop.
> hist(b$t, freq = FALSE)
>
>
> Hope this helps,
>
> Rui Barradas
>
> ?s 21:57 de 22/01/21, Marna Wagley escreveu:
> > Hi All,
> > I was trying to estimate standard error (SE) for the proportion value
> using
> > some kind of randomization process (bootstrapping or jackknifing) in
R,
> but
> > I could not figure it out.
> >
> > Is there any way to generate SE for the proportion?
> >
> > The example of the data and the code I am using is attached for your
> > reference. I would like to generate the value of proportion with a SE
> using
> > a 1000 times randomization.
> >
> > dat<-structure(list(Sample = structure(c(1L, 12L, 13L, 14L, 15L,
16L,
> > 17L, 18L, 19L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L), .Label >
c("id1",
> > "id10", "id11", "id12",
"id13", "id14", "id15", "id16",
"id17",
> > "id18", "id19", "Id2", "id3",
"id4", "id5", "id6", "id7",
"id8",
> > "id9"), class = "factor"), Time1 = c(0L, 1L, 1L,
1L, 0L, 0L,
> > 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L), Time2 = c(1L,
> > 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
> > 1L, 1L)), .Names = c("Sample", "Time1",
"Time2"), class = "data.frame",
> > row.names = c(NA,
> > -19L))
> > daT<-data.frame(dat %>%
> > mutate(Time1.but.not.in.Time2 = case_when(
> > Time1 %in% "1" & Time2 %in% "0"
~ "1"),
> > Time2.but.not.in.Time1 = case_when(
> > Time1 %in% "0" & Time2 %in% "1"
~ "1"),
> > BothTimes = case_when(
> > Time1 %in% "1" & Time2 %in% "1"
~ "1")))
> > daT
> > summary(daT)
> >
> > cols.num <-
c("Time1.but.not.in.Time2","Time2.but.not.in.Time1",
> > "BothTimes")
> > daT[cols.num] <- sapply(daT[cols.num],as.numeric)
> > summary(daT)
> > ProportionValue<-sum(daT$BothTimes, na.rm=T)/sum(daT$Time1,
na.rm=T)
> > ProportionValue
> > standard error??
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
[[alternative HTML version deleted]]