[markdown format] I'm glad to introduce you the new package aVirtualTwins. This package is an adaptation of VirtualTwins method of subgroup identification from [Foster, J. C., Taylor, J. M.G. and Ruberg, S. J. (2011)](http://onlinelibrary.wiley.com/doi/10.1002/sim.4322/abstract). ### Explanation Virtual Twins has been created to find subgroup of patients in a random clinical trial with enhanced treatment effect, if it exists. Theorically, this method can be used for binary and continous outcome. This package only deals with binary outcome in a two arms clinical trial. Virutal Twins is also adapted for A/B testing of course. Virtual Twins is based on random forest and regression/classification trees. ### Quick preview Here's a example of aVirtualTwins use with a well known dataset (_sepsis_) in subgroup decovery: _Sepsis_ contains simulated data on 470 subjects with a binary outcome survival, that stores survival status for patient after 28 days of treatment, value of 1 for subjects who died after 28 days and 0 otherwise. There are 11 covariates, listed below, all of which are numerical variables. ```r library(aVirtualTwins) # Load data data(sepsis) # Format data vt.obj <- vt.data(dataset = sepsis, outcome.field = "survival", treatment.field = "THERAPY", interactions = TRUE) ## "1" will be the favorable outcome # view of data head(sepsis) ## survival THERAPY PRAPACHE AGE BLGCS ORGANNUM BLIL6 BLLPLAT ## 1 0 1 19 42.921 15 1 301.80 191.0000 ## 2 1 1 48 68.818 11 2 118.90 264.1565 ## 3 0 1 20 68.818 15 2 92.80 123.0000 ## 4 0 1 19 33.174 14 2 1232.00 244.0000 ## 5 0 1 48 46.532 3 4 2568.00 45.0000 ## 6 0 0 21 56.098 14 1 162.65 137.0000 ## BLLBILI BLLCREAT TIMFIRST BLADL blSOFA ## 1 2.913416 1.000000 17.17 0 5.00 ## 2 0.400000 1.100000 17.17 5 10.00 ## 3 5.116471 1.000000 10.00 1 7.50 ## 4 3.142092 1.200000 17.17 0 6.25 ## 5 4.052668 3.000000 10.00 0 12.00 ## 6 0.500000 4.662556 10.00 0 8.75 # Print Incidences of sepsis data vt.obj$getIncidences() ## $table ## trt ## resp 0 1 sum ## 0 101 188 289 ## 1 52 129 181 ## sum 153 317 470 ## Incidence 0.34 0.407 0.385 ## ## $rr ## [1] 1.197059 # $table # trt # resp 0 1 sum # 0 101 188 289 # 1 52 129 181 # sum 153 317 470 # Incidence 0.34 0.407 0.385 # # $rr # [1] 1.197059 # # First step : create random forest model vt.for <- vt.forest(forest.type = "one", vt.data = vt.obj, interactions = TRUE, ntree = 500) # Second step : find rules in data vt.trees <- vt.tree(tree.type = "class", vt.difft = vt.for, threshold = quantile(vt.for$difft, seq(.5,.8,.1)), maxdepth = 2) # Print results vt.sbgrps <- vt.subgroups(vt.trees) knitr::kable(vt.sbgrps) ``` Subgroup Subgroup size Treatement event rate Control event rate Treatment sample size Control sample size RR (resub) RR (snd) ------ ---------------------------- -------------- ---------------------- ------------------- ---------------------- -------------------- ----------- --------- tree1 PRAPACHE>=26.5 157 0.752 0.327 105 52 2.300 1.774 tree3 PRAPACHE>=26.5 & AGE>=51.74 120 0.897 0.31 78 42 2.894 1.924 aVirtualTwins can be found on [CRAN](https://cran.r-project.org/package=aVirtualTwins) and [github](https://github.com/prise6/aVirtualTwins). Feel free to contribute. Francois. _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages