adiamond@fas.harvard.edu
2005-Apr-05 04:51 UTC
[R] exclusion rules for propensity score matchng (pattern rec)
Dear R-list, i have 6 different sets of samples. Each sample has about 5000 observations, with each observation comprised of 150 baseline covariates (X), 125 of which are dichotomous. Roughly 20% of the observations in each sample are "treatment" and the rest are "control" units. i am doing propensity score matching, i have already estimated propensity scores(predicted probabilities) using logistic regression, and in each sample i am going to have to exclude approximately 100 treated observations for which I cannot find matching control observations (because the scores for these treated units are outside the support of the scores for control units). in each sample, i must identify an exclusion rule that is interpretable on the scale of the X's that excludes these unmatchable treated observations and excludes as FEW of the remaining treated observations as possible. (the reason is that i want to be able to explain, in terms of the Xs, who the individuals are that I making causal inference about.) i've tried some simple stuff over the past few days and nothing's worked. is there an R-package or algorithm, or even estimation strategy that anyone could recommend? (i am really hoping so!) thank you, alexis diamond
Frank E Harrell Jr
2005-Apr-05 14:36 UTC
[R] exclusion rules for propensity score matchng (pattern rec)
adiamond at fas.harvard.edu wrote:> Dear R-list, > > i have 6 different sets of samples. Each sample has about 5000 observations, > with each observation comprised of 150 baseline covariates (X), 125 of which > are dichotomous. Roughly 20% of the observations in each sample are "treatment" > and the rest are "control" units. > > i am doing propensity score matching, i have already estimated propensity > scores(predicted probabilities) using logistic regression, and in each sample i > am going to have to exclude approximately 100 treated observations for which I > cannot find matching control observations (because the scores for these treated > units are outside the support of the scores for control units). > > in each sample, i must identify an exclusion rule that is interpretable on the > scale of the X's that excludes these unmatchable treated observations and > excludes as FEW of the remaining treated observations as possible. > (the reason is that i want to be able to explain, in terms of the Xs, who the > individuals are that I making causal inference about.) > > i've tried some simple stuff over the past few days and nothing's worked. > is there an R-package or algorithm, or even estimation strategy that anyone > could recommend? > (i am really hoping so!) > > thank you, > > alexis diamond >Exclusion can be based on the non-overlap regions from the propensity. It should not be done in the individual covariate space. I tend to look at the 10th smallest and largest values of propensity for each of the two treatment groups for making the decision. You will need to exclude non-overlap regions whether you use matching or covariate adjustment of propensity but covariate adjustment (using e.g. regression splines in the logit of propensity) is often a better approach once you've been careful about non-overlap. Frank Harrell> ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University