Hello all. I am doing one part of an evaluation of a mandatory welfare-to-work programme in the UK. As with all evaluations, the problem is to determine what would have happened if the initiative had not taken place. In our case, we have a number of pilot areas and no possibility of random assignment. Therefore we have been given control areas. My problem is to select for survey individuals in the control areas who match as closely as possible the randomly selected sample of action area participants. As I understand the methodology, the procedure is to run a logistic regression to determine the odds of a case being in the sample, across both action and control areas, and then choose for control sample the control area individual whose odds of being in the sample are closest to an actual sample member. So far, I have following the multinomial logistic regression example in Fox's Companion to Applied Regression. Firstly, I would like to know if the predict() is producing odds ratios (or probabilities) for being in the sample, which is what I am aiming for. Secondly, how do I get rownames (my unique identifier) into the output from predict() - my input may be faulty somehow and the wrong rownames being picked up - as I need to export back to database to sort and match in names, addresses and phone numbers for my selected samples. My code is as follows: londonpsm <- sqlFetch(channel, "London_NW_london_pilots_elig", rownames=ORCID) attach(londonpsm) mod.multinom <- multinom(sample ~ AGE + DISABLED + GENDER + ETHCODE + NDYPTOT + NDLTUTOT + LOPTYPE) lonoutput <- predict(mod.multinom, sample, type='probs') london2 <- data.frame(lonoutput) The Logistic regression seems to work, although summary() says the it is not a matrix. The output looks like odds ratios, but I would like to know whether this is so. Thank you Paul Bivand
Prof Brian Ripley
2003-Jun-04 06:45 UTC
[R] Logistic regression problem: propensity score matching
1) Why are you using multinom when this is not a multinomial logistic regression? You could just use a binomial glm. 2) The second argument to predict() is `newdata'. `sample' is an R function, so what did you mean to have there? I think the predictions should be a named vector if `sample' is a data frame. 3) There are many more examples of such things (and more explanation) in Venables & Ripley's MASS (the book). On Wed, 4 Jun 2003, Paul Bivand wrote:> I am doing one part of an evaluation of a mandatory welfare-to-work > programme in the UK. > As with all evaluations, the problem is to determine what would have > happened if the initiative had not taken place. > In our case, we have a number of pilot areas and no possibility of > random assignment. > Therefore we have been given control areas. > My problem is to select for survey individuals in the control areas who > match as closely as possible the randomly selected sample of action area > participants. > As I understand the methodology, the procedure is to run a logistic > regression to determine the odds of a case being in the sample, across > both action and control areas, and then choose for control sample the > control area individual whose odds of being in the sample are closest to > an actual sample member. > > So far, I have following the multinomial logistic regression example in > Fox's Companion to Applied Regression. > Firstly, I would like to know if the predict() is producing odds ratios > (or probabilities) for being in the sample, which is what I am aiming > for.You asked for `probs', so you got probabilities.> Secondly, how do I get rownames (my unique identifier) into the > output from predict() - my input may be faulty somehow and the wrong > rownames being picked up - as I need to export back to database to sort > and match in names, addresses and phone numbers for my selected samples. > > My code is as follows: > londonpsm <- sqlFetch(channel, "London_NW_london_pilots_elig", > rownames=ORCID) > attach(londonpsm) > mod.multinom <- multinom(sample ~ AGE + DISABLED + GENDER + ETHCODE + > NDYPTOT + NDLTUTOT + LOPTYPE) > lonoutput <- predict(mod.multinom, sample, type='probs') > london2 <- data.frame(lonoutput) > > The Logistic regression seems to work, although summary() says the it is > not a matrix.what is `it'?> The output looks like odds ratios, but I would like to know whether this > is so.No. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Possibly Parallel Threads
- Propensity score and three treatments
- exclusion rules for propensity score matchng (pattern rec)
- [OT] propensity score implementation
- propensity score matching estimates?
- error when using ps() function on categorical variables - re propensity score matching