Eileen Meyer
2010-Nov-03 00:38 UTC
[R] package 'np' and point estimation with multiple predictors
(disclaimer: I'm in physics, not stats... ) I have a multivariate problem. One variable, call it R1, and 3 "predictor" variables, P1, P2, P3. My goal is to take a load of training data (I know R1,P1,P2,P3 for about 700 total points), and then predict R1 for a new set of data for which I have all the predictors. Simple, no? I understand how to calculate bandwidths, and I have a kind of bastardized way of getting the conditional distribution, i.e., f(R1|P1=0.8,P2=0.2,P3=2) using fitted(npudens(bw=bw,edat=newdata)) evaluating over a vector of R1. I have then been using this "density" to get a maximum likelihood estimator of R1- I have no idea if that is really valid, and if anyone wants to yell at me go ahead, I want to do this the correct way and I'm sure I'm making it harder than it is. Moving past that, the technical problem I am facing is getting a prediction interval from this. There's npqreg, and I get how it works when you have one predictor, but what happens when you have many? What I want to do is get the 0.05 and 0.95 quantile for a given P1,P2,P3. to use as my prediction interval. Thanks, EM