Mark Hempelmann
2005-Sep-09 15:43 UTC
[R] regression with restrictions - optimization problem
Dear WizaRds! I am sorry to ask for some help, but I have come to a complete stop in my efforts. I hope, though, that some of you might find the problem quite interesting to look at. I have been trying to estimate parameters for lotteries, the so called utility of chance, i.e. the "felt" probability compared to a rational given probability. A real brief example: Given is a lottery payoff matrix of the type x1 x2 ... x10 median 1000 5000 ... 5000 3750 0 1000 ... 5000 2250 etc. The actual data frame consists of 11 columns and 28 rows. Each entry x1 ... x10 gives the amount of money resp. the utility of that amount you receive playing the lottery. The probability for each column is 10%. The median represents the empirical answers of players where the person is indifferent if they prefer to receive the lottery or the sum of money as a sure payoff. I try to determine the probability people feel instead of the known 10% probability of each column payoff entry. But here's the catch: People also give different utilities to each amount of money, which basically gives us some sort of function like this: u(x1...x10) = u(x1)*pi(p1) + u(x2)*pi(p2) +...+u(x10)*pi(p10)=y u() - unknown utility function pi() - unknown probability function y - empirical answer p1..p10 - probabilities, here always 0.1 To keep it simple, I set u(0)=0 and u(5000)=5000 and vary u(1000) between a start and end point. On each cycle R computes the regression coefficients that serve as the pi(p) estimators for every 10% step. Then I minimize the residual sum of squares which should give the best estimators for every 10% step. How can I possibly calculate a "smooth" pi(p) curve, a curve that should look like an "S", plotted against the cumulative 10% probabilities? I only have my ten estimators. How can I possibly tell R the necessary restrictions of nonnegative estimators and their sum to equal one? Here is my quite naive approach: a70 <- matrix(c(1000,5000,5000,5000,2150, 0,1000,5000,5000,1750, 0,0,1000,5000,1150, 0,0,0,1000,200, 1000,1000,5000,5000,2050, 0,1000,1000,5000,1972), ncol=5, byrow=T) colnames(a70)=c(paste("x", 1:4, sep=""), "med") a70 <- as.data.frame(a70) start=800; end=2000 step=10; u1000=start-step u1000 <- u1000+step # varying the 1000 entry a70[a70==1000] <- u1000 reg70 <- lm(a70$med ~ -1+x1+x2+x3+x4, data=a70) res <- sum( (reg70$residuals^2) ) for (i in 1:( (end-start)/step) ){ a70[a70==u1000] <- u1000+step u1000 <- u1000+step reg70 <- lm(a70$med ~ -1+x1+x2+x3+x4, data=a70) if (res >= sum( (reg70$residuals^2) )) { res <- sum( (reg70$residuals^2) ) print(paste("cycle", i, "u1000=", u1000, "RSS=", res)) final70 <- a70 finalreg <- reg70 } } print(final70) summary(finalreg) Maybe a better approach works with optim(stats) or dfp(Bhat), but I have no idea how to correctly approach such a restricted optimization problem. Thank you su much for your help and support. Mark Hempelmann
Spencer Graves
2005-Sep-17 02:55 UTC
[R] regression with restrictions - optimization problem
I have not seen any replies, so I will offer a comment: 1. You speak of x1, x2, ..., x10, but your example includes only x1+x2+x3+x4. I'm confused. If you could still use help with this, could you please simplify your example further so there was only x1+x2, say? Can you solve the problem with only x1? If no, state your problem in terms only of x1. The answer to that may include a fairly obvious generalization to x10. If not, that can become a follow-on question. 2. What is your objective function? What do you want to maximize or minimize? Are your y's ("med"?), e.g., some model plus normal error? If yes, then some kind of (nonlinear) regression might be appropriate. If no, then it might be best to start by writing an objective function. If you have only one parameter to estimate to minimize the objective function, then you can just compute the objective function over a range for the parameter, create a plot, and be done. If you want some refinement of that, please look at "optimize". If you only two parameters, you can create contour plots, and use "optim" to refine the result. For more unknowns, "optim" can be fairly useful. Good luck. spencer graves Mark Hempelmann wrote:> Dear WizaRds! > > I am sorry to ask for some help, but I have come to a complete stop in > my efforts. I hope, though, that some of you might find the problem > quite interesting to look at. > > I have been trying to estimate parameters for lotteries, the so called > utility of chance, i.e. the "felt" probability compared to a rational > given probability. A real brief example: Given is a lottery payoff > matrix of the type > > x1 x2 ... x10 median > 1000 5000 ... 5000 3750 > 0 1000 ... 5000 2250 etc. > > The actual data frame consists of 11 columns and 28 rows. > > Each entry x1 ... x10 gives the amount of money resp. the utility of > that amount you receive playing the lottery. The probability for each > column is 10%. The median represents the empirical answers of players > where the person is indifferent if they prefer to receive the lottery or > the sum of money as a sure payoff. > > I try to determine the probability people feel instead of the known 10% > probability of each column payoff entry. But here's the catch: > > People also give different utilities to each amount of money, which > basically gives us some sort of function like this: > u(x1...x10) = u(x1)*pi(p1) + u(x2)*pi(p2) +...+u(x10)*pi(p10)=y > u() - unknown utility function > pi() - unknown probability function > y - empirical answer > p1..p10 - probabilities, here always 0.1 > > To keep it simple, I set u(0)=0 and u(5000)=5000 and vary u(1000) > between a start and end point. On each cycle R computes the regression > coefficients that serve as the pi(p) estimators for every 10% step. > Then I minimize the residual sum of squares which should give the best > estimators for every 10% step. > > How can I possibly calculate a "smooth" pi(p) curve, a curve that should > look like an "S", plotted against the cumulative 10% probabilities? I > only have my ten estimators. How can I possibly tell R the necessary > restrictions of nonnegative estimators and their sum to equal one? Here > is my quite naive approach: > > a70 <- matrix(c(1000,5000,5000,5000,2150, 0,1000,5000,5000,1750, > 0,0,1000,5000,1150, 0,0,0,1000,200, 1000,1000,5000,5000,2050, > 0,1000,1000,5000,1972), ncol=5, byrow=T) > colnames(a70)=c(paste("x", 1:4, sep=""), "med") > a70 <- as.data.frame(a70) > > start=800; end=2000 > step=10; u1000=start-step > > u1000 <- u1000+step # varying the 1000 entry > a70[a70==1000] <- u1000 > reg70 <- lm(a70$med ~ -1+x1+x2+x3+x4, data=a70) > res <- sum( (reg70$residuals^2) ) > > for (i in 1:( (end-start)/step) ){ > a70[a70==u1000] <- u1000+step > u1000 <- u1000+step > reg70 <- lm(a70$med ~ -1+x1+x2+x3+x4, data=a70) > if (res >= sum( (reg70$residuals^2) )) { > res <- sum( (reg70$residuals^2) ) > print(paste("cycle", i, "u1000=", u1000, "RSS=", res)) > final70 <- a70 > finalreg <- reg70 > } > } > print(final70) > summary(finalreg) > > Maybe a better approach works with optim(stats) or dfp(Bhat), but I > have no idea how to correctly approach such a restricted optimization > problem. > Thank you su much for your help and support. > > Mark Hempelmann > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915