Hi I am looking to perform a discrete mean-variance optimisation, specifically to maximise the ratio of portfolio mean over portfolio standard deviation for a portfolio of several hundred stocks through discrete position size holdings in each stock, where all position sizes must be elements of a small finite set of integer amounts which include zero. I don't think any of the standard R optimisation functions are ideally suited to this particular task, but perhaps there is a way to tailor them to this purpose, or does anyone know of any alternative R algorithms which would address this problem well? Phil -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Cheers Jim Your response is exactly what I was hoping for, confirming my suspicions that (for once) R does not have a ready-made function for the purpose, and that what I actually need is one of the various combinatorial optimisation algorithms dotted around the web, such as simulated annealing, genetic algorithms, taboo search, memetic algorithms, neural networks, et al. I did manage to cobble together some R code for a taboo search, but I'm no expert in optimisation so I'm not convinced that it is working as efficiently as it should be, and would therefore be very interested in your simulated annealing code as I'll bet it works better than mine! The SPSA algorithm is also very interesting, especially if it really is as efficient as the website claims, so if you have any code for this then again it would be very welcome, but otherwise I'll have a go at translating the MATLAB code in the overview document into R. Thanks very much for your response - it's already been very helpful. Phil -----Original Message----- From: Jim_Garrett at bd.com [mailto:Jim_Garrett at bd.com] Sent: 22 October 2002 14:45 To: Phil Saunders Subject: Re: [R] Combinatorial Optimisation Hello Phil, There are a number of possibilities that are not hard to implement, though I don't know of anything out-of-the-box. You will probably have to do some coding. First, if you can implement a "steepest descent" algorithm in which, beginning from some candidate point, you move to a neighboring point if your criterion suggests it's better, simply run this algorithm many times beginning each run from a different randomly-selected starting point. Simple probability theory indicates that this will converge (in the number of runs) to the true solution. In the "consider the neighbors" step, you could consider all neighbors and pick the best, or just pick one at random (which might actually do better--it will be less likely to get stuck in poor local optima). In many problems this works very quickly, it's just not very sexy. Second, you can generalize this a bit and try "simulated annealing" in which neighboring candidate points are randomly selected. However, whereas in steepest descent the algorithm will never move to a point that's worse, with simulated annealing this happens with probability that decreases as the candidate point looks worse. I.e., if the candidate point is just a little worse, the algorithm moves with moderate to high probability. But if the point is a lot worse, the algorithm will make this move with small probability. Furthermore, the scaling of "badness" becomes continuously greater, so that eventually the algorithm will not move to even slightly worse points. I have R code to do this, but it requires that you write some routines that are specific to your problem, mainly defining your loss function and the function that runs the random selection of neighbors (and hence implicitly defines what a neighbor is). If you'd like to try this out, let me know. As with steepest descent, I would try several simulated annealing runs. But simulated annealing is more likely to get a good optimum in a small number of runs. One nice thing about simulated annealing is that you can handle complex constraints by coding the point-selection routine to never selected an impermissible point. However, this could cause the optimization to be trapped. Another nice feature is (depending on the point-selection routine again) neighbors involve single-component changes, and if you can write your loss function so that it can be updated rather than recomputed from scratch, simulated annealing can be relatively fast. Third, there's an extremely efficient continuous optimizer that is not known as well as it ought to be, called Simultaneous Perturbation Stochastic Approximation (SPSA); there's a web site at www.jhuapl.edu/spsa/. I've adapted it to subset selection (in the context of determining which regression inputs to use). A potential downside is that it will randomly partition the components, and will apply the loss function to each half. Initially, roughly half would be used. However, suppose the optimal solution involves 5 components and you have 95 others. It will calculate the loss function on the 95. In some situations this can be computationally difficult or mathematically impossible. I have some code that I could probably generalize pretty well; again, you'll have to write the loss function. Again, let me know if you'd like to give it a try. Finally, as you probably already know, everybody and their brother has some combinatorial optimization algorithm. Some that come to mind are ant searches, genetic algorithms, and taboo search. They will all probably work if used correctly. But I can't offer much help on them! Again, I'm happy to offer code, though it's not going to be an out-of-the-box solution. I can get you simulated annealing code faster than SPSA code. Let me know if you're interested. Good luck, Jim Garrett Becton Dickinson Diagnostic Systems Baltmore, Maryland, USA -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> Date: Thu, 24 Oct 2002 08:58:08 +0200 (CEST) > From: Detlef Steuer <Detlef.Steuer@unibw-hamburg.de> > Subject: RE: [R] Combinatorial Optimisation > > This thread missed me somehow. > > If not mentioned before: > simulated annealing is implemented in optim().The current implementation of simulated annealing in optim() has been designed for a continuos parameter space. The next candidate point in the parameter space is generated from a Gaussian Markov kernel with scale proportional to the actual temperature. However, it is quite simple to change this for a combinatorial optimization: in the R source tree, see the file src/main/optim.c, function samin: What you need to change are the four lines, that are responsible for generating a new candidate point: for (i = 0; i < n; i++) dp[i] = scale * t * norm_rand(); /* random perturbation */ for (i = 0; i < n; i++) ptry[i] = p[i] + dp[i]; /* new candidate point */ You could, e.g., replace them by a call to an R function, which randomly selects a new candidate point of a finite set of points. Hence, to quickly get SA for combinatorial optimization: Copy and paste samin from optim.c Change the four lines Write an R wrapper for this new function best Adrian PS: TO R-DEVEL: What about having this behaviour in optim (default as is, optional a function argument which selects a candidate point)? -- Dr. Adrian Trapletti Phone : +41 (0) 1 994 5631 Trapletti Statistical Computing Mobile: +41 (0)76 370 5631 Wildsbergstrasse 31 Fax : +41 (0) 1 994 5631 CH-8610 Uster Email : mailto:a.trapletti@bluewin.ch Switzerland WWW : http://trapletti.homelinux.com -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Adrian, Thanks for the idea. If you were willing to contribute a patch to do this we would be happy to incorporate it. Brian On Fri, 25 Oct 2002, Adrian Trapletti wrote:> > Date: Thu, 24 Oct 2002 08:58:08 +0200 (CEST) > > From: Detlef Steuer <Detlef.Steuer@unibw-hamburg.de> > > Subject: RE: [R] Combinatorial Optimisation > > > > This thread missed me somehow. > > > > If not mentioned before: > > simulated annealing is implemented in optim(). > > The current implementation of simulated annealing in optim() has been designed for a continuos parameter space. The next candidate > point in the parameter space is generated from a Gaussian Markov kernel with scale proportional to the actual temperature. However, > it is quite simple to change this for a combinatorial optimization: > > in the R source tree, see the file src/main/optim.c, function samin: > > What you need to change are the four lines, that are responsible for generating a new candidate point: > > for (i = 0; i < n; i++) > dp[i] = scale * t * norm_rand(); /* random perturbation */ > for (i = 0; i < n; i++) > ptry[i] = p[i] + dp[i]; /* new candidate point */ > > You could, e.g., replace them by a call to an R function, which randomly selects a new candidate point of a finite set of points. > Hence, to quickly get SA for combinatorial optimization:(Using an R function here is not entirely trivial, hence the request above.)> Copy and paste samin from optim.c > Change the four lines > Write an R wrapper for this new function > > best > Adrian > > PS: TO R-DEVEL: What about having this behaviour in optim (default as is, optional a function argument which selects a candidate > point)? > > -- > Dr. Adrian Trapletti Phone : +41 (0) 1 994 5631 > Trapletti Statistical Computing Mobile: +41 (0)76 370 5631 > Wildsbergstrasse 31 Fax : +41 (0) 1 994 5631 > CH-8610 Uster Email : mailto:a.trapletti@bluewin.ch > Switzerland WWW : http://trapletti.homelinux.com > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._