I see that some of the speed patches that I posted have been incorporated into the current development version (eg, my patch-for, patch-evalList, and patch-vec-arith). My patch for speeding up x^2 has been addressed in an inadvisable way, however. This was a simple addition of four lines of code that speeds up squaring of real vectors by a factor of about six (for vectors of length 10000), by just converting x^2 to x*x. Modifications in the current development version (r52936 and r52937) attempt to address this issue in a different way, but they produce a smaller speedup. My modification is about 2.6 faster than the current development version (on an Intel/Linux system). Similarly, in the current development version, x*x is still about 2.6 times faster than x^2. Furthermore, the modification in the current development version slows down exponentiation by 0.5 (square roots) by about 4%. I think there's no reason not to just use my patch. One could also put in a similar modification to speed up squaring of integer vectors, but I think that is a much less common operation than squaring of real vectors, which arises all the time when computing squared residuals, squared Euclidean distances, etc.