Hi all,
I wanted to make sure that I am getting the most from my 8 core computer
running R. I have a distance function like this:
distance <- function(pointX, pointY, vecX, vecY)
{
#convert everything to radians
pointX = pointX * pi/180
pointY = pointY * pi/180
vecX = vecX * pi/180
vecY = vecY * pi/180
#data@coords[,2] = lat
#data@coords[,1] = lon
delta_lon = pointX - vecX
#print(sprintf("Len Pt X: %d Len Pt Y: %d",
length(pointX),length(pointY)))
#print(sprintf("Len Vec X: %d Len Vec Y: %d",
length(vecX),length(vecY)))
#print(sprintf("Len delta_lon: %d", length(delta_lon)))
R = 6372.8 #Radius of the earth (km)
#spherical law of cosines:
#acos(sin(lat)*sin(lat) + cos(lat)*cos(lat)*cos(delta lon))
delta_rad = acos(sin(pointY)*sin(vecY) +
cos(pointY)*cos(vecY)*cos(delta_lon))
delta_km = R*delta_rad
return(delta_km)
}
vecX and vecY are very large, about 500k elements or more. I want to know
the most efficient way to write this function. I don't even need to do
lapply or an apply function for this because I assume that cos(vector) and
sign(vector) will handle all of the multi-threading that I could ask for. I
know that R is a functional language and, as such, I assume that each of
these will be run in parallel with the optimal settings. Am I mistaken in
this and if I am, how can I can default multithreading base operations
applied to vectors etc?
Currently I also have MPI set up for some work I am doing. Should I use
this heavily and assume that it is doing what I want it to? What free BLAS
should I be using on ubuntu 11.10?
Thank you,
~Ben
[[alternative HTML version deleted]]