Ryan van Laar
2009-May-14 18:28 UTC
[R] KNN script: Identity of specific K samples chosen?
I am currently doing some prediction work using the knn script in the 'class' package. Does anyone know a way of having R return the IDs (sample IDs, or column IDs of the training matrix) of the 'k' samples that are chosen by the algorithm as being nearest to a given test sample? I have searched/read everything I can about the script, however have not found anything other than the ability to output the proportion of 'k' samples agreeing with the final prediction (which is also quite useful). However I would still like the 'k' sample ID's, for further lookup and reporting steps in my process. Thanks in advance for any advice. Ryan [[alternative HTML version deleted]]
> I am currently doing some prediction work using the knn script in the > 'class' package. Does anyone know a way of having R return the IDs (sample > IDs, or column IDs of the training matrix) of the 'k' samples that are > chosen by the algorithm as being nearest to a given test sample?You would probably have to modify the source code to get the specific neighbors.> I have searched/read everything I can about the script, however have not > found anything other than the ability to output the proportion of 'k' > samples agreeing with the final prediction (which is also quite useful).The knn3 function in the caret package will return the votes per class for each test sample.> However I would still like the 'k' sample ID's, for further lookup and > reporting steps in my process.Since this sounds like a research project, a better idea might be to use the proxy package (or some other) to compute the neighbors yourself and do what you wish with them. Max