On 08.09.2011 14:19, ?????? ?????? wrote:> Hello.
>
> I found the behavior of knn(
> http://stat.ethz.ch/R-manual/R-devel/library/class/html/knn.html) function
> looking very strange.
> Consider the toy example.
>> library(class)
>> train<- matrix(nrow=5000,ncol=2,data=rnorm(10000,0,1))
>> test<- matrix(nrow=10,ncol=2,data=rnorm(20,0,1))
>> cl<- rep(c(0,1),2500)
>> knn(train,test,cl,1)
> [1] 1 1 0 0 1 0 1 1 0 1
> Levels: 0 1
>
> It works properly if you pass any number of nearest neibhours (4-th
> parameter) from 1 to 499
> But if you run it with number of n.n.>= 500 than there would be an
error.
>> knn(train,test,cl,500)
> error in knn(train, test, cl, 500) : too many ties in knn
>
> no matter what data you have. even if you run it with odd number of n.n.,
> say, 501 (so there just can't be any ties) there will be exactly
> the same error.
> Am I missing smth?
Yes, the source code. In the source package, ./src/class.c, line 89:
#define MAX_TIES 1000
That means the author (who is on well deserved vacations and may not
answer at once) decided that it is extremely unlikely that someone is
going to run knn with such an extreme number of neighbours k.
If you really have an application where this makes sense, just edit the
source code and increase that number, then install the package from
sources yourself.
Uwe Ligges
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.