Hello,
I am hoping you can help me with a question concerning kmeans clustering
in R. I am working with the following data-set (abbreviated):
BMW Ford Infiniti Jeep Lexus Chrysler Mercedes Saab Porsche
Volvo
[1,] 6 8 2 8 4 5 4 4 7 7
[2,] 8 7 4 6 4 1 6 7 8 5
[3,] 8 2 4 6 3 2 7 4 4 4
[4,] 7 4 4 6 6 1 6 3 5 5
[5,] 6 2 4 5 5 1 3 3 6 3
[6,] 6 7 3 6 5 1 8 4 8 2
[7,] 1 6 6 7 5 2 6 6 5 6
[8,] 3 6 6 4 5 1 4 2 1 1
[9,] 6 7 5 8 4 1 6 6 8 5
[10,] 6 7 5 9 3 1 2 5 1 8
When I try to scale my data and perform kmeans clustering, I get the
following errors:
new <- scale(new)
Error in colMeans(x, na.rm = TRUE) : 'x' must be
numeric> cl <- kmeans(new, 4)
Error in switch(nmeth, { : NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In switch(nmeth, { : NAs introduced by coercion
This is confusing to me since all of the data is numeric and there are
no missing values. Is there something I need to do to my data to prepare
it for kmeans? I have tried many matrix transformations but nothing has
worked so far.
Your help is much appreciated.
Thanks,
jordan
--
Jordan van Rijn
vanrijn9 at fastmail.fm
On 9 May 2008, at 09:12, Jordan van Rijn wrote:> Hello, > > I am hoping you can help me with a question concerning kmeans > clustering > in R. I am working with the following data-set (abbreviated): > > > BMW Ford Infiniti Jeep Lexus Chrysler Mercedes Saab Porsche > Volvo > [1,] 6 8 2 8 4 5 4 4 > 7 7 > [2,] 8 7 4 6 4 1 6 7 > 8 5 > [3,] 8 2 4 6 3 2 7 4 > 4 4 > [4,] 7 4 4 6 6 1 6 3 > 5 5 > [5,] 6 2 4 5 5 1 3 3 > 6 3 > [6,] 6 7 3 6 5 1 8 4 > 8 2 > [7,] 1 6 6 7 5 2 6 6 > 5 6 > [8,] 3 6 6 4 5 1 4 2 > 1 1 > [9,] 6 7 5 8 4 1 6 6 > 8 5 > [10,] 6 7 5 9 3 1 2 5 > 1 8 > > When I try to scale my data and perform kmeans clustering, I get the > following errors: > new <- scale(new) > Error in colMeans(x, na.rm = TRUE) : 'x' must be numericProbably the data is stored as factor instead of numeric. Try coercing by as.numeric(new) hth, Ingmar>> cl <- kmeans(new, 4) > Error in switch(nmeth, { : NA/NaN/Inf in foreign function call (arg 1) > In addition: Warning message: > In switch(nmeth, { : NAs introduced by coercion > > This is confusing to me since all of the data is numeric and there are > no missing values. Is there something I need to do to my data to > prepare > it for kmeans? I have tried many matrix transformations but nothing > has > worked so far. > > Your help is much appreciated. > > Thanks, > jordan > > -- > Jordan van Rijn > vanrijn9@fastmail.fm > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]]
Unfortunately, your data is *not* numeric. That is what the first
error message, " 'x' must be numeric", is telling you, and you
should
believe it. It might look numeric, but it isn't, which is why Ingmar
mentioned you might have factors instead of numbers.
Your challenge is to discover why. The "why" will depend on how you
brought the data into R.
Assuming 'new' is a matrix (which it appears to be), here are some
ways to find out more about your data object:
is.numeric(new)
is.factor(new)
class(new)
mode(new)
str(new)
I'd suggest taking another look at your input data and making very
sure there are only numbers in it. If it was a text file you read
into R with some function, inspect the text file carefully. Also,
check the help pages for the method you used to load the data into R,
and see if you can find out what kinds of things cause data to be
interpreted as other than numeric.
-Don
At 12:12 AM -0700 5/9/08, Jordan van Rijn wrote:>Hello,
>
>I am hoping you can help me with a question concerning kmeans clustering
>in R. I am working with the following data-set (abbreviated):
>
>
> BMW Ford Infiniti Jeep Lexus Chrysler Mercedes Saab Porsche
> Volvo
> [1,] 6 8 2 8 4 5 4 4 7 7
> [2,] 8 7 4 6 4 1 6 7 8 5
> [3,] 8 2 4 6 3 2 7 4 4 4
> [4,] 7 4 4 6 6 1 6 3 5 5
> [5,] 6 2 4 5 5 1 3 3 6 3
> [6,] 6 7 3 6 5 1 8 4 8 2
> [7,] 1 6 6 7 5 2 6 6 5 6
> [8,] 3 6 6 4 5 1 4 2 1 1
> [9,] 6 7 5 8 4 1 6 6 8 5
> [10,] 6 7 5 9 3 1 2 5 1 8
>
>When I try to scale my data and perform kmeans clustering, I get the
>following errors:
> new <- scale(new)
>Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric
>> cl <- kmeans(new, 4)
>Error in switch(nmeth, { : NA/NaN/Inf in foreign function call (arg 1)
>In addition: Warning message:
>In switch(nmeth, { : NAs introduced by coercion
>
>This is confusing to me since all of the data is numeric and there are
>no missing values. Is there something I need to do to my data to prepare
>it for kmeans? I have tried many matrix transformations but nothing has
>worked so far.
>
>Your help is much appreciated.
>
>Thanks,
> jordan
>
>--
> Jordan van Rijn
> vanrijn9 at fastmail.fm
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
--------------------------------------
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062