Hi all, I want to use Self Organizing Map in R for my data. I want my training set to be the following subset of my data: subdf=subset(df,Country%in%c("US","FR")) next I should change this subset to a matrix but I get the following error: data_train_matrix=as.matrix(scale(subdf)) error in colMeans(x,na.rm=TRUE):'x' must be numeric Can anyone help me to solve that? Thanks for any help Elahe
You did not send sample of your data, using dput. Before doing that, I suggest peeling apart your troublesome line of code yourself: str( as.matrix( scale( subdf ) ) ) str( scale( subdf ) ) str( subdf ) And then think about what the scale function does. Does it make sense to ask it to scale character or factor data? Could you perhaps exclude some of the columns that don't belong in the scaled data? -- Sent from my phone. Please excuse my brevity. On June 1, 2016 7:39:30 AM PDT, "ch.elahe via R-help" <r-help at r-project.org> wrote:>Hi all, >I want to use Self Organizing Map in R for my data. I want my training >set to be the following subset of my data: > > > subdf=subset(df,Country%in%c("US","FR")) >next I should change this subset to a matrix but I get the following >error: > > data_train_matrix=as.matrix(scale(subdf)) > error in colMeans(x,na.rm=TRUE):'x' must be numeric > >Can anyone help me to solve that? >Thanks for any help >Elahe > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Hi Elahe, if you look at your subdf, you will see that the column Country - which is not numeric - is still present. You might have other non-number columns, but this I cannot tell. scale expects a numeric matrix. You give it a data.frame which is silently cast to a matrix. A matrix can only have one type - unlike the data.frame - so the presence of the non-numeric columns results in a matrix of type character. Calculating means of characters is not possible, hence the error. You need your data.frame to consist only of numeric types - then scale will proceed without complaints. Best wishes, Ulrik On Wed, 1 Jun 2016 at 16:41 ch.elahe via R-help <r-help at r-project.org> wrote:> Hi all, > I want to use Self Organizing Map in R for my data. I want my training set > to be the following subset of my data: > > > subdf=subset(df,Country%in%c("US","FR")) > next I should change this subset to a matrix but I get the following error: > > data_train_matrix=as.matrix(scale(subdf)) > error in colMeans(x,na.rm=TRUE):'x' must be numeric > > Can anyone help me to solve that? > Thanks for any help > Elahe > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Jeff, Thanks for your reply. My df contains Protocols and their Parameters and I want to use SOM to see if I can find some clusters in customor's using protocols. Some of these Parameters are factors and some are numeric. I want to make a subset of some protocols and give them to SOM as training set and keep some for test set. is it possible to have also those factors in training set of SOM? $ Protocol : Factor w/ 132 levels "_unknown","A5. SAG TSE T2 FS",..: 5 $ BR : int 320 320 384 384 384 320 256 320 384 38 $ BW : int 150 150 191 191 98 150 150 150 $ COUNTRY : Factor w/ 35 levels "AE","AT","AU",..: 10 10 $ FSM : Factor w/ 2 levels "strong","weak": 2 2 On Wednesday, June 1, 2016 7:59 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: You did not send sample of your data, using dput. Before doing that, I suggest peeling apart your troublesome line of code yourself: str( as.matrix( scale( subdf ) ) ) str( scale( subdf ) ) str( subdf ) And then think about what the scale function does. Does it make sense to ask it to scale character or factor data? Could you perhaps exclude some of the columns that don't belong in the scaled data? -- Sent from my phone. Please excuse my brevity. On June 1, 2016 7:39:30 AM PDT, "ch.elahe via R-help" <r-help at r-project.org> wrote: Hi all,>I want to use Self Organizing Map in R for my data. I want my training set to be the following subset of my data: > > >subdf=subset(df,Country%in%c("US","FR")) >next I should change this subset to a matrix but I get the following error: > >data_train_matrix=as.matrix(scale(subdf)) >error in colMeans(x,na.rm=TRUE):'x' must be numeric > >Can anyone help me to solve that? >Thanks for any help >Elahe > > >________________________________ > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. >