Dear R-users,
I am trying to run a function of the package “adabag” (e.g. boosting.cv)
in order to determine a proper number of cluster that I would specify later on
my KMeans clustering.
(I had this idea from: http://www.statsoft.com/TEXTBOOK/stcluan.html)
However, I do have a problem with the “formula” parameter of
e.g. boosting.cv : I am not familiar with these formulas and my research in the
R user guide did not really helped me:
I have 2 columns in my dataset corresponding to 2 “response”
variables (x = coordinates along PCA axis 1 and y = coordinates along PCA axis
2 of my raw dataset) and that’s on these variables that I would like to run the
boosting.cv, therefore I tried to define a formula with 2 response variables
and no terms but it’s not really explained this way in the R-guide and I’m not
even sure it’s possible.
I also tried by making my KMeans clustering before, applying
the cluster number to each row and using this number (recoded as.character) as
a term, but it does not work
either.
A quick overview of code / error message if needed :
> form <- as.formula (COORDI_PCA1n2$Dim.1 +
COORDI_PCA1n2$Dim.2 ~KMeans)
> boosting.cv(form, COORDI_PCA1n2)
Error in `[.data.frame`(data, , as.character(formula[[2]]))
:
undefined columns
selected
> boosting.cv(form, COORDI_PCA1n2[, 1:2])
Error in `[.data.frame`(data, , as.character(formula[[2]]))
:
undefined columns
selected
If anyone has the time to indicate me where I’m being wrong…
With regards
[[alternative HTML version deleted]]