thr3ads.net - R help - [R] r-square for cluster [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Yan Boulanger

2011-Feb-21 18:48 UTC

[R] r-square for cluster

Dear forumities,

It's seem that there is no straight forward way to calculate R2 of a cluster
solution in R. So, I would like to know if I'm right when calculating a
R2-like statistic for a given clustering solution. In fact, I have different
cluster solution for a given set of data. I would like to know which cluster
solution gives the highest R2. My data (5 variables) are scaled to a 0 mean and
1 std. This is the command lines I used to calculate R2 for 1 cluster solution:

SSTot <- (nrow(grid40km.datascale)-1)*sum(apply(grid40km.datascale,2,var)) #
total sum of square

SStot_grid40km <- NULL
for (i in 1:22) # there is 22 clusters 
{
data_group <- subset(grid40km.data,grid40km.cluster==i, select=c(X1, X2, X3,
X4, X5))
SSgroup <- (nrow(data_group-1)*sum(apply(data_group,2,var))) # SS for all
variables for a given cluster
SStot_grid40km=append(SStot_grid40km, SSgroup,after=length(SStot_grid40km))
}
ssw_grid40km = sum(SStot_grid40km) #withinSS (??) as the sum of SS for all
clusters
ssbetween_grid40km = SSTot-ssw_grid40km
RSQ_grid40km2 = ssbetween_grid40km/SSTot  # R-square

Am I right? Does this correspond to SAS's R2?

Many thanks,

Yan

 

Ressources Naturelles Canada
Service Canadien des Forêts - Centre de Foresterie des Laurentides
1055, rue du PEPS
CP 10380, Succ. Ste-Foy
Québec, QC, G1V 4C7
Tel. : +001 418 649-6859
Fax :  +001 418 648-5849 
email : Yan.Boulanger@nrcan.gc.ca
 




 		 	   		  
	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more maybe matching threads

R help - Feb 2011 - r-square for cluster

[R] r-square for cluster

Possibly Parallel Threads

Wisdom of the Ancients