All, I'm relatively new to using R, having used it thus far for some simple statistics and plotting. However, I'm not new to programming by any measure. I've been looking at the various modules available for clustering, factor analysis, etc. and find that I need advice on which modules I should be focusing on and their application. I have a data set comprised of columns of both quantitative and qualitative / non-numeric attributes. I would like to perform two operations on this data: identify correlations between attributes, and cluster the records by attribute. All of the clustering algorithms that I've looked at so far are based on numerical distance functions, and it's not clear to me how I'd apply them to qualitative attributes. It's not appropriate to simple convert discrete qualitative attributes (e.g., native language) to numerical values or independent columns with binary values. Is there a module that provides such an algorithm or that can be adapted to this purpose? I can wrap my head around the problem of looking for cross-correlation between the attributes, but would appreciate any insight in how to do it most efficiently and present the results. Thank you. [[alternative HTML version deleted]]
All, I'm relatively new to using R, having used it thus far for some simple statistics and plotting. However, I'm not new to programming by any measure. I've been looking at the various modules available for clustering, factor analysis, etc. and find that I need advice on which modules I should be focusing on and their application. I have a data set comprised of columns of both quantitative and qualitative / non-numeric attributes. I would like to perform two operations on this data: identify correlations between attributes, and cluster the records by attribute. All of the clustering algorithms that I've looked at so far are based on numerical distance functions, and it's not clear to me how I'd apply them to qualitative attributes. It's not appropriate to simple convert discrete qualitative attributes (e.g., native language) to numerical values or independent columns with binary values. Is there a module that provides such an algorithm or that can be adapted to this purpose? I can wrap my head around the problem of looking for cross-correlation between the attributes, but would appreciate any insight in how to do it most efficiently and present the results. Thank you. [[alternative HTML version deleted]]
All, I'm relatively new to using R, having used it thus far for some simple statistics and plotting. However, I'm not new to programming by any measure. I've been looking at the various modules available for clustering, factor analysis, etc. and find that I need advice on which modules I should be focusing on and their application. I have a data set comprised of columns of both quantitative and qualitative / non-numeric attributes. I would like to perform two operations on this data: identify correlations between attributes, and cluster the records by attribute. All of the clustering algorithms that I've looked at so far are based on numerical distance functions, and it's not clear to me how I'd apply them to qualitative attributes. It's not appropriate to simple convert discrete qualitative attributes (e.g., native language) to numerical values or independent columns with binary values. Is there a module that provides such an algorithm or that can be adapted to this purpose? I can wrap my head around the problem of looking for cross-correlation between the attributes, but would appreciate any insight in how to do it most efficiently and present the results. Thank you. [[alternative HTML version deleted]]