R community, I am trying to cluster large omics datasets (10,000-20,000 variables). Obviously, with datasets of this size, PC memory is an issue. I am using a custom distance metric, and am able to generate a dissimilarity matrix in sparse format. To cluster, for example, using heirarchical clustering (hclust, or fastcluster::hclust), I need to submit the dataset as a distance object. I can use as.dist() to acheive this, but in doing so the sparse matrix format is expanded to its full form, which quickly consumes all the memory on most desktop PCs. My question is then: 1. Is there a clustering tool that can take as input a sparse dissimilarity matrix directly without expanding it? 2. alternatively, is there a sparse distance object format that I can't seem to find (an alternative to as.dist(), for example)? Any advice is appreciated. Corey [[alternative HTML version deleted]]