I have a large excel file with data in it. I converted it to a 'csv' format. I imported this dataset to R using the follownig command mldata <- read.csv("c:\\temp\\mldata.csv", header=T) all the column names and the rows seems to be correct. Now that I have this object, I need to perfrom hclust. I used the following hc <- hclust(dist(mldata), method="single") I get the following error> hc <- hclust(dist(mldata),"ave")Error: cannot allocate vector of size 622668 Kb In addition: Warning messages: 1: NAs introduced by coercion 2: Reached total allocation of 479Mb: see help(memory.size) Can anyone please point where I'm going wrong ? Thank you, Ramya --------------------------------- [[alternative HTML version deleted]]
Try traceback(). My bet is that the error is in dist(), and you do not have space for the distance matrix. If so you need to reassess what you are doing. On Sat, 8 Nov 2003, ramya sundaram wrote:> I have a large excel file with data in it. I converted it to a 'csv' format. > I imported this dataset to R using the follownig command > mldata <- read.csv("c:\\temp\\mldata.csv", header=T) > > all the column names and the rows seems to be correct. > > Now that I have this object, I need to perfrom hclust. I used the following > hc <- hclust(dist(mldata), method="single") > > I get the following error > > > hc <- hclust(dist(mldata),"ave") > Error: cannot allocate vector of size 622668 Kb > In addition: Warning messages: > 1: NAs introduced by coercion > 2: Reached total allocation of 479Mb: see help(memory.size) > > Can anyone please point where I'm going wrong ? > > Thank you, > Ramya > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 8 Nov 2003 at 13:23, ramya sundaram wrote: The error message should be clear: you don't have sufficient memory. By more RAM, or see to it that R uses virtual memory. AFAIK, the way to do this could depend on OS, and you did'nt tell which. Read ?Memory But first think about if this could possible help: in n is the number of observations, a dist object have length n(n-1)/2, compute this and see if what you try to do make sense! If not, try library(cluster) and look at ?clara "clustering large applications" That will not work with factors, but from your code below it seems you doesn't have factors. Kjetil Halvorsen> I have a large excel file with data in it. I converted it to a 'csv' format. > I imported this dataset to R using the follownig command > mldata <- read.csv("c:\\temp\\mldata.csv", header=T) > > all the column names and the rows seems to be correct. > > Now that I have this object, I need to perfrom hclust. I used the following > hc <- hclust(dist(mldata), method="single") > > I get the following error > > > hc <- hclust(dist(mldata),"ave") > Error: cannot allocate vector of size 622668 Kb > In addition: Warning messages: > 1: NAs introduced by coercion > 2: Reached total allocation of 479Mb: see help(memory.size) > > Can anyone please point where I'm going wrong ? > > Thank you, > Ramya > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help