Carolina Bello
2013-Feb-08  23:25 UTC
[R] vegdist Error en double(N * (N - 1)/2) : tama?o del vector especificado es muy grande
---------- Forwarded message ----------
From: <r-help-owner@r-project.org>
Date: 2013/2/8
Subject: vegdist Error en double(N * (N - 1)/2) : tama?o del vector
especificado es muy grande
To: caro.bello58@gmail.com
Message rejected by filter rule match
---------- Mensaje reenviado ----------
From: caro bello <caro.bello58@gmail.com>
To: r-help@r-project.org
Cc:
Date: Fri, 8 Feb 2013 15:18:40 -0800 (PST)
Subject: vegdist Error en double(N * (N - 1)/2) : tamaño del vector
especificado es muy grande
Hi
I have some problems with the vegdist function. I want to calculate a
distance matrix with jaccard. I have binary data.
The problem is that i have a matrix of 138037 rows (sites) and 89 columns
(species). my script is:
    rm(list=ls(all=T))
    gc() ##para borrar todo lo que quede oculto en memoria
    memory.limit(size = 100000) # it gives 1 Tera from HDD in case ram
memory is over
    DF=as.data.frame(MODELOS)
    DF=na.omit(DF)
    DISTAN=vegdist(DF[,2:ncol(DF)],"jaccard")
Almost immediately IT produces the error: Error en double(N * (N - 1)/2) :
tamaño del vector especificado es muy grande
I think this a memory error, but i don´t know why if i have a pc with 32GB
of ram and 1 Tera of HDD.
I also try to do a dist matrix whit the function dist from package proxy, i
did:
  library(proxy)
    vector=dist(DF, method = "Jaccard")
it starts to run but when it gets to 10 GB of ram, a window announces that R
committed an error and it will close, so it closes and start a new section.
I really don't know what is going on and less how to solve this, can anybody
help me?
thanks
Carolina Bello IAVH-COLOMBIA
--
View this message in context:
http://r.789695.n4.nabble.com/vegdist-Error-en-double-N-N-1-2-tama-o-del-vector-especificado-es-muy-grande-tp4658010.html
Sent from the R help mailing list archive at Nabble.com.
	[[alternative HTML version deleted]]
Prof Brian Ripley
2013-Feb-09  09:51 UTC
[R] vegdist Error en double(N * (N - 1)/2) : tama?o del vector especificado es muy grande
Suppose N = 138037 (you haven't really told us). A dissimilarity half-matrix would have 9 billlion elements. The maximum size of a vector in current versions of R is 2 billion. You will be able to get further in R-devel (3.0.0-to-be) with a 64-bit version of R, although as you appear to be using Windows it will be very slow and 32GB of RAM is not enough to even store that object. What do you propose to do with a distance matrix on 140,000 objects? I think you need to re-think whatever that is. On 08/02/2013 23:25, Carolina Bello wrote:> ---------- Forwarded message ---------- > From: <r-help-owner at r-project.org> > Date: 2013/2/8 > Subject: vegdist Error en double(N * (N - 1)/2) : tama?o del vector > especificado es muy grande > To: caro.bello58 at gmail.com > > > Message rejected by filter rule match > > > > ---------- Mensaje reenviado ---------- > From: caro bello <caro.bello58 at gmail.com> > To: r-help at r-project.org > Cc: > Date: Fri, 8 Feb 2013 15:18:40 -0800 (PST) > Subject: vegdist Error en double(N * (N - 1)/2) : tama?o del vector > especificado es muy grande > Hi > I have some problems with the vegdist function. I want to calculate a > distance matrix with jaccard. I have binary data. > > The problem is that i have a matrix of 138037 rows (sites) and 89 columns > (species). my script is: > > rm(list=ls(all=T)) > > gc() ##para borrar todo lo que quede oculto en memoria > > memory.limit(size = 100000) # it gives 1 Tera from HDD in case ram > memory is over > > DF=as.data.frame(MODELOS) > > DF=na.omit(DF) > > DISTAN=vegdist(DF[,2:ncol(DF)],"jaccard") > > Almost immediately IT produces the error: Error en double(N * (N - 1)/2) : > tama?o del vector especificado es muy grande > > I think this a memory error, but i don?t know why if i have a pc with 32GB > of ram and 1 Tera of HDD. > > I also try to do a dist matrix whit the function dist from package proxy, i > did: > > library(proxy) > > vector=dist(DF, method = "Jaccard") > > it starts to run but when it gets to 10 GB of ram, a window announces that R > committed an error and it will close, so it closes and start a new section. > > I really don't know what is going on and less how to solve this, can anybody > help me? > > thanks > > Carolina Bello IAVH-COLOMBIA > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/vegdist-Error-en-double-N-N-1-2-tama-o-del-vector-especificado-es-muy-grande-tp4658010.html > Sent from the R help mailing list archive at Nabble.com. > > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Apparently Analagous Threads
- hvcluster() with distance method from vegdist(), package = vegan
- simprof test using jaccard distance
- function gdist, dist and vegdist in mvpart
- Problems running IsoMDS using vegdist with pres-abs data and two sites with zero distance
- error in rowSums:'x' must be numeric