Hi,
An SVD on a 771x5677 matrix should be fine, it took 30 seconds and no
memory on my workstation. The problem is most likely when you transform
the array tdm2 to a matrix. The array tdm2 has a much greater size than
771x5677, so does tdm_matrix. Without a reproducible example we cannot
help you very well. Furthermore, I have no clue as to what needs to be
extracted from tdm2 as input for the svd because I have no experience
with the tm-package.
good luck,
Paul
On 09/13/2011 10:24 AM, vioravis wrote:> I am trying to perform Singular Value Decomposition (SVD) on a Term
Document
> Matrix I created using the 'tm' package. Eventually I want to do a
Latent
> Semantic Analysis (LSA).
>
> There are 5677 documents with 771 terms (the DTM is 771 x 5677). When I try
> to do the SVD, it runs out of memory. I am using a 12GB Dual core Machine
> with Windows XP and don't think I can increase the memory anymore. Are
there
> any other memory efficient methods to find the SVD?
>
> The term document is obtained using:
>
> tdm2 <-
> TermDocumentMatrix(tr1,control=list(weighting=weightTf,minWordLength=3))
> str(tdm2)
>
> List of 6
> $ i : int [1:6438] 202 729 737 278 402 621 654 718 157 380 ...
> $ j : int [1:6438] 1 2 3 7 7 7 7 8 10 10 ...
> $ v : num [1:6438] 8 5 6 9 5 7 5 6 5 7 ...
> $ nrow : int 771
> $ ncol : int 5677
> $ dimnames:List of 2
> ..$ Terms: chr [1:771] "access" "accessori"
"accumul" "acoust" ...
> ..$ Docs : chr [1:5677] "1" "2" "3"
"4" ...
> - attr(*, "class")= chr [1:2] "TermDocumentMatrix"
"simple_triplet_matrix"
> - attr(*, "Weighting")= chr [1:2] "term frequency"
"tf"
>
> SVD is calcualted using:
>
>> tdm_matrix <- as.matrix(tdm2)
>> svd_out<-svd(tdm_matrix)
> Error: cannot allocate vector of size 767.7 Mb
> In addition: Warning messages:
> 1: In matrix(0, n, np) :
> Reached total allocation of 3583Mb: see help(memory.size)
> 2: In matrix(0, n, np) :
> Reached total allocation of 3583Mb: see help(memory.size)
> 3: In matrix(0, n, np) :
> Reached total allocation of 3583Mb: see help(memory.size)
> 4: In matrix(0, n, np) :
> Reached total allocation of 3583Mb: see help(memory.size)
>
>
> Thank you.
>
> Ravi
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/SVD-Memory-Issue-tp3809667p3809667.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494
http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770