search for: textmatrix

Displaying 13 results from an estimated 13 matches for "textmatrix".

2007 Aug 18
2
Problem with lsa package (data.frame) on Windows XP
Dear R team, The following piece of code (to use the lsa package) works fine on my mac os x, but when I run the same code on Windows XP, it doesn't work any more. ### code: library("lsa") matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE. 000\\LSA\\cuentos\\", stemming=TRUE, language="spanish", minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL) print(matrix1,bag_lines = 3, bag_cols = 3) matrix1 = lw_bintf(matrix1) * gw_idf(matrix1) space = lsa(matri...
2012 Feb 22
0
LSA package: problem with textmatrix()
I have a problem with the textmatrix() function of the LSA package whenever I specify 'removeNumbers=TRUE'. The data for the function are stored in a directory LSAwork which consists of a series of files that houses the text in column form. As long as removeNumbers = FALSE or it is not present the textmatrix function works j...
2006 Oct 03
1
new to R: don't understand errors
..., I still get the errors. So I am wondering if it might be something in the files themselves... At any rate I routinely get these two errors. The first is generated when I include a minDocFreq=x, and it looks a little like this when I run it: > data(stopwords_en) > CCauto = textmatrix( "CultureMineTXT" , minWordLength=3, minDocFreq=50, stopwords=stopwords_en) > Error in data.frame(docs = basename(file), terms = names(tab), Freq = tab, : > arguments imply differing number of rows: 1, 0 If I remove the minDocFreq, I get a differen...
2008 Mar 25
0
Error "... x must be atomic" when using lsa (latent semantic analysis) package
...s to be related to the number of documents being processed. Here's the code I'm running (after loading the lsa and rstem packages), and the error message: > SnippetsPath <- "c:\\OED\\AuditExplain\\" # path where to find text snippets > data(stopwords_en) > tdm <- textmatrix(SnippetsPath, stopwords=stopwords_en) I get this error message with ~ 280 documents: "Error in sort( unique.default(x), na.last = TRUE) : 'x' must be atomic" The error won't occur if I reduce the number of documents (say to 220, for instance). I'm not clear if this is...
2008 Mar 25
0
Solution to: Error "... x must be atomic" when using lsa (latent semantic analysis) package
...s to be related to the number of documents being processed. Here's the code I'm running (after loading the lsa and rstem packages), and the error message: > SnippetsPath <- "c:\\OED\\AuditExplain\\" # path where to find text snippets > data(stopwords_en) > tdm <- textmatrix(SnippetsPath, stopwords=stopwords_en) I get this error message with ~ 280 documents: "Error in sort( unique.default(x), na.last = TRUE) : 'x' must be atomic" The error won't occur if I reduce the number of documents (say to 220, for instance). I'm not clear if this is...
2007 Mar 15
2
Cannot allocate vector size of... ?
Hello all, I've been working with R & Fridolin Wild's lsa package a bit over the past few months, but I'm still pretty much a novice. I have a lot of files that I want to use to create a semantic space. When I begin to run the initial textmatrix( ), it runs for about 3-4 hours and eventually gives me an error. It's always "ERROR: cannot allocate vector size of xxx Kb". I imagine this might be my computer running out of memory, but I'm sure. So I thought I would send this to community at large for any help/thoughts. I...
2003 Sep 11
1
Customised legend in lattice
Hi List, Am trying to customize a legend in trellis: Draws 2x5 lines in 5 colors and 2 linetypes. I would like to add two more items to the legend showing the key for the line types above the colored legend. Any suggestions welcome - thanks Herry ############################# #Following example code: library(gregmisc) trellis.device(bg="white") i1=0 i2=-1.89767506 i3=-1.17087085
2012 Dec 13
3
Combined Marimekko/heatmap
Hi all, I'm trying to figure out a way to create a data graphic that I haven't ever seen an example of before, but hopefully there's an R package out there for it. The idea is to essentially create a heatmap, but to allow each column and/or row to be a different width, rather than having uniform column and row height. This is sort of like a Marimekko chart in appearance, except that
2012 Jan 13
4
Troubles with stemming (tm + Snowball packages) under MacOS
Dear all, I have some troubles using the stemming algorithm provided by the tm (text mining) + Snowball packages. Here is my config: MacOS 10.5 R 2.12.0 / R 2.13.1 / R 2.14.1 (I have tried several versions) I have installed all the needed packages (tm, rJava, rWeka, Snowball) + dependencies. I have desactivated AWT (like written in
2006 Oct 04
0
FW: new to R: don't understand errors
...yway group this low-frequency terms in a lower order factor. Of course you will still get an error if you use documents that are completely empty, so delete all 0 bytes documents beforehands. I am thinking about what to do with this sanitizing part. It is not a good idea to integrate that into the textmatrix method -- it would slow things down tremendously. So what about this idea: does it make sense to provide a sanitizing collection of methods that help to select the files you want to work with (copy them to a different directory or just return a list with the filenames of the ones that are "go...
2012 Feb 26
2
tm_map help
...Completion, dictionary=dictCorpus) myDtm <- TermDocumentMatrix(myCorpus, control = list(minWordLength = 1)) m <- as.matrix(myDtm) v <- sort(rowSums(m), decreasing=TRUE) myNames <- names(v) d <- data.frame(word=myNames, freq=v) wordcloud(d$word, d$freq, min.freq=minFreq) list(freq=v, TextMatrix=myDtm) } qantas=hashTag("#qantas", 7) [[alternative HTML version deleted]]
2005 Nov 08
0
sorting during xtabs? sorting by "individual" order?
...# and create a dataframe F1frame = data.frame( docs="F1", terms=names(F1tab), Freq = F1tab, row.names = NULL) F2frame = data.frame( docs="F2", terms = names(F2tab), Freq = F2tab, row.names = NULL) (2) textmatrix function ... to be bound together for every file and to be converted with xtabs into a document term matrix: dummy = list(F1frame, F2frame) dtm = t(xtabs(Freq ~ ., data = do.call("rbind", dummy))) => docs terms F1 F2...
2007 Aug 21
2
Partial comparison in string vector
...r. Uwe Ligges Walter Rojas wrote: > Dear R team, > > The following piece of code (to use the lsa package) works fine on my > mac os x, but when I run the same code on Windows XP, it doesn't work > any more. > > ### code: > library("lsa") > matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE. > 000\\LSA\\cuentos\\", stemming=TRUE, language="spanish", > minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL) > print(matrix1,bag_lines = 3, bag_cols = 3) > matrix1 = lw_bintf(matrix1) * gw_idf(matrix...