Displaying 13 results from an estimated 13 matches for "textmatrix".
2007 Aug 18
2
Problem with lsa package (data.frame) on Windows XP
Dear R team,
The following piece of code (to use the lsa package) works fine on my
mac os x, but when I run the same code on Windows XP, it doesn't work
any more.
### code:
library("lsa")
matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE.
000\\LSA\\cuentos\\", stemming=TRUE, language="spanish",
minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
print(matrix1,bag_lines = 3, bag_cols = 3)
matrix1 = lw_bintf(matrix1) * gw_idf(matrix1)
space = lsa(matri...
2012 Feb 22
0
LSA package: problem with textmatrix()
I have a problem with the textmatrix() function of the LSA package whenever I specify 'removeNumbers=TRUE'. The data for the function are stored in a directory LSAwork which consists of a series of files that houses the text in column form. As long as removeNumbers = FALSE or it is not present the textmatrix function works j...
2006 Oct 03
1
new to R: don't understand errors
..., I still get the errors. So I am wondering
if it might be something in the files themselves...
At any rate I routinely get these two errors. The first is generated
when I include a minDocFreq=x, and it looks a little like this when I
run it:
> data(stopwords_en)
> CCauto = textmatrix( "CultureMineTXT" , minWordLength=3,
minDocFreq=50, stopwords=stopwords_en)
> Error in data.frame(docs = basename(file), terms = names(tab),
Freq = tab, :
> arguments imply differing number of rows: 1, 0
If I remove the minDocFreq, I get a differen...
2008 Mar 25
0
Error "... x must be atomic" when using lsa (latent semantic analysis) package
...s to be related to the number of documents being
processed. Here's the code I'm running (after loading the lsa and rstem
packages), and the error message:
> SnippetsPath <- "c:\\OED\\AuditExplain\\" # path where to find text
snippets
> data(stopwords_en)
> tdm <- textmatrix(SnippetsPath, stopwords=stopwords_en)
I get this error message with ~ 280 documents: "Error in sort(
unique.default(x), na.last = TRUE) : 'x' must be atomic"
The error won't occur if I reduce the number of documents (say to 220, for
instance). I'm not clear if this is...
2008 Mar 25
0
Solution to: Error "... x must be atomic" when using lsa (latent semantic analysis) package
...s to be related to the number of documents being
processed. Here's the code I'm running (after loading the lsa and rstem
packages), and the error message:
> SnippetsPath <- "c:\\OED\\AuditExplain\\" # path where to find text
snippets
> data(stopwords_en)
> tdm <- textmatrix(SnippetsPath, stopwords=stopwords_en)
I get this error message with ~ 280 documents: "Error in sort(
unique.default(x), na.last = TRUE) : 'x' must be atomic"
The error won't occur if I reduce the number of documents (say to 220, for
instance). I'm not clear if this is...
2007 Mar 15
2
Cannot allocate vector size of... ?
Hello all,
I've been working with R & Fridolin Wild's lsa package a bit over the
past few months, but I'm still pretty much a novice. I have a lot of
files that I want to use to create a semantic space. When I begin to run
the initial textmatrix( ), it runs for about 3-4 hours and eventually
gives me an error. It's always "ERROR: cannot allocate vector size of
xxx Kb". I imagine this might be my computer running out of memory, but
I'm sure. So I thought I would send this to community at large for any
help/thoughts.
I...
2003 Sep 11
1
Customised legend in lattice
Hi List,
Am trying to customize a legend in trellis: Draws 2x5 lines in 5 colors and
2 linetypes. I would like to add two more items to the legend showing the
key for the line types above the colored legend.
Any suggestions welcome - thanks Herry
#############################
#Following example code:
library(gregmisc)
trellis.device(bg="white")
i1=0
i2=-1.89767506
i3=-1.17087085
2012 Dec 13
3
Combined Marimekko/heatmap
Hi all,
I'm trying to figure out a way to create a data graphic that I haven't ever seen an example of before, but hopefully there's an R package out there for it. The idea is to essentially create a heatmap, but to allow each column and/or row to be a different width, rather than having uniform column and row height. This is sort of like a Marimekko chart in appearance, except that
2012 Jan 13
4
Troubles with stemming (tm + Snowball packages) under MacOS
Dear all,
I have some troubles using the stemming algorithm provided by the tm
(text mining) + Snowball packages.
Here is my config:
MacOS 10.5
R 2.12.0 / R 2.13.1 / R 2.14.1 (I have tried several versions)
I have installed all the needed packages (tm, rJava, rWeka, Snowball)
+ dependencies. I have desactivated AWT (like written in
2006 Oct 04
0
FW: new to R: don't understand errors
...yway group this low-frequency terms
in a lower order factor. Of course you will still get
an error if you use documents that are completely empty,
so delete all 0 bytes documents beforehands.
I am thinking about what to do with this sanitizing part.
It is not a good idea to integrate that into the
textmatrix method -- it would slow things down
tremendously.
So what about this idea: does it make sense to provide a
sanitizing collection of methods that help to select the
files you want to work with (copy them to a different
directory or just return a list with the filenames of
the ones that are "go...
2012 Feb 26
2
tm_map help
...Completion, dictionary=dictCorpus)
myDtm <- TermDocumentMatrix(myCorpus, control = list(minWordLength = 1))
m <- as.matrix(myDtm)
v <- sort(rowSums(m), decreasing=TRUE)
myNames <- names(v)
d <- data.frame(word=myNames, freq=v)
wordcloud(d$word, d$freq, min.freq=minFreq)
list(freq=v, TextMatrix=myDtm)
}
qantas=hashTag("#qantas", 7)
[[alternative HTML version deleted]]
2005 Nov 08
0
sorting during xtabs? sorting by "individual" order?
...# and create a dataframe
F1frame = data.frame( docs="F1", terms=names(F1tab),
Freq = F1tab, row.names = NULL)
F2frame = data.frame( docs="F2", terms = names(F2tab),
Freq = F2tab, row.names = NULL)
(2) textmatrix function
... to be bound together for every file and to be
converted with xtabs into a document term matrix:
dummy = list(F1frame, F2frame)
dtm = t(xtabs(Freq ~ ., data = do.call("rbind", dummy)))
=>
docs
terms F1 F2...
2007 Aug 21
2
Partial comparison in string vector
...r.
Uwe Ligges
Walter Rojas wrote:
> Dear R team,
>
> The following piece of code (to use the lsa package) works fine on my
> mac os x, but when I run the same code on Windows XP, it doesn't work
> any more.
>
> ### code:
> library("lsa")
> matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE.
> 000\\LSA\\cuentos\\", stemming=TRUE, language="spanish",
> minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
> print(matrix1,bag_lines = 3, bag_cols = 3)
> matrix1 = lw_bintf(matrix1) * gw_idf(matrix...