Displaying 9 results from an estimated 9 matches similar to: "recursively count the words occurrence in the text files"
2012 Mar 23
1
how to cluster rows of words in a text file
Hi:
I am trying to cluster the rows of a text file with kmeans:
I load the data as follows
file1 <- read.csv("somefile.csv")
and the file can be viewed having the following line of words
> file1
1 word1 word3 word4 word1
2 word1 word4 word3 word1
3 word4 word2 word4 word3
4 word4 word2 word1 word3
5 word2 word2 word4 word2
file_as_matrix <- as.matrix(file1);
Now,
2005 Nov 08
0
sorting during xtabs? sorting by "individual" order?
Hey alltogether,
refacturing a package (before it will be released),
I ran across the following problem.
I have two directories with different text files,
I want to read the first and construct a document-term
matrix from it (every term=word in a row, every file in
a column, occurrence frequencies form the values).
The second directory contains different files. It
needs to be read in to also
2009 Nov 03
3
re ading tokens
Greetings,
I am not familiar with processing text in R. Can someone tell me how to
read each line of words as separate elements in a list?
FE, I would like to turn:
word1 word2 word3
word2 word4
into a list of length two with three character elements in the first list
and two elements in the second. I know that this should be easy, but I am a
little confused by the text functions.
Thanks in
2011 Sep 26
2
findAssocs()
I am trying to find the math behind the "tm" package findAssocs()
?findAssocs does not say anything besides "association" and "correlate"
Usually entering "findAssocs" at the CLI gives the code for a R
function, but in this case I obtain:
function (x, term, corlimit)
UseMethod("findAssocs", x)
<environment: namespace:tm>
Any ideas?
2003 Nov 03
2
upgrade 2.2.8a -> 3.0 Debian DOS long filename problem
Hi,
I tried to upgrade from Samba 2.2.8a to 3.0.
It worked generally speaking fine and speed went up tremendously,
BUT
since then the DOS conversion of long file names is freaking me out.
Using long file names produced a readable short version plus ~1 oder ~x.
where x stands for a number.
now long file names produce some cryptic 8 letter name plus extension.
listing in a DOS box
2017 Jul 07
1
How does findAssocs() calculate the correlation value ??
hi:
I want to know the math behind the "tm" package findAssocs().
I have found that someone had asked the question before, and have a good explanation by Rick.
?]http://r.789695.n4.nabble.com/findAssocs-td3845751.html?^
But I still don't understand how to calculate the correlation value between the two vectors.
For example:
# Correlation word2 with word3
2007 Jul 07
2
Extending/Modifying QueryParser
Hi,
I''ve implemented synonym searching in my rails application but have
an idea I''d like to implement but can''t figure out how to do. The
idea is that I''d like to give the end user the choice on whether to
search for the synonym of a word or not. Preferably by extending the
query language to parse a construct similar to ''%word1'' and
1999 Jul 20
1
MS Word from samba share
A few days ago I send a message titled "A samba question or a mutt question".
I got a few repies but they suggested wide work-arounds to the problem. I
have done some more work and it is clear it is not anything to do with
the mail side of what I was trying. The question or problem I have is
essentially this:-
How do you double click on a MS Word icon to bring the document into
MS Word
2010 Apr 04
1
How to add a column to dtm showing a part from directory source?
Hello Experts,
I'm new with R and having troubles doing my graduation project.I have 20
subfolders including almost 20000 txt files.What i need to do is to create a
dtm and add a column to it showing a "class" information of the txt files.
My directory source is like "C:\\R\\20news-18828\\comp.graphics" for the
comp.graphic subfolder.I need to take only