thr3ads.net - R help - [R] Help with cleaning a corpus [Apr 2011]

If this information is useful, please help other people find it:
Share via:

vintersorg123

2011-Apr-18 14:00 UTC

[R] Help with cleaning a corpus

Hi!

I created a corpus and I started to clean through this piece of code:

txt <-tm_map(txt,removeWords, stopwords("spanish"))
txt <-tm_map(txt,stripWhitespace)
txt <-tm_map(txt,tolower)
txt <-tm_map(txt,removeNumbers)
txt <-tm_map(txt,removePunctuation)

But something happpended: some of the documents  in the corpus became empty,
this is a problem when i try to make a document term matrix with tfidf. 
Is there any way to eliminate  automatically a document if it become empty? 

Or manually, how could i get the lenght of every document?

hope you can help me! thanks a lot

greetings!


--
View this message in context:
http://r.789695.n4.nabble.com/Help-with-cleaning-a-corpus-tp3457649p3457649.html
Sent from the R help mailing list archive at Nabble.com.

Maybe Matching Threads

Search for more apparently analagous threads

R help - Apr 2011 - Help with cleaning a corpus

[R] Help with cleaning a corpus

Maybe Matching Threads

Wisdom of the Ancients