Displaying 1 result from an estimated 1 matches for "iter_terms_for_field".
2010 Oct 08
1
Get a list of all terms in an indexed corpus
Hello,
I have a corpus that I have indexed with xapian/xappy and I would now
like to generate a corpus-specific list of stopwords. (This is a
technical corpus, so a typical stopword list wouldn't be helpful.)
My first thought was to ask the xapian database for a list of terms
followed by their frequency. My intuition is that I could probably bring
together a list of stopwords by examining