Andreas Marienborg wrote:> I was wondering if it is possible to figure out "popular terms"
in a
> given set of documents (not the entire database, but lets say the 1000
> last articles).
You want to read the documentation for the Enquire::get_eset() method.
This takes a list of "relevant documents" (as an RSet object), and
returns a list of terms. The terms returned will be ordered by a
weighting function, which rewards terms which are high frequency in the
documents in the RSet compared to the corpus as a whole.
In a sense, this method is the dual of the get_mset() method - it
returns a list of terms given a list of documents.
If you want to dig into the code of omega, you'll find that the
implementation of the topterms functionality there uses this method.
--
Richard