Displaying 3 results from an estimated 3 matches for "wdf_upper_bound".
2013 Sep 02
2
Backend for Lucene format indexes-How to get doclength
On Mon, Sep 02, 2013 at 09:21:48AM +0800, jiangwen jiang wrote:
> TfIdfWeight and BM25(b=0) also need wdf_upper_bound, it is not exists in
> Lucene backends.
If you don't provide an implementation of wdf_upper_bound(), the default
is to use the collection frequency of the term, so provided that
information is available in the lucene files, the lack of
wdf_upper_bound information isn't a show stopper....
2013 Aug 26
2
Backend for Lucene format indexes-How to get doclength
On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote:
> > For now, using weighting schemes which don't use document length is
> > probably the simplest answer.
>
> There's tf-idf weighting scheme on svn master, is it suitable for lucene
> backend?
Yes - TfIdfWeight doesn't ever use the document length (at least with
the normalisations currently
2013 Mar 11
1
Implementation of the PL2 weighting scheme of the DFR Framework
...will always be negative as in his thesis,Amati states that
for the PL2 model, collection frequency of term << Collection
Size and so lamda will always be less than one .) .So, in
order to find the upper bound,I simply substituted wdf=1 for L
and used wdf = wdf_upper_bound for P and multiplied them by
using upper doc length bound and lower doc length
for wdfn of L and P respectively.However,this does not give
that tight a bound.Not a word has been spoken about
upper bounds on DFR weights in Amati's thesis or on his
papers on DFR .I eve...