search for: wdf_upper_bound

Displaying 3 results from an estimated 3 matches for "wdf_upper_bound".

2013 Sep 02
2
Backend for Lucene format indexes-How to get doclength
On Mon, Sep 02, 2013 at 09:21:48AM +0800, jiangwen jiang wrote: > TfIdfWeight and BM25(b=0) also need wdf_upper_bound, it is not exists in > Lucene backends. If you don't provide an implementation of wdf_upper_bound(), the default is to use the collection frequency of the term, so provided that information is available in the lucene files, the lack of wdf_upper_bound information isn't a show stopper....
2013 Aug 26
2
Backend for Lucene format indexes-How to get doclength
On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote: > > For now, using weighting schemes which don't use document length is > > probably the simplest answer. > > There's tf-idf weighting scheme on svn master, is it suitable for lucene > backend? Yes - TfIdfWeight doesn't ever use the document length (at least with the normalisations currently
2013 Mar 11
1
Implementation of the PL2 weighting scheme of the DFR Framework
...will always be negative as in his thesis,Amati states that for the PL2 model, collection frequency of term << Collection Size and so lamda will always be less than one .) .So, in order to find the upper bound,I simply substituted wdf=1 for L and used wdf = wdf_upper_bound for P and multiplied them by using upper doc length bound and lower doc length for wdfn of L and P respectively.However,this does not give that tight a bound.Not a word has been spoken about upper bounds on DFR weights in Amati's thesis or on his papers on DFR .I eve...