thr3ads.net - search: "wdf_upper

Displaying 3 results from an estimated 3 matches for "wdf_upper_bound".

Backend for Lucene format indexes-How to get doclength

2013 Sep 02

Backend for Lucene format indexes-How to get doclength

On Mon, Sep 02, 2013 at 09:21:48AM +0800, jiangwen jiang wrote: > TfIdfWeight and BM25(b=0) also need wdf_upper_bound, it is not exists in > Lucene backends. If you don't provide an implementation of wdf_upper_bound(), the default is to use the collection frequency of the term, so provided that information is available in the lucene files, the lack of wdf_upper_bound information isn't a show stopper....

Backend for Lucene format indexes-How to get doclength

2013 Aug 26

Backend for Lucene format indexes-How to get doclength

On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote: > > For now, using weighting schemes which don't use document length is > > probably the simplest answer. > > There's tf-idf weighting scheme on svn master, is it suitable for lucene > backend? Yes - TfIdfWeight doesn't ever use the document length (at least with the normalisations currently

Implementation of the PL2 weighting scheme of the DFR Framework

2013 Mar 11

Implementation of the PL2 weighting scheme of the DFR Framework

...will always be negative as in his thesis,Amati states that for the PL2 model, collection frequency of term << Collection Size and so lamda will always be less than one .) .So, in order to find the upper bound,I simply substituted wdf=1 for L and used wdf = wdf_upper_bound for P and multiplied them by using upper doc length bound and lower doc length for wdfn of L and P respectively.However,this does not give that tight a bound.Not a word has been spoken about upper bounds on DFR weights in Amati's thesis or on his papers on DFR .I eve...

search for: wdf_upper_bound