James Kim
2007-Apr-06  07:36 UTC
[Ferret-talk] Count frequency of term in a specific document?
Is there any way to count the frequency of specific term in one document? I can''t find any method... Do you? -- Posted via http://www.ruby-forum.com/.
David Balmain
2007-Apr-06  08:19 UTC
[Ferret-talk] Count frequency of term in a specific document?
On 4/6/07, James Kim <sjoonk at gmail.com> wrote:> Is there any way to count the frequency of specific term in one > document? > I can''t find any method... Do you?Hi James, Caleb and I answered your previous post but here is my answer again. You can find the frequency without storing term-vectors. Simply use the TermDocEnum and skip to the document you are interested. tde = index.reader.term_docs_for(:field, ''term'') tde.skip_to(100) # now check that we are at the correct document. If there are no # instances of ''term'' in document 100 then it will skip to the next # document with an instance of the term ''term'' frequency = tde.doc == 100 ? tde.freq : 0 puts "frequency of field:term in document 100 is #{frequency}" Here is a full working example; require ''rubygems'' require ''ferret'' index = Ferret::I.new index << ''one'' index << ''one two one three one four one'' # doc 1 index << ''one'' index << ''no 1s'' # doc 3 index << ''one'' def get_frequency(index, doc_num, term, field = :id) tde = index.reader.term_docs_for(field, term) tde.skip_to(doc_num) return tde.doc == doc_num ? tde.freq : 0 end puts get_frequency(index, 1, ''one'') #=> 4 puts get_frequency(index, 3, ''one'') #=> 0 -- Dave Balmain http://www.davebalmain.com/
James Kim
2007-Apr-07  09:17 UTC
[Ferret-talk] Count frequency of term in a specific document?
Thanks! One more. I''d like to count all term''s frequency on a specific document. Upper solution is for count one term''s frequency. Is there any way to gather all term''s frequency of a specific document? -- Posted via http://www.ruby-forum.com/.
David Balmain
2007-Apr-07  09:50 UTC
[Ferret-talk] Count frequency of term in a specific document?
On 4/7/07, James Kim <sjoonk at gmail.com> wrote:> > Thanks! > > One more. I''d like to count all term''s frequency on a specific document. > Upper solution is for count one term''s frequency. Is there any way to > gather all term''s frequency of a specific document?Yes, for this you need to store term-vectors with positions. That will allow you to count the frequency of all terms in the document. -- Dave Balmain http://www.davebalmain.com/