similar to: How can I count frequency of terms in a document?

Displaying 20 results from an estimated 500 matches similar to: "How can I count frequency of terms in a document?"

2007 Apr 06
3
Count frequency of term in a specific document?
Is there any way to count the frequency of specific term in one document? I can''t find any method... Do you? -- Posted via http://www.ruby-forum.com/.
2007 Jun 12
5
index browser inconsistent with IndexReader
Hi, We have an index of around 1M web pages as part of our web app. The app uses ferret by way of RDig to perform searches. We have noticed anecdotally that some searches don''t work the way we thought they should, as if documents were missing from the index. Yesterday we came upon a concrete instance of this. Our documents have several fields, one of which is called :keywords and
2007 Mar 20
2
Strange Results For Term Frequencies
I would like to thank all the people who have contributed to this very fine project. Great work! I''ve encountered some strange results while examining the term frequency of one of my indexed documents. The indexed terms seem to vary for the very same document depending on the presence or absence of completely unrelated operations in the code, so the resulting term frequency changes, too.
2006 Nov 22
2
crash while retrieving term vectors
This program reliably crashes for me (usually a segfault): require ''rubygems'' require ''ferret'' reader=Ferret::Index::IndexReader.new ARGV fields=reader.field_infos.fields reader.max_doc.times{|n| fields.each{|field| reader.term_vector(n,field) } unless reader.deleted?(n) print "."; STDOUT.flush } As you can see, it just goes through
2007 Feb 16
8
term vector blues
I have a lot of crashes when I try to use term vectors. Here''s an example, which crashes pretty consistently. This problem seems to be somewhat sensitive to platform... people on other OS''s and ruby versions have reported no error. I have seen this with ferret 0.10.13 and 0.10.14 on debian stable using ruby 1.8.2, but I have observed the same problem on various other systems as
2006 Oct 11
6
Indexing problem 10.9/10.10
Sorry if this is a repost- I wasn''t sure if the www.ruby-forum.com list works for postings. I''ve been having trouble with indexing a large amount of documents(2.4M). Essentially, I have one process that is following the tutorial dumping documents to an index stored on the file system. If I open the index with another process, and run the size() method it is stuck at a number
2006 May 12
4
validates_uniqueness_of with a condition
Hi, I need to check the uniqueness of an attribute (a doc number) using a condition (a specific year), I''ve found validates_uniqueness_of :number, but I need to tell it I just want to check a specific year. I''ve found :scope but I haven''t really understood it''s meaning, Can I scope on a specific year? Thanks, Enrico -- "The only thing necessary for the
2006 May 26
8
Comparing two documents in the index
I want to compare two documents in the index (i.e. retrieve the cosine similarity/score between two documents term-vector''s). Is this possible using the standard Ferret functionality? Thanks in advance, Jeroen Bulters -- Posted via http://www.ruby-forum.com/.
2005 Feb 25
2
Bug in TermIterator::skip_to() ?
Hi all, I've been toying with xapian (mostly using the Python bindings) and I think I've hit a bug in the TermIterator::skip_to() method (or maybe in QuartzAllTermsList::skip_to()). I've attached a c++ source file that demonstrates the issue. In short, if you have a WritableDatabase, ask for the all-terms TermIterator with db.allterms_begin(), and then skip_to() a word that is itself
2007 Mar 09
5
higlighting problem
Hi, I''ve been having a problem getting highlighting to work with aaf. I have a class defined as follows such: class Link < ActiveRecord::Base acts_as_ferret :fields => { :description => { :store => :yes } } end I get back the correct results when I do Link.find_by_contents, however, I''d like to highlight them. If I do something like iterate through the list of
2005 Nov 26
3
Get number of found documents
Hi David again. I would say that Ferret works great with Rails. And now I am trying to create pagination. Because site could have millions of documents I need to create on page link something like "Page #100". Rather usual situation. But to create this links I need to know how many documents Ferret found in index. For now I am doing it with following code index =
2006 Jun 04
20
Proposal of some radical changes to API
Hey guys, Now that the Lucy[1] project has Apache approval and is about to begin, the onus is no longer on Ferret to strive for Lucene compatability. (We''ll be doing that in Lucy). So I''m starting to think about ways to improve Ferret''s API. The first part that needs to be improved is the Document API. It''s annoying having to type all the attributes to
2006 Oct 12
3
Ferret::StateError while using acts_as_ferret
I''m fairly new to ferret / aaf and finding it much easier to use than HyperEstraier (which I migrated from). However, I am getting a few errors and I need to figure out if they''re problems with my usage of ferret or a bug I should report. I''m currently running Ferret 0.10.11 with acts_as_ferret (latest via svn external) and 3 times today I''ve seen the
2005 Dec 02
4
How to get the count of matching documents
I''m trying to generate a rails pagination helper for some ferret search results, and I need to know how many total matches there are to my search query. I don''t see an obvious way of finding this. Any help would be appreciated. Thanks, Carl Youngblood
2006 Nov 10
2
A new attack
Log report is reporting a lot of these lately.. following is just a short snippet from the beginning on one server. WARNING!!!! Possible Attack: Attempt from 104.29.broadband2.iol.cz [83.208.29.104] with: command=HELO/EHLO, count=3 : 1 Time(s) Attempt from 106.7.broadband7.iol.cz [88.102.7.106] with: command=HELO/EHLO, count=3 : 1 Time(s) Attempt from
2006 Sep 05
4
Ferret 0.10.2 - Index#search_each() and :num_docs
Hi, I seem to be having trouble getting more than 10 hits from Index#search_each since upgrading to 0.10.2 (ie, this was working in 0.9.4). Maybe a bug, as the #search_each doesn''t seem to use the options parameter any more ? Thanks, Neville =========================================== require ''rubygems'' require ''ferret'' p Ferret::VERSION idx =
2008 Mar 27
6
Problems pinging PC on tunnel
Hello! I have set up tunnel between a FreeBSD machine and Windows Vista. Tunnel is established, but when I try to ping either end ping fails. I have temporarily switched off firewalls on both machines, no luck. Here is client tinc.conf on Vista: Name = lenovo_client ConnectTo = lenovo_server Interface = tinctap Subnet = 10.20.40.0/24 Sevrer tinc.conf on FreeBSD: Device=/dev/tap0
2007 Apr 28
6
Determine how many documents a term occurs in
Is there a fast way to determine how many documents a term occurs in, besides iterating through every document with TermDocEnum? -- Best regards, Stian Gryt?yr
2007 Mar 01
2
FerretHash
Dave, thank you so much for the 0.11 release(s). You have solved many problems for me. As part of my appreciation for your good works, I am offering up for public consideration a silly little class that I wrote. (Code is below.) This class offers a simplified Hash-like interface to (a very restricted subset of) Ferret. Hence I call it FerretHash. FerretHash comes with its very own pet Ferret
2006 Aug 28
12
Help with Multiple Readers, 1 Writer scenario
Hi, I''m building a web server application using Ferret [thanks so much Dave], Mongrel and Camping which works fine servicing one request at a time, but serialises searches if more than one request arrives, so I''d like some advice please about the best way to use multiple readers and one writer. Some background ... query requests which in my case are always read only, arrive via