Displaying 20 results from an estimated 30000 matches similar to: "Key-Value extraction methods"
2007 Aug 23
4
index all but search in some fields
Hi,
i like to index most of my model fields, but limit the search only to a
(changing) subset of this fields.
--
Posted via http://www.ruby-forum.com/.
2006 Apr 22
2
Ferret C Indexer Error: Fields not stored in index?
Hello,
I am trying to get Ferret''s C indexer to work on OpenSUSE 10 and
fastcgi.
Indexing documents appears to work correctly but when I try to display
the results I recieve the following error:
ActionView::TemplateError (undefined method `string_value'' for
stored/uncompressed,indexed,tokenized,<title:Revit.jpg>:Ferret::Document::Field)
on line #17 of
2006 Feb 07
1
setting of :key to :id in cFerret
Hi Dave,
I''ve been reading this post below back in December 2005.
Is it possible to set :key to :id in cFerret like suggested below?
Thanks,
Mac
On 12/3/05, Carl Youngblood <carl at youngbloods.org
<http://rubyforge.org/mailman/listinfo/ferret-talk>> wrote:
>* I seem to be getting the same document multiple times in my search
*>* results. I''m wondering if
2006 Sep 22
1
Query Objects vs. Query Strings
Hi ..
I tried to build some query objects to get some documents from my
index.. without success.. Is something wrong here?
q = Ferret::Search::BooleanQuery.new
q1 = Ferret::Search::TermQuery.new(:type, "movie")
q2 = Ferret::Search::TermQuery.new(:name, "Indiana")
q.add_query(q1, :should)
q.add_query(q2, :should)
Indexer.index.search_each(q) do |doc, score| puts doc end
0
2007 Aug 23
0
Language support in ferret
Hi,
I am using ferret 0.10.9. I have indexed a whole set of data using the
standard tokenizer and stem filter. Its stemming well for english
characters. But when i enter any non english character the whole application
crashes down. although the index doesn''t get corrupted. Instead of crashing
down it should atleast so no results.Am I missing out something.
-------------- next part
2006 Apr 05
3
Missing Ferret 0.9.0 Field methods
The following instance methods seem to be missing from the
Ferret::Document::Field class in Ferret 0.9.0 using compiled C
extensions: #string_value, #reader_value and #binary_value. They are in
the pure ruby implementation).
I got round it by mixing in hacked versions of the pure Ruby methods
(@data replaced by self.data).
Many thanks for Ferret it''s shaping up to being a killer app
2007 May 10
0
Large index performance = 8x decrease
hi,
i''m indexing a really large db table (~4.2 million rows). i''ve noticed
that after ~2M records, index performance decreases by almost an order
of magnitude. full dataset graph here:
http://i122.photobucket.com/albums/o244/spokeo/indexer-data.jpg
here''s a couple best-fit lines that represent the data points:
0-2M : y = 78.65655x + 144237.5
2.5M+ : y = 10.79832x +
2006 Sep 26
4
Some documents not found
I''m a ferret newbie, so hopefully I''m missing something simple :)
I am using ferret to index data about 36,000 products from a MySQL
database. The index has one document for each product, with these
important fields:
id: the id (unique) of the product record in the database
content: a concatenation of several bits of information from the product
and associated records
I
2008 Mar 05
0
Index Searcher Causes GC Memory Error: "irb: double free or corruption"
My linux Ruby application is using Ferret 0.11.4. I created my own class IndexSearcher to contain
the Searcher of multiple directories. If I do not have the searcher.close called, the end of
runner/console or runner/server will pop out with system error:
*** glibc detected *** irb: double free or corruption (fasttop): 0x0a51d6c0 ***
======= Backtrace: =========
/lib/libc.so.6[0x638ac1]
2006 Jul 24
0
error searching for a boolean query
Hey ..
i''m not sure if the trac is currently maintained, so i''ll post this here
as well, just to make sure :)
http://ferret.davebalmain.com/trac/ticket/94
i get a segfault on certain queries.. i guess thats a problem with the
query parser..
>> Indexer.index.search( "american~0.6 AND NOT type:Language" )
*** glibc detected *** double free or corruption
2007 Jun 07
0
Unique :key not maintained after add_indexes?
Hi,
When adding an index to another one using add_indexes I get duplicates
even though I use the :key attribute. For example:
def test_add_indexes_uniqueness
index1 = Ferret::Index::Index.new(:key => :id)
index2 = Ferret::Index::Index.new(:key => :id)
# Add two items with same id
index1 << {:id => 23, :data => "This is the data..."}
2001 Jan 27
1
//server/sharename%user question
I have a problem I'm trying to solve. We're using Windows 98 machines as
the client computers. Every user has a home directory and can access
their files through the [homes] interface.
I would like to be able to access multiple home directories at the same
time in the case where 2 people may be working on a project and need
access to both their home directories at the same time.
2008 Apr 20
1
Picolena, a ferret+rails documents search engine
Hi everybody!
I am proud to present you a small project I have been working on for a
while:
Picolena, a documents search engine written in Rails.
( http://picolena.devjavu.com/ ).
It obviously uses Ferret for indexing and searching, and adds some plain
text extractors in order to index OOffice.org, pdf and MS Office
documents (and some others as well).
Everything is packed in a gem (gem install
2005 Dec 14
2
undefined method `add'' for Ferret::Search::BooleanQuery
Up to now in my ferret development I have been using simple
single-word strings as my search queries. I just now am trying to
increase the complexity of my queries. When I was passing a single
word with no spaces in my index searches, like so:
count = index.search_each(''testing'') do |d, s|
...
end
everything worked fine. But now when I do something like this:
count =
2006 Mar 25
1
RDig - ferret-based website crawler/indexer
Hi!
RDig is a small tool to build a Ferret index for the contents of a
website or intranet. It contains a simple HTTP crawler and some support
for extracting textual content from the fetched pages.
I built this to implement a site-wide search for a recent project
that combined a Rails application with lots of static html files
generated by a CMS.
Any feedback is very welcome!
Rubyforge
2007 Jan 27
0
concurrency errors adding to a keyed index
Hi,
I''m adding some news articles to a keyed Ferret 0.10.14 index and
encountering quite serious instability when concurrently reading and
writing to the index, even though with just 1 writer and 1 reader
process.
If I recreate the index without a key, concurrent reading and writing
seem to work fine (and indexing is about 10 times quicker :)
I''m testing by running my indexing
2006 Mar 31
3
undefined method `<=>'' for :id:Symbol
Upgrading to 0.9.0, I have the following error. Anybody?
c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/index/term.rb:35:in
`<=>'': undefined method `<=>'' for :id:Symbol
(NoMethodError)
from
c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/index/term_infos_io.rb:263:in
`get_index_offset''
from
2007 Jan 22
1
Ferret-talk Digest, Vol 15, Issue 8
Hi everyone, thank you for the help last time.
A quick question, through rereading the ferret tutorial I realized
that by adding :key => :id to the index loading, I could access my
documents through index["11"], in addition to using the doc_id from
ferret through index[122]... This is great, and saves me a line or two
a lot of places in my code. However, is there a way of extracting
2005 Dec 02
1
Compile error on FreeBSD 4.10 gcc 2.95.4
FYI, I tried installing ferret on my freebsd virtual server and got this:
retango# gem install ferret --include-dependencies
Attempting local installation of ''ferret''
Local gem file not found: ferret*.gem
Attempting remote installation of ''ferret''
Updating Gem source index for: http://gems.rubyforge.org
Building native extensions. This could take a while...
2007 Aug 03
0
StandardTokenizer Doesn''t Support token_stream method
According to the Analyzer doc and the StandardTokenizer doc:
http://ferret.davebalmain.com/api/classes/Ferret/Analysis/Analyzer.html
http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardTokenizer.html
I ought to be able to construct a StandardTokenizer like this:
t = StandardTokenizer.new( true) # true to downcase tokens
and then later:
stream = token_stream(