search for: nutch

Displaying 11 results from an estimated 11 matches for "nutch".

Did you mean: dutch
2008 Jan 25
5
Ferret+Lucene Index
We use Nutch and Lucene for our heavy duty text analysis jobs but I?m trying to use ferrret to do some experiments. I understood that Ferret used the same index format as lucene but I cannot look into a lucene index with ferret and cannot read a ferret index with luke (the lucene index browser). Am I doing som...
2006 Mar 25
1
RDig - ferret-based website crawler/indexer
Hi! RDig is a small tool to build a Ferret index for the contents of a website or intranet. It contains a simple HTTP crawler and some support for extracting textual content from the fetched pages. I built this to implement a site-wide search for a recent project that combined a Rails application with lots of static html files generated by a CMS. Any feedback is very welcome! Rubyforge
2008 Jun 28
3
Commercial Rails CMS
Hi: Are there any commercial Rails based CMS? I searched but I have not found any. Any pointer? Cheers Rob --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this
2006 May 15
16
Ferret not able to read a Lucene Index?
Hi all, Having problems trying to get Ferret to read an index generated by Lucene. Am I right in thinking Ferret should be able to read a Lucene generated index no problem? Using the code snippets detailed in http://www.ruby-forum.com/topic/64099#new Any advice gratefully received. Many Thanks, Steven -- Posted via http://www.ruby-forum.com/.
2005 Dec 02
43
ANN: acts_as_ferret
Hi all This week I have worked with Rails and Ferret to test Ferrets (and Lucenes) capabilities. I decided to make a mixin for ActiveRecord as it seemed the simplest possible solution and I ended up making this into a plugin. For more info on Ferret see: http://ferret.davebalmain.com/trac/ The plugin is functional but could easily be refined. Anyway I want to share it with you. Regard it as a
2005 Dec 02
43
ANN: acts_as_ferret
Hi all This week I have worked with Rails and Ferret to test Ferrets (and Lucenes) capabilities. I decided to make a mixin for ActiveRecord as it seemed the simplest possible solution and I ended up making this into a plugin. For more info on Ferret see: http://ferret.davebalmain.com/trac/ The plugin is functional but could easily be refined. Anyway I want to share it with you. Regard it as a
2007 Apr 21
5
Thinking of using aaf- looking for advice
Hi- I''m technical lead at Lingr (http://www.lingr.com), a chatroom-based social networking site. We''ve currently got several million user utterances stored in MySQL, and we''re looking to build a local search functionality. I''ve played around with aaf and I really like it, but I have some questions. 1. Is anyone out there using aaf to index a corpus of this
2010 Jun 16
6
clustered file system of choice
Hi all, I am just trying to consider my options for storing a large mass of data (tens of terrabytes of files) and one idea is to build a clustered FS of some kind. Has anybody had any experience with that? Any recommendations? Thanks in advance for any and all advice. Boris.
2005 Dec 19
17
Indexing so slow......
I am indexing over 10,000 rows of data, it is very slow when it is indexing the 100,1000,10000 row, and now it is over 1 hour passed on the row 10,000. how to make it faster? here is my code: ================== doc = Document.new doc << Field.new("id", t.id, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("title", t.title,
2006 Jan 02
11
aligning Ferret''s IndexSearcher.search API with Lucene''s
Recently I''ve been revisiting some of my search code. With a greater understanding of how Java Lucene implements its search methods, I realized that one level of abstraction is not present in the Ferret classes/methods. Here are the relevant method signatures: Ferret''s search methods: in Ferret::Index::Index: search(query, options = {}) -> returns a TopDocs
2006 Jun 04
20
Proposal of some radical changes to API
Hey guys, Now that the Lucy[1] project has Apache approval and is about to begin, the onus is no longer on Ferret to strive for Lucene compatability. (We''ll be doing that in Lucy). So I''m starting to think about ways to improve Ferret''s API. The first part that needs to be improved is the Document API. It''s annoying having to type all the attributes to