Andrew S. Townley
2011-Feb-23 13:57 UTC
[Ferret-talk] Custom highlighter/match vector access?
Hi everyone, I know from the archives things have kinda slowed down on ferret and there''s an effort ongoing with lucy, but I was wondering if anyone had discovered a way to enumerate the matches of a particular field in the document and get the offsets? With what I''m trying to do, ferret will be indexing large portions of structured information, but I really don''t want to store it all in the ferret index just to have highlighting. My understanding (I''m still new at this) is that if you index and store the match offsets, you can do this without storing the full text of the field. Ideally, what I''d like is to expose the contents of the C MatchRange structure as an array of Ruby hash objects so that I could then use those offsets in the actual data store to create my own highlighted extracts (or something along those lines). Short of adding a hacked version of searcher_highlight to the C API to do this and creating a corresponding wrapped Ruby version, is there any way to get to this information right now from the Ruby API? Alternatively, is there another/better way to do this besides storing the whole field values and using the built-in highlighter? Any advice or pointers would be really appreciated. Cheers, ast -- Andrew S. Townley <ast at atownley.org> http://atownley.org