Hi Folks,
I''ve just release Ferret 0.10.13 (skip 0.10.12, it was a bad build).
There are two interesting additions to this release. You can now
access the Filter#bits method of the built in filters so you can can
use them in your own code, possibly within your own custom filters.
For example you could implement a custom filter like so:
class MultiFilter < Hash
def bits(index_reader)
bit_vector = Ferret::Utils::BitVector.new.not!
filters = self.values
filters.each {|filter| bit_vector.and!(filter.bits(index_reader))}
bit_vector
end
end
And you would use it like this:
mf = MultiFilter.new
mf[:category] = category_filter
mf[:country] = country_filter
# run the query with the filter
index.search(query, :filter => mf)
# filters can be changed and deleted
mf[:category] = new_category_filter
mf.delete(:country)
index.search(query, :filter => mf)
The other major addition is a MappingFilter (< TokenFilter). This can
be used to transform your code from UTF-8 to ascii for example. I
posted an example of how to do this earlier today. However, using the
mapping filter you can apply a list of mappings string mappings rather
than just character mappings. Obviously you could acheive this with a
list of "String#gsub!"s but MappingFilter will compile the mappings
into a DFA so it will be a *lot* faster. Here is an example:
include Ferret::Analysis
class EuropeanAnalyzer
MAPPING = {
[''?'', ''?'', ''?'',
''A'', ''?'', ''?'',
''?'', ''?'', ''?'',
''a''] => ''a'',
[''?'', ''?'', ''?'',
''?''] => ''o'',
[''?'', ''?'', ''?'',
''?'', ''?'', ''?'',
''?'', ''?''] => ''e'',
[''?'', ''?'', ''?'']
=> ''u'',
[''?''] => ''c''
}
def token_stream(field, string)
return MappingFilter.new(StandardTokenizer.new(string), MAPPING)
end
end
Happy Ferreting and check the Ferret homepage[1] if you are able to contribute.
Cheers,
Dave
[1] http://ferret.davebalmain.com/trac/