To answer my own questions, the problem I had was that I didn''t pad
zeros in my query as the same way as in the IntegerAnalyzer.
# rip from the "Ferret" ebook at
http://www.oreilly.com/catalog/9780596527853/index.html
module Ferret::Analysis
# range comparision is done by lexical order, not numeric order
class IntegerTokenizer
def initialize(num, width)
@num = num.to_i
@width = width
end
def next
token = Token.new("%0#{@width}d" % @num, 0, @width) if @num
@num = nil
return token
end
def text=(text)
@num = text.to_i
end
end
class IntegerAnalyzer
def initialize(width)
@width = width
end
def token_stream(field, input)
return IntegerTokenizer.new(input, @width)
end
end
end
include Ferret::Analysis
analyzer = PerFieldAnalyzer.new(StandardAnalyzer.new)
# "num" is the name of the field which to use this analyzer
analyzer[:num] = IntegerAnalyzer.new(3)
index = Ferret::Index::Index.new(:analyzer => analyzer)
docs = [
{:num => 5},
{:num => 15},
{:num => 30}
]
docs.each { |d| index << d }
>> puts index.search(''num:[001 020]'')
TopDocs: total_hits = 2, max_score = 1.000000 [
0 "": 1.000000
1 "": 1.000000
]
=> nil>>
?> puts index.search(''num:[010 100]'')
TopDocs: total_hits = 2, max_score = 1.000000 [
1 "": 1.000000
2 "": 1.000000
]
Yaxm Yaxm wrote:> Hi,
> it looks like Ferret still compares numeric fields by lexical ordering,
> not numerical ordering. I am using Ferret 0.11.4(I tried in both linux
> and windows, the results are the same).
>
>
> index = Ferret::Index::Index.new()
> docs = [
> {:num => 1, :data => "yes"},
> {:num => 1, :data => "no"},
> {:num => 10, :data => "yes"},
> {:num => 10, :data => "no"},
> {:num => 100, :data => "yes"},
> {:num => 100, :data => "no"},
> {:num => 1000, :data => "yes"},
> {:num => 1000, :data => "no"}
> ]
>
> ?> puts index.process_query(''data:yes AND num:[10
100]'')
> +data:yes +num:[10 100]
> => nil
>>> puts index.search(''d:data:yes AND num:[10
100]'')
> TopDocs: total_hits = 2, max_score = 1.777895 [
> 2 "": 1.777895
> 4 "": 1.777895
> ]
> => nil
>>> puts index.process_query(''data:yes AND num:[2
100]'')
> num:"data yes <> num 2 100"~4
> => nil
>>> puts index.process_query(''num:[2 100]'')
> num:"num 2 100"~2
> => nil
>>> puts index.search(''num:[2 100]'')
> TopDocs: total_hits = 0, max_score = 0.000000 [
> ]
> => nil
>>> puts index.process_query(''num:>2'')
> num:{2>
> => nil
>>> puts index.search(''num:>2'')
> TopDocs: total_hits = 0, max_score = 0.000000 [
> ]
> => nil
>
>
> According to the release note for Ferret 0.10.6 at
> http://rubyforge.org/forum/forum.php?forum_id=9058, "Range queries
just
> work. No need to pad numbers or format dates correctly."
>
> Is this a new bug?
>
> Thanks.
> Yaxm
--
Posted via http://www.ruby-forum.com/.