thr3ads.net - Xapian discuss - [Xapian-discuss] range query for terms [Mar 2015]

If this information is useful, please help other people find it:
Share via:

张吉财

2015-Mar-14 13:25 UTC

[Xapian-discuss] range query for terms

first, thank you,xapian!

then I'd like to ask if it is possible to do a range query on terms(like the
range query on values), or if it is just a wildcard(right truncation) match.

the case is searching ip address bettween  ?10.10.0.0? and ?10.10.255.255?
the user want :
1.   query "10.10.10.10" < ip < "10.10.10.12"  gives
"10.10.10.11"
2.   query "*.*.*.10" gives all ip addresses ended with 10.

how can I achieve this?

Olly Betts

2015-Mar-15 19:36 UTC

head link

[Xapian-discuss] range query for terms

On Sat, Mar 14, 2015 at 09:25:24PM +0800, ??? wrote:> then I'd like to ask if it is possible to do a range query on
> terms(like the range query on values), or if it is just a
> wildcard(right truncation) match.
Currently only right truncation is supported.
> the case is searching ip address bettween  ?10.10.0.0? and ?10.10.255.255?
> the user want :
> 1.   query "10.10.10.10" < ip < "10.10.10.12" 
gives "10.10.10.11"
> 2.   query "*.*.*.10" gives all ip addresses ended with 10.
> 
> how can I achieve this?
You should really consider using a value for this, as such wildcards can
expand to an awful lot of terms - "*.*.*.10" potentially matches 16
million terms.  With a value, there's only one thing to check for every
candidate document.

But if you only actually have a small number of IP addresses and really
want to use terms, you can just iterate allterms from the Database
object and build an OP_SYNONYM query from all the matching terms.  In
1.2.x, that's exactly how OP_WILDCARD is implemented (in master
OP_WILDCARD expansion is delayed until we process the Query tree, which
means we can avoid creating Query objects for every term in the
wildcard).

Cheers,
    Olly

张吉财

2015-Mar-29 11:07 UTC

head link

[Xapian-discuss] range query for terms

Thank you, Olly!

I tried to figure out a picture about how index/query related to the B-tree
block access on disk.
I think I'm all messed up and failed.
now I am trying to index docs in json format, and came to a question about
prefix mapping:
a json doc like:    {"starttime":1111,"endtime":2222}
considerring mapping prefix to slot number in two ways:
1.starttime-->0,endtime--->1
2.startime--->hash(starttime), endtime--->hash(endtime), while hash(key)
is a random int, which may be very sparse but unique, for example, using BKDR
hash.

after simple test, both ways seemed to work well. can I use the second way(do
not have to maintain a mapping), is there performance issues?

At 2015-03-16 03:36:48, "Olly Betts" <olly at survex.com>
wrote:>On Sat, Mar 14, 2015 at 09:25:24PM +0800, ??? wrote:
>> then I'd like to ask if it is possible to do a range query on
>> terms(like the range query on values), or if it is just a
>> wildcard(right truncation) match.
>
>Currently only right truncation is supported.
>
>> the case is searching ip address bettween  ?10.10.0.0? and
?10.10.255.255?
>> the user want :
>> 1.   query "10.10.10.10" < ip < "10.10.10.12"
gives "10.10.10.11"
>> 2.   query "*.*.*.10" gives all ip addresses ended with 10.
>> 
>> how can I achieve this?
>
>You should really consider using a value for this, as such wildcards can
>expand to an awful lot of terms - "*.*.*.10" potentially matches
16
>million terms.  With a value, there's only one thing to check for every
>candidate document.
>
>But if you only actually have a small number of IP addresses and really
>want to use terms, you can just iterate allterms from the Database
>object and build an OP_SYNONYM query from all the matching terms.  In
>1.2.x, that's exactly how OP_WILDCARD is implemented (in master
>OP_WILDCARD expansion is delayed until we process the Query tree, which
>means we can avoid creating Query objects for every term in the
>wildcard).
>
>Cheers,
>    Olly

Maybe Matching Threads

Search for more possibly parallel threads

Xapian discuss - Mar 2015 - range query for terms

[Xapian-discuss] range query for terms

[Xapian-discuss] range query for terms

[Xapian-discuss] range query for terms

Maybe Matching Threads