Displaying 18 results from an estimated 18 matches for "set_weighting_scheme".
2017 Apr 09
3
Omega: Missing support for newer weighting schemes
...ot;bm25 1 0.8", see the first word is "bm25"
> > and get a BM25Weight object, then call parse_params("1 0.8") on it to
> > create the correct Weight object (broadly similar to how unserialise()
> > is handled).
>
> If I followed correctly, since the set_weighting_scheme method in
> omega/weight.cc already does exactly that, do you suggest adding a
> parse_params method in each weighting scheme class to create the specific
> object? -- in which case the set_weighting_scheme method in omega/weight.cc
> would then use parse_params method to create the spe...
2017 Apr 12
4
Omega: Missing support for newer weighting schemes
...i Olly -- the following piece of tested code in omega/weight.cc hopefully
achieves what we intend to do. It works fine for all tests. Please let me
know what you think.
if (startswith(scheme, "pl2")) {
const char *p = scheme.c_str() + 3;
if (*p == '\0') {
enq.set_weighting_scheme(Xapian::BM25Weight());
return;
}
if (C_isspace(*p)) {
Xapian::Registry reg;
const Xapian::Weight * wt =
reg.get_weighting_scheme("Xapian::PL2Weight");
enq.set_weighting_scheme(*wt->set_parameter_values(p));
return;
}
}
Although, I&...
2017 Apr 08
2
Omega: Missing support for newer weighting schemes
On Sat, Apr 08, 2017 at 09:11:22PM +0100, James Aylett wrote:
> On 8 Apr 2017, at 19:15, Vivek Pal <vivekpal.dtu at gmail.com> wrote:
>
> >> and the details of which weighting schemes were available in which version
> >> isn't a key part of the $set command itself.
> >
> > Do you suggest dropping that piece of information out? Since the reason behind
2017 Apr 13
2
Omega: Missing support for newer weighting schemes
...pian::UnimplementedError
> error. I wonder if you meant the same method?
That's the default implementation - each subclass overrides that with an
actual implementation (at least if it wants to work with remote databases).
> > The code in omega would just be:
> >
> > enq.set_weighting_scheme(Xapian::Weight::parse_params(scheme));
>
> I wonder if we could do something more like:
>
> enq.set_weighting_scheme(Xapian::Registry::get_weighting_scheme(name).parse_params(params));
>
> where, Xapian::Registry::get_weighting_scheme(name) returns a weighting scheme
> objec...
2011 Aug 11
3
Fwd: Re: what is the fastest way to fetch results which are sorted by timestamp ?
...sorted by timestamp ?
Date: Thu, 11 Aug 2011 01:06:36 +0800
From: ??? <panjunyong at gmail.com>
To: Tim Brody <tdb2 at ecs.soton.ac.uk>
On Wed, Aug 10, 2011 at 6:39 PM, Tim Brody <tdb2 at ecs.soton.ac.uk> wrote:
> Hi,
>
> In terms of the enquiry, do you mean this?:
> set_weighting_scheme(Xapian::BoolWeight());
> set_docid_order(Xapian::Enquire::DESCENDING);
>
>
In my test, it is more than 10 times slower than :
set_weighting_scheme(Xapian::BoolWeight());
set_docid_order(Xapian::Enquire::ASCENDING);
Why?
What's the most efficient process to build multiple Xapian inde...
2018 Mar 30
2
sorting large msets
Hello, is there a way to optimize sorting by certain values
for queries which return a huge amount of results?
For example, I just want a simple query that gives me the 200
most recent emails out of millions. The elapsed time for
get_mset increases as the number of documents ($n * 2000)
increases.
I suppose I could store a pre-sorted set using SQLite or
similar. Thanks in advance for any
2018 Mar 31
2
sorting large msets
...or queries which return a huge amount of results?
> [...]
> > $enquire->set_sort_by_value_then_relevance(0, 1);
>
> If you're just wanting the 200 newest, it'll be faster not to calculate
> weights, so:
>
> $enquire->set_sort_by_value(0, 1);
> $enquire->set_weighting_scheme(new Xapian::BoolWeight());
>
> For me, this drops the time from ~0.075 seconds to ~0.067 seconds (with
> xapian-core 1.4.5).
Thanks, I can see how that helps.
> But even 0.075 seconds doesn't really seem "slow" to me. What times
> are you seeing? If it's much s...
2011 Aug 09
3
what is the fastest way to fetch results which are sorted by timestamp ?
what is the fastest way to fetch results which are sorted by timestamp ?
i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc.
search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp.
This method is ok , but is there a faster way to do that ? Since i have millions of records .
2013 Oct 23
2
performance on document.get_data()
...# value: 0:date, 1:site
# data: json message which contains: author, url, message(30 words)
Do you have any idea to improve the search performance , especially
doc.get_data?
my code snippet
database = xapian.Database("%s/athena" % DATA_PATH)
enquire = xapian.Enquire(database)
enquire.set_weighting_scheme(xapian.BM25Weight())
query = parse(keywords)
enquire.set_query(query)
matches = enquire.get_mset(start, 200)
matches.fetch()
result = [json.loads(match.document.get_data()) for match in matches]
2011 Jun 01
1
Relevance, weighting and searching by specifically weighted text
...these to the doc using index_text with the second parameter passing in a
weighting.
I've been asked if it's possible to narrow queries to just a particular
field (translating into fields with a specific weight) within the index.
I've had a look at the reference for the
Xapian_Enquire::set_weighting_scheme method and cannot see a way where
altering the parameters of the BM25 weighting scheme can achieve this.
Is there a way this can be done?
Thanks,
Justin
--
Redwire Design Limited
54 Maltings Place
169 Tower Bridge Road
London SE1 3LJ
www.redwiredesign.com
[ 020 7403 1444 ] - voice
[ 020 7...
2018 Mar 30
0
sorting large msets
...ptimize sorting by certain values
> for queries which return a huge amount of results?
[...]
> $enquire->set_sort_by_value_then_relevance(0, 1);
If you're just wanting the 200 newest, it'll be faster not to calculate
weights, so:
$enquire->set_sort_by_value(0, 1);
$enquire->set_weighting_scheme(new Xapian::BoolWeight());
For me, this drops the time from ~0.075 seconds to ~0.067 seconds (with
xapian-core 1.4.5).
If I use xapian git master (still using the glass backend) then it's
~0.051 seconds with weights and ~0.045 seconds without.
If I use the new (but still in development) hone...
2018 Apr 03
0
sorting large msets
...at, Mar 31, 2018 at 12:58:19AM +0000, Eric Wong wrote:
> Olly Betts <olly at survex.com> wrote:
> > If you're just wanting the 200 newest, it'll be faster not to calculate
> > weights, so:
> >
> > $enquire->set_sort_by_value(0, 1);
> > $enquire->set_weighting_scheme(new Xapian::BoolWeight());
> >
> > For me, this drops the time from ~0.075 seconds to ~0.067 seconds (with
> > xapian-core 1.4.5).
>
> Thanks, I can see how that helps.
>
> > But even 0.075 seconds doesn't really seem "slow" to me. What times
>...
2010 Aug 23
1
Sort ordering
Using MultiValueSorter, I can sort by key1, key2, relevance; or relevance, key1, key2.
But AFAIK, I can't sort by key1, relevance, key2. Unless I spool out the entire result set or write some C++.
I wonder if we need a new 'sort by' function that accepts any combination of keys and relevance in any order? The function would make it's own optimisations (ie is relevance first or
2011 Mar 08
1
MSet order
Hello
I defined a weighting scheme to simulate a king of "euclidean" distance.
To test it, i used a database with 1000 documents.
If I run :
enquire.set_weighting_scheme(MyWeight());
Xapian::MSet matches = enquire.get_mset(0, 1000);
I have a correct list of results.
But if I run Xapian::MSet matches = enquire.get_mset(0, 10);
I don't have the top-10 results.
If I run Xapian::MSet matches = enquire.get_mset(0, 20);
I d'ont have the top-20 results and it'...
2011 Aug 10
0
xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!
...sys.argv[1]
terms = sys.argv[2:]
try:
database = xapian.Database(db_path)
terms = ' '.join(terms)
qp = xapian.QueryParser()
qp.set_database(database)
qp.set_default_op(0) #0:OP_AND; 1:OP_OR default
query = qp.parse_query(terms)
enquire = xapian.Enquire(database)
enquire.set_weighting_scheme(xapian.BoolWeight())
enquire.set_query(query)
enquire.set_docid_order(enquire.DESCENDING)
matches = enquire.get_mset(0,10)
print "%i results found . " % matches.get_matches_estimated()
print "Results 1-%i:" % matches.size()
for m in matches:
print "rand=...
2012 Apr 20
1
Implementing the tf-idf weighting scheme
...implement Tf_idfWeight.
Here is the git diff patch:
https://gist.github.com/2422049
I think the next thing to do is register this scheme to Xapian and write
some test to see whether or not it works?
I'm grepped the current BM25Weight, TradWeight and BoolWeight, and find
clues about Enquire::set_weighting_scheme( ). But something more should be
done to understand it.
Best,
Jiuding
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20120420/129e0730/attachment.html>
2005 Jun 29
2
Sort by docid
Hello,
I wonder if there is a way to cause Xapian to order a result set purely by
docid. In other words, once the result set has been determined, I'd like the
results to be returned to me ordered by their docid, as opposed to by their
match relevance.
The problem at hand is that I'm building a search engine for a mailing list
and I would like to return matches sorted by date; ordering by
2012 Jul 17
1
Can not use custom weight scheme with python binding
...(self):
return "Tinker"
def serialize(self):
return ""
def get_sumpart(*args):
return 1
def get_maxpart(*args):
return 1
def get_sumextra(*args):
return 0
def get_maxextra(*args):
return 0
... ...
enquire.set_weighting_scheme(TinkerWeight())
But is throws this error:
*in method 'Enquire_set_weighting_scheme', argument 2 of type
'Xapian::Weight const &'*
Could anyone help me to solve this? Thanks a lot.