search for: boolweight

Displaying 19 results from an estimated 19 matches for "boolweight".

2011 Aug 11
3
Fwd: Re: what is the fastest way to fetch results which are sorted by timestamp ?
...hu, 11 Aug 2011 01:06:36 +0800 From: ??? <panjunyong at gmail.com> To: Tim Brody <tdb2 at ecs.soton.ac.uk> On Wed, Aug 10, 2011 at 6:39 PM, Tim Brody <tdb2 at ecs.soton.ac.uk> wrote: > Hi, > > In terms of the enquiry, do you mean this?: > set_weighting_scheme(Xapian::BoolWeight()); > set_docid_order(Xapian::Enquire::DESCENDING); > > In my test, it is more than 10 times slower than : set_weighting_scheme(Xapian::BoolWeight()); set_docid_order(Xapian::Enquire::ASCENDING); Why? What's the most efficient process to build multiple Xapian indexes? Can > the &...
2018 Mar 30
2
sorting large msets
Hello, is there a way to optimize sorting by certain values for queries which return a huge amount of results? For example, I just want a simple query that gives me the 200 most recent emails out of millions. The elapsed time for get_mset increases as the number of documents ($n * 2000) increases. I suppose I could store a pre-sorted set using SQLite or similar. Thanks in advance for any
2018 Mar 31
2
sorting large msets
...ount of results? > [...] > > $enquire->set_sort_by_value_then_relevance(0, 1); > > If you're just wanting the 200 newest, it'll be faster not to calculate > weights, so: > > $enquire->set_sort_by_value(0, 1); > $enquire->set_weighting_scheme(new Xapian::BoolWeight()); > > For me, this drops the time from ~0.075 seconds to ~0.067 seconds (with > xapian-core 1.4.5). Thanks, I can see how that helps. > But even 0.075 seconds doesn't really seem "slow" to me. What times > are you seeing? If it's much slower, I'd make sur...
2005 Jun 29
2
Sort by docid
Hello, I wonder if there is a way to cause Xapian to order a result set purely by docid. In other words, once the result set has been determined, I'd like the results to be returned to me ordered by their docid, as opposed to by their match relevance. The problem at hand is that I'm building a search engine for a mailing list and I would like to return matches sorted by date; ordering by
2011 Aug 09
3
what is the fastest way to fetch results which are sorted by timestamp ?
what is the fastest way to fetch results which are sorted by timestamp ? i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc. search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp. This method is ok , but is there a faster way to do that ? Since i have millions of records .
2018 Mar 30
0
sorting large msets
...> for queries which return a huge amount of results? [...] > $enquire->set_sort_by_value_then_relevance(0, 1); If you're just wanting the 200 newest, it'll be faster not to calculate weights, so: $enquire->set_sort_by_value(0, 1); $enquire->set_weighting_scheme(new Xapian::BoolWeight()); For me, this drops the time from ~0.075 seconds to ~0.067 seconds (with xapian-core 1.4.5). If I use xapian git master (still using the glass backend) then it's ~0.051 seconds with weights and ~0.045 seconds without. If I use the new (but still in development) honey backend it's ~0.0...
2018 Apr 03
0
sorting large msets
...000, Eric Wong wrote: > Olly Betts <olly at survex.com> wrote: > > If you're just wanting the 200 newest, it'll be faster not to calculate > > weights, so: > > > > $enquire->set_sort_by_value(0, 1); > > $enquire->set_weighting_scheme(new Xapian::BoolWeight()); > > > > For me, this drops the time from ~0.075 seconds to ~0.067 seconds (with > > xapian-core 1.4.5). > > Thanks, I can see how that helps. > > > But even 0.075 seconds doesn't really seem "slow" to me. What times > > are you seeing? I...
2017 Dec 15
5
How to get the serialise score returned in Xapian::KeyMaker->operator().
HI, all, I am a user of Xapian, and now I have a problem in using it. After using boolean terms to get some candidates of documents (still too much), we want sorted them by self-defined function which is used in Xapian::KeyMaker->operator(). But how can I get the serialise score in Xapian::MSetIterator object. c++ code likes this: class SortKeyMaker : public Xapian::KeyMaker { std::string
2010 Aug 23
1
Sort ordering
Using MultiValueSorter, I can sort by key1, key2, relevance; or relevance, key1, key2. But AFAIK, I can't sort by key1, relevance, key2. Unless I spool out the entire result set or write some C++. I wonder if we need a new 'sort by' function that accepts any combination of keys and relevance in any order? The function would make it's own optimisations (ie is relevance first or
2011 Aug 10
0
xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!
...v[2:] try: database = xapian.Database(db_path) terms = ' '.join(terms) qp = xapian.QueryParser() qp.set_database(database) qp.set_default_op(0) #0:OP_AND; 1:OP_OR default query = qp.parse_query(terms) enquire = xapian.Enquire(database) enquire.set_weighting_scheme(xapian.BoolWeight()) enquire.set_query(query) enquire.set_docid_order(enquire.DESCENDING) matches = enquire.get_mset(0,10) print "%i results found . " % matches.get_matches_estimated() print "Results 1-%i:" % matches.size() for m in matches: print "rand= %-4d docid=%-8i&q...
2012 Apr 20
1
Implementing the tf-idf weighting scheme
...le tf_idf.cc in ../weight in the repo, to implement Tf_idfWeight. Here is the git diff patch: https://gist.github.com/2422049 I think the next thing to do is register this scheme to Xapian and write some test to see whether or not it works? I'm grepped the current BM25Weight, TradWeight and BoolWeight, and find clues about Enquire::set_weighting_scheme( ). But something more should be done to understand it. Best, Jiuding -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20120420/129e0730/attachment.ht...
2017 Dec 16
0
How to get the serialise score returned in Xapian::KeyMaker->operator().
...y isn't currently exposed via the public API. It's available internally and it seems like it ought to be accessible but there's no accessor method for it - I can add one but that won't help for existing releases. A possible workaround (and perhaps a better approach) would be to set BoolWeight as the weighting scheme, then feed in your score as a weight using a PostingSource. Then it's available via get_weight() on the MSetIterator object: https://getting-started-with-xapian.readthedocs.io/en/latest/advanced/postingsource.html You may find that's faster because it'll mean...
2006 Jul 25
2
weight scheme with document values
Hi guys, I resently used xapian to sort some documents by distance between 2 points. I implemented a MatchDecider which work well. I now tried to implement a Weight scheme to put my document in ascending order depending on the distance... My information to calcul distance is in values in the document. How I can access document values from Weight to be able to add some sum_extra weight ??
2012 Mar 05
1
Interested in IR, Getting started with Xapian
Hi everyone, I'm Akshay, an Information Science undergrad from Bangalore. I'm interested in Information Retrieval and I'd like to contribute to Xapian as a part of GSoC and later to feed my interests. I liked the idea of adding more weighting schemes (Project #2). I did a project last semester on Document Retrieval on Hadoop using TF-IDF and Cosine Similarity (the query had to be a
2018 Jan 22
2
How to get the serialise score returned in Xapian::KeyMaker->operator().
>A possible workaround (and perhaps a better approach) would be to >set BoolWeight as the weighting scheme, then feed in your score as >a weight using a PostingSource. Then it's available via get_weight() >on the MSetIterator object: > >https://getting-started-with-xapian.readthedocs.io/en/latest/advanced/postingsource.html > >You may find that's faster...
2007 Mar 21
1
scoring question
Hi All I have just realized that if I set a query like 'green jelly bean' xapian will turn that query into 'green OR jelly OR bean' This causes documents containing just one of the words to be considered a 100% hit. The behavior I would like to see is that each word gives a 33.3% hit, so that a document containing all 3 words gets placed above a document with only 1 or 2
2011 Apr 21
1
Installing Search::xapian
...ib/man3/Search::Xapian::DatabaseLockError.3pm Manifying blib/man3/Search::Xapian::RuntimeError.3pm Manifying blib/man3/Search::Xapian::TermIterator.3pm Manifying blib/man3/Search::Xapian::TradWeight.3pm Manifying blib/man3/Search::Xapian::DatabaseCorruptError.3pm Manifying blib/man3/Search::Xapian::BoolWeight.3pm Manifying blib/man3/Search::Xapian::DocNotFoundError.3pm Manifying blib/man3/Search::Xapian::LogicError.3pm Manifying blib/man3/Search::Xapian::Stem.3pm Manifying blib/man3/Search::Xapian::PostingIterator.3pm Manifying blib/man3/Search::Xapian::PositionIterator.3pm Manifying blib/man3/Search::X...
2017 Mar 15
2
xapian core missing link to math on MSYS2
...ueryparser/.libs/queryparser_internal.o queryparser/.libs/termgenerator.o queryparser/.libs/termgenerator_internal.o unicode/.libs/description_append.o unicode/.libs/unicode-data.o unicode/.libs/utf8itor.o weight/.libs/bb2weight.o weight/.libs/bm25plusweight.o weight/.libs/bm25weight.o weight/.libs/boolweight.o weight/.libs/coordweight.o weight/.libs/dlhweight.o weight/.libs/dphweight.o weight/.libs/ifb2weight.o weight/.libs/ineb2weight.o weight/.libs/inl2weight.o weight/.libs/lmweight.o weight/.libs/pl2plusweight.o weight/.libs/pl2weight.o weight/.libs/tfidfweight.o weight/.libs/tradweight.o weight/.li...
2006 Dec 06
1
Bug and patch for +terms with wildcards
...postlist_from_query(query->subqs[1], matcher, is_bool), matcher, db->get_doccount()); + case Xapian::Query::OP_MATCH_NOTHING: { + Assert(query->subqs.size() == 0); + LeafPostList *pl = new EmptyPostList(); + pl->set_termweight(new Xapian::BoolWeight()); + RETURN(pl); + } } Assert(false); RETURN(NULL); Index: tests/queryparsertest.cc =================================================================== --- tests/queryparsertest.cc (revision 7552) +++ tests/queryparsertest.cc (working copy) @@ -655,7 +655,7 @@...