search for: wdfs

Displaying 12 results from an estimated 12 matches for "wdfs".

Did you mean: dfs
2023 Aug 27
1
DatabaseModifiedError while iterating on mset
...odifiedErrors while inside a > Xapian::MSetIterator loop. > > I assume ->get_document is a place where it gets thrown; > but once a document is retrieved, can iterating through > terms in one document (using TermIterator) also throw DB modified? If you only look at the terms and wdfs then you could only get DatabaseModifiedError on the call to create the TermIterator since the list of terms and wdfs is stored in a single entry per document which is fetched when the iterator is created (it is conceivable this might be different for a new database backend in the future I suppose)...
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
...sure to change 5 to 1 otherwise divide the final count > statistics by 5 . :) We really need to resolve any instances where letor requires code in other parts of Xapian to be patched. In this case, possibly the bias on the title should be done differently, but won't this just mean both the wdfs and the field length for the S prefix are 5 times larger, and it won't matter? Cheers, Olly
2023 Aug 28
1
DatabaseModifiedError while iterating on mset
...gt; Xapian::MSetIterator loop. > > > > I assume ->get_document is a place where it gets thrown; > > but once a document is retrieved, can iterating through > > terms in one document (using TermIterator) also throw DB modified? > > If you only look at the terms and wdfs then you could only get > DatabaseModifiedError on the call to create the TermIterator since the > list of terms and wdfs is stored in a single entry per document which > is fetched when the iterator is created (it is conceivable this might > be different for a new database backend in t...
2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly, Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1) by below line, automatically adjust the wdfs and field lengths? indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); if it does not then we should include that part in the patch too. I like to create a patch for xapian-letor for resolving common code of xapian. Cheers, Parth. On Wed, Mar 12...
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory. Indexing about 5GB of data average of 2MB per document. The documents are plain text. I notice the omindex's memory fott print get's biger an bigger then the machine starts to swap and it all slows down to a crawl. In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000 Am I right in saying that for my setup
2023 Aug 23
1
DatabaseModifiedError while iterating on mset
I'm already retrying the ->get_mset operations; but now I'm wondering where I'd hit DatabaseModifiedErrors while inside a Xapian::MSetIterator loop. I assume ->get_document is a place where it gets thrown; but once a document is retrieved, can iterating through terms in one document (using TermIterator) also throw DB modified? I'm dumping multiple terms per-document to a
2011 Mar 07
1
Set Term Frequency for a Query
Hello, I have a problem when trying to define a query and setting for each term its "term frequency" with the classical constructor Xapian::Query<http://xapian.org/docs/apidoc/html/classXapian_1_1Query.html#f396e213df0d8bcffa473a75ebf228d6>(const std::string &tname_,
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 12:02:15PM +0100, Parth Gupta wrote: > During the indexing with omindex, only you need to make sure is indexing > with prefix 'S' for title as explained here in Letor documentation: > xapian-letor/docs/letor.rst > > Previously when I edited omindex.cc it was modified as can be seen >
2014 Mar 22
2
[GSOC 2014] Indexing INEX dataset
..., Parth. On Thu, Mar 20, 2014 at 2:35 AM, Olly Betts <olly at survex.com> wrote: > On Mon, Mar 17, 2014 at 09:07:29PM +0100, Parth Gupta wrote: > > Wouldn't setting the weight of terms in title back to normal (e.g. 5 to > 1) > > by below line, automatically adjust the wdfs and field lengths? > > > > indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); > > > > if it does not then we should include that part in the patch too. I like > to > > create a patch for xapian-letor for resolving common...
2012 Mar 31
1
Project: Posting list encoding improvements
Hi Xapianers: My name is Weixian Zhou, Computer Science student of University at Buffalo, State University of New York. I am interested in the project of posting list encoding improvements and weighting schemes. I have some questions toward them. 1) After read the comments in brass_postlist.cc, I am still not very clear about the detailed structure of postings list. If you can provide some simple
2006 Apr 02
1
About field weight
Hi, I've been disappointed today with some of my Xapian results. Here is the issue : - I am searching for the terms "ipod vid?o 60" (with the OR operator) - the first results sorted by relevance are : (name / description) 1. Etui en cuir Shinnorie EZgoing pour iPod avec vid?o 60 Go - Blanc Etui en cuir Shinnorie EZgoing pour iPod avec vid?o 60 Go - Noir - With the SA1
2007 Jan 24
1
how to properly extend s3 data.frames with s4 classes?
..."row.names": # > character(0) # > Warning message: # > missing package slot (.GlobalEnv) in object of class # > "WrappedDataframe" (package info added) in: initialize(value, ...) # # OBS! Now there is # (i) a slot "row.names" -- which is wrong # since WDFs aren't suposed to have any slots; # (ii) an odd warning about another missing slot # (presumably called "package" but the message is # somewhat ambigous). # # But at least # new(WDF, tdf) # # yields: # # > $x # > [1] 1 2 # > # > $y # > [1] TRUE FALSE # > #...