search for: allterm

Displaying 10 results from an estimated 10 matches for "allterm".

Did you mean: allterms
2013 Oct 30
2
Lucene 3.6.2 backend for xapian (#25)
...multi-database support, but I think otherwise we'll end up duplicating a lot of that machinery in the Lucene backend anyway. I've not looked at the Lucene file structure with this in mind yet though - do you see any obvious problems with this approach? > Xapian::TermIterator it = db_in.allterms_begin(); > This method traverse all terms in the first segment, then the second > segment, until the last segment. Iteration over all terms should return the terms in sorted order (by byte value) and without duplicates, neither of which is achieved by handling each segment in turn like this...
2007 Feb 09
1
Fetching document content by Q term in Python
Hello, I'd like to be able to retrieve the indexes stored copy of the document text and tried the following: terms = self.db.allterms() terms.skip_to('Q' + uri.encode('utf-8')) term = terms.next() doc = self.db.get_document(term[1]) print doc.get_data() I just wildly guessed that [1] was the docid, but of course it isn't. So the question is, how do I get a docid out of a term? Or if I'm...
2015 Mar 14
2
range query for terms
first, thank you,xapian! then I'd like to ask if it is possible to do a range query on terms(like the range query on values), or if it is just a wildcard(right truncation) match. the case is searching ip address bettween ?10.10.0.0? and ?10.10.255.255? the user want : 1. query "10.10.10.10" < ip < "10.10.10.12" gives "10.10.10.11" 2. query
2004 Jan 30
0
Two apparent bugs in aov(y~ *** -1 + Error(***)), with suggested (PR#6510)
...e { ## helmert contrasts can be helpful: do we want to force them? ## this version does for the Error model. opcons <- options("contrasts") options(contrasts=c("contr.helmert", "contr.poly")) on.exit(options(opcons)) allTerms <- Terms errorterm <- attr(Terms, "variables")[[1 + indError]] eTerm <- deparse(errorterm[[2]], width = 500, backtick = TRUE) intercept <- attr(Terms, "intercept") ecall <- lmcall ecall$formula <- as.formula...
2004 Feb 02
0
Two apparent bugs in aov(y~ *** -1 + Error(***)), with (PR#6520)
...ontrasts can be helpful: do we want to force them? > ## this version does for the Error model. > opcons <- options("contrasts") > options(contrasts=c("contr.helmert", "contr.poly")) > on.exit(options(opcons)) > allTerms <- Terms > errorterm <- attr(Terms, "variables")[[1 + indError]] > eTerm <- deparse(errorterm[[2]], width = 500, backtick = TRUE) > intercept <- attr(Terms, "intercept") > ecall <- lmcall > ecall$formula &lt...
2010 Oct 08
1
Get a list of all terms in an indexed corpus
Hello, I have a corpus that I have indexed with xapian/xappy and I would now like to generate a corpus-specific list of stopwords. (This is a technical corpus, so a typical stopword list wouldn't be helpful.) My first thought was to ask the xapian database for a list of terms followed by their frequency. My intuition is that I could probably bring together a list of stopwords by examining
2015 Mar 29
1
range query for terms
...to an awful lot of terms - "*.*.*.10" potentially matches 16 >million terms. With a value, there's only one thing to check for every >candidate document. > >But if you only actually have a small number of IP addresses and really >want to use terms, you can just iterate allterms from the Database >object and build an OP_SYNONYM query from all the matching terms. In >1.2.x, that's exactly how OP_WILDCARD is implemented (in master >OP_WILDCARD expansion is delayed until we process the Query tree, which >means we can avoid creating Query objects for every te...
2005 Feb 25
2
Bug in TermIterator::skip_to() ?
Hi all, I've been toying with xapian (mostly using the Python bindings) and I think I've hit a bug in the TermIterator::skip_to() method (or maybe in QuartzAllTermsList::skip_to()). I've attached a c++ source file that demonstrates the issue. In short, if you have a WritableDatabase, ask for the all-terms TermIterator with db.allterms_begin(), and then skip_to() a word that is itself a term, the iterator sometimes stays at the beginning of the term list....
2012 Apr 14
1
[xapian] a bug fixed in brass_database.cc
Hi all, I fixed a bug in brass_database.cc. The bug is: *FIXME: this should be done by checking memory usage, not the number of* *changes. We could also look at the amount of data the inverter object* *currently holds.* I also modified the simpleindex.cc so that it now supports batch files indexing. -- Weixian Zhou Department of Computer Science and Engineering University at Buffalo, SUNY
2001 Dec 17
1
environments again
In a previous message I was not clear enough in my querry. I have the following program: tst<- function() { x <- c(32.7,32.3,31.5,32.1,29.7,29.1,35.7,35.9,33.1, 36.0,34.2,31.2,31.8,28.0,29.2,38.2,37.8,31.9, 32.5,31.1,29.7) g <- rep(1:7,rep(3,7)) s <- rep(1:3,7) cat(" Only x and g \n") aov1(x,g) cat("\n\n Now x, g and s \n") aov1(x,g,s=s) }