Hello, I am trying to build a script in Python to add documents to a database. After that I am trying to use Omega to search into this database, but Omega cannot retrieve any information from it. I have read omindex.cc but I thing I am doing something wrong or something is missing. This is the script I am using to index: ------- index.py ------- import xapian db = xapian.WritableDatabase("teste01", xapian.DB_CREATE_OR_OPEN) doc = xapian.Document() record = """caption=Test page sample=This is a test size=4554 url=http://www.test.com """ doc.set_data(record) doc.add_term("Ttext/html") doc.add_term("Hhttp://www.test.com") doc.add_posting(record, 1) db.add_document(doc) ------ EOF ------- When I open the database with get_data() function, I can retrieve the following information: --- caption=Test page sample=This is a test size=4554 url=http://www.test.com --- But Omega doesn't return anything when I submit a 'test' search. Can someone point me where is the mistake? Am I using the add_term function on the right way? Thanks Christiano
On Wed, 2005-08-10 at 15:12 -0300, Christiano Anderson wrote:> Hello, > > I am trying to build a script in Python to add documents to a database. > After that I am trying to use Omega to search into this database, but > Omega cannot retrieve any information from it. I have read omindex.cc > but I thing I am doing something wrong or something is missing. > > This is the script I am using to index: > > ------- index.py ------- > import xapian > db = xapian.WritableDatabase("teste01", xapian.DB_CREATE_OR_OPEN) > doc = xapian.Document() > > record = """caption=Test page > sample=This is a test > size=4554 > url=http://www.test.com > """ > > doc.set_data(record) > doc.add_term("Ttext/html") > doc.add_term("Hhttp://www.test.com") > > doc.add_posting(record, 1) > db.add_document(doc) > ------ EOF ------- > > When I open the database with get_data() function, I can retrieve the > following information: > > --- > caption=Test page > sample=This is a test > size=4554 > url=http://www.test.com > ---This _looks_ o.k. (but make shure your newlines are really '\n').> But Omega doesn't return anything when I submit a 'test' search.How do you perform your 'test' search. Testing with Omega is a bit tricky since Omega will parse your Querystring (and most likely stem the terms ...). Why don't you add: doc.add_term(Rtuxtux) and then search for "Tuxtux". Note: The capital will force the Query- parser into parsing the term as a "raw" term and prepend the 'R' tag (read omaga/docs/termprefixes.txt if this isn't clear). HTH Ralf Mattes> Can > someone point me where is the mistake? Am I using the add_term function > on the right way? > > Thanks > > Christiano > > > > > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss@lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss
On Wed, Aug 10, 2005 at 03:12:12PM -0300, Christiano Anderson wrote:> This is the script I am using to index: > > ------- index.py ------- > import xapian > db = xapian.WritableDatabase("teste01", xapian.DB_CREATE_OR_OPEN) > doc = xapian.Document() > > record = """caption=Test page > sample=This is a test > size=4554 > url=http://www.test.com > """ > > doc.set_data(record) > doc.add_term("Ttext/html") > doc.add_term("Hhttp://www.test.com") > > doc.add_posting(record, 1) > db.add_document(doc) > ------ EOF -------You don't want to add the entire record as a single posting - you should split it into individual terms first. See indextext.cc in the omega distribution for an example algorithm that fits well with Xapian::QueryParser (xapian.QueryParser in Python :-). J -- /--------------------------------------------------------------------------\ James Aylett xapian.org james@tartarus.org uncertaintydivision.org