jarrod roberson
2006-Jan-19 18:44 UTC
[Xapian-discuss] Python Binding - match[xapian.MSET_DOCUMENT].get_data() doesn't return anything!
I am working on creating a OSX Spotlight like application. first task is to index fully qualified paths, I want to be able to search for filenames first as a learning exercise to learn xapian and the python bindings. I tried using Xapwrap by divmod.org, that didn't pan out, I could not get the actual data back after a search, a search would return document uid but I never code get .get_document().get_data() to return anything. So I decided to just use the "raw" python bindings provided so I tried the simpleindex and simplesearch python example programs. I think in both cases ( xapwrap and just the default xapian ) bindings I am getting indexing to happen, but I can't really tell because I can't get any search results to confirm anything. When I tried with the xapian python bindings directly, I can't get the search to work. Granted the simplesearch example program is broken, so I am kind of groping in the dark on how to get the search to return a list of documents and have get_data() actually return something. I guess what I need is some simple example code that will allow me to do the following.. given some data like /this/is/a/fully/qualified/path/to/a/filename how do I create a document and add it to an index so that I can search for it by 'filename' this is what I am doing to create documents and add them to the index #!/usr/bin/python # indexer.py import sys import xapian # setup the file to index fileToIndex = sys.argv[1] if len(sys.argv) >= 3: maxRecordsToIndex = int(sys.argv[2]) else: maxRecordsToIndex = 0 recordCount = -1 # setup the xapian database try: db = xapian.WritableDatabase('/tmp/index', xapian.DB_CREATE_OR_OPEN) # index the file for line in file(fileToIndex): doc = xapian.Document() doc.set_data(line) db.add_document(doc) # my input file is 70GB of data, this is to make testing faster recordCount = recordCount + 1 if maxRecordsToIndex > -1 and recordCount >= maxRecordsToIndex: break elif recordCount % 1000 == 0: print 'print processed %s records so far!' % recordCount print 'processed %s records' % recordCount except Exception, e: print'Exception: %s' % str(e) sys.exit(1) and this is what I an doing to try and get the data back from a search, the problem is I can't get it to find anything. Given the example data above when run: python searcher.py /tmp/index filename I get 0 records found! #!/usr/local/bin/python # searcher.py import sys import xapian if len(sys.argv) < 3: print "usage: %s <path to database> <search terms>" % sys.argv[0] sys.exit(1) try: database = xapian.Database(sys.argv[1]) enquire = xapian.Enquire(database) query = xapian.Query(sys.argv[2]) print "Performing query `%s'" % query.get_description() enquire.set_query(query) matches = enquire.get_mset(0, 10) print "%i results found" % matches.get_matches_estimated() for match in matches: print "ID %i %i%% [%s]" % (match[xapian.MSET_DID], match[ xapian.MSET_PERCENT], match[xapian.MSET_DOCUMENT].get_data()) except Exception, e: print "Exception: %s" % str(e) sys.exit(1) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060119/e3fb7b91/attachment.htm