F. Bos
2005-Aug-04 21:07 UTC
[Xapian-discuss] Using Omega and/or Xapian and how to get started
Hello, I'm totally new to Xapian/Omega. I've spent the last couple of hours reading any documentation I could find as well as quite some postings from the mailing list archives. I'm interested in using Xapian/Omega in combination with a website to optimize searching. However I need some help to get started. I've been building my website using PHP/HTML. I use Mysql as db system. On the website I'm building there will be several documents that can be searched: - Forums - User pages (containing information about a site user) - Group pages (containing information on a group of users on the site) All these pages are created dynamically using PHP/Mysql. Almost all the information on these pages is gathered from the database at the moment a visitor opens a certain page. I have considered searching directly in the Mysql database but I found out that the processing time would be unacceptable due to the large sets of data. So I want to use Xapian/Omega instead. My question now is, what's the best approach for setting up a search engine with a Xapian/Omega backend? So far I've learned that I need to get all the data I want to search in a Xapian database (in documents), indexed in a way that I can optimize my search demands. I've looked at the PHP examples that are included with the Xapian bindings and I do roughly understand how these work. However I don't understand when the index script will have to run. For example: do I need to run an index script directly after I inserted a new forum post in order to insert the search terms that are in the post in the Xapian database and for the post to become searchable? Won't this slow down the posting procedure a lot? Another thing I don't understand (this probably sounds stupid) is that when I use the Xapian PHP bindings/functions to index and search, I don't see where Omega comes in?! Or can I also use Omega from within PHP? I think I don't really got the hang of how Omega interacts with Xapian and how Omega interacts with PHP (if the latter is the case at all). Last question: does Xapian need a dedicated server or can I also get reasonable performance when I have the Mysql database, the Xapian database and the web server software on the same server? I know these questions are kind of low level but I hope that someone can answer these so that I can get started. As I said I'm totally new to this but I'm very anxious to learn. Thanks, Floris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20050804/7de2e592/attachment-0001.htm
James Aylett
2005-Aug-05 11:14 UTC
[Xapian-discuss] Using Omega and/or Xapian and how to get started
On Thu, Aug 04, 2005 at 10:06:49PM +0200, F. Bos wrote:> So far I've learned that I need to get all the data I want to search in a > Xapian database (in documents), indexed in a way that I can optimize my > search demands. I've looked at the PHP examples that are included with the > Xapian bindings and I do roughly understand how these work. However I don't > understand when the index script will have to run. For example: do I need to > run an index script directly after I inserted a new forum post in order to > insert the search terms that are in the post in the Xapian database and for > the post to become searchable? Won't this slow down the posting procedure a > lot?Hi, Floris. You have a choice: either you want to update the Xapian database immediately on a new forum post (or similar), or you want to batch up the changes and do them later. The former will slot down the posting procedure a little, but may not necessarily slow it down a lot - depends on a whole range of things. It's probably easier to implement than batched processing, however, so unless you're concerned that your site gets high volume and you might not be able to sustain an 'online' approach, updating the database as part of the post operation, you might want to try this first. Much of the code could be reusable in a batch approach if that became necessary later.> Another thing I don't understand (this probably sounds stupid) is that when > I use the Xapian PHP bindings/functions to index and search, I don't see > where Omega comes in?! Or can I also use Omega from within PHP? I think I > don't really got the hang of how Omega interacts with Xapian and how Omega > interacts with PHP (if the latter is the case at all).You don't have to use Omega if you're using Xapian directly from PHP - indeed, there isn't much point. However you can quite happily use Omega from PHP without directly accessing Xapian; this has a number of advantages, mostly that Omega has a whole load of features you'd otherwise have to implement for yourself. If you want to go down that route, PHP would call Omega to generate an file that contained the results (perhaps as XML), and would then process it into the form to be displayed to the user. Indexing in this scenario would probably be best done by creating scriptindex input files and running that, either from PHP, or in a batch fashion from a cronjob or similar.> Last question: does Xapian need a dedicated server or can I also get > reasonable performance when I have the Mysql database, the Xapian database > and the web server software on the same server?Again, depends on a lot of factors. You should be able to get pretty far with everything on the one server, but it really depends on the machine. If it's a recent Intel or AMD-based server, even at the lower end, you should do fine - as with many performance situations, you have to be careful not to optimise too soon. Often, upgrading a single system will give you better performance than splitting onto multiple systems, in the short term; and if your machine is powerful enough to cope with initial demand all by itself, splitting across multiple machines will slow things down until you get quite a bit more demand. There's a document on Xapian scalability that may prove helpful: <http://xapian.org/docs/scalability.html>. Hope this helps, James -- /--------------------------------------------------------------------------\ James Aylett xapian.org james@tartarus.org uncertaintydivision.org
Floris Bos
2005-Aug-09 19:17 UTC
[Xapian-discuss] Using Omega and/or Xapian and how to get started
Thanks again! Once again, things are starting to make sense to me. I hope I'll get a chance of testing the things you suggest. I seem to have problems with installing Xapian/Omega, I didn't think installing would be a problem but apparently it does. I think I'll post a topic on that too, I hope some of the more experienced Xapian/Omega users can help me out here.