Kevin SoftDev
2006-Feb-25 20:21 UTC
[Xapian-discuss] How to index html pages with real URL links
Xapians, I was able to index my web pages sitting on hard disk using Omega. The performance is stunning, something I have not seen before and I have done substantial work with Lucene and MySQL 5.0 FreeText and MS SQL 2005 FreeText. However I was not able to figure out how to index html pages and instead of having stored link from hard disk location like /home/kevin/public_html/programming.html get some meaningful URL link instead like http://pacific-design.com/programming.html Thanks, Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060225/0488a963/attachment.htm
Olly Betts
2006-Feb-25 22:14 UTC
[Xapian-discuss] How to index html pages with real URL links
On Sat, Feb 25, 2006 at 12:21:33PM -0800, Kevin SoftDev wrote:> However I was not able to figure out how to index html pages and instead of > having stored link from hard disk location like > /home/kevin/public_html/programming.html get some meaningful URL link > instead like http://pacific-design.com/programming.htmlJust tell omindex the URL which corresponds to the start directory, e.g.: omindex --db /path/to/database --url http://pacific-design.com \ /home/kevin/public_html Or if the search itself runs on pacific-design.com, you might prefer to omit the hostname from all URLs: omindex --db /path/to/database --url / /home/kevin/public_html Cheers, Olly