Hi. I am writing a python equivalent of omindex (we are using scriptindex currently - but I wanted to use omindex instead, and extend it to work with our internal file format.. BUT did not want to compile code if possible... so anyway). I have tried to keep the code as close to possible to the omindex native code, but am facing a bit of confusion: what exactly is the reason for omindex to take in a base directory AND a subdirectory. In other words, what scenario is possible that cannot be covered by just passing in a directory and a corresponding mapping to the url. * I _know_ there was something I read somewhere about this - but could not find it when I needed it (now). If someone has a link, that would help as well. ** I am wondering if the tool would be interesting/usable to anyone else - please let me know if so. Right now I am only trying for html support, but once this works, adding in other file format support should be as easy as adding it to the native omindex if not more so (I hope ;) ) Thanks, Srijon.
Sirjon, I'd be happy to work on a python omindex. Srijon Biswas wrote:> Hi. > > I am writing a python equivalent of omindex (we are using scriptindex > currently - but I wanted to use omindex instead, and extend it to work with > our internal file format.. BUT did not want to compile code if possible... > so anyway). > > I have tried to keep the code as close to possible to the omindex native > code, but am facing a bit of confusion: what exactly is the reason for > omindex to take in a base directory AND a subdirectory. In other words, what > scenario is possible that cannot be covered by just passing in a directory > and a corresponding mapping to the url. > > * I _know_ there was something I read somewhere about this - but could not > find it when I needed it (now). If someone has a link, that would help as > well. > ** I am wondering if the tool would be interesting/usable to anyone else - > please let me know if so. Right now I am only trying for html support, but > once this works, adding in other file format support should be as easy as > adding it to the native omindex if not more so (I hope ;) ) > > Thanks, > Srijon. > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss at lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss >
On Tue, May 19, 2009 at 02:00:21PM +0100, Srijon Biswas wrote:> I have tried to keep the code as close to possible to the omindex native > code, but am facing a bit of confusion: what exactly is the reason for > omindex to take in a base directory AND a subdirectory. In other words, what > scenario is possible that cannot be covered by just passing in a directory > and a corresponding mapping to the url.This is the `sites` concept of omindex. I believe these days it's mainly useful when your URI space maps onto several different parts of your file system. You can also use it to index multiple different websites (ie on different hostnames) in one database. I think it also makes it a little easier to generate correct URIs, although this could be done in another way if it were the only issue. J -- James Aylett talktorex.co.uk - xapian.org - uncertaintydivision.org
Hi Srijon,> I have tried to keep the code as close to possible to the omindex native > code, but am facing a bit of confusion: what exactly is the reason for > omindex to take in a base directory AND a subdirectory. In other words, what > scenario is possible that cannot be covered by just passing in a directory > and a corresponding mapping to the url. > > * I _know_ there was something I read somewhere about this - but could not > find it when I needed it (now). If someone has a link, that would help as > well.As james said, it's the "sites" concept of omindex that is described on the Omega overview page: http://www.xapian.org/docs/omega/overview.html I actually used this to divide my indexing of large directory trees, to avoid omindex eating up all my computer's memory and then failing to index some documents, when it had to do a large tree in one single pass. It proved useful. Eric ATIS Uher S.A. CH 2046 Fontaines ________________________________________________________________________________________________ This message is confidential. It may also be privileged or otherwise protected by work product immunity or other legal rules. If you have received this message by mistake please let us know by reply and then delete it from your system; you should not copy it or disclose its contents to anyone. All messages sent to and from ATIS Uher S.A. may be monitored to ensure compliance with internal policies and to protect our business. E-Mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, lost or destroyed. Anyone who communicates with us by e-mail is taken to accept these risks.
Hi Eric, James. Apparently (and obviously) I did not look hard enough :) I was searching the wiki for this. Thanks for the link and the explanations. Thanks, Srijon. ---------- Forwarded message ---------- From: "Eric Voisard" <eric.voisard at atisuher.ch> To: <xapian-discuss at lists.xapian.org> Date: Wed, 20 May 2009 14:27:41 +0200 Subject: Re: [Xapian-discuss] omindex options Hi Srijon,> I have tried to keep the code as close to possible to the omindex native > code, but am facing a bit of confusion: what exactly is the reason for > omindex to take in a base directory AND a subdirectory. In other words,what> scenario is possible that cannot be covered by just passing in a directory > and a corresponding mapping to the url. > > * I _know_ there was something I read somewhere about this - but could not > find it when I needed it (now). If someone has a link, that would help as > well.As james said, it's the "sites" concept of omindex that is described on the Omega overview page: http://www.xapian.org/docs/omega/overview.html I actually used this to divide my indexing of large directory trees, to avoid omindex eating up all my computer's memory and then failing to index some documents, when it had to do a large tree in one single pass. It proved useful. Eric