On Thu, Dec 13, 2012 at 08:05:38AM -0500, Will Partain
wrote:> Hi, all -- I want to do Plain Old Omindex'ing *but* the mapping
> between my documents' filenames and the URLs where I hope search
> users to find them is, uh..., strange. The simplest thing (to
> me) would be to run omindex for each document, e.g.
>
> omindex --no-delete -U /cool-url-1 /funky/doc/file-blah.pdf
> omindex --no-delete -U /cool-url-7 /doc/funky/ohmy/blah-file.txt
> ... and so on...
>
> Of course, this doesn't work because the pathnames don't signify
> directories. I'm guessing the same thing can be done with
> 'scriptindex' -- but I really want what just plain old omindex
> does.
Running omindex once for each document will be slow. If you have a lot
of documents, you really want to batch updates for good indexing
performance.
> A horrible? way might be to copy each document into a temp
> directory and run omindex -- but I'm guessing the URLs would come
> out wrong (it would append the filename onto the end).
I'd just symlink them all into a temporary directory structure and use
-f so omindex will follow the symlinks - e.g.:
$ mkdir tmp
$ ln -s /home/olly/git/survex/doc/manual.pdf tmp/cool-url-1
$ ln -s /home/olly/tmp.txt tmp/cool-url-7
$ ./omindex --db cool-url.db -f tmp
$ delve cool-url.db -1a|grep U
U/cool-url-1
U/cool-url-7
This will work so long as your omindex was built with libmagic (which is
optional in 1.2.x, but a hard requirement on trunk) and libmagic can
detect the filetype from the contents of the file.
Cheers,
Olly