Olly (looking at commit logs, I think this is your dept :-) For apps which re/index files frequently and need format conversion, I'd like to propose a patch for one of... Omindex library (thread safe): Omindex::init(options) // struct Omindex::options { ... } initialize mime_map, store default options session = new Omindex::Session(db_pathname) user threads use different sessions session.index_files(list, options) // list & return value are vector of { char * url, * file_path, * file_ext } perform a transaction for all files in list; create & return skip_list session.index_directory(url, dir_name, options) perform a transaction for all files in a directory tree; return skip_list main() moves to omindex_main.cc process command line, call Omindex::init(), proceed normally SWIG & Node.js bindings Omindex daemon mode: The initial directory pass is optional. Listen on a domain socket; for each connection, start a thread with a WritableDatabase and read JSON-formatted messages. Perform a transaction for each message. Respond with a skipped list. The library is more flexible, I think... Would you accept a patch for this?
On Mon, Oct 17, 2011 at 10:25 PM, Liam <xapian at networkimprov.net> wrote:> Olly (looking at commit logs, I think this is your dept :-) > > For apps which re/index files frequently and need format conversion, I'd > like to propose a patch for one of... > > Omindex library (thread safe): > > Omindex::init(options) // struct Omindex::options { ... } > initialize mime_map, store default options > session = new Omindex::Session(db_pathname) > user threads use different sessions > session.index_files(list, options) // list & return value are vector of { > char * url, * file_path, * file_ext } > perform a transaction for all files in list; create & return skip_list > session.index_directory(url, dir_name, options) > perform a transaction for all files in a directory tree; return > skip_list > > main() moves to omindex_main.cc > process command line, call Omindex::init(), proceed normally > > SWIG & Node.js bindings > > Omindex daemon mode: > > The initial directory pass is optional. > Listen on a domain socket; for each connection, start a thread with a > WritableDatabase and read JSON-formatted messages. > Perform a transaction for each message. > Respond with a skipped list. > > The library is more flexible, I think... Would you accept a patch for this? >Hoping for some feedback on this... Also realized that the Omindex::Session object could be a subclass of WritableDatabase My aim here is to leverage the format conversion and other logic in omindex without requiring a process invocation and a recursive directory scan.
James Aylett
2011-Oct-23 18:40 UTC
[Xapian-discuss] patch proposal: omindex library or daemon
On 17 Oct 2011, at 22:25, Liam wrote:> For apps which re/index files frequently and need format conversion, I'd > like to propose a patch for one of... > > Omindex library (thread safe) > Omindex daemon modeOmindex doesn't do a huge amount; there's a fairly simple system for figuring out which external helper to use for a given file (which is mostly libmagic and a MIME type to helper map), and there's some infrastructure for defending against runaway helpers which would probably want to be more flexible or even totally different in a library context, and wouldn't even be quite the same in a daemon context. The rest is a directory tree walker, a couple of internal handlers, and a pretty small amount of code that directly interfaces with Xapian. I guess I'm not entirely sure which bits of omindex you think are most valuable to pull into other systems. Given the niche use of it, if this was interesting I'd probably support a refactor and intermediate build step that generates a library used to create omindex, rather than installing another dynamic library for all users who won't need this. That would enable you to reuse bits of omindex as needed.> SWIG & Node.js bindingsCertainly we'd been happy to see work done to make Xapian available to Node users ? that would be awesome. I know very little about Node & V8, but I assume we'd be looking at object & function templates in V8 to gain access to Xapian, and then some Node helpers to make it easier for developers to use in an idiomatic fashion? J -- James Aylett talktorex.co.uk - xapian.org - devfort.com - spacelog.org
Seemingly Similar Threads
- Question about the ticket #743 omindex: delay libmagic checks
- How to use Xapian Omega directly (i.e., without using `recoll` and `xapiandb`) ... Full Set Of Questions Below:
- omindex one file at a time?
- Ticket #282: omindex-assorted-enhancements.patch woes
- How to omindex some sub-directories?