Gaurav, James and I are attending the GSoC reunion in San Jose, California, USA this weekend, so if anyone else is around, do come and say hello. We're also going to be holding a documentation sprint on Monday and Tuesday next week in San Francisco. If anyone else wants to take part (remotely would work) let me know. The only plan we have so far is to come up with a plan over the weekend. Cheers, Olly
On Thu, Oct 23, 2014, at 03:16 PM, Olly Betts wrote:> Gaurav, James and I are attending the GSoC reunion in San Jose, > California, USA this weekend, so if anyone else is around, do come and > say hello. > > We're also going to be holding a documentation sprint on Monday and > Tuesday next week in San Francisco. If anyone else wants to take > part (remotely would work) let me know. The only plan we have so far > is to come up with a plan over the weekend.Since I'm in town for once in forever, I'd love to meet up just to say hello - and also to see if we can get the FastMail patches for snippet support either into Xapian or an alternative implementation that does what's needed. Neil and I from FastMail are around for the Inbox Love conference on Wednesday. I've just been in Pittsburgh (sitting in the airport on my way out now) working on the Cyrus IMAPd 2.5 release, and the major thing blocking us adding full text search based on Xapian was that it requires a patched/forked version of the library, and we don't want to force that on people - so we've punted the whole feature down the road. Xapian works fantastically for FastMail, but we can't really expect everyone to run patched versions with no upgrade path - so I'd like to figure out a path to getting back onto mainline. Unfortunately Greg, who wrote the patches, isn't working at FastMail any more - so I'm stuck maintaining it myself! Cheers, Bron. -- Bron Gondwana brong at fastmail.fm
On Thu, Oct 23, 2014 at 04:58:49PM -0400, Bron Gondwana wrote:> Since I'm in town for once in forever, I'd love to meet up just to say > hello - and also to see if we can get the FastMail patches for snippet > support either into Xapian or an alternative implementation that does > what's needed.So the current state is that Mihai's GSoC snippet project is merged to trunk. This builds a language model from the top ranked documents and then uses that to select segments of text to form a snippet. Gaurav's been doing some testing of this on a real system, and we found that it can select text containing none of the query terms, and this seems to happen a bit too often. So he tried adding some code to select based on both the language model and query terms, which worked quite nicely, but the speed wasn't so great. And while profiling, he found that dropping the language model completely didn't affect the snippets much, but was significantly faster. We haven't quite decided where this leaves us (we only got to this point about a week ago) - the language model is conceptually a nice idea, and I'm generally a fan of approaches with a sound theoretical basis, but if it takes significant time without giving the sort of snippets we want, it's not a good solution. I've not looked at Greg's snippet patches for a while, but perhaps they're actually a better starting point after all.> Neil and I from FastMail are around for the Inbox Love conference on > Wednesday.OK, I'll talk to you off list to see if we can arrange something. Cheers, Olly